CN114428837A - Content quality evaluation method, content quality evaluation device, content quality evaluation medium, and electronic device - Google Patents
Content quality evaluation method, content quality evaluation device, content quality evaluation medium, and electronic device Download PDFInfo
- Publication number
- CN114428837A CN114428837A CN202111671403.5A CN202111671403A CN114428837A CN 114428837 A CN114428837 A CN 114428837A CN 202111671403 A CN202111671403 A CN 202111671403A CN 114428837 A CN114428837 A CN 114428837A
- Authority
- CN
- China
- Prior art keywords
- evaluation
- target
- text
- determining
- content
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3346—Query execution using probabilistic model
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Databases & Information Systems (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present disclosure relates to a content quality evaluation method, apparatus, medium, and electronic device, the method including: acquiring a first evaluation result and an evaluation text of a target user aiming at target content; determining a second evaluation result of the target content according to the evaluation text; determining the matching degree corresponding to the target user and the target content; and determining the quality evaluation result of the target user for the target content according to the first evaluation result, the second evaluation result and the matching degree. Therefore, the quality evaluation result determined by combining the three methods can fully consider the evaluation score and the comment text input by the user, so that the consistency between the determined quality evaluation result and the evaluation text input by the user is kept. Meanwhile, the subjective influence of the user can be reduced to a certain extent by combining the matching degree between the user and the content, the accuracy and the objectivity of the quality evaluation result are ensured, and the matching degree between the quality evaluation result and the target content is improved.
Description
Technical Field
The present disclosure relates to the field of data processing, and in particular, to a content quality evaluation method, apparatus, medium, and electronic device.
Background
The text sentiment analysis refers to the process of analyzing, processing and extracting subjective texts with sentiment colors by methods such as natural language processing technology, text mining, computer linguistics and the like.
Aiming at a content quality evaluation system such as a movie and a TV play, a viewer user can give own evaluation text and evaluation score for the content, so that the quality evaluation of the content is realized, and corresponding content recommendation is carried out on other users. However, in the above evaluation manner, the score is usually divided into five parts, and when the user directly inputs the score, the user can easily and directly give a full-scale evaluation, however, the evaluation text corresponding to the user may not be completely positive, that is, to some extent, the evaluation score input by the user and the evaluation text are often not matched, so that the evaluation scores of many contents are often high in a false state, and it is difficult to provide an accurate reference for viewing the contents for the user.
Disclosure of Invention
The purpose of the present disclosure is to provide an accurate and objective content quality evaluation method, device, medium, and electronic apparatus.
In order to achieve the above object, according to a first aspect of the present disclosure, there is provided a content quality evaluation method including:
acquiring a first evaluation result and an evaluation text of a target user aiming at target content;
determining a second evaluation result of the target content according to the evaluation text;
determining the matching degree corresponding to the target user and the target content;
and determining the quality evaluation result of the target user for the target content according to the first evaluation result, the second evaluation result and the matching degree.
Optionally, the determining the matching degree corresponding to the target user and the target content includes:
determining a user vector corresponding to the target user based on the historical evaluation text of the target user;
determining a content vector corresponding to the target content based on a plurality of evaluation texts corresponding to the target content;
and determining the matching degree according to the user vector and the content vector.
Optionally, the determining, based on the historical evaluation text of the target user, a user vector corresponding to the target user includes:
clustering the historical evaluation texts to obtain a plurality of cluster clusters corresponding to the historical evaluation texts;
for each cluster, splicing the historical evaluation texts with the text length smaller than a preset length threshold value in the cluster to obtain at least one spliced text;
and determining a subject word corresponding to the target user based on the spliced text in each cluster, the history evaluation text with the text length not less than the length threshold value and a subject generation model, and determining the user vector based on the vector corresponding to the subject word.
Optionally, the determining a content vector corresponding to the target content based on the plurality of evaluation texts corresponding to the target content includes:
determining a word frequency and a reverse document frequency corresponding to each word in a plurality of evaluation texts corresponding to the target content, and a text length proportion corresponding to the word, wherein the text length proportion corresponding to the word is the ratio of the length of the evaluation text to which the word belongs to the average text length of the plurality of evaluation texts;
for each word segmentation, determining the product of the word frequency corresponding to the word segmentation, the reverse document frequency and the text length proportion corresponding to the word segmentation as a target parameter corresponding to the word segmentation;
and determining a target word segmentation corresponding to the target content according to the target parameter corresponding to each word segmentation, and determining the content vector based on the vector corresponding to the target word segmentation.
Optionally, the determining, according to the evaluation text, a second evaluation result of the target content includes:
determining a classification corresponding to the evaluation text according to the evaluation text and the text classification model, and determining the grade indicated by the classification as the second evaluation result;
in the training process of the text classification model, feature extraction is carried out based on the feature extraction submodel, target features are obtained based on the full connection layer, and the prediction result of the text classification model is obtained by predicting partial features in the target features.
Optionally, the determining, according to the first evaluation result, the second evaluation result, and the matching degree, a quality evaluation result of the target user for the target content includes:
determining a weighted sum of the first and second evaluation results as an initial evaluation result;
and adjusting the initial evaluation result according to the matching degree to obtain the quality evaluation result.
Optionally, the method further comprises:
determining recommended content corresponding to the target user according to a quality evaluation result corresponding to each content in a content library;
and outputting the recommended content.
According to a second aspect of the present disclosure, there is provided a content quality evaluation apparatus, the apparatus including:
the acquisition module is used for acquiring a first evaluation result and an evaluation text of a target user aiming at target content;
the first determining module is used for determining a second evaluation result of the target content according to the evaluation text;
the second determining module is used for determining the matching degree corresponding to the target user and the target content;
and a third determining module, configured to determine, according to the first evaluation result, the second evaluation result, and the matching degree, a quality evaluation result of the target user for the target content.
According to a third aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of any of the methods of the first aspect.
According to a fourth aspect of the present disclosure, there is provided an electronic device comprising:
a memory having a computer program stored thereon;
a processor for executing the computer program in the memory to implement the steps of the method of any of the first aspects.
Therefore, according to the technical scheme, the first evaluation result is the score input by the target user, the second evaluation result is the score determined according to the evaluation text input by the target user, and the matching degree corresponding to the target user and the target content can represent the possibility that the user has subjective evaluation to a certain extent, so that the evaluation score and the comment text input by the user can be fully considered by combining the three results for determining the quality evaluation result, and the determined quality evaluation result and the evaluation text input by the user can be kept consistent. Meanwhile, the subjective influence of the user can be reduced to a certain extent by combining the matching degree between the user and the content, so that the accuracy and the objectivity of the quality evaluation result are ensured, the matching degree between the quality evaluation result and the target content is improved, an accurate content viewing reference is provided for the user, and the user experience is improved.
Additional features and advantages of the disclosure will be set forth in the detailed description which follows.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure and are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description serve to explain the disclosure without limiting the disclosure. In the drawings:
fig. 1 is a flowchart of a content quality evaluation method provided according to an embodiment of the present disclosure;
FIG. 2 is a schematic structural diagram of a text classification model provided in accordance with one embodiment of the present disclosure;
FIG. 3 is a flow diagram of an exemplary implementation of determining a corresponding degree of match of a target user and target content;
fig. 4 is a block diagram of a content quality evaluation apparatus provided according to an embodiment of the present disclosure;
FIG. 5 is a block diagram illustrating an electronic device in accordance with an exemplary embodiment;
FIG. 6 is a block diagram illustrating an electronic device in accordance with an example embodiment.
Detailed Description
The following detailed description of specific embodiments of the present disclosure is provided in connection with the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the present disclosure, are given by way of illustration and explanation only, not limitation.
Fig. 1 is a flowchart of a content quality evaluation method provided according to an embodiment of the present disclosure. As shown in fig. 1, the method may include:
in step 11, a first evaluation result and an evaluation text of the target user for the target content are acquired. The target content may be any content in multimedia content such as a movie, a tv show, and an animation. For example, the first rating result may be a rating value input by the target user for the target content, and the rating text may be a comment character input by the target user for the target content.
In step 12, a second evaluation result of the target content is determined based on the evaluation text.
The evaluation text is a comment of the target user for the target content, so the evaluation text may also reflect the evaluation bias of the target user for the target content to some extent, and therefore, in this embodiment, another evaluation result of the target user for the target content may be further determined according to the evaluation text. For example, the second evaluation result may be represented by a score value, and a value range of the score value is the same as a range of the score value corresponding to the first evaluation result, so that the second evaluation result may be used to represent a score determined according to the evaluation text input by the target user.
In step 13, the matching degree between the target user and the target content is determined.
Each target user may have a corresponding interest field, and belongs to the target content in the interest field, and the user may give a better evaluation, that is, the user may give an evaluation with a bias. Therefore, in this embodiment, the matching degree between the target user and the target content may be determined to indicate the possibility that the user has subjective evaluation to some extent, that is, a higher matching degree indicates a higher possibility that the evaluation given by the user is subjective evaluation.
In step 14, a quality evaluation result of the target user for the target content is determined according to the first evaluation result, the second evaluation result and the matching degree.
Therefore, according to the technical scheme, the first evaluation result is the score input by the target user, the second evaluation result is the score determined according to the evaluation text input by the target user, and the matching degree corresponding to the target user and the target content can represent the possibility that the user has subjective evaluation to a certain extent, so that the evaluation score and the comment text input by the user can be fully considered by combining the three results for determining the quality evaluation result, and the determined quality evaluation result and the evaluation text input by the user can be kept consistent. Meanwhile, the subjective influence of the user can be reduced to a certain extent by combining the matching degree between the user and the content, so that the accuracy and the objectivity of the quality evaluation result are ensured, the matching degree between the quality evaluation result and the target content is improved, an accurate content viewing reference is provided for the user, and the user experience is improved.
As an example, the evaluation text may be subjected to score prediction based on a score model, thereby obtaining a second evaluation result. For example, the scoring model may be implemented based on a CNN (Convolutional Neural Networks), an RNN (Recurrent Neural Networks), and the like, and the evaluation text based on the labeling scoring is trained as a training sample, and the training mode may be trained in a manner common in the art, and is not described herein again.
As another example, an exemplary implementation of determining a second rating result for the target content according to the rating text in step 12 is as follows, including:
and determining the classification corresponding to the evaluation text according to the evaluation text and the text classification model, and determining the grade indicated by the classification as the second evaluation result.
In the training process of the text classification model, feature extraction is carried out based on the feature extraction submodel, target features are obtained based on the full connection layer, and the prediction result of the text classification model is obtained by predicting partial features in the target features.
For example, the feature extraction submodel may be implemented based on a Transformer model, and accordingly, a schematic structural diagram of the text classification model is shown in fig. 2, where the number of layers of each feature layer in the text classification model is only an exemplary illustration and is not limited to the scheme of the present disclosure.
As shown in fig. 2, the text classification model includes a feature extraction sub-model, which may be composed of a transform model, and then includes a Fully-Connected Layer full-Connected Layer and a random deactivation Layer Dropout Layer, and may combine the excitation function and the normalization Layer to obtain a prediction result. Illustratively, the excitation function may be a GELU. For example, if the set score is 1-5 points, 5 categories can be set at the normalization layer, corresponding to 1, 2, 3, 4, 5 points respectively.
Therefore, in the embodiment, the evaluation text can be segmented to obtain a segmentation sequence corresponding to the evaluation text, each segmentation is vectorized to represent an input text classification model, features related to scoring can be focused more based on an attention mechanism during feature extraction in the text classification model, target features are obtained based on a full connection layer after the features are extracted, accuracy of feature extraction is guaranteed, and reliable data support is provided for subsequent prediction. The prediction result of the text classification model is obtained by predicting part of the target features, for example, the probability of Dropout for each neural network Layer can be set by Dropout Layer, and the neural network training unit removes the neural network from the network according to the probability to perform prediction classification to obtain the prediction result and train the prediction result. Therefore, in the training process of the text classification model, prediction can be randomly carried out based on partial features in the target features, and random selection is carried out when partial features are selected for random gradient reduction, so that each mini-batch of the text classification model is used for training different networks, the situation that the training of the text classification model enters overfitting is effectively avoided, the accuracy of the text classification model obtained through training is improved, the accuracy of the determined second evaluation result is ensured, and support is provided for objective and accurate evaluation of target content.
In one possible embodiment, an exemplary implementation manner of determining the corresponding matching degree of the target user and the target content in step 13 is as follows, as shown in fig. 3, which may include:
in step 31, a user vector corresponding to the target user is determined based on the historical evaluation text of the target user.
In this step, each evaluation text of the target user may be obtained under the condition that user authorization is obtained, so that the features of the user may be obtained from the historical evaluation of the user, that is, the user vector may be obtained.
As an example, keyword extraction may be performed based on a plurality of historical evaluation texts of a target user, so that the extracted keyword may be used as a feature word of the target user, and then the user vector may be obtained by vectorizing the feature word.
As another example, the determining a user vector corresponding to the target user based on the historical rating text of the target user may include:
and clustering the historical evaluation texts to obtain a plurality of cluster clusters corresponding to the historical evaluation texts.
The historical evaluation texts can be clustered based on common clustering algorithms such as K-means or KNN to obtain a plurality of clustering clusters.
And splicing the historical evaluation texts with the text lengths smaller than a preset length threshold value in each cluster to obtain at least one spliced text.
However, when a user evaluates contents, the evaluation may be frequently performed using short texts, and it is inconvenient to generate a subject word if the evaluation text length is too short. Thus, in this embodiment, the historical rating texts may be clustered first, such that historical rating texts having similar features are clustered together.
For example, the preset length threshold may be set based on an actual application scenario, which is not limited by the present disclosure. For each cluster, if the text length of the historical evaluation text is smaller than the length threshold, the historical evaluation text is represented as a short text, and the short texts in the same cluster can be spliced under the condition, so that the length of the historical evaluation text can be expanded based on the texts with similar characteristics, and the self characteristics of each historical evaluation text can be ensured while the text expansion is realized.
As an example, all the historical evaluation texts in a cluster with a text length smaller than a preset length threshold may be spliced to obtain a spliced text. As another example, to avoid an excessively long spliced text, a text length limit of the spliced text may be set, that is, when the splicing is performed based on the short text in the current cluster, the text length of the current spliced text S is smaller than the text length limit, and if the text length of the spliced text S' obtained after adding a new history evaluation text a is larger than the text length limit, the splicing is not performed, that is, the current spliced text S is used as the spliced text after the splicing is completed, and the newly added history evaluation text a is used as a new spliced text to be further spliced with other history evaluation texts, so as to obtain a plurality of spliced texts.
And then, determining subject terms corresponding to the target user based on the spliced texts in each cluster, the history evaluation texts with the text lengths not less than the length threshold value and a subject generation model, and determining the user vectors based on the vectors corresponding to the subject terms.
The text length of the historical evaluation text is not less than the length threshold value, and the historical evaluation text can be directly applied to subject word generation. As described above, the history evaluation texts with the text length smaller than the length threshold in each cluster are spliced to obtain a spliced text, so that the spliced text is a text with a longer text length. Therefore, in this embodiment, the topic words may be obtained based on the topic generation model based on the stitched text and the history evaluation text that is not stitched in the cluster, for example, a preset number of words may be selected as the topic words in a probability distribution output by the topic generation model according to a descending order of probability, and then the topic words are vectorized to obtain the user vector for representing the user feature.
For example, the topic generation model may be implemented based on LDA (Latent Dirichlet Allocation), and the training manner thereof is the prior art and is not described herein again. The LDA model is an unsupervised model, in the embodiment, the LDA model can be pre-trained based on an existing prior text, and the prior text can be an evaluation text of a user for a thriller-type movie content, an evaluation text of a comedy-type movie content and the like, so that the topic word determination of the topic generation model for a customized scene concerned by the user can be improved to a certain extent.
Therefore, texts with short text lengths and similar characteristics can be merged and spliced, so that subject term determination can be performed based on the long texts, the accuracy of the subject term determination is improved to a certain extent, namely, the accuracy of the determined user characteristics is improved, and accurate data support is provided for determining the user portrait.
In step 32, a content vector corresponding to the target content is determined based on the plurality of rating texts corresponding to the target content.
Therefore, the real comprehensive characteristics of the target content, that is, the content vector, can be obtained based on the evaluation texts of the multiple users on the same target content.
As an example, keyword extraction may be performed based on a plurality of evaluation texts, so that the extracted keywords may be used as feature words of the target content, and then the content vector may be obtained by vectorizing the feature words.
As another example, the determining a content vector corresponding to the target content based on the plurality of rating texts corresponding to the target content may include:
determining a word frequency and a reverse document frequency corresponding to each word in a plurality of evaluation texts corresponding to the target content, and a text length proportion corresponding to the word, wherein the text length proportion corresponding to the word is the ratio of the length of the evaluation text to which the word belongs to the average text length of the plurality of evaluation texts.
Each evaluation text can be used as a document, and the evaluation texts are segmented to obtain the segmented words corresponding to the evaluation texts. Therefore, the word frequency TF, the inverse document frequency IDF and the text length proportion DOC _ LEN corresponding to each participle can be further determined, and the formula is as follows:
for each word segmentation, determining the product of the word frequency corresponding to the word segmentation, the reverse document frequency and the text length proportion corresponding to the word segmentation as a target parameter corresponding to the word segmentation; and determining a target word segmentation corresponding to the target content according to the target parameter corresponding to each word segmentation, and determining the content vector based on the vector corresponding to the target word segmentation.
The target participles corresponding to the target content can be determined as the top N participles ranked in the order of the target parameters corresponding to the participles from large to small, and then the target participles can be vectorized to obtain the content vector. The vectorization of the subject term of the target user and the vectorization of the target participle of the target content are performed in the same manner, and a common vectorization manner in the art, such as a word2vec manner, may be selected, which is not limited by the present disclosure.
Therefore, when determining the target word segmentation from the evaluation text, the word frequency and the reverse document frequency of the word segmentation are considered, and meanwhile, the text length of the evaluation text to which the word segmentation belongs is combined. As shown above, it is difficult to accurately extract keywords from an evaluation text with a short text length, so in the present disclosure, by considering a text length ratio to improve the importance of word segmentation in the evaluation text with a longer text length, the accuracy of determined target word segmentation is ensured, and thus the accuracy and comprehensiveness of features in a content vector are improved.
In step 33, a degree of matching is determined based on the user vector and the content vector.
For example, a cosine similarity between the user vector and the content vector may be calculated based on the user vector and the content vector, and as the matching degree, the matching degree may be calculated by, for example, the following formula:
wherein, beta1For representing user vectors, β2The method is used for representing the content vector, Ai is used for representing the ith feature in the user vector, Bi is used for representing the ith feature in the content vector, n is used for representing the dimensions of the features of the user vector and the content vector, and the number of the corresponding dimensions of the user vector and the content vector is the same.
Therefore, by the technical scheme, the interest characteristics of the target user can be obtained based on the historical evaluation texts of the user, the characteristics of the target content can be obtained based on the plurality of evaluation texts of the target content, whether the user can give objective evaluation to the target content can be represented based on the matching degree between the target content and the evaluation texts, data parameters are provided for determining the final quality evaluation result based on the matching degree, and the objectivity of the determined quality evaluation result is ensured.
In one possible embodiment, in step 14, an exemplary implementation manner of determining the quality evaluation result of the target user for the target content according to the first evaluation result, the second evaluation result and the matching degree is as follows, and the step may include:
and determining the weighted sum of the first evaluation result and the second evaluation result as an initial evaluation result.
For example, the respective corresponding weights of the first evaluation result and the second evaluation result may be preset, wherein the sum of the respective corresponding weights of the first evaluation result and the second evaluation result is 1, and the respective corresponding weights of the first evaluation result and the second evaluation result may be set according to an actual application scenario, which is not limited by the present disclosure. As an example, the weights of the two are 0.5, and an average value of the first evaluation result and the second evaluation result may be determined as an initial evaluation result, thereby comprehensively considering the consistency between the score input by the user and the text.
And adjusting the initial evaluation result according to the matching degree to obtain the quality evaluation result.
As described above, the matching degree is used to indicate the subjectivity degree of the target user for the evaluation of the target content, where the higher the matching degree is, the stronger the subjectivity degree is, which indicates that the target user is more likely to give an evaluation with a bias, and therefore, when it is determined that the quality evaluation result needs to offset the influence of the subjective preference factor, the initial evaluation result is adjusted according to the matching degree, and a result obtained by subtracting the matching degree from the initial evaluation result may be used as the quality evaluation result. Alternatively, the quality evaluation result may be obtained by taking a product of the matching degree and the adjustment weight of the matching degree as an adjustment value, and subtracting the adjustment value from the initial evaluation result, based on a preset adjustment weight of the matching degree.
Therefore, by the technical scheme, the determined quality evaluation result is consistent with the evaluation score and the evaluation text input by the user, meanwhile, the influence of the subjective preference of the user can be reduced to a certain extent, and the accuracy and the objectivity of the determined quality evaluation result are ensured. Moreover, the quality evaluation result is compared with the first evaluation result input by the user to a certain extent and is more dispersed, the situation that the evaluation result of the content has higher overlap ratio and is difficult to distinguish the quality of the content is avoided, and more accurate data reference is provided for the user to clearly determine the actual quality of each content.
In one possible embodiment, the method may further comprise:
determining recommended content corresponding to the target user according to a quality evaluation result corresponding to each content in a content library; and outputting the recommended content. Wherein, the quality evaluation result corresponding to all or part of the content in the content library is determined according to the content quality evaluation method.
As an example, the previous P may be selected as the recommended content of the target user in the order from large to small according to the quality evaluation result corresponding to each content in the content library, where P may be set according to the actual user requirement. Therefore, the content with higher quality can be recommended to the target user, and the content watching experience of the user is guaranteed.
As another example, the top Q may be selected as the candidate content of the target user in the order from large to small according to the quality evaluation result corresponding to each content in the content library, where Q may be set according to the actual user requirement. And then, based on the selection matching degree between the candidate content and the target user, selecting the previous P as the recommended content in the sequence from high to low in the selection matching degree, and displaying the previous P as the recommended content in the sequence from high to low in the selection matching degree, wherein P is smaller than or equal to Q. The selection matching degree can be determined by similarity calculation based on the tag vector of the candidate content and the interest vector of the target user, so as to represent whether the candidate content meets the interest preference of the user. Illustratively, the tag vector may be derived by vectorization based on type tags of candidate content, such as genre (e.g., feature, documentary), type (e.g., comedy, tragedy, urban, rural), etc., the interest vector may be derived by vectorization of interest tags of the user obtained with authorization of the user, such as idol, youth, urban, etc. Therefore, the recommendation display can be carried out based on the interest preference of the target user while recommending the content with higher quality for the target user, the matching degree between the recommended content and the target user is ensured, the diversity and the individuation of the recommended content are improved, and the user experience is further improved.
Based on the same inventive concept, the present disclosure also provides a content quality evaluation apparatus, as shown in fig. 4, the apparatus 10 includes:
an obtaining module 100, configured to obtain a first evaluation result and an evaluation text of a target user for a target content;
a first determining module 200, configured to determine a second evaluation result of the target content according to the evaluation text;
a second determining module 300, configured to determine a matching degree between the target user and the target content;
a third determining module 400, configured to determine, according to the first evaluation result, the second evaluation result, and the matching degree, a quality evaluation result of the target user for the target content.
Optionally, the second determining module includes:
the first determining submodule is used for determining a user vector corresponding to the target user based on the historical evaluation text of the target user;
the second determining sub-module is used for determining a content vector corresponding to the target content based on the plurality of evaluation texts corresponding to the target content;
and the third determining submodule is used for determining the matching degree according to the user vector and the content vector.
Optionally, the first determining sub-module includes:
the clustering submodule is used for clustering the historical evaluation texts to obtain a plurality of clustering clusters corresponding to the historical evaluation texts;
the splicing sub-module is used for splicing the historical evaluation texts with the text lengths smaller than a preset length threshold value in each cluster to obtain at least one spliced text;
and the fourth determining submodule is used for determining the subject term corresponding to the target user based on the spliced text in each cluster, the historical evaluation text with the text length not less than the length threshold value and the subject generating model, and determining the user vector based on the vector corresponding to the subject term.
Optionally, the second determining submodule includes:
a fifth determining sub-module, configured to determine a word frequency and a reverse document frequency corresponding to each participle in a plurality of evaluation texts corresponding to the target content, and a text length ratio corresponding to the participle, where the text length ratio corresponding to the participle is a ratio of a length of an evaluation text to which the participle belongs to and an average text length of the plurality of evaluation texts;
a sixth determining submodule, configured to determine, for each segmented word, a product of a word frequency corresponding to the segmented word, a reverse document frequency, and a text length ratio corresponding to the segmented word as a target parameter corresponding to the segmented word;
and the seventh determining submodule is used for determining the target participle corresponding to the target content according to the target parameter corresponding to each participle and determining the content vector based on the vector corresponding to the target participle.
Optionally, the first determining module includes:
the eighth determining submodule is used for determining the classification corresponding to the evaluation text according to the evaluation text and the text classification model, and determining the grade indicated by the classification as the second evaluation result;
in the training process of the text classification model, feature extraction is carried out based on the feature extraction submodel, target features are obtained based on the full connection layer, and the prediction result of the text classification model is obtained by predicting partial features in the target features.
Optionally, the third determining module includes:
a ninth determining sub-module, configured to determine a weighted sum of the first evaluation result and the second evaluation result as an initial evaluation result;
and the adjusting submodule is used for adjusting the initial evaluation result according to the matching degree to obtain the quality evaluation result.
Optionally, the apparatus further comprises:
a fourth determining module, configured to determine, according to a quality evaluation result corresponding to each content in a content library, recommended content corresponding to the target user;
and the output module is used for outputting the recommended content.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
Fig. 5 is a block diagram illustrating an electronic device 700 according to an example embodiment. As shown in fig. 5, the electronic device 700 may include: a processor 701 and a memory 702. The electronic device 700 may also include one or more of a multimedia component 703, an input/output (I/O) interface 704, and a communication component 705.
The processor 701 is configured to control the overall operation of the electronic device 700, so as to complete all or part of the steps in the content quality evaluation method. The memory 702 is used to store various types of data to support operation at the electronic device 700, such as instructions for any application or method operating on the electronic device 700 and application-related data, such as contact data, transmitted and received messages, pictures, audio, video, and the like. The Memory 702 may be implemented by any type or combination of volatile and non-volatile Memory devices, such as Static Random Access Memory (SRAM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic or optical disk. The multimedia components 703 may include screen and audio components. Wherein the screen may be, for example, a touch screen and the audio component is used for outputting and/or inputting audio signals. For example, the audio component may include a microphone for receiving external audio signals. The received audio signal may further be stored in the memory 702 or transmitted through the communication component 705. The audio assembly further comprises at least one speaker for outputting audio signals. The I/O interface 704 provides an interface between the processor 701 and other interface modules, such as a keyboard, mouse, buttons, etc. These buttons may be virtual buttons or physical buttons. The communication component 705 is used for wired or wireless communication between the electronic device 700 and other devices. Wireless Communication, such as Wi-Fi, bluetooth, Near Field Communication (NFC), 2G, 3G, 4G, NB-IOT, eMTC, or other 5G, etc., or a combination of one or more of them, which is not limited herein. The corresponding communication component 705 may thus include: Wi-Fi module, Bluetooth module, NFC module, etc.
In an exemplary embodiment, the electronic Device 700 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic components for performing the content quality evaluation method described above.
In another exemplary embodiment, there is also provided a computer-readable storage medium including program instructions which, when executed by a processor, implement the steps of the content quality evaluation method described above. For example, the computer readable storage medium may be the memory 702 described above including program instructions that are executable by the processor 701 of the electronic device 700 to perform the content quality assessment method described above.
Fig. 6 is a block diagram illustrating an electronic device 1900 according to an example embodiment. For example, the electronic device 1900 may be provided as a server. Referring to fig. 6, an electronic device 1900 includes a processor 1922, which may be one or more in number, and a memory 1932 for storing computer programs executable by the processor 1922. The computer program stored in memory 1932 may include one or more modules that each correspond to a set of instructions. Further, the processor 1922 may be configured to execute the computer program to perform the content quality evaluation method described above.
Additionally, electronic device 1900 may also include a power component 1926 and a communication component 1950, the power component 1926 may be configured to perform power management of the electronic device 1900, and the communication component 1950 may be configured to enable communication, e.g., wired or wireless communication, of the electronic device 1900. In addition, the electronic device 1900 may also include input/output (I/O) interfaces 1958. Electronic device 1900 may operate based on data stored in memory 1932Operating systems, e.g. Windows ServerTM,Mac OS XTM,UnixTM,LinuxTMAnd so on.
In another exemplary embodiment, there is also provided a computer-readable storage medium including program instructions which, when executed by a processor, implement the steps of the content quality evaluation method described above. For example, the non-transitory computer readable storage medium may be the memory 1932 described above that includes program instructions executable by the processor 1922 of the electronic device 1900 to perform the content quality assessment method described above.
In another exemplary embodiment, a computer program product is also provided, which comprises a computer program executable by a programmable apparatus, the computer program having code portions for performing the content quality assessment method described above when executed by the programmable apparatus.
The preferred embodiments of the present disclosure are described in detail with reference to the accompanying drawings, however, the present disclosure is not limited to the specific details of the above embodiments, and various simple modifications may be made to the technical solution of the present disclosure within the technical idea of the present disclosure, and these simple modifications all belong to the protection scope of the present disclosure.
It should be noted that the various features described in the above embodiments may be combined in any suitable manner without departing from the scope of the invention. To avoid unnecessary repetition, the disclosure does not separately describe various possible combinations.
In addition, any combination of various embodiments of the present disclosure may be made, and the same should be considered as the disclosure of the present disclosure, as long as it does not depart from the spirit of the present disclosure.
Claims (10)
1. A content quality evaluation method, characterized by comprising:
acquiring a first evaluation result and an evaluation text of a target user aiming at target content;
determining a second evaluation result of the target content according to the evaluation text;
determining the matching degree corresponding to the target user and the target content;
and determining the quality evaluation result of the target user for the target content according to the first evaluation result, the second evaluation result and the matching degree.
2. The method of claim 1, wherein determining the degree of match between the target user and the target content comprises:
determining a user vector corresponding to the target user based on the historical evaluation text of the target user;
determining a content vector corresponding to the target content based on a plurality of evaluation texts corresponding to the target content;
and determining the matching degree according to the user vector and the content vector.
3. The method of claim 2, wherein determining the user vector corresponding to the target user based on the historical rating text of the target user comprises:
clustering the historical evaluation texts to obtain a plurality of cluster clusters corresponding to the historical evaluation texts;
for each cluster, splicing the historical evaluation texts with the text length smaller than a preset length threshold value in the cluster to obtain at least one spliced text;
and determining a subject word corresponding to the target user based on the spliced text in each cluster, the history evaluation text with the text length not less than the length threshold value and a subject generation model, and determining the user vector based on the vector corresponding to the subject word.
4. The method of claim 2, wherein determining the content vector corresponding to the target content based on the plurality of rating texts corresponding to the target content comprises:
determining a word frequency and a reverse document frequency corresponding to each word in a plurality of evaluation texts corresponding to the target content, and a text length proportion corresponding to the word, wherein the text length proportion corresponding to the word is the ratio of the length of the evaluation text to which the word belongs to the average text length of the plurality of evaluation texts;
for each word segmentation, determining the product of the word frequency corresponding to the word segmentation, the reverse document frequency and the text length proportion corresponding to the word segmentation as a target parameter corresponding to the word segmentation;
and determining a target word segmentation corresponding to the target content according to the target parameter corresponding to each word segmentation, and determining the content vector based on the vector corresponding to the target word segmentation.
5. The method of claim 1, wherein determining a second rating result of the target content according to the rating text comprises:
determining a classification corresponding to the evaluation text according to the evaluation text and the text classification model, and determining the grade indicated by the classification as the second evaluation result;
in the training process of the text classification model, feature extraction is carried out based on the feature extraction submodel, target features are obtained based on the full connection layer, and the prediction result of the text classification model is obtained by predicting partial features in the target features.
6. The method according to claim 1, wherein the determining the quality evaluation result of the target user for the target content according to the first evaluation result, the second evaluation result and the matching degree comprises:
determining a weighted sum of the first and second evaluation results as an initial evaluation result;
and adjusting the initial evaluation result according to the matching degree to obtain the quality evaluation result.
7. The method according to any one of claims 1-6, further comprising:
determining recommended content corresponding to the target user according to a quality evaluation result corresponding to each content in a content library;
and outputting the recommended content.
8. A content quality evaluation apparatus, characterized in that the apparatus comprises:
the acquisition module is used for acquiring a first evaluation result and an evaluation text of a target user aiming at target content;
the first determining module is used for determining a second evaluation result of the target content according to the evaluation text;
the second determining module is used for determining the matching degree corresponding to the target user and the target content;
and the third determining module is used for determining the quality evaluation result of the target user aiming at the target content according to the first evaluation result, the second evaluation result and the matching degree.
9. A non-transitory computer readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 7.
10. An electronic device, comprising:
a memory having a computer program stored thereon;
a processor for executing the computer program in the memory to carry out the steps of the method of any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111671403.5A CN114428837A (en) | 2021-12-31 | 2021-12-31 | Content quality evaluation method, content quality evaluation device, content quality evaluation medium, and electronic device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111671403.5A CN114428837A (en) | 2021-12-31 | 2021-12-31 | Content quality evaluation method, content quality evaluation device, content quality evaluation medium, and electronic device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114428837A true CN114428837A (en) | 2022-05-03 |
Family
ID=81312192
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111671403.5A Pending CN114428837A (en) | 2021-12-31 | 2021-12-31 | Content quality evaluation method, content quality evaluation device, content quality evaluation medium, and electronic device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114428837A (en) |
-
2021
- 2021-12-31 CN CN202111671403.5A patent/CN114428837A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111444428B (en) | Information recommendation method and device based on artificial intelligence, electronic equipment and storage medium | |
CN109117777B (en) | Method and device for generating information | |
CN111798879B (en) | Method and apparatus for generating video | |
CN110069709B (en) | Intention recognition method, device, computer readable medium and electronic equipment | |
CN108776676B (en) | Information recommendation method and device, computer readable medium and electronic device | |
CN113627447B (en) | Label identification method, label identification device, computer equipment, storage medium and program product | |
CN110234018B (en) | Multimedia content description generation method, training method, device, equipment and medium | |
CN112559800B (en) | Method, apparatus, electronic device, medium and product for processing video | |
CN112533051A (en) | Bullet screen information display method and device, computer equipment and storage medium | |
CN113705299A (en) | Video identification method and device and storage medium | |
CN113806588B (en) | Method and device for searching video | |
CN112989212B (en) | Media content recommendation method, device and equipment and computer storage medium | |
US10915756B2 (en) | Method and apparatus for determining (raw) video materials for news | |
CN113688951A (en) | Video data processing method and device | |
CN113408282B (en) | Method, device, equipment and storage medium for topic model training and topic prediction | |
CN114528474A (en) | Method and device for determining recommended object, electronic equipment and storage medium | |
CN116976354A (en) | Emotion analysis method, emotion analysis device, emotion analysis equipment and computer-readable storage medium | |
CN116775980B (en) | Cross-modal searching method and related equipment | |
CN114428837A (en) | Content quality evaluation method, content quality evaluation device, content quality evaluation medium, and electronic device | |
CN111797765B (en) | Image processing method, device, server and storage medium | |
CN113704544A (en) | Video classification method and device, electronic equipment and storage medium | |
CN112925972A (en) | Information pushing method and device, electronic equipment and storage medium | |
Chen et al. | Emotion recognition in videos via fusing multimodal features | |
CN116523024B (en) | Training method, device, equipment and storage medium of recall model | |
CN114417875B (en) | Data processing method, apparatus, device, readable storage medium, and program product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |