CN108009228B

CN108009228B - Method and device for setting content label and storage medium

Info

Publication number: CN108009228B
Application number: CN201711209262.9A
Authority: CN
Inventors: 邹建波
Original assignee: China Mobile Communications Group Co Ltd; MIGU Interactive Entertainment Co Ltd
Current assignee: China Mobile Communications Group Co Ltd; MIGU Interactive Entertainment Co Ltd
Priority date: 2017-11-27
Filing date: 2017-11-27
Publication date: 2020-10-09
Anticipated expiration: 2037-11-27
Also published as: CN108009228A

Abstract

The invention discloses a method for setting a content label, which comprises the following steps: acquiring text information associated with multimedia content; segmenting the text information to obtain segmentation segments; clustering the word segmentation segments to obtain a first clustering result, wherein the first clustering result comprises word segmentation segment groups formed by the word segmentation segments of all clustering categories; extracting target characteristic words from the first clustering result, and inputting the target characteristic words into a machine learning model; obtaining all probability values output by a machine learning model; the machine learning model is obtained by performing semantic analysis training on a sample comprising the corresponding relation between the text information and the label; each probability value represents the probability of each target feature word as a label of the text information; and selecting the label meeting the probability condition according to the probability values, and associating the selected label with the multimedia content. The invention also discloses a setting device of the content label and a storage medium.

Description

Method and device for setting content label and storage medium

Technical Field

The present invention relates to data processing technology in the field of artificial intelligence, and in particular, to a method and an apparatus for setting a content tag, and a storage medium.

Background

With the development of internet technology, people can browse or watch various multimedia contents through a network. Most of the current multimedia content websites, such as video websites, use tags to classify and mark the provided multimedia content. The tags are keywords with strong relevance to the multimedia contents, and the multimedia contents can be simply described and classified by using the tags, so that a user can conveniently retrieve or search interesting multimedia contents.

At present, in order to set a tag for multimedia content, a generally adopted technical implementation scheme is as follows: and the user manually sets the label for the multimedia content according to own interests and hobbies. However, since the mode is to manually set the tag by the user himself, the workload is large and the efficiency is low when the number of the multimedia contents to be set with the tag is large; in addition, the method is too dependent on the personal subjective knowledge of the user, and different users may have personalized differences in the tags set for the same multimedia content, so that if a user recommends a multimedia content to another user according to a tag set by the user, a relatively large deviation may exist, that is, the tag set by the user is not suitable for all people, and the applicability of the set tag is low, that is, for recommendation scenes facing different users, the accuracy of the tag set by the method is low.

The related art has no effective solution for how to quickly and accurately set tags for multimedia content.

Disclosure of Invention

In view of the above, embodiments of the present invention are to provide a method, an apparatus, and a storage medium for setting a content tag, so as to solve the problem that it is difficult to efficiently and accurately set a tag for multimedia content in the related art.

In order to achieve the above purpose, the technical solution of the embodiment of the present invention is realized as follows:

in a first aspect, an embodiment of the present invention provides a method for setting a content tag, where the method includes:

acquiring text information associated with multimedia content;

performing word segmentation on the text information to obtain word segmentation segments;

clustering the word segmentation segments to obtain a first clustering result, wherein the first clustering result comprises word segmentation segment groups formed by the word segmentation segments of all clustering categories;

extracting target characteristic words from the first clustering result, and inputting the target characteristic words into a machine learning model;

obtaining all probability values output by the machine learning model; the machine learning model is obtained by performing semantic analysis training on a sample comprising the corresponding relation between the text information and the label; the probability values respectively represent the probability sizes of the target feature words respectively serving as the labels of the text information;

and selecting the label meeting the probability condition according to the probability values, and associating the selected label with the multimedia content.

In a second aspect, an embodiment of the present invention provides an apparatus for setting a content tag, where the apparatus includes: the system comprises an acquisition module, a word segmentation module, a clustering module, an extraction module, a generation module and an association module; wherein the content of the first and second substances,

the acquisition module is used for acquiring text information associated with the multimedia content;

the word segmentation module is used for segmenting words of the text information to obtain word segmentation segments;

the clustering module is used for clustering the word segmentation segments to obtain a first clustering result, wherein the first clustering result comprises word segmentation segment groups formed by the word segmentation segments of all clustering categories;

the extraction module is used for extracting target characteristic words from the first clustering result and inputting the target characteristic words into a machine learning model;

the obtaining module is further configured to obtain probability values output by the machine learning model; the machine learning model is obtained by performing semantic analysis training on a sample comprising the corresponding relation between the text information and the label; the probability values respectively represent the probability sizes of the target feature words respectively serving as the labels of the text information;

the generating module is used for selecting labels meeting probability conditions according to the probability values;

the association module is used for associating the selected label with the multimedia content.

In a third aspect, an embodiment of the present invention provides a storage medium, on which an executable program is stored, and the executable program, when executed by a processor, implements the steps of the setting method of the content tag provided by the embodiment of the present invention.

In a fourth aspect, an embodiment of the present invention provides a device for setting a content tag, including a memory, a processor, and an executable program stored on the memory and capable of being executed by the processor, where the processor executes the steps of the method for setting a content tag provided in the embodiment of the present invention when executing the executable program.

By adopting at least one technical scheme provided by the embodiment of the invention, the text information associated with the multimedia content can be automatically analyzed and processed by word segmentation, clustering and the like to obtain a first clustering result, the target feature words extracted from the first clustering result are input into the machine learning model to obtain each probability value, the tags meeting the probability condition are selected according to each probability value, and the selected tags are associated with the multimedia content to realize the purpose of setting the tags for the multimedia content. Therefore, subjective influence of manual label setting is avoided, the label can be set for the multimedia content automatically and quickly and accurately, and the label set for the multimedia content in the embodiment of the invention is irrelevant to the interest and hobbies of the user and is only relevant to text information relevant to the multimedia content, so that the set label is more suitable for the requirements of different users, and the use experience of the user is greatly improved.

Drawings

Fig. 1 is a schematic flow chart illustrating an implementation of a method for setting a content tag according to an embodiment of the present invention;

fig. 2 is a functional structure diagram of a setting apparatus for content tags according to an embodiment of the present invention;

fig. 3 is a functional structure diagram of another setting apparatus for content tags according to an embodiment of the present invention;

fig. 4 is a schematic hardware structure diagram of a setting apparatus for content tags according to an embodiment of the present invention.

Detailed Description

So that the manner in which the features and aspects of the embodiments of the present invention can be understood in detail, a more particular description of the embodiments of the invention, briefly summarized above, may be had by reference to the embodiments, some of which are illustrated in the appended drawings.

Before further detailed description of the embodiments of the present invention, terms and expressions mentioned in the embodiments of the present invention are explained, and the terms and expressions mentioned in the embodiments of the present invention are applied to the following explanations.

1) Segmentation, also called word segmentation, refers to dividing characters in text information into individual words according to a certain word segmentation strategy.

2) The stop word is a word which can be filtered from the text information and does not influence the classification decision of the text information; stop words generally have no definite meaning (they only play a role in a complete sentence), for example, functional words such as pronouns, articles, numerals, prepositions, adverbs, prepositions, and conjunctions.

3) The target characteristic words refer to words which are extracted from the remaining words and can represent multimedia content associated with the text information after the text information is segmented and stop words are filtered out.

4) The vector space model is a feature space vector obtained by mapping a plurality of feature words extracted from the feature word set of each media type into corresponding word vectors and combining the word vectors.

Fig. 1 is a schematic flow chart illustrating an implementation process of a setting method for a content tag according to an embodiment of the present invention, where the setting method for the content tag is applied to a terminal device; as shown in fig. 1, an implementation flow of the setting method of the content tag in the embodiment of the present invention may include the following steps:

step 101: textual information associated with multimedia content is obtained.

In the embodiment of the present invention, the terminal device may include, but is not limited to, a computer device such as a smart phone, a tablet computer, a palmtop computer, and the like. The multimedia content may include, but is not limited to, video content such as images, audio content such as music, text content such as novels, and the like in various media forms. The multimedia content referred to herein may be obtained by at least one of the following means, for example: the multimedia content may be an image, a picture or a song uploaded by the user, or a video captured and collected from a specific website such as a video website, etc.

Here, the text information associated with the multimedia content refers to related information for representing the multimedia content, such as the name, introduction, author, genre, and the like of the content.

Step 102: and performing word segmentation on the text information to obtain word segmentation fragments.

In this embodiment, the computer device invokes a word segmentation service to perform word segmentation processing on all text information, so as to obtain a plurality of words corresponding to the text information. The word segmentation process described herein can be understood as a process of segmenting a text sequence formed by one text message into individual word segmentation segments by using a word segmentation device, and specifically, word segmentation can be performed on the text message by using an existing or new word segmentation method according to the formation characteristics of the chinese words and the characteristics of the english words and the english phrases, and a continuous text character string can be segmented into a plurality of word segmentation segments. For example, if the content of the text information is "today's weather is too hot", the word segmentation sections obtained by segmenting the content of the text information are "today", "weather", "too hot", and "hot", respectively.

Here, for the text information expressed in chinese, a word segmentation method of character string matching may be used for word segmentation processing, such as a forward maximum matching method, a reverse maximum matching method, an N-gram, a shortest path word segmentation method, an improved maximum matching method, a bidirectional maximum matching method, and the like. The forward maximum matching method is that a plurality of continuous characters contained in a text to be segmented are matched with a word list from left to right, and if the continuous characters can be matched, a segmentation segment can be segmented; the improved maximum matching method is to continue to use the core idea of the forward maximum matching method, make up that the forward maximum matching method does not have the functions of ambiguity detection and resolution, and further improve the word segmentation accuracy on the premise of ensuring that the word segmentation speed is basically unchanged. The embodiment of the present invention is not limited herein as to which word segmentation method is used to perform word segmentation on the text information to obtain each word segmentation segment.

In this embodiment, step 102 specifically includes: performing word segmentation on the text information to obtain a word segmentation segment set;

and filtering the stop words from the segmentation segment set according to stop words stored in a preset corpus, and taking the rest segmentation segments except the filtered stop words in the segmentation segment set as segmentation segments corresponding to the text information.

In short, the stop word is a word that has no substantial influence on the purpose of determining the content tag, such as a mood word, an auxiliary word, and the like, i.e., the stop word has no definite meaning. For example, if the content of the text message is "today's weather is too hot", the word segmentation segments obtained by segmenting the content of the text message are collected as "today", "weather", "too hot", and "hot"; according to stop words stored in a preset corpus, some word segmentation segments in the word segmentation segment set can be determined to belong to stop words, namely all the words of 'yes', 'too' and 'too' in the word segmentation segment set belong to stop words, then the words of 'too' and 'too' in the word segmentation segment set are filtered, and further the filtered word segmentation segments of 'today', 'weather' and 'hot' are obtained.

Step 103: clustering the word segmentation segments to obtain a first clustering result, wherein the first clustering result comprises word segmentation segment groups formed by the word segmentation segments of all clustering categories.

In the embodiment of the invention, clustering can be understood as measuring the semantic similarity between each participle segment of text information, and clustering the participle segments with the closest semantic similarity into one class. For example, by the clustering process, words used to represent emotional colors, such as "like", "love", and the like, in the participle segment can be clustered into the same participle segment group. Because the quantity of the acquired text information related to the multimedia content is huge, word segmentation fragment groups of different clustering categories can be acquired after clustering processing.

Here, the clustering process may be performed on each segmented word using an existing or new clustering algorithm, such as a partition (K-means) -based clustering algorithm or a model-based (SOM) -based clustering algorithm, to obtain a first clustering result. The semantic similarity of the segmented word segments can be calculated by adopting the existing methods such as Euclidean distance or cosine theorem, and the embodiment of the invention is not repeated any more. Preferably, the clustering method and the clustering device adopt a clustering algorithm with higher clustering similarity based on SOM to cluster each participle segment.

In this embodiment, if it is assumed that the multimedia content described in step 101 specifically includes multimedia contents of different media types, before executing step 103, the method may further include:

classifying the word segmentation segments into word segmentation segments of various media types according to different media types of multimedia contents;

correspondingly, the step 103 specifically includes: and clustering the word segmentation segments of each media type to obtain a first clustering result.

In the embodiment of the present invention, for classifying the word segmentation segments into word segmentation segments of each media type, the following method may be adopted: and classifying the word segmentation segments by adopting the existing or new text classification models such as a maximum entropy model, a decision tree model and the like according to different media types of the multimedia content. Specifically, the category prediction of each word segmentation segment can be performed by calculating the probability that each word segmentation segment belongs to each media type, and the type with the highest probability is used as the media type to which the word segmentation segment belongs.

For example, by calculating the probability that the segmented word segments "rock", "jazz" and "song" belong to each media type respectively, it can be found through comparison that the segmented word segments "rock", "jazz" and "song" have the highest probability of belonging to the music type compared with other media types, so that the segmented word segments "rock", "jazz" and "song" can be classified as the segmented word segments of the music type. Wherein different media types of the multimedia content may include, but are not limited to, multiple types of video, music, novel, and so on.

Here, the clustering the word segmentation segments of each media type may specifically be to cluster the word segmentation segments of each media type according to different media types of multimedia content. For example, clustering is performed on all the participle segments belonging to the music type, and simultaneously clustering is also performed on all the participle segments belonging to the video type, and the like. Therefore, after word segmentation, classification and clustering are carried out on the text information associated with the multimedia content, the feature words used for representing the text information associated with the multimedia content are extracted, so that the relevance of the finally set label and the multimedia content can be increased, and the accuracy of setting the label for the multimedia content is improved.

Step 104: and extracting target characteristic words from the first clustering result and inputting the target characteristic words into a machine learning model.

In this embodiment, step 104 specifically includes: counting the occurrence frequency of each participle segment in the participle segment group of each clustering class in all clustering classes, and determining the importance degree value of each participle segment in all clustering classes according to the frequency and the weight value of each participle segment;

and selecting the importance degree value meeting the degree condition from the determined importance degree values, and determining the target characteristic word according to the word segmentation segment corresponding to the selected importance degree value.

Specifically, in order to screen the participle segments in the first clustering result, so as to reduce the feature words corresponding to the same multimedia content and representing the text information, an existing feature selection method of Term Frequency Inverse Document Frequency (TFIDF) may be adopted to evaluate the importance degree of a participle to one of the documents in a Document set or a corpus. In general, the importance of a participle increases in proportion to the number of times it appears in the document. In the embodiment of the invention, the importance degree of the participle segment is determined by combining two factors of frequency and weight value, namely, the importance degree value of each participle segment in all clustering categories is calculated according to the product of the frequency value of each participle segment in the participle segment group of each clustering category in all clustering categories and the weight value of each participle segment in the text information associated with the multimedia content. Then, the calculated importance degree values are sequentially arranged, importance degree values meeting degree conditions are screened out based on the arrangement result, namely the largest value is selected from the importance degree values, and the target feature words are determined according to the word segmentation segments corresponding to the screened largest importance degree values. Wherein the sequential ordering includes ascending ordering and descending ordering; the weighted values can be automatically calculated by a computer device, and the weighted values corresponding to different segmentation segments may be different.

In this embodiment of the present invention, if classification processing is not performed on each participle segment before clustering is performed on each participle segment, assuming that the multimedia content in step 101 specifically includes multimedia content of different media types, then determining the target feature word according to the participle segment corresponding to the selected importance degree value may specifically include:

classifying the word segmentation segments corresponding to the selected importance degree value according to different media types of the multimedia content to obtain a feature word set of each media type;

and determining target characteristic words according to the characteristic words which are selected from the characteristic word set of each media type and used for representing the text information of the media type to which the characteristic words belong.

Here, for multimedia contents of different media types, when selecting a feature word for characterizing text information of a media type from a feature word set of each media type, the method is generally divided into local feature selection and global feature selection, wherein a computer device may automatically determine whether to adopt local feature selection or global feature selection according to the media type, but the purpose of the method is to extract a feature word which can best express core content of the text information from the feature word set of each media type. For example, taking multimedia content as a movie as an example, a sentence or several participles in a content introduction included in the movie content can best intensively express the subject content of the movie, so that a target feature word can be selected from the content introduction, and the process belongs to the selection of the target feature word by adopting local feature selection; taking multimedia content as an example of a song, because one song does not have information capable of intensively expressing the theme content of the song, the global characteristics of the song such as lyrics, style, author and the like need to be synthesized to select the target characteristic words, and the process belongs to the selection of the target characteristic words by adopting global characteristic selection. In general, local feature selection is more efficient to select the target feature words.

The selected word segmentation segment corresponding to the importance degree value is used as a candidate feature word, then the candidate feature words are classified according to different media types of the multimedia content, such as videos, music, novels and other types, so as to obtain a feature word set of each media type, a feature word used for representing core content of text information is selected from the feature word set of each media type, and a target feature word is determined according to the feature word representing the core content of the text information. Therefore, the dimension reduction is carried out on the candidate feature words by adopting the mode of combining the local feature selection and the global feature selection, the number of the candidate feature words processed by the machine learning model can be reduced, and the processing efficiency is greatly improved. The dimension reduction refers to reduction of the dimension, that is, reduction of the dimension of the candidate feature words, that is, reduction of the overall number of the candidate feature words.

In this embodiment of the present invention, the determining a target feature word according to a feature word selected from the feature word set of each media type and used for characterizing text information of the media type to which the target feature word belongs specifically includes:

selecting characteristic words used for representing text information of the media types from the characteristic word set of each media type;

constructing a vector space model based on the feature vectors corresponding to the selected feature words;

calculating the similarity among the feature vectors based on the vector space model, and clustering the selected feature words according to the calculation result of the similarity to obtain a second clustering result, wherein the second clustering result comprises the feature words of each clustering category;

and extracting target characteristic words from the characteristic words of each cluster category.

In this embodiment, the feature words that can most express the core content of the text information may be extracted from the feature word sets of each media type by using the local feature selection or the global feature selection according to different media types of the multimedia content. The selected feature words are represented by the vectors, that is, the selected feature words are mapped into corresponding word vectors, and the word vectors are combined to obtain feature space vectors, so that a vector space model is constructed. The similarity between the feature vectors can be calculated by using the existing methods such as Euclidean distance or cosine theorem, and the embodiment of the invention is not repeated. The selected feature words may be clustered by using an existing or new clustering algorithm, such as a K-means-based clustering algorithm or an SOM-based clustering algorithm, to extract target feature words from the feature words of each cluster category. Thus, by clustering the feature words, the accuracy of setting labels for multimedia content can be further improved.

Step 105: obtaining all probability values output by the machine learning model; the machine learning model is obtained by performing semantic analysis training on a sample comprising the corresponding relation between the text information and the label; and the probability values respectively represent the probability sizes of the target characteristic words respectively serving as the labels of the text information.

In the embodiment of the invention, the vector representation of the target characteristic words can be transformed by inputting the target characteristic words of the machine learning model, and the transformed result is output as the probability of the labels of the text information, so as to obtain the probability values of the labels of the text information respectively serving as the target characteristic words. Specifically, the vector representation of the input target feature word is transformed based on the excitation functions of different nodes in the machine learning model, and the result of the transformation is used as the vector representation of the label and the corresponding probability thereof.

Here, the machine learning model is obtained by semantic analysis of training data in the field of natural language learning; the machine learning model is trained by taking the corresponding relation between the text information and the labels as a sample, and each probability value output by the machine learning model is obtained, wherein each probability value represents the probability of each target feature word as the label of the text information.

Here, the text information in the machine learning model may be common words (abbreviated as professional words) in the professional field in each multimedia type, and the common words may be obtained by means of web crawler capture and manual entry. Specifically, the crawler is configured to a professional website to crawl professional words in the corresponding professional field, for example, professional words related to videos such as 'live show' are crawled from a bean video website, and then the crawled professional words are added to the machine learning model in a manual entry mode. Therefore, the text information in the machine learning model can be updated in time, so that the machine learning model is suitable for different professional fields, and meanwhile, the labels set for the multimedia content can be more accurate.

Step 106: and selecting the label meeting the probability condition according to the probability values, and associating the selected label with the multimedia content.

Here, the label meeting the probability condition may be a label as text information having the highest probability. That is, the label with the highest probability as text information is selected from the probability values output by the machine learning model. After the tags meeting the probability condition are selected, the association relationship between the selected tags and the multimedia content is established, so that the multimedia content corresponding to the tags can be quickly found through the association relationship.

In this embodiment, in order to achieve an effect of setting a label for multimedia content more accurately, if an operator corrects the label in the semantic analysis process, the corrected label needs to be reversely synchronized to the machine learning model, and then the label is set for the multimedia content again.

Specifically, after the selecting the tags meeting the probability condition according to the probability values in step 106, the method may further include:

acquiring a correction label which is used for updating a label corresponding to the text information and output by the machine learning model;

and when the number of the corrected labels reaches a first preset threshold value and/or a training time interval for semantic analysis training in the machine learning model reaches a second preset threshold value, updating the machine learning model based on the corrected labels and the corresponding text information, and re-determining the labels corresponding to the text information according to the updated machine learning model.

It should be noted that, here, when the number of the modified tags reaches the first preset threshold and/or the training time interval for performing semantic analysis training in the machine learning model reaches the second preset threshold, the machine learning model is updated, so that it can be ensured that the samples of the machine learning model can be updated in time within an effective time period, and the change effect of the tags reset for the multimedia content is more obvious.

In this embodiment of the present invention, after the selecting, according to the probability values in this step 106, the tags meeting the probability conditions, the method may further include:

acquiring preference information; the preference information is used for representing the preference of each multimedia content with the same label;

according to the preference information, adjusting the labels of the text information associated with the multimedia contents;

and updating the machine learning model according to the text information and the corresponding adjusted label.

Here, the preference information may be feedback information of the user on each multimedia content having the same tag, such as like or dislike. And feeding back preference information of the user to a big data platform such as computer equipment in real time, wherein the computer equipment can adjust the labels of the text information associated with each multimedia content according to the preference information, and further update the corresponding relation between the text information of the sample and the labels in the machine learning model according to the text information and the corresponding adjusted labels.

For example, a certain user group a likes multimedia content with the same tag T1, so that the user group a is recommended with multimedia content C1A, C1B, C1C, C1D and C1E under the tag T1, but it is found that the multimedia content frequently browsed or viewed by the user group a is C1A, C1B and C1C, but the multimedia content C1D and C1E rarely browsed or viewed, but likes the multimedia content C2F under the tag T2, and then the multimedia content with the same tag T1 changes accordingly, that is, the multimedia content of T1 changes into C1A, C1B, C1C and C2F.

By adopting the technical scheme of the embodiment of the invention, the text information associated with the multimedia content is subjected to word segmentation, clustering and other analysis processing to obtain a first clustering result, the target feature words extracted from the first clustering result are input into the machine learning model to obtain various probability values, the tags meeting probability conditions are selected according to the various probability values, and the selected tags are associated with the multimedia content to realize the purpose of setting the tags for the multimedia content. Therefore, subjective influence of manual label setting can be avoided, the label can be set for multimedia content automation quickly and accurately, the set label is only related to text information associated with the multimedia content, requirements of different users are fitted more, and use experience of the users is greatly improved.

In order to implement the above method for setting a content tag, an embodiment of the present invention further provides a device for setting a content tag, where the device for setting a content tag is applied to a terminal device, such as a computer device like a smart phone, a tablet computer, and a palmtop computer, and fig. 2 is a functional structure diagram of the device for setting a content tag according to the embodiment of the present invention; as shown in fig. 2, the setting device of the content tag includes an obtaining module 201, a word segmentation module 202, a clustering module 203, an extracting module 204, a generating module 205, and an associating module 206; wherein the content of the first and second substances,

the obtaining module 201 is configured to obtain text information associated with multimedia content;

the word segmentation module 202 is configured to segment words of the text information to obtain word segmentation segments;

the clustering module 203 is configured to cluster the word segmentation segments to obtain a first clustering result, where the first clustering result includes word segmentation segment groups of the clustering categories, where the word segmentation segments form word segmentation segments;

the extracting module 204 is configured to extract a target feature word from the first clustering result, and input the target feature word into a machine learning model;

the obtaining module 201 is further configured to obtain probability values output by the machine learning model; the machine learning model is obtained by performing semantic analysis training on a sample comprising the corresponding relation between the text information and the label; the probability values respectively represent the probability sizes of the target feature words respectively serving as the labels of the text information;

the generating module 205 is configured to select, according to the probability values, a label meeting a probability condition;

the associating module 206 is configured to associate the selected tag with the multimedia content.

In this embodiment, the extracting module 204 is specifically configured to:

counting the occurrence frequency of each participle segment in the participle segment group of each clustering class in all clustering classes, and determining the importance degree value of each participle segment in all clustering classes according to the frequency and the weight value of each participle segment;

In this embodiment, for determining the target feature word according to the word segmentation segment corresponding to the selected importance degree value, the following method may be adopted: classifying the word segmentation segments corresponding to the selected importance degree value according to different media types of the multimedia content to obtain a feature word set of each media type;

In this embodiment, for determining the target feature word according to the feature word selected from the feature word set of each media type and used for representing the text information of the media type to which the feature word belongs, the following method may be adopted: selecting characteristic words used for representing text information of the media types from the characteristic word set of each media type;

In this embodiment, the word segmentation module 202 is specifically configured to:

performing word segmentation on the text information to obtain a word segmentation segment set;

As an implementation manner, fig. 3 is a schematic functional structure diagram of another content tag setting apparatus provided in an embodiment of the present invention; as shown in fig. 3, the setting device of the content tag further includes: a classifying module 207, configured to classify the word segmentation segments into word segmentation segments of each media type according to different media types of multimedia content before the clustering module 203 clusters the word segmentation segments to obtain a first clustering result;

the clustering module 203 is specifically configured to: and clustering the word segmentation segments of each media type to obtain a first clustering result.

In this embodiment, as an implementation manner, the obtaining module 201 is further configured to obtain a modified tag after the generating module 205 selects a tag meeting a probability condition according to the probability values, where the modified tag is a tag corresponding to the text information and used for updating the machine learning model output;

the device further comprises: an updating module 208, configured to update the machine learning model based on the modified labels and the corresponding text information when the number of the modified labels reaches a first preset threshold and/or a training time interval for performing semantic analysis training in the machine learning model reaches a second preset threshold, and re-determine labels corresponding to the text information according to the updated machine learning model.

As another embodiment, the obtaining module 201 is further configured to obtain preference information after the generating module 205 selects a tag meeting a probability condition according to the probability values; the preference information is used for representing the preference of each multimedia content with the same label;

the updating module 208 is further configured to adjust the label of the text information associated with each multimedia content according to the preference information, and update the machine learning model according to the text information and the corresponding adjusted label.

It should be noted that: the content tag setting apparatus provided in the above embodiment is only illustrated by dividing each program module when setting the content tag, and in practical applications, the processing distribution may be completed by different program modules according to needs, that is, the internal structure of the apparatus may be divided into different program modules to complete all or part of the processing described above. In addition, the setting apparatus of the content tag and the setting method embodiment of the content tag provided in the above embodiments belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments and are not described herein again.

In practical applications, the word segmentation module 202, the clustering module 203, the extraction module 204, the generation module 205, the association module 206, the classification module 207 and the update module 208 may be implemented by a Central Processing Unit (CPU), a Micro Processing Unit (MPU), a Digital Signal Processor (DSP), a Field Programmable Gate Array (FPGA), or the like on a computer device; the obtaining module 201 can be implemented by a communication module (including a basic communication suite, an operating system, a communication module, a standardized interface, a protocol, etc.) and a transceiver antenna.

In order to implement the setting method of the content tag, an embodiment of the present invention further provides a hardware structure of a setting apparatus of a content tag. A setting apparatus of a content tag, which may be implemented in a terminal device, such as a computer device like a smart phone, a tablet computer, a palmtop computer, etc., implementing an embodiment of the present invention will now be described with reference to the accompanying drawings. In the following, the hardware structure of the setting apparatus for content tags provided in the embodiment of the present invention is further described, it is to be understood that fig. 4 only shows an exemplary structure of the setting apparatus for content tags, and not a whole structure, and a part of the structure or a whole structure shown in fig. 4 may be implemented as required.

Referring to fig. 4, fig. 4 is a schematic diagram of a hardware structure of a setting apparatus for a content tag according to an embodiment of the present invention, which may be applied to the terminal device running an application program in practical application, where the setting apparatus 400 for a content tag shown in fig. 4 includes: at least one processor 401, memory 402, a user interface 403, and at least one network interface 404. The various components in the content tag setup device 400 are coupled together by a bus system 405. It will be appreciated that the bus system 405 is used to enable communications among the components. The bus system 405 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as bus system 405 in fig. 4.

The user interface 403 may include, among other things, a display, a keyboard, a mouse, a trackball, a click wheel, a key, a button, a touch pad, or a touch screen.

It will be appreciated that the memory 402 can be either volatile memory or nonvolatile memory, and can include both volatile and nonvolatile memory.

The memory 402 in the embodiment of the present invention is used to store various types of data to support the operation of the setting apparatus 400 for a content tag. Examples of such data include: any computer program for operating on the setting apparatus 400 of a content tag, such as the executable program 4021 and the operating system 4022, a program that implements the setting method of a content tag of an embodiment of the present invention may be contained in the executable program 4021.

The setting method of the content tag disclosed by the embodiment of the invention can be applied to the processor 401, or implemented by the processor 401. The processor 401 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the setting method of the content tag may be implemented by an integrated logic circuit of hardware or an instruction in the form of software in the processor 401. The processor 401 described above may be a general purpose processor, a DSP, or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. Processor 401 may implement or perform the setting methods, steps, and logic blocks of the content tags provided in the embodiments of the present invention. A general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the setting method of the content tag provided by the embodiment of the invention can be directly embodied as the execution of a hardware decoding processor, or the combination of hardware and software modules in the decoding processor. The software modules may be located in a storage medium located in the memory 402, and the processor 401 reads the information in the memory 402, and in combination with the hardware thereof, performs the steps of the aforementioned setting method for the content tag.

The embodiment of the present invention further provides a hardware structure of a device for setting a content tag, where the device 400 for setting a content tag includes a memory 402, a processor 401, and an executable program 4021 stored in the memory 402 and capable of being executed by the processor 401, and when the processor 401 executes the executable program 4021, the following are implemented: acquiring text information associated with multimedia content; performing word segmentation on the text information to obtain word segmentation segments; clustering the word segmentation segments to obtain a first clustering result, wherein the first clustering result comprises word segmentation segment groups formed by the word segmentation segments of all clustering categories; extracting target characteristic words from the first clustering result, and inputting the target characteristic words into a machine learning model; obtaining all probability values output by the machine learning model; the machine learning model is obtained by performing semantic analysis training on a sample comprising the corresponding relation between the text information and the label; the probability values respectively represent the probability sizes of the target feature words respectively serving as the labels of the text information; and selecting the label meeting the probability condition according to the probability values, and associating the selected label with the multimedia content.

As an embodiment, when the processor 401 runs the executable program 4021, it implements: before clustering the word segmentation segments to obtain a first clustering result, classifying the word segmentation segments into word segmentation segments of various media types according to different media types of multimedia contents.

As an embodiment, when the processor 401 runs the executable program 4021, it implements: and clustering the word segmentation segments of each media type to obtain a first clustering result.

As an embodiment, when the processor 401 runs the executable program 4021, it implements: counting the occurrence frequency of each participle segment in the participle segment group of each clustering class in all clustering classes, and determining the importance degree value of each participle segment in all clustering classes according to the frequency and the weight value of each participle segment; and selecting the importance degree value meeting the degree condition from the determined importance degree values, and determining the target characteristic word according to the word segmentation segment corresponding to the selected importance degree value.

As an embodiment, when the processor 401 runs the executable program 4021, it implements: classifying the word segmentation segments corresponding to the selected importance degree value according to different media types of the multimedia content to obtain a feature word set of each media type; and determining target characteristic words according to the characteristic words which are selected from the characteristic word set of each media type and used for representing the text information of the media type to which the characteristic words belong.

As an embodiment, when the processor 401 runs the executable program 4021, it implements: selecting characteristic words used for representing text information of the media types from the characteristic word set of each media type; constructing a vector space model based on the feature vectors corresponding to the selected feature words; calculating the similarity among the feature vectors based on the vector space model, and clustering the selected feature words according to the calculation result of the similarity to obtain a second clustering result, wherein the second clustering result comprises the feature words of each clustering category; and extracting target characteristic words from the characteristic words of each cluster category.

As an embodiment, when the processor 401 runs the executable program 4021, it implements: performing word segmentation on the text information to obtain a word segmentation segment set; and filtering the stop words from the segmentation segment set according to stop words stored in a preset corpus, and taking the rest segmentation segments except the filtered stop words in the segmentation segment set as segmentation segments corresponding to the text information.

As an embodiment, when the processor 401 runs the executable program 4021, it implements: after selecting a label meeting probability conditions according to the probability values, acquiring a correction label, wherein the correction label is a label corresponding to the text information and used for updating the machine learning model output; and when the number of the corrected labels reaches a first preset threshold value and/or a training time interval for semantic analysis training in the machine learning model reaches a second preset threshold value, updating the machine learning model based on the corrected labels and the corresponding text information, and re-determining the labels corresponding to the text information according to the updated machine learning model.

As an embodiment, when the processor 401 runs the executable program 4021, it implements: after selecting the labels meeting probability conditions according to the probability values, acquiring preference information; the preference information is used for representing the preference of each multimedia content with the same label; according to the preference information, adjusting the labels of the text information associated with the multimedia contents; and updating the machine learning model according to the text information and the corresponding adjusted label.

The embodiment of the invention also provides a storage medium, which can be a storage medium such as an optical disk, a flash memory or a magnetic disk, and can be a non-instant storage medium. Wherein the storage medium has stored thereon an executable program 4021, and when executed by the processor 401, the executable program 4021 implements: acquiring text information associated with multimedia content; performing word segmentation on the text information to obtain word segmentation segments; clustering the word segmentation segments to obtain a first clustering result, wherein the first clustering result comprises word segmentation segment groups formed by the word segmentation segments of all clustering categories; extracting target characteristic words from the first clustering result, and inputting the target characteristic words into a machine learning model; obtaining all probability values output by the machine learning model; the machine learning model is obtained by performing semantic analysis training on a sample comprising the corresponding relation between the text information and the label; the probability values respectively represent the probability sizes of the target feature words respectively serving as the labels of the text information; and selecting the label meeting the probability condition according to the probability values, and associating the selected label with the multimedia content.

As an embodiment, the executable program 4021 when executed by the processor 401 implements: before clustering the word segmentation segments to obtain a first clustering result, classifying the word segmentation segments into word segmentation segments of various media types according to different media types of multimedia contents.

As an embodiment, the executable program 4021 when executed by the processor 401 implements: and clustering the word segmentation segments of each media type to obtain a first clustering result.

As an embodiment, the executable program 4021 when executed by the processor 401 implements: counting the occurrence frequency of each participle segment in the participle segment group of each clustering class in all clustering classes, and determining the importance degree value of each participle segment in all clustering classes according to the frequency and the weight value of each participle segment; and selecting the importance degree value meeting the degree condition from the determined importance degree values, and determining the target characteristic word according to the word segmentation segment corresponding to the selected importance degree value.

As an embodiment, the executable program 4021 when executed by the processor 401 implements: classifying the word segmentation segments corresponding to the selected importance degree value according to different media types of the multimedia content to obtain a feature word set of each media type; and determining target characteristic words according to the characteristic words which are selected from the characteristic word set of each media type and used for representing the text information of the media type to which the characteristic words belong.

As an embodiment, the executable program 4021 when executed by the processor 401 implements: selecting characteristic words used for representing text information of the media types from the characteristic word set of each media type; constructing a vector space model based on the feature vectors corresponding to the selected feature words; calculating the similarity among the feature vectors based on the vector space model, and clustering the selected feature words according to the calculation result of the similarity to obtain a second clustering result, wherein the second clustering result comprises the feature words of each clustering category; and extracting target characteristic words from the characteristic words of each cluster category.

As an embodiment, the executable program 4021 when executed by the processor 401 implements: performing word segmentation on the text information to obtain a word segmentation segment set; and filtering the stop words from the segmentation segment set according to stop words stored in a preset corpus, and taking the rest segmentation segments except the filtered stop words in the segmentation segment set as segmentation segments corresponding to the text information.

As an embodiment, the executable program 4021 when executed by the processor 401 implements: after selecting a label meeting probability conditions according to the probability values, acquiring a correction label, wherein the correction label is a label corresponding to the text information and used for updating the machine learning model output; and when the number of the corrected labels reaches a first preset threshold value and/or a training time interval for semantic analysis training in the machine learning model reaches a second preset threshold value, updating the machine learning model based on the corrected labels and the corresponding text information, and re-determining the labels corresponding to the text information according to the updated machine learning model.

As an embodiment, the executable program 4021 when executed by the processor 401 implements: after selecting the labels meeting probability conditions according to the probability values, acquiring preference information; the preference information is used for representing the preference of each multimedia content with the same label; according to the preference information, adjusting the labels of the text information associated with the multimedia contents; and updating the machine learning model according to the text information and the corresponding adjusted label.

In summary, according to at least one of the above technical solutions provided in the embodiments of the present invention, the text information associated with the multimedia content can be automatically analyzed and processed by word segmentation and clustering to obtain a first clustering result, the target feature words extracted from the first clustering result are input into the machine learning model to obtain each probability value, the tags meeting the probability condition are selected according to each probability value, and the selected tags are associated with the multimedia content, so as to achieve the purpose of setting the tags for the multimedia content. Therefore, subjective influence of manual label setting is avoided, the label can be set for the multimedia content automatically and quickly and accurately, and the label set for the multimedia content in the embodiment of the invention is irrelevant to the interest and hobbies of the user and is only relevant to text information relevant to the multimedia content, so that the set label is more suitable for the requirements of different users, and the use experience of the user is greatly improved.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or executable program product. Accordingly, the present invention may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of an executable program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.

The present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and executable program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by executable program instructions. These executable program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor with reference to a programmable data processing apparatus to produce a machine, such that the instructions, which execute via the computer or processor with reference to the programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These executable program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These executable program instructions may also be loaded onto a computer or reference programmable data processing apparatus to cause a series of operational steps to be performed on the computer or reference programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or reference programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above description is only exemplary of the present invention and should not be taken as limiting the scope of the present invention, and any modifications, equivalents, improvements, etc. that are within the spirit and principle of the present invention should be included in the present invention.

Claims

1. A method for setting a content tag, the method comprising:

acquiring text information associated with multimedia content;

selecting labels meeting probability conditions according to the probability values, and associating the selected labels with the multimedia content;

the extracting of the target feature words from the first clustering result comprises:

counting the occurrence frequency of each participle segment in the participle segment group of each clustering class in all clustering classes, and determining the importance degree value of each participle segment in all clustering classes according to the frequency and the weight value of each participle segment; selecting an importance degree value meeting the degree condition from the determined importance degree values;

2. The method for setting content tags according to claim 1, wherein before said clustering of said participle segments to obtain a first clustering result, said method further comprises:

the clustering the word segmentation segments to obtain a first clustering result includes:

and clustering the word segmentation segments of each media type to obtain a first clustering result.

3. The method for setting the content tag according to claim 1, wherein the segmenting the text information to obtain each segmented segment includes:

4. The method as claimed in claim 1, wherein after selecting the tags meeting the probability condition according to the probability values, the method further comprises:

5. The method as claimed in claim 4, wherein after selecting the tags meeting the probability condition according to the probability values, the method further comprises:

6. An apparatus for setting a content tag, the apparatus comprising: the system comprises an acquisition module, a word segmentation module, a clustering module, an extraction module, a generation module and an association module; wherein the content of the first and second substances,

the extraction module is used for extracting target characteristic words from the first clustering result and inputting the target characteristic words into a machine learning model; the system is used for counting the occurrence frequency of each participle segment in the participle segment group of each clustering class in all clustering classes, and determining the importance degree value of each participle segment in all clustering classes according to the frequency and the weight value of each participle segment; selecting an importance degree value meeting the degree condition from the determined importance degree values; classifying the word segmentation segments corresponding to the selected importance degree value according to different media types of the multimedia content to obtain a feature word set of each media type; selecting characteristic words used for representing text information of the media types from the characteristic word set of each media type; constructing a vector space model based on the feature vectors corresponding to the selected feature words; calculating the similarity among the feature vectors based on the vector space model, and clustering the selected feature words according to the calculation result of the similarity to obtain a second clustering result, wherein the second clustering result comprises the feature words of each clustering category; extracting target characteristic words from the characteristic words of each cluster category;

7. The content tag setting apparatus according to claim 6, further comprising: the classification module is used for classifying the word segmentation segments into word segmentation segments of various media types according to different media types of multimedia contents before the clustering module clusters the word segmentation segments to obtain a first clustering result;

the clustering module is specifically configured to: and clustering the word segmentation segments of each media type to obtain a first clustering result.

8. The setting apparatus of content tags according to claim 6, wherein the word segmentation module is specifically configured to:

9. The content tag setting apparatus according to claim 6, wherein the obtaining module is further configured to obtain a modified tag after the generating module selects a tag meeting a probability condition according to the probability values, where the modified tag is a tag corresponding to the text information and used for updating the machine learning model output;

the device further comprises: and the updating module is used for updating the machine learning model based on the corrected labels and the corresponding text information when the number of the corrected labels reaches a first preset threshold value and/or the training time interval for semantic analysis training in the machine learning model reaches a second preset threshold value, and re-determining the labels corresponding to the text information according to the updated machine learning model.

10. The apparatus for setting content tags according to claim 9, wherein the obtaining module is further configured to obtain preference information after the generating module selects tags meeting probability conditions according to the probability values; the preference information is used for representing the preference of each multimedia content with the same label;

the updating module is further configured to adjust the label of the text information associated with each multimedia content according to the preference information, and update the machine learning model according to the text information and the corresponding adjusted label.

11. A storage medium having stored thereon an executable program, characterized in that the executable program, when executed by a processor, implements the steps of the setting method of a content tag according to any one of claims 1 to 5.

12. A setting apparatus of a content tag, comprising a memory, a processor and an executable program stored on the memory and capable of being executed by the processor, wherein the processor executes the executable program to perform the steps of the setting method of a content tag according to any one of claims 1 to 5.