CN108009228A - A kind of method to set up of content tab, device and storage medium - Google Patents

A kind of method to set up of content tab, device and storage medium Download PDF

Info

Publication number
CN108009228A
CN108009228A CN201711209262.9A CN201711209262A CN108009228A CN 108009228 A CN108009228 A CN 108009228A CN 201711209262 A CN201711209262 A CN 201711209262A CN 108009228 A CN108009228 A CN 108009228A
Authority
CN
China
Prior art keywords
label
text message
content
participle
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201711209262.9A
Other languages
Chinese (zh)
Other versions
CN108009228B (en
Inventor
邹建波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
MIGU Interactive Entertainment Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
MIGU Interactive Entertainment Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, MIGU Interactive Entertainment Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201711209262.9A priority Critical patent/CN108009228B/en
Publication of CN108009228A publication Critical patent/CN108009228A/en
Application granted granted Critical
Publication of CN108009228B publication Critical patent/CN108009228B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of method to set up of content tab, including:Obtain the text message associated with content of multimedia;Text message is segmented, to obtain each participle fragment;Each participle fragment is clustered, to obtain the first cluster result, wherein the first cluster result includes the participle fragment group being made of participle fragment of each cluster classification;Target signature word is extracted from the first cluster result, inputs machine learning model;Obtain each probable value of machine learning model output;Wherein, machine learning model, by training to obtain to carrying out semantic analysis including the sample of text message and the correspondence of label;Each probable value represents probability size of each target signature word respectively as the label of text message respectively;According to each probable value, the label for meeting Probability Condition is chosen, selected label is associated with content of multimedia.The present invention further simultaneously discloses the setting device and storage medium of a kind of content tab.

Description

A kind of method to set up of content tab, device and storage medium
Technical field
The present invention relates to the data processing technique in artificial intelligence field, more particularly to a kind of setting side of content tab Method, device and storage medium.
Background technology
With the development of Internet technology, people can pass through network browsing or the miscellaneous content of multimedia of viewing. Current content of multimedia website such as video website mostly carries out class mark using label to the content of multimedia provided.Its In, label is the keyword very strong with content of multimedia correlation, and content of multimedia can be briefly described using label And classification, in order to user search or search content of multimedia interested.
At present, in order to set label to content of multimedia, the technic relization scheme generally used is:User according to itself Interest and hobby, are manually operated and set label to content of multimedia.However, since which is to rely on user itself into row label Manual setting, cause when need set label content of multimedia quantity it is larger when, workload is larger, inefficiency;In addition, This mode too depends on the personal subjective understanding of user, the label that possible different user sets same content of multimedia There are personalized difference, therefore, if the recommendation of content of multimedia is carried out to other users according to the label of a certain user setting, May be there are bigger deviation, i.e., the label that the user is set not is suitable for owner, its label applicability set It is relatively low, it is for towards for the recommendation scene of different user, the label accuracy set using which is relatively low.
For how quickly and accurately to set label for content of multimedia, correlation technique there is no effective solution.
The content of the invention
In view of this, an embodiment of the present invention is intended to provide a kind of method to set up of content tab, device and storage medium, use Label quickly and accurately is set for content of multimedia to solve the problems, such as that correlation technique is difficult to effectively realize.
To reach above-mentioned purpose, what the technical solution of the embodiment of the present invention was realized in:
In a first aspect, the embodiment of the present invention provides a kind of method to set up of content tab, the described method includes:
Obtain the text message associated with content of multimedia;
The text message is segmented, to obtain each participle fragment;
Each participle fragment is clustered, to obtain the first cluster result, wherein, first cluster result includes The participle fragment group being made of the participle fragment of each cluster classification;
Target signature word is extracted from first cluster result, inputs machine learning model;
Obtain each probable value of the machine learning model output;Wherein, the machine learning model, by including text The sample of the correspondence of this information and label carries out semantic analysis and trains to obtain;Each probable value represents each mesh respectively Mark probability size of the Feature Words respectively as the label of the text message;
According to each probable value, the label for meeting Probability Condition is chosen, and by selected label and the multimedia Content is associated.
Second aspect, the embodiment of the present invention provide a kind of setting device of content tab, and described device includes:Obtain mould Block, word-dividing mode, cluster module, extraction module, generation module and relating module;Wherein,
The acquisition module, for obtaining the text message associated with content of multimedia;
The word-dividing mode, for being segmented to the text message, to obtain each participle fragment;
The cluster module, for being clustered to each participle fragment, to obtain the first cluster result, wherein, institute Stating the first cluster result includes the participle fragment group being made of the participle fragment of each cluster classification;
The extraction module, for extracting target signature word from first cluster result, inputs machine learning model;
The acquisition module, is additionally operable to obtain each probable value of the machine learning model output;Wherein, the engineering Model is practised, by training to obtain to carrying out semantic analysis including the sample of text message and the correspondence of label;It is described each general Rate value represents probability size of each target signature word respectively as the label of the text message respectively;
The generation module, for according to each probable value, choosing the label for meeting Probability Condition;
The relating module, for selected label is associated with the content of multimedia.
The third aspect, the embodiment of the present invention provide a kind of storage medium, are stored thereon with executable program, described executable The step of method to set up of content tab provided in an embodiment of the present invention is realized when program is executed by processor.
Fourth aspect, the embodiment of the present invention provide a kind of setting device of content tab, including memory, processor and deposit The executable program that can be run on a memory and by the processor is stored up, when the processor runs the executable program The step of performing the method to set up of content tab provided in an embodiment of the present invention.
Using the embodiment of the present invention provided more than at least one technical solution, due to can automatic pair with multimedia Hold after associated text message such as is segmented, clustered at the analyzing and processing and obtain the first cluster result, will be from the first cluster result In the target signature word input machine learning model extracted, to obtain each probable value, according to each probable value, selection meets probability The label of condition, and the label of selection is associated with content of multimedia, to be embodied as the purpose that content of multimedia sets label. Can not only be quickly and accurately content of multimedia automated setting mark in this way, avoiding the subjective impact for manually setting label Label, and the embodiment of the present invention for content of multimedia set label and user itself interest and like it is unrelated, only with more matchmakers Hold associated text message correlation in vivo, therefore, set label is more bonded the demand of different user, greatly improves The usage experience of user.
Brief description of the drawings
Fig. 1 realizes flow diagram for a kind of method to set up of content tab provided in an embodiment of the present invention;
Fig. 2 is a kind of illustrative view of functional configuration of the setting device of content tab provided in an embodiment of the present invention;
Fig. 3 is the illustrative view of functional configuration of the setting device of another content tab provided in an embodiment of the present invention;
Fig. 4 is a kind of hardware architecture diagram of the setting device of content tab provided in an embodiment of the present invention.
Embodiment
The characteristics of in order to more fully hereinafter understand the embodiment of the present invention and technology contents, below in conjunction with the accompanying drawings to this hair The realization of bright embodiment is described in detail, appended attached drawing purposes of discussion only for reference, is not used for limiting the present invention.
Before the embodiment of the present invention is further elaborated, to the noun and term involved in the embodiment of the present invention Illustrate, the noun and term involved in the embodiment of the present invention are suitable for following explanation.
1) segment, also known as cutting word, refer to that according to certain participle strategy be independent by the Character segmentation in text message Word.
2) stop words, refers to what categorised decision being filtered from text message, to text message will not have an impact Word;Usual stop words does not have clear and definite meaning (only putting it into a complete sentence just has certain effect), for example, The function words such as pronoun, article, number, auxiliary words of mood, adverbial word, preposition and conjunction.
3) target signature word, after referring to segment text message and filtering out stop words, is extracted from remaining word What is obtained can represent the word of the content of multimedia associated with text message.
4) vector space model, refers to the multiple Feature Words extracted from the feature set of words of each medium type being mapped to Corresponding term vector, and the feature space vector being combined.
Fig. 1 is a kind of flow diagram of realizing of method to set up of content tab provided in an embodiment of the present invention, described interior The method to set up for holding label is applied to terminal device;As shown in Figure 1, the method to set up of content tab in the embodiment of the present invention Realize flow, may comprise steps of:
Step 101:Obtain the text message associated with content of multimedia.
In embodiments of the present invention, the terminal device can include but is not limited to smart mobile phone, tablet computer, palm electricity The computer equipments such as brain.The content of multimedia can include but is not limited to video content such as image, audio content such as music, text This content such as novel media form.Content of multimedia mentioned here can be obtained by way of following at least one, example Such as:Content of multimedia can be an image, picture or the song uploaded by user or such as be regarded from specific website Video that frequency website is included and collected etc..
Here, the text message associated with content of multimedia, refers to the relevant information for representing content of multimedia, such as The information such as the title of content, brief introduction, author, type.
Step 102:The text message is segmented, to obtain each participle fragment.
In the present embodiment, computer equipment calls Chinese Word Segmentation Service that all text messages are done word segmentation processing, obtains and text The corresponding multiple participles of this information.Word segmentation processing mentioned here, it can be understood as using segmenter by a text message structure Into text sequence be divided into the process of participle fragment independent one by one, specifically, can be special according to the composition of Chinese word Sign, and the characteristics of English word and English phrase, word is carried out to text message using existing or new participle mode and is cut Point, it is several participle fragments by continuous text-string cutting.For example, if the content of text message is the " weather of today It is too hot ", then the participle fragment obtained after the content of text information is segmented be respectively " today ", " ", " my god Gas ", " too ", " heat " and " ".
Here, for the text message of Chinese statement, the segmenting method of string matching can be used to be segmented Processing, such as Forward Maximum Method method, reverse maximum matching method, N-gram, shortest path participle method, improved maximum matching Method and two-way maximum matching method etc..Wherein, Forward Maximum Method method refers to from left to right include in text to be segmented Several continuation characters are matched with vocabulary, if can match, can be syncopated as a participle fragment;Improved maximum matching method It is the core concept for continuing to use Forward Maximum Method method, and makes up the work(that Forward Maximum Method method does not possess ambiguity detection and resolution Can, and then on the premise of ensureing that participle speed is basically unchanged, improve the accuracy of participle.For which kind of above-mentioned participle side used Method segments the text message, and to obtain each participle fragment, the embodiment of the present invention does not limit herein.
In the present embodiment, this step 102 specifically includes:The text message is segmented, obtains participle set of segments;
According to the stop words stored in default corpus, the stop words is filtered out from the participle set of segments, will Remaining participle fragment in the participle set of segments in addition to the stop words filtered out, as with the text message pair The participle fragment answered.
In simple terms, stop words here is to determining do not have the word such as language of substantial effect for the purpose of content tab Gas word and auxiliary word etc., i.e. stop words do not have clear and definite meaning.For example, if the content of text message is the " weather of today It is too hot ", then the participle set of segments obtained after the content of text information is segmented for " today ", " ", " my god Gas ", " too ", " heat " and " ";According to the stop words stored in default corpus, it may be determined that some in the participle set of segments Participle fragment belongs to stop words, that is, segment in set of segments " ", " too " and " " belong to stop words, then, will be to this In participle set of segments " ", " too " and " " filtered, and then participle fragment " today " after being filtered, " weather " " heat ", it is seen then that the participle fragment combination of gained can also state the implication of content of text messages after filtering, from participle fragment Stop words is filtered out in set, the length of the remaining participle fragment filtered out can be limited, with the accuracy rate of lifting filtering, be easy to The follow-up setting efficiency for improving label.
Step 103:Each participle fragment is clustered, to obtain the first cluster result, wherein, first cluster As a result the participle fragment group being made of the participle fragment of each cluster classification is included.
In embodiments of the present invention, cluster, it can be understood as to semantic similar between each participle fragment of text message Measured, the immediate participle fragment of semantic similarity is gathered for one kind.For example, by clustering processing, piece can will be segmented " liking ", " love ", " having pity on " in section etc. are used to represent that the word of emotion to gather for same participle fragment group.Due to the present invention The text message substantial amounts associated with content of multimedia obtained in embodiment, therefore, can obtain after clustering processing Obtain the participle fragment groups of different cluster classifications.
Here it is possible to using existing or new clustering algorithm, such as based on the clustering algorithm for dividing (K-means) or it is based on The clustering algorithm of model (SOM) carries out clustering processing to each participle fragment, to obtain the first cluster result.Wherein, can use The methods of some Euclidean distances or the cosine law, carries out participle fragment the calculating of semantic similarity, and the embodiment of the present invention is not another One repeats.Preferably, the embodiment of the present invention using cluster similarity higher based on the clustering algorithm of SOM to it is each participle fragment into Row clustering processing.
In the present embodiment, if assuming the content of multimedia described in step 101, more matchmakers of different media types are specifically included Hold in vivo, then, before this step 103 is performed, the method can also include:
According to the different media types of content of multimedia, by each participle piece for segmenting fragment and being classified as each medium type Section;
Correspondingly, this step 103 specifically includes:The participle fragment of each medium type is clustered, to obtain One cluster result.
In embodiments of the present invention, for each participle fragment is classified as the participle fragment of each medium type, It can realize in the following ways:According to the different media types of content of multimedia, using existing or new text classification mould Type such as maximum entropy model, decision-tree model etc. classify each participle fragment.Specifically, can be by calculating each participle piece Section belongs to the probability of each medium type, class prediction is carried out to each participle fragment, using the type of maximum probability as participle piece The medium type that section is belonged to.
For example, the probability of each medium type is belonging respectively to by calculating participle fragment " rock and roll ", " jazz ", " style of song ", By can relatively draw, compared to other medium types, these participle fragments " rock and roll ", " jazz ", " style of song " belong to music The maximum probability of type, in this manner it is possible to which " rock and roll ", " jazz ", " style of song " to be classified as to the participle fragment of music type.Its In, the different media types of content of multimedia can include but is not limited to the polytypes such as video, music, novel.
Here, the participle fragment of each medium type is clustered, can specifically referred to according to content of multimedia Different media types, cluster the participle fragment under each medium type respectively.For example, for the institute for belonging to music type There is participle fragment to carry out clustering processing, while clustering processing etc. is carried out also directed to all participle fragments for belonging to video type.Such as This, is segmented in a pair text message associated with content of multimedia, is classified, after clustering processing, then extract for characterize with The Feature Words for the text message that content of multimedia is associated, can so increase the phase of the label and content of multimedia finally set Guan Xing, rises to the accuracy that content of multimedia sets label.
Step 104:Target signature word is extracted from first cluster result, inputs machine learning model.
In the present embodiment, this step 104 specifically includes:Count each participle fragment in the participle fragment group of each cluster classification The frequency occurred in all cluster classifications, according to the weighted value of the frequency and each participle fragment, determines that each participle fragment exists Importance value in all cluster classifications;
The importance value of matching degree condition is chosen from identified each importance value, according to selected important The corresponding participle fragment of degree value, determines target signature word.
Specifically, in order to be screened to the participle fragment in the first cluster result, so as to reduce in same multimedia Hold the Feature Words of corresponding characterization text information, existing word frequency reverse document-frequency (TFIDF, Term can be used Frequency Inverse Document Frequency) feature selection approach come assess one participle for a file The significance level of collection or a copy of it file in a corpus.In general, the importance of a participle is as it is in text The directly proportional increase of number occurred in part.In embodiments of the present invention, with reference to frequency and weighted value two because usually determining participle The significance level of fragment, i.e., each participle fragment in the participle fragment group of each cluster classification of statistics is in all cluster classifications The frequency values of middle appearance, are occupied with each participle fragment being calculated in the text message associated with content of multimedia The product of weighted value, to calculate importance value of each participle fragment in all cluster classifications.Then, it is each heavy to what is calculated Degree value carry out order arrangement is wanted, the importance value of matching degree condition is filtered out based on rank results, i.e., from each important journey The value of maximum is chosen in angle value, target signature word is determined according to the corresponding participle fragment of the maximum importance value filtered out. Wherein, the order arrangement includes the arrangement of ascending order and the arrangement of descending;Here weighted value can be counted automatically by computer equipment Calculate and obtain, and the different corresponding weighted values of participle fragment is possible to different.
In embodiments of the present invention, if before being clustered to each participle fragment, and it is not carried out to each participle piece Duan Jinhang classification is handled, it is assumed that the content of multimedia described in step 101, specifically includes in the multimedia of different media types Hold, then, the corresponding participle fragment of importance value selected by the basis, determines target signature word, can specifically include:
According to the different media types of content of multimedia, the corresponding participle fragment of importance value of the selection is carried out Classification, to obtain the feature set of words of each medium type;
According to chosen from the feature set of words of each medium type be used for characterize belonging to medium type text envelope The Feature Words of breath, determine target signature word.
Here, for the content of multimedia of different media types, for from the feature set of words of each medium type When choosing the Feature Words for the text message for being used to characterize affiliated medium type, local feature selection and global characteristics are generally divided into Selection, wherein, computer equipment can be selected according to medium type automatic decision using local feature, or global characteristics choosing Select, but its purpose is provided to extract the core content that can most express text message from the feature set of words of each medium type Feature Words.For example, by taking content of multimedia is film as an example, a word in the brief introduction that movie contents include or several points Word can most concentrate the subject content for expressing the film, therefore, target signature word can be chosen from brief introduction, which belongs to The selection of target signature word is realized using local feature selection;By content of multimedia be song exemplified by, due in a song not The information for the subject content for expressing the song can be concentrated by having, therefore, it is necessary to the comprehensive song global characteristics for example the lyrics, style, Author etc. chooses target signature word, which belongs to the selection that target signature word is realized using global characteristics selection.It is general and Speech, the more efficient of target signature word is chosen using local feature selection.
Here, the Feature Words using the corresponding participle fragment of selected importance value as candidate, then, according to more matchmakers The different media types the held in vivo such as polytype such as video, music, novel, classifies the Feature Words of candidate, obtains each The feature set of words of medium type, chooses the core content for being used for characterizing text message from the feature set of words of each medium type Feature Words, and then determine target signature word according to the Feature Words of core content of characterization text message.In this way, using local special The mode that sign selection and global characteristics selection are combined carries out dimensionality reduction to the Feature Words of candidate, it is possible to reduce at machine learning model The quantity of the Feature Words of the candidate of reason, greatly improves treatment effeciency.Wherein, the dimensionality reduction refers to the reduction of dimension, that is to say, that The dimension for reducing the Feature Words of candidate is to reduce the overall quantity of the Feature Words of candidate.
In embodiments of the present invention, what the basis was chosen from the feature set of words of each medium type is used to characterize The Feature Words of the text message of affiliated medium type, determine target signature word, specifically include:
The spy for the text message for being used to characterize affiliated medium type is chosen from the feature set of words of each medium type Levy word;
Based on the selected corresponding feature vector of Feature Words, vector space model is built;
Based on the vector space model, the similarity between each feature vector is calculated, according to the calculating of the similarity As a result selected Feature Words are clustered, to obtain the second cluster result, wherein, second cluster result includes each poly- The Feature Words of class classification;
Target signature word is extracted from the Feature Words of each cluster classification.
, can be according to the different media types of content of multimedia, using foregoing local feature selection or complete in the present embodiment Office's feature selecting mode, extracts the feature for the core content that can most express text message from the feature set of words of each medium type Word.Here, corresponding term vector is mapped to by the Feature Words selected by vector representation, Feature Words that will be selected, will be each Term vector is combined to obtain feature space vector, and then builds vector space model.Existing Euclidean distance or remaining can be used The methods of string theorem, calculates the similarity between each feature vector, and the embodiment of the present invention no longer repeats one by one.It can use existing Or new clustering algorithm, such as the clustering algorithm based on K-means or based on the clustering algorithm of SOM to selected Feature Words into Row clustering processing, target signature word is extracted from the Feature Words of each cluster classification.In this way, by gathering to Feature Words Class, can further improve the accuracy that label is set for content of multimedia.
Step 105:Obtain each probable value of the machine learning model output;Wherein, the machine learning model, passes through Train to obtain to carrying out semantic analysis including the sample of text message and the correspondence of label;Each probable value represents respectively Probability size of each target signature word respectively as the label of the text message.
In embodiments of the present invention, can by inputting the target signature word of machine learning model, to target signature word to Amount expression is converted, and is exported the result after conversion as the probability of the label of text message, special to obtain each target Levy each probable value of the word respectively as the label of text message.Specifically, swashed based on different node in machine learning model Encourage function, the vector representation to the target signature word of input converts, using the result of conversion as the vector representation of label and Its corresponding probability.
Here, the machine learning model, is obtained by the semantic analysis training data in natural language learning field 's;Wherein, machine learning model includes the correspondence of the text message and label set by operation personnel, with text message With the correspondence of label as sample training machine learning model, each probable value of machine learning model output is obtained, wherein, Each probable value represents probability size of each target signature word respectively as the label of text message respectively.
Here, the text message in the machine learning model can be the normal of the professional domain under each multiple media types With word (abbreviation specialized word), these everyday expressions can be obtained by way of web crawlers is captured and is manually entered.Tool For body, by configuring reptile to professional website, to crawl the specialized word under corresponding professional domain, such as, from bean cotyledon video Crawled in website with the relevant specialized word of video such as " reality TV show ", then by way of manually typing, will be climbed The specialized word got is added in machine learning model.In this way, the text message in the machine learning model that can upgrade in time, Make it is suitable for different professional domains, while can make it that the label for content of multimedia setting is more accurate.
Step 106:According to each probable value, the label for meeting Probability Condition is chosen, and by selected label and institute Stating content of multimedia is associated.
Here, the label for meeting Probability Condition can be the highest label as text message of probability.That is, The highest label as text message of probability is chosen from each probable value of machine learning model output.Meet in selection After the label of Probability Condition, the incidence relation between selected label and the content of multimedia is established, in this way, passing through pass Connection relation can be quickly found out content of multimedia corresponding with label.
In the present embodiment, to be reached for the more accurate effect of label of content of multimedia setting, if operation personnel is to language Label in adopted analytic process is corrected, then is needed revised label reverse sync to machine learning model, Ran Houzai Again label is set for content of multimedia.
Specifically, in this step 106 is performed according to each probable value, choose meet Probability Condition label it Afterwards, the method can also include:
Obtain modifying label, the modifying label is with the text envelope for updating machine learning model output Cease corresponding label;
Semanteme is carried out in the first predetermined threshold value, and/or the machine learning model when the quantity of the modifying label reaches When the training time interval of analyzing and training reaches the second predetermined threshold value, based on the modifying label and corresponding text message more The new machine learning model, label corresponding with the text message is redefined according to the machine learning model after renewal.
It should be noted that reach the first predetermined threshold value in the quantity of modifying label here, and/or in machine learning model When the training time interval of progress semantic analysis training reaches the second predetermined threshold value, then machine learning model is updated, can To ensure that the sample of machine learning model can be upgraded in time in effective period of time, reset for content of multimedia The variation effect of label is more obvious.
In embodiments of the present invention, Probability Condition is met according to each probable value, selection in this step 106 is performed Label after, the method can also include:
Obtain preference information;The preference information, for characterizing the preference to each content of multimedia with same label;
According to the preference information, the label of pair text message associated with each content of multimedia is adjusted;
According to the label after the text message and corresponding adjustment, the machine learning model is updated.
Here, the preference information can be feedback information of the user to each content of multimedia with same label, such as Like or do not like.Give the preference information Real-time Feedback of user to big data platform such as computer equipment, computer equipment meeting According to preference information, the label of pair text message associated with each content of multimedia is adjusted, and then according to text message And the label after corresponding adjustment, update the correspondence of the text message of sample and label in the machine learning model.
For example, certain class user group A likes the content of multimedia with same label T1, therefore, recommended to user group A Content of multimedia C1A, C1B, C1C, C1D and C1E under label T1, it is found that more matchmakers that user group A is often browsed or watched It is C1A, C1B, C1C to hold in vivo, but seldom browses or watch for content of multimedia C1D, C1E, is liked on the contrary under label T2 Content of multimedia C2F, then, respective change will occur for the content of multimedia with same label T1, i.e., in the multimedia of T1 Appearance is changed into C1A, C1B, C1C and C2F.
Using the technical solution of the embodiment of the present invention, divided by a pair text message associated with content of multimedia The first cluster result is obtained after the analyzing and processing such as word, cluster, the target signature word extracted from the first cluster result is inputted Machine learning model, to obtain each probable value, according to each probable value, chooses the label for meeting Probability Condition, and by the mark of selection Label are associated with content of multimedia, to be embodied as the purpose that content of multimedia sets label.In this way, mark can be set to avoid artificial The subjective impact of label, additionally it is possible to be quickly and accurately content of multimedia automated setting label, and set label only with The text message that content of multimedia is associated is related, is more bonded the demand of different user, and greatly improve user uses body Test.
In order to realize the method to set up of the above label, the embodiment of the present invention additionally provides a kind of setting of content tab Device, the setting device of the content tab are applied to terminal device, such as the meter such as smart mobile phone, tablet computer, palm PC Machine equipment is calculated, Fig. 2 is a kind of illustrative view of functional configuration of the setting device of content tab provided in an embodiment of the present invention;Such as Fig. 2 Shown, the setting device of the content tab includes acquisition module 201, word-dividing mode 202, cluster module 203, extraction module 204th, generation module 205 and relating module 206;Wherein,
The acquisition module 201, for obtaining the text message associated with content of multimedia;
The word-dividing mode 202, for being segmented to the text message, to obtain each participle fragment;
The cluster module 203, for being clustered to each participle fragment, to obtain the first cluster result, wherein, First cluster result includes the participle fragment group being made of the participle fragment of each cluster classification;
The extraction module 204, for extracting target signature word from first cluster result, inputs machine learning mould Type;
The acquisition module 201, is additionally operable to obtain each probable value of the machine learning model output;Wherein, the machine Device learning model, by training to obtain to carrying out semantic analysis including the sample of text message and the correspondence of label;It is described Each probable value represents probability size of each target signature word respectively as the label of the text message respectively;
The generation module 205, for according to each probable value, choosing the label for meeting Probability Condition;
The relating module 206, for selected label is associated with the content of multimedia.
In the present embodiment, the extraction module 204, is specifically used for:
Count the frequency that each participle fragment in the participle fragment group of each cluster classification occurs in all cluster classifications, root According to the weighted value of the frequency and each participle fragment, importance value of each participle fragment in all cluster classifications is determined;
The importance value of matching degree condition is chosen from identified each importance value, according to selected important The corresponding participle fragment of degree value, determines target signature word.
In the present embodiment, for according to the selected corresponding participle fragment of importance value, determining that target signature word comes Say, can realize in the following way:According to the different media types of content of multimedia, to the importance value of the selection Corresponding participle fragment is classified, to obtain the feature set of words of each medium type;
According to chosen from the feature set of words of each medium type be used for characterize belonging to medium type text envelope The Feature Words of breath, determine target signature word.
In the present embodiment, for according to chosen from the feature set of words of each medium type be used for characterize affiliated matchmaker The Feature Words of the text message of body type, determine for target signature word, can realize in the following way:From each matchmaker The Feature Words for the text message for being used to characterize affiliated medium type are chosen in the feature set of words of body type;
Based on the selected corresponding feature vector of Feature Words, vector space model is built;
Based on the vector space model, the similarity between each feature vector is calculated, according to the calculating of the similarity As a result selected Feature Words are clustered, to obtain the second cluster result, wherein, second cluster result includes each poly- The Feature Words of class classification;
Target signature word is extracted from the Feature Words of each cluster classification.
In the present embodiment, the word-dividing mode 202, is specifically used for:
The text message is segmented, obtains participle set of segments;
According to the stop words stored in default corpus, the stop words is filtered out from the participle set of segments, will Remaining participle fragment in the participle set of segments in addition to the stop words filtered out, as with the text message pair The participle fragment answered.
As a kind of embodiment, Fig. 3 is the work(of the setting device of another content tab provided in an embodiment of the present invention Can structure diagram;As shown in figure 3, the setting device of the content tab further includes:Sort module 207, for described poly- Generic module 203 clusters each participle fragment, before the first cluster result of acquisition, according to the difference of content of multimedia Each participle fragment, is classified as the participle fragment of each medium type by medium type;
The cluster module 203, is specifically used for:The participle fragment of each medium type is clustered, to obtain One cluster result.
In the present embodiment, as a kind of embodiment, the acquisition module 201, is additionally operable in the generation module 205 According to each probable value, after selection meets the label of Probability Condition, modifying label is obtained, the modifying label is for updating The label corresponding with the text message of the machine learning model output;
Described device further includes:Update module 208, for reaching the first predetermined threshold value when the quantity of the modifying label, And/or the training time interval that semantic analysis training is carried out in the machine learning model is based on when reaching the second predetermined threshold value The modifying label and corresponding text message update the machine learning model, according to the machine learning model weight after renewal Newly determine label corresponding with the text message.
As another embodiment, the acquisition module 201, is additionally operable in the generation module 205 according to described each Probable value, after selection meets the label of Probability Condition, obtains preference information;The preference information, for characterizing to phase With the preference of each content of multimedia of label;
The update module 208, is additionally operable to according to the preference information, pair text associated with each content of multimedia The label of this information is adjusted, and according to the label after the text message and corresponding adjustment, updates the machine Learning model.
It should be noted that:The content tab that above-described embodiment provides sets device carrying out the setting of content tab When, only with the division progress of above-mentioned each program module for example, in practical application, above-mentioned processing can be divided as needed With by different program module completions, i.e., the internal structure of device is divided into different program modules, to complete above description All or part of processing.In addition, the setting side that device and content tab are set for the content tab that above-described embodiment provides Method embodiment belongs to same design, its specific implementation process refers to embodiment of the method, and which is not described herein again.
In practical applications, the word-dividing mode 202, cluster module 203, extraction module 204, generation module 205, association Module 206, sort module 207 and update module 208 can by computer equipment central processing unit (CPU, Central Processing Unit), microprocessor (MPU, Micro Processor Unit), digital signal processor (DSP, Digital Signal Processor) or field programmable gate array (FPGA, Field Programmable Gate ) etc. Array realize;The acquisition module 201 can by communications module (including Base communication external member, operating system, communication mould Block, standard interface and agreement etc.) and dual-mode antenna realization.
In order to realize the method to set up of the above label, the embodiment of the present invention additionally provides a kind of setting of content tab The hardware configuration of device.The setting device of the content tab of the embodiment of the present invention is realized referring now to attached drawing description, it is described interior Holding the setting device of label can be come real with terminal device, such as smart mobile phone, tablet computer, palm PC computer equipment Apply.The hardware configuration of the setting device to content tab provided in an embodiment of the present invention is described further below, it will be understood that Fig. 4 illustrate only the example arrangement rather than entire infrastructure of the setting device of content tab, can implement Fig. 4 as needed and show The part-structure or entire infrastructure gone out.
Referring to Fig. 4, Fig. 4 is a kind of hardware configuration signal of the setting device of content tab provided in an embodiment of the present invention Scheme, the terminal device of foregoing operation application program, the setting device of the content tab shown in Fig. 4 are can be applied in practical application 400 include:At least one processor 401, memory 402, user interface 403 and at least one network interface 404.The content Various components in the setting device 400 of label are coupled by bus system 405.It is appreciated that bus system 405 is used Connection communication between these components are realized.Bus system 405 further includes power bus, control in addition to including data/address bus Bus and status signal bus in addition processed.But for the sake of clear explanation, various buses are all designated as bus system 405 in Fig. 4.
Wherein, user interface 403 can include display, keyboard, mouse, trace ball, click wheel, button, button, sense of touch Plate or touch-screen etc..
It is appreciated that memory 402 can be volatile memory or nonvolatile memory, may also comprise volatibility and Both nonvolatile memories.
Memory 402 in the embodiment of the present invention is used to store various types of data to support the setting of content tab to fill Put 400 operation.The example of these data includes:For any computer operated on the setting device 400 of content tab Program, such as executable program 4021 and operating system 4022, realizes the journey of the method to set up of the content tab of the embodiment of the present invention Sequence may be embodied in executable program 4021.
The method to set up for the content tab that the embodiment of the present invention discloses can be applied in processor 401, or by handling Device 401 is realized.Processor 401 is probably a kind of IC chip, has the disposal ability of signal.During realization, on Each step for stating the method to set up of content tab can be by the integrated logic circuit or software shape of the hardware in processor 401 The instruction of formula is completed.Above-mentioned processor 401 can be general processor, DSP, or other programmable logic device, discrete Door or transistor logic, discrete hardware components etc..Processor 401 can be realized or performed in the embodiment of the present invention and carry Method to set up, step and the logic diagram of each content tab supplied.General processor can be microprocessor or any routine Processor etc..The step of method to set up of the content tab provided with reference to the embodiment of the present invention, can be embodied directly in hard Part decoding processor performs completion, or performs completion with the hardware in decoding processor and software module combination.Software module It can be located in storage medium, which is located at memory 402, and processor 401 reads the information in memory 402, knot Close the step of its hardware completes the method to set up of foregoing teachings label.
The embodiment of the present invention additionally provides a kind of hardware configuration of the setting device of content tab, and the content tab is set Putting device 400 includes memory 402, processor 401 and is stored on memory 402 and can be run by the processor 401 Executable program 4021, the processor 401 realizes when running the executable program 4021:Obtain and content of multimedia phase Associated text message;The text message is segmented, to obtain each participle fragment;Each participle fragment is gathered Class, to obtain the first cluster result, wherein, first cluster result includes being made of the participle fragment for each cluster classification Participle fragment group;Target signature word is extracted from first cluster result, inputs machine learning model;Obtain the machine Each probable value of learning model output;Wherein, the machine learning model, by including text message pass corresponding with label The sample of system carries out semantic analysis and trains to obtain;Each probable value represents each target signature word respectively as described respectively The probability size of the label of text message;According to each probable value, the label for meeting Probability Condition is chosen, and will be selected Label is associated with the content of multimedia.
As a kind of embodiment, the processor 401 is realized when running the executable program 4021:Described to institute State each participle fragment to be clustered, before the first cluster result of acquisition, according to the different media types of content of multimedia, by institute State the participle fragment that each participle fragment is classified as each medium type.
As a kind of embodiment, the processor 401 is realized when running the executable program 4021:To each matchmaker The participle fragment of body type is clustered, to obtain the first cluster result.
As a kind of embodiment, the processor 401 is realized when running the executable program 4021:Count each cluster The frequency that each participle fragment in the participle fragment group of classification occurs in all cluster classifications, according to the frequency and each participle The weighted value of fragment, determines importance value of each participle fragment in all cluster classifications;From identified each significance level The importance value of matching degree condition is chosen in value, according to the selected corresponding participle fragment of importance value, determines mesh Mark Feature Words.
As a kind of embodiment, the processor 401 is realized when running the executable program 4021:According to multimedia The different media types of content, classify the corresponding participle fragment of importance value of the selection, to obtain each media The feature set of words of type;According to chosen from the feature set of words of each medium type be used for characterize affiliated medium type Text message Feature Words, determine target signature word.
As a kind of embodiment, the processor 401 is realized when running the executable program 4021:From each matchmaker The Feature Words for the text message for being used to characterize affiliated medium type are chosen in the feature set of words of body type;Based on selected spy The corresponding feature vector of word is levied, builds vector space model;Based on the vector space model, calculate between each feature vector Similarity, clusters selected Feature Words according to the result of calculation of the similarity, to obtain the second cluster result, its In, second cluster result includes the Feature Words of each cluster classification;Target is extracted from the Feature Words of each cluster classification Feature Words.
As a kind of embodiment, the processor 401 is realized when running the executable program 4021:To the text Information is segmented, and obtains participle set of segments;According to the stop words stored in default corpus, from the participle set of segments In filter out the stop words, by the remaining participle piece in the participle set of segments in addition to the stop words filtered out Section, as participle fragment corresponding with the text message.
As a kind of embodiment, the processor 401 is realized when running the executable program 4021:In the basis Each probable value, after selection meets the label of Probability Condition, obtains modifying label, the modifying label is for updating State the label corresponding with the text message of machine learning model output;Preset when the quantity of the modifying label reaches first When the training time interval of progress semantic analysis training reaches the second predetermined threshold value in threshold value, and/or the machine learning model, The machine learning model is updated based on the modifying label and corresponding text message, according to the machine learning mould after renewal Type redefines label corresponding with the text message.
As a kind of embodiment, the processor 401 is realized when running the executable program 4021:In the basis Each probable value, after selection meets the label of Probability Condition, obtains preference information;The preference information, for characterization pair The preference of each content of multimedia with same label;It is pair associated with each content of multimedia according to the preference information The label of text message be adjusted;According to the label after the text message and corresponding adjustment, the machine is updated Learning model.
The embodiment of the present invention additionally provides a kind of storage medium, and the storage medium can be that CD, flash memory or disk etc. are deposited Storage media, is chosen as non-moment storage medium.Wherein, executable program 4021 is stored with the storage medium, it is described to hold Line program 4021 is realized when being performed by processor 401:Obtain the text message associated with content of multimedia;To the text envelope Breath is segmented, to obtain each participle fragment;Each participle fragment is clustered, to obtain the first cluster result, wherein, First cluster result includes the participle fragment group being made of the participle fragment of each cluster classification;From the described first cluster As a result middle extraction target signature word, inputs machine learning model;Obtain each probable value of the machine learning model output;Its In, the machine learning model, by carrying out semantic analysis training including the sample of text message and the correspondence of label Obtain;Each probable value represents that each target signature word is big respectively as the probability of the label of the text message respectively It is small;According to each probable value, the label for meeting Probability Condition is chosen, and by selected label and the content of multimedia phase Association.
As a kind of embodiment, the executable program 4021 is realized when being performed by processor 401:Described to described Each participle fragment is clustered, before the first cluster result of acquisition, according to the different media types of content of multimedia, by described in Each participle fragment is classified as the participle fragment of each medium type.
As a kind of embodiment, the executable program 4021 is realized when being performed by processor 401:To each media The participle fragment of type is clustered, to obtain the first cluster result.
As a kind of embodiment, the executable program 4021 is realized when being performed by processor 401:Count each cluster class The frequency that each participle fragment in other participle fragment group occurs in all cluster classifications, according to the frequency and each participle piece The weighted value of section, determines importance value of each participle fragment in all cluster classifications;From identified each importance value The middle importance value for choosing matching degree condition, according to the selected corresponding participle fragment of importance value, determines target Feature Words.
As a kind of embodiment, the executable program 4021 is realized when being performed by processor 401:According in multimedia The different media types of appearance, classify the corresponding participle fragment of importance value of the selection, to obtain each media class The feature set of words of type;According to choose from the feature set of words of each medium type be used to characterize belonging to medium type The Feature Words of text message, determine target signature word.
As a kind of embodiment, the executable program 4021 is realized when being performed by processor 401:From each media The Feature Words for the text message for being used to characterize affiliated medium type are chosen in the feature set of words of type;Based on selected feature The corresponding feature vector of word, builds vector space model;Based on the vector space model, the phase between each feature vector is calculated Like degree, selected Feature Words are clustered according to the result of calculation of the similarity, to obtain the second cluster result, its In, second cluster result includes the Feature Words of each cluster classification;Target is extracted from the Feature Words of each cluster classification Feature Words.
As a kind of embodiment, the executable program 4021 is realized when being performed by processor 401:To the text envelope Breath is segmented, and obtains participle set of segments;According to the stop words stored in default corpus, from the participle set of segments The stop words is filtered out, the residue in the participle set of segments in addition to the stop words filtered out is segmented into fragment, As participle fragment corresponding with the text message.
As a kind of embodiment, the executable program 4021 is realized when being performed by processor 401:Described according to institute Each probable value is stated, after selection meets the label of Probability Condition, obtains modifying label, the modifying label is described for updating The label corresponding with the text message of machine learning model output;When the quantity of the modifying label reaches the first default threshold Value, and/or the training time interval of semantic analysis training is carried out in the machine learning model when reaching the second predetermined threshold value, base The machine learning model is updated in the modifying label and corresponding text message, according to the machine learning model after renewal Redefine label corresponding with the text message.
As a kind of embodiment, the executable program 4021 is realized when being performed by processor 401:Described according to institute Each probable value is stated, after selection meets the label of Probability Condition, obtains preference information;The preference information, for characterizing to tool There is the preference of each content of multimedia of same label;It is pair associated with each content of multimedia according to the preference information The label of text message is adjusted;According to the label after the text message and corresponding adjustment, the engineering is updated Practise model.
In conclusion using the embodiment of the present invention provided more than at least one technical solution, due to can be automatically right The text message associated with content of multimedia obtains the first cluster result after the analyzing and processing such as being segmented, being clustered, will be from the The target signature word input machine learning model extracted in one cluster result, to obtain each probable value, according to each probable value, choosing The label for meeting Probability Condition is taken, and the label of selection is associated with content of multimedia, to be embodied as content of multimedia setting The purpose of label.Can not only be quickly and accurately content of multimedia in this way, avoiding the subjective impact for manually setting label Automated setting label, and the embodiment of the present invention is the label that content of multimedia is set and the interest and hobby nothing of user itself Close, only related to the text message that content of multimedia is associated, therefore, set label is more bonded the need of different user Ask, greatly improve the usage experience of user.
It should be understood by those skilled in the art that, the embodiment of the present invention can be provided as method, system or executable program Product.Therefore, the shape of the embodiment in terms of the present invention can use hardware embodiment, software implementation or combination software and hardware Formula.Moreover, the present invention can use the computer for wherein including computer usable program code in one or more to use storage The form for the executable program product that medium is implemented on (including but not limited to magnetic disk storage and optical memory etc.).
The present invention be with reference to according to the method for the embodiment of the present invention, the flow of equipment (system) and executable program product Figure and/or block diagram describe.It should be understood that it can be realized by executable program instructions every first-class in flowchart and/or the block diagram The combination of flow and/or square frame in journey and/or square frame and flowchart and/or the block diagram.These executable programs can be provided Instruct all-purpose computer, special purpose computer, Embedded Processor or with reference to programmable data processing device processor to produce A raw machine so that the instruction performed by computer or with reference to the processor of programmable data processing device is produced for real The device for the function of being specified in present one flow of flow chart or one square frame of multiple flows and/or block diagram or multiple square frames.
These executable program instructions, which may also be stored in, can guide computer or with reference to programmable data processing device with spy Determine in the computer-readable memory that mode works so that the instruction being stored in the computer-readable memory, which produces, to be included referring to Make the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one square frame of block diagram or The function of being specified in multiple square frames.
These executable program instructions can also be loaded into computer or with reference in programmable data processing device so that count Calculation machine or with reference to performing series of operation steps on programmable device to produce computer implemented processing, thus in computer or There is provided and be used for realization in one flow of flow chart or multiple flows and/or block diagram one with reference to the instruction performed on programmable device The step of function of being specified in a square frame or multiple square frames.
The foregoing is only a preferred embodiment of the present invention, is not intended to limit the scope of the present invention, it is all All any modification, equivalent and improvement made within the spirit and principles in the present invention etc., should be included in the protection of the present invention Within the scope of.

Claims (18)

  1. A kind of 1. method to set up of content tab, it is characterised in that the described method includes:
    Obtain the text message associated with content of multimedia;
    The text message is segmented, to obtain each participle fragment;
    Each participle fragment is clustered, to obtain the first cluster result, wherein, first cluster result includes each poly- The participle fragment group being made of the participle fragment of class classification;
    Target signature word is extracted from first cluster result, inputs machine learning model;
    Obtain each probable value of the machine learning model output;Wherein, the machine learning model, by including text envelope Cease and train to obtain with the progress semantic analysis of the sample of the correspondence of label;Each probable value represents that each target is special respectively Levy probability size of the word respectively as the label of the text message;
    According to each probable value, the label for meeting Probability Condition is chosen, and by selected label and the content of multimedia It is associated.
  2. 2. the method to set up of content tab according to claim 1, it is characterised in that described to each participle fragment Clustered, with before obtaining the first cluster result, the method further includes:
    According to the different media types of content of multimedia, by each participle fragment for segmenting fragment and being classified as each medium type;
    It is described that each participle fragment is clustered, to obtain the first cluster result, including:
    The participle fragment of each medium type is clustered, to obtain the first cluster result.
  3. 3. the method to set up of content tab according to claim 1, it is characterised in that described from first cluster result Middle extraction target signature word, including:
    The frequency that each participle fragment in the participle fragment group of each cluster classification occurs in all cluster classifications is counted, according to institute The weighted value of frequency and each participle fragment is stated, determines importance value of each participle fragment in all cluster classifications;
    The importance value of matching degree condition is chosen from identified each importance value, according to selected significance level It is worth corresponding participle fragment, determines target signature word.
  4. 4. the method to set up of content tab according to claim 3, it is characterised in that the important journey selected by the basis The corresponding participle fragment of angle value, determines target signature word, including:
    According to the different media types of content of multimedia, the corresponding participle fragment of importance value of the selection is divided Class, to obtain the feature set of words of each medium type;
    According to choose from the feature set of words of each medium type be used to characterize belonging to medium type text message Feature Words, determine target signature word.
  5. 5. the method to set up of content tab according to claim 4, it is characterised in that the basis is from each media class That is chosen in the feature set of words of type is used to characterize the Feature Words of the text message of affiliated medium type, determines target signature word, Including:
    The Feature Words for the text message for being used to characterize affiliated medium type are chosen from the feature set of words of each medium type;
    Based on the selected corresponding feature vector of Feature Words, vector space model is built;
    Based on the vector space model, the similarity between each feature vector is calculated, according to the result of calculation of the similarity Selected Feature Words are clustered, to obtain the second cluster result, wherein, second cluster result includes each cluster class Another characteristic word;
    Target signature word is extracted from the Feature Words of each cluster classification.
  6. 6. the method to set up of content tab according to claim 1, it is characterised in that described to be carried out to the text message Participle, to obtain each participle fragment, including:
    The text message is segmented, obtains participle set of segments;
    According to the stop words stored in default corpus, the stop words is filtered out from the participle set of segments, by described in The remaining participle fragment in addition to the stop words filtered out in set of segments is segmented, as corresponding with the text message Segment fragment.
  7. 7. the method to set up of content tab according to claim 1, it is characterised in that described according to each probability Value, after selection meets the label of Probability Condition, the method further includes:
    Obtain modifying label, the modifying label is with the text message pair for updating machine learning model output The label answered;
    Semantic analysis is carried out in the first predetermined threshold value, and/or the machine learning model when the quantity of the modifying label reaches When trained training time interval reaches the second predetermined threshold value, based on the modifying label and corresponding text message renewal institute Machine learning model is stated, label corresponding with the text message is redefined according to the machine learning model after renewal.
  8. 8. the method to set up of content tab according to claim 7, it is characterised in that described according to each probability Value, after selection meets the label of Probability Condition, the method further includes:
    Obtain preference information;The preference information, for characterizing the preference to each content of multimedia with same label;
    According to the preference information, the label of pair text message associated with each content of multimedia is adjusted;
    According to the label after the text message and corresponding adjustment, the machine learning model is updated.
  9. 9. the setting device of a kind of content tab, it is characterised in that described device includes:Acquisition module, word-dividing mode, cluster mould Block, extraction module, generation module and relating module;Wherein,
    The acquisition module, for obtaining the text message associated with content of multimedia;
    The word-dividing mode, for being segmented to the text message, to obtain each participle fragment;
    The cluster module, for being clustered to each participle fragment, to obtain the first cluster result, wherein, described the One cluster result includes the participle fragment group being made of the participle fragment of each cluster classification;
    The extraction module, for extracting target signature word from first cluster result, inputs machine learning model;
    The acquisition module, is additionally operable to obtain each probable value of the machine learning model output;Wherein, the machine learning mould Type, by training to obtain to carrying out semantic analysis including the sample of text message and the correspondence of label;Each probable value Probability size of each target signature word respectively as the label of the text message is represented respectively;
    The generation module, for according to each probable value, choosing the label for meeting Probability Condition;
    The relating module, for selected label is associated with the content of multimedia.
  10. 10. the setting device of content tab according to claim 9, it is characterised in that described device further includes:Classification mould Block, for being clustered in the cluster module to each participle fragment, before the first cluster result of acquisition, according to more matchmakers Each participle fragment, is classified as the participle fragment of each medium type by the different media types held in vivo;
    The cluster module, is specifically used for:The participle fragment of each medium type is clustered, is tied with obtaining the first cluster Fruit.
  11. 11. the setting device of content tab according to claim 9, it is characterised in that the extraction module, it is specific to use In:
    The frequency that each participle fragment in the participle fragment group of each cluster classification occurs in all cluster classifications is counted, according to institute The weighted value of frequency and each participle fragment is stated, determines importance value of each participle fragment in all cluster classifications;
    The importance value of matching degree condition is chosen from identified each importance value, according to selected significance level It is worth corresponding participle fragment, determines target signature word.
  12. 12. the setting device of content tab according to claim 11, it is characterised in that the extraction module, it is specific to use In:
    According to the different media types of content of multimedia, the corresponding participle fragment of importance value of the selection is divided Class, to obtain the feature set of words of each medium type;
    According to choose from the feature set of words of each medium type be used to characterize belonging to medium type text message Feature Words, determine target signature word.
  13. 13. the setting device of content tab according to claim 12, it is characterised in that the extraction module, it is specific to use In:
    The Feature Words for the text message for being used to characterize affiliated medium type are chosen from the feature set of words of each medium type;
    Based on the selected corresponding feature vector of Feature Words, vector space model is built;
    Based on the vector space model, the similarity between each feature vector is calculated, according to the result of calculation of the similarity Selected Feature Words are clustered, to obtain the second cluster result, wherein, second cluster result includes each cluster class Another characteristic word;
    Target signature word is extracted from the Feature Words of each cluster classification.
  14. 14. the setting device of content tab according to claim 9, it is characterised in that the word-dividing mode is specific to use In:
    The text message is segmented, obtains participle set of segments;
    According to the stop words stored in default corpus, the stop words is filtered out from the participle set of segments, by described in The remaining participle fragment in addition to the stop words filtered out in set of segments is segmented, as corresponding with the text message Segment fragment.
  15. 15. the setting device of content tab according to claim 9, it is characterised in that the acquisition module, is additionally operable to The generation module after selection meets the label of Probability Condition, obtains modifying label, the amendment according to each probable value Label is the label corresponding with the text message for updating the machine learning model output;
    Described device further includes:Update module, for reaching the first predetermined threshold value, and/or institute when the quantity of the modifying label The training time interval that semantic analysis training is carried out in machine learning model is stated when reaching the second predetermined threshold value, based on the amendment Label and corresponding text message update the machine learning model, according to the machine learning model after renewal redefine with The corresponding label of the text message.
  16. 16. the setting device of content tab according to claim 15, it is characterised in that the acquisition module, is additionally operable to In the generation module according to each probable value, after selection meets the label of Probability Condition, preference information is obtained;It is described inclined Good information, for characterizing the preference to each content of multimedia with same label;
    The update module, is additionally operable to according to the preference information, pair text message associated with each content of multimedia Label be adjusted, and according to the label after the text message and corresponding adjustment, update the machine learning mould Type.
  17. 17. a kind of storage medium, is stored thereon with executable program, it is characterised in that the executable code processor is held The step of method to set up such as claim 1 to 8 any one of them content tab is realized during row.
  18. 18. a kind of setting device of content tab, including memory, processor and storage are on a memory and can be by described Manage the executable program of device operation, it is characterised in that the processor performs such as claim 1 when running the executable program To 8 any one of them content tabs method to set up the step of.
CN201711209262.9A 2017-11-27 2017-11-27 Method and device for setting content label and storage medium Active CN108009228B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711209262.9A CN108009228B (en) 2017-11-27 2017-11-27 Method and device for setting content label and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711209262.9A CN108009228B (en) 2017-11-27 2017-11-27 Method and device for setting content label and storage medium

Publications (2)

Publication Number Publication Date
CN108009228A true CN108009228A (en) 2018-05-08
CN108009228B CN108009228B (en) 2020-10-09

Family

ID=62054132

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711209262.9A Active CN108009228B (en) 2017-11-27 2017-11-27 Method and device for setting content label and storage medium

Country Status (1)

Country Link
CN (1) CN108009228B (en)

Cited By (49)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108733786A (en) * 2018-05-11 2018-11-02 济南浪潮高新科技投资发展有限公司 A kind of method and apparatus for extracting effective information from html text
CN109033082A (en) * 2018-07-19 2018-12-18 深圳创维数字技术有限公司 The learning training method, apparatus and computer readable storage medium of semantic model
CN109145260A (en) * 2018-08-24 2019-01-04 北京科技大学 A kind of text information extraction method
CN109241281A (en) * 2018-08-01 2019-01-18 百度在线网络技术(北京)有限公司 Software failure reason generation method, device and equipment
CN109255066A (en) * 2018-09-30 2019-01-22 武汉斗鱼网络科技有限公司 A kind of label labeling method, device, server and the storage medium of business object
CN109271502A (en) * 2018-09-25 2019-01-25 武汉大学 A kind of classifying method and device of the space querying theme based on natural language processing
CN109299315A (en) * 2018-09-03 2019-02-01 腾讯科技(深圳)有限公司 Multimedia resource classification method, device, computer equipment and storage medium
CN109344253A (en) * 2018-09-18 2019-02-15 平安科技(深圳)有限公司 Add method, apparatus, computer equipment and the storage medium of user tag
CN109447105A (en) * 2018-09-10 2019-03-08 平安科技(深圳)有限公司 Contract audit method, apparatus, computer equipment and storage medium
CN109871447A (en) * 2019-03-05 2019-06-11 南京甄视智能科技有限公司 Clustering method, computer program product and the server system of Chinese comment unsupervised learning
CN109933662A (en) * 2019-02-15 2019-06-25 北京奇艺世纪科技有限公司 Model training method, information generating method, device, electronic equipment and computer-readable medium
CN110019563A (en) * 2018-08-09 2019-07-16 北京首钢自动化信息技术有限公司 A kind of portrait modeling method and device based on multidimensional data
CN110070143A (en) * 2019-04-29 2019-07-30 北京达佳互联信息技术有限公司 Obtain method, apparatus, equipment and the storage medium of training data
CN110196948A (en) * 2019-06-10 2019-09-03 北京金山安全软件有限公司 Content recommendation method and device, computer equipment and storage medium
CN110321435A (en) * 2019-06-28 2019-10-11 京东数字科技控股有限公司 A kind of data source division methods, device, equipment and storage medium
CN110390002A (en) * 2019-06-18 2019-10-29 深圳壹账通智能科技有限公司 Call resource allocation method, device, computer readable storage medium and server
CN110413837A (en) * 2019-05-30 2019-11-05 腾讯科技(深圳)有限公司 Video recommendation method and device
CN110413787A (en) * 2019-07-26 2019-11-05 腾讯科技(深圳)有限公司 Text Clustering Method, device, terminal and storage medium
CN110442767A (en) * 2019-07-31 2019-11-12 腾讯科技(深圳)有限公司 A kind of method, apparatus and readable storage medium storing program for executing of determining content interaction platform label
CN110717326A (en) * 2019-09-17 2020-01-21 平安科技(深圳)有限公司 Text information author identification method and device based on machine learning
CN110765778A (en) * 2019-10-23 2020-02-07 北京锐安科技有限公司 Label entity processing method and device, computer equipment and storage medium
CN111078885A (en) * 2019-12-18 2020-04-28 腾讯科技(深圳)有限公司 Label classification method, related device, equipment and storage medium
CN111104545A (en) * 2018-10-26 2020-05-05 阿里巴巴集团控股有限公司 Background music configuration method and equipment, client device and electronic equipment
CN111125460A (en) * 2019-12-24 2020-05-08 腾讯科技(深圳)有限公司 Information recommendation method and device
CN111191003A (en) * 2019-12-26 2020-05-22 东软集团股份有限公司 Method and device for determining text association type, storage medium and electronic equipment
CN111222328A (en) * 2018-11-26 2020-06-02 百度在线网络技术(北京)有限公司 Label extraction method and device and electronic equipment
CN111291688A (en) * 2020-02-12 2020-06-16 咪咕文化科技有限公司 Video tag obtaining method and device
CN111339304A (en) * 2020-03-16 2020-06-26 闪捷信息科技有限公司 Text data automatic classification method based on machine learning
CN111369029A (en) * 2018-12-06 2020-07-03 北京嘀嘀无限科技发展有限公司 Service selection prediction method, device, electronic equipment and storage medium
CN111435596A (en) * 2019-01-14 2020-07-21 珠海格力电器股份有限公司 Method and device for adjusting running state of target equipment, storage medium and electronic device
CN111475603A (en) * 2019-01-23 2020-07-31 百度在线网络技术(北京)有限公司 Enterprise identifier identification method and device, computer equipment and storage medium
CN111625716A (en) * 2020-05-12 2020-09-04 聚好看科技股份有限公司 Media asset recommendation method, server and display device
CN111666452A (en) * 2020-07-09 2020-09-15 腾讯科技(深圳)有限公司 Method and device for clustering videos
CN111680156A (en) * 2020-05-25 2020-09-18 中国工商银行股份有限公司 Data multi-label classification method and system
CN111711869A (en) * 2020-06-24 2020-09-25 腾讯科技(深圳)有限公司 Label data processing method and device and computer readable storage medium
CN112131511A (en) * 2020-09-29 2020-12-25 中国银行股份有限公司 Method and device for displaying negotiation information in matching activities
CN112384911A (en) * 2018-07-11 2021-02-19 株式会社东芝 Label applying device, label applying method, and program
CN112447173A (en) * 2019-08-16 2021-03-05 阿里巴巴集团控股有限公司 Voice interaction method and device and computer storage medium
CN112580329A (en) * 2019-09-30 2021-03-30 北京国双科技有限公司 Text noise data identification method and device, computer equipment and storage medium
CN112579738A (en) * 2020-12-23 2021-03-30 广州博冠信息科技有限公司 Target object label processing method, device, equipment and storage medium
CN112612888A (en) * 2020-12-25 2021-04-06 航天信息股份有限公司 Method and system for intelligently clustering text files
CN113111174A (en) * 2020-04-28 2021-07-13 北京明亿科技有限公司 Group identification method, device, equipment and medium based on deep learning model
CN113157851A (en) * 2021-02-23 2021-07-23 北京三快在线科技有限公司 Category information generation method and device, electronic equipment and computer readable medium
CN113221533A (en) * 2021-04-29 2021-08-06 支付宝(杭州)信息技术有限公司 Experience sound label extraction method, device and equipment
CN113806542A (en) * 2021-09-18 2021-12-17 上海幻电信息科技有限公司 Text analysis method and system
CN115271851A (en) * 2022-07-04 2022-11-01 天翼爱音乐文化科技有限公司 Video color ring recommendation method, system, electronic equipment and storage medium
CN115599903A (en) * 2021-07-07 2023-01-13 腾讯科技(深圳)有限公司(Cn) Object tag obtaining method and device, electronic equipment and storage medium
CN116912845A (en) * 2023-06-16 2023-10-20 广东电网有限责任公司佛山供电局 Intelligent content identification and analysis method and device based on NLP and AI
CN113221533B (en) * 2021-04-29 2024-07-05 支付宝(杭州)信息技术有限公司 Label extraction method, device and equipment for experience sound

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104111933A (en) * 2013-04-17 2014-10-22 阿里巴巴集团控股有限公司 Method and device for acquiring business object label and building training model
CN105243389A (en) * 2015-09-28 2016-01-13 北京橙鑫数据科技有限公司 Industry classification tag determining method and apparatus for company name
WO2017075939A1 (en) * 2015-11-06 2017-05-11 腾讯科技(深圳)有限公司 Method and device for recognizing image contents
CN107301199A (en) * 2017-05-17 2017-10-27 北京融数云途科技有限公司 A kind of data label generation method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104111933A (en) * 2013-04-17 2014-10-22 阿里巴巴集团控股有限公司 Method and device for acquiring business object label and building training model
CN105243389A (en) * 2015-09-28 2016-01-13 北京橙鑫数据科技有限公司 Industry classification tag determining method and apparatus for company name
WO2017075939A1 (en) * 2015-11-06 2017-05-11 腾讯科技(深圳)有限公司 Method and device for recognizing image contents
CN107301199A (en) * 2017-05-17 2017-10-27 北京融数云途科技有限公司 A kind of data label generation method and device

Cited By (73)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108733786A (en) * 2018-05-11 2018-11-02 济南浪潮高新科技投资发展有限公司 A kind of method and apparatus for extracting effective information from html text
CN112384911A (en) * 2018-07-11 2021-02-19 株式会社东芝 Label applying device, label applying method, and program
CN109033082A (en) * 2018-07-19 2018-12-18 深圳创维数字技术有限公司 The learning training method, apparatus and computer readable storage medium of semantic model
CN109033082B (en) * 2018-07-19 2022-06-10 深圳创维数字技术有限公司 Learning training method and device of semantic model and computer readable storage medium
CN109241281A (en) * 2018-08-01 2019-01-18 百度在线网络技术(北京)有限公司 Software failure reason generation method, device and equipment
CN109241281B (en) * 2018-08-01 2022-09-23 百度在线网络技术(北京)有限公司 Software failure reason generation method, device and equipment
CN110019563A (en) * 2018-08-09 2019-07-16 北京首钢自动化信息技术有限公司 A kind of portrait modeling method and device based on multidimensional data
CN109145260A (en) * 2018-08-24 2019-01-04 北京科技大学 A kind of text information extraction method
US11798278B2 (en) 2018-09-03 2023-10-24 Tencent Technology (Shenzhen) Company Limited Method, apparatus, and storage medium for classifying multimedia resource
CN109299315A (en) * 2018-09-03 2019-02-01 腾讯科技(深圳)有限公司 Multimedia resource classification method, device, computer equipment and storage medium
CN109447105A (en) * 2018-09-10 2019-03-08 平安科技(深圳)有限公司 Contract audit method, apparatus, computer equipment and storage medium
CN109344253A (en) * 2018-09-18 2019-02-15 平安科技(深圳)有限公司 Add method, apparatus, computer equipment and the storage medium of user tag
CN109271502A (en) * 2018-09-25 2019-01-25 武汉大学 A kind of classifying method and device of the space querying theme based on natural language processing
CN109255066B (en) * 2018-09-30 2021-11-09 武汉斗鱼网络科技有限公司 Label marking method, device, server and storage medium for business object
CN109255066A (en) * 2018-09-30 2019-01-22 武汉斗鱼网络科技有限公司 A kind of label labeling method, device, server and the storage medium of business object
CN111104545A (en) * 2018-10-26 2020-05-05 阿里巴巴集团控股有限公司 Background music configuration method and equipment, client device and electronic equipment
CN111222328A (en) * 2018-11-26 2020-06-02 百度在线网络技术(北京)有限公司 Label extraction method and device and electronic equipment
CN111222328B (en) * 2018-11-26 2023-06-16 百度在线网络技术(北京)有限公司 Label extraction method and device and electronic equipment
CN111369029A (en) * 2018-12-06 2020-07-03 北京嘀嘀无限科技发展有限公司 Service selection prediction method, device, electronic equipment and storage medium
CN111435596A (en) * 2019-01-14 2020-07-21 珠海格力电器股份有限公司 Method and device for adjusting running state of target equipment, storage medium and electronic device
CN111435596B (en) * 2019-01-14 2024-01-30 珠海格力电器股份有限公司 Method and device for adjusting running state of target equipment, storage medium and electronic device
CN111475603A (en) * 2019-01-23 2020-07-31 百度在线网络技术(北京)有限公司 Enterprise identifier identification method and device, computer equipment and storage medium
CN109933662A (en) * 2019-02-15 2019-06-25 北京奇艺世纪科技有限公司 Model training method, information generating method, device, electronic equipment and computer-readable medium
CN109871447A (en) * 2019-03-05 2019-06-11 南京甄视智能科技有限公司 Clustering method, computer program product and the server system of Chinese comment unsupervised learning
CN110070143B (en) * 2019-04-29 2021-07-16 北京达佳互联信息技术有限公司 Method, device and equipment for acquiring training data and storage medium
CN110070143A (en) * 2019-04-29 2019-07-30 北京达佳互联信息技术有限公司 Obtain method, apparatus, equipment and the storage medium of training data
CN110413837A (en) * 2019-05-30 2019-11-05 腾讯科技(深圳)有限公司 Video recommendation method and device
CN110413837B (en) * 2019-05-30 2023-07-25 腾讯科技(深圳)有限公司 Video recommendation method and device
CN110196948A (en) * 2019-06-10 2019-09-03 北京金山安全软件有限公司 Content recommendation method and device, computer equipment and storage medium
CN110390002A (en) * 2019-06-18 2019-10-29 深圳壹账通智能科技有限公司 Call resource allocation method, device, computer readable storage medium and server
CN110321435A (en) * 2019-06-28 2019-10-11 京东数字科技控股有限公司 A kind of data source division methods, device, equipment and storage medium
CN110413787B (en) * 2019-07-26 2023-07-21 腾讯科技(深圳)有限公司 Text clustering method, device, terminal and storage medium
CN110413787A (en) * 2019-07-26 2019-11-05 腾讯科技(深圳)有限公司 Text Clustering Method, device, terminal and storage medium
CN110442767B (en) * 2019-07-31 2023-08-18 腾讯科技(深圳)有限公司 Method and device for determining content interaction platform label and readable storage medium
CN110442767A (en) * 2019-07-31 2019-11-12 腾讯科技(深圳)有限公司 A kind of method, apparatus and readable storage medium storing program for executing of determining content interaction platform label
CN112447173A (en) * 2019-08-16 2021-03-05 阿里巴巴集团控股有限公司 Voice interaction method and device and computer storage medium
CN110717326A (en) * 2019-09-17 2020-01-21 平安科技(深圳)有限公司 Text information author identification method and device based on machine learning
CN110717326B (en) * 2019-09-17 2022-12-23 平安科技(深圳)有限公司 Text information author identification method and device based on machine learning
CN112580329A (en) * 2019-09-30 2021-03-30 北京国双科技有限公司 Text noise data identification method and device, computer equipment and storage medium
CN112580329B (en) * 2019-09-30 2024-02-20 北京国双科技有限公司 Text noise data identification method, device, computer equipment and storage medium
CN110765778B (en) * 2019-10-23 2023-08-29 北京锐安科技有限公司 Label entity processing method, device, computer equipment and storage medium
CN110765778A (en) * 2019-10-23 2020-02-07 北京锐安科技有限公司 Label entity processing method and device, computer equipment and storage medium
CN111078885B (en) * 2019-12-18 2023-04-07 腾讯科技(深圳)有限公司 Label classification method, related device, equipment and storage medium
CN111078885A (en) * 2019-12-18 2020-04-28 腾讯科技(深圳)有限公司 Label classification method, related device, equipment and storage medium
CN111125460A (en) * 2019-12-24 2020-05-08 腾讯科技(深圳)有限公司 Information recommendation method and device
CN111191003B (en) * 2019-12-26 2023-04-18 东软集团股份有限公司 Method and device for determining text association type, storage medium and electronic equipment
CN111191003A (en) * 2019-12-26 2020-05-22 东软集团股份有限公司 Method and device for determining text association type, storage medium and electronic equipment
CN111291688B (en) * 2020-02-12 2023-07-14 咪咕文化科技有限公司 Video tag acquisition method and device
CN111291688A (en) * 2020-02-12 2020-06-16 咪咕文化科技有限公司 Video tag obtaining method and device
CN111339304A (en) * 2020-03-16 2020-06-26 闪捷信息科技有限公司 Text data automatic classification method based on machine learning
CN113111174A (en) * 2020-04-28 2021-07-13 北京明亿科技有限公司 Group identification method, device, equipment and medium based on deep learning model
CN111625716B (en) * 2020-05-12 2023-10-31 聚好看科技股份有限公司 Media asset recommendation method, server and display device
CN111625716A (en) * 2020-05-12 2020-09-04 聚好看科技股份有限公司 Media asset recommendation method, server and display device
CN111680156A (en) * 2020-05-25 2020-09-18 中国工商银行股份有限公司 Data multi-label classification method and system
CN111680156B (en) * 2020-05-25 2024-02-09 中国工商银行股份有限公司 Data multi-label classification method and system
CN111711869B (en) * 2020-06-24 2022-05-17 腾讯科技(深圳)有限公司 Label data processing method and device and computer readable storage medium
CN111711869A (en) * 2020-06-24 2020-09-25 腾讯科技(深圳)有限公司 Label data processing method and device and computer readable storage medium
CN111666452A (en) * 2020-07-09 2020-09-15 腾讯科技(深圳)有限公司 Method and device for clustering videos
CN112131511A (en) * 2020-09-29 2020-12-25 中国银行股份有限公司 Method and device for displaying negotiation information in matching activities
CN112579738A (en) * 2020-12-23 2021-03-30 广州博冠信息科技有限公司 Target object label processing method, device, equipment and storage medium
CN112612888A (en) * 2020-12-25 2021-04-06 航天信息股份有限公司 Method and system for intelligently clustering text files
CN112612888B (en) * 2020-12-25 2023-06-16 航天信息股份有限公司 Method and system for intelligent clustering of text files
CN113157851A (en) * 2021-02-23 2021-07-23 北京三快在线科技有限公司 Category information generation method and device, electronic equipment and computer readable medium
CN113221533A (en) * 2021-04-29 2021-08-06 支付宝(杭州)信息技术有限公司 Experience sound label extraction method, device and equipment
CN113221533B (en) * 2021-04-29 2024-07-05 支付宝(杭州)信息技术有限公司 Label extraction method, device and equipment for experience sound
CN115599903A (en) * 2021-07-07 2023-01-13 腾讯科技(深圳)有限公司(Cn) Object tag obtaining method and device, electronic equipment and storage medium
CN115599903B (en) * 2021-07-07 2024-06-04 腾讯科技(深圳)有限公司 Object tag acquisition method and device, electronic equipment and storage medium
CN113806542A (en) * 2021-09-18 2021-12-17 上海幻电信息科技有限公司 Text analysis method and system
CN113806542B (en) * 2021-09-18 2024-05-17 上海幻电信息科技有限公司 Text analysis method and system
CN115271851B (en) * 2022-07-04 2023-10-10 天翼爱音乐文化科技有限公司 Video color ring recommending method, system, electronic equipment and storage medium
CN115271851A (en) * 2022-07-04 2022-11-01 天翼爱音乐文化科技有限公司 Video color ring recommendation method, system, electronic equipment and storage medium
CN116912845A (en) * 2023-06-16 2023-10-20 广东电网有限责任公司佛山供电局 Intelligent content identification and analysis method and device based on NLP and AI
CN116912845B (en) * 2023-06-16 2024-03-19 广东电网有限责任公司佛山供电局 Intelligent content identification and analysis method and device based on NLP and AI

Also Published As

Publication number Publication date
CN108009228B (en) 2020-10-09

Similar Documents

Publication Publication Date Title
CN108009228A (en) A kind of method to set up of content tab, device and storage medium
CN112784130B (en) Twin network model training and measuring method, device, medium and equipment
CN107818781B (en) Intelligent interaction method, equipment and storage medium
CN108509465A (en) A kind of the recommendation method, apparatus and server of video data
CN108984530A (en) A kind of detection method and detection system of network sensitive content
CN104199822B (en) It is a kind of to identify the method and system for searching for corresponding demand classification
CN110097085A (en) Lyrics document creation method, training method, device, server and storage medium
CN108182279A (en) Object classification method, device and computer equipment based on text feature
CN109271493A (en) A kind of language text processing method, device and storage medium
CN109408665A (en) A kind of information recommendation method and device, storage medium
CN110134792B (en) Text recognition method and device, electronic equipment and storage medium
CN111831802B (en) Urban domain knowledge detection system and method based on LDA topic model
CN111783468B (en) Text processing method, device, equipment and medium
CN103365867A (en) Method and device for emotion analysis of user evaluation
CN106105096A (en) System and method for continuous social communication
CN111046225B (en) Audio resource processing method, device, equipment and storage medium
CN103984741A (en) Method and system for extracting user attribute information
CN103810162A (en) Method and system for recommending network information
CN110851650B (en) Comment output method and device and computer storage medium
CN109271550A (en) A kind of music personalization classification recommended method based on deep learning
CN110309114A (en) Processing method, device, storage medium and the electronic device of media information
CN111523324A (en) Training method and device for named entity recognition model
CN106528538A (en) Method and device for intelligent emotion recognition
CN103631874A (en) UGC label classification determining method and device for social platform
CN107273546A (en) Counterfeit application detection method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder
CP01 Change in the name or title of a patent holder

Address after: No.88-1, Yurun street, Jianye District, Nanjing City, Jiangsu Province, 210000

Patentee after: MIGU INTERACTIVE ENTERTAINMENT Co.,Ltd.

Patentee after: CHINA MOBILE COMMUNICATIONS GROUP Co.,Ltd.

Address before: No.88-1, Yurun street, Jianye District, Nanjing City, Jiangsu Province, 210000

Patentee before: MIGU INTERACTIVE ENTERTAINMENT Co.,Ltd.

Patentee before: China Mobile Communications Corp.