CN108009228A - A kind of method to set up of content tab, device and storage medium - Google Patents
A kind of method to set up of content tab, device and storage medium Download PDFInfo
- Publication number
- CN108009228A CN108009228A CN201711209262.9A CN201711209262A CN108009228A CN 108009228 A CN108009228 A CN 108009228A CN 201711209262 A CN201711209262 A CN 201711209262A CN 108009228 A CN108009228 A CN 108009228A
- Authority
- CN
- China
- Prior art keywords
- label
- text message
- content
- participle
- cluster
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/40—Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Multimedia (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of method to set up of content tab, including:Obtain the text message associated with content of multimedia;Text message is segmented, to obtain each participle fragment;Each participle fragment is clustered, to obtain the first cluster result, wherein the first cluster result includes the participle fragment group being made of participle fragment of each cluster classification;Target signature word is extracted from the first cluster result, inputs machine learning model;Obtain each probable value of machine learning model output;Wherein, machine learning model, by training to obtain to carrying out semantic analysis including the sample of text message and the correspondence of label;Each probable value represents probability size of each target signature word respectively as the label of text message respectively;According to each probable value, the label for meeting Probability Condition is chosen, selected label is associated with content of multimedia.The present invention further simultaneously discloses the setting device and storage medium of a kind of content tab.
Description
Technical field
The present invention relates to the data processing technique in artificial intelligence field, more particularly to a kind of setting side of content tab
Method, device and storage medium.
Background technology
With the development of Internet technology, people can pass through network browsing or the miscellaneous content of multimedia of viewing.
Current content of multimedia website such as video website mostly carries out class mark using label to the content of multimedia provided.Its
In, label is the keyword very strong with content of multimedia correlation, and content of multimedia can be briefly described using label
And classification, in order to user search or search content of multimedia interested.
At present, in order to set label to content of multimedia, the technic relization scheme generally used is:User according to itself
Interest and hobby, are manually operated and set label to content of multimedia.However, since which is to rely on user itself into row label
Manual setting, cause when need set label content of multimedia quantity it is larger when, workload is larger, inefficiency;In addition,
This mode too depends on the personal subjective understanding of user, the label that possible different user sets same content of multimedia
There are personalized difference, therefore, if the recommendation of content of multimedia is carried out to other users according to the label of a certain user setting,
May be there are bigger deviation, i.e., the label that the user is set not is suitable for owner, its label applicability set
It is relatively low, it is for towards for the recommendation scene of different user, the label accuracy set using which is relatively low.
For how quickly and accurately to set label for content of multimedia, correlation technique there is no effective solution.
The content of the invention
In view of this, an embodiment of the present invention is intended to provide a kind of method to set up of content tab, device and storage medium, use
Label quickly and accurately is set for content of multimedia to solve the problems, such as that correlation technique is difficult to effectively realize.
To reach above-mentioned purpose, what the technical solution of the embodiment of the present invention was realized in:
In a first aspect, the embodiment of the present invention provides a kind of method to set up of content tab, the described method includes:
Obtain the text message associated with content of multimedia;
The text message is segmented, to obtain each participle fragment;
Each participle fragment is clustered, to obtain the first cluster result, wherein, first cluster result includes
The participle fragment group being made of the participle fragment of each cluster classification;
Target signature word is extracted from first cluster result, inputs machine learning model;
Obtain each probable value of the machine learning model output;Wherein, the machine learning model, by including text
The sample of the correspondence of this information and label carries out semantic analysis and trains to obtain;Each probable value represents each mesh respectively
Mark probability size of the Feature Words respectively as the label of the text message;
According to each probable value, the label for meeting Probability Condition is chosen, and by selected label and the multimedia
Content is associated.
Second aspect, the embodiment of the present invention provide a kind of setting device of content tab, and described device includes:Obtain mould
Block, word-dividing mode, cluster module, extraction module, generation module and relating module;Wherein,
The acquisition module, for obtaining the text message associated with content of multimedia;
The word-dividing mode, for being segmented to the text message, to obtain each participle fragment;
The cluster module, for being clustered to each participle fragment, to obtain the first cluster result, wherein, institute
Stating the first cluster result includes the participle fragment group being made of the participle fragment of each cluster classification;
The extraction module, for extracting target signature word from first cluster result, inputs machine learning model;
The acquisition module, is additionally operable to obtain each probable value of the machine learning model output;Wherein, the engineering
Model is practised, by training to obtain to carrying out semantic analysis including the sample of text message and the correspondence of label;It is described each general
Rate value represents probability size of each target signature word respectively as the label of the text message respectively;
The generation module, for according to each probable value, choosing the label for meeting Probability Condition;
The relating module, for selected label is associated with the content of multimedia.
The third aspect, the embodiment of the present invention provide a kind of storage medium, are stored thereon with executable program, described executable
The step of method to set up of content tab provided in an embodiment of the present invention is realized when program is executed by processor.
Fourth aspect, the embodiment of the present invention provide a kind of setting device of content tab, including memory, processor and deposit
The executable program that can be run on a memory and by the processor is stored up, when the processor runs the executable program
The step of performing the method to set up of content tab provided in an embodiment of the present invention.
Using the embodiment of the present invention provided more than at least one technical solution, due to can automatic pair with multimedia
Hold after associated text message such as is segmented, clustered at the analyzing and processing and obtain the first cluster result, will be from the first cluster result
In the target signature word input machine learning model extracted, to obtain each probable value, according to each probable value, selection meets probability
The label of condition, and the label of selection is associated with content of multimedia, to be embodied as the purpose that content of multimedia sets label.
Can not only be quickly and accurately content of multimedia automated setting mark in this way, avoiding the subjective impact for manually setting label
Label, and the embodiment of the present invention for content of multimedia set label and user itself interest and like it is unrelated, only with more matchmakers
Hold associated text message correlation in vivo, therefore, set label is more bonded the demand of different user, greatly improves
The usage experience of user.
Brief description of the drawings
Fig. 1 realizes flow diagram for a kind of method to set up of content tab provided in an embodiment of the present invention;
Fig. 2 is a kind of illustrative view of functional configuration of the setting device of content tab provided in an embodiment of the present invention;
Fig. 3 is the illustrative view of functional configuration of the setting device of another content tab provided in an embodiment of the present invention;
Fig. 4 is a kind of hardware architecture diagram of the setting device of content tab provided in an embodiment of the present invention.
Embodiment
The characteristics of in order to more fully hereinafter understand the embodiment of the present invention and technology contents, below in conjunction with the accompanying drawings to this hair
The realization of bright embodiment is described in detail, appended attached drawing purposes of discussion only for reference, is not used for limiting the present invention.
Before the embodiment of the present invention is further elaborated, to the noun and term involved in the embodiment of the present invention
Illustrate, the noun and term involved in the embodiment of the present invention are suitable for following explanation.
1) segment, also known as cutting word, refer to that according to certain participle strategy be independent by the Character segmentation in text message
Word.
2) stop words, refers to what categorised decision being filtered from text message, to text message will not have an impact
Word;Usual stop words does not have clear and definite meaning (only putting it into a complete sentence just has certain effect), for example,
The function words such as pronoun, article, number, auxiliary words of mood, adverbial word, preposition and conjunction.
3) target signature word, after referring to segment text message and filtering out stop words, is extracted from remaining word
What is obtained can represent the word of the content of multimedia associated with text message.
4) vector space model, refers to the multiple Feature Words extracted from the feature set of words of each medium type being mapped to
Corresponding term vector, and the feature space vector being combined.
Fig. 1 is a kind of flow diagram of realizing of method to set up of content tab provided in an embodiment of the present invention, described interior
The method to set up for holding label is applied to terminal device;As shown in Figure 1, the method to set up of content tab in the embodiment of the present invention
Realize flow, may comprise steps of:
Step 101:Obtain the text message associated with content of multimedia.
In embodiments of the present invention, the terminal device can include but is not limited to smart mobile phone, tablet computer, palm electricity
The computer equipments such as brain.The content of multimedia can include but is not limited to video content such as image, audio content such as music, text
This content such as novel media form.Content of multimedia mentioned here can be obtained by way of following at least one, example
Such as:Content of multimedia can be an image, picture or the song uploaded by user or such as be regarded from specific website
Video that frequency website is included and collected etc..
Here, the text message associated with content of multimedia, refers to the relevant information for representing content of multimedia, such as
The information such as the title of content, brief introduction, author, type.
Step 102:The text message is segmented, to obtain each participle fragment.
In the present embodiment, computer equipment calls Chinese Word Segmentation Service that all text messages are done word segmentation processing, obtains and text
The corresponding multiple participles of this information.Word segmentation processing mentioned here, it can be understood as using segmenter by a text message structure
Into text sequence be divided into the process of participle fragment independent one by one, specifically, can be special according to the composition of Chinese word
Sign, and the characteristics of English word and English phrase, word is carried out to text message using existing or new participle mode and is cut
Point, it is several participle fragments by continuous text-string cutting.For example, if the content of text message is the " weather of today
It is too hot ", then the participle fragment obtained after the content of text information is segmented be respectively " today ", " ", " my god
Gas ", " too ", " heat " and " ".
Here, for the text message of Chinese statement, the segmenting method of string matching can be used to be segmented
Processing, such as Forward Maximum Method method, reverse maximum matching method, N-gram, shortest path participle method, improved maximum matching
Method and two-way maximum matching method etc..Wherein, Forward Maximum Method method refers to from left to right include in text to be segmented
Several continuation characters are matched with vocabulary, if can match, can be syncopated as a participle fragment;Improved maximum matching method
It is the core concept for continuing to use Forward Maximum Method method, and makes up the work(that Forward Maximum Method method does not possess ambiguity detection and resolution
Can, and then on the premise of ensureing that participle speed is basically unchanged, improve the accuracy of participle.For which kind of above-mentioned participle side used
Method segments the text message, and to obtain each participle fragment, the embodiment of the present invention does not limit herein.
In the present embodiment, this step 102 specifically includes:The text message is segmented, obtains participle set of segments;
According to the stop words stored in default corpus, the stop words is filtered out from the participle set of segments, will
Remaining participle fragment in the participle set of segments in addition to the stop words filtered out, as with the text message pair
The participle fragment answered.
In simple terms, stop words here is to determining do not have the word such as language of substantial effect for the purpose of content tab
Gas word and auxiliary word etc., i.e. stop words do not have clear and definite meaning.For example, if the content of text message is the " weather of today
It is too hot ", then the participle set of segments obtained after the content of text information is segmented for " today ", " ", " my god
Gas ", " too ", " heat " and " ";According to the stop words stored in default corpus, it may be determined that some in the participle set of segments
Participle fragment belongs to stop words, that is, segment in set of segments " ", " too " and " " belong to stop words, then, will be to this
In participle set of segments " ", " too " and " " filtered, and then participle fragment " today " after being filtered, " weather "
" heat ", it is seen then that the participle fragment combination of gained can also state the implication of content of text messages after filtering, from participle fragment
Stop words is filtered out in set, the length of the remaining participle fragment filtered out can be limited, with the accuracy rate of lifting filtering, be easy to
The follow-up setting efficiency for improving label.
Step 103:Each participle fragment is clustered, to obtain the first cluster result, wherein, first cluster
As a result the participle fragment group being made of the participle fragment of each cluster classification is included.
In embodiments of the present invention, cluster, it can be understood as to semantic similar between each participle fragment of text message
Measured, the immediate participle fragment of semantic similarity is gathered for one kind.For example, by clustering processing, piece can will be segmented
" liking ", " love ", " having pity on " in section etc. are used to represent that the word of emotion to gather for same participle fragment group.Due to the present invention
The text message substantial amounts associated with content of multimedia obtained in embodiment, therefore, can obtain after clustering processing
Obtain the participle fragment groups of different cluster classifications.
Here it is possible to using existing or new clustering algorithm, such as based on the clustering algorithm for dividing (K-means) or it is based on
The clustering algorithm of model (SOM) carries out clustering processing to each participle fragment, to obtain the first cluster result.Wherein, can use
The methods of some Euclidean distances or the cosine law, carries out participle fragment the calculating of semantic similarity, and the embodiment of the present invention is not another
One repeats.Preferably, the embodiment of the present invention using cluster similarity higher based on the clustering algorithm of SOM to it is each participle fragment into
Row clustering processing.
In the present embodiment, if assuming the content of multimedia described in step 101, more matchmakers of different media types are specifically included
Hold in vivo, then, before this step 103 is performed, the method can also include:
According to the different media types of content of multimedia, by each participle piece for segmenting fragment and being classified as each medium type
Section;
Correspondingly, this step 103 specifically includes:The participle fragment of each medium type is clustered, to obtain
One cluster result.
In embodiments of the present invention, for each participle fragment is classified as the participle fragment of each medium type,
It can realize in the following ways:According to the different media types of content of multimedia, using existing or new text classification mould
Type such as maximum entropy model, decision-tree model etc. classify each participle fragment.Specifically, can be by calculating each participle piece
Section belongs to the probability of each medium type, class prediction is carried out to each participle fragment, using the type of maximum probability as participle piece
The medium type that section is belonged to.
For example, the probability of each medium type is belonging respectively to by calculating participle fragment " rock and roll ", " jazz ", " style of song ",
By can relatively draw, compared to other medium types, these participle fragments " rock and roll ", " jazz ", " style of song " belong to music
The maximum probability of type, in this manner it is possible to which " rock and roll ", " jazz ", " style of song " to be classified as to the participle fragment of music type.Its
In, the different media types of content of multimedia can include but is not limited to the polytypes such as video, music, novel.
Here, the participle fragment of each medium type is clustered, can specifically referred to according to content of multimedia
Different media types, cluster the participle fragment under each medium type respectively.For example, for the institute for belonging to music type
There is participle fragment to carry out clustering processing, while clustering processing etc. is carried out also directed to all participle fragments for belonging to video type.Such as
This, is segmented in a pair text message associated with content of multimedia, is classified, after clustering processing, then extract for characterize with
The Feature Words for the text message that content of multimedia is associated, can so increase the phase of the label and content of multimedia finally set
Guan Xing, rises to the accuracy that content of multimedia sets label.
Step 104:Target signature word is extracted from first cluster result, inputs machine learning model.
In the present embodiment, this step 104 specifically includes:Count each participle fragment in the participle fragment group of each cluster classification
The frequency occurred in all cluster classifications, according to the weighted value of the frequency and each participle fragment, determines that each participle fragment exists
Importance value in all cluster classifications;
The importance value of matching degree condition is chosen from identified each importance value, according to selected important
The corresponding participle fragment of degree value, determines target signature word.
Specifically, in order to be screened to the participle fragment in the first cluster result, so as to reduce in same multimedia
Hold the Feature Words of corresponding characterization text information, existing word frequency reverse document-frequency (TFIDF, Term can be used
Frequency Inverse Document Frequency) feature selection approach come assess one participle for a file
The significance level of collection or a copy of it file in a corpus.In general, the importance of a participle is as it is in text
The directly proportional increase of number occurred in part.In embodiments of the present invention, with reference to frequency and weighted value two because usually determining participle
The significance level of fragment, i.e., each participle fragment in the participle fragment group of each cluster classification of statistics is in all cluster classifications
The frequency values of middle appearance, are occupied with each participle fragment being calculated in the text message associated with content of multimedia
The product of weighted value, to calculate importance value of each participle fragment in all cluster classifications.Then, it is each heavy to what is calculated
Degree value carry out order arrangement is wanted, the importance value of matching degree condition is filtered out based on rank results, i.e., from each important journey
The value of maximum is chosen in angle value, target signature word is determined according to the corresponding participle fragment of the maximum importance value filtered out.
Wherein, the order arrangement includes the arrangement of ascending order and the arrangement of descending;Here weighted value can be counted automatically by computer equipment
Calculate and obtain, and the different corresponding weighted values of participle fragment is possible to different.
In embodiments of the present invention, if before being clustered to each participle fragment, and it is not carried out to each participle piece
Duan Jinhang classification is handled, it is assumed that the content of multimedia described in step 101, specifically includes in the multimedia of different media types
Hold, then, the corresponding participle fragment of importance value selected by the basis, determines target signature word, can specifically include:
According to the different media types of content of multimedia, the corresponding participle fragment of importance value of the selection is carried out
Classification, to obtain the feature set of words of each medium type;
According to chosen from the feature set of words of each medium type be used for characterize belonging to medium type text envelope
The Feature Words of breath, determine target signature word.
Here, for the content of multimedia of different media types, for from the feature set of words of each medium type
When choosing the Feature Words for the text message for being used to characterize affiliated medium type, local feature selection and global characteristics are generally divided into
Selection, wherein, computer equipment can be selected according to medium type automatic decision using local feature, or global characteristics choosing
Select, but its purpose is provided to extract the core content that can most express text message from the feature set of words of each medium type
Feature Words.For example, by taking content of multimedia is film as an example, a word in the brief introduction that movie contents include or several points
Word can most concentrate the subject content for expressing the film, therefore, target signature word can be chosen from brief introduction, which belongs to
The selection of target signature word is realized using local feature selection;By content of multimedia be song exemplified by, due in a song not
The information for the subject content for expressing the song can be concentrated by having, therefore, it is necessary to the comprehensive song global characteristics for example the lyrics, style,
Author etc. chooses target signature word, which belongs to the selection that target signature word is realized using global characteristics selection.It is general and
Speech, the more efficient of target signature word is chosen using local feature selection.
Here, the Feature Words using the corresponding participle fragment of selected importance value as candidate, then, according to more matchmakers
The different media types the held in vivo such as polytype such as video, music, novel, classifies the Feature Words of candidate, obtains each
The feature set of words of medium type, chooses the core content for being used for characterizing text message from the feature set of words of each medium type
Feature Words, and then determine target signature word according to the Feature Words of core content of characterization text message.In this way, using local special
The mode that sign selection and global characteristics selection are combined carries out dimensionality reduction to the Feature Words of candidate, it is possible to reduce at machine learning model
The quantity of the Feature Words of the candidate of reason, greatly improves treatment effeciency.Wherein, the dimensionality reduction refers to the reduction of dimension, that is to say, that
The dimension for reducing the Feature Words of candidate is to reduce the overall quantity of the Feature Words of candidate.
In embodiments of the present invention, what the basis was chosen from the feature set of words of each medium type is used to characterize
The Feature Words of the text message of affiliated medium type, determine target signature word, specifically include:
The spy for the text message for being used to characterize affiliated medium type is chosen from the feature set of words of each medium type
Levy word;
Based on the selected corresponding feature vector of Feature Words, vector space model is built;
Based on the vector space model, the similarity between each feature vector is calculated, according to the calculating of the similarity
As a result selected Feature Words are clustered, to obtain the second cluster result, wherein, second cluster result includes each poly-
The Feature Words of class classification;
Target signature word is extracted from the Feature Words of each cluster classification.
, can be according to the different media types of content of multimedia, using foregoing local feature selection or complete in the present embodiment
Office's feature selecting mode, extracts the feature for the core content that can most express text message from the feature set of words of each medium type
Word.Here, corresponding term vector is mapped to by the Feature Words selected by vector representation, Feature Words that will be selected, will be each
Term vector is combined to obtain feature space vector, and then builds vector space model.Existing Euclidean distance or remaining can be used
The methods of string theorem, calculates the similarity between each feature vector, and the embodiment of the present invention no longer repeats one by one.It can use existing
Or new clustering algorithm, such as the clustering algorithm based on K-means or based on the clustering algorithm of SOM to selected Feature Words into
Row clustering processing, target signature word is extracted from the Feature Words of each cluster classification.In this way, by gathering to Feature Words
Class, can further improve the accuracy that label is set for content of multimedia.
Step 105:Obtain each probable value of the machine learning model output;Wherein, the machine learning model, passes through
Train to obtain to carrying out semantic analysis including the sample of text message and the correspondence of label;Each probable value represents respectively
Probability size of each target signature word respectively as the label of the text message.
In embodiments of the present invention, can by inputting the target signature word of machine learning model, to target signature word to
Amount expression is converted, and is exported the result after conversion as the probability of the label of text message, special to obtain each target
Levy each probable value of the word respectively as the label of text message.Specifically, swashed based on different node in machine learning model
Encourage function, the vector representation to the target signature word of input converts, using the result of conversion as the vector representation of label and
Its corresponding probability.
Here, the machine learning model, is obtained by the semantic analysis training data in natural language learning field
's;Wherein, machine learning model includes the correspondence of the text message and label set by operation personnel, with text message
With the correspondence of label as sample training machine learning model, each probable value of machine learning model output is obtained, wherein,
Each probable value represents probability size of each target signature word respectively as the label of text message respectively.
Here, the text message in the machine learning model can be the normal of the professional domain under each multiple media types
With word (abbreviation specialized word), these everyday expressions can be obtained by way of web crawlers is captured and is manually entered.Tool
For body, by configuring reptile to professional website, to crawl the specialized word under corresponding professional domain, such as, from bean cotyledon video
Crawled in website with the relevant specialized word of video such as " reality TV show ", then by way of manually typing, will be climbed
The specialized word got is added in machine learning model.In this way, the text message in the machine learning model that can upgrade in time,
Make it is suitable for different professional domains, while can make it that the label for content of multimedia setting is more accurate.
Step 106:According to each probable value, the label for meeting Probability Condition is chosen, and by selected label and institute
Stating content of multimedia is associated.
Here, the label for meeting Probability Condition can be the highest label as text message of probability.That is,
The highest label as text message of probability is chosen from each probable value of machine learning model output.Meet in selection
After the label of Probability Condition, the incidence relation between selected label and the content of multimedia is established, in this way, passing through pass
Connection relation can be quickly found out content of multimedia corresponding with label.
In the present embodiment, to be reached for the more accurate effect of label of content of multimedia setting, if operation personnel is to language
Label in adopted analytic process is corrected, then is needed revised label reverse sync to machine learning model, Ran Houzai
Again label is set for content of multimedia.
Specifically, in this step 106 is performed according to each probable value, choose meet Probability Condition label it
Afterwards, the method can also include:
Obtain modifying label, the modifying label is with the text envelope for updating machine learning model output
Cease corresponding label;
Semanteme is carried out in the first predetermined threshold value, and/or the machine learning model when the quantity of the modifying label reaches
When the training time interval of analyzing and training reaches the second predetermined threshold value, based on the modifying label and corresponding text message more
The new machine learning model, label corresponding with the text message is redefined according to the machine learning model after renewal.
It should be noted that reach the first predetermined threshold value in the quantity of modifying label here, and/or in machine learning model
When the training time interval of progress semantic analysis training reaches the second predetermined threshold value, then machine learning model is updated, can
To ensure that the sample of machine learning model can be upgraded in time in effective period of time, reset for content of multimedia
The variation effect of label is more obvious.
In embodiments of the present invention, Probability Condition is met according to each probable value, selection in this step 106 is performed
Label after, the method can also include:
Obtain preference information;The preference information, for characterizing the preference to each content of multimedia with same label;
According to the preference information, the label of pair text message associated with each content of multimedia is adjusted;
According to the label after the text message and corresponding adjustment, the machine learning model is updated.
Here, the preference information can be feedback information of the user to each content of multimedia with same label, such as
Like or do not like.Give the preference information Real-time Feedback of user to big data platform such as computer equipment, computer equipment meeting
According to preference information, the label of pair text message associated with each content of multimedia is adjusted, and then according to text message
And the label after corresponding adjustment, update the correspondence of the text message of sample and label in the machine learning model.
For example, certain class user group A likes the content of multimedia with same label T1, therefore, recommended to user group A
Content of multimedia C1A, C1B, C1C, C1D and C1E under label T1, it is found that more matchmakers that user group A is often browsed or watched
It is C1A, C1B, C1C to hold in vivo, but seldom browses or watch for content of multimedia C1D, C1E, is liked on the contrary under label T2
Content of multimedia C2F, then, respective change will occur for the content of multimedia with same label T1, i.e., in the multimedia of T1
Appearance is changed into C1A, C1B, C1C and C2F.
Using the technical solution of the embodiment of the present invention, divided by a pair text message associated with content of multimedia
The first cluster result is obtained after the analyzing and processing such as word, cluster, the target signature word extracted from the first cluster result is inputted
Machine learning model, to obtain each probable value, according to each probable value, chooses the label for meeting Probability Condition, and by the mark of selection
Label are associated with content of multimedia, to be embodied as the purpose that content of multimedia sets label.In this way, mark can be set to avoid artificial
The subjective impact of label, additionally it is possible to be quickly and accurately content of multimedia automated setting label, and set label only with
The text message that content of multimedia is associated is related, is more bonded the demand of different user, and greatly improve user uses body
Test.
In order to realize the method to set up of the above label, the embodiment of the present invention additionally provides a kind of setting of content tab
Device, the setting device of the content tab are applied to terminal device, such as the meter such as smart mobile phone, tablet computer, palm PC
Machine equipment is calculated, Fig. 2 is a kind of illustrative view of functional configuration of the setting device of content tab provided in an embodiment of the present invention;Such as Fig. 2
Shown, the setting device of the content tab includes acquisition module 201, word-dividing mode 202, cluster module 203, extraction module
204th, generation module 205 and relating module 206;Wherein,
The acquisition module 201, for obtaining the text message associated with content of multimedia;
The word-dividing mode 202, for being segmented to the text message, to obtain each participle fragment;
The cluster module 203, for being clustered to each participle fragment, to obtain the first cluster result, wherein,
First cluster result includes the participle fragment group being made of the participle fragment of each cluster classification;
The extraction module 204, for extracting target signature word from first cluster result, inputs machine learning mould
Type;
The acquisition module 201, is additionally operable to obtain each probable value of the machine learning model output;Wherein, the machine
Device learning model, by training to obtain to carrying out semantic analysis including the sample of text message and the correspondence of label;It is described
Each probable value represents probability size of each target signature word respectively as the label of the text message respectively;
The generation module 205, for according to each probable value, choosing the label for meeting Probability Condition;
The relating module 206, for selected label is associated with the content of multimedia.
In the present embodiment, the extraction module 204, is specifically used for:
Count the frequency that each participle fragment in the participle fragment group of each cluster classification occurs in all cluster classifications, root
According to the weighted value of the frequency and each participle fragment, importance value of each participle fragment in all cluster classifications is determined;
The importance value of matching degree condition is chosen from identified each importance value, according to selected important
The corresponding participle fragment of degree value, determines target signature word.
In the present embodiment, for according to the selected corresponding participle fragment of importance value, determining that target signature word comes
Say, can realize in the following way:According to the different media types of content of multimedia, to the importance value of the selection
Corresponding participle fragment is classified, to obtain the feature set of words of each medium type;
According to chosen from the feature set of words of each medium type be used for characterize belonging to medium type text envelope
The Feature Words of breath, determine target signature word.
In the present embodiment, for according to chosen from the feature set of words of each medium type be used for characterize affiliated matchmaker
The Feature Words of the text message of body type, determine for target signature word, can realize in the following way:From each matchmaker
The Feature Words for the text message for being used to characterize affiliated medium type are chosen in the feature set of words of body type;
Based on the selected corresponding feature vector of Feature Words, vector space model is built;
Based on the vector space model, the similarity between each feature vector is calculated, according to the calculating of the similarity
As a result selected Feature Words are clustered, to obtain the second cluster result, wherein, second cluster result includes each poly-
The Feature Words of class classification;
Target signature word is extracted from the Feature Words of each cluster classification.
In the present embodiment, the word-dividing mode 202, is specifically used for:
The text message is segmented, obtains participle set of segments;
According to the stop words stored in default corpus, the stop words is filtered out from the participle set of segments, will
Remaining participle fragment in the participle set of segments in addition to the stop words filtered out, as with the text message pair
The participle fragment answered.
As a kind of embodiment, Fig. 3 is the work(of the setting device of another content tab provided in an embodiment of the present invention
Can structure diagram;As shown in figure 3, the setting device of the content tab further includes:Sort module 207, for described poly-
Generic module 203 clusters each participle fragment, before the first cluster result of acquisition, according to the difference of content of multimedia
Each participle fragment, is classified as the participle fragment of each medium type by medium type;
The cluster module 203, is specifically used for:The participle fragment of each medium type is clustered, to obtain
One cluster result.
In the present embodiment, as a kind of embodiment, the acquisition module 201, is additionally operable in the generation module 205
According to each probable value, after selection meets the label of Probability Condition, modifying label is obtained, the modifying label is for updating
The label corresponding with the text message of the machine learning model output;
Described device further includes:Update module 208, for reaching the first predetermined threshold value when the quantity of the modifying label,
And/or the training time interval that semantic analysis training is carried out in the machine learning model is based on when reaching the second predetermined threshold value
The modifying label and corresponding text message update the machine learning model, according to the machine learning model weight after renewal
Newly determine label corresponding with the text message.
As another embodiment, the acquisition module 201, is additionally operable in the generation module 205 according to described each
Probable value, after selection meets the label of Probability Condition, obtains preference information;The preference information, for characterizing to phase
With the preference of each content of multimedia of label;
The update module 208, is additionally operable to according to the preference information, pair text associated with each content of multimedia
The label of this information is adjusted, and according to the label after the text message and corresponding adjustment, updates the machine
Learning model.
It should be noted that:The content tab that above-described embodiment provides sets device carrying out the setting of content tab
When, only with the division progress of above-mentioned each program module for example, in practical application, above-mentioned processing can be divided as needed
With by different program module completions, i.e., the internal structure of device is divided into different program modules, to complete above description
All or part of processing.In addition, the setting side that device and content tab are set for the content tab that above-described embodiment provides
Method embodiment belongs to same design, its specific implementation process refers to embodiment of the method, and which is not described herein again.
In practical applications, the word-dividing mode 202, cluster module 203, extraction module 204, generation module 205, association
Module 206, sort module 207 and update module 208 can by computer equipment central processing unit (CPU,
Central Processing Unit), microprocessor (MPU, Micro Processor Unit), digital signal processor
(DSP, Digital Signal Processor) or field programmable gate array (FPGA, Field Programmable Gate
) etc. Array realize;The acquisition module 201 can by communications module (including Base communication external member, operating system, communication mould
Block, standard interface and agreement etc.) and dual-mode antenna realization.
In order to realize the method to set up of the above label, the embodiment of the present invention additionally provides a kind of setting of content tab
The hardware configuration of device.The setting device of the content tab of the embodiment of the present invention is realized referring now to attached drawing description, it is described interior
Holding the setting device of label can be come real with terminal device, such as smart mobile phone, tablet computer, palm PC computer equipment
Apply.The hardware configuration of the setting device to content tab provided in an embodiment of the present invention is described further below, it will be understood that
Fig. 4 illustrate only the example arrangement rather than entire infrastructure of the setting device of content tab, can implement Fig. 4 as needed and show
The part-structure or entire infrastructure gone out.
Referring to Fig. 4, Fig. 4 is a kind of hardware configuration signal of the setting device of content tab provided in an embodiment of the present invention
Scheme, the terminal device of foregoing operation application program, the setting device of the content tab shown in Fig. 4 are can be applied in practical application
400 include:At least one processor 401, memory 402, user interface 403 and at least one network interface 404.The content
Various components in the setting device 400 of label are coupled by bus system 405.It is appreciated that bus system 405 is used
Connection communication between these components are realized.Bus system 405 further includes power bus, control in addition to including data/address bus
Bus and status signal bus in addition processed.But for the sake of clear explanation, various buses are all designated as bus system 405 in Fig. 4.
Wherein, user interface 403 can include display, keyboard, mouse, trace ball, click wheel, button, button, sense of touch
Plate or touch-screen etc..
It is appreciated that memory 402 can be volatile memory or nonvolatile memory, may also comprise volatibility and
Both nonvolatile memories.
Memory 402 in the embodiment of the present invention is used to store various types of data to support the setting of content tab to fill
Put 400 operation.The example of these data includes:For any computer operated on the setting device 400 of content tab
Program, such as executable program 4021 and operating system 4022, realizes the journey of the method to set up of the content tab of the embodiment of the present invention
Sequence may be embodied in executable program 4021.
The method to set up for the content tab that the embodiment of the present invention discloses can be applied in processor 401, or by handling
Device 401 is realized.Processor 401 is probably a kind of IC chip, has the disposal ability of signal.During realization, on
Each step for stating the method to set up of content tab can be by the integrated logic circuit or software shape of the hardware in processor 401
The instruction of formula is completed.Above-mentioned processor 401 can be general processor, DSP, or other programmable logic device, discrete
Door or transistor logic, discrete hardware components etc..Processor 401 can be realized or performed in the embodiment of the present invention and carry
Method to set up, step and the logic diagram of each content tab supplied.General processor can be microprocessor or any routine
Processor etc..The step of method to set up of the content tab provided with reference to the embodiment of the present invention, can be embodied directly in hard
Part decoding processor performs completion, or performs completion with the hardware in decoding processor and software module combination.Software module
It can be located in storage medium, which is located at memory 402, and processor 401 reads the information in memory 402, knot
Close the step of its hardware completes the method to set up of foregoing teachings label.
The embodiment of the present invention additionally provides a kind of hardware configuration of the setting device of content tab, and the content tab is set
Putting device 400 includes memory 402, processor 401 and is stored on memory 402 and can be run by the processor 401
Executable program 4021, the processor 401 realizes when running the executable program 4021:Obtain and content of multimedia phase
Associated text message;The text message is segmented, to obtain each participle fragment;Each participle fragment is gathered
Class, to obtain the first cluster result, wherein, first cluster result includes being made of the participle fragment for each cluster classification
Participle fragment group;Target signature word is extracted from first cluster result, inputs machine learning model;Obtain the machine
Each probable value of learning model output;Wherein, the machine learning model, by including text message pass corresponding with label
The sample of system carries out semantic analysis and trains to obtain;Each probable value represents each target signature word respectively as described respectively
The probability size of the label of text message;According to each probable value, the label for meeting Probability Condition is chosen, and will be selected
Label is associated with the content of multimedia.
As a kind of embodiment, the processor 401 is realized when running the executable program 4021:Described to institute
State each participle fragment to be clustered, before the first cluster result of acquisition, according to the different media types of content of multimedia, by institute
State the participle fragment that each participle fragment is classified as each medium type.
As a kind of embodiment, the processor 401 is realized when running the executable program 4021:To each matchmaker
The participle fragment of body type is clustered, to obtain the first cluster result.
As a kind of embodiment, the processor 401 is realized when running the executable program 4021:Count each cluster
The frequency that each participle fragment in the participle fragment group of classification occurs in all cluster classifications, according to the frequency and each participle
The weighted value of fragment, determines importance value of each participle fragment in all cluster classifications;From identified each significance level
The importance value of matching degree condition is chosen in value, according to the selected corresponding participle fragment of importance value, determines mesh
Mark Feature Words.
As a kind of embodiment, the processor 401 is realized when running the executable program 4021:According to multimedia
The different media types of content, classify the corresponding participle fragment of importance value of the selection, to obtain each media
The feature set of words of type;According to chosen from the feature set of words of each medium type be used for characterize affiliated medium type
Text message Feature Words, determine target signature word.
As a kind of embodiment, the processor 401 is realized when running the executable program 4021:From each matchmaker
The Feature Words for the text message for being used to characterize affiliated medium type are chosen in the feature set of words of body type;Based on selected spy
The corresponding feature vector of word is levied, builds vector space model;Based on the vector space model, calculate between each feature vector
Similarity, clusters selected Feature Words according to the result of calculation of the similarity, to obtain the second cluster result, its
In, second cluster result includes the Feature Words of each cluster classification;Target is extracted from the Feature Words of each cluster classification
Feature Words.
As a kind of embodiment, the processor 401 is realized when running the executable program 4021:To the text
Information is segmented, and obtains participle set of segments;According to the stop words stored in default corpus, from the participle set of segments
In filter out the stop words, by the remaining participle piece in the participle set of segments in addition to the stop words filtered out
Section, as participle fragment corresponding with the text message.
As a kind of embodiment, the processor 401 is realized when running the executable program 4021:In the basis
Each probable value, after selection meets the label of Probability Condition, obtains modifying label, the modifying label is for updating
State the label corresponding with the text message of machine learning model output;Preset when the quantity of the modifying label reaches first
When the training time interval of progress semantic analysis training reaches the second predetermined threshold value in threshold value, and/or the machine learning model,
The machine learning model is updated based on the modifying label and corresponding text message, according to the machine learning mould after renewal
Type redefines label corresponding with the text message.
As a kind of embodiment, the processor 401 is realized when running the executable program 4021:In the basis
Each probable value, after selection meets the label of Probability Condition, obtains preference information;The preference information, for characterization pair
The preference of each content of multimedia with same label;It is pair associated with each content of multimedia according to the preference information
The label of text message be adjusted;According to the label after the text message and corresponding adjustment, the machine is updated
Learning model.
The embodiment of the present invention additionally provides a kind of storage medium, and the storage medium can be that CD, flash memory or disk etc. are deposited
Storage media, is chosen as non-moment storage medium.Wherein, executable program 4021 is stored with the storage medium, it is described to hold
Line program 4021 is realized when being performed by processor 401:Obtain the text message associated with content of multimedia;To the text envelope
Breath is segmented, to obtain each participle fragment;Each participle fragment is clustered, to obtain the first cluster result, wherein,
First cluster result includes the participle fragment group being made of the participle fragment of each cluster classification;From the described first cluster
As a result middle extraction target signature word, inputs machine learning model;Obtain each probable value of the machine learning model output;Its
In, the machine learning model, by carrying out semantic analysis training including the sample of text message and the correspondence of label
Obtain;Each probable value represents that each target signature word is big respectively as the probability of the label of the text message respectively
It is small;According to each probable value, the label for meeting Probability Condition is chosen, and by selected label and the content of multimedia phase
Association.
As a kind of embodiment, the executable program 4021 is realized when being performed by processor 401:Described to described
Each participle fragment is clustered, before the first cluster result of acquisition, according to the different media types of content of multimedia, by described in
Each participle fragment is classified as the participle fragment of each medium type.
As a kind of embodiment, the executable program 4021 is realized when being performed by processor 401:To each media
The participle fragment of type is clustered, to obtain the first cluster result.
As a kind of embodiment, the executable program 4021 is realized when being performed by processor 401:Count each cluster class
The frequency that each participle fragment in other participle fragment group occurs in all cluster classifications, according to the frequency and each participle piece
The weighted value of section, determines importance value of each participle fragment in all cluster classifications;From identified each importance value
The middle importance value for choosing matching degree condition, according to the selected corresponding participle fragment of importance value, determines target
Feature Words.
As a kind of embodiment, the executable program 4021 is realized when being performed by processor 401:According in multimedia
The different media types of appearance, classify the corresponding participle fragment of importance value of the selection, to obtain each media class
The feature set of words of type;According to choose from the feature set of words of each medium type be used to characterize belonging to medium type
The Feature Words of text message, determine target signature word.
As a kind of embodiment, the executable program 4021 is realized when being performed by processor 401:From each media
The Feature Words for the text message for being used to characterize affiliated medium type are chosen in the feature set of words of type;Based on selected feature
The corresponding feature vector of word, builds vector space model;Based on the vector space model, the phase between each feature vector is calculated
Like degree, selected Feature Words are clustered according to the result of calculation of the similarity, to obtain the second cluster result, its
In, second cluster result includes the Feature Words of each cluster classification;Target is extracted from the Feature Words of each cluster classification
Feature Words.
As a kind of embodiment, the executable program 4021 is realized when being performed by processor 401:To the text envelope
Breath is segmented, and obtains participle set of segments;According to the stop words stored in default corpus, from the participle set of segments
The stop words is filtered out, the residue in the participle set of segments in addition to the stop words filtered out is segmented into fragment,
As participle fragment corresponding with the text message.
As a kind of embodiment, the executable program 4021 is realized when being performed by processor 401:Described according to institute
Each probable value is stated, after selection meets the label of Probability Condition, obtains modifying label, the modifying label is described for updating
The label corresponding with the text message of machine learning model output;When the quantity of the modifying label reaches the first default threshold
Value, and/or the training time interval of semantic analysis training is carried out in the machine learning model when reaching the second predetermined threshold value, base
The machine learning model is updated in the modifying label and corresponding text message, according to the machine learning model after renewal
Redefine label corresponding with the text message.
As a kind of embodiment, the executable program 4021 is realized when being performed by processor 401:Described according to institute
Each probable value is stated, after selection meets the label of Probability Condition, obtains preference information;The preference information, for characterizing to tool
There is the preference of each content of multimedia of same label;It is pair associated with each content of multimedia according to the preference information
The label of text message is adjusted;According to the label after the text message and corresponding adjustment, the engineering is updated
Practise model.
In conclusion using the embodiment of the present invention provided more than at least one technical solution, due to can be automatically right
The text message associated with content of multimedia obtains the first cluster result after the analyzing and processing such as being segmented, being clustered, will be from the
The target signature word input machine learning model extracted in one cluster result, to obtain each probable value, according to each probable value, choosing
The label for meeting Probability Condition is taken, and the label of selection is associated with content of multimedia, to be embodied as content of multimedia setting
The purpose of label.Can not only be quickly and accurately content of multimedia in this way, avoiding the subjective impact for manually setting label
Automated setting label, and the embodiment of the present invention is the label that content of multimedia is set and the interest and hobby nothing of user itself
Close, only related to the text message that content of multimedia is associated, therefore, set label is more bonded the need of different user
Ask, greatly improve the usage experience of user.
It should be understood by those skilled in the art that, the embodiment of the present invention can be provided as method, system or executable program
Product.Therefore, the shape of the embodiment in terms of the present invention can use hardware embodiment, software implementation or combination software and hardware
Formula.Moreover, the present invention can use the computer for wherein including computer usable program code in one or more to use storage
The form for the executable program product that medium is implemented on (including but not limited to magnetic disk storage and optical memory etc.).
The present invention be with reference to according to the method for the embodiment of the present invention, the flow of equipment (system) and executable program product
Figure and/or block diagram describe.It should be understood that it can be realized by executable program instructions every first-class in flowchart and/or the block diagram
The combination of flow and/or square frame in journey and/or square frame and flowchart and/or the block diagram.These executable programs can be provided
Instruct all-purpose computer, special purpose computer, Embedded Processor or with reference to programmable data processing device processor to produce
A raw machine so that the instruction performed by computer or with reference to the processor of programmable data processing device is produced for real
The device for the function of being specified in present one flow of flow chart or one square frame of multiple flows and/or block diagram or multiple square frames.
These executable program instructions, which may also be stored in, can guide computer or with reference to programmable data processing device with spy
Determine in the computer-readable memory that mode works so that the instruction being stored in the computer-readable memory, which produces, to be included referring to
Make the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one square frame of block diagram or
The function of being specified in multiple square frames.
These executable program instructions can also be loaded into computer or with reference in programmable data processing device so that count
Calculation machine or with reference to performing series of operation steps on programmable device to produce computer implemented processing, thus in computer or
There is provided and be used for realization in one flow of flow chart or multiple flows and/or block diagram one with reference to the instruction performed on programmable device
The step of function of being specified in a square frame or multiple square frames.
The foregoing is only a preferred embodiment of the present invention, is not intended to limit the scope of the present invention, it is all
All any modification, equivalent and improvement made within the spirit and principles in the present invention etc., should be included in the protection of the present invention
Within the scope of.
Claims (18)
- A kind of 1. method to set up of content tab, it is characterised in that the described method includes:Obtain the text message associated with content of multimedia;The text message is segmented, to obtain each participle fragment;Each participle fragment is clustered, to obtain the first cluster result, wherein, first cluster result includes each poly- The participle fragment group being made of the participle fragment of class classification;Target signature word is extracted from first cluster result, inputs machine learning model;Obtain each probable value of the machine learning model output;Wherein, the machine learning model, by including text envelope Cease and train to obtain with the progress semantic analysis of the sample of the correspondence of label;Each probable value represents that each target is special respectively Levy probability size of the word respectively as the label of the text message;According to each probable value, the label for meeting Probability Condition is chosen, and by selected label and the content of multimedia It is associated.
- 2. the method to set up of content tab according to claim 1, it is characterised in that described to each participle fragment Clustered, with before obtaining the first cluster result, the method further includes:According to the different media types of content of multimedia, by each participle fragment for segmenting fragment and being classified as each medium type;It is described that each participle fragment is clustered, to obtain the first cluster result, including:The participle fragment of each medium type is clustered, to obtain the first cluster result.
- 3. the method to set up of content tab according to claim 1, it is characterised in that described from first cluster result Middle extraction target signature word, including:The frequency that each participle fragment in the participle fragment group of each cluster classification occurs in all cluster classifications is counted, according to institute The weighted value of frequency and each participle fragment is stated, determines importance value of each participle fragment in all cluster classifications;The importance value of matching degree condition is chosen from identified each importance value, according to selected significance level It is worth corresponding participle fragment, determines target signature word.
- 4. the method to set up of content tab according to claim 3, it is characterised in that the important journey selected by the basis The corresponding participle fragment of angle value, determines target signature word, including:According to the different media types of content of multimedia, the corresponding participle fragment of importance value of the selection is divided Class, to obtain the feature set of words of each medium type;According to choose from the feature set of words of each medium type be used to characterize belonging to medium type text message Feature Words, determine target signature word.
- 5. the method to set up of content tab according to claim 4, it is characterised in that the basis is from each media class That is chosen in the feature set of words of type is used to characterize the Feature Words of the text message of affiliated medium type, determines target signature word, Including:The Feature Words for the text message for being used to characterize affiliated medium type are chosen from the feature set of words of each medium type;Based on the selected corresponding feature vector of Feature Words, vector space model is built;Based on the vector space model, the similarity between each feature vector is calculated, according to the result of calculation of the similarity Selected Feature Words are clustered, to obtain the second cluster result, wherein, second cluster result includes each cluster class Another characteristic word;Target signature word is extracted from the Feature Words of each cluster classification.
- 6. the method to set up of content tab according to claim 1, it is characterised in that described to be carried out to the text message Participle, to obtain each participle fragment, including:The text message is segmented, obtains participle set of segments;According to the stop words stored in default corpus, the stop words is filtered out from the participle set of segments, by described in The remaining participle fragment in addition to the stop words filtered out in set of segments is segmented, as corresponding with the text message Segment fragment.
- 7. the method to set up of content tab according to claim 1, it is characterised in that described according to each probability Value, after selection meets the label of Probability Condition, the method further includes:Obtain modifying label, the modifying label is with the text message pair for updating machine learning model output The label answered;Semantic analysis is carried out in the first predetermined threshold value, and/or the machine learning model when the quantity of the modifying label reaches When trained training time interval reaches the second predetermined threshold value, based on the modifying label and corresponding text message renewal institute Machine learning model is stated, label corresponding with the text message is redefined according to the machine learning model after renewal.
- 8. the method to set up of content tab according to claim 7, it is characterised in that described according to each probability Value, after selection meets the label of Probability Condition, the method further includes:Obtain preference information;The preference information, for characterizing the preference to each content of multimedia with same label;According to the preference information, the label of pair text message associated with each content of multimedia is adjusted;According to the label after the text message and corresponding adjustment, the machine learning model is updated.
- 9. the setting device of a kind of content tab, it is characterised in that described device includes:Acquisition module, word-dividing mode, cluster mould Block, extraction module, generation module and relating module;Wherein,The acquisition module, for obtaining the text message associated with content of multimedia;The word-dividing mode, for being segmented to the text message, to obtain each participle fragment;The cluster module, for being clustered to each participle fragment, to obtain the first cluster result, wherein, described the One cluster result includes the participle fragment group being made of the participle fragment of each cluster classification;The extraction module, for extracting target signature word from first cluster result, inputs machine learning model;The acquisition module, is additionally operable to obtain each probable value of the machine learning model output;Wherein, the machine learning mould Type, by training to obtain to carrying out semantic analysis including the sample of text message and the correspondence of label;Each probable value Probability size of each target signature word respectively as the label of the text message is represented respectively;The generation module, for according to each probable value, choosing the label for meeting Probability Condition;The relating module, for selected label is associated with the content of multimedia.
- 10. the setting device of content tab according to claim 9, it is characterised in that described device further includes:Classification mould Block, for being clustered in the cluster module to each participle fragment, before the first cluster result of acquisition, according to more matchmakers Each participle fragment, is classified as the participle fragment of each medium type by the different media types held in vivo;The cluster module, is specifically used for:The participle fragment of each medium type is clustered, is tied with obtaining the first cluster Fruit.
- 11. the setting device of content tab according to claim 9, it is characterised in that the extraction module, it is specific to use In:The frequency that each participle fragment in the participle fragment group of each cluster classification occurs in all cluster classifications is counted, according to institute The weighted value of frequency and each participle fragment is stated, determines importance value of each participle fragment in all cluster classifications;The importance value of matching degree condition is chosen from identified each importance value, according to selected significance level It is worth corresponding participle fragment, determines target signature word.
- 12. the setting device of content tab according to claim 11, it is characterised in that the extraction module, it is specific to use In:According to the different media types of content of multimedia, the corresponding participle fragment of importance value of the selection is divided Class, to obtain the feature set of words of each medium type;According to choose from the feature set of words of each medium type be used to characterize belonging to medium type text message Feature Words, determine target signature word.
- 13. the setting device of content tab according to claim 12, it is characterised in that the extraction module, it is specific to use In:The Feature Words for the text message for being used to characterize affiliated medium type are chosen from the feature set of words of each medium type;Based on the selected corresponding feature vector of Feature Words, vector space model is built;Based on the vector space model, the similarity between each feature vector is calculated, according to the result of calculation of the similarity Selected Feature Words are clustered, to obtain the second cluster result, wherein, second cluster result includes each cluster class Another characteristic word;Target signature word is extracted from the Feature Words of each cluster classification.
- 14. the setting device of content tab according to claim 9, it is characterised in that the word-dividing mode is specific to use In:The text message is segmented, obtains participle set of segments;According to the stop words stored in default corpus, the stop words is filtered out from the participle set of segments, by described in The remaining participle fragment in addition to the stop words filtered out in set of segments is segmented, as corresponding with the text message Segment fragment.
- 15. the setting device of content tab according to claim 9, it is characterised in that the acquisition module, is additionally operable to The generation module after selection meets the label of Probability Condition, obtains modifying label, the amendment according to each probable value Label is the label corresponding with the text message for updating the machine learning model output;Described device further includes:Update module, for reaching the first predetermined threshold value, and/or institute when the quantity of the modifying label The training time interval that semantic analysis training is carried out in machine learning model is stated when reaching the second predetermined threshold value, based on the amendment Label and corresponding text message update the machine learning model, according to the machine learning model after renewal redefine with The corresponding label of the text message.
- 16. the setting device of content tab according to claim 15, it is characterised in that the acquisition module, is additionally operable to In the generation module according to each probable value, after selection meets the label of Probability Condition, preference information is obtained;It is described inclined Good information, for characterizing the preference to each content of multimedia with same label;The update module, is additionally operable to according to the preference information, pair text message associated with each content of multimedia Label be adjusted, and according to the label after the text message and corresponding adjustment, update the machine learning mould Type.
- 17. a kind of storage medium, is stored thereon with executable program, it is characterised in that the executable code processor is held The step of method to set up such as claim 1 to 8 any one of them content tab is realized during row.
- 18. a kind of setting device of content tab, including memory, processor and storage are on a memory and can be by described Manage the executable program of device operation, it is characterised in that the processor performs such as claim 1 when running the executable program To 8 any one of them content tabs method to set up the step of.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711209262.9A CN108009228B (en) | 2017-11-27 | 2017-11-27 | Method and device for setting content label and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711209262.9A CN108009228B (en) | 2017-11-27 | 2017-11-27 | Method and device for setting content label and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108009228A true CN108009228A (en) | 2018-05-08 |
CN108009228B CN108009228B (en) | 2020-10-09 |
Family
ID=62054132
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711209262.9A Active CN108009228B (en) | 2017-11-27 | 2017-11-27 | Method and device for setting content label and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108009228B (en) |
Cited By (49)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108733786A (en) * | 2018-05-11 | 2018-11-02 | 济南浪潮高新科技投资发展有限公司 | A kind of method and apparatus for extracting effective information from html text |
CN109033082A (en) * | 2018-07-19 | 2018-12-18 | 深圳创维数字技术有限公司 | The learning training method, apparatus and computer readable storage medium of semantic model |
CN109145260A (en) * | 2018-08-24 | 2019-01-04 | 北京科技大学 | A kind of text information extraction method |
CN109241281A (en) * | 2018-08-01 | 2019-01-18 | 百度在线网络技术(北京)有限公司 | Software failure reason generation method, device and equipment |
CN109255066A (en) * | 2018-09-30 | 2019-01-22 | 武汉斗鱼网络科技有限公司 | A kind of label labeling method, device, server and the storage medium of business object |
CN109271502A (en) * | 2018-09-25 | 2019-01-25 | 武汉大学 | A kind of classifying method and device of the space querying theme based on natural language processing |
CN109299315A (en) * | 2018-09-03 | 2019-02-01 | 腾讯科技(深圳)有限公司 | Multimedia resource classification method, device, computer equipment and storage medium |
CN109344253A (en) * | 2018-09-18 | 2019-02-15 | 平安科技(深圳)有限公司 | Add method, apparatus, computer equipment and the storage medium of user tag |
CN109447105A (en) * | 2018-09-10 | 2019-03-08 | 平安科技(深圳)有限公司 | Contract audit method, apparatus, computer equipment and storage medium |
CN109871447A (en) * | 2019-03-05 | 2019-06-11 | 南京甄视智能科技有限公司 | Clustering method, computer program product and the server system of Chinese comment unsupervised learning |
CN109933662A (en) * | 2019-02-15 | 2019-06-25 | 北京奇艺世纪科技有限公司 | Model training method, information generating method, device, electronic equipment and computer-readable medium |
CN110019563A (en) * | 2018-08-09 | 2019-07-16 | 北京首钢自动化信息技术有限公司 | A kind of portrait modeling method and device based on multidimensional data |
CN110070143A (en) * | 2019-04-29 | 2019-07-30 | 北京达佳互联信息技术有限公司 | Obtain method, apparatus, equipment and the storage medium of training data |
CN110196948A (en) * | 2019-06-10 | 2019-09-03 | 北京金山安全软件有限公司 | Content recommendation method and device, computer equipment and storage medium |
CN110321435A (en) * | 2019-06-28 | 2019-10-11 | 京东数字科技控股有限公司 | A kind of data source division methods, device, equipment and storage medium |
CN110390002A (en) * | 2019-06-18 | 2019-10-29 | 深圳壹账通智能科技有限公司 | Call resource allocation method, device, computer readable storage medium and server |
CN110413837A (en) * | 2019-05-30 | 2019-11-05 | 腾讯科技(深圳)有限公司 | Video recommendation method and device |
CN110413787A (en) * | 2019-07-26 | 2019-11-05 | 腾讯科技(深圳)有限公司 | Text Clustering Method, device, terminal and storage medium |
CN110442767A (en) * | 2019-07-31 | 2019-11-12 | 腾讯科技(深圳)有限公司 | A kind of method, apparatus and readable storage medium storing program for executing of determining content interaction platform label |
CN110717326A (en) * | 2019-09-17 | 2020-01-21 | 平安科技(深圳)有限公司 | Text information author identification method and device based on machine learning |
CN110765778A (en) * | 2019-10-23 | 2020-02-07 | 北京锐安科技有限公司 | Label entity processing method and device, computer equipment and storage medium |
CN111078885A (en) * | 2019-12-18 | 2020-04-28 | 腾讯科技(深圳)有限公司 | Label classification method, related device, equipment and storage medium |
CN111104545A (en) * | 2018-10-26 | 2020-05-05 | 阿里巴巴集团控股有限公司 | Background music configuration method and equipment, client device and electronic equipment |
CN111125460A (en) * | 2019-12-24 | 2020-05-08 | 腾讯科技(深圳)有限公司 | Information recommendation method and device |
CN111191003A (en) * | 2019-12-26 | 2020-05-22 | 东软集团股份有限公司 | Method and device for determining text association type, storage medium and electronic equipment |
CN111222328A (en) * | 2018-11-26 | 2020-06-02 | 百度在线网络技术(北京)有限公司 | Label extraction method and device and electronic equipment |
CN111291688A (en) * | 2020-02-12 | 2020-06-16 | 咪咕文化科技有限公司 | Video tag obtaining method and device |
CN111339304A (en) * | 2020-03-16 | 2020-06-26 | 闪捷信息科技有限公司 | Text data automatic classification method based on machine learning |
CN111369029A (en) * | 2018-12-06 | 2020-07-03 | 北京嘀嘀无限科技发展有限公司 | Service selection prediction method, device, electronic equipment and storage medium |
CN111435596A (en) * | 2019-01-14 | 2020-07-21 | 珠海格力电器股份有限公司 | Method and device for adjusting running state of target equipment, storage medium and electronic device |
CN111475603A (en) * | 2019-01-23 | 2020-07-31 | 百度在线网络技术(北京)有限公司 | Enterprise identifier identification method and device, computer equipment and storage medium |
CN111625716A (en) * | 2020-05-12 | 2020-09-04 | 聚好看科技股份有限公司 | Media asset recommendation method, server and display device |
CN111666452A (en) * | 2020-07-09 | 2020-09-15 | 腾讯科技(深圳)有限公司 | Method and device for clustering videos |
CN111680156A (en) * | 2020-05-25 | 2020-09-18 | 中国工商银行股份有限公司 | Data multi-label classification method and system |
CN111711869A (en) * | 2020-06-24 | 2020-09-25 | 腾讯科技(深圳)有限公司 | Label data processing method and device and computer readable storage medium |
CN112131511A (en) * | 2020-09-29 | 2020-12-25 | 中国银行股份有限公司 | Method and device for displaying negotiation information in matching activities |
CN112384911A (en) * | 2018-07-11 | 2021-02-19 | 株式会社东芝 | Label applying device, label applying method, and program |
CN112447173A (en) * | 2019-08-16 | 2021-03-05 | 阿里巴巴集团控股有限公司 | Voice interaction method and device and computer storage medium |
CN112580329A (en) * | 2019-09-30 | 2021-03-30 | 北京国双科技有限公司 | Text noise data identification method and device, computer equipment and storage medium |
CN112579738A (en) * | 2020-12-23 | 2021-03-30 | 广州博冠信息科技有限公司 | Target object label processing method, device, equipment and storage medium |
CN112612888A (en) * | 2020-12-25 | 2021-04-06 | 航天信息股份有限公司 | Method and system for intelligently clustering text files |
CN113111174A (en) * | 2020-04-28 | 2021-07-13 | 北京明亿科技有限公司 | Group identification method, device, equipment and medium based on deep learning model |
CN113157851A (en) * | 2021-02-23 | 2021-07-23 | 北京三快在线科技有限公司 | Category information generation method and device, electronic equipment and computer readable medium |
CN113221533A (en) * | 2021-04-29 | 2021-08-06 | 支付宝(杭州)信息技术有限公司 | Experience sound label extraction method, device and equipment |
CN113806542A (en) * | 2021-09-18 | 2021-12-17 | 上海幻电信息科技有限公司 | Text analysis method and system |
CN115271851A (en) * | 2022-07-04 | 2022-11-01 | 天翼爱音乐文化科技有限公司 | Video color ring recommendation method, system, electronic equipment and storage medium |
CN115599903A (en) * | 2021-07-07 | 2023-01-13 | 腾讯科技(深圳)有限公司(Cn) | Object tag obtaining method and device, electronic equipment and storage medium |
CN116912845A (en) * | 2023-06-16 | 2023-10-20 | 广东电网有限责任公司佛山供电局 | Intelligent content identification and analysis method and device based on NLP and AI |
CN113221533B (en) * | 2021-04-29 | 2024-07-05 | 支付宝(杭州)信息技术有限公司 | Label extraction method, device and equipment for experience sound |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104111933A (en) * | 2013-04-17 | 2014-10-22 | 阿里巴巴集团控股有限公司 | Method and device for acquiring business object label and building training model |
CN105243389A (en) * | 2015-09-28 | 2016-01-13 | 北京橙鑫数据科技有限公司 | Industry classification tag determining method and apparatus for company name |
WO2017075939A1 (en) * | 2015-11-06 | 2017-05-11 | 腾讯科技(深圳)有限公司 | Method and device for recognizing image contents |
CN107301199A (en) * | 2017-05-17 | 2017-10-27 | 北京融数云途科技有限公司 | A kind of data label generation method and device |
-
2017
- 2017-11-27 CN CN201711209262.9A patent/CN108009228B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104111933A (en) * | 2013-04-17 | 2014-10-22 | 阿里巴巴集团控股有限公司 | Method and device for acquiring business object label and building training model |
CN105243389A (en) * | 2015-09-28 | 2016-01-13 | 北京橙鑫数据科技有限公司 | Industry classification tag determining method and apparatus for company name |
WO2017075939A1 (en) * | 2015-11-06 | 2017-05-11 | 腾讯科技(深圳)有限公司 | Method and device for recognizing image contents |
CN107301199A (en) * | 2017-05-17 | 2017-10-27 | 北京融数云途科技有限公司 | A kind of data label generation method and device |
Cited By (73)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108733786A (en) * | 2018-05-11 | 2018-11-02 | 济南浪潮高新科技投资发展有限公司 | A kind of method and apparatus for extracting effective information from html text |
CN112384911A (en) * | 2018-07-11 | 2021-02-19 | 株式会社东芝 | Label applying device, label applying method, and program |
CN109033082A (en) * | 2018-07-19 | 2018-12-18 | 深圳创维数字技术有限公司 | The learning training method, apparatus and computer readable storage medium of semantic model |
CN109033082B (en) * | 2018-07-19 | 2022-06-10 | 深圳创维数字技术有限公司 | Learning training method and device of semantic model and computer readable storage medium |
CN109241281A (en) * | 2018-08-01 | 2019-01-18 | 百度在线网络技术(北京)有限公司 | Software failure reason generation method, device and equipment |
CN109241281B (en) * | 2018-08-01 | 2022-09-23 | 百度在线网络技术(北京)有限公司 | Software failure reason generation method, device and equipment |
CN110019563A (en) * | 2018-08-09 | 2019-07-16 | 北京首钢自动化信息技术有限公司 | A kind of portrait modeling method and device based on multidimensional data |
CN109145260A (en) * | 2018-08-24 | 2019-01-04 | 北京科技大学 | A kind of text information extraction method |
US11798278B2 (en) | 2018-09-03 | 2023-10-24 | Tencent Technology (Shenzhen) Company Limited | Method, apparatus, and storage medium for classifying multimedia resource |
CN109299315A (en) * | 2018-09-03 | 2019-02-01 | 腾讯科技(深圳)有限公司 | Multimedia resource classification method, device, computer equipment and storage medium |
CN109447105A (en) * | 2018-09-10 | 2019-03-08 | 平安科技(深圳)有限公司 | Contract audit method, apparatus, computer equipment and storage medium |
CN109344253A (en) * | 2018-09-18 | 2019-02-15 | 平安科技(深圳)有限公司 | Add method, apparatus, computer equipment and the storage medium of user tag |
CN109271502A (en) * | 2018-09-25 | 2019-01-25 | 武汉大学 | A kind of classifying method and device of the space querying theme based on natural language processing |
CN109255066B (en) * | 2018-09-30 | 2021-11-09 | 武汉斗鱼网络科技有限公司 | Label marking method, device, server and storage medium for business object |
CN109255066A (en) * | 2018-09-30 | 2019-01-22 | 武汉斗鱼网络科技有限公司 | A kind of label labeling method, device, server and the storage medium of business object |
CN111104545A (en) * | 2018-10-26 | 2020-05-05 | 阿里巴巴集团控股有限公司 | Background music configuration method and equipment, client device and electronic equipment |
CN111222328A (en) * | 2018-11-26 | 2020-06-02 | 百度在线网络技术(北京)有限公司 | Label extraction method and device and electronic equipment |
CN111222328B (en) * | 2018-11-26 | 2023-06-16 | 百度在线网络技术(北京)有限公司 | Label extraction method and device and electronic equipment |
CN111369029A (en) * | 2018-12-06 | 2020-07-03 | 北京嘀嘀无限科技发展有限公司 | Service selection prediction method, device, electronic equipment and storage medium |
CN111435596A (en) * | 2019-01-14 | 2020-07-21 | 珠海格力电器股份有限公司 | Method and device for adjusting running state of target equipment, storage medium and electronic device |
CN111435596B (en) * | 2019-01-14 | 2024-01-30 | 珠海格力电器股份有限公司 | Method and device for adjusting running state of target equipment, storage medium and electronic device |
CN111475603A (en) * | 2019-01-23 | 2020-07-31 | 百度在线网络技术(北京)有限公司 | Enterprise identifier identification method and device, computer equipment and storage medium |
CN109933662A (en) * | 2019-02-15 | 2019-06-25 | 北京奇艺世纪科技有限公司 | Model training method, information generating method, device, electronic equipment and computer-readable medium |
CN109871447A (en) * | 2019-03-05 | 2019-06-11 | 南京甄视智能科技有限公司 | Clustering method, computer program product and the server system of Chinese comment unsupervised learning |
CN110070143B (en) * | 2019-04-29 | 2021-07-16 | 北京达佳互联信息技术有限公司 | Method, device and equipment for acquiring training data and storage medium |
CN110070143A (en) * | 2019-04-29 | 2019-07-30 | 北京达佳互联信息技术有限公司 | Obtain method, apparatus, equipment and the storage medium of training data |
CN110413837A (en) * | 2019-05-30 | 2019-11-05 | 腾讯科技(深圳)有限公司 | Video recommendation method and device |
CN110413837B (en) * | 2019-05-30 | 2023-07-25 | 腾讯科技(深圳)有限公司 | Video recommendation method and device |
CN110196948A (en) * | 2019-06-10 | 2019-09-03 | 北京金山安全软件有限公司 | Content recommendation method and device, computer equipment and storage medium |
CN110390002A (en) * | 2019-06-18 | 2019-10-29 | 深圳壹账通智能科技有限公司 | Call resource allocation method, device, computer readable storage medium and server |
CN110321435A (en) * | 2019-06-28 | 2019-10-11 | 京东数字科技控股有限公司 | A kind of data source division methods, device, equipment and storage medium |
CN110413787B (en) * | 2019-07-26 | 2023-07-21 | 腾讯科技(深圳)有限公司 | Text clustering method, device, terminal and storage medium |
CN110413787A (en) * | 2019-07-26 | 2019-11-05 | 腾讯科技(深圳)有限公司 | Text Clustering Method, device, terminal and storage medium |
CN110442767B (en) * | 2019-07-31 | 2023-08-18 | 腾讯科技(深圳)有限公司 | Method and device for determining content interaction platform label and readable storage medium |
CN110442767A (en) * | 2019-07-31 | 2019-11-12 | 腾讯科技(深圳)有限公司 | A kind of method, apparatus and readable storage medium storing program for executing of determining content interaction platform label |
CN112447173A (en) * | 2019-08-16 | 2021-03-05 | 阿里巴巴集团控股有限公司 | Voice interaction method and device and computer storage medium |
CN110717326A (en) * | 2019-09-17 | 2020-01-21 | 平安科技(深圳)有限公司 | Text information author identification method and device based on machine learning |
CN110717326B (en) * | 2019-09-17 | 2022-12-23 | 平安科技(深圳)有限公司 | Text information author identification method and device based on machine learning |
CN112580329A (en) * | 2019-09-30 | 2021-03-30 | 北京国双科技有限公司 | Text noise data identification method and device, computer equipment and storage medium |
CN112580329B (en) * | 2019-09-30 | 2024-02-20 | 北京国双科技有限公司 | Text noise data identification method, device, computer equipment and storage medium |
CN110765778B (en) * | 2019-10-23 | 2023-08-29 | 北京锐安科技有限公司 | Label entity processing method, device, computer equipment and storage medium |
CN110765778A (en) * | 2019-10-23 | 2020-02-07 | 北京锐安科技有限公司 | Label entity processing method and device, computer equipment and storage medium |
CN111078885B (en) * | 2019-12-18 | 2023-04-07 | 腾讯科技(深圳)有限公司 | Label classification method, related device, equipment and storage medium |
CN111078885A (en) * | 2019-12-18 | 2020-04-28 | 腾讯科技(深圳)有限公司 | Label classification method, related device, equipment and storage medium |
CN111125460A (en) * | 2019-12-24 | 2020-05-08 | 腾讯科技(深圳)有限公司 | Information recommendation method and device |
CN111191003B (en) * | 2019-12-26 | 2023-04-18 | 东软集团股份有限公司 | Method and device for determining text association type, storage medium and electronic equipment |
CN111191003A (en) * | 2019-12-26 | 2020-05-22 | 东软集团股份有限公司 | Method and device for determining text association type, storage medium and electronic equipment |
CN111291688B (en) * | 2020-02-12 | 2023-07-14 | 咪咕文化科技有限公司 | Video tag acquisition method and device |
CN111291688A (en) * | 2020-02-12 | 2020-06-16 | 咪咕文化科技有限公司 | Video tag obtaining method and device |
CN111339304A (en) * | 2020-03-16 | 2020-06-26 | 闪捷信息科技有限公司 | Text data automatic classification method based on machine learning |
CN113111174A (en) * | 2020-04-28 | 2021-07-13 | 北京明亿科技有限公司 | Group identification method, device, equipment and medium based on deep learning model |
CN111625716B (en) * | 2020-05-12 | 2023-10-31 | 聚好看科技股份有限公司 | Media asset recommendation method, server and display device |
CN111625716A (en) * | 2020-05-12 | 2020-09-04 | 聚好看科技股份有限公司 | Media asset recommendation method, server and display device |
CN111680156A (en) * | 2020-05-25 | 2020-09-18 | 中国工商银行股份有限公司 | Data multi-label classification method and system |
CN111680156B (en) * | 2020-05-25 | 2024-02-09 | 中国工商银行股份有限公司 | Data multi-label classification method and system |
CN111711869B (en) * | 2020-06-24 | 2022-05-17 | 腾讯科技(深圳)有限公司 | Label data processing method and device and computer readable storage medium |
CN111711869A (en) * | 2020-06-24 | 2020-09-25 | 腾讯科技(深圳)有限公司 | Label data processing method and device and computer readable storage medium |
CN111666452A (en) * | 2020-07-09 | 2020-09-15 | 腾讯科技(深圳)有限公司 | Method and device for clustering videos |
CN112131511A (en) * | 2020-09-29 | 2020-12-25 | 中国银行股份有限公司 | Method and device for displaying negotiation information in matching activities |
CN112579738A (en) * | 2020-12-23 | 2021-03-30 | 广州博冠信息科技有限公司 | Target object label processing method, device, equipment and storage medium |
CN112612888A (en) * | 2020-12-25 | 2021-04-06 | 航天信息股份有限公司 | Method and system for intelligently clustering text files |
CN112612888B (en) * | 2020-12-25 | 2023-06-16 | 航天信息股份有限公司 | Method and system for intelligent clustering of text files |
CN113157851A (en) * | 2021-02-23 | 2021-07-23 | 北京三快在线科技有限公司 | Category information generation method and device, electronic equipment and computer readable medium |
CN113221533A (en) * | 2021-04-29 | 2021-08-06 | 支付宝(杭州)信息技术有限公司 | Experience sound label extraction method, device and equipment |
CN113221533B (en) * | 2021-04-29 | 2024-07-05 | 支付宝(杭州)信息技术有限公司 | Label extraction method, device and equipment for experience sound |
CN115599903A (en) * | 2021-07-07 | 2023-01-13 | 腾讯科技(深圳)有限公司(Cn) | Object tag obtaining method and device, electronic equipment and storage medium |
CN115599903B (en) * | 2021-07-07 | 2024-06-04 | 腾讯科技(深圳)有限公司 | Object tag acquisition method and device, electronic equipment and storage medium |
CN113806542A (en) * | 2021-09-18 | 2021-12-17 | 上海幻电信息科技有限公司 | Text analysis method and system |
CN113806542B (en) * | 2021-09-18 | 2024-05-17 | 上海幻电信息科技有限公司 | Text analysis method and system |
CN115271851B (en) * | 2022-07-04 | 2023-10-10 | 天翼爱音乐文化科技有限公司 | Video color ring recommending method, system, electronic equipment and storage medium |
CN115271851A (en) * | 2022-07-04 | 2022-11-01 | 天翼爱音乐文化科技有限公司 | Video color ring recommendation method, system, electronic equipment and storage medium |
CN116912845A (en) * | 2023-06-16 | 2023-10-20 | 广东电网有限责任公司佛山供电局 | Intelligent content identification and analysis method and device based on NLP and AI |
CN116912845B (en) * | 2023-06-16 | 2024-03-19 | 广东电网有限责任公司佛山供电局 | Intelligent content identification and analysis method and device based on NLP and AI |
Also Published As
Publication number | Publication date |
---|---|
CN108009228B (en) | 2020-10-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108009228A (en) | A kind of method to set up of content tab, device and storage medium | |
CN112784130B (en) | Twin network model training and measuring method, device, medium and equipment | |
CN107818781B (en) | Intelligent interaction method, equipment and storage medium | |
CN108509465A (en) | A kind of the recommendation method, apparatus and server of video data | |
CN108984530A (en) | A kind of detection method and detection system of network sensitive content | |
CN104199822B (en) | It is a kind of to identify the method and system for searching for corresponding demand classification | |
CN110097085A (en) | Lyrics document creation method, training method, device, server and storage medium | |
CN108182279A (en) | Object classification method, device and computer equipment based on text feature | |
CN109271493A (en) | A kind of language text processing method, device and storage medium | |
CN109408665A (en) | A kind of information recommendation method and device, storage medium | |
CN110134792B (en) | Text recognition method and device, electronic equipment and storage medium | |
CN111831802B (en) | Urban domain knowledge detection system and method based on LDA topic model | |
CN111783468B (en) | Text processing method, device, equipment and medium | |
CN103365867A (en) | Method and device for emotion analysis of user evaluation | |
CN106105096A (en) | System and method for continuous social communication | |
CN111046225B (en) | Audio resource processing method, device, equipment and storage medium | |
CN103984741A (en) | Method and system for extracting user attribute information | |
CN103810162A (en) | Method and system for recommending network information | |
CN110851650B (en) | Comment output method and device and computer storage medium | |
CN109271550A (en) | A kind of music personalization classification recommended method based on deep learning | |
CN110309114A (en) | Processing method, device, storage medium and the electronic device of media information | |
CN111523324A (en) | Training method and device for named entity recognition model | |
CN106528538A (en) | Method and device for intelligent emotion recognition | |
CN103631874A (en) | UGC label classification determining method and device for social platform | |
CN107273546A (en) | Counterfeit application detection method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP01 | Change in the name or title of a patent holder | ||
CP01 | Change in the name or title of a patent holder |
Address after: No.88-1, Yurun street, Jianye District, Nanjing City, Jiangsu Province, 210000 Patentee after: MIGU INTERACTIVE ENTERTAINMENT Co.,Ltd. Patentee after: CHINA MOBILE COMMUNICATIONS GROUP Co.,Ltd. Address before: No.88-1, Yurun street, Jianye District, Nanjing City, Jiangsu Province, 210000 Patentee before: MIGU INTERACTIVE ENTERTAINMENT Co.,Ltd. Patentee before: China Mobile Communications Corp. |