CN106528633B - A kind of video society attention rate improvement method recommended based on keyword - Google Patents
A kind of video society attention rate improvement method recommended based on keyword Download PDFInfo
- Publication number
- CN106528633B CN106528633B CN201610884840.8A CN201610884840A CN106528633B CN 106528633 B CN106528633 B CN 106528633B CN 201610884840 A CN201610884840 A CN 201610884840A CN 106528633 B CN106528633 B CN 106528633B
- Authority
- CN
- China
- Prior art keywords
- keyword
- video
- words
- key
- semantic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/73—Querying
- G06F16/738—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/70—Information retrieval; Database structures therefor; File system structures therefor of video data
- G06F16/78—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/7867—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title and artist information, manually generated time, location and usage information, user ratings
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Library & Information Science (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The video society attention rate improvement method based on keyword recommendation that the invention discloses a kind of.The method that the present invention uses semantic dependency and deep learning, recommends the keyword of video, improves the degree of social concern of video.Method is found and the maximally related several semantic key words of initial key word justice according to the initial key word of user first based on the semantic dependency between keyword and to the deep learning of video content;Then entity key is excavated using deep learning analysis video content;Finally according to this two parts keyword of certain standard sorted, the maximally related keyword of user is recommended.The keyword that the present invention recommends has taken into account the degree of correlation of keyword and video content and has attracted the potentiality of the degree of social concern, improves the degree of social concern of video, is a kind of not only efficient but also practical video key recommended method.The present invention can be used for online social media analysis, and data mining and video tab recommend field.
Description
Technical field
The invention belongs to the analysis of online social media, data mining, video tab recommended technology field are specifically designed one
The video society attention rate improvement method that kind is recommended based on keyword.
Background technique
In traditional internet application, search engine is the important tool that user has found Web content.Therefore, needle at present
It is mainly what Search Engine-Oriented proposed to the method for improving the Web content degree of social concern.And in social media, especially
It is as YouTube, Flickr and youku.com in multimedia sharing website, other than search engine, recommender system is also
One important sources of the degree of social concern.It thus needs while excavating search engine and recommender system improves the latent of the degree of social concern
Power could more effectively improve the degree of social concern of Social Media content.
Although search engine has enough interior tolerant users to go to find, but with the information content of internet be skyrocketed through with
And people require search engine higher and higher, search engine also shows certain limitation, if coverage rate is low, as a result not
Accurately, incoherent result etc. is returned.Although recommender system can recommend out the degree of correlation it is very high as a result, still recommendation results
Range also suffered from certain limitation, and very big difference can be also presented in the recommendation results of different recommender systems.
Deep learning is as current new technology, in terms of also can be applied to the degree of social concern for improving video.Depth
It practises the application in terms of video and is mainly manifested in extraction video content information, and can ensure the standard for extracting video content information
True property.
Summary of the invention
For appeal problem, the invention discloses it is a kind of based on keyword recommend video society attention rate improvement method,
The keyword that this method is recommended can combine with the degree of correlation of video content and attract the potentiality of the degree of social concern.
The technical solution used to solve the technical problems of the present invention is that:
A kind of video society attention rate improvement method recommended based on keyword, this method are realized using following steps:
Step 1 --- obtain video initial key word:
According to given video, the video initial title keyword provided when in conjunction with user's uploaded videos is provided and video
Relevant K initial key word.
Step 2 --- initial key word WordNet semantic extension:
Semantic similar keyword is looked on WordNet according to initial key word, is extended to the preliminary semantic pass WordNet
Keyword set.
Step 3 --- major video sharing website extends again:
It using preliminary semantic key words set, is scanned in major video sharing website, extraction can attract more
The keyword of the degree of social concern is extended to final semantic key words set.
Step 4 --- extract video entities keyword set:
By deep learning technology, mining content of video information forms video entities keyword set.
Step 5 --- sort key set of words:
The degree of correlation and the degree of social concern for considering keyword, according to keyword frequency of occurrence and keyword and initial key
Two keyword sets of two aspect sequencing video semanteme of average degree of correlation and entity of set of words, confirmation can finally be provided to user's
Keyword set.
The invention has the advantages that:
1, the present invention carries out extension semantically to initial key word by WordNet semantic dictionary, due to WordNet language
Adopted dictionary itself has carried out good summarizing to entry in terms of semanteme, therefore the semantic key words set expanded guarantees
With initial video title in correlation semantically, the quality of semantic hierarchies is improved, the diversity of keyword is also promoted.
2, the present invention further expands semantic key words set by major video sharing website, according to similar or similar
The video of theme can usually search this thought on multiple websites, in conjunction with video sharing site search engine and recommend system
The two abilities of uniting extend semantic key words set, and the semantic key words set is not only related to video content, but also increases
The diversity and the degree of social concern of keyword.
3, the present invention is analyzed and is identified to video content by deep learning technology, can be collected true with video
The most proper entity information of real content improves the authenticity and accuracy that be finally supplied to the keyword of user.
4, it is arranged in terms of the average degree of correlation two that the present invention passes through keyword frequency of occurrence and keyword and initial key word
Sequence set of keywords, had not only measured the degree of correlation of keyword Yu initial key set of words, but also had measured the society pass of keyword
Note degree.
5, the present invention can be used for online social media analysis, the field of data mining, it is particularly possible to be used for video tab
Recommendation field.
Detailed description of the invention
Fig. 1 is the overall framework figure that keyword of the present invention is recommended.
Fig. 2 is the flow chart of keyword rank of the present invention.
Specific embodiment
The present invention is further described with specific implementation application process with reference to the accompanying drawing:
Execute step referring to Fig.1 to illustrate implementation process of the invention:
Step 1 --- obtain video initial key word:
According to given video, user can provide user oneself in video upload interface and think to close in uploaded videos
Accurate video initial title is fitted, extracts K keyword in video initial title as initial title keyword set X.
Step 2 --- initial key word WordNet semantic extension:
According to initial key set of words, for each keyword, input in WordNet semantic dictionary, output and this
Several relevant entries of a keywords semantics choose semantically maximally related 2-3 keyword, finally constitute and be based on
The preliminary semantic key words set of WordNet extension.The Video Key word extended in this way not only ensure that and initial video mark
Topic improves the quality of semantic hierarchies, and promote the diversity of keyword in correlation semantically.
Step 3 --- major video sharing website extends again:
It can usually be searched on multiple video sharing websites according to similar or similar topic video, so we
Similar or similar topic video information can be collected on multiple video sharing websites.Video sharing website is utilized in we
Search engine and the big ability of recommender system two, following two can be divided into based on major video sharing website extending video keyword
Step:
1) search engine is searched for
For the preliminary semantic key words set of WordNet extension, 2-3 group keyword is formed, is shared in major video
Plain engine is searched using website on website every group of keyword is carried out searching element, extract before the ranking that each site search goes out 10 view
Frequently, the key word in title for collecting these videos is added in semantic key words set.
2) recommender system is recommended
For, by searching the video in the forefront that plain engine search goes out, collection video website passes through these videos in the first step
The associated video that recommender system is recommended, likewise, the key word in title of these associated videos is added to semantic pass by we here
In keyword set.
By two above step, we are adequately utilized the search capability of the search engine of video sharing website and push away
The recommendation ability for recommending system, the semantic key words set extended by the two abilities is not only related to video content, and
Also add the diversity and the degree of social concern of keyword.
Step 4 --- extract video entities keyword set:
According to the duration of video, the extraction key frame of video of our fixed length forms key frame of video collection.Key frame of video
As input, it is input to and has used in the trained deep learning frame Caffe of ImageNet, export corresponding key frame of video
Entity information recognition result is added in video entities keyword set.Analysis by deep learning technology to video content
And identification, we can collect the entity information most proper with video true content, improve to a certain extent final
It is supplied to the authenticity and accuracy of the keyword of user.
Step 5 --- sort key set of words:
NGD similarity distance calculates as follows:
Wherein h (t) and h (Xi) indicate using in Google engine search keyword t and initial title keyword set X
Keyword XiThe searching bar number returned respectively, h (t, Xi) indicate to simultaneously scan for the searching bar number of the two keywords return, N is indicated
The webpage number (the webpage number that Google engine may search in the case where not inputting any search condition) of Google index.If away from
From value closer to 0, indicate that both keyword is more related semantically;If distance value is closer to infinitely great, both keyword
It is more uncorrelated semantically.
TF-SIM sort algorithm is as follows:
Wherein TtIndicate that the number that keyword t occurs, X indicate initial title keyword set, n indicates that initial title is crucial
The number of keyword in set of words X.
The distribution of semantic and entity key number calculates as follows:
Tn=Ts+δTs (3)
Wherein TnExpression needs to recommend the keyword number of user, TsIndicate the key extracted from semantic key words set
Word number, δ TsIndicate the keyword number extracted from entity key set, δ value is rule of thumb set as 0.5.
It is divided into following four step referring to the process keyword set relevancy ranking of Fig. 2:
1) keyword frequency calculates: pressing frequency of occurrence sort key set of words to the keyword in keyword set, and remembers
Record the frequency of occurrence of each keyword.One is obtained without duplicate keyword set.
2) NGD distance calculates: is calculated and initial title keyword set using formula (1) without duplicate keyword set
Average degree of correlation.
3) TF-SIM (similarity value) sequence calculates: the keyword frequency of occurrence and keyword being calculated by first two steps
Formula (2) algorithmic formula, sort key set of words are substituted into the average degree of correlation of initial key set of words.
4) it final keyword extraction: is calculated by formula (3), obtains the consequently recommended keyword set to user.
Keyword order standard should comprehensively consider the degree of correlation and the degree of social concern, and formula (2) weighs from two factors
Amount: the average degree of correlation of keyword frequency of occurrence and keyword and initial key set of words.First factor is gone out by keyword
Occurrence number has measured the potentiality that keyword attracts the degree of social concern.Second factor is by calculating the phase with initial key set of words
Guan Du has measured keyword with the video degree of correlation.It conditions each other between two factors, if keyword frequency of occurrence is more, but
It is that the degree of correlation is low, then score is affected certainly, vice versa.By the calculating of the two factors, final determination is recommended
The best keyword of user.
Above-described embodiment is not for limitation of the invention, and the present invention is not limited only to above-described embodiment, as long as meeting
The present invention claims all belong to the scope of protection of the present invention.
Claims (4)
1. a kind of video society attention rate improvement method recommended based on keyword, which is characterized in that this method uses following step
It is rapid to realize:
Step 1. obtains video initial key word: according to given video, the video provided when in conjunction with user's uploaded videos is initial
Title keyword extracts K initial key word relevant to video, constitutes initial key set of words X;
Step 2. initial key word WordNet semantic extension: by K initial key word obtained above respectively on WordNet
It looks for semantic similar keyword and is extended to the preliminary semantic key words set of WordNet in conjunction with initial key set of words;
Step 3. major video sharing website extends again: utilizing the preliminary semantic key words set of WordNet, shares in major video
It is scanned in website, the keyword that can attract more degree of social concern is extracted, in conjunction with the preliminary semantic key words collection of WordNet
It closes, is extended to the final semantic key words set of WordNet;
Step 4. extracts video entities keyword set: by deep learning technology, it is real to form video for mining content of video information
Body keyword set;
Step 5. sort key set of words: considering the degree of correlation and the degree of social concern of keyword, according to keyword frequency of occurrence with
And two keyword sets of two aspect sequencing video semanteme of average degree of correlation and entity of keyword and initial key set of words, really
Recognize the keyword set that can finally be provided to user.
2. a kind of video society attention rate improvement method recommended based on keyword according to claim 1, feature are existed
In: in step 3, according to similar or similar topic video information thought can be searched on multiple websites, utilize step
The rapid 2 preliminary semantic key words of WordNet obtained, scan on major video website, by video sharing website itself
Two abilities of search engine and recommender system of carrying, extract the keyword that can attract more degree of social concern, are extended to most
Whole semantic key words set.
3. a kind of video society attention rate improvement method recommended based on keyword according to claim 1, feature are existed
In: in step 4, the key frame of the extraction video of fixed length, is backstage with ImageNet pictures, according to deep learning technology,
Video content information is excavated, video entities keyword set is formed.
4. a kind of video society attention rate improvement method recommended based on keyword according to claim 1, feature are existed
It is as follows in step 5 specific operation process:
5.1 keyword frequencies calculate: in the final semantic key words set of WordNet and video entities keyword set
Keyword presses frequency of occurrence rearrangement respectively, and integration obtains respectively without duplicate new keywords set, and records each key
The frequency of occurrence of word;
5.2 NGD distance values calculate: what step 5.1 obtained is calculated and step 1 without duplicate key set of words using formula (1)
The NGD distance value of the initial key set of words of acquisition:
Wherein h (t) and h (Xi) indicate step 5.1 obtain without keyword t in duplicate key set of words and initial key set of words X
In keyword XiThe searching bar number returned in search engine G respectively, h (t, Xi) indicate that simultaneously scanning for the two keywords returns
The searching bar number returned, N indicate the webpage number that search engine G can be indexed;
5.3 TF-SIM sequence calculates: the keyword frequency of occurrence and keyword and initial key word being calculated by first two steps
The NGD distance value of set X substitutes into formula (2) algorithmic formula, is carried out according to the TF-SIM similarity value being calculated according to size
Rearrangement, constitutes new keyword set;
Wherein TtIndicate that the number that keyword t occurs, X indicate initial key set of words, n indicates crucial in initial key set of words X
The number of word;
5.4 final keyword extractions: it is calculated by formula (3) and constitutes WordNet in the keyword set for recommending user
The keyword number that final semantic key words set and video entities keyword set respectively provide is most related needed for final acquisition
Keyword:
Tn=Ts+δTs (3)
Wherein TnExpression needs to recommend the keyword number of user, TsIndicate the keyword number extracted from semantic key words set,
δTsIndicate the keyword number extracted from entity key set, δ is empirical value.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610884840.8A CN106528633B (en) | 2016-10-11 | 2016-10-11 | A kind of video society attention rate improvement method recommended based on keyword |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610884840.8A CN106528633B (en) | 2016-10-11 | 2016-10-11 | A kind of video society attention rate improvement method recommended based on keyword |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106528633A CN106528633A (en) | 2017-03-22 |
CN106528633B true CN106528633B (en) | 2019-07-02 |
Family
ID=58331590
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610884840.8A Active CN106528633B (en) | 2016-10-11 | 2016-10-11 | A kind of video society attention rate improvement method recommended based on keyword |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106528633B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108009293B (en) * | 2017-12-26 | 2022-08-23 | 北京百度网讯科技有限公司 | Video tag generation method and device, computer equipment and storage medium |
CN109992720A (en) * | 2018-11-15 | 2019-07-09 | 厦门笨鸟电子商务有限公司 | A kind of system and method for promoting user and writing attention rate of the content in social media |
CN109992656A (en) * | 2018-11-15 | 2019-07-09 | 厦门笨鸟电子商务有限公司 | A kind of machine writing system and method with high attention rate content issued in social media |
CN111061939B (en) * | 2019-12-31 | 2023-03-24 | 西安理工大学 | Scientific research academic news keyword matching recommendation method based on deep learning |
CN113836289B (en) * | 2021-08-16 | 2023-06-09 | 北京邮电大学 | Entity evolution rule recommendation method and device |
CN116304315B (en) * | 2023-02-27 | 2024-02-06 | 广州兴趣岛信息科技有限公司 | Intelligent content recommendation system for online teaching |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101901249A (en) * | 2009-05-26 | 2010-12-01 | 复旦大学 | Text-based query expansion and sort method in image retrieval |
CN101977319A (en) * | 2010-11-03 | 2011-02-16 | 上海交通大学 | Method for generating and authenticating hidden video tags based on video characteristics and digital signatures |
CN102214173A (en) * | 2010-04-02 | 2011-10-12 | 富士通株式会社 | Method and device for choosing keywords for web publishing |
CN102982076A (en) * | 2012-10-30 | 2013-03-20 | 新华通讯社 | Multi-dimensionality content labeling method based on semanteme label database |
CN104657376A (en) * | 2013-11-20 | 2015-05-27 | 航天信息股份有限公司 | Searching method and searching device for video programs based on program relationship |
CN105404619A (en) * | 2015-09-08 | 2016-03-16 | 华南理工大学 | Similarity based semantic Web service clustering labeling method |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080294624A1 (en) * | 2007-05-25 | 2008-11-27 | Ontogenix, Inc. | Recommendation systems and methods using interest correlation |
-
2016
- 2016-10-11 CN CN201610884840.8A patent/CN106528633B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101901249A (en) * | 2009-05-26 | 2010-12-01 | 复旦大学 | Text-based query expansion and sort method in image retrieval |
CN102214173A (en) * | 2010-04-02 | 2011-10-12 | 富士通株式会社 | Method and device for choosing keywords for web publishing |
CN101977319A (en) * | 2010-11-03 | 2011-02-16 | 上海交通大学 | Method for generating and authenticating hidden video tags based on video characteristics and digital signatures |
CN102982076A (en) * | 2012-10-30 | 2013-03-20 | 新华通讯社 | Multi-dimensionality content labeling method based on semanteme label database |
CN104657376A (en) * | 2013-11-20 | 2015-05-27 | 航天信息股份有限公司 | Searching method and searching device for video programs based on program relationship |
CN105404619A (en) * | 2015-09-08 | 2016-03-16 | 华南理工大学 | Similarity based semantic Web service clustering labeling method |
Non-Patent Citations (1)
Title |
---|
Boosting video popularity through keyword suggestion;Renjie Zhou 等;《Neurocomputing》;20160520;第205卷;529-541 * |
Also Published As
Publication number | Publication date |
---|---|
CN106528633A (en) | 2017-03-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106528633B (en) | A kind of video society attention rate improvement method recommended based on keyword | |
CN105488196B (en) | A kind of hot topic automatic mining system based on interconnection corpus | |
CN108763321B (en) | Related entity recommendation method based on large-scale related entity network | |
CN102929873B (en) | Method and device for extracting searching value terms based on context search | |
Armentano et al. | Followee recommendation based on text analysis of micro-blogging activity | |
CN104885081A (en) | Search system and corresponding method | |
US10176265B2 (en) | Awareness engine | |
US20160085869A1 (en) | Social media content analysis and output | |
CN104866554B (en) | A kind of individuation search method and system based on socialization mark | |
Cui et al. | Social-sensed image search | |
CN103793434A (en) | Content-based image search method and device | |
Ma et al. | Your Tweets Reveal What You Like: Introducing Cross-media Content Information into Multi-domain Recommendation. | |
Armentano et al. | Recommending information sources to information seekers in Twitter | |
CN108021667A (en) | A kind of file classification method and device | |
Kawase et al. | Exploiting the wisdom of the crowds for characterizing and connecting heterogeneous resources | |
Wu et al. | Web video recommendation and long tail discovering | |
Milajevs et al. | Real time discussion retrieval from twitter | |
JP2008102790A (en) | Retrieval system | |
Lossio-Ventura et al. | Communication overload management through social interactions clustering | |
Redondoio Garcia et al. | Describing and contextualizing events in tv news show | |
Kim et al. | TV program searching and ranking for supporting TV personaliztion | |
Musto et al. | A tag recommender system exploiting user and community behavior | |
Musto et al. | Combining collaborative and content-based techniques for tag recommendation | |
Shaikh et al. | IRuSL: image recommendation using semantic link | |
Belém et al. | Tagging and Tag Recommendation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |