CN106528633B - A kind of video society attention rate improvement method recommended based on keyword - Google Patents

A kind of video society attention rate improvement method recommended based on keyword Download PDF

Info

Publication number
CN106528633B
CN106528633B CN201610884840.8A CN201610884840A CN106528633B CN 106528633 B CN106528633 B CN 106528633B CN 201610884840 A CN201610884840 A CN 201610884840A CN 106528633 B CN106528633 B CN 106528633B
Authority
CN
China
Prior art keywords
keyword
video
words
key
semantic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610884840.8A
Other languages
Chinese (zh)
Other versions
CN106528633A (en
Inventor
周仁杰
万健
夏冬晨
张纪林
殷昱煜
张伟
任祖杰
贾刚勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN201610884840.8A priority Critical patent/CN106528633B/en
Publication of CN106528633A publication Critical patent/CN106528633A/en
Application granted granted Critical
Publication of CN106528633B publication Critical patent/CN106528633B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/738Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/7867Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title and artist information, manually generated time, location and usage information, user ratings

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The video society attention rate improvement method based on keyword recommendation that the invention discloses a kind of.The method that the present invention uses semantic dependency and deep learning, recommends the keyword of video, improves the degree of social concern of video.Method is found and the maximally related several semantic key words of initial key word justice according to the initial key word of user first based on the semantic dependency between keyword and to the deep learning of video content;Then entity key is excavated using deep learning analysis video content;Finally according to this two parts keyword of certain standard sorted, the maximally related keyword of user is recommended.The keyword that the present invention recommends has taken into account the degree of correlation of keyword and video content and has attracted the potentiality of the degree of social concern, improves the degree of social concern of video, is a kind of not only efficient but also practical video key recommended method.The present invention can be used for online social media analysis, and data mining and video tab recommend field.

Description

A kind of video society attention rate improvement method recommended based on keyword
Technical field
The invention belongs to the analysis of online social media, data mining, video tab recommended technology field are specifically designed one The video society attention rate improvement method that kind is recommended based on keyword.
Background technique
In traditional internet application, search engine is the important tool that user has found Web content.Therefore, needle at present It is mainly what Search Engine-Oriented proposed to the method for improving the Web content degree of social concern.And in social media, especially It is as YouTube, Flickr and youku.com in multimedia sharing website, other than search engine, recommender system is also One important sources of the degree of social concern.It thus needs while excavating search engine and recommender system improves the latent of the degree of social concern Power could more effectively improve the degree of social concern of Social Media content.
Although search engine has enough interior tolerant users to go to find, but with the information content of internet be skyrocketed through with And people require search engine higher and higher, search engine also shows certain limitation, if coverage rate is low, as a result not Accurately, incoherent result etc. is returned.Although recommender system can recommend out the degree of correlation it is very high as a result, still recommendation results Range also suffered from certain limitation, and very big difference can be also presented in the recommendation results of different recommender systems.
Deep learning is as current new technology, in terms of also can be applied to the degree of social concern for improving video.Depth It practises the application in terms of video and is mainly manifested in extraction video content information, and can ensure the standard for extracting video content information True property.
Summary of the invention
For appeal problem, the invention discloses it is a kind of based on keyword recommend video society attention rate improvement method, The keyword that this method is recommended can combine with the degree of correlation of video content and attract the potentiality of the degree of social concern.
The technical solution used to solve the technical problems of the present invention is that:
A kind of video society attention rate improvement method recommended based on keyword, this method are realized using following steps:
Step 1 --- obtain video initial key word:
According to given video, the video initial title keyword provided when in conjunction with user's uploaded videos is provided and video Relevant K initial key word.
Step 2 --- initial key word WordNet semantic extension:
Semantic similar keyword is looked on WordNet according to initial key word, is extended to the preliminary semantic pass WordNet Keyword set.
Step 3 --- major video sharing website extends again:
It using preliminary semantic key words set, is scanned in major video sharing website, extraction can attract more The keyword of the degree of social concern is extended to final semantic key words set.
Step 4 --- extract video entities keyword set:
By deep learning technology, mining content of video information forms video entities keyword set.
Step 5 --- sort key set of words:
The degree of correlation and the degree of social concern for considering keyword, according to keyword frequency of occurrence and keyword and initial key Two keyword sets of two aspect sequencing video semanteme of average degree of correlation and entity of set of words, confirmation can finally be provided to user's Keyword set.
The invention has the advantages that:
1, the present invention carries out extension semantically to initial key word by WordNet semantic dictionary, due to WordNet language Adopted dictionary itself has carried out good summarizing to entry in terms of semanteme, therefore the semantic key words set expanded guarantees With initial video title in correlation semantically, the quality of semantic hierarchies is improved, the diversity of keyword is also promoted.
2, the present invention further expands semantic key words set by major video sharing website, according to similar or similar The video of theme can usually search this thought on multiple websites, in conjunction with video sharing site search engine and recommend system The two abilities of uniting extend semantic key words set, and the semantic key words set is not only related to video content, but also increases The diversity and the degree of social concern of keyword.
3, the present invention is analyzed and is identified to video content by deep learning technology, can be collected true with video The most proper entity information of real content improves the authenticity and accuracy that be finally supplied to the keyword of user.
4, it is arranged in terms of the average degree of correlation two that the present invention passes through keyword frequency of occurrence and keyword and initial key word Sequence set of keywords, had not only measured the degree of correlation of keyword Yu initial key set of words, but also had measured the society pass of keyword Note degree.
5, the present invention can be used for online social media analysis, the field of data mining, it is particularly possible to be used for video tab Recommendation field.
Detailed description of the invention
Fig. 1 is the overall framework figure that keyword of the present invention is recommended.
Fig. 2 is the flow chart of keyword rank of the present invention.
Specific embodiment
The present invention is further described with specific implementation application process with reference to the accompanying drawing:
Execute step referring to Fig.1 to illustrate implementation process of the invention:
Step 1 --- obtain video initial key word:
According to given video, user can provide user oneself in video upload interface and think to close in uploaded videos Accurate video initial title is fitted, extracts K keyword in video initial title as initial title keyword set X.
Step 2 --- initial key word WordNet semantic extension:
According to initial key set of words, for each keyword, input in WordNet semantic dictionary, output and this Several relevant entries of a keywords semantics choose semantically maximally related 2-3 keyword, finally constitute and be based on The preliminary semantic key words set of WordNet extension.The Video Key word extended in this way not only ensure that and initial video mark Topic improves the quality of semantic hierarchies, and promote the diversity of keyword in correlation semantically.
Step 3 --- major video sharing website extends again:
It can usually be searched on multiple video sharing websites according to similar or similar topic video, so we Similar or similar topic video information can be collected on multiple video sharing websites.Video sharing website is utilized in we Search engine and the big ability of recommender system two, following two can be divided into based on major video sharing website extending video keyword Step:
1) search engine is searched for
For the preliminary semantic key words set of WordNet extension, 2-3 group keyword is formed, is shared in major video Plain engine is searched using website on website every group of keyword is carried out searching element, extract before the ranking that each site search goes out 10 view Frequently, the key word in title for collecting these videos is added in semantic key words set.
2) recommender system is recommended
For, by searching the video in the forefront that plain engine search goes out, collection video website passes through these videos in the first step The associated video that recommender system is recommended, likewise, the key word in title of these associated videos is added to semantic pass by we here In keyword set.
By two above step, we are adequately utilized the search capability of the search engine of video sharing website and push away The recommendation ability for recommending system, the semantic key words set extended by the two abilities is not only related to video content, and Also add the diversity and the degree of social concern of keyword.
Step 4 --- extract video entities keyword set:
According to the duration of video, the extraction key frame of video of our fixed length forms key frame of video collection.Key frame of video As input, it is input to and has used in the trained deep learning frame Caffe of ImageNet, export corresponding key frame of video Entity information recognition result is added in video entities keyword set.Analysis by deep learning technology to video content And identification, we can collect the entity information most proper with video true content, improve to a certain extent final It is supplied to the authenticity and accuracy of the keyword of user.
Step 5 --- sort key set of words:
NGD similarity distance calculates as follows:
Wherein h (t) and h (Xi) indicate using in Google engine search keyword t and initial title keyword set X Keyword XiThe searching bar number returned respectively, h (t, Xi) indicate to simultaneously scan for the searching bar number of the two keywords return, N is indicated The webpage number (the webpage number that Google engine may search in the case where not inputting any search condition) of Google index.If away from From value closer to 0, indicate that both keyword is more related semantically;If distance value is closer to infinitely great, both keyword It is more uncorrelated semantically.
TF-SIM sort algorithm is as follows:
Wherein TtIndicate that the number that keyword t occurs, X indicate initial title keyword set, n indicates that initial title is crucial The number of keyword in set of words X.
The distribution of semantic and entity key number calculates as follows:
Tn=Ts+δTs (3)
Wherein TnExpression needs to recommend the keyword number of user, TsIndicate the key extracted from semantic key words set Word number, δ TsIndicate the keyword number extracted from entity key set, δ value is rule of thumb set as 0.5.
It is divided into following four step referring to the process keyword set relevancy ranking of Fig. 2:
1) keyword frequency calculates: pressing frequency of occurrence sort key set of words to the keyword in keyword set, and remembers Record the frequency of occurrence of each keyword.One is obtained without duplicate keyword set.
2) NGD distance calculates: is calculated and initial title keyword set using formula (1) without duplicate keyword set Average degree of correlation.
3) TF-SIM (similarity value) sequence calculates: the keyword frequency of occurrence and keyword being calculated by first two steps Formula (2) algorithmic formula, sort key set of words are substituted into the average degree of correlation of initial key set of words.
4) it final keyword extraction: is calculated by formula (3), obtains the consequently recommended keyword set to user.
Keyword order standard should comprehensively consider the degree of correlation and the degree of social concern, and formula (2) weighs from two factors Amount: the average degree of correlation of keyword frequency of occurrence and keyword and initial key set of words.First factor is gone out by keyword Occurrence number has measured the potentiality that keyword attracts the degree of social concern.Second factor is by calculating the phase with initial key set of words Guan Du has measured keyword with the video degree of correlation.It conditions each other between two factors, if keyword frequency of occurrence is more, but It is that the degree of correlation is low, then score is affected certainly, vice versa.By the calculating of the two factors, final determination is recommended The best keyword of user.
Above-described embodiment is not for limitation of the invention, and the present invention is not limited only to above-described embodiment, as long as meeting The present invention claims all belong to the scope of protection of the present invention.

Claims (4)

1. a kind of video society attention rate improvement method recommended based on keyword, which is characterized in that this method uses following step It is rapid to realize:
Step 1. obtains video initial key word: according to given video, the video provided when in conjunction with user's uploaded videos is initial Title keyword extracts K initial key word relevant to video, constitutes initial key set of words X;
Step 2. initial key word WordNet semantic extension: by K initial key word obtained above respectively on WordNet It looks for semantic similar keyword and is extended to the preliminary semantic key words set of WordNet in conjunction with initial key set of words;
Step 3. major video sharing website extends again: utilizing the preliminary semantic key words set of WordNet, shares in major video It is scanned in website, the keyword that can attract more degree of social concern is extracted, in conjunction with the preliminary semantic key words collection of WordNet It closes, is extended to the final semantic key words set of WordNet;
Step 4. extracts video entities keyword set: by deep learning technology, it is real to form video for mining content of video information Body keyword set;
Step 5. sort key set of words: considering the degree of correlation and the degree of social concern of keyword, according to keyword frequency of occurrence with And two keyword sets of two aspect sequencing video semanteme of average degree of correlation and entity of keyword and initial key set of words, really Recognize the keyword set that can finally be provided to user.
2. a kind of video society attention rate improvement method recommended based on keyword according to claim 1, feature are existed In: in step 3, according to similar or similar topic video information thought can be searched on multiple websites, utilize step The rapid 2 preliminary semantic key words of WordNet obtained, scan on major video website, by video sharing website itself Two abilities of search engine and recommender system of carrying, extract the keyword that can attract more degree of social concern, are extended to most Whole semantic key words set.
3. a kind of video society attention rate improvement method recommended based on keyword according to claim 1, feature are existed In: in step 4, the key frame of the extraction video of fixed length, is backstage with ImageNet pictures, according to deep learning technology, Video content information is excavated, video entities keyword set is formed.
4. a kind of video society attention rate improvement method recommended based on keyword according to claim 1, feature are existed It is as follows in step 5 specific operation process:
5.1 keyword frequencies calculate: in the final semantic key words set of WordNet and video entities keyword set Keyword presses frequency of occurrence rearrangement respectively, and integration obtains respectively without duplicate new keywords set, and records each key The frequency of occurrence of word;
5.2 NGD distance values calculate: what step 5.1 obtained is calculated and step 1 without duplicate key set of words using formula (1) The NGD distance value of the initial key set of words of acquisition:
Wherein h (t) and h (Xi) indicate step 5.1 obtain without keyword t in duplicate key set of words and initial key set of words X In keyword XiThe searching bar number returned in search engine G respectively, h (t, Xi) indicate that simultaneously scanning for the two keywords returns The searching bar number returned, N indicate the webpage number that search engine G can be indexed;
5.3 TF-SIM sequence calculates: the keyword frequency of occurrence and keyword and initial key word being calculated by first two steps The NGD distance value of set X substitutes into formula (2) algorithmic formula, is carried out according to the TF-SIM similarity value being calculated according to size Rearrangement, constitutes new keyword set;
Wherein TtIndicate that the number that keyword t occurs, X indicate initial key set of words, n indicates crucial in initial key set of words X The number of word;
5.4 final keyword extractions: it is calculated by formula (3) and constitutes WordNet in the keyword set for recommending user The keyword number that final semantic key words set and video entities keyword set respectively provide is most related needed for final acquisition Keyword:
Tn=Ts+δTs (3)
Wherein TnExpression needs to recommend the keyword number of user, TsIndicate the keyword number extracted from semantic key words set, δTsIndicate the keyword number extracted from entity key set, δ is empirical value.
CN201610884840.8A 2016-10-11 2016-10-11 A kind of video society attention rate improvement method recommended based on keyword Active CN106528633B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610884840.8A CN106528633B (en) 2016-10-11 2016-10-11 A kind of video society attention rate improvement method recommended based on keyword

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610884840.8A CN106528633B (en) 2016-10-11 2016-10-11 A kind of video society attention rate improvement method recommended based on keyword

Publications (2)

Publication Number Publication Date
CN106528633A CN106528633A (en) 2017-03-22
CN106528633B true CN106528633B (en) 2019-07-02

Family

ID=58331590

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610884840.8A Active CN106528633B (en) 2016-10-11 2016-10-11 A kind of video society attention rate improvement method recommended based on keyword

Country Status (1)

Country Link
CN (1) CN106528633B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108009293B (en) * 2017-12-26 2022-08-23 北京百度网讯科技有限公司 Video tag generation method and device, computer equipment and storage medium
CN109992720A (en) * 2018-11-15 2019-07-09 厦门笨鸟电子商务有限公司 A kind of system and method for promoting user and writing attention rate of the content in social media
CN109992656A (en) * 2018-11-15 2019-07-09 厦门笨鸟电子商务有限公司 A kind of machine writing system and method with high attention rate content issued in social media
CN111061939B (en) * 2019-12-31 2023-03-24 西安理工大学 Scientific research academic news keyword matching recommendation method based on deep learning
CN113836289B (en) * 2021-08-16 2023-06-09 北京邮电大学 Entity evolution rule recommendation method and device
CN116304315B (en) * 2023-02-27 2024-02-06 广州兴趣岛信息科技有限公司 Intelligent content recommendation system for online teaching

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101901249A (en) * 2009-05-26 2010-12-01 复旦大学 Text-based query expansion and sort method in image retrieval
CN101977319A (en) * 2010-11-03 2011-02-16 上海交通大学 Method for generating and authenticating hidden video tags based on video characteristics and digital signatures
CN102214173A (en) * 2010-04-02 2011-10-12 富士通株式会社 Method and device for choosing keywords for web publishing
CN102982076A (en) * 2012-10-30 2013-03-20 新华通讯社 Multi-dimensionality content labeling method based on semanteme label database
CN104657376A (en) * 2013-11-20 2015-05-27 航天信息股份有限公司 Searching method and searching device for video programs based on program relationship
CN105404619A (en) * 2015-09-08 2016-03-16 华南理工大学 Similarity based semantic Web service clustering labeling method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080294624A1 (en) * 2007-05-25 2008-11-27 Ontogenix, Inc. Recommendation systems and methods using interest correlation

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101901249A (en) * 2009-05-26 2010-12-01 复旦大学 Text-based query expansion and sort method in image retrieval
CN102214173A (en) * 2010-04-02 2011-10-12 富士通株式会社 Method and device for choosing keywords for web publishing
CN101977319A (en) * 2010-11-03 2011-02-16 上海交通大学 Method for generating and authenticating hidden video tags based on video characteristics and digital signatures
CN102982076A (en) * 2012-10-30 2013-03-20 新华通讯社 Multi-dimensionality content labeling method based on semanteme label database
CN104657376A (en) * 2013-11-20 2015-05-27 航天信息股份有限公司 Searching method and searching device for video programs based on program relationship
CN105404619A (en) * 2015-09-08 2016-03-16 华南理工大学 Similarity based semantic Web service clustering labeling method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Boosting video popularity through keyword suggestion;Renjie Zhou 等;《Neurocomputing》;20160520;第205卷;529-541 *

Also Published As

Publication number Publication date
CN106528633A (en) 2017-03-22

Similar Documents

Publication Publication Date Title
CN106528633B (en) A kind of video society attention rate improvement method recommended based on keyword
CN105488196B (en) A kind of hot topic automatic mining system based on interconnection corpus
CN108763321B (en) Related entity recommendation method based on large-scale related entity network
CN102929873B (en) Method and device for extracting searching value terms based on context search
Armentano et al. Followee recommendation based on text analysis of micro-blogging activity
CN104885081A (en) Search system and corresponding method
US10176265B2 (en) Awareness engine
US20160085869A1 (en) Social media content analysis and output
CN104866554B (en) A kind of individuation search method and system based on socialization mark
Cui et al. Social-sensed image search
CN103793434A (en) Content-based image search method and device
Ma et al. Your Tweets Reveal What You Like: Introducing Cross-media Content Information into Multi-domain Recommendation.
Armentano et al. Recommending information sources to information seekers in Twitter
CN108021667A (en) A kind of file classification method and device
Kawase et al. Exploiting the wisdom of the crowds for characterizing and connecting heterogeneous resources
Wu et al. Web video recommendation and long tail discovering
Milajevs et al. Real time discussion retrieval from twitter
JP2008102790A (en) Retrieval system
Lossio-Ventura et al. Communication overload management through social interactions clustering
Redondoio Garcia et al. Describing and contextualizing events in tv news show
Kim et al. TV program searching and ranking for supporting TV personaliztion
Musto et al. A tag recommender system exploiting user and community behavior
Musto et al. Combining collaborative and content-based techniques for tag recommendation
Shaikh et al. IRuSL: image recommendation using semantic link
Belém et al. Tagging and Tag Recommendation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant