CN107357889A - A kind of across social platform picture proposed algorithm based on interior perhaps emotion similitude - Google Patents

A kind of across social platform picture proposed algorithm based on interior perhaps emotion similitude Download PDF

Info

Publication number
CN107357889A
CN107357889A CN201710560717.5A CN201710560717A CN107357889A CN 107357889 A CN107357889 A CN 107357889A CN 201710560717 A CN201710560717 A CN 201710560717A CN 107357889 A CN107357889 A CN 107357889A
Authority
CN
China
Prior art keywords
picture
text
emotion
mrow
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710560717.5A
Other languages
Chinese (zh)
Other versions
CN107357889B (en
Inventor
毋立芳
祁铭超
刘爽
简萌
杨博文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN201710560717.5A priority Critical patent/CN107357889B/en
Publication of CN107357889A publication Critical patent/CN107357889A/en
Application granted granted Critical
Publication of CN107357889B publication Critical patent/CN107357889B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/5866Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, manually generated location and time information
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A kind of across social platform picture proposed algorithm based on interior perhaps emotion similitude, it is related to smart media and calculates and big data analysis technical field;The content that first picture and text are shared with the generation text of the user in social platform is analyzed, extract its key word information, carry out the analysis in content to the picture in picture social platform with same method simultaneously, matched according to text with the content consistency of picture, obtain the initial recommendation list based on content;Secondly sentiment analysis is carried out to text, if text includes emotion, matched based on text with the emotion uniformity of picture, obtain the initial recommendation list based on emotion matching;Then user is modeled in two social network-i i-platforms respectively, obtains user preference;Finally, content, emotion and user preference are merged, produces final picture recommendation list.It is more fully proper the invention enables the expression of user, the Consumer's Experience contributed on significant increase social networks, lift usage rate of the user.

Description

A kind of across social platform picture proposed algorithm based on interior perhaps emotion similitude
Technical field
The present invention relates to smart media calculating and big data analysis technical field, more particularly to a kind of picture recommendation side Method.
Background technology
The Web2.0 epoch, the development trend of social networks be multimedia interactive enrich constantly and user mutual threshold Constantly reduce.From initial long text interaction social networks (news website, blog etc.), short text interaction social network finally Network (twitter, Facebook, microblogging, Renren Network etc.), then finally visualization social networks (Flickr, Snapchat, Instagram etc.), medium type is enriched constantly (word-image-video), and user, which participates in threshold, constantly to reduce.It is increased later Forward, collect and thumb up etc. function more allow user by other people generation/issue/forwardings content it is one-touch express individual conception Point.It is adapted therewith, the expression way that both pictures and texts are excellent is welcome by user.
The user of social networks often delivers word to express certain viewpoint of user, mood, emotion etc., a so-called " figure To thousand speeches ", after user delivers passage, if it is possible to suitably scheme with a width, then user mood and emotion can be expressed more Fully.In recent years, the expression way that both pictures and texts are excellent is welcome by user.Put down as people's acquisition information, exchange, the important of communication Platform, on the one hand, social network-i i-platform turns into abundant data source;On the other hand, due to the convenience of information issue, letter is caused The renewal speed of breath quickly, generates huge redundancy, serious information overload, in this case, user occurs It is difficult to the data of oneself needs are found from mass data, so that user is difficult to find to make oneself full in a short period of time The figure of meaning.
For the demand, the present invention proposes a kind of across social platform picture recommendation side based on interior perhaps emotion similarity Method.The text of user oriented generation, automatically analyzes its content and emotional expression, and from other picture category website Auto-matching figures Piece, it is accustomed to carrying out picture recommendation further combined with user.So that the expression of user is more fully proper, help greatly to carry The Consumer's Experience on social networks is risen, lifts usage rate of the user.
The content of the invention
It is an object of the invention to provide a kind of across social platform picture recommendation method based on interior perhaps emotion similitude, fits Share social platform and picture social platform for picture and text, wherein picture social platform shares social platform for picture and text and provides figure Piece, the two possesses co-user.
The present invention is divided into four parts:Picture based on content matching is recommended, and the picture based on emotion matching is recommended, across flat Platform user preference models, and fusion content, emotion, the recommendation results of user preference reorder.
The content that first picture and text are shared with the generation text of the user in social platform is analyzed, and extracts its keyword letter Cease, while carry out the analysis in content to the picture in picture social platform with same method, according in text and picture Hold uniformity to be matched, obtain the initial recommendation list based on content;Secondly sentiment analysis is carried out to text, if text bag Containing emotion, then matched based on text with the emotion uniformity of picture, obtain the initial recommendation list based on emotion matching;So User is modeled in two social network-i i-platforms respectively afterwards, obtains user preference;Finally, content, emotion and user are merged Preference, produce final picture recommendation list.
Above-mentioned comprising the following steps that across social platform picture recommendation method:
1. the picture based on content matching is recommended
1.1 text content analysis
User's generation text that with existing participle instrument picture and text are shared with social platform first is segmented and gone to disable Word processing, and part of speech is marked, the extraction of keyword then is carried out to it with existing TextRank algorithm, the key that will be extracted Word sorts from big to small by weights, retains the keyword of respective amount on demand.
1.2 image contents are analyzed
Because the picture of most of picture social platform is included than more rich text data, so in the present invention, figure Piece content information obtains from text data.
Description text for picture, extracting mode is consistent with text content analysis, the key that will be obtained with TextRank Label of the word as picture material.
For picture tag, part of speech analysis directly is carried out to it, using nouns and adjectives therein as image content Label.
1.3 content consistencies are analyzed
Content consistency analysis is carried out to caused keyword in 1.1 and 1.2 using Word2vec instruments. Word2vec Using the thought of deep learning, word is mapped in the vector space of a K dimension by training, and word is in vector space Similarity can be used for characterize word in similarity semantically.Cosine similarity pair would generally be used when calculating similarity It is measured, and cosine similarity of the semantically close word in Word2vec spatially is higher, namely between them away from From smaller.
2. the picture based on emotion matching is recommended
2.1 text emotions are analyzed
Sentiment analysis is carried out to the text after segmenting in 1.1.First, it is determined that whether text includes emotion, i.e. text emotion Identification, if there is emotion, fine granularity sentiment analysis is carried out to it, i.e. text emotion is classified more.In the more sorting phases of emotion, by text This emotion is divided into seven classes:Happy, good, anger, sorrow, fear, dislike, shying.
2.1.1 text emotion identifies
User is when showing emotion, often word of the selection with obvious Sentiment orientation, in text emotion cognitive phase, We are according to part of speech and symbolic feature, it is first determined emotion deictic words/symbol:Adjective, degree adverb, interjection, microblogging expression, Cyberspeak,!With, part noun and verb.When choosing emotion deictic words noun and verb because in existing noun and In verb, the word of non-emotion class has a lot, so first having to establish sentiment dictionary, if some noun or verb are in sentiment dictionary In, then it is assumed that it is emotion deictic words, otherwise, then it is assumed that it is not emotion deictic words.Then text is determined according to formula (1) Emotion intensity.
Wherein, EI represents the emotion intensity that user generates text, and N is the total length of text (in units of word, including punctuate Symbol), fiThe number occurred for every kind of emotion deictic words/symbol, αiFor the emotion intensity of every kind of emotion deictic words/symbol, value For (0,1).Given threshold value TH ∈ (0,1), if EI>TH, then it is assumed that this text has emotion, on the contrary, then it is assumed that ameleia.For αi Determination with TH takes the α that effect is best, it is necessary to by test of many timesiWith TH values.
To the ameleia text determined in aforementioned manners, further analyzed with SVM methods, judge the presence or absence of emotion.If The result judged with SVM is still ameleia, then the affective tag of user's microblogging directly is arranged into [0,0,0,0,0,0,0], to Each dimension of amount is expressed as happy, good, anger, sorrow, the ranking value fearing, dislike, shying, and 0 represents without such a emotion.
2.1.2 text emotion is classified more
There is emotion text to carry out fine granularity sentiment analysis to what is judged in 2.1.1.One is built first with training set Grader f (), the Confidence f (x of seven kinds of affective tags of user's generation text can be obtained from this graderi,yj), by Formula (2) can determine that the affective tag ranking value of every microblogging.
Wherein rankf(xi,yj) represent microblogging xiAffective tag yjRanking value, j ∈ (1,7).
According to the above method, this stage uses Naive Bayes Classifier first, and generating text to user carries out more points of emotion Class, obtained result is ranked up by the size of probable value, obtains ranking value Rank1;User is generated with SVM classifier again Text carries out more classification, because SVM is a binary classifier, using 1-v-1 method classify more, obtains Rank2;Rank1 and Rank2 average is finally taken to be denoted as Rank as last ranking resultsw=[rw1,rw2,…,rw7], Wherein RankwFor the sequence of text emotion, rwi(i ∈ [1,7]) represents the ranking value of all kinds of emotions, and value is [1,7], rwiValue It is smaller, it is stronger to represent its corresponding emotion intensity.
2.2 picture sentiment analysis
Picture sentiment analysis mainly includes data acquisition, deep learning model training and picture emotional semantic classification three parts.
First, picture concerned is obtained from picture website by the use of emotion vocabulary as term, then by emotion vocabulary pair Emotion is answered to obtain initial data set as picture tag.Followed by the feeling polarities of term, picture tag, picture description text The emotion uniformity of word is cleaned to data set, obtains a purer data set.Deep learning method is further utilized, The sentiment classification model training based on CNN is carried out on data set after cleaning.Finally using the CNN models that training obtains to figure Piece carries out sentiment analysis, and obtained result is ranked up by the size of probable value, is denoted asIts Middle RankpFor the sequence of text emotion, rpi(i ∈ [1,7]) represents the ranking value of all kinds of emotions, and value is [1,7], rpiValue get over It is small, it is stronger to represent its corresponding emotion intensity.
For the fine granularity sentiment analysis task of picture, emotional category label is identical with the polytypic label of text emotion, It is seven kinds:Happy, good, anger, sorrow, fear, dislike, shying.For each picture in data set, its class label is stored.Particularly, Because data of 7 kinds of emotional categories in data set are unbalanced, so entering after arrangement obtains initial data set to its category Row re-sampling operations, make it per a kind of picture number balanced proportion.
2.3 emotion consistency analysis
The purpose that emotion uniformity calculates is the similarity for calculating text and picture on emotional space.We utilize amendment Kendall's coefficient afterwards calculates the emotion ranking value Rank of textwWith the emotion ranking value Rank of picturepEmotion uniformity, It is expressed as follows:
Wherein, Tw,pFor the emotion degree of consistency of text and picture, span is [0,1], when its value closer to 1 when Illustrate that the emotion degree of correlation of text and picture is higher;Rank during C, D represent 2.1 respectivelywWith 2.2 in RankpCorresponding emotion ranking value Identical number and the number differed;N be emotion ordering vector length, the present invention in emotional category be 7 classes, so n takes It is worth for 7.
3. cross-platform user preference modeling
Cross-platform user preference modeling needs to deploy in two social platforms, and one of them is that picture and text share social put down Platform, another is picture social platform.The requirement for sharing social platform for picture and text is:1) platform allows user to deliver text And it is equipped with picture;2) platform allows user to inquire about the picture and text state that its history is delivered.Requirement for picture social platform is:1) Platform allows user to deliver picture and be subject to necessary description;2) platform has preset conventional picture classification and has selected and need for user User is wanted to select related category for each picture.
3.1 establish picture attribute model
Picture social platform generally includes several picture classifications, is crawled accordingly from picture website for each classification Picture, and be divided into training set, checking collection and test set.
The strategy that selection is finely adjusted to CaffeNet in terms of disaggregated model training, kept for first 7 layers of CaffeNet not Become, according to the picture categorical measure of above-mentioned picture social platform, change the output vector length of last full articulamentum, make defeated Outgoing vector length is equal with picture categorical measure.Then by the use of collection data to the model that is trained on ImageNet as Initial model is finely adjusted training, finally gives confusion matrix of the model on test set.
The user interest modeling of 3.2 picture social platforms
Carried out based on the user preferences model of picture sharing platform mainly around picture and picture attribute, i.e., counting user is total The picture hobby of body.
For user u, all picture composition data set I of its collection are collected first, for a picture i, collect it Generic attribute c.For user u, there is Iu={ i1:c1,i2:c2,…,im:cη, wherein IuRepresent user u image data Collection, imM-th of the picture collected for user, cηFor picture attribute corresponding to picture, η is the classification number for the picture attribute chosen, M value is the number that user collects pictures.
The user preferences model of website can be characterized by the accumulated probability histogram of picture generic:
Wherein M is the picture number in data set I;Hist () function is histogram functions, and its effect is to obtain each class Picture number under other attribute, return value are a vector.
3.3 picture and text share user's figure preference modeling of social platform
Picture and text share social platform packet containing abundant picture and text message, so sharing social platform based on picture and text The user preferences models coupling text and picture of data are established, i.e. figure happiness of the counting user under certain particular emotion It is good.
For a user u, its all graph text information composition data collection W delivered is collected.Each element in W can be by Word w and corresponding pictures { i1,i2,…,ikComposition.
For text data w, using 2.1 text emotion analysis model, obtain 7 class emotions of text prediction sort to Measure Ew=(e1,e2,…,e7)。
For a pictures i of pictures, using the 3.1 picture attribute models proposed, the contents attribute of picture is obtained Prediction probability vector Pi=(p1,p2,…,p11)。
For certain particular emotion em, all image, text and datas for traveling through the user obtain the attribute set of corresponding figureThen a user is in particular emotion emUnder figure hobby model can be each by figure generic The cumulative histogram of item probability is characterized:
Wherein H isThe picture number included in set.
The fusion of 3.4 cross-platform user preferences modelings
For a user, user shares the graph text information that social platform delivers in picture and text and can reflected than relatively straightforward The figure custom of user, belong to the explicit interest of user.And user shares what the picture that social platform shares reflected in picture It is hobby of the user for picture, but no clear evidence determines that user can use these pictures as its text figure, institute Belong to the potential interest of user in the interest that picture is shared in social platform with user.Carried out with reference to the data on the two platforms User modeling can be to have the explicit interest of user and potential interest concurrently so that recommendation effect is more preferable.
This patent carries out cross-platform user preferences Model Fusion using selection linear weighted function.And when user explicit interest with When potential interest is variant, the explicit interest of user is considered emphatically.The formalization representation of fusion coefficients is as follows:
Wherein dist () function is measuring similarity function, to measure the similitude between two vectors, in this patent In, we use cosine similarity as metric function.ImplicitPreferuImagePrefer in corresponding 3.2u,In corresponding 3.3
Then for certain particular emotion em, the overall hobby model tormulation of user is:
So far, we obtain the user preferences model in picture attribute expression of space.
4. merge content, emotion, the picture of user preferences to recommend
For user u, its user preferences model is obtained using the cross-platform user preferences model modelling approach in 3 Preferu
Text w is generated for one of user u, calculates each picture p to be recommended recommendation index
Rec(Preferu, w, p) and=min (Sw,p)·Tw,p·PreferU | type=p.type (8)
Wherein PreferU | type=p.typeRepresent that user, can be by formula for the other fancy grade of the affiliated picture categories of picture p (7) draw;Tw,pText w and picture p emotion uniformity is represented, can be drawn by formula (3);Sw,pRepresent text w and picture p's Keyword similarity matrix, it can be drawn by the cosine similarity for the keyword that text and picture are calculated in 1.3.
Finally, according to recommending the size of index to be ranked up from big to small, obtain recommending sequence of pictures.
Brief description of the drawings
Fig. 1 is the algorithm overall framework designed by the present invention;
Fig. 2 is fine granularity sentiment analysis deep learning frame diagram (CaffeNet-emo7) used in the present invention;
Fig. 3 is the picture classification in the petal net used in implementation process of the present invention;
Embodiment
To become apparent from the implementation of the present invention, below in conjunction with the accompanying drawings, by taking Sina weibo and petal net as an example, to this Invention is described in detail, and wherein Sina weibo is that picture and text share social platform, and petal net is that picture shares social platform, the two Contact be to possess common user.Therefore, the present invention, which applies the purpose in Sina weibo and petal net, is being delivered for user The picture for meeting its current emotion and word content is provided during microblogging.Accompanying drawing 1 for the present invention overall framework, specific implementation step It is as follows:
1. experimental data gathers
Because Sina weibo and petal net be without wide-open data acquisition interface, in specific implementation process I Select web crawlers to complete the collection of data, and from Scrapy as reptile framework, it is therefore an objective to ensure the efficient of program Property and durability.
To petal network users, its people's essential information, and all collections and drawing board information, the number finally collected are crawled According to storing classifiedly.To microblog users, we crawl his personal essential information, and his all microblog datas for delivering.Finally, We have crawled the co-user of 11000 Sina weibos and petal net, and filter out 100 of a relatively high users of liveness As experiment user.To the experiment user filtered out, we collect all original microbloggings that it delivers on microblogging and correspondingly Picture, in all pictures and its all metadata that petal turns to adopt on the net.In addition, it also have collected the use of the invention to be used In 11 class pictures of user modeling.
2. the picture based on content matching is recommended
2.1 text content analysis
Microblog users can initiate topic (#XXXX#) or other users (XXX) when delivering the original microblogging of oneself, Analysis of the meeting such as these non-critical informations #XXXX# ,@XXX to text interferes, therefore is carrying out content to microblog data collection Analysis is with that before sentiment analysis, will filter out these interference informations.We are using the method for regular expression to the microblogging collected in 1 Text carries out the filtering of interference information.
This stage carries out keyword extraction with TextRank algorithm to text.On Python2.7 platforms to cleaning after Test set carries out keyword extraction with textrank4zh modules, it is contemplated that microblogging belongs to short text, and the present invention is by TextRank's Co-occurrence window is set to 3, and only retains nouns and adjectives.Accompanying drawing 2 is the keyword example of some microblogging texts and extraction.
2.2 image contents are analyzed
Petal net is a social sharing website based on interest, it is intended to help user to find the picture oneself liked, and The picture liked can be reorganized and collected.The picture of petal net describes text corresponding to having, so in the present invention, figure Piece content information obtains from description text.
When extracting the content information of picture, method will be consistent with the content analysis method of microblogging text, so will The extracting method of 2.1 text key word directly applies to the description text of picture, finally obtains the keyword of image content.
2.3 content consistencies are analyzed
Content consistency analysis is carried out to caused keyword in 2.1 and 2.2 using Word2vec instruments. Word2vec Using the thought of deep learning, word is mapped in the vector space of a K dimension by training, and word is in vector space Similarity can be used for characterize word in similarity semantically.Cosine similarity pair would generally be used when calculating similarity It is measured, and cosine similarity of the semantically close word in Word2vec spatially is higher, namely between them away from From smaller.
When carrying out content consistency analysis using Word2vec instruments, Word2vec models are complete Sogou laboratories Pre-training is carried out on the microblogging language material that net news database and subnetwork are collected, characteristic vector dimension is arranged to 200.
3. the picture based on emotion matching is recommended
3.1 text emotions are analyzed
Sentiment analysis is carried out to filtering the text after interference information in 2.1.First, it is determined that whether text includes emotion, i.e., Text emotion identifies, if there is emotion, fine granularity sentiment analysis is carried out to it, i.e. text emotion is classified more.Classifying the emotion more In the stage, text emotion is divided into seven classes:Happy, good, anger, sorrow, fear, dislike, shying.
3.1.1 text emotion identifies
User is when showing emotion, often word of the selection with obvious Sentiment orientation, in text emotion cognitive phase, We are according to part of speech and symbolic feature, it is first determined emotion deictic words/symbol:Adjective, degree adverb, interjection, microblogging expression, Cyberspeak,!With, part noun and verb.When choosing emotion deictic words noun and verb because in existing noun and In verb, the word of non-emotion class has a lot, so first having to establish sentiment dictionary, if some noun or verb are in sentiment dictionary In, then it is assumed that it is emotion deictic words, otherwise, then it is assumed that it is not emotion deictic words.It is real for convenience in implementation process Existing, a part of emotion word that we have selected in the Chinese emotion vocabulary ontology library of Dalian University of Technology's Research into information retrieval room is made For sentiment dictionary.Then the emotion intensity of microblogging is determined according to formula (1).
Wherein, EI represents microblog emotional intensity, and N is the total length (in units of word, including punctuation mark) of microblogging, fiFor The number that every kind of emotion deictic words/symbol occurs, αiFor the emotion intensity of every kind of emotion deictic words/symbol, value is (0,1). Given threshold value TH ∈ (0,1), if EI>TH, then it is assumed that this microblogging has emotion, on the contrary, then it is assumed that ameleia.For αiWith TH really Determine, it is necessary to by test of many times, take the α that effect is bestiWith TH values.
In implementation process, by repetition test, adjective, degree adverb, interjection, microblogging expression, cyberspeak,! With, part noun and verb emotion intensity αiRespectively 0.45,0.35,0.73,0.9,0.85,0.75,0.34, threshold value TH Value 0.05.Microblogging text segmented and marked part of speech with jieba participle instruments, can calculate adjective, degree adverb, The occurrence number f of interjection, noun and verbi, with regular expression can calculate microblogging expression, cyberspeak,!WithAppearance Number, it is manual foundation in microblogging expression storehouse, the cyberspeak storehouse that calculating process needs to use, wherein microblogging expression storehouse contains There are 1516 expressions, 159 network words are contained in cyberspeak storehouse.
To the ameleia text determined in aforementioned manners, further analyzed with SVM methods, judge the presence or absence of emotion.
In implementation process, one of the task of SVM training set in NLPCC2014:Emotion Analysis in Chinese Weibo Texts training set, altogether comprising 14000 microbloggings.First have to enter line number to training set with 2.1 method According to cleaning, feature extraction then is carried out with the Word2vec ameleia microbloggings judged to training set and previous step, finally uses SVM The ameleia microblogging judged to previous step determines whether.In the wherein Python of the implementation of SVM algorithm Svm bags in sklearn storehouses, kernel function are rbf cores, C=2.
If the result judged with SVM is still ameleia, be directly arranged to the affective tag of user's microblogging [0,0,0,0, 0,0,0]。
3.1.2 text emotion is classified more
There is emotion microblogging to what is judged in 3.1.1, carry out fine granularity sentiment analysis.One is built first with training set Grader f (), the Confidence f (x of seven kinds of affective tags of every microblogging can be obtained from this graderi,yj), by formula (2) It can determine that the affective tag ranking value of every microblogging.
Wherein rankf(xi,yj) represent microblogging xiAffective tag yjRanking value, j ∈ (1,7).
According to the above method, this stage uses Naive Bayes Classifier first, and carry out emotion to microblogging classifies more, will obtain Result be ranked up by the size of probable value, obtain ranking value Rank1;Carry out more classification to microblogging with SVM classifier again, Because SVM is a binary classifier, more classification are carried out using 1-v-1 method, 21 SVM classifiers is needed altogether, obtains To Rank2;Rank1 and Rank2 average is finally taken to be denoted as Rank as last ranking resultsw=[rw1,rw2,…, rw7], wherein RankwFor the sequence of text emotion, rwi(i ∈ [1,7]) represents the ranking value of all kinds of emotions, and value is [1,7], rwiValue it is smaller, illustrate that its corresponding emotion intensity is stronger.
When being classified with Naive Bayes Classifier, the feature extraction of text uses tfidf methods, and with SVM points When class device carries out classify more, Word2vec is still used in feature extraction, so ensure that the complementarity of feature.
3.2 picture sentiment analysis
Picture sentiment analysis mainly includes data acquisition, deep learning model training and picture sentiment analysis three parts.
First, picture concerned is obtained from petal net by the use of emotion vocabulary as term, it is then that emotion vocabulary is corresponding Emotion obtains initial data set as picture tag.Followed by the feeling polarities of term, picture tag, picture descriptive text Emotion uniformity data set is cleaned, obtain a purer data set.Deep learning method is further utilized, The sentiment classification model training based on CNN is carried out on data set after cleaning.The CNN models finally obtained using training are to picture Sentiment analysis is carried out, obtained result is ranked up by the size of probable value, is denoted asWherein RankpFor the sequence of text emotion, rpi(i ∈ [1,7]) represents the ranking value of all kinds of emotions, and value is [1,7], rpiValue get over It is small, it is stronger to represent its corresponding emotion intensity.
For the fine granularity sentiment analysis task of picture, emotional category label is identical with the polytypic label of text emotion, It is for seven kinds:Happy, good, anger, sorrow, fear, dislike, shying.For each picture in data set, its class label is stored.Especially , because data of 7 kinds of emotional categories in data set are unbalanced, so to its category after arrangement obtains initial data set Re-sampling operations are carried out, make it per a kind of picture number balanced proportion.
In specific implementation process, fine granularity sentiment analysis deep learning framework CaffeNet-emo7 is as shown in Figure 2. Network is by 5 convolutional layers, 3 full articulamentums and 1 softmax layers composition, and wherein the activation primitive of neuron selects ReLU letters Count, pond layer is added after preceding two layers of convolutional layer and the 5th convolutional layer.In order to which the model trained can be used to be finely adjusted, We keep convolutional layer, and the configuration of first two layers of the parameter and CaffeNet of pond layer and full articulamentum is completely the same, finally One full articulamentum modification size is 7, is named as fc8_s.The output of softmax layers for picture emotional category (happy, good, anger, Sorrow, fear, dislike, shy).
3.3 emotion consistency analysis
The purpose that emotion uniformity calculates is the similarity for calculating text and picture on emotional space.We utilize amendment Kendall's coefficient afterwards calculates the emotion ranking value Rank of textwWith the emotion ranking value Rank of picturepEmotion uniformity, It is expressed as follows:
Wherein, Tw,pFor the emotion degree of consistency of text and picture, span is [0,1], when its value closer to 1 when Illustrate that the emotion degree of correlation of text and picture is higher;C, D represent Rank in 3.1.2 respectivelywWith 3.2 in RankpCorresponding emotion sequence Value identical number and the number differed;N be emotion ordering vector length, the present invention in emotional category be 7 classes, so n Value is 7.
4. cross-platform user preference modeling
4.1 establish picture attribute model
The picture classification of petal net includes 32 major classes altogether, and by screening, we finally pick 11 kinds and can be used for expressing feelings Sense and the major class easily delivered by user as figure, specifically as shown in Figure 3.
11000 images are crawled from petal net for each above-mentioned major class, wherein 9000 is extracted and is used as training set, 1000 collect as checking, and 1000 are used as test set.
We select the strategy being finely adjusted to CaffeNet in terms of disaggregated model training, keep first 7 layers of CaffeNet It is constant, according to the picture categorical measure of above-mentioned selection, the output vector length of last full articulamentum is changed, makes output vector Length is equal with picture categorical measure, so output vector length is arranged into 11 in specific implementation process.Then receipts are utilized The data of collection are finely adjusted training to the model trained on ImageNet as initial model.The model finally given exists Confusion matrix on test set is as shown in the table.
From following table as can be seen that the overall performance of grader can substantially meet requirement, and deep learning is for tool The classification of elephant is higher as the resolution of the classifications such as automobile, cuisines, children, and is then identified for semanteme of the present so than higher level Effect is slightly worse.
Automobile/motor Cuisines Body-building/motion Pet Animation/game Beauty It is extremely objective Present Children Travelling Building
Automobile/motor 0.902 0 0.022 0 0.006 0.004 0.062 0.008 0.002 0.014 0.008
Cuisines 0 0.832 0.02 0.024 0.006 0.008 0.006 0.078 0.008 0.018 0.01
Body-building/motion 0.008 0.01 0.76 0.01 0.008 0.056 0.026 0.012 0.022 0.01 0.02
Pet 0.002 0.016 0.01 0.83 0.018 0.02 0.006 0.038 0.028 0.028 0.004
Animation/game 0 0.01 0.018 0.018 0.798 0.022 0.044 0.026 0.018 0.054 0.018
Beauty 0.008 0.008 0.05 0.022 0.028 0.734 0.006 0.044 0.082 0.018 0.006
It is extremely objective 0.054 0.006 0.028 0.004 0.044 0.002 0.764 0.056 0.002 0.006 0.016
Present 0.004 0.062 0.014 0.018 0.01 0.028 0.044 0.648 0.022 0.014 0.042
Children 0 0.002 0.038 0.014 0.006 0.1 0.006 0.016 0.81 0.018 0.006
Travelling 0.006 0.018 0.024 0.038 0.036 0.022 0.004 0.014 0.006 0.706 0.15
Building 0.016 0.036 0.016 0.022 0.04 0.004 0.032 0.06 0 0.114 0.72
The user interest modeling of 4.2 picture social platforms
Because the data of petal net are based on picture, so based on the user preferences model of petal network data mainly around figure Piece attribute is established, i.e. the overall image content hobby of counting user.
For user u, all picture composition data set I of its collection are collected first, for a picture i, collect it Generic attribute c.For user u, there is Iu={ i1:c1,i2:c2,…,im:cη, wherein IuRepresent user u image data Collection, imM-th of the picture collected for user, cηFor picture attribute corresponding to picture, η is the classification number for the picture attribute chosen, M value is the number that user collects pictures.
The user preferences model of website can be characterized by the accumulated probability histogram of picture generic:
For user u, all picture composition data set I of its collection are collected first, for a picture i, collect it Generic attribute c.
Wherein M is the picture number in data set I;Hist () function is histogram functions, and its effect is to obtain each class Picture number under other attribute, return value are a vector.
4.3 picture and text share user's figure preference modeling of social platform
Microblog data includes abundant picture and text message, so the text of the user preferences models coupling based on microblog data This and picture are established, i.e. figure hobby of the counting user under certain particular emotion.
For a user u, its all graph text information composition data collection W delivered is collected.Each element in W can be by Word w and corresponding pictures { i1,i2,…,ikComposition.
For text data w, using 3.1 text emotion analysis model, obtain 7 class emotions of text prediction sort to Measure Ew=(e1,e2,…,e7)。
For a pictures i of pictures, using the 4.1 picture attribute models proposed, the contents attribute of picture is obtained Prediction probability vector Pi=(p1,p2,…,p11)。
For certain particular emotion em, all image, text and datas for traveling through the user obtain the attribute set of corresponding figureThen a user is in particular emotion emUnder figure hobby model can be each by figure generic The cumulative histogram of item probability is characterized:
Wherein H isThe picture number included in set.
The fusion of 4.4 cross-platform user preferences modelings
For a user, the graph text information of delivering of user can be than the relatively straightforward figure custom for reflecting user, category In the explicit interest of user.And what user shared that the picture that social platform shares reflects in picture is happiness of the user for picture It is good, but no clear evidence determines user and can use these pictures as its text figure, so user shares society in picture The interest on platform is handed over to belong to the potential interest of user.Carrying out user modeling with reference to the data on the two platforms can be to have concurrently The explicit interest and potential interest of user so that recommendation effect is more preferable.
So in specific implementation process, this patent carries out the cross-platform use of microblogging and petal net using selection linear weighted function Model Fusion is liked at family.And when the explicit interest of user is variant with potential interest, the explicit interest of user is considered emphatically.Melt The formalization representation of syzygy number is as follows:
Wherein dist () function is measuring similarity function, to measure the similitude between two vectors, in the present invention In, we use cosine similarity as metric function.ImplicitPreferuImagePrefer in corresponding 4.2u,In corresponding 4.3
Then for certain particular emotion em, the overall hobby model tormulation of user is:
So far, we obtain the user preferences model in picture attribute expression of space.
5. merge content, emotion, the picture of user preferences to recommend
For user u, its user preferences model is obtained using the cross-platform user preferences model modelling approach in 4 Preferu
Text w is generated for one of user u, calculates each picture p to be recommended recommendation index
Rec(Preferu, w, p) and=min (Sw,p)·Tw,p·PreferU | type=p.type (8)
Wherein PreferU | type=p.typeRepresent that user, can be by formula for the other fancy grade of the affiliated picture categories of picture p (7) draw;Tw,pText w and picture p emotion uniformity is represented, can be drawn by formula (3);Sw,pRepresent text w and picture p's Keyword similarity matrix, it can be drawn by the cosine similarity for the keyword that text and picture are calculated in 2.3.
Finally, according to recommending the size of index to be ranked up from big to small, obtain recommending sequence of pictures.
6. algorithm is evaluated
Picture and text are shared social platform and are linked together with picture social platform by the present invention, facilitate user and are equipped with word Corresponding picture expresses the emotion of oneself, viewpoint etc..We compared for recommending and the width of user's underground pipe net 5 using this patent method The user satisfaction of picture.Experiment shows that the user satisfaction of this patent method is more than the 30 of the satisfaction of user's underground pipe net Times, this explanation this patent method can be with the information acquisition efficiency of significant increase user, so as to lift user satisfaction.

Claims (4)

  1. A kind of 1. across social platform picture proposed algorithm based on interior perhaps emotion similitude, it is characterised in that:
    It is divided into four parts:Picture based on content matching is recommended, and the picture based on emotion matching is recommended, cross-platform user preference Modeling, and fusion content, emotion, the recommendation results of user preference reorder;
    The content that first picture and text are shared with the generation text of the user in social platform is analyzed, and extracts its key word information, together When analysis in content is carried out to the picture in picture social platform with same method, it is consistent with the content of picture according to text Property is matched, and obtains the initial recommendation list based on content;Secondly sentiment analysis is carried out to text, if text includes feelings Sense, then matched based on text with the emotion uniformity of picture, obtains the initial recommendation list based on emotion matching;Then it is right User models in two social network-i i-platforms respectively, obtains user preference;Finally, content, emotion and user preference are merged, Produce final picture recommendation list.
  2. 2. algorithm according to claim 1, it is characterised in that the picture based on content matching is recommended to comprise the following steps that:
    1.1 text content analysis
    The user that with existing participle instrument picture and text are shared with social platform first generates text and is segmented and go at stop words Reason, and marks part of speech, then carries out the extraction of keyword to it with existing TextRank algorithm, by the keyword extracted by Weights sort from big to small, retain the keyword of respective amount on demand;
    1.2 image contents are analyzed
    Picture content information obtains from text data;
    Description text for picture, extracting mode is consistent with text content analysis, and the keyword obtained with TextRank is made For the label of picture material;
    For picture tag, part of speech analysis directly is carried out to it, the label using nouns and adjectives therein as picture material;
    1.3 content consistencies are analyzed
    Content consistency analysis is carried out to keyword caused by 1.1 and 1.2 parts using Word2vec instruments;Word2vec is utilized The thought of deep learning, word is mapped in the vector space of a K dimension by training, and phase of the word in vector space It is used for characterizing word in similarity semantically like degree;It is measured using cosine similarity when calculating similarity, language Cosine similarity of the close word in Word2vec spatially is higher in justice, namely they the distance between it is smaller;
    Picture based on emotion matching is recommended to comprise the following steps that:
    2.1 text emotions are analyzed
    Sentiment analysis is carried out to the text after segmenting in 1.1;First, it is determined that whether text includes emotion, i.e. text emotion identifies, If there is emotion, fine granularity sentiment analysis is carried out to it, i.e. text emotion is classified more;In the more sorting phases of emotion, by text feelings Sense is divided into seven classes:Happy, good, anger, sorrow, fear, dislike, shying;
    2.1.1 text emotion identifies
    According to part of speech and symbolic feature, it is first determined emotion deictic words/symbol:Adjective, degree adverb, interjection, microblogging expression, Cyberspeak,!With, part noun and verb;When choosing emotion deictic words noun with verb, first have to establish sentiment dictionary, If some noun or verb are in sentiment dictionary, then it is assumed that it is emotion deictic words, otherwise, then it is assumed that it is not emotion instruction Word;Then the emotion intensity of text is determined according to formula (1);
    <mrow> <mi>E</mi> <mi>I</mi> <mo>=</mo> <mfrac> <mn>1</mn> <mi>N</mi> </mfrac> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>i</mi> <mo>=</mo> <mn>0</mn> </mrow> <mn>7</mn> </munderover> <msub> <mi>&amp;alpha;</mi> <mi>i</mi> </msub> <msub> <mi>f</mi> <mi>i</mi> </msub> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow>
    Wherein, EI represents the emotion intensity that user generates text, and N is the total length of text, in units of word, including punctuate symbol Number, fiThe number occurred for every kind of emotion deictic words/symbol, αiFor the emotion intensity of every kind of emotion deictic words/symbol, value is (0,1);Given threshold value TH ∈ (0,1), if EI>TH, then it is assumed that this text has emotion, on the contrary, then it is assumed that ameleia;
    To the ameleia text determined in aforementioned manners, further analyzed with SVM methods, judge the presence or absence of emotion;If use SVM The affective tag of user's microblogging still as ameleia, is then directly arranged to [0,0,0,0,0,0,0] by the result judged, and vectorial is every Individual dimension is expressed as happy, good, anger, sorrow, the ranking value fearing, dislike, shying, and 0 represents without such a emotion;
    2.1.2 text emotion is classified more
    There is emotion text to carry out fine granularity sentiment analysis to what is judged in 2.1.1;A classification is built first with training set Device f (), the Confidence f (x of seven kinds of affective tags of user's generation text are obtained from this graderi,yj), it is true by formula (2) Make the affective tag ranking value of every microblogging;
    <mrow> <msub> <mi>rank</mi> <mi>f</mi> </msub> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>=</mo> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>k</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mi>j</mi> <mo>&amp;NotEqual;</mo> <mi>k</mi> </mrow> <mn>7</mn> </munderover> <mo>|</mo> <mo>|</mo> <mi>f</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mi>j</mi> </msub> <mo>)</mo> </mrow> <mo>&gt;</mo> <mi>f</mi> <mrow> <mo>(</mo> <msub> <mi>x</mi> <mi>i</mi> </msub> <mo>,</mo> <msub> <mi>y</mi> <mi>k</mi> </msub> <mo>)</mo> </mrow> <mo>|</mo> <mo>|</mo> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>2</mn> <mo>)</mo> </mrow> </mrow>
    Wherein rankf(xi,yj) represent microblogging xiAffective tag yjRanking value, j ∈ (1,7);
    Naive Bayes Classifier is used first, and text is generated to user and carries out emotion classification more, by obtained result by probable value Size be ranked up, obtain ranking value Rank1;Text is generated to user with SVM classifier again and carries out more classification, because SVM It is a binary classifier, so carrying out more classification using 1-v-1 method, obtains Rank2;Finally take Rank1's and Rank2 Average is denoted as Rank as last ranking resultsw=[rw1,rw2,…,rw7], wherein RankwFor the sequence of text emotion, rwi (i ∈ [1,7]) represents the ranking value of all kinds of emotions, and value is [1,7], rwiValue it is smaller, represent its corresponding emotion intensity and get over By force;
    2.2 picture sentiment analysis
    Picture sentiment analysis mainly includes data acquisition, deep learning model training and picture emotional semantic classification three parts;
    First, picture concerned is obtained from picture website by the use of emotion vocabulary as term, emotion vocabulary is then corresponded into feelings Sense obtains initial data set as picture tag;Followed by the feeling polarities of term, picture tag, picture descriptive text Emotion uniformity is cleaned to data set, obtains a purer data set;Deep learning method is further utilized, clear The sentiment classification model training based on CNN is carried out on data set after washing;The CNN models finally obtained using training are entered to picture Row sentiment analysis, obtained result is ranked up by the size of probable value, is denoted as Rankp=[rp1,rp2,…,rp7], wherein RankpFor the sequence of text emotion, rpi(i ∈ [1,7]) represents the ranking value of all kinds of emotions, and value is [1,7], rpiValue get over It is small, it is stronger to represent its corresponding emotion intensity;
    For the fine granularity sentiment analysis task of picture, emotional category label is identical with the polytypic label of text emotion, is Seven kinds:Happy, good, anger, sorrow, fear, dislike, shying;For each picture in data set, its class label is stored;Due to 7 kinds of emotions Data of the classification in data set are unbalanced, so carrying out resampling behaviour to its category after arrangement obtains initial data set Make, make it per a kind of picture number balanced proportion;
    2.3 emotion consistency analysis
    The purpose that emotion uniformity calculates is the similarity for calculating text and picture on emotional space;Utilize revised willing moral Your coefficient calculates the emotion ranking value Rank of textwWith the emotion ranking value Rank of picturepEmotion uniformity, be expressed as follows:
    <mrow> <msub> <mi>T</mi> <mrow> <mi>w</mi> <mo>,</mo> <mi>p</mi> </mrow> </msub> <mo>=</mo> <mfrac> <mn>1</mn> <mn>2</mn> </mfrac> <mrow> <mo>(</mo> <mfrac> <mrow> <mi>C</mi> <mo>-</mo> <mi>D</mi> </mrow> <mrow> <mfrac> <mn>1</mn> <mn>2</mn> </mfrac> <mi>n</mi> <mrow> <mo>(</mo> <mi>n</mi> <mo>-</mo> <mn>1</mn> <mo>)</mo> </mrow> </mrow> </mfrac> <mo>+</mo> <mn>1</mn> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>3</mn> <mo>)</mo> </mrow> </mrow>
    Wherein, Tw,pFor the emotion degree of consistency of text and picture, span is [0,1], when its value closer to 1 when illustrate The emotion degree of correlation of text and picture is higher;Rank during C, D represent 2.1 respectivelywWith 2.2 in RankpCorresponding emotion ranking value is identical Number and the number that differs;N is the length of emotion ordering vector, and n values are 7.
  3. 3. algorithm according to claim 2, it is characterised in that cross-platform user preference modeling comprises the following steps that:
    Cross-platform user preference modeling needs to deploy in two social platforms, and one of them is that picture and text share social platform, separately One is picture social platform;The requirement for sharing social platform for picture and text is:1) platform allows user to deliver text and be equipped with Picture;2) platform allows user to inquire about the picture and text state that its history is delivered;Requirement for picture social platform is:1) platform is permitted Deliver picture and be subject to necessary description in family allowable;2) platform has preset conventional picture classification and has been selected for user and need user Related category is selected for each picture;
    3.1 establish picture attribute model
    Picture social platform generally includes several picture classifications, and corresponding figure is crawled from picture website for each classification Piece, and it is divided into training set, checking collection and test set;
    The strategy that selection is finely adjusted to CaffeNet in terms of disaggregated model training, keep first 7 layers of CaffeNet it is constant, according to The picture categorical measure of above-mentioned picture social platform, the output vector length of last full articulamentum is changed, makes output vector Length is equal with picture categorical measure;Then introductory die is used as to the model trained on ImageNet by the use of the data of collection Type is finely adjusted training, finally gives confusion matrix of the model on test set;
    The user interest modeling of 3.2 picture social platforms
    Carried out mainly around picture and picture attribute based on the user preferences model of picture sharing platform, i.e. counting user is overall Picture is liked;
    For user u, all picture composition data set I of its collection are collected first, for a picture i, are collected belonging to it Category attribute c;For user u, there is Iu={ i1:c1,i2:c2,…,im:cη, wherein IuRepresent user u image data collection, im M-th of the picture collected for user, cηFor picture attribute corresponding to picture, η is the classification number for the picture attribute chosen, m's Value is the number that user collects pictures;
    The user preferences model of website is characterized by the accumulated probability histogram of picture generic:
    <mrow> <msub> <mi>ImagePrefer</mi> <mi>u</mi> </msub> <mo>=</mo> <mfrac> <mrow> <mi>h</mi> <mi>i</mi> <mi>s</mi> <mi>t</mi> <mrow> <mo>(</mo> <mo>{</mo> <msub> <mi>c</mi> <mn>1</mn> </msub> <mo>,</mo> <msub> <mi>c</mi> <mn>2</mn> </msub> <mo>,</mo> <mo>...</mo> <mo>,</mo> <msub> <mi>c</mi> <mi>&amp;eta;</mi> </msub> <mo>}</mo> <mo>)</mo> </mrow> </mrow> <mi>M</mi> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>4</mn> <mo>)</mo> </mrow> </mrow>
    Wherein M is the picture number in data set I;Hist () function is histogram functions, and its effect is to obtain each classification category Picture number under property, return value is a vector;
    3.3 picture and text share user's figure preference modeling of social platform
    Picture and text share social platform packet containing abundant picture and text message, so sharing social platform data based on picture and text User preferences models coupling text and picture established, i.e., counting user under certain particular emotion figure hobby;
    For a user u, its all graph text information composition data collection W delivered is collected;Each element in W by word w and Corresponding pictures { i1,i2,…,ikComposition;
    For text data w, using 2.1 text emotion analysis model, the prediction ordering vector E of 7 class emotions of text is obtainedw =(e1,e2,…,e7);
    For a pictures i of pictures, using the 3.1 picture attribute models proposed, the prediction of the contents attribute of picture is obtained Probability vector Pi=(p1,p2,…,p11);
    For certain particular emotion em, all image, text and datas for traveling through the user obtain the attribute set of corresponding figureThen a user is in particular emotion emUnder figure hobby model it is every general by figure generic The cumulative histogram of rate is characterized:
    <mrow> <msub> <mi>TextPrefer</mi> <mrow> <mi>u</mi> <mo>,</mo> <msub> <mi>&amp;epsiv;</mi> <mi>m</mi> </msub> </mrow> </msub> <mo>=</mo> <mfrac> <mrow> <msub> <mi>&amp;Sigma;IP</mi> <msub> <mi>&amp;epsiv;</mi> <mi>m</mi> </msub> </msub> </mrow> <mi>H</mi> </mfrac> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>5</mn> <mo>)</mo> </mrow> </mrow>
    Wherein H isThe picture number included in set;
    The fusion of 3.4 cross-platform user preferences modelings
    For a user, user shares the graph text information that social platform delivers in picture and text and reflects matching somebody with somebody for user than relatively straightforward Figure custom, belong to the explicit interest of user;And that user shares that the picture that social platform shares reflects in picture is user couple In the hobby of picture, but no clear evidence determines that user can use these pictures as its text figure, so user exists The interest that picture is shared in social platform belongs to the potential interest of user;User modeling is carried out with reference to the data on the two platforms Just the explicit interest of user and potential interest are had concurrently;
    Cross-platform user preferences Model Fusion is carried out using selection linear weighted function;And when the explicit interest of user has with potential interest During difference, the explicit interest of user is considered emphatically;The formalization representation of fusion coefficients is as follows:
    <mrow> <msub> <mi>&amp;beta;</mi> <mrow> <mi>u</mi> <mo>,</mo> <msub> <mi>&amp;epsiv;</mi> <mi>m</mi> </msub> </mrow> </msub> <mo>=</mo> <mi>d</mi> <mi>i</mi> <mi>s</mi> <mi>t</mi> <mrow> <mo>(</mo> <msub> <mi>ImplicitPrefer</mi> <mi>u</mi> </msub> <mo>,</mo> <msub> <mi>ExplicitPrefer</mi> <mrow> <mi>u</mi> <mo>,</mo> <msub> <mi>&amp;epsiv;</mi> <mi>m</mi> </msub> </mrow> </msub> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>6</mn> <mo>)</mo> </mrow> </mrow>
    Wherein dist () function is measuring similarity function, to measure the similitude between two vectors, in this patent, is made Metric function is used as by the use of cosine similarity;ImplicitPreferuImagePrefer in corresponding 3.2u, In corresponding 3.3
    Then for certain particular emotion em, the overall hobby model tormulation of user is:
    <mrow> <msub> <mi>Prefer</mi> <mrow> <mi>u</mi> <mo>,</mo> <msub> <mi>&amp;epsiv;</mi> <mi>m</mi> </msub> </mrow> </msub> <mo>=</mo> <msub> <mi>ExplicitPrefer</mi> <mrow> <mi>u</mi> <mo>,</mo> <msub> <mi>&amp;epsiv;</mi> <mi>m</mi> </msub> </mrow> </msub> <mo>+</mo> <msub> <mi>&amp;beta;</mi> <mrow> <mi>u</mi> <mo>,</mo> <msub> <mi>&amp;epsiv;</mi> <mi>m</mi> </msub> </mrow> </msub> <msub> <mi>ImplicitPrefer</mi> <mi>u</mi> </msub> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>7</mn> <mo>)</mo> </mrow> </mrow>
    So far, the user preferences model in picture attribute expression of space is obtained.
  4. 4. algorithm according to claim 1, it is characterised in that fusion content, emotion, the picture of user preferences are recommended
    For user u, its user preferences model Prefer is obtained using cross-platform user preferences model modelling approachu
    Text w is generated for one of user u, calculates each picture p to be recommended recommendation index
    Rec(Preferu, w, p) and=min (Sw,p)·Tw,p·PreferU | type=p.type (8)
    Wherein PreferU | type=p.typeRepresent that user for the other fancy grade of the affiliated picture categories of picture p, is drawn by formula (7); Tw,pText w and picture p emotion uniformity is represented, is drawn by formula (3);Sw,pRepresent that text w is similar to picture p keyword Matrix is spent, is drawn by the cosine similarity of calculating text and the keyword of picture;Finally, according to recommend index size from greatly to It is small to be ranked up, obtain recommending sequence of pictures.
CN201710560717.5A 2017-07-11 2017-07-11 Cross-social platform picture recommendation algorithm based on content or emotion similarity Active CN107357889B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710560717.5A CN107357889B (en) 2017-07-11 2017-07-11 Cross-social platform picture recommendation algorithm based on content or emotion similarity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710560717.5A CN107357889B (en) 2017-07-11 2017-07-11 Cross-social platform picture recommendation algorithm based on content or emotion similarity

Publications (2)

Publication Number Publication Date
CN107357889A true CN107357889A (en) 2017-11-17
CN107357889B CN107357889B (en) 2020-07-17

Family

ID=60291888

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710560717.5A Active CN107357889B (en) 2017-07-11 2017-07-11 Cross-social platform picture recommendation algorithm based on content or emotion similarity

Country Status (1)

Country Link
CN (1) CN107357889B (en)

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107862620A (en) * 2017-12-11 2018-03-30 四川新网银行股份有限公司 A kind of similar users method for digging based on social data
CN108230171A (en) * 2017-12-26 2018-06-29 爱品克科技(武汉)股份有限公司 One kind is based on timing node LDA theme algorithms
CN108388544A (en) * 2018-02-10 2018-08-10 桂林电子科技大学 A kind of picture and text fusion microblog emotional analysis method based on deep learning
CN108494741A (en) * 2018-03-05 2018-09-04 同济大学 The identity theft detection method of behavior is synthesized based on user on line
CN108551419A (en) * 2018-03-19 2018-09-18 联想(北京)有限公司 A kind of information processing method and device
CN108563663A (en) * 2018-01-04 2018-09-21 出门问问信息科技有限公司 Picture recommendation method, device, equipment and storage medium
CN108733779A (en) * 2018-05-04 2018-11-02 百度在线网络技术(北京)有限公司 The method and apparatus of text figure
CN108804650A (en) * 2018-06-07 2018-11-13 北京工业大学 A kind of image recommendation method based on modular manifold ranking
CN109034248A (en) * 2018-07-27 2018-12-18 电子科技大学 A kind of classification method of the Noise label image based on deep learning
CN109062995A (en) * 2018-07-05 2018-12-21 北京工业大学 A kind of social activity plan opens up the personalized recommendation algorithm of drawing board (Board) cover on network
CN109299253A (en) * 2018-09-03 2019-02-01 华南理工大学 A kind of social text Emotion identification model construction method of Chinese based on depth integration neural network
CN109614486A (en) * 2018-11-28 2019-04-12 宇捷东方(北京)科技有限公司 A kind of service automatic Recommendation System and method based on natural language processing technique
CN109753563A (en) * 2019-03-28 2019-05-14 深圳市酷开网络科技有限公司 Tag extraction method, apparatus and computer readable storage medium based on big data
CN109948401A (en) * 2017-12-20 2019-06-28 北京京东尚科信息技术有限公司 Data processing method and its system for text
CN109978645A (en) * 2017-12-28 2019-07-05 北京京东尚科信息技术有限公司 A kind of data recommendation method and device
CN110083684A (en) * 2019-04-24 2019-08-02 吉林大学 Interpretable recommended models towards fine granularity emotion
CN110287319A (en) * 2019-06-13 2019-09-27 南京航空航天大学 Students' evaluation text analyzing method based on sentiment analysis technology
CN110309308A (en) * 2019-06-27 2019-10-08 北京金山安全软件有限公司 Text information classification method and device and electronic equipment
CN110719525A (en) * 2019-08-28 2020-01-21 咪咕文化科技有限公司 Bullet screen expression package generation method, electronic equipment and readable storage medium
CN111125460A (en) * 2019-12-24 2020-05-08 腾讯科技(深圳)有限公司 Information recommendation method and device
CN111339338A (en) * 2020-02-29 2020-06-26 西安理工大学 Text picture matching recommendation method based on deep learning
CN112699949A (en) * 2021-01-05 2021-04-23 百威投资(中国)有限公司 Potential user identification method and device based on social platform data
CN112926569A (en) * 2021-03-16 2021-06-08 重庆邮电大学 Method for detecting natural scene image text in social network
CN113158082A (en) * 2021-05-13 2021-07-23 聂佼颖 Artificial intelligence-based media content reality degree analysis method
CN113326374A (en) * 2021-05-25 2021-08-31 成都信息工程大学 Short text emotion classification method and system based on feature enhancement
US11416539B2 (en) 2019-06-10 2022-08-16 International Business Machines Corporation Media selection based on content topic and sentiment
CN115546355A (en) * 2022-11-28 2022-12-30 北京红棉小冰科技有限公司 Text matching method and device
CN116628317A (en) * 2023-04-19 2023-08-22 上海顺多网络科技有限公司 Method for analyzing user group preference by using small amount of information
CN117333800A (en) * 2023-10-12 2024-01-02 广州有好戏网络科技有限公司 Cross-platform content operation optimization method and system based on artificial intelligence

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663010A (en) * 2012-03-20 2012-09-12 复旦大学 Personalized image browsing and recommending method based on labelling semantics and system thereof
CN104933113A (en) * 2014-06-06 2015-09-23 北京搜狗科技发展有限公司 Expression input method and device based on semantic understanding
US9489401B1 (en) * 2015-06-16 2016-11-08 My EyeSpy PTY Ltd. Methods and systems for object recognition
CN106649603A (en) * 2016-11-25 2017-05-10 北京资采信息技术有限公司 Webpage text data sentiment classification designated information push method
CN106886580A (en) * 2017-01-23 2017-06-23 北京工业大学 A kind of picture feeling polarities analysis method based on deep learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663010A (en) * 2012-03-20 2012-09-12 复旦大学 Personalized image browsing and recommending method based on labelling semantics and system thereof
CN104933113A (en) * 2014-06-06 2015-09-23 北京搜狗科技发展有限公司 Expression input method and device based on semantic understanding
US9489401B1 (en) * 2015-06-16 2016-11-08 My EyeSpy PTY Ltd. Methods and systems for object recognition
CN106649603A (en) * 2016-11-25 2017-05-10 北京资采信息技术有限公司 Webpage text data sentiment classification designated information push method
CN106886580A (en) * 2017-01-23 2017-06-23 北京工业大学 A kind of picture feeling polarities analysis method based on deep learning

Cited By (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107862620A (en) * 2017-12-11 2018-03-30 四川新网银行股份有限公司 A kind of similar users method for digging based on social data
CN109948401A (en) * 2017-12-20 2019-06-28 北京京东尚科信息技术有限公司 Data processing method and its system for text
CN108230171A (en) * 2017-12-26 2018-06-29 爱品克科技(武汉)股份有限公司 One kind is based on timing node LDA theme algorithms
CN109978645A (en) * 2017-12-28 2019-07-05 北京京东尚科信息技术有限公司 A kind of data recommendation method and device
CN109978645B (en) * 2017-12-28 2022-04-12 北京京东尚科信息技术有限公司 Data recommendation method and device
CN108563663A (en) * 2018-01-04 2018-09-21 出门问问信息科技有限公司 Picture recommendation method, device, equipment and storage medium
CN108388544A (en) * 2018-02-10 2018-08-10 桂林电子科技大学 A kind of picture and text fusion microblog emotional analysis method based on deep learning
CN108494741A (en) * 2018-03-05 2018-09-04 同济大学 The identity theft detection method of behavior is synthesized based on user on line
CN108494741B (en) * 2018-03-05 2020-09-15 同济大学 Identity embezzlement detection method based on-line user synthetic behavior
CN108551419A (en) * 2018-03-19 2018-09-18 联想(北京)有限公司 A kind of information processing method and device
CN108733779B (en) * 2018-05-04 2022-10-04 百度在线网络技术(北京)有限公司 Text matching method and device
CN108733779A (en) * 2018-05-04 2018-11-02 百度在线网络技术(北京)有限公司 The method and apparatus of text figure
CN108804650A (en) * 2018-06-07 2018-11-13 北京工业大学 A kind of image recommendation method based on modular manifold ranking
CN108804650B (en) * 2018-06-07 2021-07-02 北京工业大学 Modularized manifold sorting-based image recommendation method
CN109062995A (en) * 2018-07-05 2018-12-21 北京工业大学 A kind of social activity plan opens up the personalized recommendation algorithm of drawing board (Board) cover on network
CN109062995B (en) * 2018-07-05 2021-07-30 北京工业大学 Personalized recommendation algorithm for drawing Board (Board) cover on social strategy exhibition network
CN109034248A (en) * 2018-07-27 2018-12-18 电子科技大学 A kind of classification method of the Noise label image based on deep learning
CN109034248B (en) * 2018-07-27 2022-04-05 电子科技大学 Deep learning-based classification method for noise-containing label images
CN109299253A (en) * 2018-09-03 2019-02-01 华南理工大学 A kind of social text Emotion identification model construction method of Chinese based on depth integration neural network
CN109614486A (en) * 2018-11-28 2019-04-12 宇捷东方(北京)科技有限公司 A kind of service automatic Recommendation System and method based on natural language processing technique
CN109753563A (en) * 2019-03-28 2019-05-14 深圳市酷开网络科技有限公司 Tag extraction method, apparatus and computer readable storage medium based on big data
CN110083684B (en) * 2019-04-24 2021-11-19 吉林大学 Interpretable recommendation model for fine-grained emotion
CN110083684A (en) * 2019-04-24 2019-08-02 吉林大学 Interpretable recommended models towards fine granularity emotion
US11416539B2 (en) 2019-06-10 2022-08-16 International Business Machines Corporation Media selection based on content topic and sentiment
CN110287319B (en) * 2019-06-13 2021-06-15 南京航空航天大学 Student evaluation text analysis method based on emotion analysis technology
CN110287319A (en) * 2019-06-13 2019-09-27 南京航空航天大学 Students' evaluation text analyzing method based on sentiment analysis technology
CN110309308A (en) * 2019-06-27 2019-10-08 北京金山安全软件有限公司 Text information classification method and device and electronic equipment
CN110719525A (en) * 2019-08-28 2020-01-21 咪咕文化科技有限公司 Bullet screen expression package generation method, electronic equipment and readable storage medium
CN111125460A (en) * 2019-12-24 2020-05-08 腾讯科技(深圳)有限公司 Information recommendation method and device
CN111339338B (en) * 2020-02-29 2023-03-07 西安理工大学 Text picture matching recommendation method based on deep learning
CN111339338A (en) * 2020-02-29 2020-06-26 西安理工大学 Text picture matching recommendation method based on deep learning
CN112699949A (en) * 2021-01-05 2021-04-23 百威投资(中国)有限公司 Potential user identification method and device based on social platform data
CN112926569A (en) * 2021-03-16 2021-06-08 重庆邮电大学 Method for detecting natural scene image text in social network
CN113158082A (en) * 2021-05-13 2021-07-23 聂佼颖 Artificial intelligence-based media content reality degree analysis method
CN113326374A (en) * 2021-05-25 2021-08-31 成都信息工程大学 Short text emotion classification method and system based on feature enhancement
CN113326374B (en) * 2021-05-25 2022-12-20 成都信息工程大学 Short text emotion classification method and system based on feature enhancement
CN115546355A (en) * 2022-11-28 2022-12-30 北京红棉小冰科技有限公司 Text matching method and device
CN116628317A (en) * 2023-04-19 2023-08-22 上海顺多网络科技有限公司 Method for analyzing user group preference by using small amount of information
CN117333800A (en) * 2023-10-12 2024-01-02 广州有好戏网络科技有限公司 Cross-platform content operation optimization method and system based on artificial intelligence
CN117333800B (en) * 2023-10-12 2024-04-05 广州有好戏网络科技有限公司 Cross-platform content operation optimization method and system based on artificial intelligence

Also Published As

Publication number Publication date
CN107357889B (en) 2020-07-17

Similar Documents

Publication Publication Date Title
CN107357889A (en) A kind of across social platform picture proposed algorithm based on interior perhaps emotion similitude
Zhao et al. An image-text consistency driven multimodal sentiment analysis approach for social media
CN107992531B (en) News personalized intelligent recommendation method and system based on deep learning
CN109492157B (en) News recommendation method and theme characterization method based on RNN and attention mechanism
Li et al. Imbalanced text sentiment classification using universal and domain-specific knowledge
Ezaldeen et al. A hybrid E-learning recommendation integrating adaptive profiling and sentiment analysis
Xu et al. Hierarchical emotion classification and emotion component analysis on Chinese micro-blog posts
Pan et al. Social media-based user embedding: A literature review
Breitfuss et al. Representing emotions with knowledge graphs for movie recommendations
CN107357793A (en) Information recommendation method and device
Dubey et al. Item-based collaborative filtering using sentiment analysis of user reviews
CN111309936A (en) Method for constructing portrait of movie user
Jain et al. Deceptive reviews detection using deep learning techniques
Liu et al. Sentiment recognition for short annotated GIFs using visual-textual fusion
Okazaki et al. How to mine brand Tweets: Procedural guidelines and pretest
Mehta et al. Sentiment analysis of tweets using supervised learning algorithms
Song et al. Text sentiment analysis based on convolutional neural network and bidirectional LSTM model
CN107103093B (en) Short text recommendation method and device based on user behavior and emotion analysis
Chaudhuri Visual and text sentiment analysis through hierarchical deep learning networks
Rahman et al. Sentiment analysis on Twitter data: comparative study on different approaches
Yu et al. Emoticon analysis for Chinese social media and e-commerce: The AZEmo system
Ghorbanali et al. A comprehensive survey on deep learning-based approaches for multimodal sentiment analysis
Sheeba et al. A fuzzy logic based on sentiment classification
Choi et al. Classifications of restricted web streaming contents based on convolutional neural network and long short-term memory (CNN-LSTM).
Vayadande et al. Mood detection and emoji classification using tokenization and convolutional neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant