CN105279288B - A kind of online content recommendation method based on deep neural network - Google Patents

A kind of online content recommendation method based on deep neural network Download PDF

Info

Publication number
CN105279288B
CN105279288B CN201510883752.1A CN201510883752A CN105279288B CN 105279288 B CN105279288 B CN 105279288B CN 201510883752 A CN201510883752 A CN 201510883752A CN 105279288 B CN105279288 B CN 105279288B
Authority
CN
China
Prior art keywords
content
user
vector
pushed
behavior
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201510883752.1A
Other languages
Chinese (zh)
Other versions
CN105279288A (en
Inventor
陈亮
王娜
李霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen University
Original Assignee
Shenzhen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen University filed Critical Shenzhen University
Priority to CN201510883752.1A priority Critical patent/CN105279288B/en
Publication of CN105279288A publication Critical patent/CN105279288A/en
Application granted granted Critical
Publication of CN105279288B publication Critical patent/CN105279288B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of, and the online content based on deep neural network recommends method, on the basis of tradition is based on commending contents, introduce deep neural network (Deep Neural Network, DNN) term vector tool, according to the historical behavior of content to be pushed text message and user, content and user are mapped in high-dimensional vector space, by calculating the COS distance between vector, screening and filtering is to the interested user group of recommendation.By in Large-scale Mobile content service system it is demonstrated experimentally that Generalization bounds proposed by the invention are significantly improved compared to random recommendation, ContentKNN and ItemCF scheduling algorithms in recommendation effect.

Description

A kind of online content recommendation method based on deep neural network
Technical field
The present invention relates to technical field of information processing, and in particular to a kind of online content recommendation based on deep neural network Method.
Background technology
With the fast development of online content enriched constantly with mobile Internet, select suitable content push emerging to sense The user of interest, becomes one of the important need of online content service provider.The significant challenge faced has:1, user characteristics and content are special Effective expression of sign;2, the accuracy of the message push of personalized recommendation requires (invalid message push push Notification can excessively bother user, influence user experience);3, the moderate complexity of proposed algorithm can be based on existing System carries out the operation and execution of large-scale data.
The prior art is lacked based on conventional recommendation algorithm excavates the depth of user and content, the scene on extensive line Recommend clicking rate relatively low in experiment, is recommended in a manner of message push Push Notification, it is emerging because cannot effectively hit user Interest leads to be pushed to most message being ignored and bringing the experience being disturbed to user, cannot effectively realize precision Property Push recommend.For example, all there is cold start-up for the new article in commending system, new user.For such feelings Condition, current major part commending system can consider to use mixed model and the Generalization bounds based on content analysis.Tradition is based on content Proposed algorithm mainly use the description informations such as article and user Tag, these information are typically the different people couple by manually adding Same thing has different views, also has different describing modes, the difference of data how much can be caused in this way, so as to cause recommendation The fluctuation of effect and lower recommendation CTR clicking rates.
Invention content
In view of the deficiencies of the prior art, the present invention is intended to provide a kind of online content recommendation side based on deep neural network Method carries out depth analysis, and by having based on deep neural network by using deep learning model to user and online content Effect establishes user, the vectorization of content indicates, efficiently realizes that user oriented personalization Push recommends method in generic group, Make to be obviously improved in terms of recommendation hit rate.
To achieve the goals above, the present invention adopts the following technical scheme that:
A kind of online content recommendation method based on deep neural network, includes the following steps:
S1 builds the important lexicon of content language material about content to be pushed and extracts keyword to it, then by the content The important lexicon of language material carries out term vector model training as the input of term vector tool, obtains term vector model;
S2 utilizes the term vector model construction content to be pushed vector obtained in step S1;
S3 is defined user's click and has been pushed away based on the obtained term vector model of step S1 and step S2 and content to be pushed vector The message sent is positive behavior, and it is negative behavior that user, which does not click on the message pushed, establishes the positive behavior vector model of each user With negative behavior vector model;
S4 calculates separately the positive behavior vector model of each user and negative behavior vector model waits pushing away with what step S2 was obtained The distance between content vector is sent, and determines therefrom that push target user.
It should be noted that in step S1, by content text message is filtered, is merged, is segmented, is gone stop words with The important lexicon of content construction language material.
It should be noted that in step S1, using word2vec as term vector tool, and built using HS-CBOW models The term vector of the important lexicon of vertical content language material.
It should be noted that including online content supplier in step S1, in the important lexicon of content language material trained Text message, every information includes the word of content itself and description content.
It should be noted that in step S1, term vector dimension set is 200 dimensions, and text window is set as 5.
It should be noted that in step S2, the add operation property that has in vector space using term vector, which is built, to be waited pushing away It send content vectorial, is specifically built according to the following formula:
Wherein, VVIndicate the vector of content to be pushed V;N is the keyword number extracted in content to be pushed;For normalizing Change coefficient, the effect of normalization coefficient is that the keyword number for preventing different content from extracting is different and generates difference;It is interior Hold the vector that i-th of keyword of V is indicated by term vector tool.
It should be noted that in step S3, in the positive behavior vector model for building user and negative behavior vector pattern, make Quantity to bear the negative behavioral data on behavior vector structure basis is the positive behavioral data as positive behavior vector structure basis 1.7 again.
It needs further exist for illustrating, in step S3, using add operation property possessed by term vector and combines TF- The positive behavior vector sum that IDF methods build user bears behavior vector, specific as follows:
Positive behavior vectorization carries out according to the following formula:
WhereinIndicate the positive behavior vector of user u, m+Content number, n for user's u clicks+For content V+Key Word number,It is normalization coefficient, in order to prevent different user click on content number different, and not Difference is generated with the keyword number difference of contents extraction;For content V+In i-th of keyword TF-IDF weight; The content V clicked for user+In the vector that is indicated by term vector tool of i-th of keyword;CoefficientN For corresponding content V+Click volume in systems, biased influence of the coefficient for reducing hot content on result;
Negative behavior vectorization carries out according to the following formula:
WhereinIndicate the negative behavior vector of user u, m-For the number of the user u contents that do not click, n-To push content In do not have click content V-Keyword number,For normalization coefficient, there is no click on content to cope with different user The keyword number that number difference and different content extract is different and generates difference;For content V-In i-th keyword TF-IDF weight;For content V-The vector that is indicated by term vector tool of i-th of keyword;Coefficient N is corresponding content V+Click volume in systems, the coefficient for reducing hot content biased influence.
It should be noted that in step S4, the specific method is as follows:
4.1) it for each user, calculates separately its positive behavior vector sum and bears between behavior vector and content to be pushed vector COS distance x and y, and calculate ratio between the twoWherein -1≤x≤1, -1≤y≤1;
4.2) initialization alternative user group is to be handled as follows comprising total user, and to the user in alternative user group:
For the user of 0≤x≤1 and 0≤y≤1, retain the user of P >=1;
For the user of -1≤x≤0 and 0≤y≤1, rejected from alternative user group;
For the user of -1≤x≤0 and -1≤y≤0, retain the user of P≤1;
For the user of 0≤x≤1 and -1≤y≤0, all it is retained in alternative user group;
Wherein, x=0 indicates that content to be pushed vector and positive behavior vector do not have correlation, y=0 to indicate content to be pushed The case where vectorial not have correlation with negative behavior vector, therefore there is no x=0 and y=0 in practical situations;
4.3) (x, y) and straight line of each user in the alternative user group by step 4.2) screening gained are calculated The distance between, and sort in descending order, M masterpieces push target user before choosing, and wherein p is selected threshold.
Explanation is needed further exist for, the COS distance x and y is calculated according to the following formula:
Wherein, VvIt is vectorial for the content to be pushed of gained in step S2,Indicate the positive behavior vector of user u,It indicates The negative behavior vector of user u;It when x is closer to 1, indicates that the positive behavior vector sum content to be pushed vector of user is more related, reflects User more may be interested in the push content, as x closer -1, indicates the positive behavior vector sum content to be pushed vector of user More uncorrelated, reflecting user may more lose interest in the push content;When y closer to 1 when, indicate the negative behavior of user to Amount waits for that push vector is more related to this, but reflects user and is more possible to lose interest in the content, and when y it is closer -1 when, This waits for that push vector is more uncorrelated to the negative behavior vector sum of expression user, but reflecting user more may be interested in the content; According to x, the practical significance of y values, when pushing a content, optimal target user is x=1, y=-1.
It needs further exist for illustrating, in step S8, the value of p is 1.
The beneficial effects of the present invention are:On the basis of tradition is based on commending contents, deep neural network (Deep is introduced Neural Network, DNN) term vector tool, according to the historical behavior of content to be pushed text message and user, by content and User is mapped in high-dimensional vector space, and by calculating the COS distance between vector, screening and filtering is to the interested use of recommendation Family group.Based in Large-scale Mobile content service system it is demonstrated experimentally that the proposed recommendation plan based on DNN algorithms Slightly, average in terms of clicking rate to obtain 106%, 41% respectively compared to random device, ContentKNN and ItemCF scheduling algorithms It is opposite with 57% to be promoted, avoid the biased problem of push any active ues to a certain extent in terms of coverage rate, on the whole Preferable recommendation effect is obtained.
Description of the drawings
Fig. 1 is the implementation process schematic diagram of the present invention;
Fig. 2 is the sub-process figure of step S1 in Fig. 1;
Fig. 3 is handled the planar structure schematic diagram that alternative user group establishes by step S4 in Fig. 1;
Fig. 4 is the Contrast on effect schematic diagram of the present invention and random device, ContentKNN and ItemCF methods in experiment.
Specific implementation mode
Below with reference to attached drawing, the invention will be further described, it should be noted that the present embodiment is with this technology side Premised on case, detailed embodiment and specific operating process are given, but protection scope of the present invention is not limited to this reality Apply example.
As described in Figure 1, the online content based on deep neural network recommends method to include the following steps:
S1 term vector model trainings.
Term vector model training process is as shown in Figure 2.Before treating push content text information and being analyzed, first to text Segmented and gone stop words, the important lexicon of content construction language material, the input as term vector tool.Obtaining content language material Keyword can be extracted to it, prepare for content construction vector sum user vector after important lexicon.
It should be noted that in term vector model training, realized by word2vec tools.The content language material trained Important lexicon includes the text message of online content supplier, and every information includes the text of content itself and description content Word.Cause data that first data are filtered and merging treatment, through data cleansing with the influence of training term vector to reduce After processing, effective data are obtained.Additionally, it is contemplated that the complexity that training speed and realization are recommended, selects training very fast and engineering On the HS-CBOW models relatively easily realized establish the term vector of the important lexicon of content language material.
Further, in the selection of term vector dimension, it is however generally that dimension is higher, text window is bigger, term vector Character representation effect is opposite can be preferable, but term vector time consumption for training is longer simultaneously, and it is bigger that training result stores occupied space.It faces Larger data set, dimension set, which is 200 dimensions, text window is selected as 5 can keep faster computational efficiency, finally be obtained by training Obtain the term vector of a certain amount of vocabulary.
S2 utilizes the term vector model construction content to be pushed vector obtained in step S1.
Term vector model and traditional semantic analysis model (such as LDA, LSI), a larger difference in analysis result It is not that term vector model is to build vector to the word in text, rather than build vector to whole text, by such model The term vector that training obtains can carry out plus and minus calculation in vector space.Therefore the present invention has using term vector in vector space Some add operation property structure content to be pushed vectors.Specifically built according to the following formula:
Wherein VVIndicate that the vector of content to be pushed V, n are the keyword number extracted in content to be pushed,For normalization Coefficient, the keyword number that different content extracts in order to prevent is different and generates difference,It is logical for i-th of keyword of content V Cross the vector of term vector tool expression.
S3 user vectorizations indicate.
Based on the obtained term vector model of step S1 and step S2 and content to be pushed vector, defines user's click and pushed Message be positive behavior, it is negative behavior that user, which does not click on the message that has pushed, establish each user positive behavior vector model and Negative behavior vector model.
It establishes the positive behavior vector model of each user and negative behavior vector model allows to characterize use in terms of two Family behavior and interest.Wherein positive behavior represents user and produces click behavior to the content of recommendation, expresses to a certain extent User receives recommended content.Negative behavior represents user and does not generate click behavior to the content of recommendation, but using negative row More discussion are needed when user loses interest in express, and are centainly lost interest in (no because user does not have click to be not offered as user The reason of click, is it could also be possible that because push time, scene are not appropriate for user).In the observation to real data, find more Number user, negative behavior is more than positive behavior, faces the unbalanced problem of positive and negative behavioral data, and the present invention is being determined for building positive and negative row In quantity for the positive and negative behavioral data of vector, it is for building to enable the quantity of the negative behavioral data for building negative behavior vector 1.7 times of the positive behavioral data of positive behavior vector.
The positive behavior vector sum of user is built using add operation property possessed by term vector and combination TF-IDF methods Negative behavior vector, it is specific as follows:
Positive behavior vectorization carries out according to the following formula:
WhereinIndicate the positive behavior vector of user u, m+Content number, n for user's u clicks+For content V+Key Word number,It is normalization coefficient, in order to prevent different user click on content number different, and not Difference is generated with the keyword number difference of contents extraction;For content V+In i-th of keyword TF-IDF weight; The content V clicked for user+In the vector that is indicated by term vector tool of i-th of keyword;CoefficientN For corresponding content V+Click volume in systems, the coefficient for reducing hot content influence power;
Negative behavior vectorization carries out according to the following formula:
WhereinIndicate the negative behavior vector of user u, m-For the number of the user u contents that do not click, n-For in push There is no the content V clicked in appearance-Keyword number,For normalization coefficient, do not clicked with coping with different user The keyword number that content number difference and different content extract is different and generates difference;For content V-In i-th it is crucial The TF-IDF weight of word;For content V-The vector that is indicated by term vector tool of i-th of keyword;CoefficientN is corresponding content V+Click volume in systems, the coefficient for reducing hot content influence power.
S4 calculates separately the positive behavior vector model of each user and negative behavior vector model waits pushing away with what step S2 was obtained The distance between content vector is sent, and determines therefrom that push target user.Specifically comprise the following steps:
4.1) it for each user, calculates separately its positive behavior vector sum and bears behavior vector and between push content vector COS distance x and y, and calculate ratio between the twoIt is right in the method for the distance between calculating high-dimensional vector In Euclidean distance, Pearson came distance, the mixing distance of COS distance and COS distance and Euclidean distance, the reality of Euclidean distance It tests clicking rate (Click Through Rate, CTR) and is slightly poorer than other three kinds of distances, COS distance CTR average value effects are optimal, But its CTR fluctuations are more than other three kinds of distances.In order to promote the CTR of commending system to greatest extent, the present invention selects COS distance As the distance calculating method between high dimension vector.
The COS distance x and y is calculated according to the following formula:
Wherein, VvIt is vectorial for the content to be pushed of gained in step S2,Indicate the positive behavior vector of user u,It indicates The negative behavior vector of user u.
When x is closer to 1, indicates that the positive behavior vector sum push content vector of user is more related, reflect user and more may It is interested in the push content, as x closer -1, indicate that user's positive behavior vector sum push content vector is more uncorrelated, instead Having reflected user may more lose interest in the push content.When y closer to 1 when, indicate negative behavior vector sum push of user Vector is more related, but reflects user and is more possible to lose interest in the content, and when y it is closer -1 when, indicate that user's is negative The behavior vector sum push vector is more uncorrelated, but reflecting user more may be interested in the content.According to x, the reality of y values Border meaning, when pushing a content, optimal target user is x=1, y=-1.
4.2) initialization alternative user group is to be handled as follows comprising total user, and to the user in alternative user group, Obtain final alternative user group:
For the user of 0≤x≤1 and 0≤y≤1, retain the user of P >=1;
For the user of -1≤x≤0 and 0≤y≤1, rejected from alternative user group;
For the user of -1≤x≤0 and -1≤y≤0, retain the user of P≤1, p2For judgment threshold;
For the user of 0≤x≤1 and -1≤y≤0, all it is retained in alternative user group.
Wherein, x=0 indicates that content to be pushed vector and positive behavior vector do not have correlation, y=0 to indicate content to be pushed The case where vectorial not have correlation with negative behavior vector, therefore there is no x=0 and y=0 in practical situations;
The above method can also use the method for establishing plane coordinate system to realize, specific as shown in Figure 3.Establish one -1≤ X≤1, the plane coordinate system of -1≤y≤1, and it is divided into C1 as shown in the figure, C2, C3, tetra- regions C4, the x of each user, y values A point in respective coordinates.
Consider x, the practical significance of y values, then the user that (x, y) value belongs to the regions C2 will be directly from alternative user group It rejects, and the user for belonging to the regions C4 will be retained in alternative user group.
For the user in the regions C1, retainThe user of (region in corresponding C1 is 1.) is in alternative user group In, filtering fall into region in C1 2. in user.It remains so big to the push interested possibility of content in the regions C1 In the user equal to uninterested possibility.
Equally, retain in C3The user of (region 1. region) in corresponding C3 is in alternative user group, filtering User in falling into region in C3 2..
The arrow of different zones, which has been directed toward the alternative user in different zones, in Fig. 3 becomes the x of push target user, and y takes Value trend.From x, y value trend can obtain | P | the size trend of value, as shown in table 1.
Table 1
1. C1- indicates in the regions C1 1. subregion, 1. C3- indicates in the regions C3 1. subregion.It can be seen that in table 1 Target user in different regions | P | value size trend in C1- 1. region, will be selected there are inconsistent | P | larger user To push target user, remaining two kinds of situation will select | P | smaller user is push target user.Because as existing not Certain situation, cannot be effective according to | P | and value size is come selected target user.But on this basis, it is observed that not same district Under domain, the value trend of the x of target user, y typically each deviateStraight line, and the most dreamboat user (x in the regions C4 =1, y=1) deviateStraight line is farthest.Thus the present invention arrives straight line by calculating alternative user (x, y)Distance it is true Surely target user is pushed.
4.3) calculate in the alternative user group by step 4.2) screening gained (x, y) of each user withBetween Distance.In the present embodiment, p=1 is defined, i.e., is with reference to straight lineThat is the positive and negative behavior vector sum content vector of user closes Be it is equal on the basis of straight line, and press formulaCalculate (the x of i-th of useri,yi) arrive straight lineDistance.And It sorts in descending order, M masterpieces push target user, wherein x before choosingiIt is interior to indicate that the positive behavior vector sum of i-th of user waits pushing Hold the COS distance of vector, yiIndicate the COS distance of the negative behavior vector sum content to be pushed vector of i-th of user.
It below will be by comparing the online content recommendation method of the invention based on deep neural network under offline environment (DNN), ContentKNN, ItemCF and random device are to pushing the CTR indexs of content as a result, further illustrating the present invention Performance.
As shown in figure 4, in off-line testing, upper four points in CTR indexs of ContentKNN, ItemCF and the method for the present invention Place value, median are higher than random device on lower quartile value, and the CTR results of the present invention and ContentKNN algorithms are stablized Property to get well compared to ItemCF algorithms, and the CTR results of the present invention are then substantially better than ContentKNN and ItemCF algorithms.Therefore, In contrast to ContentKNN, ItemCF and random device, the online content of the invention based on deep neural network recommend method Performance it is more superior.
For those skilled in the art, it can be made various corresponding according to above technical solution and design Change and distortion, and all these change and distortions should be construed as being included within the protection domain of the claims in the present invention.

Claims (8)

1. a kind of online content based on deep neural network recommends method, which is characterized in that include the following steps:
S1 builds the important lexicon of content language material about content to be pushed and extracts keyword to it, then by the content language material Important lexicon carries out term vector model training as the input of term vector tool, obtains term vector model;
S2 utilizes the term vector model construction content to be pushed vector obtained in step S1;
S3 defines what user's click had pushed based on the obtained term vector model of step S1 and step S2 and content to be pushed vector Message is positive behavior, and it is negative behavior that user, which does not click on the message pushed, establishes the positive behavior vector model of each user and bears Behavior vector model;In the positive behavior vector model for building user and negative behavior vector pattern, built as negative behavior vector The negative behavioral data on basis is 1.7 times of the positive behavioral data as positive behavior vector structure basis;Had using term vector Add operation property and combine TF-IDF methods structure user positive behavior vector sum bear behavior vector, it is specific as follows:
Positive behavior vectorization carries out according to the following formula:
WhereinIndicate the positive behavior vector of user u, m+Content number, n for user's u clicks+For content V+Keyword Number,It is normalization coefficient, in order to prevent in different user click on content number difference and difference Hold the keyword number difference of extraction and generates difference;For content V+In i-th of keyword TF-IDF weight;For with The content V that family is clicked+In the vector that is indicated by term vector tool of i-th of keyword;CoefficientN is pair The content V answered+Click volume in systems, biased influence of the coefficient for reducing hot content on result;
Negative behavior vectorization carries out according to the following formula:
WhereinIndicate the negative behavior vector of user u, m-For the number of the user u contents that do not click, n-Not have in push content There is the content V of click-Keyword number,For normalization coefficient, there is no click on content to cope with different user The keyword number that number difference and different content extract is different and generates difference;ai -For content V-In i-th keyword TF-IDF weight;For content V-The vector that is indicated by term vector tool of i-th of keyword;Coefficient N is corresponding content V+Click volume in systems, the coefficient for reducing hot content biased influence;
S4 calculate separately the positive behavior vector model of each user and negative behavior vector model and step S2 are obtained wait pushing in Hold the distance between vector, and determines therefrom that push target user.
2. the online content according to claim 1 based on deep neural network recommends method, which is characterized in that step S1 In, by being filtered, merging to content text message, segment, go stop words with the important lexicon of content construction language material.
3. the online content according to claim 1 based on deep neural network recommends method, which is characterized in that step S1 In, using word2vec as term vector tool, and using the word of the important lexicon of HS-CBOW model foundation content language materials to Amount.
4. the online content according to claim 1 based on deep neural network recommends method, which is characterized in that step S1 In, term vector dimension set is 200 dimensions, and text window is set as 5.
5. the online content according to claim 1 based on deep neural network recommends method, which is characterized in that step S2 In, the add operation property structure content to be pushed vector having in vector space using term vector specifically carries out according to the following formula Structure:
Wherein, VVIndicate the vector of content to be pushed V;N is the keyword number extracted in content to be pushed;It is for normalization Number, the effect of normalization coefficient are that the keyword number for preventing different content from extracting is different and generates difference;For content V's The vector that i-th of keyword is indicated by term vector tool.
6. the online content according to claim 1 based on deep neural network recommends method, which is characterized in that step S4 In the specific method is as follows:
4.1) it for each user, calculates separately its positive behavior vector sum and bears cosine between behavior vector and content to be pushed vector Distance x and y, and calculate ratio between the twoWherein -1≤x≤1, -1≤y≤1;
4.2) initialization alternative user group is to be handled as follows comprising total user, and to the user in alternative user group:
For 0<The user of x≤1 and 0≤y≤1 retains the user of P >=1;
For -1≤x≤0,0<Y≤1 or -1≤x<The user of 0,0≤y≤1 rejects from alternative user group;
For -1≤x≤0, -1≤y<0 or -1≤x<0, the user of -1≤y≤0 retains the user of P≤1;
For 0≤x≤1 and -1≤y<0 user is all retained in alternative user group;
Wherein, x=0 indicates that content to be pushed vector and positive behavior vector do not have correlation, y=0 to indicate content to be pushed vector The case where not having correlation with negative behavior vector, therefore x=0 and y=0 is not present in practical situations;
4.3) (x, y) and straight line of each user in the alternative user group by step 4.2) screening gained are calculatedBetween Distance, and sort in descending order, M masterpieces push target user before choosing, and wherein p is selected threshold.
7. the online content according to claim 6 based on deep neural network recommends method, which is characterized in that described remaining Chordal distance x and y are calculated according to the following formula:
Wherein, VvIt is vectorial for the content to be pushed of gained in step S2,Indicate the positive behavior vector of user u,Indicate user u Negative behavior vector;When x is closer to 1, indicates that the positive behavior vector sum content to be pushed vector of user is more related, reflect user More may be interested in the push content, as x closer -1, indicate the positive behavior vector sum content to be pushed vector of user more not phase It closes, reflecting user may more lose interest in the push content;When y closer to 1 when, indicate user negative behavior vector sum should Wait for that push vector is more related, but reflect user and be more possible to lose interest in the content, and when y it is closer -1 when, indicate to use This waits for that push vector is more uncorrelated to the negative behavior vector sum at family, but reflecting user more may be interested in the content;According to x, y The practical significance of value, when pushing a content, optimal target user is x=1, y=-1.
8. the online content according to claim 6 based on deep neural network recommends method, which is characterized in that in step 4.3)In, the p is that the value of selected threshold is 1.
CN201510883752.1A 2015-12-04 2015-12-04 A kind of online content recommendation method based on deep neural network Expired - Fee Related CN105279288B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510883752.1A CN105279288B (en) 2015-12-04 2015-12-04 A kind of online content recommendation method based on deep neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510883752.1A CN105279288B (en) 2015-12-04 2015-12-04 A kind of online content recommendation method based on deep neural network

Publications (2)

Publication Number Publication Date
CN105279288A CN105279288A (en) 2016-01-27
CN105279288B true CN105279288B (en) 2018-08-24

Family

ID=55148302

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510883752.1A Expired - Fee Related CN105279288B (en) 2015-12-04 2015-12-04 A kind of online content recommendation method based on deep neural network

Country Status (1)

Country Link
CN (1) CN105279288B (en)

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107025228B (en) * 2016-01-29 2021-01-26 阿里巴巴集团控股有限公司 Question recommendation method and equipment
CN107193832A (en) * 2016-03-15 2017-09-22 北京京东尚科信息技术有限公司 Similarity method for digging and device
CN105787100A (en) * 2016-03-18 2016-07-20 浙江大学 User session recommendation method based on deep neural network
CN106227792B (en) * 2016-07-20 2019-10-15 百度在线网络技术(北京)有限公司 Method and apparatus for pushed information
CN106934007B (en) * 2017-02-14 2021-02-12 北京时间股份有限公司 Associated information pushing method and device
WO2018218405A1 (en) * 2017-05-27 2018-12-06 深圳大学 Similar user selecting method and device
CN107908698B (en) * 2017-11-03 2021-04-13 广州索答信息科技有限公司 Topic web crawler method, electronic device, storage medium and system
CN108182621A (en) * 2017-12-07 2018-06-19 合肥美的智能科技有限公司 The Method of Commodity Recommendation and device for recommending the commodity, equipment and storage medium
CN108804595B (en) * 2018-05-28 2021-07-27 中山大学 Short text representation method based on word2vec
CN108831548B (en) * 2018-06-21 2021-07-06 中国联合网络通信集团有限公司 Remote intelligent medical optimization method, device and system
CN109190372B (en) * 2018-07-09 2021-11-12 四川大学 JavaScript malicious code detection method based on bytecode
CN109190046A (en) * 2018-09-18 2019-01-11 北京点网聚科技有限公司 Content recommendation method, device and content recommendation service device
CN109508421B (en) * 2018-11-26 2020-11-13 中国电子科技集团公司第二十八研究所 Word vector-based document recommendation method
CN110765368B (en) * 2018-12-29 2020-10-27 滴图(北京)科技有限公司 Artificial intelligence system and method for semantic retrieval
CN110222328B (en) * 2019-04-08 2022-11-22 平安科技(深圳)有限公司 Method, device and equipment for labeling participles and parts of speech based on neural network and storage medium
CN112307312A (en) * 2019-07-30 2021-02-02 北京三好互动教育科技有限公司 Article recommendation method and device
CN110597977B (en) * 2019-09-16 2022-01-11 腾讯科技(深圳)有限公司 Data processing method, data processing device, computer equipment and storage medium
CN111191833B (en) * 2019-12-25 2023-04-18 武汉美和易思数字科技有限公司 Intelligent experiment process recommendation method and system based on neural network
CN113301065B (en) * 2020-02-24 2022-07-08 北京达佳互联信息技术有限公司 Content pushing method and device, electronic equipment and storage medium
CN112836081A (en) * 2021-03-01 2021-05-25 腾讯音乐娱乐科技(深圳)有限公司 Neural network model training method, information recommendation method and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104408115A (en) * 2014-11-25 2015-03-11 三星电子(中国)研发中心 Semantic link based recommendation method and device for heterogeneous resource of TV platform
CN104462593A (en) * 2014-12-29 2015-03-25 北京奇虎科技有限公司 Method and device for providing user personalized resource message pushing

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100903961B1 (en) * 2007-12-17 2009-06-25 한국전자통신연구원 Indexing And Searching Method For High-Demensional Data Using Signature File And The System Thereof

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104408115A (en) * 2014-11-25 2015-03-11 三星电子(中国)研发中心 Semantic link based recommendation method and device for heterogeneous resource of TV platform
CN104462593A (en) * 2014-12-29 2015-03-25 北京奇虎科技有限公司 Method and device for providing user personalized resource message pushing

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
个性化视频内容智能推荐的算法设计;刘迎盈等;《视听界(广播电视技术)》;20131231;全文 *
社会网络分析之社区发现研究;王娜等;《深圳大学学报理工版》;20140131;第31卷(第1期);全文 *

Also Published As

Publication number Publication date
CN105279288A (en) 2016-01-27

Similar Documents

Publication Publication Date Title
CN105279288B (en) A kind of online content recommendation method based on deep neural network
Liu et al. Towards early identification of online rumors based on long short-term memory networks
Zhao et al. Cyberbullying detection based on semantic-enhanced marginalized denoising auto-encoder
WO2020108430A1 (en) Weibo sentiment analysis method and system
CN106202294B (en) Related news computing method and device based on keyword and topic model fusion
CN103678431A (en) Recommendation method based on standard labels and item grades
CN102495864A (en) Collaborative filtering recommending method and system based on grading
CN108446964B (en) User recommendation method based on mobile traffic DPI data
CN107122455A (en) A kind of network user&#39;s enhancing method for expressing based on microblogging
CN103699525A (en) Method and device for automatically generating abstract on basis of multi-dimensional characteristics of text
CN103886105A (en) User influence analysis method based on social network user behaviors
CN104077417A (en) Figure tag recommendation method and system in social network
CN111191099B (en) User activity type identification method based on social media
CN105760499A (en) Method for analyzing and predicting online public opinion based on LDA topic models
CN104090936A (en) News recommendation method based on hypergraph sequencing
CN104915399A (en) Recommended data processing method based on news headline and recommended data processing method system based on news headline
CN108959329A (en) A kind of file classification method, device, medium and equipment
CN110619045A (en) Text classification model based on convolutional neural network and self-attention
Sitorus et al. Sensing trending topics in twitter for greater Jakarta area
CN110019653A (en) A kind of the social content characterizing method and system of fusing text and label network
CN106570167A (en) Knowledge-integrated subject model-based microblog topic detection method
CN113569118A (en) Self-media pushing method and device, computer equipment and storage medium
CN110209704B (en) User matching method and device
Liu et al. Community detection based on topic distance in social tagging networks
CN111507098B (en) Ambiguous word recognition method and device, electronic equipment and computer-readable storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180824

Termination date: 20181204