CN103164470A - Directional application method based on user gender distinguished results and system thereof - Google Patents

Directional application method based on user gender distinguished results and system thereof Download PDF

Info

Publication number
CN103164470A
CN103164470A CN 201110422555 CN201110422555A CN103164470A CN 103164470 A CN103164470 A CN 103164470A CN 201110422555 CN201110422555 CN 201110422555 CN 201110422555 A CN201110422555 A CN 201110422555A CN 103164470 A CN103164470 A CN 103164470A
Authority
CN
China
Prior art keywords
user
users
internet site
behavioral data
gender
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN 201110422555
Other languages
Chinese (zh)
Inventor
曹臻
张秉豪
邓爱林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shengqu Information Technology (Shanghai) Co., Ltd.
Original Assignee
Shanda Computer Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanda Computer Shanghai Co Ltd filed Critical Shanda Computer Shanghai Co Ltd
Priority to CN 201110422555 priority Critical patent/CN103164470A/en
Publication of CN103164470A publication Critical patent/CN103164470A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a directional application method based on user gender distinguished results and a system thereof. A relationship of gender inclination and behavior data is calculated and obtained through behavior data and genders of sample users, and then the gender inclination of all users can be obtained according to the collected behavior data of all users and the relationship of the gender inclination and the behavior data. Real, impersonal and complete behavior data of the users is used for distinguishing internet user gender, and calculated results are accurate and reliable. Under the conditions that real gender information of the users is lost and false, accurate user gender information can be obtained. The directional application method based on the user gender distinguished results and the system enable accurate rates and efficiency of directional application to be greatly improved, wherein the directional application such as individuation search, individuation recommendation and advertisement targeted delivery is based on different genders of the users, and efficiency of internet individuation application can be improved.

Description

Directed application process and system thereof based on user's Sex Discrimination result
Technical field
The present invention relates to the internet, applications field, relate in particular to a kind of directed application process and system thereof based on user's Sex Discrimination result.
Background technology
Along with becoming increasingly abundant of internet, applications, user's demand is constantly upgraded.Huge change has all occured in the application model of the network media and marketing characteristics, wherein, browses the html webpage by web 1.0 by browser, and abundanter, interactive and personalized stronger web2.0 mode development, be the new development trend in internet to content.
User's sex and age can be identified for after the user sets up complete User Profile (user profile) system of cover in the internet, sees clearly user's behavior pattern and hobby etc.Simultaneously, adopt this system to support all kinds of web2.0 to use, could really realize the personalization that web2.0 uses, as personalized search, personalized recommendation and advertisement fixing to throwing in etc.
But because internet, applications has certain singularity, the collection of user's sex information has larger difficulty.Overwhelming majority user is unwilling to fill in or mistake is filled out relevant information, cause most of websites can't completely obtain its user's true sex, and for example application number is when adopting face recognition technology to differentiate user's sex in 200610117050.3 and 200810226414.0 patent of invention, has limitation, because be not the photo that all users are ready to upload oneself.
At present, most of internet, applications all has log collection mechanism, the behavioral data of recording user, therefore, need a kind of system and method that adopts user behavior data in conjunction with Classification Algorithms in Data Mining, user's sex to be differentiated, to cover the most of users in website, overcome effectively that the user is reluctant to fill in or wrong problem of filling out sex information.
Summary of the invention
The object of the present invention is to provide a kind of directed application process and system based on user's Sex Discrimination result, can calculate comparatively exactly user's gender tendency, and overcome effectively that the user is reluctant to fill in or wrongly fill out sex information and cause accurately knowing the problem of its sex, improve as based on the different personalized search of gender, personalized recommendation and advertisement fixing to the directed efficient of using such as throwing in.
For addressing the above problem, the invention provides a kind of directed application process based on user's Sex Discrimination result, comprise the following steps:
Step 1: the behavioral data of sample of users of collecting and arrange the known true sex of an internet site;
Step 2: obtain the relation of gender tendency and behavioral data according to behavioral data and the known true sex of described sample of users, and deposit the database of described internet site in;
Step 3: the behavioral data of collecting and arrange all users of described internet site;
Step 4: obtain all users' gender tendency according to the relation of described gender tendency and behavioral data and described all users' behavioral data, and deposit the database of described internet site in;
Step 5: inquire about the database of described internet site, output user's to be checked gender tendency.
Step 6: provide gender tendency's information based on the user to be checked of output to described user to be checked.
Further, in described step 1, the behavioral data of the sample of users of collecting and arranging comprises: each sample of users is accessed the content and the access weight of each sample of users to each content of described internet site.
Further, in described step 1, the behavioral data of the sample of users of collecting and arranging also comprises: each sample of users is accessed time of the act and/or the behavior vector of described internet site.
Further, in described step 3, the behavioral data of collecting and arrange all users of described internet site comprises: each user accesses each content and the access weight of each user to each content of described internet site.
Further, in described step 3, the behavioral data of collecting and arrange all users of described internet site also comprises: each user accesses time of the act and/or the behavior vector of described internet site.
Further, in described step 2, the relation of described gender tendency and behavioral data comprises:
The relation of the gender tendency of described internet site integral body and the behavioral data of all sample of users; And
The relation of the behavioral data of the sample of users of the gender tendency of each content of described internet site and this content of access.
Further, the computing formula of the relation of the behavioral data of the gender tendency of described internet site integral body and all sample of users is:
P ( g ) = Σ I i = 1 Σ J j = 1 r ij u i ( g ) Σ I i = 1 Σ J j = 1 r ij ;
The computing formula of the relation of the gender tendency of each content of described internet site and the behavioral data of all sample of users is:
P ( g | w j ) = Σ I i = 1 Σ J j = 1 r ij u i ( g ) Σ I i = 1 r ij
Wherein, P (g|w j) represent that j content is for the probability of sex g; u i(g) be Dirac function, represent whether i sample of users has the g sex; r ijRepresent that i sample of users is to the access weight of j content.
Further, in described step 4, the computing formula that obtains all users' gender tendency according to the relation of described gender tendency and behavioral data and described all users' behavioral data is:
P ( g 0 | u i ) = 1 1 + odds ( g , X u i ) , P ( g 1 | u i ) = odds ( g , Xu i ) 1 + odds ( g , X u i ) ,
Odds ( g , Xu i ) = P ( g 1 ) P ( g 0 ) · Π j = 1 J ( P ( g 1 | w j ) P ( g 0 | w j ) / P ( g 1 ) P ( g 0 ) ) r ij ;
Wherein, g 0Represent the women, g 1Represent the male sex; J content is for the probability of sex g, r ijRepresent user u to be checked iTo the access weight of j content, odds (g, Xu i) be user u to be checked iGender's likelihood ratio, P (g 0) probability of women's tendency of expression described internet site integral body, P (g 1) probability of male sex's tendency of expression described internet site integral body.
Further, in described step 2, the relation of described gender tendency and behavioral data also comprises: adopt decision tree, logistic recurrence, neural network or support vector machine, process behavioral data and the known true sex of described sample of users, obtain the relation between described user's individual behavior and its gender tendency.
Further, in described step 4, obtain all users' gender tendency according to the relation between described user's individual behavior and its gender tendency and described all users' behavioral data, and deposit the database of described internet site in.
Further, in described step 5, inquire about the database of described internet site, when exporting user's to be checked gender tendency, if the database of described internet site has described user's to be checked gender tendency, export described user's gender tendency; If the database of described internet site is without described user's to be checked gender tendency, grasp the current content of the described internet site of described user's current accessed to be checked, inquire about the gender tendency of described current content according to the current content of crawl in the database of described internet site, the gender tendency of the described current content of output is as described user's to be checked gender tendency; If the database of described internet site without the gender tendency of described content to be checked, is exported the gender tendency of internet site's integral body as described user's to be checked gender tendency.
Accordingly, the present invention also provides a kind of directed application system based on user's Sex Discrimination result, comprising:
The sample of users data collection module is used for collecting and to arrange the behavioral data of sample of users of the known true sex of an internet site;
Concerning of behavioral data and gender tendency is used for obtaining the relation of gender tendency and behavioral data according to behavioral data and the known true sex of described sample of users, and deposits the database of described internet site in the unit;
The total user data collector unit is for all users' that collect and arrange described internet site behavioral data;
Gender tendency's computing unit is used for obtaining all users' gender tendency according to the relation of described gender tendency and behavioral data and described all users' behavioral data, and deposits the database of described internet site in;
Gender tendency's output unit, for the database of inquiring about described internet site, output user's to be checked gender tendency;
Directed applying unit is used for the gender tendency according to described gender tendency's output unit output, provides information based on the gender tendency of described output to described user to be checked.
Compared with prior art, directed application process and system based on user's Sex Discrimination result of the present invention, behavioral data and sex by sample of users, calculate the relation of gender tendency and behavioral data, again according to each user's who collects behavioral data and the gender tendency who calculates and the relation of behavioral data, obtain each user's gender tendency, the behavioral data that the user is true, objective, complete is applied to Internet user's Sex Discrimination, and result of calculation accurately, reliably; In the situation that the true sex loss of learning of user, falseness can obtain user's sex information more accurately; Directed application process and system based on user's Sex Discrimination result of the present invention makes as personalized search, personalized recommendation and advertisement fixing and greatly improves to accuracy rate and the efficient used based on the different orientation of user gender such as throwing in, and improves the efficient of internet personalized application; Further, there is not the user filtering condition restriction in the present invention, can substantially cover the whole network station user, and user coverage rate is high; And can realize a kind of differentiation process of going forward one by one, the behavioral data along with the user in internet site constantly increases, and its gender tendency's accuracy of computation also improves constantly.
Description of drawings
Fig. 1 is the process flow diagram based on the directed application process of user's Sex Discrimination result of the embodiment of the present invention one;
Fig. 2 is the structural representation based on the directed application system of user's Sex Discrimination result of the embodiment of the present invention two.
Embodiment
Below in conjunction with the drawings and specific embodiments, directed application process and the system based on user's Sex Discrimination result that the present invention proposes is described in further detail.
Embodiment one
As shown in Figure 1, the present embodiment provides a kind of directed application process based on user's Sex Discrimination result, comprising:
Step 1: the behavioral data of sample of users of collecting and arrange the known true sex of an internet site;
Step 2: obtain the relation of gender tendency and behavioral data according to behavioral data and the known true sex of described sample of users, and deposit the database of described internet site in;
Step 3: the behavioral data of collecting and arrange all users of described internet site;
Step 4: obtain all users' gender tendency according to the relation of described gender tendency and behavioral data and described all users' behavioral data, and deposit the database of described internet site in;
Step 5: inquire about the database of described internet site, output user's to be checked gender tendency;
Step 6: to the information of described user's directive sending to be checked based on the user's to be checked of output gender tendency.
Need to prove, sample of users of the present invention refers to the user of known true sex information, and the gender data of sample of users can be obtained by modes such as I.D., customer service communication, questionnaires; Behavioral data of the present invention refers to the user in the data of the behavior generation of the arbitrary content of accessing internet site, and the arbitrary content of accessing internet site can refer to concrete webpage, video or books etc., and the granularity of content can generally be changed and refinement.For example, concrete video is summarized as visual classification, can also be subdivided into finance and economics, physical culture in visual classification, make laughs etc., any user's that the present invention collects and arranges behavioral data comprises: this user accesses all the elements of described internet site and the access weight between the frequency and described user and each content, all user behavior datas can be described by parameter S, S=(U, W, R), U={u wherein 1, u 2, u 3..., u iRepresentative of consumer, W={w 1, w 2, w 3..., w jRepresent that all users access all different contents of internet site, R={r ijBe access matrix, r ijRepresentative of consumer i and content w jBetween concern weight, r in the present embodiment ijFor user i to content w jAccess weight, generally can by user i to content w jAccess frequency f jEstimate, for example r ij=A wj* f j, A wjDenoting contents w jWeight coefficient.
Therefore, the behavioral data of the sample of users in step 1 and all users' in step 3 behavioral data is all collected and is arranged by S=(U, W, R) mode.That is to say, in described step 1, the behavioral data of the sample of users of collecting and arranging comprises: each sample of users is accessed the content of described internet site and the access weight between the frequency and each sample of users and each content; In described step 3, the behavioral data of collecting and arrange all users of described internet site comprises: each user accesses each content of described internet site and the access weight between the frequency and each user and each content; In other embodiments of the invention, in described step 1, the behavioral data of the sample of users of collecting and arranging can also comprise: each sample of users is accessed time of the act, behavior vector and any relevant information that can describe this time behavior of described internet site, accordingly, in described step 3, the behavioral data of collecting and arrange all users of described internet site also comprises: each user accesses time of the act, behavior vector and any relevant information that can describe this time behavior of described internet site.
In the step 2 of the present embodiment, the computing formula of the relation of the gender tendency of described internet site integral body and the behavioral data of all sample of users is:
P ( g ) = Σ I i = 1 Σ J j = 1 r ij u i ( g ) Σ I i = 1 Σ J j = 1 r ij
In the step 2 of the present embodiment, the computing formula of the relation of the gender tendency of each content of described internet site and the behavioral data of all sample of users is:
P ( g | w j ) = Σ I i = 1 Σ J j = 1 r ij u i ( g ) Σ I i = 1 r ij
Wherein, P (g|w j) represent that j content is for the probability of sex g; u i(g) be Dirac function, represent whether i sample of users has the g sex; r ijRepresent that i sample of users is to the access weight of j content.If g 0Represent the women, g 1Represent the male sex, P (g 0)+P (g 1)=P (g 0| w j)+P (g 1| w j)=1.
The behavioral data that the step 4 of the present embodiment can adopt bayes method to process the relation of described gender tendency and behavioral data and described all users obtains all users' gender tendency.Wherein, Bayes' theorem is:
P ( A | B ) = P ( B | A ) P ( A ) P ( B )
Wherein P (A), P (B) are respectively the prior probabilities of event A, B, and P (A|B), P (B|A) are respectively the B of the generation again conditional probabilities after the conditional probability of A, known A occur after known B occurs again occuring.
In the step 4 of the present embodiment, make behavior pattern X={x 1, x 2, x 3..., x k, i.e. behavior pattern X is made of K access to content; C iBe classification designator.Suppose access behavior x each time kBetween separate, the content that the k time behavior accessed is w k, have:
P ( X | C i ) = Π k = 1 K P ( x k | C i ) = Π k = 1 K P ( w k | C i )
Have gender's likelihood ratio of the user of identical behavior pattern X with odds (g, X) representative:
Odds ( g , X ) = P ( g 1 | X ) P ( g 0 | X ) = P ( X | g 1 ) P ( g 1 ) P ( X | g 0 ) P ( g 0 ) = P ( g 1 ) P ( g 0 ) · Π k = 1 K P ( w k | g 1 ) Π k = 1 K P ( w k | g 0 ) = P ( g 1 ) P ( g 0 ) · Π k = 1 K P ( g 1 | w k ) P ( g 0 | w k ) P ( g 1 ) P ( g 0 )
Make Xu iRepresentative of consumer u iBehavior pattern:
Odds ( g , X u i ) = P ( g 1 ) P ( g 0 ) · Π k = 1 K P ( g 1 | w k ) P ( g 0 | w k ) P ( g 1 ) P ( g 0 ) = P ( g 1 ) P ( g 0 ) · Π j = 1 J ( P ( g 1 | w j ) P ( g 0 | w j ) P ( g 1 ) P ( g 0 ) ) r ij
With odds (g, Xu i) estimating user u iThe computing formula of gender's probability be respectively:
P ( g 0 | u i ) = 1 1 + odds ( g , X u i ) , P ( g 1 | u i ) = odds ( g , Xu i ) 1 + odds ( g , X u i )
Computing formula by the relation of the behavioral data of the gender tendency of each content of the computing formula of the relation of the behavioral data of the gender tendency of the computing formula of above-mentioned gender's probability and the internet site's integral body in step 2 and all sample of users and internet site and all sample of users, can calculate gender's probability estimate of each user, gender tendency as each user deposits database in.
In the step 5 of the present embodiment, inquire about the database of described internet site, when exporting user's to be checked gender tendency, if the database of described internet site has described user's to be checked gender tendency P (g 0) and P (g 1), export described user's gender tendency P (g 0| u) and P (g 1| u), if the database of described internet site is without described user's to be checked gender tendency, grasp the current content of the described internet site of described user's current accessed to be checked, inquire about the gender tendency P (g of described current content according to the current content of crawl in the database of described internet site 0| w j) and P (g 1| w j), the gender tendency P (g of the described current content of output 0| w j) and P (g 1| w j) as described user's to be checked gender tendency P (g 0| u) and P (g 1| u), if the database of described internet site without the gender tendency of described content to be checked, is exported the gender tendency P (g of internet site's integral body 0) and P (g 1) as described user's to be checked gender tendency P (g 0| u) and P (g 1| u).
As from the foregoing, the present embodiment is a kind of method that realizes Internet user gender tendency differentiation based on Bayesian Classification Arithmetic, can obtain gender tendency's prior probability of internet site's integral body and gender tendency's prior probability of each content according to the behavioral data of stating sample of users and the known true sex of collecting in step 2, then each user's who collects in integrating step three behavioral data utilizes Bayesian formula to calculate each user gender tendency's posterior probability in step 4; And for any inferior user who accesses, all can calculate the gender tendency.Therefore this method of differentiating based on the Internet user gender tendency of Bayesian Classification Arithmetic of the present embodiment can cover the most users in website.When user's access websites content for the first time, because its behavior has contingency, the result of calculation confidence level is relatively low; When user's access times increase gradually, form comparatively stable behavior pattern, result of calculation tends towards stability gradually, can realize a kind of differentiation process of going forward one by one, behavioral data along with the user in internet site constantly increases, and its gender tendency's accuracy of computation also improves constantly, and contingency reduces, need not training process, maintenance cost is lower.
In other embodiments of the invention, described step 2 can also adopt the classification algorithms such as decision tree, logistic recurrence, neural network or support vector machine, process behavioral data and the known true sex of described sample of users, obtain the relation between user's individual behavior and its gender tendency, and obtain all users' gender tendency according to the relation between described user's individual behavior and its gender tendency and described all users' behavioral data in step 4.For example, adopt the classification algorithms such as decision tree, logistic recurrence, neural network or support vector machine to obtain funtcional relationship between user's individual behavior and its gender tendency in step 2, root with the described funtcional relationship of behavioral data substitution of each user in step 3, just can obtain each user's gender tendency in step 4 accordingly.User behavior pattern may gradually change along with the time.In order to keep higher judgment accuracy, realize based on classification algorithms such as decision tree, logistic recurrence, neural network or support vector machine the method that the Internet user gender tendency differentiates in the present invention, need regularly or irregularly to train and Renewal model, maintenance cost is higher.In addition, said method can't be realized a kind of progressive process that improves gradually its sex judgment accuracy that increases along with user behavior.Directed application process based on user's Sex Discrimination result of the present invention can be after obtaining comparatively accurately user's sex, provide information based on user's sex to the user, greatly improve to accuracy rate and the efficient used based on the different orientation of user gender such as throwing in thereby make as personalized search, personalized recommendation and advertisement fixing, improve the efficient of web2.0 personalized application.
Embodiment two
As shown in Figure 2, the present embodiment provides a kind of directed application system based on user's Sex Discrimination result, comprising:
Sample of users data collection module 21 is used for collecting and to arrange the behavioral data of sample of users of the known true sex of an internet site;
Behavioral data and gender tendency concern unit 22, are used for obtaining the relation of gender tendency and behavioral data according to behavioral data and the known true sex of the sample of users of described sample of users data collection module 21, and deposit the database of described internet site in;
Total user data collector unit 23 is for all users' that collect and arrange described internet site behavioral data;
Gender tendency's computing unit 24, be used for obtaining all users' gender tendency according to all users' of the relation of described behavioral data and gender tendency's the gender tendency who concerns unit 22 and behavioral data and described total user data collector unit 23 behavioral data, and deposit the database of described internet site in;
Gender tendency's output unit 25, for the database of inquiring about described internet site, output user's to be checked gender tendency;
Directed applying unit 26 is used for the gender tendency according to described gender tendency's output unit output, provides information based on the gender tendency of described output to described user to be checked.
In the present embodiment, the behavioral data of the sample of users that described sample of users data collection module 21 is collected and arranged comprises: each sample of users is accessed each content of described internet site and the access weight between the frequency and each sample of users and each content; The behavioral data that all users of described internet site were collected and arranged to described total user data collector unit 23 comprises: each user accesses each content of described internet site and the access weight between the frequency and each user and each content.
In the present embodiment, described behavioral data and gender tendency's concern gender tendency that unit 22 obtains and the relation of behavioral data comprise: the relation of the gender tendency of described internet site integral body and the behavioral data of all sample of users, and the relation of the behavioral data of the gender tendency of each content of described internet site and all sample of users; In the present embodiment, described gender tendency's computing unit 24 adopts described behavioral data and gender tendency's gender tendency and the relation of behavioral data and described all users' the behavioral data that unit 22 obtains that concern, obtains all users' gender tendency.
in the present embodiment, described gender tendency's output unit 25 is inquired about the database of described internet site, when exporting user's to be checked gender tendency, if the database of described internet site has described user's to be checked gender tendency, export described user's gender tendency, if the database of described internet site is without described user's to be checked gender tendency, grasp the current content of the described internet site of described user's current accessed to be checked, inquire about the gender tendency of described current content according to the current content of crawl in the database of described internet site, export the gender tendency of described current content as described user's to be checked gender tendency, if the database of described internet site without the gender tendency of described content to be checked, is exported the gender tendency of internet site's integral body as described user's to be checked gender tendency.
need to prove, in the present embodiment, described behavioral data and gender tendency's the unit 22 that concerns can obtain gender tendency's prior probability of internet site's integral body and gender tendency's prior probability of each content according to the behavioral data of stating sample of users and the known true sex of collecting, then gender tendency's computing unit 24 is in conjunction with each user's who collects in total user data collector unit 23 behavioral data, utilize Bayesian formula to calculate each user gender tendency's posterior probability, realize a kind of system of differentiating based on the Internet user gender tendency of Bayesian Classification Arithmetic.When user's access websites content for the first time, because its behavior has contingency, the result of calculation confidence level is relatively low; When user's access times increase gradually, form comparatively stable behavior pattern, result of calculation tends towards stability gradually, can realize a kind of differentiation process of going forward one by one, behavioral data along with the user in internet site constantly increases, and its gender tendency's accuracy of computation also improves constantly, and contingency reduces, need not training process, maintenance cost is lower.
in other embodiments of the invention, described behavioral data and gender tendency's the unit 22 that concerns can also adopt decision tree, logistic returns, the classification such as neural network or support vector machine algorithm, process behavioral data and the known true sex of described sample of users, obtain the relation between user's individual behavior and its gender tendency, then gender tendency's computing unit 24 obtains all users' gender tendency in conjunction with the behavioral data according to all users that collect in the relation between described user's individual behavior and its gender tendency and described total user data collector unit 23.For example, the classification algorithms such as unit 22 employing decision trees, logistic recurrence, neural network or support vector machine that concern described behavioral data and gender tendency have obtained the funtcional relationship between user's individual behavior and its gender tendency, then gender tendency's computing unit 24 with the described funtcional relationship of behavioral data substitution of each user in total user data collector unit 23, just can obtain each user's gender tendency.User behavior pattern may gradually change along with the time.In order to keep higher judgment accuracy, this user's Sex Discrimination system that realizes Internet user gender tendency differentiation based on classification algorithms such as decision tree, logistic recurrence, neural network or support vector machine, need regularly or irregularly to train and Renewal model, maintenance cost is higher, and can't realize a kind of progressive process that improves gradually its sex judgment accuracy that increases along with user behavior.
Directed application system based on user's Sex Discrimination result of the present invention can be based on personalized search, personalized recommendation and the advertisement fixing of user's sex to application systems such as inputs.
In sum, directed application process and system based on user's Sex Discrimination result of the present invention, behavioral data and sex by sample of users, calculate the relation of gender tendency and behavioral data, again according to each user's who collects behavioral data and the relation of this gender tendency and behavioral data, obtain each user's gender tendency, the behavioral data that the user is true, objective, complete is applied to Internet user's Sex Discrimination, and result of calculation accurately, reliably; In the situation that the true sex loss of learning of user, falseness can obtain user's sex information more accurately; Directed application process and system based on user's Sex Discrimination result of the present invention makes as personalized search, personalized recommendation and advertisement fixing and greatly improves to accuracy rate and the efficient used based on the different orientation of user gender such as throwing in, and improves the efficient of internet personalized application; Further, there is not the user filtering condition restriction in the present invention, can substantially cover the whole network station user, and user coverage rate is high; And can realize a kind of differentiation process of going forward one by one, the behavioral data along with the user in internet site constantly increases, and its gender tendency's accuracy of computation also improves constantly.
Obviously, those skilled in the art can carry out various changes and modification and not break away from the spirit and scope of the present invention invention.Like this, if within of the present invention these are revised and modification belongs to the scope of claim of the present invention and equivalent technologies thereof, the present invention also is intended to comprise these changes and modification interior.

Claims (22)

1. the directed application process based on user's Sex Discrimination result, is characterized in that, comprising:
Step 1: the behavioral data of sample of users of collecting and arrange the known true sex of an internet site;
Step 2: obtain the relation of gender tendency and behavioral data according to behavioral data and the known true sex of described sample of users, and deposit the database of described internet site in;
Step 3: the behavioral data of collecting and arrange all users of described internet site;
Step 4: obtain all users' gender tendency according to the relation of described gender tendency and behavioral data and described all users' behavioral data, and deposit the database of described internet site in;
Step 5: inquire about the database of described internet site, output user's to be checked gender tendency;
Step 6: provide gender tendency's information based on the user to be checked of output to described user to be checked.
2. the directed application process based on user's Sex Discrimination result as claimed in claim 1, it is characterized in that, in described step 1, the behavioral data of the sample of users of collecting and arranging comprises: each content of each sample of users access described internet site and and the frequency and each sample of users and each content between access weight.
3. the directed application process based on user's Sex Discrimination result as claimed in claim 2, it is characterized in that, in described step 1, the behavioral data of the sample of users of collecting and arranging also comprises: each sample of users is accessed time of the act and/or the behavior vector of described internet site.
4. the directed application process based on user's Sex Discrimination result as claimed in claim 2, it is characterized in that, in described step 3, the behavioral data of collecting and arrange all users of described internet site comprises: each user accesses each content of described internet site and the access weight between the frequency and each user and each content.
5. the directed application process based on user's Sex Discrimination result as claimed in claim 4, it is characterized in that, in described step 3, the behavioral data of collecting and arrange all users of described internet site also comprises: each user accesses time of the act and/or the behavior vector of described internet site.
6. the directed application process based on user's Sex Discrimination result as claimed in claim 2, is characterized in that, in described step 2, the relation of described gender tendency and behavioral data comprises:
The relation of the gender tendency of described internet site integral body and the behavioral data of all sample of users; And
The relation of the behavioral data of the sample of users of the gender tendency of each content of described internet site and this content of access.
7. the directed application process based on user's Sex Discrimination result as claimed in claim 6, is characterized in that, the computing formula of the relation of the gender tendency of described internet site integral body and the behavioral data of all sample of users is:
P ( g ) = Σ I i = 1 Σ J j = 1 r ij u i ( g ) Σ I i = 1 Σ J j = 1 r ij ;
The computing formula of the relation of the gender tendency of each content of described internet site and the behavioral data of all sample of users is:
P ( g | w j ) = Σ I i = 1 Σ J j = 1 r ij u i ( g ) Σ I i = 1 r ij
Wherein, P (g|w j) represent that j content is for the probability of sex g; u i(g) be Dirac function, represent whether i sample of users has the g sex; r ijRepresent that i sample of users is to the access weight of j content.
8. the directed application process based on user's Sex Discrimination result as claimed in claim 7, it is characterized in that, in described step 4, the computing formula that obtains all users' gender tendency according to the relation of described gender tendency and behavioral data and described all users' behavioral data is:
P ( g 0 | u i ) = 1 1 + odds ( g , X u i ) , P ( g 1 | u i ) = odds ( g , Xu i ) 1 + odds ( g , X u i ) ,
Odds ( g , Xu i ) = P ( g 1 ) P ( g 0 ) · Π j = 1 J ( P ( g 1 | w j ) P ( g 0 | w j ) / P ( g 1 ) P ( g 0 ) ) r ij ;
Wherein, g 0Represent the women, g 1Represent the male sex; J content is for the probability of sex g, r ijRepresent user u to be checked iTo the access weight of j content, odds (g, Xu i) be user u to be checked iGender's likelihood ratio, P (g 0) probability of women's tendency of expression described internet site integral body, P (g 1) probability of male sex's tendency of expression described internet site integral body.
9. the directed application process based on user's Sex Discrimination result as claimed in claim 6, it is characterized in that, in described step 2, the relation of described gender tendency and behavioral data also comprises: adopt decision tree, logistic recurrence, neural network or support vector machine, process behavioral data and the known true sex of described sample of users, obtain the relation between described user's individual behavior and its gender tendency.
10. the directed application process based on user's Sex Discrimination result as claimed in claim 9, it is characterized in that, in described step 4, obtain all users' gender tendency according to the relation between described user's individual behavior and its gender tendency and described all users' behavioral data, and deposit the database of described internet site in.
11. the directed application process based on user's Sex Discrimination result as claimed in claim 6, it is characterized in that, in described step 5, inquire about the database of described internet site, when exporting user's to be checked gender tendency, if the database of described internet site has described user's to be checked gender tendency, export described user's gender tendency; If the database of described internet site is without described user's to be checked gender tendency, grasp the current content of the described internet site of described user's current accessed to be checked, inquire about the gender tendency of described current content according to the current content of crawl in the database of described internet site, the gender tendency of the described current content of output is as described user's to be checked gender tendency; If the database of described internet site without the gender tendency of described content to be checked, is exported the gender tendency of internet site's integral body as described user's to be checked gender tendency.
12. the directed application system based on user's Sex Discrimination result is characterized in that, comprising:
The sample of users data collection module is used for collecting and to arrange the behavioral data of sample of users of the known true sex of an internet site;
Concerning of behavioral data and gender tendency is used for obtaining the relation of gender tendency and behavioral data according to behavioral data and the known true sex of described sample of users, and deposits the database of described internet site in the unit;
The total user data collector unit is for all users' that collect and arrange described internet site behavioral data;
Gender tendency's computing unit is used for obtaining all users' gender tendency according to the relation of described gender tendency and behavioral data and described all users' behavioral data, and deposits the database of described internet site in;
Gender tendency's output unit, for the database of inquiring about described internet site, output user's to be checked gender tendency;
Directed applying unit is used for the gender tendency according to described gender tendency's output unit output, provides information based on the gender tendency of described output to described user to be checked.
13. the directed application system based on user's Sex Discrimination result as claimed in claim 12, it is characterized in that, the behavioral data of the sample of users that described sample of users data collection module is collected and arranged comprises: each sample of users is accessed the content of described internet site and the access weight between the frequency and each sample of users and each content.
14. the directed application system based on user's Sex Discrimination result as claimed in claim 13, it is characterized in that, the behavioral data of the sample of users that described sample of users data collection module is collected and arranged also comprises: each sample of users is accessed time of the act and/or the behavior vector of described internet site.
15. the directed application system based on user's Sex Discrimination result as claimed in claim 13, it is characterized in that, the behavioral data that all users of described internet site were collected and arranged to described total user data collector unit comprises: each user accesses each content of described internet site and the access weight between the frequency and each user and each content.
16. the directed application system based on user's Sex Discrimination result as claimed in claim 15, it is characterized in that, the behavioral data that all users of described internet site were collected and arranged to described total user data collector unit also comprises: each user accesses time of the act and/or the behavior vector of described internet site.
17. the directed application system based on user's Sex Discrimination result as claimed in claim 13 is characterized in that, described behavioral data and gender tendency's concern gender tendency that the unit obtains and the relation of behavioral data comprise:
The relation of the gender tendency of described internet site integral body and the behavioral data of all sample of users; And
The relation of the behavioral data of the sample of users of the gender tendency of each content of described internet site and this content of access.
18. the directed application system based on user's Sex Discrimination result as claimed in claim 17 is characterized in that, the computing formula of the relation of the gender tendency of described internet site integral body and the behavioral data of all sample of users is:
P ( g ) = Σ I i = 1 Σ J j = 1 r ij u i ( g ) Σ I i = 1 Σ J j = 1 r ij ;
The computing formula of the relation of the gender tendency of each content of described internet site and the behavioral data of all sample of users is:
P ( g | w j ) = Σ I i = 1 Σ J j = 1 r ij u i ( g ) Σ I i = 1 r ij
Wherein, P (g|w j) represent that j content is for the probability of sex g; u i(g) be Dirac function, represent whether i sample of users has the g sex; r ijRepresent that i sample of users is to the access weight of j content.
19. the directed application system based on user's Sex Discrimination result as claimed in claim 18, it is characterized in that, described gender tendency's computing unit according to the computing formula that the relation of described gender tendency and behavioral data and described all users' behavioral data obtains all users' gender tendency is:
P ( g 0 | u i ) = 1 1 + odds ( g , X u i ) , P ( g 1 | u i ) = odds ( g , Xu i ) 1 + odds ( g , X u i ) ,
Odds ( g , Xu i ) = P ( g 1 ) P ( g 0 ) · Π j = 1 J ( P ( g 1 | w j ) P ( g 0 | w j ) / P ( g 1 ) P ( g 0 ) ) r ij ;
Wherein, g 0Represent the women, g 1Represent the male sex; J content is for the probability of sex g, r ijRepresent user u to be checked iTo the access weight of j content, odds (g, Xu i) be user u to be checked iGender's likelihood ratio, P (g 0) probability of women's tendency of expression described internet site integral body, P (g 1) probability of male sex's tendency of expression described internet site integral body.
20. the directed application system based on user's Sex Discrimination result as claimed in claim 17, it is characterized in that, described behavioral data and gender tendency's concern gender tendency that the unit obtains and the relation of behavioral data also comprise: adopt decision tree, logistic recurrence or neural network or support vector machine, process behavioral data and the known true sex of described sample of users, obtain the relation between described user's individual behavior and its gender tendency.
21. the directed application system based on user's Sex Discrimination result as claimed in claim 20, it is characterized in that, described gender tendency's computing unit obtains all users' gender tendency according to the relation between described user's individual behavior and its gender tendency and described all users' behavioral data, and deposits the database of described internet site in.
22. the directed application system based on user's Sex Discrimination result as claimed in claim 17, it is characterized in that, the database of the described internet site of described gender tendency's output unit inquiry, when exporting user's to be checked gender tendency, if the database of described internet site has described user's to be checked gender tendency, export described user's gender tendency, if the database of described internet site is without described user's to be checked gender tendency, grasp the current content of the described internet site of described user's current accessed to be checked, inquire about the gender tendency of described current content according to the current content of crawl in the database of described internet site, export the gender tendency of described current content as described user's to be checked gender tendency, if the database of described internet site without the gender tendency of described content to be checked, is exported the gender tendency of internet site's integral body as described user's to be checked gender tendency.
CN 201110422555 2011-12-15 2011-12-15 Directional application method based on user gender distinguished results and system thereof Pending CN103164470A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN 201110422555 CN103164470A (en) 2011-12-15 2011-12-15 Directional application method based on user gender distinguished results and system thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN 201110422555 CN103164470A (en) 2011-12-15 2011-12-15 Directional application method based on user gender distinguished results and system thereof

Publications (1)

Publication Number Publication Date
CN103164470A true CN103164470A (en) 2013-06-19

Family

ID=48587564

Family Applications (1)

Application Number Title Priority Date Filing Date
CN 201110422555 Pending CN103164470A (en) 2011-12-15 2011-12-15 Directional application method based on user gender distinguished results and system thereof

Country Status (1)

Country Link
CN (1) CN103164470A (en)

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103646341A (en) * 2013-11-29 2014-03-19 北京奇虎科技有限公司 A method and an apparatus for recommending website-provided objects
CN103729785A (en) * 2014-01-26 2014-04-16 合一信息技术(北京)有限公司 Video user gender classification method and device for method
CN104205100A (en) * 2014-06-05 2014-12-10 深圳市推想大数据信息技术有限公司 Method and device for estimating recessive character distribution of users
CN104281634A (en) * 2014-03-13 2015-01-14 电子科技大学 Neighborhood-based mobile subscriber basic attribute forecasting method
CN104317822A (en) * 2014-09-29 2015-01-28 新浪网技术(中国)有限公司 Population property prediction method and device of network user
CN104598452A (en) * 2013-10-30 2015-05-06 北京思博途信息技术有限公司 Method and device for analyzing user gender
CN104636504A (en) * 2015-03-10 2015-05-20 飞狐信息技术(天津)有限公司 Method and system for identifying sexuality of user
CN105095401A (en) * 2015-07-07 2015-11-25 北京嘀嘀无限科技发展有限公司 Method and apparatus for identifying gender
CN105426395A (en) * 2015-10-28 2016-03-23 上汽通用汽车有限公司 Audience portrait generation method and system
CN105809557A (en) * 2016-03-15 2016-07-27 微梦创科网络科技(中国)有限公司 Method and device for mining genders of users in social network
US9489592B2 (en) 2014-12-05 2016-11-08 Xerox Corporation User characteristic prediction using images posted in online social networks
CN106372151A (en) * 2016-08-30 2017-02-01 多盟睿达科技(中国)有限公司 Message push method and message push device based on user gender recognition
CN106371750A (en) * 2016-08-30 2017-02-01 北京奇艺世纪科技有限公司 Method and device for confirming user gender
CN106656943A (en) * 2015-11-03 2017-05-10 秒针信息技术有限公司 Network user attribute matching method and device
CN106897727A (en) * 2015-12-21 2017-06-27 百度在线网络技术(北京)有限公司 A kind of user's gender identification method and device
CN107180044A (en) * 2016-03-09 2017-09-19 精硕科技(北京)股份有限公司 Recognize Internet user's sex method and system
CN107704547A (en) * 2017-09-26 2018-02-16 硕诺科技(深圳)有限公司 One kind passes through mobile phone usage behavior identity method for distinguishing
WO2019120007A1 (en) * 2017-12-22 2019-06-27 Oppo广东移动通信有限公司 Method and apparatus for predicting user gender, and electronic device
CN111582900A (en) * 2019-02-19 2020-08-25 腾讯科技(深圳)有限公司 Media file delivery method and device, storage medium and electronic device

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104598452A (en) * 2013-10-30 2015-05-06 北京思博途信息技术有限公司 Method and device for analyzing user gender
CN104598452B (en) * 2013-10-30 2018-09-11 秒针信息技术有限公司 User's gender analysis method and apparatus
CN103646341B (en) * 2013-11-29 2018-06-22 北京奇虎科技有限公司 A kind of website provides the recommendation method and apparatus of object
CN103646341A (en) * 2013-11-29 2014-03-19 北京奇虎科技有限公司 A method and an apparatus for recommending website-provided objects
CN103729785A (en) * 2014-01-26 2014-04-16 合一信息技术(北京)有限公司 Video user gender classification method and device for method
CN104281634A (en) * 2014-03-13 2015-01-14 电子科技大学 Neighborhood-based mobile subscriber basic attribute forecasting method
CN104281634B (en) * 2014-03-13 2018-04-20 电子科技大学 A kind of mobile subscriber's primary attribute Forecasting Methodology based on neighborhood
WO2015184619A1 (en) * 2014-06-05 2015-12-10 深圳市推想大数据信息技术有限公司 Method and apparatus for estimating recessive character distribution of users
CN104205100B (en) * 2014-06-05 2018-02-02 北京推想科技有限公司 A kind of method and device for the recessive character distribution for estimating user
CN104205100A (en) * 2014-06-05 2014-12-10 深圳市推想大数据信息技术有限公司 Method and device for estimating recessive character distribution of users
CN104317822A (en) * 2014-09-29 2015-01-28 新浪网技术(中国)有限公司 Population property prediction method and device of network user
CN104317822B (en) * 2014-09-29 2018-02-27 新浪网技术(中国)有限公司 The ascribed characteristics of population Forecasting Methodology and device of the network user
US9489592B2 (en) 2014-12-05 2016-11-08 Xerox Corporation User characteristic prediction using images posted in online social networks
CN104636504A (en) * 2015-03-10 2015-05-20 飞狐信息技术(天津)有限公司 Method and system for identifying sexuality of user
CN105095401A (en) * 2015-07-07 2015-11-25 北京嘀嘀无限科技发展有限公司 Method and apparatus for identifying gender
CN105426395A (en) * 2015-10-28 2016-03-23 上汽通用汽车有限公司 Audience portrait generation method and system
CN105426395B (en) * 2015-10-28 2019-02-19 上汽通用汽车有限公司 A kind of audient draws a portrait generation method and system
CN106656943A (en) * 2015-11-03 2017-05-10 秒针信息技术有限公司 Network user attribute matching method and device
CN106656943B (en) * 2015-11-03 2019-09-17 秒针信息技术有限公司 A kind of matching process and device of network user's attribute
CN106897727A (en) * 2015-12-21 2017-06-27 百度在线网络技术(北京)有限公司 A kind of user's gender identification method and device
WO2017107422A1 (en) * 2015-12-21 2017-06-29 百度在线网络技术(北京)有限公司 Method and device for user gender identification
CN107180044A (en) * 2016-03-09 2017-09-19 精硕科技(北京)股份有限公司 Recognize Internet user's sex method and system
CN105809557A (en) * 2016-03-15 2016-07-27 微梦创科网络科技(中国)有限公司 Method and device for mining genders of users in social network
CN106371750A (en) * 2016-08-30 2017-02-01 北京奇艺世纪科技有限公司 Method and device for confirming user gender
CN106372151A (en) * 2016-08-30 2017-02-01 多盟睿达科技(中国)有限公司 Message push method and message push device based on user gender recognition
CN106372151B (en) * 2016-08-30 2019-10-08 多盟睿达科技(中国)有限公司 A kind of information push method and device based on the identification of user's gender
CN107704547A (en) * 2017-09-26 2018-02-16 硕诺科技(深圳)有限公司 One kind passes through mobile phone usage behavior identity method for distinguishing
CN107704547B (en) * 2017-09-26 2022-01-14 英望科技(山东)有限公司 Method for identifying gender through mobile phone using behaviors
WO2019120007A1 (en) * 2017-12-22 2019-06-27 Oppo广东移动通信有限公司 Method and apparatus for predicting user gender, and electronic device
CN111582900A (en) * 2019-02-19 2020-08-25 腾讯科技(深圳)有限公司 Media file delivery method and device, storage medium and electronic device

Similar Documents

Publication Publication Date Title
CN103164470A (en) Directional application method based on user gender distinguished results and system thereof
CN104281718B (en) A kind of method that intelligent recommendation is excavated based on user group's behavioral data
CN102609533B (en) Kernel method-based collaborative filtering recommendation system and method
CN109934721A (en) Finance product recommended method, device, equipment and storage medium
CN103955842B (en) A kind of online advertisement commending system and method towards mass media data
CN103329151B (en) Recommendation based on topic cluster
WO2021025926A1 (en) Digital content prioritization to accelerate hyper-targeting
CN101841435B (en) Method, apparatus and system for detecting abnormality of DNS (domain name system) query flow
CN104166668A (en) News recommendation system and method based on FOLFM model
CN101645066B (en) Method for monitoring novel words on Internet
CN103235812B (en) Method and system for identifying multiple query intents
CN103514239A (en) Recommendation method and system integrating user behaviors and object content
CN106447463A (en) Commodity recommendation method based on Markov decision-making process model
CN102541920A (en) Method and device for improving accuracy degree by collaborative filtering jointly based on user and item
CN102663022A (en) Classification recognition method based on URL (uniform resource locator)
CN105005708B (en) A kind of broad sense load Specialty aggregation method based on AP clustering algorithms
CN104851025A (en) Case-reasoning-based personalized recommendation method for E-commerce website commodity
Rahmani et al. Solving economic dispatch problem using particle swarm optimization by an evolutionary technique for initializing particles
CN102902775A (en) Internet real-time computing method and internet real-time computing system
Feng et al. [Retracted] Design and Simulation of Human Resource Allocation Model Based on Double‐Cycle Neural Network
CN103034963A (en) Service selection system and selection method based on correlation
CN108572988A (en) A kind of house property assessment data creation method and device
CN110188268A (en) A kind of personalized recommendation method based on label and temporal information
CN106095939A (en) The acquisition methods of account authority and device
CN108171545A (en) A kind of conversion ratio predictor method based on level of hierarchy data

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
ASS Succession or assignment of patent right

Owner name: SHENGQU INFORMATION TECH (SHANGHAI) CO., LTD.

Free format text: FORMER OWNER: SHANDA NETWORKING CO., LTD.

Effective date: 20130909

C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 201203 PUDONG NEW AREA, SHANGHAI TO: 200241 MINHANG, SHANGHAI

TA01 Transfer of patent application right

Effective date of registration: 20130909

Address after: 200241 No. 1, building 690, blue wave road, Zhangjiang hi tech park, Shanghai

Applicant after: Shengqu Information Technology (Shanghai) Co., Ltd.

Address before: 201203 712-A room, No. 625 Zhangjiang Road, Shanghai, Pudong New Area

Applicant before: Shanda computer (Shanghai) Co., Ltd.

C05 Deemed withdrawal (patent law before 1993)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20130619