CN108241989A - Otherness data capture method and device - Google Patents

Otherness data capture method and device Download PDF

Info

Publication number
CN108241989A
CN108241989A CN201611218196.7A CN201611218196A CN108241989A CN 108241989 A CN108241989 A CN 108241989A CN 201611218196 A CN201611218196 A CN 201611218196A CN 108241989 A CN108241989 A CN 108241989A
Authority
CN
China
Prior art keywords
index
otherness
crowd
crowds
probability
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611218196.7A
Other languages
Chinese (zh)
Inventor
卢金金
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Gridsum Technology Co Ltd
Original Assignee
Beijing Gridsum Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Gridsum Technology Co Ltd filed Critical Beijing Gridsum Technology Co Ltd
Priority to CN201611218196.7A priority Critical patent/CN108241989A/en
Publication of CN108241989A publication Critical patent/CN108241989A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0269Targeted advertisements based on user profile or attribute
    • G06Q30/0271Personalized advertisement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0277Online advertisement

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • Finance (AREA)
  • Game Theory and Decision Science (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A kind of otherness data capture method and device provided by the invention, two corresponding attribute informations of people group are determined first, again according to the corresponding attribute information of described two crowds, at least one reference index that can characterize the otherness of described two crowds is filtered out from least one index.I.e. at least one reference index can embody the otherness between two crowds.Therefore it can obtain and the corresponding target group's index of at least one reference index is paid close attention in each crowd, again based at least one corresponding target group's index of reference index paid close attention in each crowd, at least one otherness data for embodying the otherness between described two crowds are obtained.So as to avoid situation about when providing personalized advertisement for different crowd, miscarrying into personalized advertisement, so as to improve ad conversion rates.

Description

Otherness data capture method and device
Technical field
The present invention relates to big data analysis technical fields, more particularly relate to otherness data capture method and device.
Background technology
In recent years, with the high speed development of internet, the advertising market scale of internet is also constantly expanding, particularly greatly The arrival of data age greatly improves and collects and analyzes ability to the information of user.In the environment of big data, internet Advertisement has measurability and effect trackability so that and the behavior and preference for analyzing different crowd become a kind of possibility, so as to Personalized advertisement can be provided for different crowd, improve ad conversion rates so that network exact advertisement marketing increasingly by The concern of people.
Mainly come in the prior art using TGI (Target Group Index, target group's index) index analysis method Behavior and the preference of different crowd are analyzed, such as has a crowd and b crowd, is all to pay close attention in 20000, a crowds comprising number of users The user of automobile has 4, and the user for paying close attention to automobile in b crowd has 1, and a people can be obtained by using TGI index analysis method Target group's index (TGI) of group is that target group's index (TGI) of 160, b crowds is 40, so as to draw a conclusion:A crowd and b Crowd's difference in the concern to this index of automobile is more apparent, can increase the input to a crowd's automotive advertising.But in totality In the case that number of users is 20000, the comparing result of 1 or 4 users' generation obviously can not accurately illustrate a crowd and b people The overall condition of group.
As it can be seen that in the prior art for analyze the behavior of different crowd and index that the method for preference is determined corresponding to Otherness data, may not be the otherness data that can accurately embody the otherness between different crowd, for not When providing personalized advertisement with crowd, may provide the reference of mistake causes to miscarry into personalized advertisement, wide so as to reduce Accuse conversion ratio.
Invention content
In view of the above problems, it is proposed that the present invention overcomes the above problem in order to provide one kind or solves at least partly State the otherness data capture method and device of problem.
A kind of otherness data capture method, including:
Determine the corresponding attribute informations of two people group and described two crowds at least one index of interest, wherein, The corresponding attribute information of each crowd includes paying close attention at least one index in the total number of users amount of the crowd, the crowd Corresponding target user's quantity;
According to the corresponding attribute information of described two crowds, filtered out from least one index described in can characterizing At least one reference index of the otherness of two crowds;
It obtains in each crowd and pays close attention to the corresponding target group's index of at least one reference index;
Based at least one corresponding target group's index of reference index paid close attention in each crowd, body is obtained At least one otherness data of otherness between existing described two crowds.
Wherein, difference of the otherness data for target group's index of the corresponding reference index of described two crowds concern Value.
Preferably, it further includes:
Difference according at least one corresponding target group's index of reference index refers at least one reference Mark is ranked up;
The reference index for meeting predetermined order condition is filtered out from least one reference index according to ranking results.
Wherein, it is described according to the corresponding attribute information of described two crowds, filter out energy from least one index At least one reference index for enough characterizing the otherness of described two crowds includes:
According to the corresponding attribute information of described two crowds, at least one corresponding probability of index is calculated, wherein, often The corresponding probability of one index is used to characterize described two crowds possibility with otherness in the index;
At least one reference index that probability is more than or equal to predetermined threshold value is filtered out from least one index.
Preferably, it further includes:The corresponding confidence level of at least one index is pre-set, each index is put accordingly Letter level can embody possibility of described two crowds with otherness for characterizing the index;
It is described according to the corresponding attribute information of described two crowds, calculate at least one corresponding probability packet of index It includes:
Determine the type of probability distribution to match at least one index;
According to two corresponding attribute informations of people group, statistics of at least one index under the probability distribution is calculated Amount, each corresponding statistic of index are used to characterize the unilateral confidence upper limit of the corresponding confidence level of the index;
Determine probability density function of at least one index under the probability distribution;
According at least one corresponding probability density function of index and the corresponding statistic of the index, calculate this and refer to Mark corresponding accumulated probability.
Wherein, at least one reference that probability is filtered out from least one index and is more than or equal to predetermined threshold value Index includes:
Using at least one corresponding confidence level of index as predetermined threshold value, and by the corresponding accumulated probability of the index As the corresponding probability of the index;
At least one reference that accumulated probability is more than or equal to corresponding confidence level is filtered out from least one index Index.
A kind of otherness data acquisition facility, including:
First determining module, for determining that two corresponding attribute informations of people group and described two crowds are of interest extremely A few index, wherein, each corresponding attribute information of crowd includes paying close attention in the total number of users amount of the crowd, the crowd The corresponding target user's quantity of at least one index;;
First screening module, for according to the corresponding attribute information of described two crowds, from least one index Filter out at least one reference index for the otherness that can characterize described two crowds;
First acquisition module pays close attention to the corresponding target of at least one reference index for obtaining in each crowd Population density index;
Second acquisition module, for based at least one corresponding mesh of reference index paid close attention in each crowd Population density index is marked, obtains at least one otherness data for embodying the otherness between described two crowds.
Wherein, difference of the otherness data for target group's index of the corresponding reference index of described two crowds concern Value.
Preferably, it further includes:
Sorting module, for the difference according at least one corresponding target group's index of reference index to it is described extremely A few reference index is ranked up;
Second screening module, for filtering out the default row of satisfaction from least one reference index according to ranking results The reference index of sequence condition.
Wherein, first screening module includes:
Computing unit, for according to the corresponding attribute information of described two crowds, it is corresponding to calculate at least one index Probability, wherein, each corresponding probability of index be used for characterize described two crowds in the index have otherness possibility Property;
Screening unit, it is at least one more than or equal to predetermined threshold value for filtering out probability from least one index Reference index.
By above-mentioned technical proposal, in a kind of otherness data capture method embodiment provided by the invention, it is first determined Two corresponding attribute informations of people group, then according to the corresponding attribute information of described two crowds, from least one index Filter out at least one reference index for the otherness that can characterize described two crowds.I.e. at least one reference index can be with body Otherness between existing two crowds.Therefore it can obtain that at least one reference index is paid close attention in each crowd is corresponding Target group's index, then referred to based at least one corresponding target group of reference index paid close attention in each crowd Number obtains at least one otherness data for embodying the otherness between described two crowds.So as to avoid for different people When group provides personalized advertisement, the situation into personalized advertisement is miscarried, so as to improve ad conversion rates.
Above description is only the general introduction of technical solution of the present invention, in order to better understand the technological means of the present invention, And it can be implemented in accordance with the contents of the specification, and in order to allow above and other objects of the present invention, feature and advantage can It is clearer and more comprehensible, below the special specific embodiment for lifting the present invention.
Description of the drawings
By reading the detailed description of hereafter preferred embodiment, it is various other the advantages of and benefit it is common for this field Technical staff will become clear.Attached drawing is only used for showing the purpose of preferred embodiment, and is not considered as to the present invention Limitation.And throughout the drawings, the same reference numbers will be used to refer to the same parts.In the accompanying drawings:
Fig. 1 shows a kind of flow diagram of otherness data capture method provided by the embodiments of the present application;
Flow Fig. 2 shows the sort method in a kind of otherness data capture method provided by the embodiments of the present application is shown It is intended to;
Fig. 3 is shown in a kind of otherness data capture method provided by the embodiments of the present application according to described two crowd's phases The attribute information answered filters out at least one of the otherness that can characterize described two crowds from least one index A kind of flow diagram of implementation method of reference index;
Fig. 4 is shown in a kind of otherness data capture method provided by the embodiments of the present application according to described two crowd's phases The attribute information answered calculates a kind of method flow diagram of realization method of at least one corresponding probability of index;
Fig. 5 shows a kind of structure diagram of otherness data acquisition facility provided by the embodiments of the present application.
Specific embodiment
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although the disclosure is shown in attached drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here It is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure Completely it is communicated to those skilled in the art.
Referring to Fig. 1, for a kind of flow diagram of otherness data capture method provided by the embodiments of the present application, the party Method includes:
Step S101:Determine the corresponding attribute informations of two people group and described two crowds at least one finger of interest Mark.
Wherein, the corresponding attribute information of each crowd includes paying close attention to institute in the total number of users amount of the crowd, the crowd State the corresponding target user's quantity of at least one index.
The embodiment of the present application provides but is not limited to the dividing mode of following two crowds:
The network user can be divided into different crowds, such as can be divided according to the age, than if desired for launch about The advertisement of university student's product, then can be using 18 years old to 25 years old as university student crowd, other ages are as non-university student crowd.It can Using by university student crowd and non-university student crowd as two crowds in the embodiment of the present application.
For a certain advertisement putting business, the advertisement launched may be automotive advertising, can be according to the network user It is no to be registered on the website related with automobile, to divide crowd, such as registration work is carried out on the website related with automobile For automobile crowd, registration is not carried out on the website related with automobile as non-vehicle crowd, it can be by automobile crowd and non-vapour Vehicle crowd is as two crowds in the embodiment of the present application.
Index can be numeric type index and/or attribute index and/or profile index, wherein:
Numeric type index includes:Average session duration, average page browsing number etc..
Attribute index includes:The number of users of concern automobile, the number of users of concern books reading, place are located at Pekinese's use The number of users of amount, the competing product word of search.
Distributivity index includes:Crowd is in the distribution of different cities.
The determining mode of two crowds in the embodiment of the present application can include it is a variety of, be not specifically limited herein.
There are many acquisition modes of index, and the embodiment of the present application is provided but is not limited to following several:
The first, utilizes cookie.
Cookie refer to website in order to distinguish user identity, carry out session tracking and be stored on user local terminal Data.Advertisement putting business can browse which information, such as automobile information, makeup according to cookie to detect the network user Product information, clothes information etc. can determine index, such as automobile index, cosmetics index, clothes index by these information Etc..
Second, utilize the account information of the network user.
Using the account information of the network user, such as the information such as mailbox, cell-phone number, identification card number, obtain the network user and exist After logging in these accounts, the information of browsing, so that it is determined that index.
The third, utilizes MAC Address.
MAC Address is integrated in network interface card, by MAC Address can with one equipment of unique mark, can by MAC Address, Determine which information the network user has browsed, so that it is determined that index.
Step S102:According to the corresponding attribute information of described two crowds, energy is filtered out from least one index Enough characterize at least one reference index of the otherness of described two crowds.
The index for the otherness that described two crowds can be characterized at least one index is known as in the embodiment of the present application Reference index.
Still by taking a crowd and b crowd as an example, and all it is that the user for paying close attention to automobile in 20000, a crowds has 4 comprising number of users A, the user for paying close attention to automobile in b crowd has 1, since the comparing result of 1 or 4 users' generation obviously can not accurately be said Bright a crowd and the overall condition of b crowd, therefore automobile index is not that can characterize the reference of the otherness of described two crowds to refer to Mark.
Assuming that the user for paying close attention to cosmetics in a crowd has 5000;The user for paying close attention to cosmetics in b crowd has 2600, Since the comparing result of 5000 or 2600 users' generations obviously can accurately illustrate the overall condition of a crowd and b crowd, Cosmetics index is can to characterize the reference index of the otherness of described two crowds.
At least one reference of the otherness of described two crowds can be characterized by being filtered out from least one index There are many ways to index, the embodiment of the present application provides but is not limited to following methods:
It calculates each crowd and pays close attention at least one corresponding target user's quantity of index, the total number of users amount with corresponding crowd Ratio;It is filtered out from least one index, corresponding ratio is more than or equal at least one reference index of preset value.
Step S103:The corresponding target group of at least one reference index are paid close attention in each crowd of acquisition to refer to Number.
Assuming that two crowds in the embodiment of the present application are known as the first crowd and the second crowd, join being paid close attention in the first crowd The target user's quantity for examining index is known as variables A, and target user's quantity that same reference index is paid close attention in the second crowd is known as Variable B;The total number of users amount of first crowd is known as variable C, the total number of users amount of the second crowd is known as variables D;Then first Target group's index TGI calculation formula of each reference index are as follows in crowd:
Target group's index TGI calculation formula of each reference index are as follows in second crowd:
Step S104:Referred to based at least one corresponding target group of reference index paid close attention in each crowd Number obtains at least one otherness data for embodying the otherness between described two crowds.
Optionally, difference of the otherness data for target group's index of the corresponding reference index of two crowd's concerns.
In a kind of otherness data capture method embodiment provided by the embodiments of the present application, it is first determined two people's faciations should Attribute information, then according to the corresponding attribute information of described two crowds, being filtered out from least one index being capable of table Levy at least one reference index of the otherness of described two crowds.I.e. at least one reference index can embody two crowds it Between otherness.Therefore it can obtain and the corresponding target group of at least one reference index are paid close attention in each crowd refer to Number, then based at least one corresponding target group's index of reference index paid close attention in each crowd, obtain and embody At least one otherness data of otherness between described two crowds.So as to avoid providing personalization for different crowd During advertisement, the situation into personalized advertisement is miscarried, improves ad conversion rates.
Referring to Fig. 2, the stream for the sort method in a kind of otherness data capture method provided by the embodiments of the present application Journey schematic diagram, this method include:
Step S201:Difference according at least one corresponding target group's index of reference index is to described at least one A reference index is ranked up.
For each reference index, since each crowd has target group's index for the reference index, so right In each reference index, difference of two people's faciations for target group's index of the reference index can be calculated.
Step S202:It is filtered out from least one reference index according to ranking results and meets predetermined order condition Reference index.
Predetermined order condition can include:Corresponding difference is more than or equal to preset difference value;Or the preceding M reference of ranking results refers to For mark (with difference descending sort), M is the positive integer more than or equal to 1.
Referring to Fig. 3, it is according to described two people in a kind of otherness data capture method provided by the embodiments of the present application The corresponding attribute information of group, the otherness of described two crowds can be characterized at least by being filtered out from least one index A kind of flow diagram of implementation method of one reference index, this method include:
Step S301:According to the corresponding attribute information of described two crowds, it is general accordingly to calculate at least one index Rate, wherein, each corresponding probability of index is used to characterize described two crowds possibility with otherness in the index.
Step S302:At least one reference that probability is more than or equal to predetermined threshold value is filtered out from least one index Index.
Referring to Fig. 4, it is according to described two people in a kind of otherness data capture method provided by the embodiments of the present application The corresponding attribute information of group calculates a kind of method flow diagram of realization method of at least one corresponding probability of index, should Method includes:
Step S401:Determine the type of probability distribution to match at least one index.
The corresponding confidence level of at least one index is pre-set, each corresponding confidence level of index is used to characterize The index, which can embody described two crowds, has the possibility of otherness.
Confidence level refers to that particular individual (such as index) treats particular proposition (such as by index energy in the embodiment of the present application Enough embodying two crowds has otherness as particular proposition) degree believed of authenticity.Confidence level refers to population parameter value The probability in a certain area of sample statistics value is fallen, is generally represented with 1- α;And confidence interval refers under a certain confidence level, sample Error range between this statistical value and population parameter value.Confidence interval is bigger, and confidence level is higher.
Confidence level can be in the embodiment of the present application ... 90% ..., 95%, 96%, 97%..., in short, confidence level Setting it is bigger, illustrate believe the index can embody described two crowds have otherness possibility it is bigger.
Step S402:According to two corresponding attribute informations of people group, at least one index is calculated in the probability point The statistic planted, each corresponding statistic of index are used to characterize the unilateral confidence upper limit of the corresponding confidence level of the index.
Statistic is for the variable analyzed data, examined, the statistic of different probability distribution in statistical theory Calculation formula it is different, such as the normalized set formula being just distributed very much is as follows:Wherein δ is variance, and n is variable Dimension,For standardized variable;μ0It is expected.
For another example chi square distribution, it is assumed that the situation such as table 1 of the first crowd and the second crowd concern index.
1 first crowd of table and the second crowd pay close attention to the situation of index
Then the calculation formula of the statistic of chi square distribution is as follows:
Step S403:Determine probability density function of at least one index under the probability distribution.
The corresponding probability density function of different probability distribution is different.
Such as the probability density function being just distributed very much is:
The probability density function of chi square distribution is:
Step S404:It is counted accordingly according at least one corresponding probability density function of index and the index Amount, calculates the corresponding accumulated probability of the index.
By taking chi square distribution as an example, accumulated probability
The above method is illustrated by taking chi square distribution as an example.Assuming that the total number of users amount point of the first crowd and the second crowd All it is not 20000, the number of users that automobile is paid close attention in the first crowd is 2611, and the number of users of concern automobile is in the second crowd 3184。
Assuming that automobile index can embody two crowds with otherness, and the confidence level of the hypothesis is 97% (probability Distribution is different, and confidence level can be identical, is not associated between the two).
Establish the contingency table of the first crowd and the second crowd relative to automobile achievement data.
2 first crowd of table and the second crowd relative to automobile achievement data relation table
Calculate the chi-square value of automobile index:
First crowd and the second crowd are relative to the number with light horizontal line shading in the relation table of automobile achievement data Character segment is contingency table.
According to χ2Value and the corresponding degree of freedom of described two crowds, determine that corresponding index can embody two crowds and have The accumulated probability of otherness.
Degree of freedom=(r-1) × (c-1), r are the line number of contingency table, and c is the columns of contingency table, so r=c=2, therefore Degree of freedom is 1.
Still by taking table 2 as an example, then the accumulated probability of the index is 1-4.44 × 10-16, it is clear that more than 97%, therefore the automobile It is correct that index, which can embody hypothesis of two crowds with otherness,.
Still by taking table 2 as an example, then target group's index TGI of the first crowd1Calculation formula is as follows:
Then target group's index TGI of the second crowd2Calculation formula is as follows:
Target group's index difference value of automobile index=| 109.89-90.11 |=19.78.
The number of users for assuming again that the first crowd and the second crowd concern novel is respectively 122 and 484, according to above-mentioned calculating side Method obtains the accumulated probability of novel index close to 100%;TGI indexes for the first crowd of novel index are 40.26, second The TGI indexes of crowd are 159.74.Illustrating the first crowd and the second crowd, there are significant difference, novels in concern novel index Target group's index difference value of index is 119.48.
Finally a crowd being calculated with same method and b crowd paying close attention to the crowd of automobile, the user for paying close attention to automobile in a crowd has 4 A, the user for paying close attention to automobile in b crowd has 1, and for a crowd and b crowd, the accumulated probability of automobile index is 82%, Assuming that confidence level is 95%, then automobile index obviously cannot embody the otherness of a crowd and b crowd.
The embodiment of the present application additionally provides otherness data acquisition facility corresponding with above-mentioned otherness data capture method, The modules in otherness data acquisition facility will be introduced in detail below.Explaining in detail for modules is please referred to To the explanation of corresponding steps in otherness data capture method, no longer repeated herein.
Referring to Fig. 5, for a kind of structure diagram of otherness data acquisition facility provided by the embodiments of the present application, the dress Put including:First determining module 51, the first screening module 52, the first acquisition module 53 and the second acquisition module 54, wherein:
First determining module 51, for determining that two corresponding attribute informations of people group and described two crowds are of interest At least one index, wherein, each corresponding attribute information of crowd includes closing in the total number of users amount of the crowd, the crowd Note the corresponding target user's quantity of at least one index;
First screening module 52, for according to the corresponding attribute information of described two crowds, from least one index In filter out at least one reference index of the otherness that can characterize described two crowds;
First acquisition module 53 pays close attention to the corresponding mesh of at least one reference index for obtaining in each crowd Mark population density index;
Second acquisition module 54, for corresponding based at least one reference index paid close attention in each crowd Target group's index obtains at least one otherness data for embodying the otherness between described two crowds.
Optionally, the otherness data are target group's index of the corresponding reference index of described two crowds concern Difference.
Optionally, above-mentioned otherness data acquisition facility embodiment can also include:
Sorting module, for the difference according at least one corresponding target group's index of reference index to it is described extremely A few reference index is ranked up;
Second screening module, for filtering out the default row of satisfaction from least one reference index according to ranking results The reference index of sequence condition.
Optionally, the first screening module in any of the above-described otherness data acquisition facility embodiment includes:
Computing unit, for according to the corresponding attribute information of described two crowds, it is corresponding to calculate at least one index Probability, wherein, each corresponding probability of index be used for characterize described two crowds in the index have otherness possibility Property;
Screening unit, it is at least one more than or equal to predetermined threshold value for filtering out probability from least one index Reference index.
Optionally, the corresponding confidence level of at least one index, each corresponding confidence level of index are pre-set Possibility of described two crowds with otherness can be embodied for characterizing the index;Computing unit includes:
First determination subelement, for determining the type of probability distribution to match at least one index;
First computation subunit, for according to two corresponding attribute informations of people group, calculating at least one index and existing Statistic under the probability distribution, each corresponding statistic of index are used to characterize the unilateral side of the corresponding confidence level of the index Confidence upper limit;
Second determination subelement, for determining probability density letter of at least one index under the probability distribution Number;
Second computation subunit, for according at least one corresponding probability density function of index and the index phase The statistic answered calculates the corresponding accumulated probability of the index.
Optionally, screening unit, including:
Third determination subelement, for using at least one corresponding confidence level of index as predetermined threshold value, and will The corresponding accumulated probability of the index is as the corresponding probability of the index;
Subelement is screened, for filtering out accumulated probability from least one index more than or equal to corresponding confidence level At least one reference index.
Above-mentioned otherness data acquisition facility includes processor and memory, and above-mentioned first determining module 51, first is screened Module 52, the first acquisition module 53 and second acquisition module 54 etc. are stored as program unit in memory, by handling Device performs above procedure unit stored in memory to realize corresponding function.
Comprising kernel in processor, gone in memory to transfer corresponding program unit by kernel.Kernel can set one Or more, come whether judge index data are the otherness number that can embody the otherness of two crowds by adjusting kernel parameter According to.
Memory may include computer-readable medium in volatile memory, random access memory (RAM) and/ Or the forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM), memory includes at least one deposit Store up chip.
In otherness data acquisition facility provided by the embodiments of the present application, the first determining module 51 first determines two crowds Corresponding attribute information, then by the first screening module 52 according to the corresponding attribute information of described two crowds, from described at least one At least one reference index for the otherness that can characterize described two crowds is filtered out in a index.I.e. at least one reference refers to Mark can embody the otherness between two crowds.Therefore the first acquisition module 53, which can be obtained in each crowd, pays close attention to institute The corresponding target group's index of at least one reference index is stated, then is based on paying close attention in each crowd by the second acquisition module 54 At least one corresponding target group's index of reference index, obtain and embody otherness between described two crowds extremely Few otherness data.So as to avoid the feelings when providing personalized advertisement for different crowd, miscarried into personalized advertisement Condition, so as to improve ad conversion rates.
Present invention also provides a kind of computer program products, first when being performed on data processing equipment, being adapted for carrying out The program code of beginningization there are as below methods step:
Determine the corresponding attribute informations of two people group and described two crowds at least one index of interest, wherein, The corresponding attribute information of each crowd includes paying close attention at least one index in the total number of users amount of the crowd, the crowd Corresponding target user's quantity;
According to the corresponding attribute information of described two crowds, filtered out from least one index described in can characterizing At least one reference index of the otherness of two crowds;
It obtains in each crowd and pays close attention to the corresponding target group's index of at least one reference index;
Based at least one corresponding target group's index of reference index paid close attention in each crowd, body is obtained At least one otherness data of otherness between existing described two crowds.
It should be understood by those skilled in the art that, embodiments herein can be provided as method, system or computer program Product.Therefore, the reality in terms of complete hardware embodiment, complete software embodiment or combination software and hardware can be used in the application Apply the form of example.Moreover, the computer for wherein including computer usable program code in one or more can be used in the application The computer program production that usable storage medium is implemented on (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) The form of product.
The application is with reference to the flow according to the method for the embodiment of the present application, equipment (system) and computer program product Figure and/or block diagram describe.It should be understood that it can be realized by computer program instructions every first-class in flowchart and/or the block diagram The combination of flow and/or box in journey and/or box and flowchart and/or the block diagram.These computer programs can be provided The processor of all-purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices is instructed to produce A raw machine so that the instruction performed by computer or the processor of other programmable data processing devices is generated for real The device of function specified in present one flow of flow chart or one box of multiple flows and/or block diagram or multiple boxes.
These computer program instructions, which may also be stored in, can guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works so that the instruction generation being stored in the computer-readable memory includes referring to Enable the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one box of block diagram or The function of being specified in multiple boxes.
These computer program instructions can be also loaded into computer or other programmable data processing devices so that counted Series of operation steps are performed on calculation machine or other programmable devices to generate computer implemented processing, so as in computer or The instruction offer performed on other programmable devices is used to implement in one flow of flow chart or multiple flows and/or block diagram one The step of function of being specified in a box or multiple boxes.
In a typical configuration, computing device includes one or more processors (CPU), input/output interface, net Network interface and memory.
Memory may include computer-readable medium in volatile memory, random access memory (RAM) and/ Or the forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM).Memory is computer-readable Jie The example of matter.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method Or technology come realize information store.Information can be computer-readable instruction, data structure, the module of program or other data. The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM (SRAM), moves State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable Programmable read only memory (EEPROM), fast flash memory bank or other memory techniques, CD-ROM read-only memory (CD-ROM), Digital versatile disc (DVD) or other optical storages, magnetic tape cassette, the storage of tape magnetic rigid disk or other magnetic storage apparatus Or any other non-transmission medium, available for storing the information that can be accessed by a computing device.It defines, calculates according to herein Machine readable medium does not include temporary computer readable media (transitory media), such as data-signal and carrier wave of modulation.
It these are only embodiments herein, be not limited to the application.To those skilled in the art, The application can have various modifications and variations.All any modifications made within spirit herein and principle, equivalent replacement, Improve etc., it should be included within the scope of claims hereof.

Claims (10)

1. a kind of otherness data capture method, which is characterized in that including:
Determine the corresponding attribute informations of two people group and described two crowds at least one index of interest, wherein, it is described It is corresponding that at least one index is paid close attention in the total number of users amount of each corresponding attribute information of crowd including the crowd, the crowd Target user's quantity;
According to the corresponding attribute information of described two crowds, filtered out from least one index can characterize it is described two At least one reference index of the otherness of crowd;
It obtains in each crowd and pays close attention to the corresponding target group's index of at least one reference index;
Based at least one corresponding target group's index of reference index paid close attention in each crowd, obtain and embody institute State at least one otherness data of the otherness between two crowds.
2. otherness data capture method according to claim 1, which is characterized in that the otherness data are described two The difference of target group's index of the corresponding reference index of crowd's concern.
3. otherness data capture method according to claim 2, which is characterized in that further include:
According at least one corresponding target group's index of reference index difference at least one reference index into Row sequence;
The reference index for meeting predetermined order condition is filtered out from least one reference index according to ranking results.
4. otherness data capture method according to claim 1, which is characterized in that described corresponding according to described two crowds Attribute information, at least one ginseng that can characterize the otherness of described two crowds is filtered out from least one index Index is examined to include:
According to the corresponding attribute information of described two crowds, at least one corresponding probability of index is calculated, wherein, each finger Mark corresponding probability has the possibility of otherness for characterizing described two crowds in the index;
At least one reference index that probability is more than or equal to predetermined threshold value is filtered out from least one index.
5. otherness data capture method according to claim 4, which is characterized in that further include:Described in pre-setting extremely A few corresponding confidence level of index, each corresponding confidence level of index for characterize the index can embody it is described two Crowd has the possibility of otherness;
It is described according to the corresponding attribute information of described two crowds, calculate at least one corresponding probability of index and include:
Determine the type of probability distribution to match at least one index;
According to two corresponding attribute informations of people group, statistic of at least one index under the probability distribution is calculated, Each corresponding statistic of index is used to characterize the unilateral confidence upper limit of the corresponding confidence level of the index;
Determine probability density function of at least one index under the probability distribution;
According at least one corresponding probability density function of index and the corresponding statistic of the index, the index phase is calculated The accumulated probability answered.
6. otherness data capture method according to claim 5, which is characterized in that described from least one index In filter out probability and include more than or equal at least one reference index of predetermined threshold value:
Using at least one corresponding confidence level of index as predetermined threshold value, and using the corresponding accumulated probability of the index as The corresponding probability of the index;
At least one reference index that accumulated probability is more than or equal to corresponding confidence level is filtered out from least one index.
7. a kind of otherness data acquisition facility, which is characterized in that including:
First determining module, for determine the corresponding attribute informations of two people group and described two crowds it is of interest at least one A index, wherein, each corresponding attribute information of crowd is included in the total number of users amount of the crowd, the crowd described in concern At least one corresponding target user's quantity of index;
First screening module, for according to the corresponding attribute information of described two crowds, being screened from least one index Go out at least one reference index for the otherness that can characterize described two crowds;
First acquisition module pays close attention to the corresponding target group of at least one reference index for obtaining in each crowd Index;
Second acquisition module, for based at least one corresponding target complex of reference index paid close attention in each crowd Body index obtains at least one otherness data for embodying the otherness between described two crowds.
8. otherness data acquisition facility according to claim 7, which is characterized in that the otherness data are described two The difference of target group's index of the corresponding reference index of crowd's concern.
9. otherness data acquisition facility according to claim 8, which is characterized in that further include:
Sorting module, for the difference according at least one corresponding target group's index of reference index to described at least one A reference index is ranked up;
Second screening module meets predetermined order item for being filtered out from least one reference index according to ranking results The reference index of part.
10. otherness data acquisition facility according to claim 7, which is characterized in that first screening module includes:
Computing unit, for according to the corresponding attribute information of described two crowds, it is general accordingly to calculate at least one index Rate, wherein, each corresponding probability of index is used to characterize described two crowds possibility with otherness in the index;
Screening unit, for filtering out at least one reference that probability is more than or equal to predetermined threshold value from least one index Index.
CN201611218196.7A 2016-12-26 2016-12-26 Otherness data capture method and device Pending CN108241989A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611218196.7A CN108241989A (en) 2016-12-26 2016-12-26 Otherness data capture method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611218196.7A CN108241989A (en) 2016-12-26 2016-12-26 Otherness data capture method and device

Publications (1)

Publication Number Publication Date
CN108241989A true CN108241989A (en) 2018-07-03

Family

ID=62701326

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611218196.7A Pending CN108241989A (en) 2016-12-26 2016-12-26 Otherness data capture method and device

Country Status (1)

Country Link
CN (1) CN108241989A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111079026A (en) * 2019-11-28 2020-04-28 精硕科技(北京)股份有限公司 Method, storage medium and device for determining character impression data
CN111768219A (en) * 2019-05-30 2020-10-13 北京沃东天骏信息技术有限公司 Advertisement crowd experiment method, device and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103854206A (en) * 2014-03-06 2014-06-11 北京品友互动信息技术有限公司 Method and device for analyzing group characteristics
CN104573113A (en) * 2015-02-03 2015-04-29 深圳市腾讯计算机系统有限公司 Information processing method and server

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103854206A (en) * 2014-03-06 2014-06-11 北京品友互动信息技术有限公司 Method and device for analyzing group characteristics
CN104573113A (en) * 2015-02-03 2015-04-29 深圳市腾讯计算机系统有限公司 Information processing method and server

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李豹等: "信号交叉口行人出行需求年龄差异分析", 《武汉理工大学学报(交通科学与工程版)》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111768219A (en) * 2019-05-30 2020-10-13 北京沃东天骏信息技术有限公司 Advertisement crowd experiment method, device and storage medium
CN111768219B (en) * 2019-05-30 2024-06-18 北京沃东天骏信息技术有限公司 Advertisement crowd experiment method, device and storage medium
CN111079026A (en) * 2019-11-28 2020-04-28 精硕科技(北京)股份有限公司 Method, storage medium and device for determining character impression data
CN111079026B (en) * 2019-11-28 2023-11-24 北京秒针人工智能科技有限公司 Method, storage medium and device for determining character impression data

Similar Documents

Publication Publication Date Title
CN106651057B (en) Mobile terminal user age prediction method based on installation package sequence list
CN108734460A (en) A kind of means of payment recommends method, apparatus and equipment
CN108141645A (en) Video emphasis detection with pairs of depth ordering
CN106651542A (en) Goods recommendation method and apparatus
CN107689008A (en) A kind of user insures the method and device of behavior prediction
CN105446988B (en) The method and apparatus for predicting classification
CN108416616A (en) The sort method and device of complaints and denunciation classification
CN110246007A (en) A kind of Method of Commodity Recommendation and device
CN110069545B (en) Behavior data evaluation method and device
CN110263821A (en) Transaction feature generates the generation method and device of the training of model, transaction feature
CN105868254A (en) Information recommendation method and apparatus
CN110489449A (en) A kind of chart recommended method, device and electronic equipment
CN108550046A (en) A kind of resource and market recommendation method, apparatus and electronic equipment
CN107622326A (en) User's classification, available resources Forecasting Methodology, device and equipment
CN110245475A (en) Auth method and device
CN106874943A (en) Business object sorting technique and system
CN108416684A (en) A kind of credibility appraisal procedure, device and the server of account main body
CN113407854A (en) Application recommendation method, device and equipment and computer readable storage medium
CN105574480B (en) A kind of information processing method, device and terminal
CN114240101A (en) Risk identification model verification method, device and equipment
CN110263817B (en) Risk grade classification method and device based on user account
CN108241989A (en) Otherness data capture method and device
CN107679236A (en) A kind of hot content pond maintaining method and device
CN109582834B (en) Data risk prediction method and device
CN106649374A (en) Navigation tag ordering method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 100080 No. 401, 4th Floor, Haitai Building, 229 North Fourth Ring Road, Haidian District, Beijing

Applicant after: BEIJING GRIDSUM TECHNOLOGY Co.,Ltd.

Address before: 100086 Cuigong Hotel, 76 Zhichun Road, Shuangyushu District, Haidian District, Beijing

Applicant before: BEIJING GRIDSUM TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information
RJ01 Rejection of invention patent application after publication

Application publication date: 20180703

RJ01 Rejection of invention patent application after publication