CN108596815A - User behavior similarity recognition method, system and device based on mobile terminal - Google Patents
User behavior similarity recognition method, system and device based on mobile terminal Download PDFInfo
- Publication number
- CN108596815A CN108596815A CN201810307705.6A CN201810307705A CN108596815A CN 108596815 A CN108596815 A CN 108596815A CN 201810307705 A CN201810307705 A CN 201810307705A CN 108596815 A CN108596815 A CN 108596815A
- Authority
- CN
- China
- Prior art keywords
- user
- line
- app
- behavior
- mobile terminal
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 23
- 230000003542 behavioural effect Effects 0.000 claims abstract description 65
- 238000004364 calculation method Methods 0.000 claims abstract description 26
- 230000006399 behavior Effects 0.000 claims description 99
- 238000006116 polymerization reaction Methods 0.000 claims description 29
- 239000011159 matrix material Substances 0.000 claims description 21
- 238000004458 analytical method Methods 0.000 claims description 10
- 238000009434 installation Methods 0.000 claims description 6
- 230000035699 permeability Effects 0.000 claims description 4
- 241001269238 Data Species 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 230000005055 memory storage Effects 0.000 claims description 3
- 238000005457 optimization Methods 0.000 claims description 3
- 239000000284 extract Substances 0.000 claims description 2
- 238000007405 data analysis Methods 0.000 abstract description 2
- 238000000605 extraction Methods 0.000 abstract 1
- 230000006870 function Effects 0.000 description 6
- 238000004422 calculation algorithm Methods 0.000 description 5
- 238000004590 computer program Methods 0.000 description 3
- 238000013139 quantization Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004140 cleaning Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 210000004709 eyebrow Anatomy 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/40—Business processes related to the transportation industry
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
- G06Q30/0203—Market surveys; Market polls
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W4/00—Services specially adapted for wireless communication networks; Facilities therefor
- H04W4/02—Services making use of location information
- H04W4/029—Location-based management or tracking services
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Strategic Management (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Finance (AREA)
- Entrepreneurship & Innovation (AREA)
- Marketing (AREA)
- Theoretical Computer Science (AREA)
- Economics (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- Data Mining & Analysis (AREA)
- Game Theory and Decision Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Primary Health Care (AREA)
- Tourism & Hospitality (AREA)
- Telephonic Communication Services (AREA)
Abstract
A kind of user behavior similarity recognition method based on mobile terminal, for mass users data, in data analysis, the mobile terminal of pair people's multimachine has first carried out identifying and being associated with, and in conjunction with LBS data under behavioral data on customer mobile terminal line and line, behavioural characteristic incidence relation between user is quantified, so that can accurate judgement multi-section mobile terminal whether be used in a people, behavioural characteristic on accurate extraction user's line, accurately portray on the line for moving down the line track and the arbitrary terminal room of accurate calculation of user, behavior similarity under line, target user is positioned for enterprise, and data support is provided.Correspondingly, the user behavior similarity identification system and device and a kind of computer readable storage medium that present invention also provides a kind of based on mobile terminal.
Description
Technical field
The present invention relates to big data technical fields, and in particular to a kind of user behavior similarity identification based on mobile terminal
Method, system and device.
Background technology
With the fast development of mobile Internet, the scale of China mobile netizen accounts for Chinese overall netizen's up to more than 800,000,000
96.3%.The daily on-line off-line behavior of a large amount of mobile terminal user is to we provide a large amount of abundant data, and generate
The degree of association between the main body of these data, which lacks always, accurately to be quantified.
In the prior art, the typically good several labels numeralization rear weight of exploitation sums to obtain a score, comes
The similarity between user is calculated, result calculated in this way is often very coarse, and the behavior that cannot be embodied under user's current state is inclined
Good, such data are used for enterprise's precision marketing, and effect is barely satisfactory.
The research project of user's similitude is simply weighted to obtain using the label of user between user mostly at present
Similarity, since the dimension of label is not complete, real-time is bad, many drawbacks in practical application.
Invention content
The application provides a kind of user behavior similarity recognition method, system and device based on mobile terminal, it is intended to tie
LBS location informations under behavioral data and line are closed on customer mobile terminal line, behavioural characteristic incidence relation between user is quantified.
According in a first aspect, providing a kind of user behavior similarity identification side based on mobile terminal in a kind of embodiment
Method, including:
Data acquisition step obtains the finger print information of at least two mobile terminals, the LBS location informations of the mobile terminal
And the app various dimensions behavioral data information of the mobile terminal installation, the app various dimensions behavioral data information include:app
Loading and unloading behavioral data, app unloading behavioral datas connect behavioral data with WiFi;
Equipment associated steps, according to the finger print information and the app various dimensions behavioral data information, to people's multimachine
Mobile terminal is associated and generates correspondingly label information;
Behavior similarity calculation step on line extracts behavioural characteristic on user's line according to the app various dimensions behavioral data
Matrix calculates behavior similarity on the line between user according to behavioural characteristic matrix on user's line;
Behavior similarity calculation step under line, according to the LBS location informations, the higher report point of polymerization density is used
Family historical track eigenmatrix calculates behavior similarity under the line between user according to the user's history track characteristic matrix;
Comprehensive analysis step quantifies user according to behavior similarity under behavior similarity on the label information, line and line
Behavior similarity.
In some embodiments, the equipment associated steps include:
According to the finger print information and the app various dimensions behavioral data information, user fingerprints information and app multidimensional are established
Correspondence incidence relation between behavioral data, the cross bearing identified by device-fingerprint and app multidimensional are analyzed, and determine that a people is more
Machine is for the degree of association, when the degree of association is high, is associated to correspondingly mobile terminal and generates correspondingly label information.
Behavior similarity calculation step includes calculating app mounting characteristics similarity and meter in some embodiments, the line
Calculate WiFi feature association degree, wherein the WiFi feature associations degree includes that working hour WiFi uses feature association degree and rest
Period WiFi uses feature association degree.
In some embodiments, the app mounting characteristics similarity is calculated using generalized J accard related coefficients, formula
For:
Wherein, Ci,CjThe feature vector of app is installed for equipment, m indicates the permeability of app.
In some embodiments, the WiFi feature associations degree is calculated using cosine similarity, and formula is:
Wherein, xa、xbRespectively user a, user b WiFi use feature vector, indicate user connection using certain WiFi
Intensity.
The behavior similarity calculation step under some embodiments, the line:
The LBS noises report point in user's history LBS information, the use after being optimized are rejected according to user's history behavioural characteristic
The family positions Bao Dian;
It reports point position to carry out clustering the user after the optimization, obtains user's polymerization behavior track;
User's polymerization behavior track is subjected to binary coding, user's polymerization behavior is calculated using the coding
The weight of track;
According to the weight of user's polymerization behavior track, calculate the hammings of arbitrary two tracks between two users away from
From by the Hamming distances normalized, obtaining the similarity of two tracks;
According to the Hamming distances of all tracks between any two between two users, the polymerization behavior track of two users is obtained
Similarity.
In some embodiments, the weight of user's polymerization behavior track
Wherein, Indicate the position coding of the weight kth of i-th track of user u;
Wherein,
Wherein, r indicates tracing point, and u is total number of users, TuIndicate the set of all tracing points of user u, { u:r∈TuBe
Include the number of users of tracing point r in track;
Wherein,For the weight mapping ruler of tracing point;
Wherein,Indicate the position coding of the kth of tracing point j in the track i of user u.
According to second aspect, a kind of user behavior similarity identification based on mobile terminal is provided in a kind of embodiment is
System, including:
Data acquisition module, for obtaining the finger print information of at least two mobile terminals, the positions LBS of the mobile terminal
Information and the app various dimensions behavioral data information of mobile terminal installation, the app various dimensions behavioral data packet
It includes:App, which is installed ,/refilling/unloads behavioral data, app clicks usage behavior data and connects behavioral data with WIFI;
Equipment relating module is used for according to the finger print information and the app various dimensions behavioral data information, more to a people
The mobile terminal of machine is associated and generates correspondingly label information;
Behavior similarity calculation module on line, for extracting behavior on user's line according to the app various dimensions behavioral data
Eigenmatrix calculates behavior similarity on the line between user according to behavioural characteristic matrix on user's line;
Behavior similarity calculation module under line, for according to the LBS location informations, the higher report point of polymerization density to obtain
Behavior phase under the line between user is calculated according to the user's history track characteristic matrix to user's history track characteristic matrix
Like degree;
Comprehensive analysis module, for according to behavior similarity under behavior similarity on the label information, line and line, quantization
User behavior similarity.
According to the third aspect, a kind of dress of the user behavior similarity identification based on mobile terminal is provided in a kind of embodiment
It sets, including:
Memory, for storing program;
Processor, for the program by executing the memory storage to realize such as first aspect any one of them side
Method.
According to fourth aspect, a kind of computer readable storage medium, including program, described program are provided in a kind of embodiment
It can be executed by processor to realize such as first aspect any one of them method.
According to above-described embodiment, since in data analysis, the mobile terminal of pair people's multimachine is first identified the application
And be associated with, and in conjunction with LBS data under behavioral data on customer mobile terminal line and line, behavioural characteristic is associated between having quantified user
Relationship so that can accurate judgement multi-section mobile terminal whether be used in a people, it is accurate to extract behavioural characteristic on user's line, it is accurate to carve
It draws on the line for moving down the line track and the arbitrary terminal room of accurate calculation of user, behavior similarity under line, target is positioned for enterprise
User provides data and supports.
Description of the drawings
Fig. 1 is a kind of user behavior similarity recognition method flow chart based on mobile terminal;
Fig. 2 is behavior similarity calculation flow chart of steps under a kind of line of embodiment.
Specific implementation mode
Below by specific implementation mode combination attached drawing, invention is further described in detail.Wherein different embodiments
Middle similar component uses associated similar element numbers.In the following embodiments, many datail descriptions be in order to
The application is better understood.However, those skilled in the art can be without lifting an eyebrow recognize, which part feature
It is dispensed, or can be substituted by other elements, material, method in varied situations.In some cases, this Shen
Please it is relevant some operation there is no in the description show or describe, this is the core in order to avoid the application by mistake
More descriptions are flooded, and to those skilled in the art, these relevant operations, which are described in detail, not to be necessary, they
It can completely understand relevant operation according to the general technology knowledge of description and this field in specification.
It is formed respectively in addition, feature described in this description, operation or feature can combine in any suitable way
Kind embodiment.Meanwhile each step in method description or action can also can be aobvious and easy according to those skilled in the art institute
The mode carry out sequence exchange or adjustment seen.Therefore, the various sequences in the description and the appended drawings are intended merely to clearly describe a certain
A embodiment is not meant to be necessary sequence, and wherein some sequentially must comply with unless otherwise indicated.
Similitude between user behavior all has ample scope for one's abilities in terms of the data operation management and marketing of enterprise, research
The similitude of equipment between any two is very necessary, and for the on-line off-line behavior number of the mobile terminal various dimensions of larger user's scale of construction
For, entire model treatment calculation amount is very huge, and performance and precision are also the key point that we continue to optimize.
Some existing algorithms calculate user's between any two similar when handling large-scale consumer on-line off-line data
Degree is related to huge data calculation amount, can occupy a large amount of computing resource, when large-scale application, efficiency and promptness are all
Receive prodigious restriction;The dimension of similarity calculation based on user tag, evaluation is limited, some labels can not meticulously react
The on-line off-line behavioural characteristic of user, and the precision measured is limited, can cause the precision of user's similarity that can also decrease;Base
Be affected by the accuracy of its user tag in the accuracy of the similarity calculation of user tag, algorithm, and algorithm when
Effect property is poor.
In embodiments of the present invention, for mass users data, people's multimachine identification is first carried out, first accurately identifies one
People's multimachine, user such as change planes at the behaviors, are associated and generate correspondingly label information by the mobile terminal of people's multimachine, reduce number
According to data processing amount.Carrying out people's multimachine identification can be using in mobile terminal identification (including the address imei, mac or idfa), app
The correspondence of the mark (including uid, ukey, alias or msg_id) or other users unique mark (cell-phone number or id) in portion,
In a specific embodiment, the application is closed by the corresponding association established between user equipment fingerprint and app multidimensional mark
System, the cross bearing identified by device-fingerprint and app multidimensional are analyzed, and the standby incidence relation of people's multimachine are determined, convenient for rear
When continuous analysis, Data Data association is carried out.
Specifically, referring to FIG. 1, a kind of user behavior similarity identification side based on mobile terminal provided by the present application
Method, including:
Data acquisition step S1 obtains the finger print information of at least two mobile terminals, the positions the LBS letter of the mobile terminal
Breath and the app various dimensions behavioral data information of mobile terminal installation, the app various dimensions behavioral data information include:
App new clothes/loading/unloading carry behavioral data connect behavioral data with WIFI;
Equipment associated steps S2, according to the finger print information and the app various dimensions behavioral data information, to people's multimachine
Mobile terminal be associated and generate correspondingly label information;
It is special to extract behavior on user's line according to the app various dimensions behavioral data by behavior similarity calculation step S3 on line
Matrix is levied, according to behavioural characteristic matrix on user's line, calculates behavior similarity on the line between user;
Behavior similarity calculation step S4 under line, according to the LBS location informations, the higher report point of polymerization density obtains
It is similar to calculate behavior under the line between user according to the user's history track characteristic matrix for user's history track characteristic matrix
Degree;
Comprehensive analysis step S5, according to behavior similarity under behavior similarity on the label information, line and line, quantization is used
Family behavior similarity.In concrete analysis, the data in the case of the same multimachine with connective marker information will be associated with
Analysis.
For step S3, mainly user is weighed by 2 aspects of app mounting characteristics similarity and WiFi feature associations degree
Behavioral similarity on line.
App mounting characteristic similarities:Behavioural characteristic matrix is installed by structuring user's app, it is related using generalized J accard
Coefficient calculates user's app mounting characteristic similarities.For user's app mounting characteristics, different app can reflect that user is similar
The degree of degree has very big difference, is weighted to obtain revised app mounting characteristics similarity according to the permeability of app, specific to indicate
For:
Wherein, Ci,CjThe feature vector that app is installed for equipment, the characteristic value value for installing the feature vector of app are 0 or 1,
0 indicates not install, and 1 indicates to have installed (including new clothes or install before), and m indicates the permeability of different app.
WiFi feature association degree:Surface cleaning is carried out according to information such as time, space, wireless WiFi attributes, establishes user
WiFi uses eigenmatrix;Based on user's WiFi eigenmatrixes, for the WiFi connection features of different periods equipment room, using remaining
String similarity calculates separately the working hour WiFi between two two users and uses feature using feature association degree and rest period WiFi
The degree of association.
WiFi feature association degree calculation formula are:
Wherein, xa、xbFeature vector is used for the WiFi of user a, user b, it is one that WiFi, which uses the characteristic value of feature vector,
User's WiFi connection frequencys in the section time.
According to above-mentioned formula, obtain:
Working hour WiFi uses feature association degree
Rest period WiFi uses feature association degree
Wherein, ta、tbFeature vector, r are used for the WiFi of working hour user a, user ba、rbFor working hour user
A, the WiFi of user b uses feature vector.
For step S4, with reference to figure 2, behavior similarity calculation step includes under the line:
Step S401 rejects it for the user's history LBS information in certain period of time according to user's history behavioural characteristic
In LBS noises report point, user after being optimized reports a point position.This is because each user has largely in different time
Report point position not only increases calculation amount wherein containing a large amount of noise datas for failing to embody behavioural characteristic under user's line, and
And analysis to user's real behavior is disturbed, it needs to be rejected nothing.
Step S402 reports point position to carry out clustering the user after the optimization, obtains user's polymerization behavior track.
Specifically, the report point position of user is clustered using density-based algorithms, by the region of user's high density report point
It is polymerized to one kind, is defined as an interest region, after clustering, it is emerging to convert the action trail of all users to user
The polymerization behavior track that interesting region indicates, the tracing point in the track are exactly the interest region after user clustering.
Step S403, binary coding is carried out by user's polymerization behavior track, and the use is calculated using the coding
The weight of family polymerization behavior track.The polymerization behavior track of user has been obtained by clustering, and user trajectory is each of total
Influence degree of the interest region when calculating user trajectory similarity is different, and the weight in each interest region is with them in history
The number that track occurs is inversely proportional, i.e.,:Influence of the more place of occurrence number to user trajectory similarity is smaller, conversely,
Influence of the fewer place of occurrence number to user trajectory similarity is bigger.
The application uses for reference the inverse document frequency in IR systems, interest region weight IRW is defined, to reflect polymerization behavior
The weight of each interest region in track for user's similarity.Wherein,
Wherein, r indicates tracing point, and u is total number of users, TuIndicate the set of all tracing points of user u, { u:r∈TuBe
Include the number of users of tracing point r in track.
In order to calculate the track similarity between all users, the application first uses Geohsah algorithms by user's polymerization behavior rail
Mark carries out binary coding, the polymerization behavior track s of user u after codingu,iMiddle tracing point r can be expressed as
The weight of user's polymerization behavior track is the reaction of each interest region weight in track, i-th track of user u
Weight be wu,i, the weight of user's polymerization behavior track
Wherein, Indicate the position coding of the weight kth of i-th track of user u;
For the weight mapping ruler of tracing point;
Wherein,The position coding for indicating the kth of tracing point j in the track i of user u, if track in the track i of user u
It is 1 that the kth position of point j, which encodes corresponding binary digit, then weight is just that otherwise, weight is negative.
Step S404 calculates arbitrary two tracks between two users according to the weight of user's polymerization behavior track
Hamming distances the Hamming distances normalized is obtained into the similarity of two tracks.The application uses Hamming distances table
Showing the otherness between user, Hamming distances are smaller to indicate that the difference of two tracks is smaller, after Hamming distances normalized,
It can obtain the similarity sim between track x and yx,y,
Wherein, H (wu,x,wv,y) indicate two users (user u and user v) between track this two tracks x and y hamming
Distance.
Step S405 obtains the poly- of two users according to the Hamming distances of all tracks between any two between two users
Close action trail similarity DSIMu,v,
In formula, | Tu|、|Tv| the tracking quantity of user u and user v are indicated respectively.
Correspondingly, the application also provides a kind of system of the user behavior similarity identification based on mobile terminal, the system
Including:
Data acquisition module, for obtaining the finger print information of at least two mobile terminals, the positions LBS of the mobile terminal
Information and the app various dimensions behavioral data information of mobile terminal installation, the app various dimensions behavioral data packet
It includes:App, which is installed ,/refilling/unloads behavioral data, app clicks usage behavior data and connects behavioral data with WIFI;
Equipment relating module is used for according to the finger print information and the app various dimensions behavioral data information, more to a people
The mobile terminal of machine is associated and generates correspondingly label information;
Behavior similarity calculation module on line, for extracting behavior on user's line according to the app various dimensions behavioral data
Eigenmatrix calculates behavior similarity on the line between user according to behavioural characteristic matrix on user's line;
Behavior similarity calculation module under line, for according to the LBS location informations, the higher report point of polymerization density to obtain
Behavior phase under the line between user is calculated according to the user's history track characteristic matrix to user's history track characteristic matrix
Like degree;
Comprehensive analysis module, for according to behavior similarity under behavior similarity on the label information, line and line, quantization
User behavior similarity.
Correspondingly, the application also provides a kind of device of the user behavior similarity identification based on mobile terminal, the device
Including:
Memory, for storing program;
Processor, for the program by executing the memory storage to realize a kind of above-mentioned use based on mobile terminal
Family behavior similarity recognition method.
It will be understood by those skilled in the art that all or part of function of various methods can pass through in the above embodiment
The mode of hardware is realized, can also be realized by way of computer program.When all or part of function in the above embodiment
When being realized by way of computer program, which can be stored in a computer readable storage medium, and storage medium can
To include:It is above-mentioned to realize to execute the program by computer for read-only memory, random access memory, disk, CD, hard disk etc.
Function.For example, program is stored in the memory of equipment, memory Program is executed when passing through processor, you can in realization
State all or part of function.It is realized by way of computer program in addition, working as all or part of function in the above embodiment
When, which can also be stored in the storage mediums such as server, another computer, disk, CD, flash disk or mobile hard disk
In, by download or copying and saving to the memory of local device in, or version updating is carried out to the system of local device, when logical
When crossing the program in processor execution memory, you can realize all or part of function in the above embodiment.
Use above specific case is illustrated the present invention, is merely used to help understand the present invention, not limiting
The system present invention.For those skilled in the art, according to the thought of the present invention, can also make several simple
It deduces, deform or replaces.
Claims (10)
1. a kind of user behavior similarity recognition method based on mobile terminal, which is characterized in that including:
Data acquisition step, obtain the finger print informations of at least two mobile terminals, the mobile terminal LBS location informations and
The app various dimensions behavioral data information of the mobile terminal installation, the app various dimensions behavioral data information include:App has been filled
Unloading behavioral data, app unloading behavioral datas connect behavioral data with WiFi;
Equipment associated steps, according to the finger print information and the app various dimensions behavioral data information, the movement to people's multimachine
Terminal is associated and generates correspondingly label information;
Behavior similarity calculation step on line extracts behavioural characteristic matrix on user's line according to the app various dimensions behavioral data,
According to behavioural characteristic matrix on user's line, behavior similarity on the line between user is calculated;
Behavior similarity calculation step under line, according to the LBS location informations, the higher report point of polymerization density obtains user and goes through
History track characteristic matrix calculates behavior similarity under the line between user according to the user's history track characteristic matrix;
Comprehensive analysis step quantifies user behavior according to behavior similarity under behavior similarity on the label information, line and line
Similarity.
2. the method as described in claim 1, which is characterized in that the equipment associated steps include:
According to the finger print information and the app various dimensions behavioral data information, user fingerprints information and app multidimensional behaviors are established
Correspondence incidence relation between data, the cross bearing identified by device-fingerprint and app multidimensional are analyzed, and determine that people's multimachine is standby
The degree of association is associated correspondingly mobile terminal and generates correspondingly label information when the degree of association is high.
3. the method as described in claim 1, which is characterized in that behavior similarity calculation step includes calculating app on the line
Mounting characteristic similarity and calculating WiFi feature association degree, wherein the WiFi feature associations degree, which includes working hour WiFi, to be made
Feature association degree is used with feature association degree and rest period WiFi.
4. method as claimed in claim 3, which is characterized in that the app mounting characteristics similarity uses generalized J accard phases
Relationship number calculates, and formula is:
Wherein, Ci,CjThe feature vector of app is installed for equipment, m indicates the permeability of app.
5. method as claimed in claim 3, which is characterized in that the WiFi feature associations degree is calculated using cosine similarity,
Its formula is:
Wherein, xa、xbRespectively user a, user b WiFi use feature vector, indicate that user's connection is strong using certain WiFi
Degree.
6. the method as described in claim 1, which is characterized in that behavior similarity calculation step under the line:
The LBS noises report point in user's history LBS information is rejected according to user's history behavioural characteristic, user's report after being optimized
Point position;
It reports point position to carry out clustering the user after the optimization, obtains user's polymerization behavior track;
User's polymerization behavior track is subjected to binary coding, user's polymerization behavior track is calculated using the coding
Weight;
According to the weight of user's polymerization behavior track, the Hamming distances of arbitrary two tracks between two users are calculated, it will
The Hamming distances normalized, obtains the similarity of two tracks;
According to the Hamming distances of all tracks between any two between two users, the polymerization behavior track for obtaining two users is similar
Degree.
7. method as claimed in claim 6, which is characterized in that the weight of user's polymerization behavior track
Wherein, Indicate the position coding of the weight kth of i-th track of user u;
Wherein,
Wherein, r indicates tracing point, and u is total number of users, TuIndicate the set of all tracing points of user u, { u:r∈TuIt is track
In include the number of users of tracing point r;
Wherein,For the weight mapping ruler of tracing point;
Wherein,Indicate the position coding of the kth of tracing point j in the track i of user u.
8. a kind of system of the user behavior similarity identification based on mobile terminal, it is characterised in that including:
Data acquisition module, for obtaining the finger print information of at least two mobile terminals, the LBS location informations of the mobile terminal
And the app various dimensions behavioral data information of the mobile terminal installation, the app various dimensions behavioral data information include:app
Install/refilling/unload behavioral data, app click usage behavior data connect behavioral data with WIFI;
Equipment relating module is used for according to the finger print information and the app various dimensions behavioral data information, to people's multimachine
Mobile terminal is associated and generates correspondingly label information;
Behavior similarity calculation module on line, for extracting behavioural characteristic on user's line according to the app various dimensions behavioral data
Matrix calculates behavior similarity on the line between user according to behavioural characteristic matrix on user's line;
Behavior similarity calculation module under line, for according to the LBS location informations, the higher report point of polymerization density to be used
Family historical track eigenmatrix calculates behavior similarity under the line between user according to the user's history track characteristic matrix;
Comprehensive analysis module, for according to behavior similarity under behavior similarity on the label information, line and line, quantifying user
Behavior similarity.
9. a kind of device of the user behavior similarity identification based on mobile terminal, it is characterised in that including:
Memory, for storing program;
Processor, for the program by executing the memory storage to realize as described in any one of claim 1-7
Method.
10. a kind of computer readable storage medium, which is characterized in that including program, described program can be executed by processor with
Realize the method as described in any one of claim 1-7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810307705.6A CN108596815A (en) | 2018-04-08 | 2018-04-08 | User behavior similarity recognition method, system and device based on mobile terminal |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810307705.6A CN108596815A (en) | 2018-04-08 | 2018-04-08 | User behavior similarity recognition method, system and device based on mobile terminal |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108596815A true CN108596815A (en) | 2018-09-28 |
Family
ID=63621303
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810307705.6A Pending CN108596815A (en) | 2018-04-08 | 2018-04-08 | User behavior similarity recognition method, system and device based on mobile terminal |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108596815A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110035393A (en) * | 2019-04-22 | 2019-07-19 | 浙江每日互动网络科技股份有限公司 | The recognition methods of mobile terminal relationship |
CN110807052A (en) * | 2019-11-05 | 2020-02-18 | 佳都新太科技股份有限公司 | User group classification method, device, equipment and storage medium |
CN110825785A (en) * | 2019-11-05 | 2020-02-21 | 佳都新太科技股份有限公司 | Data mining method and device, electronic equipment and storage medium |
CN111507732A (en) * | 2019-01-30 | 2020-08-07 | 北京嘀嘀无限科技发展有限公司 | System and method for identifying similar trajectories |
CN111669710A (en) * | 2020-04-21 | 2020-09-15 | 上海因势智能科技有限公司 | Demographic deduplication method |
CN112269937A (en) * | 2020-11-16 | 2021-01-26 | 加和(北京)信息科技有限公司 | Method, system and device for calculating user similarity |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20070042051A (en) * | 2005-10-17 | 2007-04-20 | 에스케이 텔레콤주식회사 | Method and system for providing shopping information in mobile telecommunication environment |
CN102929928A (en) * | 2012-09-21 | 2013-02-13 | 北京格致璞科技有限公司 | Multidimensional-similarity-based personalized news recommendation method |
CN103116614A (en) * | 2013-01-25 | 2013-05-22 | 北京奇艺世纪科技有限公司 | Collaborative filtering recommendation method, device and system base on user track |
CN104796468A (en) * | 2015-04-14 | 2015-07-22 | 蔡宏铭 | Method and system for realizing instant messaging of people travelling together and travel-together information sharing |
CN105095909A (en) * | 2015-07-13 | 2015-11-25 | 中国联合网络通信集团有限公司 | User similarity evaluation method and apparatus for mobile network |
CN107515915A (en) * | 2017-08-18 | 2017-12-26 | 晶赞广告(上海)有限公司 | User based on user behavior data identifies correlating method |
-
2018
- 2018-04-08 CN CN201810307705.6A patent/CN108596815A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20070042051A (en) * | 2005-10-17 | 2007-04-20 | 에스케이 텔레콤주식회사 | Method and system for providing shopping information in mobile telecommunication environment |
CN102929928A (en) * | 2012-09-21 | 2013-02-13 | 北京格致璞科技有限公司 | Multidimensional-similarity-based personalized news recommendation method |
CN103116614A (en) * | 2013-01-25 | 2013-05-22 | 北京奇艺世纪科技有限公司 | Collaborative filtering recommendation method, device and system base on user track |
CN104796468A (en) * | 2015-04-14 | 2015-07-22 | 蔡宏铭 | Method and system for realizing instant messaging of people travelling together and travel-together information sharing |
CN105095909A (en) * | 2015-07-13 | 2015-11-25 | 中国联合网络通信集团有限公司 | User similarity evaluation method and apparatus for mobile network |
CN107515915A (en) * | 2017-08-18 | 2017-12-26 | 晶赞广告(上海)有限公司 | User based on user behavior data identifies correlating method |
Non-Patent Citations (1)
Title |
---|
印桂生,程伟杰等: ""使用轨迹指纹和地点相似性的地点推荐"", 《哈尔滨工程大学学报》 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111507732A (en) * | 2019-01-30 | 2020-08-07 | 北京嘀嘀无限科技发展有限公司 | System and method for identifying similar trajectories |
CN111507732B (en) * | 2019-01-30 | 2023-07-07 | 北京嘀嘀无限科技发展有限公司 | System and method for identifying similar trajectories |
CN110035393A (en) * | 2019-04-22 | 2019-07-19 | 浙江每日互动网络科技股份有限公司 | The recognition methods of mobile terminal relationship |
CN110807052A (en) * | 2019-11-05 | 2020-02-18 | 佳都新太科技股份有限公司 | User group classification method, device, equipment and storage medium |
CN110825785A (en) * | 2019-11-05 | 2020-02-21 | 佳都新太科技股份有限公司 | Data mining method and device, electronic equipment and storage medium |
CN111669710A (en) * | 2020-04-21 | 2020-09-15 | 上海因势智能科技有限公司 | Demographic deduplication method |
CN112269937A (en) * | 2020-11-16 | 2021-01-26 | 加和(北京)信息科技有限公司 | Method, system and device for calculating user similarity |
CN112269937B (en) * | 2020-11-16 | 2024-02-02 | 加和(北京)信息科技有限公司 | Method, system and device for calculating user similarity |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108596815A (en) | User behavior similarity recognition method, system and device based on mobile terminal | |
CN107798557A (en) | Electronic installation, the service location based on LBS data recommend method and storage medium | |
TWI696194B (en) | Sorting method and device of complaint report type | |
CN107358247B (en) | Method and device for determining lost user | |
CN111222976B (en) | Risk prediction method and device based on network map data of two parties and electronic equipment | |
CN111199474B (en) | Risk prediction method and device based on network map data of two parties and electronic equipment | |
CN105988988A (en) | Method and device for processing text address | |
CN111163072B (en) | Method and device for determining characteristic value in machine learning model and electronic equipment | |
CN111125658B (en) | Method, apparatus, server and storage medium for identifying fraudulent user | |
CN109543040A (en) | Similar account recognition methods and device | |
CN110728526A (en) | Address recognition method, apparatus and computer readable medium | |
CN110599200A (en) | Detection method, system, medium and device for false address of OTA hotel | |
CN110909540B (en) | Method and device for identifying new words of short message spam and electronic equipment | |
CN110728313A (en) | Classification model training method and device for intention classification recognition | |
CN108900619A (en) | A kind of independent Statistics of accessing population method and device | |
CN109308615B (en) | Real-time fraud transaction detection method, system, storage medium and electronic terminal based on statistical sequence characteristics | |
CN106874760A (en) | A kind of Android malicious code sorting techniques based on hierarchy type SimHash | |
CN112818162A (en) | Image retrieval method, image retrieval device, storage medium and electronic equipment | |
CN110909804B (en) | Method, device, server and storage medium for detecting abnormal data of base station | |
CN106301979B (en) | Method and system for detecting abnormal channel | |
Liu et al. | Extracting, ranking, and evaluating quality features of web services through user review sentiment analysis | |
CN109409959A (en) | A kind of user information analysis method, device, equipment and medium | |
CN110598122B (en) | Social group mining method, device, equipment and storage medium | |
CN111538925A (en) | Method and device for extracting Uniform Resource Locator (URL) fingerprint features | |
CN109739840A (en) | Data processing empty value method, apparatus and terminal device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180928 |
|
RJ01 | Rejection of invention patent application after publication |