CN108596815A - User behavior similarity recognition method, system and device based on mobile terminal - Google Patents

User behavior similarity recognition method, system and device based on mobile terminal Download PDF

Info

Publication number
CN108596815A
CN108596815A CN201810307705.6A CN201810307705A CN108596815A CN 108596815 A CN108596815 A CN 108596815A CN 201810307705 A CN201810307705 A CN 201810307705A CN 108596815 A CN108596815 A CN 108596815A
Authority
CN
China
Prior art keywords
user
line
app
behavior
mobile terminal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810307705.6A
Other languages
Chinese (zh)
Inventor
贺智谋
洪晶
陈宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Information Technology Co Ltd
Original Assignee
Shenzhen Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Information Technology Co Ltd filed Critical Shenzhen Information Technology Co Ltd
Priority to CN201810307705.6A priority Critical patent/CN108596815A/en
Publication of CN108596815A publication Critical patent/CN108596815A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/40Business processes related to the transportation industry
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0203Market surveys; Market polls
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/029Location-based management or tracking services

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Strategic Management (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Theoretical Computer Science (AREA)
  • Economics (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Game Theory and Decision Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Primary Health Care (AREA)
  • Tourism & Hospitality (AREA)
  • Telephonic Communication Services (AREA)

Abstract

A kind of user behavior similarity recognition method based on mobile terminal, for mass users data, in data analysis, the mobile terminal of pair people's multimachine has first carried out identifying and being associated with, and in conjunction with LBS data under behavioral data on customer mobile terminal line and line, behavioural characteristic incidence relation between user is quantified, so that can accurate judgement multi-section mobile terminal whether be used in a people, behavioural characteristic on accurate extraction user's line, accurately portray on the line for moving down the line track and the arbitrary terminal room of accurate calculation of user, behavior similarity under line, target user is positioned for enterprise, and data support is provided.Correspondingly, the user behavior similarity identification system and device and a kind of computer readable storage medium that present invention also provides a kind of based on mobile terminal.

Description

User behavior similarity recognition method, system and device based on mobile terminal
Technical field
The present invention relates to big data technical fields, and in particular to a kind of user behavior similarity identification based on mobile terminal Method, system and device.
Background technology
With the fast development of mobile Internet, the scale of China mobile netizen accounts for Chinese overall netizen's up to more than 800,000,000 96.3%.The daily on-line off-line behavior of a large amount of mobile terminal user is to we provide a large amount of abundant data, and generate The degree of association between the main body of these data, which lacks always, accurately to be quantified.
In the prior art, the typically good several labels numeralization rear weight of exploitation sums to obtain a score, comes The similarity between user is calculated, result calculated in this way is often very coarse, and the behavior that cannot be embodied under user's current state is inclined Good, such data are used for enterprise's precision marketing, and effect is barely satisfactory.
The research project of user's similitude is simply weighted to obtain using the label of user between user mostly at present Similarity, since the dimension of label is not complete, real-time is bad, many drawbacks in practical application.
Invention content
The application provides a kind of user behavior similarity recognition method, system and device based on mobile terminal, it is intended to tie LBS location informations under behavioral data and line are closed on customer mobile terminal line, behavioural characteristic incidence relation between user is quantified.
According in a first aspect, providing a kind of user behavior similarity identification side based on mobile terminal in a kind of embodiment Method, including:
Data acquisition step obtains the finger print information of at least two mobile terminals, the LBS location informations of the mobile terminal And the app various dimensions behavioral data information of the mobile terminal installation, the app various dimensions behavioral data information include:app Loading and unloading behavioral data, app unloading behavioral datas connect behavioral data with WiFi;
Equipment associated steps, according to the finger print information and the app various dimensions behavioral data information, to people's multimachine Mobile terminal is associated and generates correspondingly label information;
Behavior similarity calculation step on line extracts behavioural characteristic on user's line according to the app various dimensions behavioral data Matrix calculates behavior similarity on the line between user according to behavioural characteristic matrix on user's line;
Behavior similarity calculation step under line, according to the LBS location informations, the higher report point of polymerization density is used Family historical track eigenmatrix calculates behavior similarity under the line between user according to the user's history track characteristic matrix;
Comprehensive analysis step quantifies user according to behavior similarity under behavior similarity on the label information, line and line Behavior similarity.
In some embodiments, the equipment associated steps include:
According to the finger print information and the app various dimensions behavioral data information, user fingerprints information and app multidimensional are established Correspondence incidence relation between behavioral data, the cross bearing identified by device-fingerprint and app multidimensional are analyzed, and determine that a people is more Machine is for the degree of association, when the degree of association is high, is associated to correspondingly mobile terminal and generates correspondingly label information.
Behavior similarity calculation step includes calculating app mounting characteristics similarity and meter in some embodiments, the line Calculate WiFi feature association degree, wherein the WiFi feature associations degree includes that working hour WiFi uses feature association degree and rest Period WiFi uses feature association degree.
In some embodiments, the app mounting characteristics similarity is calculated using generalized J accard related coefficients, formula For:
Wherein, Ci,CjThe feature vector of app is installed for equipment, m indicates the permeability of app.
In some embodiments, the WiFi feature associations degree is calculated using cosine similarity, and formula is:
Wherein, xa、xbRespectively user a, user b WiFi use feature vector, indicate user connection using certain WiFi Intensity.
The behavior similarity calculation step under some embodiments, the line:
The LBS noises report point in user's history LBS information, the use after being optimized are rejected according to user's history behavioural characteristic The family positions Bao Dian;
It reports point position to carry out clustering the user after the optimization, obtains user's polymerization behavior track;
User's polymerization behavior track is subjected to binary coding, user's polymerization behavior is calculated using the coding The weight of track;
According to the weight of user's polymerization behavior track, calculate the hammings of arbitrary two tracks between two users away from From by the Hamming distances normalized, obtaining the similarity of two tracks;
According to the Hamming distances of all tracks between any two between two users, the polymerization behavior track of two users is obtained Similarity.
In some embodiments, the weight of user's polymerization behavior track
Wherein, Indicate the position coding of the weight kth of i-th track of user u;
Wherein,
Wherein, r indicates tracing point, and u is total number of users, TuIndicate the set of all tracing points of user u, { u:r∈TuBe Include the number of users of tracing point r in track;
Wherein,For the weight mapping ruler of tracing point;
Wherein,Indicate the position coding of the kth of tracing point j in the track i of user u.
According to second aspect, a kind of user behavior similarity identification based on mobile terminal is provided in a kind of embodiment is System, including:
Data acquisition module, for obtaining the finger print information of at least two mobile terminals, the positions LBS of the mobile terminal Information and the app various dimensions behavioral data information of mobile terminal installation, the app various dimensions behavioral data packet It includes:App, which is installed ,/refilling/unloads behavioral data, app clicks usage behavior data and connects behavioral data with WIFI;
Equipment relating module is used for according to the finger print information and the app various dimensions behavioral data information, more to a people The mobile terminal of machine is associated and generates correspondingly label information;
Behavior similarity calculation module on line, for extracting behavior on user's line according to the app various dimensions behavioral data Eigenmatrix calculates behavior similarity on the line between user according to behavioural characteristic matrix on user's line;
Behavior similarity calculation module under line, for according to the LBS location informations, the higher report point of polymerization density to obtain Behavior phase under the line between user is calculated according to the user's history track characteristic matrix to user's history track characteristic matrix Like degree;
Comprehensive analysis module, for according to behavior similarity under behavior similarity on the label information, line and line, quantization User behavior similarity.
According to the third aspect, a kind of dress of the user behavior similarity identification based on mobile terminal is provided in a kind of embodiment It sets, including:
Memory, for storing program;
Processor, for the program by executing the memory storage to realize such as first aspect any one of them side Method.
According to fourth aspect, a kind of computer readable storage medium, including program, described program are provided in a kind of embodiment It can be executed by processor to realize such as first aspect any one of them method.
According to above-described embodiment, since in data analysis, the mobile terminal of pair people's multimachine is first identified the application And be associated with, and in conjunction with LBS data under behavioral data on customer mobile terminal line and line, behavioural characteristic is associated between having quantified user Relationship so that can accurate judgement multi-section mobile terminal whether be used in a people, it is accurate to extract behavioural characteristic on user's line, it is accurate to carve It draws on the line for moving down the line track and the arbitrary terminal room of accurate calculation of user, behavior similarity under line, target is positioned for enterprise User provides data and supports.
Description of the drawings
Fig. 1 is a kind of user behavior similarity recognition method flow chart based on mobile terminal;
Fig. 2 is behavior similarity calculation flow chart of steps under a kind of line of embodiment.
Specific implementation mode
Below by specific implementation mode combination attached drawing, invention is further described in detail.Wherein different embodiments Middle similar component uses associated similar element numbers.In the following embodiments, many datail descriptions be in order to The application is better understood.However, those skilled in the art can be without lifting an eyebrow recognize, which part feature It is dispensed, or can be substituted by other elements, material, method in varied situations.In some cases, this Shen Please it is relevant some operation there is no in the description show or describe, this is the core in order to avoid the application by mistake More descriptions are flooded, and to those skilled in the art, these relevant operations, which are described in detail, not to be necessary, they It can completely understand relevant operation according to the general technology knowledge of description and this field in specification.
It is formed respectively in addition, feature described in this description, operation or feature can combine in any suitable way Kind embodiment.Meanwhile each step in method description or action can also can be aobvious and easy according to those skilled in the art institute The mode carry out sequence exchange or adjustment seen.Therefore, the various sequences in the description and the appended drawings are intended merely to clearly describe a certain A embodiment is not meant to be necessary sequence, and wherein some sequentially must comply with unless otherwise indicated.
Similitude between user behavior all has ample scope for one's abilities in terms of the data operation management and marketing of enterprise, research The similitude of equipment between any two is very necessary, and for the on-line off-line behavior number of the mobile terminal various dimensions of larger user's scale of construction For, entire model treatment calculation amount is very huge, and performance and precision are also the key point that we continue to optimize.
Some existing algorithms calculate user's between any two similar when handling large-scale consumer on-line off-line data Degree is related to huge data calculation amount, can occupy a large amount of computing resource, when large-scale application, efficiency and promptness are all Receive prodigious restriction;The dimension of similarity calculation based on user tag, evaluation is limited, some labels can not meticulously react The on-line off-line behavioural characteristic of user, and the precision measured is limited, can cause the precision of user's similarity that can also decrease;Base Be affected by the accuracy of its user tag in the accuracy of the similarity calculation of user tag, algorithm, and algorithm when Effect property is poor.
In embodiments of the present invention, for mass users data, people's multimachine identification is first carried out, first accurately identifies one People's multimachine, user such as change planes at the behaviors, are associated and generate correspondingly label information by the mobile terminal of people's multimachine, reduce number According to data processing amount.Carrying out people's multimachine identification can be using in mobile terminal identification (including the address imei, mac or idfa), app The correspondence of the mark (including uid, ukey, alias or msg_id) or other users unique mark (cell-phone number or id) in portion, In a specific embodiment, the application is closed by the corresponding association established between user equipment fingerprint and app multidimensional mark System, the cross bearing identified by device-fingerprint and app multidimensional are analyzed, and the standby incidence relation of people's multimachine are determined, convenient for rear When continuous analysis, Data Data association is carried out.
Specifically, referring to FIG. 1, a kind of user behavior similarity identification side based on mobile terminal provided by the present application Method, including:
Data acquisition step S1 obtains the finger print information of at least two mobile terminals, the positions the LBS letter of the mobile terminal Breath and the app various dimensions behavioral data information of mobile terminal installation, the app various dimensions behavioral data information include: App new clothes/loading/unloading carry behavioral data connect behavioral data with WIFI;
Equipment associated steps S2, according to the finger print information and the app various dimensions behavioral data information, to people's multimachine Mobile terminal be associated and generate correspondingly label information;
It is special to extract behavior on user's line according to the app various dimensions behavioral data by behavior similarity calculation step S3 on line Matrix is levied, according to behavioural characteristic matrix on user's line, calculates behavior similarity on the line between user;
Behavior similarity calculation step S4 under line, according to the LBS location informations, the higher report point of polymerization density obtains It is similar to calculate behavior under the line between user according to the user's history track characteristic matrix for user's history track characteristic matrix Degree;
Comprehensive analysis step S5, according to behavior similarity under behavior similarity on the label information, line and line, quantization is used Family behavior similarity.In concrete analysis, the data in the case of the same multimachine with connective marker information will be associated with Analysis.
For step S3, mainly user is weighed by 2 aspects of app mounting characteristics similarity and WiFi feature associations degree Behavioral similarity on line.
App mounting characteristic similarities:Behavioural characteristic matrix is installed by structuring user's app, it is related using generalized J accard Coefficient calculates user's app mounting characteristic similarities.For user's app mounting characteristics, different app can reflect that user is similar The degree of degree has very big difference, is weighted to obtain revised app mounting characteristics similarity according to the permeability of app, specific to indicate For:
Wherein, Ci,CjThe feature vector that app is installed for equipment, the characteristic value value for installing the feature vector of app are 0 or 1, 0 indicates not install, and 1 indicates to have installed (including new clothes or install before), and m indicates the permeability of different app.
WiFi feature association degree:Surface cleaning is carried out according to information such as time, space, wireless WiFi attributes, establishes user WiFi uses eigenmatrix;Based on user's WiFi eigenmatrixes, for the WiFi connection features of different periods equipment room, using remaining String similarity calculates separately the working hour WiFi between two two users and uses feature using feature association degree and rest period WiFi The degree of association.
WiFi feature association degree calculation formula are:
Wherein, xa、xbFeature vector is used for the WiFi of user a, user b, it is one that WiFi, which uses the characteristic value of feature vector, User's WiFi connection frequencys in the section time.
According to above-mentioned formula, obtain:
Working hour WiFi uses feature association degree
Rest period WiFi uses feature association degree
Wherein, ta、tbFeature vector, r are used for the WiFi of working hour user a, user ba、rbFor working hour user A, the WiFi of user b uses feature vector.
For step S4, with reference to figure 2, behavior similarity calculation step includes under the line:
Step S401 rejects it for the user's history LBS information in certain period of time according to user's history behavioural characteristic In LBS noises report point, user after being optimized reports a point position.This is because each user has largely in different time Report point position not only increases calculation amount wherein containing a large amount of noise datas for failing to embody behavioural characteristic under user's line, and And analysis to user's real behavior is disturbed, it needs to be rejected nothing.
Step S402 reports point position to carry out clustering the user after the optimization, obtains user's polymerization behavior track. Specifically, the report point position of user is clustered using density-based algorithms, by the region of user's high density report point It is polymerized to one kind, is defined as an interest region, after clustering, it is emerging to convert the action trail of all users to user The polymerization behavior track that interesting region indicates, the tracing point in the track are exactly the interest region after user clustering.
Step S403, binary coding is carried out by user's polymerization behavior track, and the use is calculated using the coding The weight of family polymerization behavior track.The polymerization behavior track of user has been obtained by clustering, and user trajectory is each of total Influence degree of the interest region when calculating user trajectory similarity is different, and the weight in each interest region is with them in history The number that track occurs is inversely proportional, i.e.,:Influence of the more place of occurrence number to user trajectory similarity is smaller, conversely, Influence of the fewer place of occurrence number to user trajectory similarity is bigger.
The application uses for reference the inverse document frequency in IR systems, interest region weight IRW is defined, to reflect polymerization behavior The weight of each interest region in track for user's similarity.Wherein,
Wherein, r indicates tracing point, and u is total number of users, TuIndicate the set of all tracing points of user u, { u:r∈TuBe Include the number of users of tracing point r in track.
In order to calculate the track similarity between all users, the application first uses Geohsah algorithms by user's polymerization behavior rail Mark carries out binary coding, the polymerization behavior track s of user u after codingu,iMiddle tracing point r can be expressed as
The weight of user's polymerization behavior track is the reaction of each interest region weight in track, i-th track of user u Weight be wu,i, the weight of user's polymerization behavior track
Wherein, Indicate the position coding of the weight kth of i-th track of user u;
For the weight mapping ruler of tracing point;
Wherein,The position coding for indicating the kth of tracing point j in the track i of user u, if track in the track i of user u It is 1 that the kth position of point j, which encodes corresponding binary digit, then weight is just that otherwise, weight is negative.
Step S404 calculates arbitrary two tracks between two users according to the weight of user's polymerization behavior track Hamming distances the Hamming distances normalized is obtained into the similarity of two tracks.The application uses Hamming distances table Showing the otherness between user, Hamming distances are smaller to indicate that the difference of two tracks is smaller, after Hamming distances normalized, It can obtain the similarity sim between track x and yx,y,
Wherein, H (wu,x,wv,y) indicate two users (user u and user v) between track this two tracks x and y hamming Distance.
Step S405 obtains the poly- of two users according to the Hamming distances of all tracks between any two between two users Close action trail similarity DSIMu,v,
In formula, | Tu|、|Tv| the tracking quantity of user u and user v are indicated respectively.
Correspondingly, the application also provides a kind of system of the user behavior similarity identification based on mobile terminal, the system Including:
Data acquisition module, for obtaining the finger print information of at least two mobile terminals, the positions LBS of the mobile terminal Information and the app various dimensions behavioral data information of mobile terminal installation, the app various dimensions behavioral data packet It includes:App, which is installed ,/refilling/unloads behavioral data, app clicks usage behavior data and connects behavioral data with WIFI;
Equipment relating module is used for according to the finger print information and the app various dimensions behavioral data information, more to a people The mobile terminal of machine is associated and generates correspondingly label information;
Behavior similarity calculation module on line, for extracting behavior on user's line according to the app various dimensions behavioral data Eigenmatrix calculates behavior similarity on the line between user according to behavioural characteristic matrix on user's line;
Behavior similarity calculation module under line, for according to the LBS location informations, the higher report point of polymerization density to obtain Behavior phase under the line between user is calculated according to the user's history track characteristic matrix to user's history track characteristic matrix Like degree;
Comprehensive analysis module, for according to behavior similarity under behavior similarity on the label information, line and line, quantization User behavior similarity.
Correspondingly, the application also provides a kind of device of the user behavior similarity identification based on mobile terminal, the device Including:
Memory, for storing program;
Processor, for the program by executing the memory storage to realize a kind of above-mentioned use based on mobile terminal Family behavior similarity recognition method.
It will be understood by those skilled in the art that all or part of function of various methods can pass through in the above embodiment The mode of hardware is realized, can also be realized by way of computer program.When all or part of function in the above embodiment When being realized by way of computer program, which can be stored in a computer readable storage medium, and storage medium can To include:It is above-mentioned to realize to execute the program by computer for read-only memory, random access memory, disk, CD, hard disk etc. Function.For example, program is stored in the memory of equipment, memory Program is executed when passing through processor, you can in realization State all or part of function.It is realized by way of computer program in addition, working as all or part of function in the above embodiment When, which can also be stored in the storage mediums such as server, another computer, disk, CD, flash disk or mobile hard disk In, by download or copying and saving to the memory of local device in, or version updating is carried out to the system of local device, when logical When crossing the program in processor execution memory, you can realize all or part of function in the above embodiment.
Use above specific case is illustrated the present invention, is merely used to help understand the present invention, not limiting The system present invention.For those skilled in the art, according to the thought of the present invention, can also make several simple It deduces, deform or replaces.

Claims (10)

1. a kind of user behavior similarity recognition method based on mobile terminal, which is characterized in that including:
Data acquisition step, obtain the finger print informations of at least two mobile terminals, the mobile terminal LBS location informations and The app various dimensions behavioral data information of the mobile terminal installation, the app various dimensions behavioral data information include:App has been filled Unloading behavioral data, app unloading behavioral datas connect behavioral data with WiFi;
Equipment associated steps, according to the finger print information and the app various dimensions behavioral data information, the movement to people's multimachine Terminal is associated and generates correspondingly label information;
Behavior similarity calculation step on line extracts behavioural characteristic matrix on user's line according to the app various dimensions behavioral data, According to behavioural characteristic matrix on user's line, behavior similarity on the line between user is calculated;
Behavior similarity calculation step under line, according to the LBS location informations, the higher report point of polymerization density obtains user and goes through History track characteristic matrix calculates behavior similarity under the line between user according to the user's history track characteristic matrix;
Comprehensive analysis step quantifies user behavior according to behavior similarity under behavior similarity on the label information, line and line Similarity.
2. the method as described in claim 1, which is characterized in that the equipment associated steps include:
According to the finger print information and the app various dimensions behavioral data information, user fingerprints information and app multidimensional behaviors are established Correspondence incidence relation between data, the cross bearing identified by device-fingerprint and app multidimensional are analyzed, and determine that people's multimachine is standby The degree of association is associated correspondingly mobile terminal and generates correspondingly label information when the degree of association is high.
3. the method as described in claim 1, which is characterized in that behavior similarity calculation step includes calculating app on the line Mounting characteristic similarity and calculating WiFi feature association degree, wherein the WiFi feature associations degree, which includes working hour WiFi, to be made Feature association degree is used with feature association degree and rest period WiFi.
4. method as claimed in claim 3, which is characterized in that the app mounting characteristics similarity uses generalized J accard phases Relationship number calculates, and formula is:
Wherein, Ci,CjThe feature vector of app is installed for equipment, m indicates the permeability of app.
5. method as claimed in claim 3, which is characterized in that the WiFi feature associations degree is calculated using cosine similarity, Its formula is:
Wherein, xa、xbRespectively user a, user b WiFi use feature vector, indicate that user's connection is strong using certain WiFi Degree.
6. the method as described in claim 1, which is characterized in that behavior similarity calculation step under the line:
The LBS noises report point in user's history LBS information is rejected according to user's history behavioural characteristic, user's report after being optimized Point position;
It reports point position to carry out clustering the user after the optimization, obtains user's polymerization behavior track;
User's polymerization behavior track is subjected to binary coding, user's polymerization behavior track is calculated using the coding Weight;
According to the weight of user's polymerization behavior track, the Hamming distances of arbitrary two tracks between two users are calculated, it will The Hamming distances normalized, obtains the similarity of two tracks;
According to the Hamming distances of all tracks between any two between two users, the polymerization behavior track for obtaining two users is similar Degree.
7. method as claimed in claim 6, which is characterized in that the weight of user's polymerization behavior track
Wherein, Indicate the position coding of the weight kth of i-th track of user u;
Wherein,
Wherein, r indicates tracing point, and u is total number of users, TuIndicate the set of all tracing points of user u, { u:r∈TuIt is track In include the number of users of tracing point r;
Wherein,For the weight mapping ruler of tracing point;
Wherein,Indicate the position coding of the kth of tracing point j in the track i of user u.
8. a kind of system of the user behavior similarity identification based on mobile terminal, it is characterised in that including:
Data acquisition module, for obtaining the finger print information of at least two mobile terminals, the LBS location informations of the mobile terminal And the app various dimensions behavioral data information of the mobile terminal installation, the app various dimensions behavioral data information include:app Install/refilling/unload behavioral data, app click usage behavior data connect behavioral data with WIFI;
Equipment relating module is used for according to the finger print information and the app various dimensions behavioral data information, to people's multimachine Mobile terminal is associated and generates correspondingly label information;
Behavior similarity calculation module on line, for extracting behavioural characteristic on user's line according to the app various dimensions behavioral data Matrix calculates behavior similarity on the line between user according to behavioural characteristic matrix on user's line;
Behavior similarity calculation module under line, for according to the LBS location informations, the higher report point of polymerization density to be used Family historical track eigenmatrix calculates behavior similarity under the line between user according to the user's history track characteristic matrix;
Comprehensive analysis module, for according to behavior similarity under behavior similarity on the label information, line and line, quantifying user Behavior similarity.
9. a kind of device of the user behavior similarity identification based on mobile terminal, it is characterised in that including:
Memory, for storing program;
Processor, for the program by executing the memory storage to realize as described in any one of claim 1-7 Method.
10. a kind of computer readable storage medium, which is characterized in that including program, described program can be executed by processor with Realize the method as described in any one of claim 1-7.
CN201810307705.6A 2018-04-08 2018-04-08 User behavior similarity recognition method, system and device based on mobile terminal Pending CN108596815A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810307705.6A CN108596815A (en) 2018-04-08 2018-04-08 User behavior similarity recognition method, system and device based on mobile terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810307705.6A CN108596815A (en) 2018-04-08 2018-04-08 User behavior similarity recognition method, system and device based on mobile terminal

Publications (1)

Publication Number Publication Date
CN108596815A true CN108596815A (en) 2018-09-28

Family

ID=63621303

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810307705.6A Pending CN108596815A (en) 2018-04-08 2018-04-08 User behavior similarity recognition method, system and device based on mobile terminal

Country Status (1)

Country Link
CN (1) CN108596815A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110035393A (en) * 2019-04-22 2019-07-19 浙江每日互动网络科技股份有限公司 The recognition methods of mobile terminal relationship
CN110807052A (en) * 2019-11-05 2020-02-18 佳都新太科技股份有限公司 User group classification method, device, equipment and storage medium
CN110825785A (en) * 2019-11-05 2020-02-21 佳都新太科技股份有限公司 Data mining method and device, electronic equipment and storage medium
CN111507732A (en) * 2019-01-30 2020-08-07 北京嘀嘀无限科技发展有限公司 System and method for identifying similar trajectories
CN111669710A (en) * 2020-04-21 2020-09-15 上海因势智能科技有限公司 Demographic deduplication method
CN112269937A (en) * 2020-11-16 2021-01-26 加和(北京)信息科技有限公司 Method, system and device for calculating user similarity

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20070042051A (en) * 2005-10-17 2007-04-20 에스케이 텔레콤주식회사 Method and system for providing shopping information in mobile telecommunication environment
CN102929928A (en) * 2012-09-21 2013-02-13 北京格致璞科技有限公司 Multidimensional-similarity-based personalized news recommendation method
CN103116614A (en) * 2013-01-25 2013-05-22 北京奇艺世纪科技有限公司 Collaborative filtering recommendation method, device and system base on user track
CN104796468A (en) * 2015-04-14 2015-07-22 蔡宏铭 Method and system for realizing instant messaging of people travelling together and travel-together information sharing
CN105095909A (en) * 2015-07-13 2015-11-25 中国联合网络通信集团有限公司 User similarity evaluation method and apparatus for mobile network
CN107515915A (en) * 2017-08-18 2017-12-26 晶赞广告(上海)有限公司 User based on user behavior data identifies correlating method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20070042051A (en) * 2005-10-17 2007-04-20 에스케이 텔레콤주식회사 Method and system for providing shopping information in mobile telecommunication environment
CN102929928A (en) * 2012-09-21 2013-02-13 北京格致璞科技有限公司 Multidimensional-similarity-based personalized news recommendation method
CN103116614A (en) * 2013-01-25 2013-05-22 北京奇艺世纪科技有限公司 Collaborative filtering recommendation method, device and system base on user track
CN104796468A (en) * 2015-04-14 2015-07-22 蔡宏铭 Method and system for realizing instant messaging of people travelling together and travel-together information sharing
CN105095909A (en) * 2015-07-13 2015-11-25 中国联合网络通信集团有限公司 User similarity evaluation method and apparatus for mobile network
CN107515915A (en) * 2017-08-18 2017-12-26 晶赞广告(上海)有限公司 User based on user behavior data identifies correlating method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
印桂生,程伟杰等: ""使用轨迹指纹和地点相似性的地点推荐"", 《哈尔滨工程大学学报》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111507732A (en) * 2019-01-30 2020-08-07 北京嘀嘀无限科技发展有限公司 System and method for identifying similar trajectories
CN111507732B (en) * 2019-01-30 2023-07-07 北京嘀嘀无限科技发展有限公司 System and method for identifying similar trajectories
CN110035393A (en) * 2019-04-22 2019-07-19 浙江每日互动网络科技股份有限公司 The recognition methods of mobile terminal relationship
CN110807052A (en) * 2019-11-05 2020-02-18 佳都新太科技股份有限公司 User group classification method, device, equipment and storage medium
CN110825785A (en) * 2019-11-05 2020-02-21 佳都新太科技股份有限公司 Data mining method and device, electronic equipment and storage medium
CN111669710A (en) * 2020-04-21 2020-09-15 上海因势智能科技有限公司 Demographic deduplication method
CN112269937A (en) * 2020-11-16 2021-01-26 加和(北京)信息科技有限公司 Method, system and device for calculating user similarity
CN112269937B (en) * 2020-11-16 2024-02-02 加和(北京)信息科技有限公司 Method, system and device for calculating user similarity

Similar Documents

Publication Publication Date Title
CN108596815A (en) User behavior similarity recognition method, system and device based on mobile terminal
CN107798557A (en) Electronic installation, the service location based on LBS data recommend method and storage medium
TWI696194B (en) Sorting method and device of complaint report type
CN107358247B (en) Method and device for determining lost user
CN111222976B (en) Risk prediction method and device based on network map data of two parties and electronic equipment
CN111199474B (en) Risk prediction method and device based on network map data of two parties and electronic equipment
CN105988988A (en) Method and device for processing text address
CN111163072B (en) Method and device for determining characteristic value in machine learning model and electronic equipment
CN111125658B (en) Method, apparatus, server and storage medium for identifying fraudulent user
CN109543040A (en) Similar account recognition methods and device
CN110728526A (en) Address recognition method, apparatus and computer readable medium
CN110599200A (en) Detection method, system, medium and device for false address of OTA hotel
CN110909540B (en) Method and device for identifying new words of short message spam and electronic equipment
CN110728313A (en) Classification model training method and device for intention classification recognition
CN108900619A (en) A kind of independent Statistics of accessing population method and device
CN109308615B (en) Real-time fraud transaction detection method, system, storage medium and electronic terminal based on statistical sequence characteristics
CN106874760A (en) A kind of Android malicious code sorting techniques based on hierarchy type SimHash
CN112818162A (en) Image retrieval method, image retrieval device, storage medium and electronic equipment
CN110909804B (en) Method, device, server and storage medium for detecting abnormal data of base station
CN106301979B (en) Method and system for detecting abnormal channel
Liu et al. Extracting, ranking, and evaluating quality features of web services through user review sentiment analysis
CN109409959A (en) A kind of user information analysis method, device, equipment and medium
CN110598122B (en) Social group mining method, device, equipment and storage medium
CN111538925A (en) Method and device for extracting Uniform Resource Locator (URL) fingerprint features
CN109739840A (en) Data processing empty value method, apparatus and terminal device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180928

RJ01 Rejection of invention patent application after publication