CN106844673A - A kind of method and system based on the public security data acquisition intimate degree of multidimensional personnel - Google Patents

A kind of method and system based on the public security data acquisition intimate degree of multidimensional personnel Download PDF

Info

Publication number
CN106844673A
CN106844673A CN201710054364.1A CN201710054364A CN106844673A CN 106844673 A CN106844673 A CN 106844673A CN 201710054364 A CN201710054364 A CN 201710054364A CN 106844673 A CN106844673 A CN 106844673A
Authority
CN
China
Prior art keywords
degree
relation
cohesion
behavior
relationship
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710054364.1A
Other languages
Chinese (zh)
Inventor
任爱敏
田峰
王贤然
郑冰
曹传卓
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Billion Billion Rand Communication Technology Co Ltd
Original Assignee
Shandong Billion Billion Rand Communication Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Billion Billion Rand Communication Technology Co Ltd filed Critical Shandong Billion Billion Rand Communication Technology Co Ltd
Priority to CN201710054364.1A priority Critical patent/CN106844673A/en
Publication of CN106844673A publication Critical patent/CN106844673A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Tourism & Hospitality (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Physics (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Fuzzy Systems (AREA)
  • Computational Linguistics (AREA)
  • Economics (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of method and system based on the public security data acquisition intimate degree of multidimensional personnel, its implementation process is to obtain relation personal data first, calculates single-relation cohesion each other, i.e., by the close and distant degree of single behavior representation relation;Then various single-relation cohesions are calculated with multidimensional syntagmatic cohesion using Weighted Average Algorithm, i.e., by the close and distant degree of various behavior representation relations;For party, the relationship degree decay function failed with linear, index or half-life period mode is obtained, recalculate the intimate degree between party, obtain the relation between party.This is based on the method and system of the public security data acquisition intimate degree of multidimensional personnel compared with prior art, made improvements on the basis of original Behavior-based control number of times calculated relationship cohesion method, defined suitable for multi-dimensional relation cohesion, the treatment of intimate degree time decline problem is taken into account simultaneously, it is practical, it is applied widely, it is easy to promote.

Description

A kind of method and system based on the public security data acquisition intimate degree of multidimensional personnel
Technical field
The present invention relates to Computer Applied Technology field, specifically it is a kind of it is practical, based on public security data acquisition The method and system of the intimate degree of multidimensional personnel.
Background technology
With the fast development and popularization and application of computer and information technology, the data storage in each application system of public security industry Standby increasingly to enrich, there is very big value in all kinds of human behavior data, and wherein personnel's relation excavation is imperative.
Popular technology is the related excavation based on relational network in terms of personnel's relation excavation, either uses biography System relevant database is still all not fee from the intimate degree of computing staff using emerging big data figure calculation, in parent A series of relationship analysis results are drawn on the basis of density.
The personnel's intimate degree calculating for being currently based on public security data uses behavior number of times and defines method, according to behavior time Number defines intimate degree (for example:Two people stay 15 times jointly, then the cohesion of two people's lodging behaviors is 15;Common online Number of times 20 times, then two people's internet behavior cohesions are for 20).
But in the intimate degree defining issue of public security data multidimensional personnel, there are two big difficult points:
1st, public security data multidimensional degree, relation is complicated between various human behavior data, traditional based on relationship behavior number of times Intimate degree computational methods, it is difficult to it is various it is intimate degree merge when find rational weight;
2nd, public security data time span is very big, when to personnel's historical behavior data calculated relationship cohesion, have ignored pass It is the time decline problem of cohesion.
And in the intimate degree behavior number of times of personnel based on public security data defines method, although it is capable of simple, intuitive Be reflected in the cohesion in certain behavior relation, but cannot the effective and reasonable close and distant degree for calculating multidimensional syntagmatic (for example: The internet behavior cohesion of first and second is 20, and first and third lodging behavior cohesion are 20, it is impossible to judge first and second, third who is closeer It is close).
Personnel's relation is to decay over time, but is not considered in cohesion behavior number of times defines method Arrive, the Shortcomings so in terms of the degree of accuracy.
Based on this, the present invention proposes a kind of method and system based on the public security data acquisition intimate degree of multidimensional personnel, It is improved on the basis of cohesion behavior number of times defines hair, enables to be calculated suitable for multi-dimensional relation cohesion, and and Turn round and look at cohesion time attenuation problem.
The content of the invention
Technical assignment of the invention is directed to above weak point, there is provided it is a kind of it is practical, based on public security data acquisition The method and system of the intimate degree of multidimensional personnel.
A kind of method based on the public security data acquisition intimate degree of multidimensional personnel, its implementation process is:
The data of party are obtained first, calculate single-relation cohesion each other, i.e., by single behavior representation The close and distant degree of relation;
Then multidimensional syntagmatic cohesion is calculated to various single-relation cohesions using Weighted Average Algorithm, i.e., By the close and distant degree of various behavior representation relations;
For party, if without discovery behavior relation, passage of the intimate degree according to the time in a period of time Gradually decline finally obtains the relationship degree decay function failed with linear, index or half-life period mode up to disappearing, and is based on The decay function, recalculates the intimate degree between party, so as to accurately obtain the relation between party.
The relation personal data of acquisition be from public security system data obtain, the data acquisition be based on Zookeeper clusters, Hadoop clusters, Spark aggregated structures are realized:Bottom using Spark on Yarn architecture mode, using HDFS as depositing Storage, Spark uses Flume, Sqoop as Computational frame, data extraction tool;Then will be surfed the Net including hotel lodging, Internet bar, Permanent resident population, people stayed temporarily, the public security internal data of suspect's mobile phone contact are drawn into the HDFS of Hadoop, extraction process In tentatively cleaned, treatment null value, invalid data, so as to obtain the data message of party.
It is described it is intimate degree weighed by behavior relation, behavior relation include live together, with live, with online, colleague, Colleague, it is of the same clan, wherein,
Live together:Party stays in the same room in same hotel simultaneously;
Companion lives:Party stays in two rooms in same hotel simultaneously, at the same open room, while checking out, i.e. the time difference is in N Within minute, the N is less than or equal to 10;
With online:Party simultaneously same Internet bar online, while online, while off line, i.e. the time difference was at N minutes Within, the N is less than or equal to 10;
Colleague:Party has the experience taken office in same time period, same enterprise or unit;
Colleague:Party goes another from a ground simultaneously, and route is identical and reaches simultaneously;
It is of the same clan:The household register information of party belongs to same clan.
Single-relation cohesion is calculated to be realized by below equation:
In the formula, p1, p2 represent two parties, riDelegate rules;
Represent p1 and p2 in regular riUnder relationship degree;
Represent p1 and p2 in regular riUnder behavior number of times;
α is the percentage that targeted behavior number of times accounts for overall behavior number of times when this calculates single-relation cohesion, works as nothing When method obtains overall behavior number of times, the α values are 1;
A isAmount of contraction, its value be 0-1, for controlling behavior number of times to relationship degreeIncreasing Speed long;
B isSide-play amount, controlling behavior number of times is to relationship degreeSide-play amount, when behavior number of timesWhen, just starting calculated relationship degree, its value is the integer between 1 to 100;
For functionThat is cohesion d to the function of behavior number of times c, when behavior number of timesWhen tending to infinite, in regular riUnder, the relationship degree of p1, p2Tend to 100%, i.e.,:
The multidimensional syntagmatic cohesion is calculated by below equation:
p1、p2:Represent two parties, ri:Delegate rules;
Represent p1 and p2 total relationship degree;
Represent p1 and p2 in regular riUnder relationship degree;
wi:Regular riWeight, wi∈R+
α:Targeted behavior number of times accounts for the percentage of overall behavior number of times during for this calculating multidimensional syntagmatic cohesion, When that cannot obtain overall behavior number of times, α values are 1;
It is p1→p2All relation rule set;
Pair strictly all rules existed with p2 with p1WhenWhen all tending to 100%, total relationship degree100% is also tended to, i.e.,:
Relationship degree decay function Weaken (d) is that linear, index or half-life period mode are failed, based on the relation Degree decay function, for the rule for having time decline attribute, p1、p2Relationship degree be specially:
WhereinIt is relation riDecay function, for without decline attribute ruleFor the rule with decline attribute, realized by following algorithm:
Linear regression d '=d (1-aT), wherein,
d:It is expressed as the relationship degree of meta-rule i;
a:The amount of zoom of the function T that expression is specified, its value is 0-1;
T:It is the difference of the maximum time occurred now for the behavior distance in current rule d;
Or, exponential decay d '=aT× d, wherein,
d:It is expressed as the relationship degree of meta-rule i;
a:The amount of zoom of the function T that expression is specified, its value is 0-1;
T:It is the difference of the maximum time occurred now for the behavior distance in current rule d.
After relationship degree decay function, the defining of the intimate degree of completion personnel is obtained, also including setting up personnel's network of personal connections The step of, the step is that personnel's relational network is set up to available data using graph visualization instrument, i.e., historical data is carried out After intimate degree is calculated, daily to incremental data calculated relationship cohesion, the lists of persons related to the personnel is obtained, the people Cohesion ranking from high to low is pressed in member's list.
A kind of system based on the public security data acquisition intimate degree of multidimensional personnel, its structure includes:
Data acquisition module, for obtaining the data between dependency relation people from public security system data, the data refer to Including hotel lodging, Internet bar's online, permanent resident population, people stayed temporarily, the public security internal data of suspect's mobile phone contact, in data During acquisition, the module is also tentatively cleaned to data, treatment null value, invalid data;
Single-relation cohesion computing module, the data for data acquisition module to be obtained carry out single-relation cohesion Calculate, i.e., the close and distant degree between party is obtained by a certain behavior relation, behavior relation is including living together, with firmly, ibid It is net, colleague, colleague, of the same clan;
Multidimensional syntagmatic cohesion computing module, combination calculates various behavior relations, then it is comprehensive check party it Between relatives' degree;
Relationship degree decline computing module, for the decay function that the passage between calculated relationship people according to the time is produced, and Based on the relationship degree between decay function calculated relationship people, the decay function is carried out with linear, index or half-life period mode Decline.
The single-relation cohesion computing module is calculated by below equation:
In the formula, p1, p2 represent two parties, riDelegate rules;
Represent p1 and p2 in regular riUnder relationship degree;
Represent p1 and p2 in regular riUnder behavior number of times;
α is the percentage that targeted behavior number of times accounts for overall behavior number of times when this calculates single-relation cohesion, works as nothing When method obtains overall behavior number of times, the α values are 1;
A isAmount of contraction, its value be 0-1, for controlling behavior number of times to relationship degreeIncreasing Speed long;
B isSide-play amount, controlling behavior number of times is to relationship degreeSide-play amount, when behavior number of timesWhen, just starting calculated relationship degree, its value is the integer between 1 to 100;
For functionThat is cohesion d to the function of behavior number of times c, when behavior number of timesWhen tending to infinite, in regular riUnder, the relationship degree of p1, p2Tend to 100%, i.e.,:
The multidimensional syntagmatic cohesion computing module is calculated by below equation:
p1、p2:Represent two parties, ri:Delegate rules;
Represent p1 and p2 total relationship degree;
Represent p1 and p2 in regular riUnder relationship degree;
wi:Regular riWeight, wi∈R+
α:Targeted behavior number of times accounts for the percentage of overall behavior number of times during for this calculating multidimensional syntagmatic cohesion, When that cannot obtain overall behavior number of times, α values are 1;
It is p1→p2All relation rule set;
Pair strictly all rules existed with p2 with p1WhenWhen all tending to 100%, total relationship degree100% is also tended to, i.e.,:
The relationship degree decline computing module calculated relationship degree is realized by below equation:
WhereinIt is relation riDecay function, for without decline attribute ruleFor the rule with decline attribute, realized by following algorithm:
Linear regression d '=d (1-aT), wherein,
d:It is expressed as the relationship degree of meta-rule i;
a:The amount of zoom of the function T that expression is specified, its value is 0-1;
T:It is the difference of the maximum time occurred now for the behavior distance in current rule d;
Or, exponential decay d '=aT× d, wherein,
d:It is expressed as the relationship degree of meta-rule i;
a:The amount of zoom of the function T that expression is specified, its value is 0-1;
T:It is the difference of the maximum time occurred now for the behavior distance in current rule d.
The system also includes UI display modules, and the UI display modules are obtaining relationship degree decay function, completing personnel and close It is after the defining of cohesion, to set up personnel's network of personal connections, personnel's relational network is set up to available data using graph visualization instrument, After intimate degree calculating is carried out to historical data, daily to incremental data calculated relationship cohesion, obtain and personnel's phase The lists of persons of pass, the lists of persons presses cohesion ranking from high to low, and then the ranking is shown on UI interfaces.
A kind of method and system based on the public security data acquisition intimate degree of multidimensional personnel of the invention, with following excellent Point:
A kind of method and system based on the public security data acquisition intimate degree of multidimensional personnel proposed by the present invention, compared to Method is defined using traditional intimate degree behavior number of times of personnel, difference is not clear aobvious when single-relation cohesion is calculated, but this Can be limited to cohesion between 0~1 by invention;When multidimensional syntagmatic cohesion is calculated, conventional method is difficult to be competent at, this Method can effectively solve this problem, and result is limited between 0~1;In the intimate degree time decline problem of personnel, The present invention controls cohesion with the decline situation of time according to different situations using the method for linear regression or exponential decay, makes Final calculation result is more accurate reasonable;Made improvements on the basis of original Behavior-based control number of times calculated relationship cohesion method, Defined suitable for multi-dimensional relation cohesion, while the treatment of intimate degree time decline problem is taken into account, it is practical, it is applicable model Enclose extensively, it is easy to promote.
Brief description of the drawings
For the clearer explanation embodiment of the present invention or the technical scheme of prior art, below will be to embodiment or existing The accompanying drawing to be used needed for technology description is briefly described, it should be apparent that, drawings in the following description are only this hair Some bright embodiments, for those of ordinary skill in the art, on the premise of not paying creative work, can be with root Other accompanying drawings are obtained according to these accompanying drawings.
Accompanying drawing 1 realizes schematic diagram for system of the invention.
Accompanying drawing 2 is single-relation cohesion curve map.
Accompanying drawing 3 is multi-dimensional relation cohesion curve map.
Accompanying drawing 4 is linear weak figure.
Accompanying drawing 5 is the weak figure of index.
Specific embodiment
In order that those skilled in the art more fully understand the present invention program, with reference to the accompanying drawings and detailed description The present invention is described in further detail.Obviously, described embodiment is only a part of embodiment of the invention, rather than Whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art are not making creative work premise Lower obtained every other embodiment, belongs to the scope of protection of the invention.
As shown in drawings, a kind of information presentation system based on program axle, and in particular to distribution figure is calculated, algorithm neck Domain.Wherein, the public security intimate degree of data multidimensional personnel is defined using new cohesion computational methods, and has been answered both at home and abroad Cohesion computational methods based on public security data have the technological advancement of differentiation.
Embodiment 1:
A kind of method based on the public security data acquisition intimate degree of multidimensional personnel, its implementation process is:
The data of party are obtained first, calculate single-relation cohesion each other, i.e., by single behavior representation The close and distant degree of relation;
Then multidimensional syntagmatic cohesion is calculated to various single-relation cohesions using Weighted Average Algorithm, i.e., By the close and distant degree of various behavior representation relations;
For party, if without discovery behavior relation, passage of the intimate degree according to the time in a period of time Gradually decline finally obtains the relationship degree decay function failed with linear, index or half-life period mode up to disappearing, and is based on The decay function, recalculates the intimate degree between party, so as to accurately obtain the relation between party.
The relation personal data of acquisition be from public security system data obtain, the data acquisition be based on Zookeeper clusters, Hadoop clusters, Spark aggregated structures are realized:Bottom using Spark on Yarn architecture mode, using HDFS as depositing Storage, Spark uses Flume, Sqoop as Computational frame, data extraction tool;Then will be surfed the Net including hotel lodging, Internet bar, Permanent resident population, people stayed temporarily, the public security internal data of suspect's mobile phone contact are drawn into the HDFS of Hadoop, extraction process In tentatively cleaned, treatment null value, invalid data, so as to obtain the data message of party.
It is described it is intimate degree weighed by behavior relation, behavior relation include live together, with live, with online, colleague, Colleague, it is of the same clan, wherein,
Live together:Party stays in the same room in same hotel simultaneously;
Companion lives:Party stays in two rooms in same hotel simultaneously, at the same open room, while checking out, i.e. the time difference is in N Within minute, the N is less than or equal to 10;
With online:Party simultaneously same Internet bar online, while online, while off line, i.e. the time difference was at N minutes Within, the N is less than or equal to 10;
Colleague:Party has the experience taken office in same time period, same enterprise or unit;
Colleague:Party goes another from a ground simultaneously, and route is identical and reaches simultaneously;
It is of the same clan:The household register information of party belongs to same clan.
Calculate single-relation cohesion to be realized by below equation, Fig. 2 is relationship degree with behavior number of times change curve:
In the formula, p1, p2 represent two parties, riDelegate rules;
Represent p1 and p2 in regular riUnder relationship degree;
Represent p1 and p2 in regular riUnder behavior number of times;
α is the percentage that targeted behavior number of times accounts for overall behavior number of times when this calculates single-relation cohesion, works as nothing When method obtains overall behavior number of times, the α values are 1;
A isAmount of contraction, its value be 0-1, for controlling behavior number of times to relationship degreeIncreasing Speed long;
B isSide-play amount, controlling behavior number of times is to relationship degreeSide-play amount, when behavior number of timesWhen, just starting calculated relationship degree, its value is the integer between 1 to 100;
For functionThat is cohesion d to the function of behavior number of times c, when behavior number of timesWhen tending to infinite, in regular riUnder, the relationship degree of p1, p2Tend to 100%, i.e.,:
The multidimensional syntagmatic cohesion is calculated by below equation, and Fig. 3 is relationship degree with behavior number of times change curve:
p1、p2:Represent two parties, ri:Delegate rules;
Represent p1 and p2 total relationship degree;
Represent p1 and p2 in regular riUnder relationship degree;
wi:Regular riWeight, wi∈R+
α:Targeted behavior number of times accounts for the percentage of overall behavior number of times during for this calculating multidimensional syntagmatic cohesion, When that cannot obtain overall behavior number of times, α values are 1;
It is p1→p2All relation rule set;
Pair strictly all rules existed with p2 with p1WhenWhen all tending to 100%, total relationship degree100% is also tended to, i.e.,:
Relationship degree decay function Weaken (d) is that linear, index or half-life period mode are failed, based on the relation Degree decay function, for the rule for having time decline attribute, p1、p2Relationship degree be specially:
WhereinIt is relation riDecay function, for without decline attribute ruleFor the rule with decline attribute, realized by following algorithm:
Linear regression d '=d (1-aT), curve as shown in figure 4, wherein,
d:It is expressed as the relationship degree of meta-rule i;
a:The amount of zoom of the function T that expression is specified, its value is 0-1;
T:It is the difference of the maximum time occurred now for the behavior distance in current rule d;
Or, exponential decay d '=aT× d, curve as shown in figure 5, wherein,
d:It is expressed as the relationship degree of meta-rule i;
a:The amount of zoom of the function T that expression is specified, its value is 0-1;
T:It is the difference of the maximum time occurred now for the behavior distance in current rule d.
Further, stayed with as a example by Internet bar's Internet data by hotel below, syntagmatic cohesions, first, second are tieed up in calculating 2 Live together 20 times, with online 20 times, the first and second intimate degree are calculated as follows:
Live intimate degree together, wherein:B=2, a=0.5, α=0.5.
Ibid gateway system cohesion, wherein:B=5, a=0.3, α=0.5.
Its density is combined, wherein living weight w=0.7 together, with online weight w=0.3, time decline uses exponential decay, Decline scaling amount a=0.95, the time is recent, time gap T=0.
Therefore, interpersonal intimate degree is calculated by above-mentioned algorithm, no matter relation species is single or complexity, most The number between one 0~1 can be obtained eventually, and it is weak in view of the time, and over time, cohesion diminishes.
Embodiment 2:
A kind of method based on the public security data acquisition intimate degree of multidimensional personnel, its implementation process is:
The data of party are obtained first, calculate single-relation cohesion each other, i.e., by a certain behavior relation Obtain the close and distant degree between party;
Then the multidimensional syntagmatic cohesion to party is calculated, i.e., obtain party by various behavior relations Between close and distant degree;
For party, if without discovery behavior relation, passage of the intimate degree according to the time in a period of time Gradually decline is until disappearance, finally obtains relationship degree decay function, specifically, for individual p1、p2If do not had in a period of time It is found p1→p2Behavior, then relationship degree according to the passage of time gradually fail until disappear (such as p1、p2Same grade Several internet behaviors, the behavior of nearest ratio before 10 years is to p1、p2Relationship degree have more influence power), relationship degree decay function Weaken (d) can be failed for modes such as linear, index, half-life period.Based on the decay function, recalculate between party Intimate degree so that accurately obtain party between relation.
The relation personal data of acquisition is obtained from public security system data, and the data acquisition is realized based on following framework: Bottom uses the architecture mode of Spark on Yarn, and using HDFS as storage, Spark is used as Computational frame, data pick-up work Tool uses Flume, Sqoop;Therefore, first have to build Zookeeper clusters, Hadoop clusters, Spark clusters, install The instruments such as Flume, Sqoop;Then will be including hotel lodging, Internet bar's online, permanent resident population, people stayed temporarily, suspect's mobile phone connection It is that the public security internal data of people is drawn into the HDFS of Hadoop, is tentatively cleaned in extraction process, treatment null value, illegal number According to so as to obtain the data message of party.
It is described it is intimate degree weighed by behavior relation, behavior relation include live together, with live, with online, colleague, Colleague, it is of the same clan, wherein,
Live together:Two people stay in the same room in same hotel simultaneously;
Companion lives:Two people stay in two rooms in same hotel simultaneously, at the same open room, while check out (2 minutes time differences with It is interior);
With online:Two people simultaneously same Internet bar online, while online, while off line (within 2 minutes time differences);
Colleague:Two people have the experience taken office in same time period, same enterprise or unit;
Colleague:Two people go another from a ground simultaneously, and route is identical and reaches simultaneously;
It is of the same clan:The household register information of two people belongs to same clan.
Single-relation cohesion is calculated to be realized by below equation:
In the formula, p1, p2 represent two parties, riDelegate rules;
Represent p1 and p2 in regular riUnder relationship degree;
Represent p1 and p2 in regular riUnder behavior number of times;
α is the percentage that targeted behavior number of times accounts for overall behavior number of times when this calculates single-relation cohesion, works as nothing When method obtains overall behavior number of times, the α values are 1;
A isAmount of contraction, its value be 0-1, for controlling behavior number of times to relationship degreeIncreasing Speed long;
B isSide-play amount, controlling behavior number of times is to relationship degreeSide-play amount, when behavior number of timesWhen, just starting calculated relationship degree, its value is the integer between 1 to 100;
For functionThat is cohesion d to the function of behavior number of times c, when behavior number of timesWhen tending to infinite, in regular riUnder, the relationship degree of p1, p2Tend to 100%, i.e.,:
The multidimensional syntagmatic cohesion is calculated by below equation:
p1、p2:Represent two parties, ri:Delegate rules;
Represent p1 and p2 total relationship degree;
Represent p1 and p2 in regular riUnder relationship degree;
wi:Regular riWeight, wi∈R+
α:Targeted behavior number of times accounts for the percentage of overall behavior number of times during for this calculating multidimensional syntagmatic cohesion, When that cannot obtain overall behavior number of times, α values are 1;
It is p1→p2All relation rule set;
Pair strictly all rules existed with p2 with p1WhenWhen all tending to 100%, total relationship degree100% is also tended to, i.e.,:
Relationship degree decay function Weaken (d) is that linear, index or half-life period mode are failed, based on the relation Degree decay function, for the rule for having time decline attribute, p1、p2Relationship degree be specially:
WhereinIt is relation riDecay function, for without decline attribute ruleFor the rule with decline attribute, realized by following algorithm:
Linear regression d '=d (1-aT), wherein,
d:It is expressed as the relationship degree of meta-rule i;
a:The amount of zoom of the function T that expression is specified, its value is 0-1;
T:It is the difference of the maximum time occurred now for the behavior distance in current rule d;
Or, exponential decay d '=aT× d, wherein,
d:It is expressed as the relationship degree of meta-rule i;
a:The amount of zoom of the function T that expression is specified, its value is 0-1;
T:It is the difference of the maximum time occurred now for the behavior distance in current rule d.
Therefore, interpersonal intimate degree is calculated by above-mentioned algorithm, no matter relation species is single or complexity, most The number between one 0~1 can be obtained eventually, and it is weak in view of the time, and over time, cohesion diminishes.
After relationship degree decay function, the defining of the intimate degree of completion personnel is obtained, also including setting up personnel's network of personal connections The step of, the step is that personnel's relational network is set up to available data using Spark GraphX instruments.The system initial stage needs very Intimate degree is carried out to historical data for a long time to calculate, then daily to incremental data calculated relationship cohesion, in systems The relational network composition of any personnel is can search for, and is obtained the lists of persons related to the personnel and (is arranged from high to low by cohesion Name).
As shown in Figure 1, a kind of system based on the public security data acquisition intimate degree of multidimensional personnel, its structure includes:
Data acquisition module, for obtaining the data between dependency relation people from public security system data, the data refer to Including hotel lodging, Internet bar's online, permanent resident population, people stayed temporarily, the public security internal data of suspect's mobile phone contact, in data During acquisition, the module is also tentatively cleaned to data, treatment null value, invalid data;
Single-relation cohesion computing module, the data for data acquisition module to be obtained carry out single-relation cohesion Calculate, i.e., the close and distant degree between party is obtained by a certain behavior relation, behavior relation is including living together, with firmly, ibid It is net, colleague, colleague, of the same clan;
Multidimensional syntagmatic cohesion computing module, combination calculates various behavior relations, then it is comprehensive check party it Between relatives' degree;
Relationship degree decline computing module, for the decay function that the passage between calculated relationship people according to the time is produced, and Based on the relationship degree between decay function calculated relationship people, the decay function is carried out with linear, index or half-life period mode Decline.
The single-relation cohesion computing module is calculated by below equation:
In the formula, p1, p2 represent two parties, riDelegate rules;
Represent p1 and p2 in regular riUnder relationship degree;
Represent p1 and p2 in regular riUnder behavior number of times;
α is the percentage that targeted behavior number of times accounts for overall behavior number of times when this calculates single-relation cohesion, works as nothing When method obtains overall behavior number of times, the α values are 1;
A isAmount of contraction, its value be 0-1, for controlling behavior number of times to relationship degreeIncreasing Speed long;
B isSide-play amount, controlling behavior number of times is to relationship degreeSide-play amount, when behavior number of timesWhen, just starting calculated relationship degree, its value is the integer between 1 to 100;
For functionThat is cohesion d to the function of behavior number of times c, when behavior number of timesWhen tending to infinite, in regular riUnder, the relationship degree of p1, p2Tend to 100%, i.e.,:
The multidimensional syntagmatic cohesion computing module is calculated by below equation:
p1、p2:Represent two parties, ri:Delegate rules;
Represent p1 and p2 total relationship degree;
Represent p1 and p2 in regular riUnder relationship degree;
wi:Regular riWeight, wi∈R+
α:Targeted behavior number of times accounts for the percentage of overall behavior number of times during for this calculating multidimensional syntagmatic cohesion, When that cannot obtain overall behavior number of times, α values are 1;
It is p1→p2All relation rule set;
Pair strictly all rules existed with p2 with p1WhenWhen all tending to 100%, total relationship degree100% is also tended to, i.e.,:
The relationship degree decline computing module calculated relationship degree is realized by below equation:
WhereinIt is relation riDecay function, for without decline attribute ruleFor the rule with decline attribute, realized by following algorithm:
Linear regression d '=d (1-aT), wherein,
d:It is expressed as the relationship degree of meta-rule i;
a:The amount of zoom of the function T that expression is specified, its value is 0-1;
T:It is the difference of the maximum time occurred now for the behavior distance in current rule d;
Or, exponential decay d '=aT× d, wherein,
d:It is expressed as the relationship degree of meta-rule i;
a:The amount of zoom of the function T that expression is specified, its value is 0-1;
T:It is the difference of the maximum time occurred now for the behavior distance in current rule d.
The system also includes UI display modules, and the UI display modules are obtaining relationship degree decay function, completing personnel and close It is after the defining of cohesion, to set up personnel's network of personal connections, personnel's relational network is set up to available data using graph visualization instrument, After intimate degree calculating is carried out to historical data, daily to incremental data calculated relationship cohesion, obtain and personnel's phase The lists of persons of pass, the lists of persons presses cohesion ranking from high to low, and then the ranking is shown on UI interfaces.
The visualization tool refers to the graphical tool of ECharts instruments.
Each embodiment is described by the way of progressive in this specification, and what each embodiment was stressed is and other The difference of embodiment, between each embodiment same or similar part mutually referring to.For being filled disclosed in embodiment For putting, because it is corresponded to the method disclosed in Example, so description is fairly simple, related part is referring to method part Illustrate.
Professional further appreciates that, with reference to the module of each example of the embodiments described herein description And corresponding mathematical computations step, can be realized with electronic hardware, computer software or the combination of the two, in order to clearly Illustrate the interchangeability of hardware and software, in the above description according to function generally describe each example composition and Step.These functions are performed with hardware or software mode actually, application-specific and design constraint depending on technical scheme Condition.Professional and technical personnel can realize described function to each specific application using distinct methods, but this Plant and realize it is not considered that beyond the scope of this invention.
The method and step and construction module described with reference to the embodiments described herein can directly use hardware, processor The software module of execution, or the two combination is implemented.Software module can be placed in random access memory (RAM), internal memory, read-only Memory (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or skill In art field in known any other form of storage medium.
Above-mentioned specific embodiment is only specific case of the invention, and scope of patent protection of the invention is included but is not limited to Above-mentioned specific embodiment, it is any to meet a kind of method based on the public security data acquisition intimate degree of multidimensional personnel of the invention And the appropriate change or replacement that the those of ordinary skill of claims the and any technical fields of system is done to it, Scope of patent protection of the invention should all be fallen into.

Claims (10)

1. a kind of method based on the public security data acquisition intimate degree of multidimensional personnel, it is characterised in that its implementation process is:
The data of party are obtained first, calculate single-relation cohesion each other, i.e., by single behavior representation relation Close and distant degree;
Then multidimensional syntagmatic cohesion is calculated to various single-relation cohesions using Weighted Average Algorithm, that is, is passed through The close and distant degree of various behavior representation relations;
For party, if without finding behavior relation in a period of time, intimate degree according to the passage of time gradually Decline finally obtains the relationship degree decay function failed with linear, index or half-life period mode up to disappearing, and is declined based on this Function is moved back, the intimate degree between party is recalculated, so as to accurately obtain the relation between party.
2. a kind of method based on the public security data acquisition intimate degree of multidimensional personnel according to claim 1, its feature Be that the relation personal data of acquisition is obtained from public security system data, the data acquisition be based on Zookeeper clusters, Hadoop clusters, Spark aggregated structures are realized:Bottom using Spark on Yarn architecture mode, using HDFS as depositing Storage, Spark uses Flume, Sqoop as Computational frame, data extraction tool;Then will be surfed the Net including hotel lodging, Internet bar, Permanent resident population, people stayed temporarily, the public security internal data of suspect's mobile phone contact are drawn into the HDFS of Hadoop, extraction process In tentatively cleaned, treatment null value, invalid data, so as to obtain the data message of party.
3. a kind of method based on the public security data acquisition intimate degree of multidimensional personnel according to claim 1, its feature It is that the intimate degree is weighed by behavior relation, behavior relation includes living together, with living, with online, colleague, same It is capable, of the same clan, wherein,
Live together:Party stays in the same room in same hotel simultaneously;
Companion lives:Party stays in two rooms in same hotel simultaneously, at the same open room, while checking out, i.e. the time difference was at N minutes Within, the N is less than or equal to 10;
With online:Party simultaneously in the online of same Internet bar, while online, while off line, i.e., the time difference N minutes with Interior, the N is less than or equal to 10;
Colleague:Party has the experience taken office in same time period, same enterprise or unit;
Colleague:Party goes another from a ground simultaneously, and route is identical and reaches simultaneously;
It is of the same clan:The household register information of party belongs to same clan.
4. a kind of method based on the public security data acquisition intimate degree of multidimensional personnel according to claim 3, its feature It is to calculate single-relation cohesion to be realized by below equation:
In the formula, p1, p2 represent two parties, riDelegate rules;
Represent p1 and p2 in regular riUnder relationship degree;
Represent p1 and p2 in regular riUnder behavior number of times;
α is that targeted behavior number of times accounts for the percentage of overall behavior number of times when this calculates single-relation cohesion, when cannot obtain When taking overall behavior number of times, the α values are 1;
A isAmount of contraction, its value be 0-1, for controlling behavior number of times to relationship degreeGrowth speed Degree;
B isSide-play amount, controlling behavior number of times is to relationship degreeSide-play amount, when behavior number of timesWhen, just starting calculated relationship degree, its value is the integer between 1 to 100;
For functionThat is cohesion d to the function of behavior number of times c, when behavior number of timesWhen tending to infinite, in regular riUnder, the relationship degree of p1, p2Tend to 100%, i.e.,:
5. according to any a kind of described methods based on the public security data acquisition intimate degree of multidimensional personnel of claim 1-4, Characterized in that, the multidimensional syntagmatic cohesion is calculated by below equation:
p1、p2:Represent two parties, ri:Delegate rules;
Represent p1 and p2 total relationship degree;
Represent p1 and p2 in regular riUnder relationship degree;
wi:Regular riWeight, wi∈R+
α:Targeted behavior number of times accounts for the percentage of overall behavior number of times during for this calculating multidimensional syntagmatic cohesion, works as nothing When method obtains overall behavior number of times, α values are 1;
It is p1→p2All relation rule set;
Pair strictly all rules existed with p2 with p1WhenWhen all tending to 100%, total relationship degree Tend to 100%, i.e.,:
6. a kind of method based on the public security data acquisition intimate degree of multidimensional personnel according to claim 5, its feature It is that relationship degree decay function Weaken (d) is that linear, index or half-life period mode are failed, based on the relationship degree Decay function, for the rule for having time decline attribute, p1、p2Relationship degree be specially:
WhereinIt is relation riDecay function, for without decline attribute ruleFor the rule with decline attribute, realized by following algorithm:
Linear regression d '=d (1-aT), wherein,
d:It is expressed as the relationship degree of meta-rule i;
a:The amount of zoom of the function T that expression is specified, its value is 0-1;
T:It is the difference of the maximum time occurred now for the behavior distance in current rule d;
Or, exponential decay d '=aT× d, wherein,
d:It is expressed as the relationship degree of meta-rule i;
a:The amount of zoom of the function T that expression is specified, its value is 0-1;
T:It is the difference of the maximum time occurred now for the behavior distance in current rule d.
7. a kind of method based on the public security data acquisition intimate degree of multidimensional personnel according to claim 1, its feature It is, after relationship degree decay function, the defining of the intimate degree of completion personnel is obtained, also including setting up the step of personnel's network of personal connections Suddenly, the step is that personnel's relational network is set up to available data using graph visualization instrument, i.e., carry out relation to historical data After cohesion is calculated, daily to incremental data calculated relationship cohesion, the lists of persons related to the personnel is obtained, the people column Table presses cohesion ranking from high to low.
8. a kind of system based on the public security data acquisition intimate degree of multidimensional personnel, it is characterised in that its structure includes:
Data acquisition module, for obtaining the data between dependency relation people from public security system data, the data are referred to Hotel lodging, Internet bar's online, permanent resident population, people stayed temporarily, the public security internal data of suspect's mobile phone contact, in data acquisition When, the module is also tentatively cleaned to data, treatment null value, invalid data;
Single-relation cohesion computing module, carry out single-relation cohesion by the data for obtaining data acquisition module based on Calculate, i.e., by a certain behavior relation obtain party between close and distant degree, behavior relation include live together, with live, with online, Colleague, colleague, it is of the same clan;
Multidimensional syntagmatic cohesion computing module, combination calculates various behavior relations, then comprehensive to check between party Relatives' degree;
Relationship degree decline computing module, for the decay function that the passage between calculated relationship people according to the time is produced, and is based on Relationship degree between decay function calculated relationship people, the decay function is failed with linear, index or half-life period mode.
9. a kind of system based on the public security data acquisition intimate degree of multidimensional personnel according to claim 8, its feature It is that the single-relation cohesion computing module is calculated by below equation:
In the formula, p1, p2 represent two parties, riDelegate rules;
Represent p1 and p2 in regular riUnder relationship degree;
Represent p1 and p2 in regular riUnder behavior number of times;
α is that targeted behavior number of times accounts for the percentage of overall behavior number of times when this calculates single-relation cohesion, when cannot obtain When taking overall behavior number of times, the α values are 1;
A isAmount of contraction, its value be 0-1, for controlling behavior number of times to relationship degreeGrowth speed Degree;
B isSide-play amount, controlling behavior number of times is to relationship degreeSide-play amount, when behavior number of timesWhen, just starting calculated relationship degree, its value is the integer between 1 to 100;
For functionThat is cohesion d to the function of behavior number of times c, when behavior number of timesWhen tending to infinite, in regular riUnder, the relationship degree of p1, p2Tend to 100%, i.e.,:
The multidimensional syntagmatic cohesion computing module is calculated by below equation:
p1、p2:Represent two parties, ri:Delegate rules;
Represent p1 and p2 total relationship degree;
Represent p1 and p2 in regular riUnder relationship degree;
wi:Regular riWeight, wi∈R+
α:Targeted behavior number of times accounts for the percentage of overall behavior number of times during for this calculating multidimensional syntagmatic cohesion, works as nothing When method obtains overall behavior number of times, α values are 1;
It is p1→p2All relation rule set;
Pair strictly all rules existed with p2 with p1WhenWhen all tending to 100%, total relationship degree Tend to 100%, i.e.,:
The relationship degree decline computing module calculated relationship degree is realized by below equation:
WhereinIt is relation riDecay function, for without decline attribute ruleFor the rule with decline attribute, realized by following algorithm:
Linear regression d '=d (1-aT), wherein,
d:It is expressed as the relationship degree of meta-rule i;
a:The amount of zoom of the function T that expression is specified, its value is 0-1;
T:It is the difference of the maximum time occurred now for the behavior distance in current rule d;
Or, exponential decay d '=aT× d, wherein,
d:It is expressed as the relationship degree of meta-rule i;
a:The amount of zoom of the function T that expression is specified, its value is 0-1;
T:It is the difference of the maximum time occurred now for the behavior distance in current rule d.
10. a kind of system based on the public security data acquisition intimate degree of multidimensional personnel according to claim 8 or claim 9, its It is characterised by, the system also includes UI display modules, the UI display modules are obtaining relationship degree decay function, completing personnel and close It is after the defining of cohesion, to set up personnel's network of personal connections, personnel's relational network is set up to available data using graph visualization instrument, After intimate degree calculating is carried out to historical data, daily to incremental data calculated relationship cohesion, obtain and personnel's phase The lists of persons of pass, the lists of persons presses cohesion ranking from high to low, and then the ranking is shown on UI interfaces.
CN201710054364.1A 2017-01-24 2017-01-24 A kind of method and system based on the public security data acquisition intimate degree of multidimensional personnel Pending CN106844673A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710054364.1A CN106844673A (en) 2017-01-24 2017-01-24 A kind of method and system based on the public security data acquisition intimate degree of multidimensional personnel

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710054364.1A CN106844673A (en) 2017-01-24 2017-01-24 A kind of method and system based on the public security data acquisition intimate degree of multidimensional personnel

Publications (1)

Publication Number Publication Date
CN106844673A true CN106844673A (en) 2017-06-13

Family

ID=59120545

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710054364.1A Pending CN106844673A (en) 2017-01-24 2017-01-24 A kind of method and system based on the public security data acquisition intimate degree of multidimensional personnel

Country Status (1)

Country Link
CN (1) CN106844673A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106874381A (en) * 2017-01-09 2017-06-20 重庆邮电大学 A kind of radio environment map datum processing system based on Hadoop
CN108491226A (en) * 2018-02-05 2018-09-04 西安电子科技大学 Spark based on cluster scaling configures parameter automated tuning method
CN109361895A (en) * 2017-12-11 2019-02-19 罗普特(厦门)科技集团有限公司 The searching method and system of suspect relationship personnel
CN109615572A (en) * 2018-11-30 2019-04-12 武汉烽火众智数字技术有限责任公司 The method and system of personnel's cohesion analysis based on big data
CN110020025A (en) * 2017-09-28 2019-07-16 阿里巴巴集团控股有限公司 A kind of data processing method and device
CN110109908A (en) * 2017-12-29 2019-08-09 成都蜀信信用服务有限公司 Analysis system and method based on the potential relationship of social base information excavating personage
CN111680077A (en) * 2020-06-17 2020-09-18 郑州市中之易科技有限公司 Method for determining mutual relation through correlation degree grading and model comparison
CN113407594A (en) * 2021-06-18 2021-09-17 重庆紫光华山智安科技有限公司 Fusion relation analysis method and system

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106874381A (en) * 2017-01-09 2017-06-20 重庆邮电大学 A kind of radio environment map datum processing system based on Hadoop
CN110020025A (en) * 2017-09-28 2019-07-16 阿里巴巴集团控股有限公司 A kind of data processing method and device
CN110020025B (en) * 2017-09-28 2022-11-15 阿里巴巴集团控股有限公司 Data processing method and device
CN109361895A (en) * 2017-12-11 2019-02-19 罗普特(厦门)科技集团有限公司 The searching method and system of suspect relationship personnel
CN110109908A (en) * 2017-12-29 2019-08-09 成都蜀信信用服务有限公司 Analysis system and method based on the potential relationship of social base information excavating personage
CN108491226A (en) * 2018-02-05 2018-09-04 西安电子科技大学 Spark based on cluster scaling configures parameter automated tuning method
CN108491226B (en) * 2018-02-05 2021-03-23 西安电子科技大学 Spark configuration parameter automatic tuning method based on cluster scaling
CN109615572A (en) * 2018-11-30 2019-04-12 武汉烽火众智数字技术有限责任公司 The method and system of personnel's cohesion analysis based on big data
CN111680077A (en) * 2020-06-17 2020-09-18 郑州市中之易科技有限公司 Method for determining mutual relation through correlation degree grading and model comparison
CN111680077B (en) * 2020-06-17 2023-10-27 郑州市中之易科技有限公司 Method for determining interrelationship through association degree scoring and model comparison
CN113407594A (en) * 2021-06-18 2021-09-17 重庆紫光华山智安科技有限公司 Fusion relation analysis method and system

Similar Documents

Publication Publication Date Title
CN106844673A (en) A kind of method and system based on the public security data acquisition intimate degree of multidimensional personnel
US11403358B2 (en) Interactive geographical map
Gharib et al. An efficient algorithm for incremental mining of temporal association rules
Zheng et al. A cloud-based knowledge discovery system for monitoring fine-grained air quality
CN102194015B (en) Retrieval information heat statistical method
Ribičić et al. Visual analysis and steering of flooding simulations
CN109711155A (en) A kind of early warning determines method and apparatus
CN110175909A (en) A kind of enterprise's incidence relation determines method and system
CN106528822A (en) Construction method of business relationship circle, query method and query system of business relationship circle
CN110213164A (en) A kind of method and device of the identification network key disseminator based on topology information fusion
Chan et al. The growth and inequality nexus: The case of China
Zhang et al. Research on smart city evaluation based on hierarchy of needs
CN110232291A (en) Intelligent data desensitization method, device, computer equipment and storage medium
CN110413658A (en) A kind of chain of evidence construction method based on the fact the correlation rule
CN104166650B (en) Data storage device and date storage method
Bhattacharjya et al. How can economic schemes curtail the increasing sex ratio at birth in China?
Zhang et al. Evaluating the relationship between urban population growth and land expansion from a policymaking perspective: Ningbo, China
CN111127186A (en) Application method of customer credit rating evaluation system based on big data technology
Qiao et al. A two-sided matching decision method based on interval triangular fuzzy sets
CN110162521A (en) A kind of payment system transaction data processing method and system
CN109840269A (en) Data relationship visual management method based on four layer data frameworks
Yu Design of location security protection system based on internet of things
Yang et al. Dynamical behavior of rumor spreading under a temporal random control strategy
Tian Design and implementation of distributed government audit system based on multidimensional online analysis
CN109584047A (en) A kind of credit method, system, computer equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170613

RJ01 Rejection of invention patent application after publication