CN106815240A - A kind of personal status relationship analysis method and system based on mobile MAC - Google Patents
A kind of personal status relationship analysis method and system based on mobile MAC Download PDFInfo
- Publication number
- CN106815240A CN106815240A CN201510859716.1A CN201510859716A CN106815240A CN 106815240 A CN106815240 A CN 106815240A CN 201510859716 A CN201510859716 A CN 201510859716A CN 106815240 A CN106815240 A CN 106815240A
- Authority
- CN
- China
- Prior art keywords
- data
- personal status
- identity
- status relationship
- mac
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/1805—Append-only file systems, e.g. using logs or journals to store data
- G06F16/1815—Journaling file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/51—Indexing; Data structures therefor; Storage structures
Abstract
The present invention relates to a kind of personal status relationship analysis method and system based on mobile MAC, wherein method is comprised the following steps:Multiple identity datas of personage are extracted from network log file, and determine the position between the multiple identity data to obtain personal status relationship data by the priority of identity data, wherein the multiple identity data includes mobile device MAC, and mobile device MAC has limit priority;Multiple identity datas and personal status relationship data of personage are stored as diagram data according to the mode of directed edge, the portrait of personage is obtained with the subgraph that the mobile device MAC of personage travels through out maximum as starting point, wherein each subgraph only has a mobile device MAC.The present invention utilizes the network log file of magnanimity, therefrom extract the identity data of personage, and determine personal status relationship data according to priority, construct the portrait of personage after being stored as diagram data with directed edge as main body with mobile device MAC, or further analyze the relation between personage.
Description
Technical field
The present invention relates to Internet technology, closed more specifically to a kind of identity based on mobile MAC
It is analysis method and system.
Background technology
Traditional piece identity's relation builds using identity card as personage's main body.With mobile Internet
Develop rapidly, mobile phone has been popularized, user is more and more frequent by the network behavior that mobile phone is implemented.In order to protect
Barrier network security meets the demand that customization is serviced, it is necessary to which the identity information based on Network Capture is built
Vertical piece identity's relation.However, not having a kind of effective method at present can utilize the network for increasingly expanding
Piece identity's information is set up in daily record.
The content of the invention
The technical problem to be solved in the present invention is, for the body lacked in the prior art for mobile Internet
The defect of part relationship analysis method, there is provided a kind of personal status relationship analysis method and system based on mobile MAC,
Identity information can be extracted from the user network journal file of magnanimity, and builds personal status relationship.
The technical solution adopted for the present invention to solve the technical problems is:Construction is a kind of based on mobile MAC's
Personal status relationship analysis method, comprises the following steps:
S1, the multiple identity datas for extracting from network log file personage, and by the excellent of identity data
First level determines the position between the multiple identity data to obtain personal status relationship data, wherein the multiple
Identity data includes mobile device MAC, and the mobile device MAC has limit priority;
S2, multiple identity datas and personal status relationship data of the personage are stored according to the mode of directed edge
It is diagram data, the portrait of personage is obtained with the subgraph that the mobile device MAC of personage travels through out maximum as starting point,
Only contain a mobile device MAC wherein in each subgraph.
According in the personal status relationship analysis method based on mobile MAC of the present invention, also including following
Step:Whether S3, the subgraph of detection personage have common factor, and the result according to detection judges figure painting as between
Friends.
According in the personal status relationship analysis method based on mobile MAC of the present invention, step is additionally included in
Performed before rapid S1:S0, receive and by cleaning procedure real time processing network journal file, will be effective
Data-pushing to Distributed Message Queue.
According in the personal status relationship analysis method based on mobile MAC of the present invention, the step
S1 is further included:
S11, from network log file extract personage multiple identity datas;Then concurrently perform step S12
With step S13;
S12, the local re-scheduling that identity data is realized by internal memory, are then realized by distributed caching service
Identity data overall situation re-scheduling;Identity data is finally persisted to distributed file system;
S13, determine the position between the multiple identity data to obtain by the priority of identity data
Personal status relationship data;The local re-scheduling of personal status relationship data is then realized by internal memory, and it is slow by distribution
Personal status relationship data overall situation re-scheduling is realized in the service of depositing;Finally by personal status relationship data persistence to distributed document
System.
According in the personal status relationship analysis method based on mobile MAC of the present invention, the step
S2 is further included:
The data of S21, loading distributed file system, wherein identity data is loaded as the summit of figure, body
Part relation data is loaded as the side of figure, wherein the personal status relationship data containing mobile device MAC are loaded as one
Bar side, the personal status relationship Data expansion not containing mobile device MAC is two sides, and by vertex set
Total figure is initialized with line set;
S22, the out-degree filtration problem data by mobile device MAC, with each mobile device MAC
It is starting point, the subgraph for traveling through out maximum obtains the portrait of personage, and a movement is only contained wherein in each subgraph
Equipment MAC;
S23, by identity data and personal status relationship data persistence, and converted by transformation and weight will
Diagram data simplifies, wherein the indirect relation of personal status relationship data is converted into direct relation by the transformation,
The weight conversion calculates the maximum level n of subgraph first, and calculates new power by below equation
Weight:Wherein GkRepresent the weight of kth level.
Present invention also offers a kind of personal status relationship analysis system based on mobile MAC, including:
Identity extraction module, for extracting multiple identity datas of personage from network log file, and passes through
The priority of identity data determines the position between the multiple identity data to obtain personal status relationship data,
Wherein the multiple identity data includes mobile device MAC, and the mobile device MAC has highest
Priority;
Portrait build module, for by multiple identity datas and personal status relationship data of the personage according to oriented
The mode on side is stored as diagram data, and travels through out the subgraph of maximum as starting point with the mobile device MAC of personage
The portrait of personage is obtained, a mobile device MAC is only contained wherein in each subgraph.
According in the personal status relationship analysis system based on mobile MAC of the present invention, the system is also
Including:Whether character relation analysis module, the subgraph for detecting personage has common factor, according to the result of detection
Judge the friends between figure painting picture.
According in the personal status relationship analysis system based on mobile MAC of the present invention, the system is also
Including:Daily record receiver module, for receiving and by cleaning procedure real time processing network journal file will have
The data-pushing of effect is to Distributed Message Queue, there is provided to the identity extraction module.
According in the personal status relationship analysis system based on mobile MAC of the present invention, the identity is carried
Modulus block is further included:
Identity data extraction unit, the multiple identity datas for extracting personage from network log file;
Identity data processing unit, the local re-scheduling for realizing identity data by internal memory, then by dividing
Cloth buffer service realizes identity data overall situation re-scheduling;Identity data is finally persisted to distributed field system
System;
Personal status relationship data processing unit, the multiple identity is determined for the priority by identity data
Position between data is obtaining personal status relationship data;The part of personal status relationship data is then realized by internal memory
Re-scheduling, and personal status relationship data overall situation re-scheduling is realized by distributed caching service;Finally by personal status relationship number
According to being persisted to distributed file system.
According in the personal status relationship analysis system based on mobile MAC of the present invention, the portrait structure
Modeling block is further included:
Data loading unit, the data for loading distributed file system wherein identity data will be loaded as
The summit of figure, personal status relationship data are loaded as the side of figure, wherein the personal status relationship containing mobile device MAC
Data are loaded as a line, and the personal status relationship Data expansion not containing mobile device MAC is two sides, and
Total figure is initialized by vertex set and line set;
Portrait construction unit, for the out-degree filtration problem data by mobile device MAC, is moved with each
Dynamic equipment MAC is starting point, and the subgraph for traveling through out maximum obtains the portrait of personage, wherein in each subgraph only
Contain a mobile device MAC;
Data reduction unit, by identity data and personal status relationship data persistence, and by transformation and power
Convert again and simplify diagram data, wherein be converted into the indirect relation of personal status relationship data directly by the transformation
Relation is connect, the weight conversion calculates the maximum level n of subgraph first, and calculates new power by below equation
Weight:Wherein GkRepresent the weight of kth level.
Implement personal status relationship analysis method and system based on mobile MAC of the invention, with following beneficial
Effect:The present invention therefrom extracts the identity data of personage using the network log file of magnanimity, and according to excellent
First level determines personal status relationship data, with mobile device MAC is main body structure after being stored as diagram data with directed edge
The portrait of personage is built out, or further analyzes the relation between personage.
Brief description of the drawings
Below in conjunction with drawings and Examples, the invention will be further described, in accompanying drawing:
Fig. 1 is the stream of the first embodiment according to personal status relationship analysis method of the present invention based on mobile MAC
Cheng Tu;
Fig. 2 is the stream of the second embodiment according to personal status relationship analysis method of the present invention based on mobile MAC
Cheng Tu;
Fig. 3 is according to identity extraction step in personal status relationship analysis method of the present invention based on mobile MAC
Particular flow sheet;
Fig. 4 is according to construction step of being drawn a portrait in personal status relationship analysis method of the present invention based on mobile MAC
Particular flow sheet;
Fig. 5 is the frame of the first embodiment according to personal status relationship analysis system of the present invention based on mobile MAC
Figure;
Fig. 6 is the frame of the second embodiment according to personal status relationship analysis system of the present invention based on mobile MAC
Figure;
Fig. 7 is according to identity extraction module in personal status relationship analysis method of the present invention based on mobile MAC
Specific block diagram;
Fig. 8 is to build module according to being drawn a portrait in personal status relationship analysis system of the present invention based on mobile MAC
Specific block diagram.
Specific embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, below in conjunction with accompanying drawing and reality
Example is applied, the present invention will be described in further detail.
Fig. 1 is referred to, is real according to the present invention based on the personal status relationship analysis method for moving MAC first
Apply the flow chart of example.The personal status relationship analysis method based on mobile MAC that the embodiment is provided mainly includes
Following steps:
First, identity extraction step is performed in step sl, and many of personage are extracted from network log file
Individual identity data, and determined by the priority of identity data the position between aforesaid plurality of identity data with
Personal status relationship data are obtained, wherein aforesaid plurality of identity data is mobile device thing including mobile device MAC
Reason address, and mobile device MAC has limit priority.
In a preferred embodiment of the invention, foregoing identity data refers to two and more than two identity numbers
According to it can include two classes:Real name identity and virtual identity.Wherein, real name identity is included but is not limited to:
Mobile device MAC, cell-phone number, IMEI (mobile device international identity code), identification card number, Hongkong and Macro lead to
Row card, passport and officer's identity card etc., its limitednumber.Mobile device MAC can be mobile phone MAC etc.
The physical address of mobile device.Virtual identity refers to network virtual account, including but not limited to No. QQ, micro-
Signal and Taobao's account etc., its limitednumber.
The network log file used in the present invention can be comprising two classes:First kind network log file must be wrapped
The daily record of MAC containing mobile device, i.e., one must include mobile device MAC, while comprising one or more
Real name identity (except mobile device MAC), and comprising zero or a virtual identity.Equations of The Second Kind net
Network journal file is possible to include one or more real name identity comprising the daily record of mobile device MAC, i.e.,
(mobile device MAC can not included), comprising zero or a virtual identity.
The priority of identity data is that real name identity is higher than virtual identity in the step.And moved in real name identity
The highest priority of dynamic equipment MAC.In a preferred embodiment of the invention, the priority of real name identity from
It is high to Low to be successively:Mobile device MAC, cell-phone number, identification card number, IMEI, Hongkong and Macro's pass, shield
According to officer's identity card etc..It should be appreciated that, although giving the specific embodiment of real name identity priority herein,
But this area basic technology personnel can set specific excellent between each real name identity according to actual needs
First level.
Personal status relationship data in the present invention refer to the position relationship between two identity datas, wherein will be preferential
Level identity data higher is placed in the left side of relation, and the relatively low identity data of priority is placed in the right side of relation.
If real name identity A and real name identity B were included in such as same daily record, and real name identity A's would be preferential
Level is higher than real name identity B, then the identity data of extraction is real name identity A and real name identity B, extraction
Personal status relationship data are that real name identity A points to real name identity B.If including real name body in same daily record
Part A and virtual identity C, then the identity data of extraction is real name identity A and virtual identity C, is extracted
Personal status relationship data point to virtual identity C for real name identity A.If including real name in same daily record
Identity A, real name identity B and virtual identity C, then the identity data of extraction is real name identity A, real name
Identity B and virtual identity C, the personal status relationship data of extraction are real name identity A sensing real name identity B,
Real name identity B points to virtual identity C.
Then, portrait construction step, multiple bodies of the personage that step S1 is extracted are performed in step s 2
Number evidence and personal status relationship data are stored as diagram data according to the mode of directed edge, and with the mobile device of personage
MAC travels through out the portrait that maximum subgraph obtains personage for starting point.Wherein each subgraph has and only one of which
Mobile device MAC.
The identity data of foregoing extraction can be loaded as the summit of figure, the loading of personal status relationship data in the step
It is the side (directed edge) of figure.A line is loaded as the personal status relationship data containing mobile device MAC,
For the personal status relationship data for not containing mobile device MAC, two sides are expanded to.Such as cell-phone number is to body
Part card number, switchs to cell-phone number to identification card number, identification card number to two sides of cell-phone number.So pass through vertex set
Close and line set initializes total figure.Travel through out the subgraph of maximum as starting point with the mobile device MAC of personage again,
One and only one mobile device MAC wherein in each subgraph.The subgraph for so traveling through out is exactly the personage
Portrait.Under normal circumstances, if a personage has multiple mobile device MAC in reality, with
When one of mobile device MAC is that starting point travels through subgraph, there may be multiple shifting in the subgraph for traveling through out
Dynamic equipment MAC, due to false relation complicated between data, contains in the subgraph that may result in the personage
The identity data of other personages, so as to influence the accuracy of data.Therefore, the figure painting defined in the present invention
As one and only one mobile device MAC in subgraph, when traverse path runs into other mobile devices MAC
When stop the data acquisition in the path.As previously described when some personages have multiple mobile device MAC, then
A subgraph, therefore these reality will be traveled through out according to the method for the present invention for each mobile device MAC
In personage will correspond to multiple figure painting pictures.
Fig. 2 is referred to, is real according to the present invention based on the personal status relationship analysis method for moving MAC second
Apply the flow chart of example.The personal status relationship analysis method based on mobile MAC that the embodiment is provided is real first
The step of being performed before be additionally may included in step S1 on the basis of example S0 is applied, i.e., network log file being entered
Row is collected and cleaned.Specifically, can receive and by cleaning procedure real time processing network journal file, enter
All kinds of daily record legitimate verifications of row, by effective data-pushing to Distributed Message Queue (KAFKA).This
The statement proterties state transmission mode such as (REST) or FTP (FTP) can be supported in invention
Collector journal.
In more preferred embodiment of the invention, step S3 is can further include, detecting step S2 is obtained
To the subgraph of personage whether have a common factor, and judge the friend pass between figure painting picture according to the result of detection
System.Specifically, can be portrait by the subgraph of each personage, figure intersection operation be performed, if any two
The portrait of personage has common factor, then it may determine that there is friends between figure painting picture, otherwise judge people
Do not exist friends between thing portrait.Friends in the present invention between figure painting picture includes two kinds of feelings
Condition, one kind refers to have friends between the two corresponding real personages of portrait, another kind refer to this two
Individual personage's portrait belongs to a real personage.Because when same real personage has multiple mobile devices
During MAC, there may be figure to occur simultaneously between its each subgraph, thus may determine that the corresponding figure painting of these subgraphs
As belonging to the same people in reality.The present invention can also distinguish both of these case by further analysis.
In more preferred embodiment of the invention, can also include that retrieval service provides step, it can basis
User input, is retrieved in the identity data or personal status relationship data of persistence, obtains corresponding personage
Data.For example, mobile device MAC according to user input detects the subgraph of the personage, and then know
All identity datas of the personage.Or according to No. QQ of user input come detect the personage subgraph whether
There is common factor with other subgraphs, and then the friends of the personage can be known.
Fig. 3 is referred to, is to be carried according to identity in personal status relationship analysis method of the present invention based on mobile MAC
Take the particular flow sheet of step.Identity extraction step S1 can preferably use Spark Stream Processings, example
Such as timing treatment in every 30 seconds once.As shown in figure 3, the identity extraction step S1 in the present invention can enter one
Step includes:
First, in step s 11, multiple identity informations of personage are extracted from network log file.Specifically
Ground, can extract all of identity data from every daily record, and identity includes real name identity and virtual identity.
Then, in step s 12, the local re-scheduling of identity data is realized by internal memory, then by dividing
Cloth buffer service (REDIS) realizes identity data overall situation re-scheduling;Next ring is given by newly-increased data transfer
Section, and record the number of times of identity data appearance.Identity data is finally persisted to distributed file system
(HDFS)。
Finally, in step s 13, determined by the priority of identity data between multiple identity datas
Position obtaining personal status relationship data.The local re-scheduling of personal status relationship data is then realized by internal memory, and
Personal status relationship data overall situation re-scheduling is realized by distributed caching service, next ring is given by newly-increased data transfer
Section, and record the number of times of personal status relationship data appearance.Finally, by personal status relationship data persistence to distribution
Formula file system (HDFS), for subsequent figure data analysis.In the present invention can by impala carry out from
Line analysis and statistics.In preferred implementation, the step can also be by identity data and personal status relationship data
Persistence full-text search engine (elasticsearch), for real-time retrieval.
Fig. 4 is referred to, is according to structure of being drawn a portrait in personal status relationship analysis method of the present invention based on mobile MAC
Build the particular flow sheet of step.Portrait construction step S2 can analyze figure painting using Spark GraphX
Picture.As shown in figure 4, the portrait construction step S2 in the present invention may further include:
First, in the step s 21, data loading operations are performed, row data is loaded as diagram data.Tool
Body, persistence in abovementioned steps S13 to the data of distributed file system (HDFS) is loaded into spark
In, wherein identity data is loaded as the summit of figure, and personal status relationship data are loaded as the side of figure, wherein containing shifting
The personal status relationship data of dynamic equipment MAC are loaded as a line, and the identity not containing mobile device MAC is closed
Coefficient evidence expands to two sides.Such as cell-phone number switchs to cell-phone number to identification card number, identity to identification card number
Card number arrives two sides of cell-phone number.Total figure is initialized by vertex set and line set.
Then, in step S22, figure painting picture is built.First, figure painting picture filtering, this hair are carried out
By the analysis to historical data in bright, determine the corresponding attribute of personage not over 20.Therefore,
Some problematic mobile device MAC datas can be filtered out by the out-degree of mobile phone MAC.Again with every
Individual mobile device MAC is starting point, travels through out the subgraph of maximum, and one and only one shifting in each subgraph
Dynamic equipment MAC, then the subgraph is exactly the portrait of personage.One and only one movement wherein in each subgraph
Equipment MAC.
Finally, in step S23, by identity data and personal status relationship data persistence, and by relation
Conversion and weight conversion simplify diagram data.Wherein:
1) transformation:The indirect relation of personal status relationship data is converted into direct relation, for example will be mobile
To cell-phone number, the transformation of cell-phone number to identification card number is mobile device MAC to mobile phone to equipment MAC
Number, mobile device MAC to identification card number.
2) weight conversion:The maximum level n of subgraph is calculated first, and new power is calculated by below equation
Weight:Wherein GkRepresent the weight of kth level.
Fig. 5 is referred to, is real according to the present invention based on the personal status relationship analysis system for moving MAC first
Apply the block diagram of example.The personal status relationship analysis system based on mobile MAC that the embodiment is provided mainly includes body
Part extraction module 100 and portrait build module 200.
Wherein identity extraction module 100 is used to be extracted from network log file multiple identity datas of personage,
And determine the position between aforesaid plurality of identity data to obtain identity pass by the priority of identity data
Coefficient evidence, wherein aforesaid plurality of identity data is mobile device physical address including mobile device MAC, and
Mobile device MAC has limit priority.
Portrait builds module 200, is connected with identity extraction module 100, for by identity extraction module 100
Multiple identity datas and personal status relationship data of the personage of extraction are stored as diagram data according to the mode of directed edge,
And travel through out the portrait that maximum subgraph obtains personage by starting point of the mobile device MAC of personage.It is wherein every
Individual subgraph has and only one of which mobile device MAC.
The portrait builds module 200 and the identity data of foregoing extraction can be loaded as the summit of figure, and identity is closed
Coefficient is according to the side (directed edge) for being loaded as figure.Add for the personal status relationship data containing mobile device MAC
It is a line to carry, and for the personal status relationship data for not containing mobile device MAC, expands to two sides.Example
If cell-phone number is to identification card number, switch to cell-phone number to identification card number, identification card number to two sides of cell-phone number.This
Sample initializes total figure by vertex set and line set.Traveled through by starting point of the mobile device MAC of personage again
Go out the subgraph of maximum, one and only one mobile device MAC wherein in each subgraph.So travel through out
Subgraph is exactly the portrait of the personage.Under normal circumstances, if a personage has multiple mobile devices in reality
During MAC, then when with one of mobile device MAC as starting point traversal subgraph, in the subgraph for traveling through out
There may be multiple mobile device MAC, due to false relation complicated between data, may result in the people
Identity data containing other personages in the subgraph of thing, so as to influence the accuracy of data.Therefore, the present invention
Defined in figure painting as one and only one mobile device MAC in subgraph, when traverse path runs into other
Stop the data acquisition in the path during mobile device MAC.As previously described when some personages have multiple movements
Equipment MAC, then will travel through out a subgraph according to the method for the present invention for each mobile device MAC,
Therefore the personage in these reality will correspond to multiple figure painting pictures.
Fig. 6 is referred to, is real according to the present invention based on the personal status relationship analysis system for moving MAC second
Apply the block diagram of example.The personal status relationship analysis system based on mobile MAC that the embodiment is provided is implemented first
Daily record receiver module 10 can also be included on the basis of example, for being collected to network log file and clearly
Wash.Specifically, daily record receiver module 10 can be received and by cleaning procedure real time processing network daily record text
Part, carries out all kinds of daily record legitimate verifications, by effective data-pushing to Distributed Message Queue
(KAFKA), being supplied to identity extraction module 100.Can support that statement proterties state is transmitted in the present invention
Or the mode collector journal such as FTP (FTP) (REST).
In more preferred embodiment of the invention, the personal status relationship analysis system based on mobile MAC can be with
Character relation analysis module 300 is further included, building module 200 with the portrait is connected, for detecting
Whether the subgraph that portrait builds the personage that module 200 is obtained has common factor, and judges personage according to the result of detection
Friends between portrait.Specifically, can be portrait by the subgraph of each personage, perform figure common factor behaviour
Make, if the portrait of any two personage has common factor, then it may determine that there is friend between figure painting picture
Relation, otherwise judges do not exist friends between figure painting picture.Friend in the present invention between figure painting picture
Relation includes two kinds of situations, and one kind refers to have friends between the two corresponding real personages of portrait,
Another kind refers to that the two figure painting pictures belong to a real personage.Because when same real personage has many
During individual mobile device MAC, there may be figure to occur simultaneously between its each subgraph, thus may determine that these subgraphs pair
The figure painting picture answered belongs to the same people in reality.The present invention can also distinguish this by further analysis
Two kinds of situations.
In more preferred embodiment of the invention, retrieval service module 400 can also be included, be extracted with identity
Module 100, portrait build module 200 and/or character relation analysis module 300 and are connected, its can according to
Family is input into, and is retrieved in the identity data or personal status relationship data of persistence, obtains corresponding personage's number
According to.For example, mobile device MAC according to user input detects the subgraph of the personage, and then know this
All identity datas of personage.Or according to No. QQ of user input come detect the personage subgraph whether with
Other subgraphs have common factor, and then can know the friends of the personage.
Fig. 7 is referred to, is to be carried according to identity in personal status relationship analysis method of the present invention based on mobile MAC
The specific block diagram of modulus block.The identity extraction module can preferably using Spark Stream Processings realize, for example
Regularly process once within every 30 seconds.As shown in fig. 7, the identity extraction module 100 in the present invention can enter one
Step includes identity data extraction unit 110, identity data processing unit 120 and personal status relationship data processing list
Unit 130.
Wherein identity data extraction unit 110 is used to be extracted from network log file multiple identity of personage
Information.Specifically, can extract all of identity data from every daily record, identity comprising real name identity and
Virtual identity.
Identity data processing unit 120 is connected with identity data extraction unit 110, for being realized by internal memory
The local re-scheduling of identity data, then realizes that identity data is global by distributed caching service (REDIS)
Re-scheduling;Next link is given by newly-increased data transfer, and records the number of times of identity data appearance.Finally will
Identity data is persisted to distributed file system (HDFS).
Personal status relationship data processing unit 130 is connected with identity data extraction unit 110, for by identity
The priority of data determines that the position between multiple identity datas obtains personal status relationship data.Then pass through
Internal memory realizes the local re-scheduling of personal status relationship data, and realizes personal status relationship data by distributed caching service
Global re-scheduling, gives next link, and record the number of times of personal status relationship data appearance by newly-increased data transfer.
Finally, by personal status relationship data persistence to distributed file system (HDFS), for subsequent figure data analysis.
Off-line analysis and statistics can be carried out by impala in the present invention.In preferred implementation, identity number
Can also be by identity data and personal status relationship number according to processing unit 120 and personal status relationship data processing unit 130
According to persistence full-text search engine (elasticsearch), for the real-time retrieval of retrieval service module 400.
Fig. 8 is referred to, is according to structure of being drawn a portrait in personal status relationship analysis system of the present invention based on mobile MAC
Model the specific block diagram of block.The portrait builds module 200 and can analyze figure painting using Spark GraphX
Picture.As shown in figure 8, the portrait in the present invention builds module 200 may further include:Data loading is single
Unit 210, portrait construction unit 220 and data reduction unit 230.
Data loading unit 210 is used to perform data loading operations, and row data are loaded as into diagram data.Specifically
, foregoing identity data processing unit 120 and the persistence of personal status relationship data processing unit 130 are extremely distributed
The data of formula file system (HDFS) are loaded into spark, and wherein identity data is loaded as the summit of figure,
Personal status relationship data are loaded as the side of figure, wherein the personal status relationship data containing mobile device MAC are loaded as
A line, the personal status relationship Data expansion not containing mobile device MAC is two sides.For example cell-phone number is arrived
Identification card number, switchs to cell-phone number to identification card number, identification card number to two sides of cell-phone number.By vertex set
Total figure is initialized with line set.
Portrait construction unit 220 is connected with data loading unit 210, for building figure painting picture.First,
Figure painting picture filtering is carried out, by the analysis to historical data in the present invention, determines that a personage is corresponding
Attribute is not over 20.Therefore, it can filter out some problematic movements by the out-degree of mobile phone MAC
Equipment MAC data.Again with each mobile device MAC as starting point, the subgraph of maximum is traveled through out, and often
One and only one mobile device MAC in individual subgraph, then the subgraph is exactly the portrait of personage.
Data reduction unit 230 is connected with portrait construction unit 220, for by identity data and personal status relationship
Data persistence, and simplified diagram data by transformation and weight conversion.Wherein:
1) transformation:The indirect relation of personal status relationship data is converted into direct relation, for example will be mobile
To cell-phone number, the transformation of cell-phone number to identification card number is mobile device MAC to mobile phone to equipment MAC
Number, mobile device MAC to identification card number.
2) weight conversion:The maximum level n of subgraph is calculated first, and new power is calculated by below equation
Weight:Wherein GkRepresent the weight of kth level.
In sum, the present invention is different from using identity card as portrait main body, and uses mobile device MAC
Address is used as portrait main body.By combining big data technology, cleaning, extraction, re-scheduling obtain master data,
Then figure painting picture is drawn by chart database, character relation figure is drawn by graphic operation.
It should be appreciated that, personal status relationship analysis method and system based on mobile MAC are used in the present invention
Principle it is identical with specific implementation, therefore to the personal status relationship analysis based on mobile MAC in the present invention
The description of method specific embodiment is also applied for the personal status relationship analysis system based on mobile MAC.
The present invention is described according to specific embodiment, but it will be understood by those skilled in the art that is not taking off
During from the scope of the invention, various change and equivalent can be carried out.Additionally, the spy to adapt to the technology of the present invention
Determine occasion or material, many modifications can be carried out to the present invention without deviating from its protection domain.Therefore, the present invention
Specific embodiment disclosed herein is not limited to, and including all implementations for dropping into claims
Example.
Claims (10)
1. it is a kind of based on the personal status relationship analysis method for moving MAC, it is characterised in that including following step
Suddenly:
S1, the multiple identity datas for extracting from network log file personage, and by the excellent of identity data
First level determines the position between the multiple identity data to obtain personal status relationship data, wherein the multiple
Identity data includes mobile device MAC, and the mobile device MAC has limit priority;
S2, multiple identity datas and personal status relationship data of the personage are stored according to the mode of directed edge
It is diagram data, the portrait of personage is obtained with the subgraph that the mobile device MAC of personage travels through out maximum as starting point,
Only contain a mobile device MAC wherein in each subgraph.
2. according to claim 1 based on the personal status relationship analysis method for moving MAC, its feature
It is that methods described is further comprising the steps of:
Whether S3, the subgraph of detection personage have common factor, and the result according to detection judges figure painting as between
Friends.
3. according to claim 1 based on the personal status relationship analysis method for moving MAC, its feature
It is, what methods described was performed before being additionally included in step S1:
S0, receive and by cleaning procedure real time processing network journal file, effective data-pushing is arrived
Distributed Message Queue.
4. according to claim 1 based on the personal status relationship analysis method for moving MAC, its feature
It is that the step S1 is further included:
S11, from network log file extract personage multiple identity datas;Then concurrently perform step S12
With step S13;
S12, the local re-scheduling that identity data is realized by internal memory, are then realized by distributed caching service
Identity data overall situation re-scheduling;Identity data is finally persisted to distributed file system;
S13, determine the position between the multiple identity data to obtain by the priority of identity data
Personal status relationship data;The local re-scheduling of personal status relationship data is then realized by internal memory, and it is slow by distribution
Personal status relationship data overall situation re-scheduling is realized in the service of depositing;Finally by personal status relationship data persistence to distributed document
System.
5. according to claim 4 based on the personal status relationship analysis method for moving MAC, its feature
It is that the step S2 is further included:
The data of S21, loading distributed file system, wherein identity data is loaded as the summit of figure, body
Part relation data is loaded as the side of figure, wherein the personal status relationship data containing mobile device MAC are loaded as one
Bar side, the personal status relationship Data expansion not containing mobile device MAC is two sides, and by vertex set
Total figure is initialized with line set;
S22, the out-degree filtration problem data by mobile device MAC, with each mobile device MAC
It is starting point, the subgraph for traveling through out maximum obtains the portrait of personage, and a movement is only contained wherein in each subgraph
Equipment MAC;
S23, by identity data and personal status relationship data persistence, and converted by transformation and weight will
Diagram data simplifies, wherein the indirect relation of personal status relationship data is converted into direct relation by the transformation,
The weight conversion calculates the maximum level n of subgraph first, and calculates new power by below equation
Weight:Wherein GkRepresent the weight of kth level.
6. a kind of based on the personal status relationship analysis system for moving MAC, it is characterised in that including:
Identity extraction module, for extracting multiple identity datas of personage from network log file, and passes through
The priority of identity data determines the position between the multiple identity data to obtain personal status relationship data,
Wherein the multiple identity data includes mobile device MAC, and the mobile device MAC has highest
Priority;
Portrait build module, for by multiple identity datas and personal status relationship data of the personage according to oriented
The mode on side is stored as diagram data, and travels through out the subgraph of maximum as starting point with the mobile device MAC of personage
The portrait of personage is obtained, a mobile device MAC is only contained wherein in each subgraph.
7. according to claim 6 based on the personal status relationship analysis system for moving MAC, its feature
It is that the system also includes:
Whether character relation analysis module, the subgraph for detecting personage has common factor, and the result according to detection is sentenced
Friends between disconnected figure painting picture.
8. according to claim 6 based on the personal status relationship analysis system for moving MAC, its feature
It is that the system also includes:
Daily record receiver module, for receiving and by cleaning procedure real time processing network journal file, incites somebody to action effective
Data-pushing to Distributed Message Queue, there is provided to the identity extraction module.
9. according to claim 6 based on the personal status relationship analysis system for moving MAC, its feature
It is that the identity extraction module is further included:
Identity data extraction unit, the multiple identity datas for extracting personage from network log file;
Identity data processing unit, the local re-scheduling for realizing identity data by internal memory, then by dividing
Cloth buffer service realizes identity data overall situation re-scheduling;Identity data is finally persisted to distributed field system
System;
Personal status relationship data processing unit, the multiple identity is determined for the priority by identity data
Position between data is obtaining personal status relationship data;The part of personal status relationship data is then realized by internal memory
Re-scheduling, and personal status relationship data overall situation re-scheduling is realized by distributed caching service;Finally by personal status relationship number
According to being persisted to distributed file system.
10. according to claim 9 based on the personal status relationship analysis system for moving MAC, its feature
It is that the portrait builds module and further includes:
Data loading unit, the data for loading distributed file system wherein identity data will be loaded as
The summit of figure, personal status relationship data are loaded as the side of figure, wherein the personal status relationship containing mobile device MAC
Data are loaded as a line, and the personal status relationship Data expansion not containing mobile device MAC is two sides, and
Total figure is initialized by vertex set and line set;
Portrait construction unit, for the out-degree filtration problem data by mobile device MAC, is moved with each
Dynamic equipment MAC is starting point, and the subgraph for traveling through out maximum obtains the portrait of personage, wherein in each subgraph only
Contain a mobile device MAC;
Data reduction unit, by identity data and personal status relationship data persistence, and by transformation and power
Convert again and simplify diagram data, wherein be converted into the indirect relation of personal status relationship data directly by the transformation
Relation is connect, the weight conversion calculates the maximum level n of subgraph first, and calculates new power by below equation
Weight:Wherein GkRepresent the weight of kth level.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510859716.1A CN106815240A (en) | 2015-11-30 | 2015-11-30 | A kind of personal status relationship analysis method and system based on mobile MAC |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510859716.1A CN106815240A (en) | 2015-11-30 | 2015-11-30 | A kind of personal status relationship analysis method and system based on mobile MAC |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106815240A true CN106815240A (en) | 2017-06-09 |
Family
ID=59156647
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510859716.1A Pending CN106815240A (en) | 2015-11-30 | 2015-11-30 | A kind of personal status relationship analysis method and system based on mobile MAC |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106815240A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109871415A (en) * | 2019-01-21 | 2019-06-11 | 武汉光谷信息技术股份有限公司 | A kind of user's portrait construction method, system and storage medium based on chart database |
CN111444368A (en) * | 2020-03-25 | 2020-07-24 | 平安科技(深圳)有限公司 | Method and device for constructing user portrait, computer equipment and storage medium |
-
2015
- 2015-11-30 CN CN201510859716.1A patent/CN106815240A/en active Pending
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109871415A (en) * | 2019-01-21 | 2019-06-11 | 武汉光谷信息技术股份有限公司 | A kind of user's portrait construction method, system and storage medium based on chart database |
CN109871415B (en) * | 2019-01-21 | 2021-04-30 | 武汉光谷信息技术股份有限公司 | User portrait construction method and system based on graph database and storage medium |
CN111444368A (en) * | 2020-03-25 | 2020-07-24 | 平安科技(深圳)有限公司 | Method and device for constructing user portrait, computer equipment and storage medium |
CN111444368B (en) * | 2020-03-25 | 2023-01-17 | 平安科技(深圳)有限公司 | Method and device for constructing user portrait, computer equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107358146B (en) | Method for processing video frequency, device and storage medium | |
CN108596041B (en) | A kind of human face in-vivo detection method based on video | |
CN106940794A (en) | A yard adjoint system is detectd in a kind of target collection | |
CN106372606A (en) | Target object information generation method and unit identification method and unit and system | |
CN105389549A (en) | Object recognition method and device based on human body action characteristic | |
CN104283918B (en) | A kind of WLAN terminal type acquisition methods and system | |
CN105208528B (en) | A kind of system and method for identifying with administrative staff | |
CN111079699A (en) | Commodity identification method and device | |
CN103761279B (en) | Method and system for scheduling network crawlers on basis of keyword search | |
CN109190586A (en) | Customer's visiting analysis method, device and storage medium | |
CN109253888A (en) | Detection method and system for vehicle vehicle condition | |
CN107204975A (en) | A kind of industrial control system network attack detection technology based on scene fingerprint | |
CN109271793A (en) | Internet of Things cloud platform device class recognition methods and system | |
CN106357416A (en) | Group information recommendation method, device and terminal | |
CN108009497A (en) | Image recognition monitoring method, system, computing device and readable storage medium storing program for executing | |
CN107360145A (en) | A kind of multinode honey pot system and its data analysing method | |
CN108108897B (en) | Rail transit passenger flow clearing method and system and electronic equipment | |
CN106874372A (en) | The method and device of destination object identification information is obtained based on unmanned plane | |
CN106534784A (en) | Acquisition analysis storage statistical system for video analysis data result set | |
CN108229262A (en) | A kind of pornographic video detecting method and device | |
CN111382808A (en) | Vehicle detection processing method and device | |
CN108319672A (en) | Mobile terminal malicious information filtering method and system based on cloud computing | |
CN106326835A (en) | Human face data collection statistical system and method for gas station convenience store | |
CN106789242A (en) | A kind of identification application intellectual analysis engine based on mobile phone client software behavioral characteristics storehouse | |
CN110188717A (en) | Image acquiring method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170609 |