CN106815240A - A kind of personal status relationship analysis method and system based on mobile MAC - Google Patents

A kind of personal status relationship analysis method and system based on mobile MAC Download PDF

Info

Publication number
CN106815240A
CN106815240A CN201510859716.1A CN201510859716A CN106815240A CN 106815240 A CN106815240 A CN 106815240A CN 201510859716 A CN201510859716 A CN 201510859716A CN 106815240 A CN106815240 A CN 106815240A
Authority
CN
China
Prior art keywords
data
personal status
identity
status relationship
mac
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510859716.1A
Other languages
Chinese (zh)
Inventor
刘臣
胡文鹏
张东升
景晓军
沈智杰
唐新民
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SURFILTER NETWORK TECHNOLOGY Co Ltd
Original Assignee
SURFILTER NETWORK TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SURFILTER NETWORK TECHNOLOGY Co Ltd filed Critical SURFILTER NETWORK TECHNOLOGY Co Ltd
Priority to CN201510859716.1A priority Critical patent/CN106815240A/en
Publication of CN106815240A publication Critical patent/CN106815240A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures

Abstract

The present invention relates to a kind of personal status relationship analysis method and system based on mobile MAC, wherein method is comprised the following steps:Multiple identity datas of personage are extracted from network log file, and determine the position between the multiple identity data to obtain personal status relationship data by the priority of identity data, wherein the multiple identity data includes mobile device MAC, and mobile device MAC has limit priority;Multiple identity datas and personal status relationship data of personage are stored as diagram data according to the mode of directed edge, the portrait of personage is obtained with the subgraph that the mobile device MAC of personage travels through out maximum as starting point, wherein each subgraph only has a mobile device MAC.The present invention utilizes the network log file of magnanimity, therefrom extract the identity data of personage, and determine personal status relationship data according to priority, construct the portrait of personage after being stored as diagram data with directed edge as main body with mobile device MAC, or further analyze the relation between personage.

Description

A kind of personal status relationship analysis method and system based on mobile MAC
Technical field
The present invention relates to Internet technology, closed more specifically to a kind of identity based on mobile MAC It is analysis method and system.
Background technology
Traditional piece identity's relation builds using identity card as personage's main body.With mobile Internet Develop rapidly, mobile phone has been popularized, user is more and more frequent by the network behavior that mobile phone is implemented.In order to protect Barrier network security meets the demand that customization is serviced, it is necessary to which the identity information based on Network Capture is built Vertical piece identity's relation.However, not having a kind of effective method at present can utilize the network for increasingly expanding Piece identity's information is set up in daily record.
The content of the invention
The technical problem to be solved in the present invention is, for the body lacked in the prior art for mobile Internet The defect of part relationship analysis method, there is provided a kind of personal status relationship analysis method and system based on mobile MAC, Identity information can be extracted from the user network journal file of magnanimity, and builds personal status relationship.
The technical solution adopted for the present invention to solve the technical problems is:Construction is a kind of based on mobile MAC's Personal status relationship analysis method, comprises the following steps:
S1, the multiple identity datas for extracting from network log file personage, and by the excellent of identity data First level determines the position between the multiple identity data to obtain personal status relationship data, wherein the multiple Identity data includes mobile device MAC, and the mobile device MAC has limit priority;
S2, multiple identity datas and personal status relationship data of the personage are stored according to the mode of directed edge It is diagram data, the portrait of personage is obtained with the subgraph that the mobile device MAC of personage travels through out maximum as starting point, Only contain a mobile device MAC wherein in each subgraph.
According in the personal status relationship analysis method based on mobile MAC of the present invention, also including following Step:Whether S3, the subgraph of detection personage have common factor, and the result according to detection judges figure painting as between Friends.
According in the personal status relationship analysis method based on mobile MAC of the present invention, step is additionally included in Performed before rapid S1:S0, receive and by cleaning procedure real time processing network journal file, will be effective Data-pushing to Distributed Message Queue.
According in the personal status relationship analysis method based on mobile MAC of the present invention, the step S1 is further included:
S11, from network log file extract personage multiple identity datas;Then concurrently perform step S12 With step S13;
S12, the local re-scheduling that identity data is realized by internal memory, are then realized by distributed caching service Identity data overall situation re-scheduling;Identity data is finally persisted to distributed file system;
S13, determine the position between the multiple identity data to obtain by the priority of identity data Personal status relationship data;The local re-scheduling of personal status relationship data is then realized by internal memory, and it is slow by distribution Personal status relationship data overall situation re-scheduling is realized in the service of depositing;Finally by personal status relationship data persistence to distributed document System.
According in the personal status relationship analysis method based on mobile MAC of the present invention, the step S2 is further included:
The data of S21, loading distributed file system, wherein identity data is loaded as the summit of figure, body Part relation data is loaded as the side of figure, wherein the personal status relationship data containing mobile device MAC are loaded as one Bar side, the personal status relationship Data expansion not containing mobile device MAC is two sides, and by vertex set Total figure is initialized with line set;
S22, the out-degree filtration problem data by mobile device MAC, with each mobile device MAC It is starting point, the subgraph for traveling through out maximum obtains the portrait of personage, and a movement is only contained wherein in each subgraph Equipment MAC;
S23, by identity data and personal status relationship data persistence, and converted by transformation and weight will Diagram data simplifies, wherein the indirect relation of personal status relationship data is converted into direct relation by the transformation, The weight conversion calculates the maximum level n of subgraph first, and calculates new power by below equation Weight:Wherein GkRepresent the weight of kth level.
Present invention also offers a kind of personal status relationship analysis system based on mobile MAC, including:
Identity extraction module, for extracting multiple identity datas of personage from network log file, and passes through The priority of identity data determines the position between the multiple identity data to obtain personal status relationship data, Wherein the multiple identity data includes mobile device MAC, and the mobile device MAC has highest Priority;
Portrait build module, for by multiple identity datas and personal status relationship data of the personage according to oriented The mode on side is stored as diagram data, and travels through out the subgraph of maximum as starting point with the mobile device MAC of personage The portrait of personage is obtained, a mobile device MAC is only contained wherein in each subgraph.
According in the personal status relationship analysis system based on mobile MAC of the present invention, the system is also Including:Whether character relation analysis module, the subgraph for detecting personage has common factor, according to the result of detection Judge the friends between figure painting picture.
According in the personal status relationship analysis system based on mobile MAC of the present invention, the system is also Including:Daily record receiver module, for receiving and by cleaning procedure real time processing network journal file will have The data-pushing of effect is to Distributed Message Queue, there is provided to the identity extraction module.
According in the personal status relationship analysis system based on mobile MAC of the present invention, the identity is carried Modulus block is further included:
Identity data extraction unit, the multiple identity datas for extracting personage from network log file;
Identity data processing unit, the local re-scheduling for realizing identity data by internal memory, then by dividing Cloth buffer service realizes identity data overall situation re-scheduling;Identity data is finally persisted to distributed field system System;
Personal status relationship data processing unit, the multiple identity is determined for the priority by identity data Position between data is obtaining personal status relationship data;The part of personal status relationship data is then realized by internal memory Re-scheduling, and personal status relationship data overall situation re-scheduling is realized by distributed caching service;Finally by personal status relationship number According to being persisted to distributed file system.
According in the personal status relationship analysis system based on mobile MAC of the present invention, the portrait structure Modeling block is further included:
Data loading unit, the data for loading distributed file system wherein identity data will be loaded as The summit of figure, personal status relationship data are loaded as the side of figure, wherein the personal status relationship containing mobile device MAC Data are loaded as a line, and the personal status relationship Data expansion not containing mobile device MAC is two sides, and Total figure is initialized by vertex set and line set;
Portrait construction unit, for the out-degree filtration problem data by mobile device MAC, is moved with each Dynamic equipment MAC is starting point, and the subgraph for traveling through out maximum obtains the portrait of personage, wherein in each subgraph only Contain a mobile device MAC;
Data reduction unit, by identity data and personal status relationship data persistence, and by transformation and power Convert again and simplify diagram data, wherein be converted into the indirect relation of personal status relationship data directly by the transformation Relation is connect, the weight conversion calculates the maximum level n of subgraph first, and calculates new power by below equation Weight:Wherein GkRepresent the weight of kth level.
Implement personal status relationship analysis method and system based on mobile MAC of the invention, with following beneficial Effect:The present invention therefrom extracts the identity data of personage using the network log file of magnanimity, and according to excellent First level determines personal status relationship data, with mobile device MAC is main body structure after being stored as diagram data with directed edge The portrait of personage is built out, or further analyzes the relation between personage.
Brief description of the drawings
Below in conjunction with drawings and Examples, the invention will be further described, in accompanying drawing:
Fig. 1 is the stream of the first embodiment according to personal status relationship analysis method of the present invention based on mobile MAC Cheng Tu;
Fig. 2 is the stream of the second embodiment according to personal status relationship analysis method of the present invention based on mobile MAC Cheng Tu;
Fig. 3 is according to identity extraction step in personal status relationship analysis method of the present invention based on mobile MAC Particular flow sheet;
Fig. 4 is according to construction step of being drawn a portrait in personal status relationship analysis method of the present invention based on mobile MAC Particular flow sheet;
Fig. 5 is the frame of the first embodiment according to personal status relationship analysis system of the present invention based on mobile MAC Figure;
Fig. 6 is the frame of the second embodiment according to personal status relationship analysis system of the present invention based on mobile MAC Figure;
Fig. 7 is according to identity extraction module in personal status relationship analysis method of the present invention based on mobile MAC Specific block diagram;
Fig. 8 is to build module according to being drawn a portrait in personal status relationship analysis system of the present invention based on mobile MAC Specific block diagram.
Specific embodiment
In order to make the purpose , technical scheme and advantage of the present invention be clearer, below in conjunction with accompanying drawing and reality Example is applied, the present invention will be described in further detail.
Fig. 1 is referred to, is real according to the present invention based on the personal status relationship analysis method for moving MAC first Apply the flow chart of example.The personal status relationship analysis method based on mobile MAC that the embodiment is provided mainly includes Following steps:
First, identity extraction step is performed in step sl, and many of personage are extracted from network log file Individual identity data, and determined by the priority of identity data the position between aforesaid plurality of identity data with Personal status relationship data are obtained, wherein aforesaid plurality of identity data is mobile device thing including mobile device MAC Reason address, and mobile device MAC has limit priority.
In a preferred embodiment of the invention, foregoing identity data refers to two and more than two identity numbers According to it can include two classes:Real name identity and virtual identity.Wherein, real name identity is included but is not limited to: Mobile device MAC, cell-phone number, IMEI (mobile device international identity code), identification card number, Hongkong and Macro lead to Row card, passport and officer's identity card etc., its limitednumber.Mobile device MAC can be mobile phone MAC etc. The physical address of mobile device.Virtual identity refers to network virtual account, including but not limited to No. QQ, micro- Signal and Taobao's account etc., its limitednumber.
The network log file used in the present invention can be comprising two classes:First kind network log file must be wrapped The daily record of MAC containing mobile device, i.e., one must include mobile device MAC, while comprising one or more Real name identity (except mobile device MAC), and comprising zero or a virtual identity.Equations of The Second Kind net Network journal file is possible to include one or more real name identity comprising the daily record of mobile device MAC, i.e., (mobile device MAC can not included), comprising zero or a virtual identity.
The priority of identity data is that real name identity is higher than virtual identity in the step.And moved in real name identity The highest priority of dynamic equipment MAC.In a preferred embodiment of the invention, the priority of real name identity from It is high to Low to be successively:Mobile device MAC, cell-phone number, identification card number, IMEI, Hongkong and Macro's pass, shield According to officer's identity card etc..It should be appreciated that, although giving the specific embodiment of real name identity priority herein, But this area basic technology personnel can set specific excellent between each real name identity according to actual needs First level.
Personal status relationship data in the present invention refer to the position relationship between two identity datas, wherein will be preferential Level identity data higher is placed in the left side of relation, and the relatively low identity data of priority is placed in the right side of relation. If real name identity A and real name identity B were included in such as same daily record, and real name identity A's would be preferential Level is higher than real name identity B, then the identity data of extraction is real name identity A and real name identity B, extraction Personal status relationship data are that real name identity A points to real name identity B.If including real name body in same daily record Part A and virtual identity C, then the identity data of extraction is real name identity A and virtual identity C, is extracted Personal status relationship data point to virtual identity C for real name identity A.If including real name in same daily record Identity A, real name identity B and virtual identity C, then the identity data of extraction is real name identity A, real name Identity B and virtual identity C, the personal status relationship data of extraction are real name identity A sensing real name identity B, Real name identity B points to virtual identity C.
Then, portrait construction step, multiple bodies of the personage that step S1 is extracted are performed in step s 2 Number evidence and personal status relationship data are stored as diagram data according to the mode of directed edge, and with the mobile device of personage MAC travels through out the portrait that maximum subgraph obtains personage for starting point.Wherein each subgraph has and only one of which Mobile device MAC.
The identity data of foregoing extraction can be loaded as the summit of figure, the loading of personal status relationship data in the step It is the side (directed edge) of figure.A line is loaded as the personal status relationship data containing mobile device MAC, For the personal status relationship data for not containing mobile device MAC, two sides are expanded to.Such as cell-phone number is to body Part card number, switchs to cell-phone number to identification card number, identification card number to two sides of cell-phone number.So pass through vertex set Close and line set initializes total figure.Travel through out the subgraph of maximum as starting point with the mobile device MAC of personage again, One and only one mobile device MAC wherein in each subgraph.The subgraph for so traveling through out is exactly the personage Portrait.Under normal circumstances, if a personage has multiple mobile device MAC in reality, with When one of mobile device MAC is that starting point travels through subgraph, there may be multiple shifting in the subgraph for traveling through out Dynamic equipment MAC, due to false relation complicated between data, contains in the subgraph that may result in the personage The identity data of other personages, so as to influence the accuracy of data.Therefore, the figure painting defined in the present invention As one and only one mobile device MAC in subgraph, when traverse path runs into other mobile devices MAC When stop the data acquisition in the path.As previously described when some personages have multiple mobile device MAC, then A subgraph, therefore these reality will be traveled through out according to the method for the present invention for each mobile device MAC In personage will correspond to multiple figure painting pictures.
Fig. 2 is referred to, is real according to the present invention based on the personal status relationship analysis method for moving MAC second Apply the flow chart of example.The personal status relationship analysis method based on mobile MAC that the embodiment is provided is real first The step of being performed before be additionally may included in step S1 on the basis of example S0 is applied, i.e., network log file being entered Row is collected and cleaned.Specifically, can receive and by cleaning procedure real time processing network journal file, enter All kinds of daily record legitimate verifications of row, by effective data-pushing to Distributed Message Queue (KAFKA).This The statement proterties state transmission mode such as (REST) or FTP (FTP) can be supported in invention Collector journal.
In more preferred embodiment of the invention, step S3 is can further include, detecting step S2 is obtained To the subgraph of personage whether have a common factor, and judge the friend pass between figure painting picture according to the result of detection System.Specifically, can be portrait by the subgraph of each personage, figure intersection operation be performed, if any two The portrait of personage has common factor, then it may determine that there is friends between figure painting picture, otherwise judge people Do not exist friends between thing portrait.Friends in the present invention between figure painting picture includes two kinds of feelings Condition, one kind refers to have friends between the two corresponding real personages of portrait, another kind refer to this two Individual personage's portrait belongs to a real personage.Because when same real personage has multiple mobile devices During MAC, there may be figure to occur simultaneously between its each subgraph, thus may determine that the corresponding figure painting of these subgraphs As belonging to the same people in reality.The present invention can also distinguish both of these case by further analysis.
In more preferred embodiment of the invention, can also include that retrieval service provides step, it can basis User input, is retrieved in the identity data or personal status relationship data of persistence, obtains corresponding personage Data.For example, mobile device MAC according to user input detects the subgraph of the personage, and then know All identity datas of the personage.Or according to No. QQ of user input come detect the personage subgraph whether There is common factor with other subgraphs, and then the friends of the personage can be known.
Fig. 3 is referred to, is to be carried according to identity in personal status relationship analysis method of the present invention based on mobile MAC Take the particular flow sheet of step.Identity extraction step S1 can preferably use Spark Stream Processings, example Such as timing treatment in every 30 seconds once.As shown in figure 3, the identity extraction step S1 in the present invention can enter one Step includes:
First, in step s 11, multiple identity informations of personage are extracted from network log file.Specifically Ground, can extract all of identity data from every daily record, and identity includes real name identity and virtual identity.
Then, in step s 12, the local re-scheduling of identity data is realized by internal memory, then by dividing Cloth buffer service (REDIS) realizes identity data overall situation re-scheduling;Next ring is given by newly-increased data transfer Section, and record the number of times of identity data appearance.Identity data is finally persisted to distributed file system (HDFS)。
Finally, in step s 13, determined by the priority of identity data between multiple identity datas Position obtaining personal status relationship data.The local re-scheduling of personal status relationship data is then realized by internal memory, and Personal status relationship data overall situation re-scheduling is realized by distributed caching service, next ring is given by newly-increased data transfer Section, and record the number of times of personal status relationship data appearance.Finally, by personal status relationship data persistence to distribution Formula file system (HDFS), for subsequent figure data analysis.In the present invention can by impala carry out from Line analysis and statistics.In preferred implementation, the step can also be by identity data and personal status relationship data Persistence full-text search engine (elasticsearch), for real-time retrieval.
Fig. 4 is referred to, is according to structure of being drawn a portrait in personal status relationship analysis method of the present invention based on mobile MAC Build the particular flow sheet of step.Portrait construction step S2 can analyze figure painting using Spark GraphX Picture.As shown in figure 4, the portrait construction step S2 in the present invention may further include:
First, in the step s 21, data loading operations are performed, row data is loaded as diagram data.Tool Body, persistence in abovementioned steps S13 to the data of distributed file system (HDFS) is loaded into spark In, wherein identity data is loaded as the summit of figure, and personal status relationship data are loaded as the side of figure, wherein containing shifting The personal status relationship data of dynamic equipment MAC are loaded as a line, and the identity not containing mobile device MAC is closed Coefficient evidence expands to two sides.Such as cell-phone number switchs to cell-phone number to identification card number, identity to identification card number Card number arrives two sides of cell-phone number.Total figure is initialized by vertex set and line set.
Then, in step S22, figure painting picture is built.First, figure painting picture filtering, this hair are carried out By the analysis to historical data in bright, determine the corresponding attribute of personage not over 20.Therefore, Some problematic mobile device MAC datas can be filtered out by the out-degree of mobile phone MAC.Again with every Individual mobile device MAC is starting point, travels through out the subgraph of maximum, and one and only one shifting in each subgraph Dynamic equipment MAC, then the subgraph is exactly the portrait of personage.One and only one movement wherein in each subgraph Equipment MAC.
Finally, in step S23, by identity data and personal status relationship data persistence, and by relation Conversion and weight conversion simplify diagram data.Wherein:
1) transformation:The indirect relation of personal status relationship data is converted into direct relation, for example will be mobile To cell-phone number, the transformation of cell-phone number to identification card number is mobile device MAC to mobile phone to equipment MAC Number, mobile device MAC to identification card number.
2) weight conversion:The maximum level n of subgraph is calculated first, and new power is calculated by below equation Weight:Wherein GkRepresent the weight of kth level.
Fig. 5 is referred to, is real according to the present invention based on the personal status relationship analysis system for moving MAC first Apply the block diagram of example.The personal status relationship analysis system based on mobile MAC that the embodiment is provided mainly includes body Part extraction module 100 and portrait build module 200.
Wherein identity extraction module 100 is used to be extracted from network log file multiple identity datas of personage, And determine the position between aforesaid plurality of identity data to obtain identity pass by the priority of identity data Coefficient evidence, wherein aforesaid plurality of identity data is mobile device physical address including mobile device MAC, and Mobile device MAC has limit priority.
Portrait builds module 200, is connected with identity extraction module 100, for by identity extraction module 100 Multiple identity datas and personal status relationship data of the personage of extraction are stored as diagram data according to the mode of directed edge, And travel through out the portrait that maximum subgraph obtains personage by starting point of the mobile device MAC of personage.It is wherein every Individual subgraph has and only one of which mobile device MAC.
The portrait builds module 200 and the identity data of foregoing extraction can be loaded as the summit of figure, and identity is closed Coefficient is according to the side (directed edge) for being loaded as figure.Add for the personal status relationship data containing mobile device MAC It is a line to carry, and for the personal status relationship data for not containing mobile device MAC, expands to two sides.Example If cell-phone number is to identification card number, switch to cell-phone number to identification card number, identification card number to two sides of cell-phone number.This Sample initializes total figure by vertex set and line set.Traveled through by starting point of the mobile device MAC of personage again Go out the subgraph of maximum, one and only one mobile device MAC wherein in each subgraph.So travel through out Subgraph is exactly the portrait of the personage.Under normal circumstances, if a personage has multiple mobile devices in reality During MAC, then when with one of mobile device MAC as starting point traversal subgraph, in the subgraph for traveling through out There may be multiple mobile device MAC, due to false relation complicated between data, may result in the people Identity data containing other personages in the subgraph of thing, so as to influence the accuracy of data.Therefore, the present invention Defined in figure painting as one and only one mobile device MAC in subgraph, when traverse path runs into other Stop the data acquisition in the path during mobile device MAC.As previously described when some personages have multiple movements Equipment MAC, then will travel through out a subgraph according to the method for the present invention for each mobile device MAC, Therefore the personage in these reality will correspond to multiple figure painting pictures.
Fig. 6 is referred to, is real according to the present invention based on the personal status relationship analysis system for moving MAC second Apply the block diagram of example.The personal status relationship analysis system based on mobile MAC that the embodiment is provided is implemented first Daily record receiver module 10 can also be included on the basis of example, for being collected to network log file and clearly Wash.Specifically, daily record receiver module 10 can be received and by cleaning procedure real time processing network daily record text Part, carries out all kinds of daily record legitimate verifications, by effective data-pushing to Distributed Message Queue (KAFKA), being supplied to identity extraction module 100.Can support that statement proterties state is transmitted in the present invention Or the mode collector journal such as FTP (FTP) (REST).
In more preferred embodiment of the invention, the personal status relationship analysis system based on mobile MAC can be with Character relation analysis module 300 is further included, building module 200 with the portrait is connected, for detecting Whether the subgraph that portrait builds the personage that module 200 is obtained has common factor, and judges personage according to the result of detection Friends between portrait.Specifically, can be portrait by the subgraph of each personage, perform figure common factor behaviour Make, if the portrait of any two personage has common factor, then it may determine that there is friend between figure painting picture Relation, otherwise judges do not exist friends between figure painting picture.Friend in the present invention between figure painting picture Relation includes two kinds of situations, and one kind refers to have friends between the two corresponding real personages of portrait, Another kind refers to that the two figure painting pictures belong to a real personage.Because when same real personage has many During individual mobile device MAC, there may be figure to occur simultaneously between its each subgraph, thus may determine that these subgraphs pair The figure painting picture answered belongs to the same people in reality.The present invention can also distinguish this by further analysis Two kinds of situations.
In more preferred embodiment of the invention, retrieval service module 400 can also be included, be extracted with identity Module 100, portrait build module 200 and/or character relation analysis module 300 and are connected, its can according to Family is input into, and is retrieved in the identity data or personal status relationship data of persistence, obtains corresponding personage's number According to.For example, mobile device MAC according to user input detects the subgraph of the personage, and then know this All identity datas of personage.Or according to No. QQ of user input come detect the personage subgraph whether with Other subgraphs have common factor, and then can know the friends of the personage.
Fig. 7 is referred to, is to be carried according to identity in personal status relationship analysis method of the present invention based on mobile MAC The specific block diagram of modulus block.The identity extraction module can preferably using Spark Stream Processings realize, for example Regularly process once within every 30 seconds.As shown in fig. 7, the identity extraction module 100 in the present invention can enter one Step includes identity data extraction unit 110, identity data processing unit 120 and personal status relationship data processing list Unit 130.
Wherein identity data extraction unit 110 is used to be extracted from network log file multiple identity of personage Information.Specifically, can extract all of identity data from every daily record, identity comprising real name identity and Virtual identity.
Identity data processing unit 120 is connected with identity data extraction unit 110, for being realized by internal memory The local re-scheduling of identity data, then realizes that identity data is global by distributed caching service (REDIS) Re-scheduling;Next link is given by newly-increased data transfer, and records the number of times of identity data appearance.Finally will Identity data is persisted to distributed file system (HDFS).
Personal status relationship data processing unit 130 is connected with identity data extraction unit 110, for by identity The priority of data determines that the position between multiple identity datas obtains personal status relationship data.Then pass through Internal memory realizes the local re-scheduling of personal status relationship data, and realizes personal status relationship data by distributed caching service Global re-scheduling, gives next link, and record the number of times of personal status relationship data appearance by newly-increased data transfer. Finally, by personal status relationship data persistence to distributed file system (HDFS), for subsequent figure data analysis. Off-line analysis and statistics can be carried out by impala in the present invention.In preferred implementation, identity number Can also be by identity data and personal status relationship number according to processing unit 120 and personal status relationship data processing unit 130 According to persistence full-text search engine (elasticsearch), for the real-time retrieval of retrieval service module 400.
Fig. 8 is referred to, is according to structure of being drawn a portrait in personal status relationship analysis system of the present invention based on mobile MAC Model the specific block diagram of block.The portrait builds module 200 and can analyze figure painting using Spark GraphX Picture.As shown in figure 8, the portrait in the present invention builds module 200 may further include:Data loading is single Unit 210, portrait construction unit 220 and data reduction unit 230.
Data loading unit 210 is used to perform data loading operations, and row data are loaded as into diagram data.Specifically , foregoing identity data processing unit 120 and the persistence of personal status relationship data processing unit 130 are extremely distributed The data of formula file system (HDFS) are loaded into spark, and wherein identity data is loaded as the summit of figure, Personal status relationship data are loaded as the side of figure, wherein the personal status relationship data containing mobile device MAC are loaded as A line, the personal status relationship Data expansion not containing mobile device MAC is two sides.For example cell-phone number is arrived Identification card number, switchs to cell-phone number to identification card number, identification card number to two sides of cell-phone number.By vertex set Total figure is initialized with line set.
Portrait construction unit 220 is connected with data loading unit 210, for building figure painting picture.First, Figure painting picture filtering is carried out, by the analysis to historical data in the present invention, determines that a personage is corresponding Attribute is not over 20.Therefore, it can filter out some problematic movements by the out-degree of mobile phone MAC Equipment MAC data.Again with each mobile device MAC as starting point, the subgraph of maximum is traveled through out, and often One and only one mobile device MAC in individual subgraph, then the subgraph is exactly the portrait of personage.
Data reduction unit 230 is connected with portrait construction unit 220, for by identity data and personal status relationship Data persistence, and simplified diagram data by transformation and weight conversion.Wherein:
1) transformation:The indirect relation of personal status relationship data is converted into direct relation, for example will be mobile To cell-phone number, the transformation of cell-phone number to identification card number is mobile device MAC to mobile phone to equipment MAC Number, mobile device MAC to identification card number.
2) weight conversion:The maximum level n of subgraph is calculated first, and new power is calculated by below equation Weight:Wherein GkRepresent the weight of kth level.
In sum, the present invention is different from using identity card as portrait main body, and uses mobile device MAC Address is used as portrait main body.By combining big data technology, cleaning, extraction, re-scheduling obtain master data, Then figure painting picture is drawn by chart database, character relation figure is drawn by graphic operation.
It should be appreciated that, personal status relationship analysis method and system based on mobile MAC are used in the present invention Principle it is identical with specific implementation, therefore to the personal status relationship analysis based on mobile MAC in the present invention The description of method specific embodiment is also applied for the personal status relationship analysis system based on mobile MAC.
The present invention is described according to specific embodiment, but it will be understood by those skilled in the art that is not taking off During from the scope of the invention, various change and equivalent can be carried out.Additionally, the spy to adapt to the technology of the present invention Determine occasion or material, many modifications can be carried out to the present invention without deviating from its protection domain.Therefore, the present invention Specific embodiment disclosed herein is not limited to, and including all implementations for dropping into claims Example.

Claims (10)

1. it is a kind of based on the personal status relationship analysis method for moving MAC, it is characterised in that including following step Suddenly:
S1, the multiple identity datas for extracting from network log file personage, and by the excellent of identity data First level determines the position between the multiple identity data to obtain personal status relationship data, wherein the multiple Identity data includes mobile device MAC, and the mobile device MAC has limit priority;
S2, multiple identity datas and personal status relationship data of the personage are stored according to the mode of directed edge It is diagram data, the portrait of personage is obtained with the subgraph that the mobile device MAC of personage travels through out maximum as starting point, Only contain a mobile device MAC wherein in each subgraph.
2. according to claim 1 based on the personal status relationship analysis method for moving MAC, its feature It is that methods described is further comprising the steps of:
Whether S3, the subgraph of detection personage have common factor, and the result according to detection judges figure painting as between Friends.
3. according to claim 1 based on the personal status relationship analysis method for moving MAC, its feature It is, what methods described was performed before being additionally included in step S1:
S0, receive and by cleaning procedure real time processing network journal file, effective data-pushing is arrived Distributed Message Queue.
4. according to claim 1 based on the personal status relationship analysis method for moving MAC, its feature It is that the step S1 is further included:
S11, from network log file extract personage multiple identity datas;Then concurrently perform step S12 With step S13;
S12, the local re-scheduling that identity data is realized by internal memory, are then realized by distributed caching service Identity data overall situation re-scheduling;Identity data is finally persisted to distributed file system;
S13, determine the position between the multiple identity data to obtain by the priority of identity data Personal status relationship data;The local re-scheduling of personal status relationship data is then realized by internal memory, and it is slow by distribution Personal status relationship data overall situation re-scheduling is realized in the service of depositing;Finally by personal status relationship data persistence to distributed document System.
5. according to claim 4 based on the personal status relationship analysis method for moving MAC, its feature It is that the step S2 is further included:
The data of S21, loading distributed file system, wherein identity data is loaded as the summit of figure, body Part relation data is loaded as the side of figure, wherein the personal status relationship data containing mobile device MAC are loaded as one Bar side, the personal status relationship Data expansion not containing mobile device MAC is two sides, and by vertex set Total figure is initialized with line set;
S22, the out-degree filtration problem data by mobile device MAC, with each mobile device MAC It is starting point, the subgraph for traveling through out maximum obtains the portrait of personage, and a movement is only contained wherein in each subgraph Equipment MAC;
S23, by identity data and personal status relationship data persistence, and converted by transformation and weight will Diagram data simplifies, wherein the indirect relation of personal status relationship data is converted into direct relation by the transformation, The weight conversion calculates the maximum level n of subgraph first, and calculates new power by below equation Weight:Wherein GkRepresent the weight of kth level.
6. a kind of based on the personal status relationship analysis system for moving MAC, it is characterised in that including:
Identity extraction module, for extracting multiple identity datas of personage from network log file, and passes through The priority of identity data determines the position between the multiple identity data to obtain personal status relationship data, Wherein the multiple identity data includes mobile device MAC, and the mobile device MAC has highest Priority;
Portrait build module, for by multiple identity datas and personal status relationship data of the personage according to oriented The mode on side is stored as diagram data, and travels through out the subgraph of maximum as starting point with the mobile device MAC of personage The portrait of personage is obtained, a mobile device MAC is only contained wherein in each subgraph.
7. according to claim 6 based on the personal status relationship analysis system for moving MAC, its feature It is that the system also includes:
Whether character relation analysis module, the subgraph for detecting personage has common factor, and the result according to detection is sentenced Friends between disconnected figure painting picture.
8. according to claim 6 based on the personal status relationship analysis system for moving MAC, its feature It is that the system also includes:
Daily record receiver module, for receiving and by cleaning procedure real time processing network journal file, incites somebody to action effective Data-pushing to Distributed Message Queue, there is provided to the identity extraction module.
9. according to claim 6 based on the personal status relationship analysis system for moving MAC, its feature It is that the identity extraction module is further included:
Identity data extraction unit, the multiple identity datas for extracting personage from network log file;
Identity data processing unit, the local re-scheduling for realizing identity data by internal memory, then by dividing Cloth buffer service realizes identity data overall situation re-scheduling;Identity data is finally persisted to distributed field system System;
Personal status relationship data processing unit, the multiple identity is determined for the priority by identity data Position between data is obtaining personal status relationship data;The part of personal status relationship data is then realized by internal memory Re-scheduling, and personal status relationship data overall situation re-scheduling is realized by distributed caching service;Finally by personal status relationship number According to being persisted to distributed file system.
10. according to claim 9 based on the personal status relationship analysis system for moving MAC, its feature It is that the portrait builds module and further includes:
Data loading unit, the data for loading distributed file system wherein identity data will be loaded as The summit of figure, personal status relationship data are loaded as the side of figure, wherein the personal status relationship containing mobile device MAC Data are loaded as a line, and the personal status relationship Data expansion not containing mobile device MAC is two sides, and Total figure is initialized by vertex set and line set;
Portrait construction unit, for the out-degree filtration problem data by mobile device MAC, is moved with each Dynamic equipment MAC is starting point, and the subgraph for traveling through out maximum obtains the portrait of personage, wherein in each subgraph only Contain a mobile device MAC;
Data reduction unit, by identity data and personal status relationship data persistence, and by transformation and power Convert again and simplify diagram data, wherein be converted into the indirect relation of personal status relationship data directly by the transformation Relation is connect, the weight conversion calculates the maximum level n of subgraph first, and calculates new power by below equation Weight:Wherein GkRepresent the weight of kth level.
CN201510859716.1A 2015-11-30 2015-11-30 A kind of personal status relationship analysis method and system based on mobile MAC Pending CN106815240A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510859716.1A CN106815240A (en) 2015-11-30 2015-11-30 A kind of personal status relationship analysis method and system based on mobile MAC

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510859716.1A CN106815240A (en) 2015-11-30 2015-11-30 A kind of personal status relationship analysis method and system based on mobile MAC

Publications (1)

Publication Number Publication Date
CN106815240A true CN106815240A (en) 2017-06-09

Family

ID=59156647

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510859716.1A Pending CN106815240A (en) 2015-11-30 2015-11-30 A kind of personal status relationship analysis method and system based on mobile MAC

Country Status (1)

Country Link
CN (1) CN106815240A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109871415A (en) * 2019-01-21 2019-06-11 武汉光谷信息技术股份有限公司 A kind of user's portrait construction method, system and storage medium based on chart database
CN111444368A (en) * 2020-03-25 2020-07-24 平安科技(深圳)有限公司 Method and device for constructing user portrait, computer equipment and storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109871415A (en) * 2019-01-21 2019-06-11 武汉光谷信息技术股份有限公司 A kind of user's portrait construction method, system and storage medium based on chart database
CN109871415B (en) * 2019-01-21 2021-04-30 武汉光谷信息技术股份有限公司 User portrait construction method and system based on graph database and storage medium
CN111444368A (en) * 2020-03-25 2020-07-24 平安科技(深圳)有限公司 Method and device for constructing user portrait, computer equipment and storage medium
CN111444368B (en) * 2020-03-25 2023-01-17 平安科技(深圳)有限公司 Method and device for constructing user portrait, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN107358146B (en) Method for processing video frequency, device and storage medium
CN108596041B (en) A kind of human face in-vivo detection method based on video
CN106940794A (en) A yard adjoint system is detectd in a kind of target collection
CN106372606A (en) Target object information generation method and unit identification method and unit and system
CN105389549A (en) Object recognition method and device based on human body action characteristic
CN104283918B (en) A kind of WLAN terminal type acquisition methods and system
CN105208528B (en) A kind of system and method for identifying with administrative staff
CN111079699A (en) Commodity identification method and device
CN103761279B (en) Method and system for scheduling network crawlers on basis of keyword search
CN109190586A (en) Customer's visiting analysis method, device and storage medium
CN109253888A (en) Detection method and system for vehicle vehicle condition
CN107204975A (en) A kind of industrial control system network attack detection technology based on scene fingerprint
CN109271793A (en) Internet of Things cloud platform device class recognition methods and system
CN106357416A (en) Group information recommendation method, device and terminal
CN108009497A (en) Image recognition monitoring method, system, computing device and readable storage medium storing program for executing
CN107360145A (en) A kind of multinode honey pot system and its data analysing method
CN108108897B (en) Rail transit passenger flow clearing method and system and electronic equipment
CN106874372A (en) The method and device of destination object identification information is obtained based on unmanned plane
CN106534784A (en) Acquisition analysis storage statistical system for video analysis data result set
CN108229262A (en) A kind of pornographic video detecting method and device
CN111382808A (en) Vehicle detection processing method and device
CN108319672A (en) Mobile terminal malicious information filtering method and system based on cloud computing
CN106326835A (en) Human face data collection statistical system and method for gas station convenience store
CN106789242A (en) A kind of identification application intellectual analysis engine based on mobile phone client software behavioral characteristics storehouse
CN110188717A (en) Image acquiring method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170609