CN109088788A - Data processing method, device, equipment and computer readable storage medium - Google Patents
Data processing method, device, equipment and computer readable storage medium Download PDFInfo
- Publication number
- CN109088788A CN109088788A CN201810752308.XA CN201810752308A CN109088788A CN 109088788 A CN109088788 A CN 109088788A CN 201810752308 A CN201810752308 A CN 201810752308A CN 109088788 A CN109088788 A CN 109088788A
- Authority
- CN
- China
- Prior art keywords
- user
- data
- identity
- similarity
- agent
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/02—Capturing of monitoring data
- H04L43/028—Capturing of monitoring data by filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/30—Authentication, i.e. establishing the identity or authorisation of security principals
- G06F21/31—User authentication
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2221/00—Indexing scheme relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/21—Indexing scheme relating to G06F21/00 and subgroups addressing additional information or applications relating to security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F2221/2117—User registration
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Theoretical Computer Science (AREA)
- Computer Hardware Design (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention provides a kind of data processing method, device, equipment and computer readable storage medium.Method of the invention, by extracting the first identity characteristic data of the first user and second user respectively from the first user data and second user data to be processed, the first identity characteristic data include at least one identity information for one user agent of unique identification;According to the first identity characteristic data of the first user and second user, determine whether the first user and second user belong to same user agent;If it is determined that the first user and second user belong to same user agent, processing then is merged to the first user data and second user data, it realizes and multiple user data merging treatments of same user agent is formed into panorama type user characteristic data, reduce the data redundancy of DPI system entirety.
Description
Technical field
The present invention relates to information data processing technology field more particularly to a kind of data processing method, device, equipment and meters
Calculation machine readable storage medium storing program for executing.
Background technique
It is a kind of application based on data message that deep message, which detects (Deep Packet Inspection, abbreviation DPI),
Laminar flow amount detects and controls technology, carries out depth detection and analysis for the different layers information of data message, to obtain entire
The application layer message of data flow or data packet, the strategy then defined according to DPI system is for statistical analysis to flow and controls.
With the development of big data and Internet technology, various applications enter people's lives.Due to different applications pair
Not having unified requirement in the registration information of user, the user identifier that same user's registration different application uses may be different,
Different user registration different application may use identical user identifier.DPI system is in the behavioral data for obtaining user at present
When, the corresponding user behavior data of each user is established for every kind of application, stores a large amount of redundant data, and can not be formed complete
Scape formula user characteristic data.
Summary of the invention
The present invention provides a kind of data processing method, device, equipment and computer readable storage medium, to solve at present
DPI system establishes the corresponding user behavior characteristics of each user when obtaining the behavioural characteristic data of user, for every kind of application
Data store a large amount of redundant data, and the problem of can not form panorama type user characteristic data.
It is an aspect of the invention to provide a kind of data processing methods, comprising:
Extract the of the first user and second user respectively from the first user data and second user data to be processed
One identity characteristic, the first identity characteristic data include at least one identity for one user agent of unique identification
Information;
According to the first identity characteristic data of first user and second user, determine that first user and second uses
Whether family belongs to same user agent;
If it is determined that first user and second user belong to same user agent, then to first user data and
Two user data merge processing.
Another aspect of the present invention is to provide a kind of data processing equipment, comprising:
Data extraction module, for extracting the first use respectively from the first user data and second user data to be processed
The first identity characteristic data at family and second user, the first identity characteristic data include at least one for unique identification one
The identity information of a user agent;
Determining module determines described for the first identity characteristic data according to first user and second user
Whether one user and second user belong to same user agent;
Processing module, for if it is determined that first user and second user belong to same user agent, then to described
One user data and second user data merge processing
Another aspect of the present invention is to provide a kind of deep packet detection device, comprising:
Memory, processor, and it is stored in the computer journey that can be run on the memory and on the processor
Sequence,
The processor realizes method described above when running the computer program.
Another aspect of the present invention is to provide a kind of computer readable storage medium, is stored with computer program,
The computer program realizes method described above when being executed by processor.
Data processing method, device, equipment and computer readable storage medium provided by the invention, by to be processed
Extract the first identity characteristic data of the first user and second user in first user data and second user data respectively, first
Identity characteristic data include at least one identity information for one user agent of unique identification;According to the first user and second
The first identity characteristic data of user, determine whether the first user and second user belong to same user agent;If it is determined that first
User and second user belong to same user agent, then merge processing to the first user data and second user data, real
Show and multiple user data merging treatments of same user agent are formed into panorama type user characteristic data, has reduced DPI system
Whole data redundancy.
Detailed description of the invention
Fig. 1 is the data processing method flow chart that the embodiment of the present invention one provides;
Fig. 2 is data processing method flow chart provided by Embodiment 2 of the present invention;
Fig. 3 is the structural schematic diagram for the data processing equipment that the embodiment of the present invention three provides;
Fig. 4 is the structural schematic diagram for the deep packet detection device that the embodiment of the present invention five provides.
Through the above attached drawings, it has been shown that the specific embodiment of the present invention will be hereinafter described in more detail.These attached drawings
It is not intended to limit the scope of the inventive concept in any manner with verbal description, but is by referring to specific embodiments
Those skilled in the art illustrate idea of the invention.
Specific embodiment
Example embodiments are described in detail here, and the example is illustrated in the accompanying drawings.Following description is related to
When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment
Described in embodiment do not represent all embodiments consistented with the present invention.On the contrary, they be only with it is such as appended
The example of device and method being described in detail in claims, some aspects of the invention are consistent.
In the embodiment of the present invention, term " first ", " second " etc. are used for description purposes only, and should not be understood as instruction or
It implies relative importance or implicitly indicates the quantity of indicated technical characteristic.It is " more in the description of following embodiment
It is a " it is meant that two or more, unless otherwise specifically defined.
These specific embodiments can be combined with each other below, may be at certain for the same or similar concept or process
It is repeated no more in a little embodiments.Below in conjunction with attached drawing, the embodiment of the present invention is described.
Embodiment one
Fig. 1 is the data processing method flow chart that the embodiment of the present invention one provides.The embodiment of the present invention is directed to current DPI system
It unites when obtaining the behavioural characteristic data of user, establishes the corresponding user behavior data of each user for every kind of application, store
A large amount of redundant data, and the problem of panorama type user characteristic data can not be formed, provide data processing method.The present embodiment
In method be applied to deep packet detection device, which can be with the computer equipment where DPI system.The present invention
In other embodiments, the method in the present embodiment can also be applied to other computer equipments, and the present embodiment is examined with deep message
It is illustrated for measurement equipment.As shown in Figure 1, specific step is as follows for this method:
Step S101, the first user and second are extracted respectively from the first user data and second user data to be processed
The first identity characteristic data of user, the first identity characteristic data include at least one for one user agent of unique identification
Identity information.
In the present embodiment, the first user data and second user data are a kind of two differences of the DPI system for application
User account obtain user behavior data, or for two different applications two user accounts obtain user
Behavioral data.
In practical applications, the first user data to be processed and second user data can be passed through specified by technical staff
Application identities and user's registration account are specified, and are also possible to DPI system and are used to any two in resulting user data
The corresponding user data of family register account number, the present embodiment are not specifically limited herein.
First identity characteristic data include that at least one identity for one user agent of unique identification is believed.Wherein, it uses
Believe at least may include: ID card No., phone number, E-mail address etc. in the identity of one user agent of unique identification.
Step S102, according to the first identity characteristic data of the first user and second user, the first user and second are determined
Whether user belongs to same user agent.
Since the first identity characteristic data of user include at least one identity for one user agent of unique identification
Information, if the first identity characteristic data of the first user and second user include at least one for one use of unique identification simultaneously
The identity information of householder's body, then the first identity characteristic data of the first user and second user simultaneously include any one
When identity information for one user agent of unique identification is consistent, so that it may it is same to determine that the first user and second user belong to
User agent.
If the first identity characteristic data of the first user and second user include at least one for unique identification one simultaneously
The identity information of a user agent, then the first identity characteristic data of the first user and second user simultaneously include it is any
When a kind of identity information for one user agent of unique identification is inconsistent, so that it may determine the first user and second user not
Belong to same user agent.
If there is no include simultaneously to be used for unique identification one in the first identity characteristic data of the first user and second user
The identity information of a user agent then not can determine that the first use according to the first identity characteristic data of the first user and second user
Family and second user belong to same user agent, can not determine that the first user and second user are not belonging to same user agent.
Step S103, if it is determined that the first user and second user belong to same user agent, then to the first user data and
Second user data merge processing.
After determining that the first user and second user belong to same user agent, to the first user data and second user
Data merge processing.
Specifically, merging processing to the first user data and second user data, specifically include:
The first user data and the corresponding unified user data mark of second user data are generated, removal first is used
Redundancy in user data and second user data generates more fully user data corresponding with user data mark.
The embodiment of the present invention by extracting the first use from the first user data and second user data to be processed respectively
The first identity characteristic data at family and second user, the first identity characteristic data include at least one for one use of unique identification
The identity information of householder's body;According to the first identity characteristic data of the first user and second user, the first user and second are determined
Whether user belongs to same user agent;If it is determined that the first user and second user belong to same user agent, then use first
User data and second user data merge processing, realize multiple user data merging treatment shapes of same user agent
At panorama type user characteristic data, reduce the data redundancy of DPI system entirety.
Embodiment two
Fig. 2 is data processing method flow chart provided by Embodiment 2 of the present invention.On the basis of the above embodiment 1, originally
In embodiment, if uncertain first user and second user belong to same user agent, from the first user data to be processed and
The Second Identity of Local data of the first user and second user are extracted in second user data respectively, Second Identity of Local data are extremely
It less include: home address, friend information, incidence relation and behavioural characteristic data;Calculate the second of the first user and second user
Similarity between identity characteristic data;Compare the similarity between the first user and the Second Identity of Local data of second user
With the size of the first preset threshold;If similarity between the first user and the Second Identity of Local data of second user is greater than the
One preset threshold, it is determined that the first user and second user belong to same user agent, to the first user data and second user
Data merge processing.If the similarity between the first user and the Second Identity of Local data of second user is less than or waits
In the first preset threshold, then compare similarity between the first user and the Second Identity of Local data of second user and second pre-
If the size of threshold value, the second preset threshold is less than the first preset threshold;If the Second Identity of Local of the first user and second user
Similarity between data is greater than the second preset threshold, then the association established between the first user data and second user data is closed
System.
As shown in Fig. 2, specific step is as follows for this method:
Step S201, the first user and second are extracted respectively from the first user data and second user data to be processed
The first identity characteristic data of user, the first identity characteristic data include at least one for one user agent of unique identification
Identity information.
In the present embodiment, the first user data and second user data are a kind of two differences of the DPI system for application
User account obtain user behavior data, or for two different applications two user accounts obtain user
Behavioral data.
In practical applications, the first user data to be processed and second user data can be passed through specified by technical staff
Application identities and user's registration account are specified, and are also possible to DPI system and are used to any two in resulting user data
The corresponding user data of family register account number, the present embodiment are not specifically limited herein.
First identity characteristic data include that at least one identity for one user agent of unique identification is believed.Wherein, it uses
Believe at least may include: ID card No., phone number, E-mail address etc. in the identity of one user agent of unique identification.
Optionally, the first user and can be extracted respectively from the first user data and second user data to be processed
The first identity characteristic data of two users, and be recorded in data list.
Step S202, according to the first identity characteristic data of the first user and second user, the first user and second are determined
Whether user belongs to same user agent.
In the present embodiment, according to the first identity characteristic data of the first user and second user, the first user and are determined
Whether two users belong to same user agent, can specifically realize in the following way:
Judge in the first identity characteristic data of the first user and second user with the presence or absence of any one identity information one
It causes;If the first user with there are any one identity information is consistent in the first identity characteristic data of second user, it is determined that
One user and second user belong to same user agent.
Since the first identity characteristic data of user include at least one identity for one user agent of unique identification
Information, if the first identity characteristic data of the first user and second user include at least one for one use of unique identification simultaneously
The identity information of householder's body, then the first identity characteristic data of the first user and second user simultaneously include any one
When identity information for one user agent of unique identification is consistent, so that it may it is same to determine that the first user and second user belong to
User agent.
If the first identity characteristic data of the first user and second user include at least one for unique identification one simultaneously
The identity information of a user agent, then the first identity characteristic data of the first user and second user simultaneously include it is any
When a kind of identity information for one user agent of unique identification is inconsistent, so that it may determine the first user and second user not
Belong to same user agent.
If there is no include simultaneously to be used for unique identification one in the first identity characteristic data of the first user and second user
The identity information of a user agent then not can determine that the first use according to the first identity characteristic data of the first user and second user
Family and second user belong to same user agent, can not determine that the first user and second user are not belonging to same user agent.
Step S203, if it is determined that the first user and second user belong to same user agent, then to the first user data and
Second user data merge processing.
After determining that the first user and second user belong to same user agent, to the first user data and second user
Data merge processing.
Specifically, merging processing to the first user data and second user data, specifically include:
The first user data and the corresponding unified user data mark of second user data are generated, removal first is used
Redundancy in user data and second user data generates more fully user data corresponding with user data mark.
If step S204, uncertain first user and second user belong to same user agent, used to be processed first
The Second Identity of Local data of the first user and second user are extracted in user data and second user data respectively.
Wherein, Second Identity of Local data include at least: home address, friend information, incidence relation and behavioural characteristic number
According to.Wherein, incidence relation can be mobile phone contact information.Optionally, Second Identity of Local data can also include Instant Messenger
The account etc. of news tool.
Optionally, if uncertain first user and second user belong to same user agent, from the first user to be processed
Before the Second Identity of Local data for extracting the first user and second user in data and second user data respectively, further includes:
Extract the note of the first user and second user respectively from the first user data and second user data to be processed
Volume account;Judge whether the first user is consistent with the register account number of second user;If the registration account of the first user and second user
It is number consistent, then it executes and subsequent extracts the first user and second respectively from the first user data and second user data to be processed
The step of Second Identity of Local data of user.
If the register account number of the first user and second user is inconsistent, the registration account of the first user and second user is calculated
Number similarity;Judge whether the similarity of the register account number of the first user and second user is greater than third predetermined threshold value;If the
The similarity of the register account number of one user and second user is greater than third predetermined threshold value, then executes subsequent from the first use to be processed
The step of Second Identity of Local data of the first user and second user are extracted in user data and second user data respectively.
Wherein, the register account number of the first user and second user is two character strings, calculates the first user and second user
Register account number similarity, specifically can using in the prior art any one calculate two character strings similarity degree side
Method realizes that the present embodiment is not specifically limited herein.For example, can be matched to two character strings, determine in two character strings
Matched longest substring, and calculate the longest substring proportion.
In addition, third predetermined threshold value can be set according to actual needs by technical staff, the present embodiment is not done herein
It is specific to limit.
Step S205, the similarity between the first user and the Second Identity of Local data of second user is calculated;Compare
The size of similarity and the first preset threshold between one user and the Second Identity of Local data of second user.
In the present embodiment, the similarity between the first user and the Second Identity of Local data of second user is calculated, specifically
Can using in the prior art any one the similar of two users is calculated according to the behavioral data and attribute information of two users
The method of degree realizes that the present embodiment is not specifically limited herein.
Wherein, the first preset threshold can be set according to actual needs by technical staff, and the present embodiment is not done herein
It is specific to limit.
If step S206, it is pre- to be greater than first for the similarity between the first user and the Second Identity of Local data of second user
If threshold value, it is determined that the first user and second user belong to same user agent, to the first user data and second user data
Merge processing.
In the present embodiment, if the similarity between the first user and the Second Identity of Local data of second user is greater than first
Preset threshold then illustrates that the direct similarity of the Second Identity of Local data of the first user and second user is very high, can recognize
Belong to same user agent for the first user and second user, place is merged to the first user data and second user data
Reason.
In addition, the present embodiment consistent with step S203 that merge processing to the first user data and second user data
Details are not described herein again.
If step S207, the similarity between the first user and the Second Identity of Local data of second user is less than or waits
In the first preset threshold, then compare similarity between the first user and the Second Identity of Local data of second user and second pre-
If the size of threshold value, the second preset threshold is less than the first preset threshold.
Wherein, the second preset threshold can be set according to actual needs by technical staff, and the present embodiment is not done herein
It is specific to limit.
If it is pre- that the similarity between the first user and the Second Identity of Local data of second user is less than or equal to second
If threshold value, then illustrate that correlation degree is smaller between the first user data and second user data, not to the first user data and
Two user data merge processing, without the incidence relation established between the first user data and second user data.
If step S208, it is pre- to be greater than second for the similarity between the first user and the Second Identity of Local data of second user
If threshold value, then the incidence relation between the first user data and second user data is established.
If the similarity between the first user and the Second Identity of Local data of second user is greater than the second preset threshold,
Illustrate not can determine that the first user and second user belong to according to the user data of current existing first user and second user
Same user agent, but the association between the first user and second user is larger, therefore, establishes the first user data and second
Incidence relation between user data, in order to the subsequent more importantly identity data for getting the first user and second user
Afterwards, further more accurately it can determine whether the first user and second user belong to same user agent, to improve number of users
According to merging treatment precision.
The embodiment of the present invention is by when uncertain first user and second user belong to same user agent, to be processed
The first user data and second user data in extract the Second Identity of Local data of the first user and second user, root respectively
According to the size of the similarity between the first user and the Second Identity of Local data of second user, if the first user and second user
Second Identity of Local data between similarity be greater than the first preset threshold, it is determined that the first user and second user belong to together
One user agent merges processing to the first user data and second user data;If the of the first user and second user
Similarity between two identity characteristic data is less than or equal to the first preset threshold, and is greater than the second preset threshold, then establishes
Incidence relation between first user data and second user data realizes the multiple use for accurately determining same user agent
Amount, and multiple user data merging treatments of same user agent are formed into panorama type user characteristic data, reduce DPI system
The data redundancy for entirety of uniting.
Embodiment three
Fig. 3 is the structural schematic diagram for the data processing equipment that the embodiment of the present invention three provides.It is provided in an embodiment of the present invention
The process flow that data processing equipment can be provided with configuration for executing data processing embodiment.As shown in figure 3, the device 30 includes:
Data extraction module 301, determining module 302 and processing module 303.
Specifically, data extraction module 301 is used for from the first user data and second user data to be processed respectively
The first identity characteristic data of the first user and second user are extracted, the first identity characteristic data include at least one for unique
Identify the identity information of a user agent.
Determining module 302 is used for the first identity characteristic data according to the first user and second user, determines the first user
Whether belong to same user agent with second user.
Processing module 303 is used for if it is determined that the first user and second user belong to same user agent, then to the first user
Data and second user data merge processing.
Device provided in an embodiment of the present invention can be specifically used for executing embodiment of the method provided by above-described embodiment one,
Details are not described herein again for concrete function.
The embodiment of the present invention by extracting the first use from the first user data and second user data to be processed respectively
The first identity characteristic data at family and second user, the first identity characteristic data include at least one for one use of unique identification
The identity information of householder's body;According to the first identity characteristic data of the first user and second user, the first user and second are determined
Whether user belongs to same user agent;If it is determined that the first user and second user belong to same user agent, then use first
User data and second user data merge processing, realize multiple user data merging treatment shapes of same user agent
At panorama type user characteristic data, reduce the data redundancy of DPI system entirety.
Example IV
On the basis of above-described embodiment three, in the present embodiment, processing module is also used to:
If uncertain first user and second user belong to same user agent, from the first user data to be processed and the
The Second Identity of Local data of the first user and second user are extracted in two user data respectively, Second Identity of Local data are at least
It include: home address, friend information, incidence relation and behavioural characteristic data;Calculate the second body of the first user and second user
Similarity between part characteristic;Compare similarity between the first user and the Second Identity of Local data of second user with
The size of first preset threshold;If the similarity between the first user and the Second Identity of Local data of second user is greater than first
Preset threshold, it is determined that the first user and second user belong to same user agent, to the first user data and second user number
It is handled according to merging.
Optionally, processing module is also used to:
If it is pre- that the similarity between the first user and the Second Identity of Local data of second user is less than or equal to first
If threshold value, then compare the similarity and the second preset threshold between the first user and the Second Identity of Local data of second user
Size, the second preset threshold is less than the first preset threshold;If between the first user and the Second Identity of Local data of second user
Similarity be greater than the second preset threshold, then establish the incidence relation between the first user data and second user data.
Optionally, processing module is also used to:
Extract the note of the first user and second user respectively from the first user data and second user data to be processed
Volume account;Judge whether the first user is consistent with the register account number of second user;If the registration account of the first user and second user
It is number consistent, then it executes and subsequent extracts the first user and second respectively from the first user data and second user data to be processed
The step of Second Identity of Local data of user.
Optionally, processing module is also used to:
If the register account number of the first user and second user is inconsistent, the registration account of the first user and second user is calculated
Number similarity;Judge whether the similarity of the register account number of the first user and second user is greater than third predetermined threshold value;If the
The similarity of the register account number of one user and second user is greater than third predetermined threshold value, then executes subsequent from the first use to be processed
The step of Second Identity of Local data of the first user and second user are extracted in user data and second user data respectively.
Optionally, processing module is also used to:
Judge in the first identity characteristic data of the first user and second user with the presence or absence of any one identity information one
It causes;If the first user with there are any one identity information is consistent in the first identity characteristic data of second user, it is determined that
One user and second user belong to same user agent.
Device provided in an embodiment of the present invention can be specifically used for executing embodiment of the method provided by above-described embodiment two,
Details are not described herein again for concrete function.
The embodiment of the present invention is by when uncertain first user and second user belong to same user agent, to be processed
The first user data and second user data in extract the Second Identity of Local data of the first user and second user, root respectively
According to the size of the similarity between the first user and the Second Identity of Local data of second user, if the first user and second user
Second Identity of Local data between similarity be greater than the first preset threshold, it is determined that the first user and second user belong to together
One user agent merges processing to the first user data and second user data;If the of the first user and second user
Similarity between two identity characteristic data is less than or equal to the first preset threshold, and is greater than the second preset threshold, then establishes
Incidence relation between first user data and second user data realizes the multiple use for accurately determining same user agent
Amount, and multiple user data merging treatments of same user agent are formed into panorama type user characteristic data, reduce DPI system
The data redundancy for entirety of uniting.
Embodiment five
Fig. 4 is the structural schematic diagram for the deep packet detection device that the embodiment of the present invention five provides.As shown in figure 4, this sets
Standby 40 include: processor 401, memory 402, and is stored in the computer that can be executed on memory 402 and by processor 401
Program.
Processor 401 realizes any of the above-described embodiment of the method when executing and storing in the computer program on memory 402
The data processing method of offer.
The embodiment of the present invention by extracting the first use from the first user data and second user data to be processed respectively
The first identity characteristic data at family and second user, the first identity characteristic data include at least one for one use of unique identification
The identity information of householder's body;According to the first identity characteristic data of the first user and second user, the first user and second are determined
Whether user belongs to same user agent;If it is determined that the first user and second user belong to same user agent, then use first
User data and second user data merge processing, realize multiple user data merging treatment shapes of same user agent
At panorama type user characteristic data, reduce the data redundancy of DPI system entirety.
In addition, the embodiment of the present invention also provides a kind of computer readable storage medium, it is stored with computer program, the meter
Calculation machine program realizes the data processing method that any of the above-described embodiment of the method provides when being executed by processor.
In several embodiments provided by the present invention, it should be understood that disclosed device and method can pass through it
Its mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the division of the unit, only
Only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components can be tied
Another system is closed or is desirably integrated into, or some features can be ignored or not executed.Another point, it is shown or discussed
Mutual coupling, direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING or logical of device or unit
Letter connection can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list
Member both can take the form of hardware realization, can also realize in the form of hardware adds SFU software functional unit.
The above-mentioned integrated unit being realized in the form of SFU software functional unit can store and computer-readable deposit at one
In storage media.Above-mentioned SFU software functional unit is stored in a storage medium, including some instructions are used so that a computer
It is each that equipment (can be personal computer, server or the network equipment etc.) or processor (processor) execute the present invention
The part steps of embodiment the method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (Read-
Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic or disk etc. it is various
It can store the medium of program code.
Those skilled in the art can be understood that, for convenience and simplicity of description, only with above-mentioned each functional module
Division progress for example, in practical application, can according to need and above-mentioned function distribution is complete by different functional modules
At the internal structure of device being divided into different functional modules, to complete all or part of the functions described above.On
The specific work process for stating the device of description, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to of the invention its
Its embodiment.The present invention is directed to cover any variations, uses, or adaptations of the invention, these modifications, purposes or
Person's adaptive change follows general principle of the invention and including the undocumented common knowledge in the art of the present invention
Or conventional techniques.The description and examples are only to be considered as illustrative, and true scope and spirit of the invention are by following
Claims are pointed out.
It should be understood that the present invention is not limited to the precise structure already described above and shown in the accompanying drawings, and
And various modifications and changes may be made without departing from the scope thereof.The scope of the present invention is only limited by appended claims
System.
Claims (13)
1. a kind of data processing method characterized by comprising
Extract the first body of the first user and second user respectively from the first user data and second user data to be processed
Part characteristic, the first identity characteristic data include that at least one identity for one user agent of unique identification is believed
Breath;
According to the first identity characteristic data of first user and second user, determines first user and second user is
It is no to belong to same user agent;
If it is determined that first user and second user belong to same user agent, then first user data and second are used
User data merges processing.
2. the method according to claim 1, wherein described according to the first of first user and second user
Identity characteristic data, determine whether first user and second user belong to same user agent, comprising:
Judge in the first identity characteristic data of first user and second user with the presence or absence of any one identity letter
Breath is consistent;
If first user in the first identity characteristic data of second user there are identity information described in any one is consistent,
Then determine that first user and second user belong to same user agent.
3. method according to claim 1 or 2, which is characterized in that described according to first user and second user
First identity characteristic data, determine whether first user and second user belong to after same user agent, further includes:
If not knowing first user and second user belonging to same user agent, from the first user data to be processed and
The Second Identity of Local data of the first user and second user, the Second Identity of Local data are extracted in two user data respectively
It includes at least: home address, friend information, incidence relation and behavioural characteristic data;
Calculate the similarity between first user and the Second Identity of Local data of second user;
Compare the similarity and the first preset threshold between first user and the Second Identity of Local data of second user
Size;
If the similarity between first user and the Second Identity of Local data of second user is greater than the first preset threshold,
Determine that first user and second user belong to same user agent, to first user data and second user data into
Row merging treatment.
4. according to the method described in claim 3, it is characterized in that, the second of first user and second user
After the size of similarity between identity characteristic data and the first preset threshold, further includes:
If similarity between first user and the Second Identity of Local data of second user is less than or equal to described the
One preset threshold, the then similarity between first user and the Second Identity of Local data of second user and second pre-
If the size of threshold value, second preset threshold is less than first preset threshold;
If the similarity between first user and the Second Identity of Local data of second user is greater than the described second default threshold
Value, then establish the incidence relation between first user data and second user data.
5. according to the method described in claim 3, it is characterized in that, if not knowing first user and second user belongs to together
One user agent, it is described to extract the first user and the second use respectively from the first user data and second user data to be processed
Before the Second Identity of Local data at family, further includes:
Extract the registration account of the first user and second user respectively from the first user data and second user data to be processed
Number;
Judge whether first user is consistent with the register account number of second user;
If first user is consistent with the register account number of second user, execute it is subsequent from the first user data to be processed and
The step of Second Identity of Local data of the first user and second user are extracted in second user data respectively.
6. according to the method described in claim 5, it is characterized in that, the registration of judgement first user and second user
After whether account is consistent, further includes:
If the register account number of first user and second user is inconsistent, the note of first user and second user are calculated
The similarity of volume account;
Judge whether the similarity of the register account number of first user and second user is greater than third predetermined threshold value;
If the similarity of the register account number of first user and second user be greater than third predetermined threshold value, execute it is subsequent to
The Second Identity of Local number of the first user and second user is extracted in the first user data and second user data of processing respectively
According to the step of.
7. a kind of data processing equipment characterized by comprising
Data extraction module, for extracted respectively from the first user data and second user data to be processed first user and
First identity characteristic data of second user, the first identity characteristic data include at least one for one use of unique identification
The identity information of householder's body;
Determining module determines that described first uses for the first identity characteristic data according to first user and second user
Whether family and second user belong to same user agent;
Processing module, for if it is determined that first user and second user belong to same user agent, then to first use
User data and second user data merge processing.
8. device according to claim 7, which is characterized in that the processing module is also used to:
If not knowing first user and second user belonging to same user agent, from the first user data to be processed and
The Second Identity of Local data of the first user and second user, the Second Identity of Local data are extracted in two user data respectively
It includes at least: home address, friend information, incidence relation and behavioural characteristic data;
Calculate the similarity between first user and the Second Identity of Local data of second user;
Compare the similarity and the first preset threshold between first user and the Second Identity of Local data of second user
Size;
If the similarity between first user and the Second Identity of Local data of second user is greater than the first preset threshold,
Determine that first user and second user belong to same user agent, to first user data and second user data into
Row merging treatment.
9. device according to claim 8, which is characterized in that the processing module is also used to:
If similarity between first user and the Second Identity of Local data of second user is less than or equal to described the
One preset threshold, the then similarity between first user and the Second Identity of Local data of second user and second pre-
If the size of threshold value, second preset threshold is less than first preset threshold;
If the similarity between first user and the Second Identity of Local data of second user is greater than the described second default threshold
Value, then establish the incidence relation between first user data and second user data.
10. device according to claim 8, which is characterized in that the processing module is also used to:
Extract the registration account of the first user and second user respectively from the first user data and second user data to be processed
Number;
Judge whether first user is consistent with the register account number of second user;
If first user is consistent with the register account number of second user, execute it is subsequent from the first user data to be processed and
The step of Second Identity of Local data of the first user and second user are extracted in second user data respectively.
11. device according to claim 10, which is characterized in that the processing module is also used to:
If the register account number of first user and second user is inconsistent, the note of first user and second user are calculated
The similarity of volume account;
Judge whether the similarity of the register account number of first user and second user is greater than third predetermined threshold value;
If the similarity of the register account number of first user and second user be greater than third predetermined threshold value, execute it is subsequent to
The Second Identity of Local number of the first user and second user is extracted in the first user data and second user data of processing respectively
According to the step of.
12. a kind of deep packet detection device characterized by comprising
Memory, processor, and it is stored in the computer program that can be run on the memory and on the processor,
The processor realizes such as method of any of claims 1-6 when running the computer program.
13. a kind of computer readable storage medium, which is characterized in that it is stored with computer program,
Such as method of any of claims 1-6 is realized when the computer program is executed by processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810752308.XA CN109088788B (en) | 2018-07-10 | 2018-07-10 | Data processing method, device, equipment and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810752308.XA CN109088788B (en) | 2018-07-10 | 2018-07-10 | Data processing method, device, equipment and computer readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109088788A true CN109088788A (en) | 2018-12-25 |
CN109088788B CN109088788B (en) | 2021-02-02 |
Family
ID=64837484
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810752308.XA Active CN109088788B (en) | 2018-07-10 | 2018-07-10 | Data processing method, device, equipment and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109088788B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110245146A (en) * | 2019-05-20 | 2019-09-17 | 中国平安人寿保险股份有限公司 | A kind of user knows method for distinguishing and relevant apparatus |
CN110557363A (en) * | 2019-06-03 | 2019-12-10 | 北京城市网邻信息技术有限公司 | identity verification method, device and storage medium |
CN111767348A (en) * | 2019-04-02 | 2020-10-13 | 上海晶赞融宣科技有限公司 | Data fusion method and device, storage medium and server |
CN112395320A (en) * | 2020-11-26 | 2021-02-23 | 深圳市房多多网络科技有限公司 | Building information merging method, device, equipment and computer readable storage medium |
CN113641657A (en) * | 2021-08-23 | 2021-11-12 | 苏州良医汇网络科技有限公司 | Method, device and equipment for merging user accounts |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101729682A (en) * | 2009-11-11 | 2010-06-09 | 南京联创科技集团股份有限公司 | Method for automatically tracing communication network users |
CN103905379A (en) * | 2012-12-25 | 2014-07-02 | 腾讯科技(深圳)有限公司 | Method for identifying internet users and device thereof |
CN105844489A (en) * | 2016-03-21 | 2016-08-10 | 联想(北京)有限公司 | Information processing method and electronic device |
CN106572048A (en) * | 2015-10-09 | 2017-04-19 | 腾讯科技(深圳)有限公司 | Identification method and system of user information in social network |
CN106570719A (en) * | 2016-08-24 | 2017-04-19 | 阿里巴巴集团控股有限公司 | Data processing method and apparatus |
US9774670B2 (en) * | 2010-08-22 | 2017-09-26 | Qwilt, Inc. | Methods for detection of content servers and caching popular content therein |
CN108235368A (en) * | 2016-12-15 | 2018-06-29 | 中国电信股份有限公司 | For determining the method and device of the radio resource of business occupancy |
CN108259314A (en) * | 2016-12-29 | 2018-07-06 | 乐视汽车(北京)有限公司 | Information-pushing method and device |
-
2018
- 2018-07-10 CN CN201810752308.XA patent/CN109088788B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101729682A (en) * | 2009-11-11 | 2010-06-09 | 南京联创科技集团股份有限公司 | Method for automatically tracing communication network users |
US9774670B2 (en) * | 2010-08-22 | 2017-09-26 | Qwilt, Inc. | Methods for detection of content servers and caching popular content therein |
CN103905379A (en) * | 2012-12-25 | 2014-07-02 | 腾讯科技(深圳)有限公司 | Method for identifying internet users and device thereof |
CN106572048A (en) * | 2015-10-09 | 2017-04-19 | 腾讯科技(深圳)有限公司 | Identification method and system of user information in social network |
CN105844489A (en) * | 2016-03-21 | 2016-08-10 | 联想(北京)有限公司 | Information processing method and electronic device |
CN106570719A (en) * | 2016-08-24 | 2017-04-19 | 阿里巴巴集团控股有限公司 | Data processing method and apparatus |
CN108235368A (en) * | 2016-12-15 | 2018-06-29 | 中国电信股份有限公司 | For determining the method and device of the radio resource of business occupancy |
CN108259314A (en) * | 2016-12-29 | 2018-07-06 | 乐视汽车(北京)有限公司 | Information-pushing method and device |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111767348A (en) * | 2019-04-02 | 2020-10-13 | 上海晶赞融宣科技有限公司 | Data fusion method and device, storage medium and server |
CN110245146A (en) * | 2019-05-20 | 2019-09-17 | 中国平安人寿保险股份有限公司 | A kind of user knows method for distinguishing and relevant apparatus |
CN110245146B (en) * | 2019-05-20 | 2022-11-25 | 中国平安人寿保险股份有限公司 | User identification method and related device |
CN110557363A (en) * | 2019-06-03 | 2019-12-10 | 北京城市网邻信息技术有限公司 | identity verification method, device and storage medium |
CN112395320A (en) * | 2020-11-26 | 2021-02-23 | 深圳市房多多网络科技有限公司 | Building information merging method, device, equipment and computer readable storage medium |
CN112395320B (en) * | 2020-11-26 | 2023-03-07 | 深圳市房多多网络科技有限公司 | Building information merging method, device, equipment and computer readable storage medium |
CN113641657A (en) * | 2021-08-23 | 2021-11-12 | 苏州良医汇网络科技有限公司 | Method, device and equipment for merging user accounts |
Also Published As
Publication number | Publication date |
---|---|
CN109088788B (en) | 2021-02-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109088788A (en) | Data processing method, device, equipment and computer readable storage medium | |
CN103336766B (en) | Short text garbage identification and modeling method and device | |
CN103399896B (en) | The method and system of incidence relation between identification user | |
CN103368917B (en) | A kind of risk control method and system of network virtual user | |
CN103778186B (en) | A kind of detection method of " network waistcoat " | |
CN112418274B (en) | Decision tree generation method and device | |
CN106921504B (en) | Method and equipment for determining associated paths of different users | |
CN106156755A (en) | Similarity calculating method in a kind of recognition of face and system | |
CN108959516B (en) | Conversation message treating method and apparatus | |
CN109726265A (en) | Assist information processing method, equipment and the computer readable storage medium of chat | |
CN110162637B (en) | Information map construction method, device and equipment | |
CN106572048A (en) | Identification method and system of user information in social network | |
Pilehvar et al. | Inducing embeddings for rare and unseen words by leveraging lexical resources | |
CN104899201B (en) | Text Extraction, sensitive word determination method, device and server | |
US20230410221A1 (en) | Information processing apparatus, control method, and program | |
CN110502670A (en) | Network social intercourse relationship knowledge mapping generation method and system based on artificial intelligence | |
WO2023272862A1 (en) | Risk control recognition method and apparatus based on network behavior data, and electronic device and medium | |
US11412063B2 (en) | Method and apparatus for setting mobile device identifier | |
CN108268762B (en) | Mobile social network user identity identification method based on behavior modeling | |
CN108234454A (en) | A kind of identity identifying method, server and client device | |
US9332031B1 (en) | Categorizing accounts based on associated images | |
CN108462624A (en) | A kind of recognition methods of spam, device and electronic equipment | |
CN116151965B (en) | Risk feature extraction method and device, electronic equipment and storage medium | |
CN109660621A (en) | Content pushing method and service equipment | |
CN113705164A (en) | Text processing method and device, computer equipment and readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |