CN109598525A - Data processing method and device - Google Patents
Data processing method and device Download PDFInfo
- Publication number
- CN109598525A CN109598525A CN201710916816.2A CN201710916816A CN109598525A CN 109598525 A CN109598525 A CN 109598525A CN 201710916816 A CN201710916816 A CN 201710916816A CN 109598525 A CN109598525 A CN 109598525A
- Authority
- CN
- China
- Prior art keywords
- data
- abnormal
- application platform
- application
- side attribute
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0241—Advertisements
- G06Q30/0251—Targeted advertisements
- G06Q30/0255—Targeted advertisements based on user history
Landscapes
- Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Engineering & Computer Science (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Finance (AREA)
- Economics (AREA)
- Game Theory and Decision Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Marketing (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of data processing method and device, by the ID data for obtaining multiple application platform records, after being screened out from it abnormal ID data, also consider the quantity of the abnormal ID data of each application platform record, all ID data of the insecure application platform record of Record ID data are rejected accordingly, the abnormal ID data that can also record the other application platform filtered out simultaneously are rejected, and using other normal ID data as ID data to be processed, utilize the incidence relation between ID data to be processed, accurately and quickly determine at least one target object of these ID data mappings to be processed, it is launched convenient for accurately completing the advertisement to different objects.
Description
Technical field
The present invention relates to technical field of data processing, more particularly to a kind of data processing method and device.
Background technique
Digital marketing is the practical activity that products & services are promoted using digital communication channel, thus timely with one kind,
Mode that is related, customizing and save cost and consumer link up.Wherein, by the internet behavioral data to user into
Row analysis, realizing that advertisement orientation is launched is common technological means in digital advertisement marketing, can satisfy the demand of different user,
Substantially increase user experience and product sales volume.
In practical applications, it since same user may leave data information in different platform different scenes, such as browses
The User ID data such as device cookie, mobile device ID, website account, cell-phone number, in order to navigate to a variety of ID data of user
With the same user, ID data usually are carried out using the incidence relation between the ID data monitored and are got through, are generated for user
One globally unique virtual ID data.
However, during progress User ID data are got through, it is easy to be influenced by abnormal ID data, cause to be associated with
As a result it fails.For example, due to the ID data that user's unrest fill data obtains, so that a large amount of different users are owned by the same mobile phone
Number, then, according to existing data processing scheme, which can be identified as different user same user, so as to cause whole
A ID data correlation relation is unreliable, reduces the accuracy of user behavior analysis, to affect the reliability of advertisement dispensing
And accuracy.
Summary of the invention
In view of the above problems, it proposes on the present invention overcomes the above problem or at least be partially solved in order to provide one kind
State the data processing method and device of problem.
The embodiment of the invention provides a kind of data processing methods, which comprises
Obtain the ID data of multiple application platform records;
Screen the abnormal ID data in the ID data;
Count the quantity in the ID data of each application platform record comprising exception ID data;
It determines that the quantity that statistics obtains meets the first preset standard, rejects ID data and its that respective application platform records
The abnormal ID data of his application platform record, and other ID data that will acquire are as ID data to be processed;
Using the incidence relation between the ID data to be processed, the ID data mapping to be processed at least one is determined
Target object.
Preferably, the abnormal ID data in the screening ID data, comprising:
According to the type of the ID data, the ID data of acquisition are grouped;
Count incidence relation of each of any one group ID data relative to other ID data organized;
The ID data for determining that the incidence relation meets the second preset standard are exception ID data.
Preferably, the abnormal ID data in the screening ID data, comprising:
The ID data that will acquire are as vertex, side of the application platform as the vertex belonging to the ID data,
Construct non-directed graph;
ID, which is extracted, according to the side attribute of the non-directed graph is associated with subgraph;
Obtain the corresponding ID quality of data feature of each side attribute in each described ID association subgraph;
Using the corresponding judgment criteria of all kinds of ID quality of data features, the corresponding exception ID data of corresponding side attribute are determined.
Preferably, the abnormal ID data in the screening ID data, comprising:
According to default blacklist filtering rule, the abnormal ID data in the ID data are screened;
Or;
Screen the abnormal ID data that default white list filtering rule is not met in the ID data.
Preferably, described to obtain the corresponding ID quality of data feature of each side attribute in each described ID association subgraph, packet
It includes:
For each side attribute of each ID association subgraph, the quantity score between all kinds of ID data of respective attributes is counted
Cloth;
It is described using the corresponding judgment criteria of all kinds of ID quality of data features, determine the corresponding exception ID number of corresponding side attribute
According to, comprising:
Obtain the corresponding default quantile of each side attribute;
Judge the quantity of the corresponding 2nd ID data of the corresponding first ID data of each side attribute than distribution whether be more than
The corresponding default quantile, the first ID data and the 2nd ID data are the corresponding different types of same side attribute
ID data;
If so, determining that the first ID data are exception ID data;
If not, the first ID data and the 2nd ID data that selection is new, it is one corresponding to return to each side attribute
Whether the quantity of the corresponding 2nd ID data of the first ID data is more than the corresponding default quantile step than distribution, until completing
Judgement of the quantity of the different types of ID data of all side attributes than distribution.
The embodiment of the invention also provides a kind of data processing equipment, described device includes:
Module is obtained, for obtaining the ID data of multiple application platform records;
Screening module, for screening the abnormal ID data in the ID data;
Statistical module, for counting the quantity in the ID data that each application platform records comprising exception ID data;
Data processing module rejects respective application platform for determining that the quantity that statistics obtains meets the first preset standard
The abnormal ID data of ID data and other application the platform record of record, and other ID data that will acquire are as ID to be processed
Data;
Target object determining module, for determining described wait locate using the incidence relation between the ID data to be processed
Manage at least one target object of ID data mapping.
Preferably, the screening module includes:
Grouped element is grouped the ID data of acquisition for the type according to the ID data;
First statistic unit, for counting pass of each of any one group ID data relative to other ID data organized
Connection relationship;
First determination unit, the ID data for determining that the incidence relation meets the second preset standard are exception ID number
According to.
Preferably, the screening module includes:
Structural unit, ID data for will acquire are as vertex, application platform conduct belonging to the ID data
The side on the vertex constructs non-directed graph;
Extraction unit extracts ID for the side attribute according to the non-directed graph and is associated with subgraph;
Feature acquiring unit, it is special for obtaining the corresponding ID quality of data of each side attribute in each described ID association subgraph
Sign;
Second determination unit, for determining corresponding side attribute using the corresponding judgment criteria of all kinds of ID quality of data features
Corresponding exception ID data.
The embodiment of the invention also provides a kind of storage medium, the storage medium includes the program of storage, wherein in institute
Equipment where controlling the storage medium when stating program operation executes data processing method as described above.
The embodiment of the invention also provides a kind of processor, the processor is for running program, wherein described program fortune
Data processing method as described above is executed when row.
By above-mentioned technical proposal, data processing method provided by the invention passes through the ID for obtaining multiple application platforms records
The abnormal ID number of each application platform record will be further considered after filtering out abnormal ID data in these ID data in data
According to quantity, and determining the quantity meet the first preset standard, it is believed that respective application platform record ID data can not
It leans on, the abnormal ID data that the other application platform rejecting all ID data of application platform record and filtering out is recorded,
Other ID data to will acquire determine this using the incidence relation between ID data to be processed as ID data to be processed
At least one target object of a little ID data mappings to be processed.It can be seen that the embodiment of the present application is by ID data itself and its comes
Source is added to abnormal judgement with identification, realizes the anomalous identification for adapting to the ID data of different application platforms record, greatly improves
The accuracy of user behavior analysis, and then improve the recognition efficiency and accuracy of target object, it is easy to implement advertisement
It is accurate to launch.
The above description is only an overview of the technical scheme of the present invention, in order to better understand the technical means of the present invention,
And it can be implemented in accordance with the contents of the specification, and in order to allow above and other objects of the present invention, feature and advantage can
It is clearer and more comprehensible, the followings are specific embodiments of the present invention.
Detailed description of the invention
By reading the following detailed description of the preferred embodiment, various other advantages and benefits are common for this field
Technical staff will become clear.The drawings are only for the purpose of illustrating a preferred embodiment, and is not considered as to the present invention
Limitation.And throughout the drawings, the same reference numbers will be used to refer to the same parts.In the accompanying drawings:
Fig. 1 shows a kind of flow chart of data processing method provided by the embodiments of the present application;
Fig. 2 shows a kind of ID provided by the embodiments of the present application to get through schematic diagram;
Fig. 3 shows the flow chart of another data processing method provided by the embodiments of the present application;
Fig. 4 shows a kind of structural block diagram of data processing equipment provided by the embodiments of the present application;
Fig. 5 shows the structural block diagram of another data processing equipment provided by the embodiments of the present application;
Fig. 6 shows the structural block diagram of another data processing equipment provided by the embodiments of the present application;
Fig. 7 shows the hardware structure diagram of a kind of electronic equipment provided by the embodiments of the present application;
Fig. 8 shows a kind of structural block diagram of data processing system provided by the embodiments of the present application.
Specific embodiment
Exemplary embodiments of the present disclosure are described in more detail below with reference to accompanying drawings.Although showing the disclosure in attached drawing
Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here
It is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure
It is fully disclosed to those skilled in the art.
Referring to Fig.1, be a kind of flow chart of data processing method provided by the embodiments of the present application, this method may include with
Lower step:
Step S101 obtains the ID data of multiple application platform records;
In this application, the ID data of acquisition can be used to characterize the identity of internet object, can specifically include browsing
The various User ID data such as device cookie, mobile device ID, website account, cell-phone number, the application ID collected to each platform
The content and its quantity of data are not construed as limiting.
Wherein, cookie refers to certain websites to distinguish user identity, carries out session tracking two and is stored in user's sheet
Data in ground terminal, are properly termed as browser rs cache, and under http protocol, server or script can safeguard Client Work
A kind of mode for upper information of standing, and it can be stored in the small text file in user browser (client) by Web server,
It may include the information of relevant user, and corresponding browser or the information of electronic equipment etc..
It should be noted that if installing multiple browsers on an electronic equipment, each browser can be respectively independent again
Cookie is stored in space, and the same user is logged in using different browsers or logged in using distinct electronic apparatuses, it will is obtained
Different cookie informations.As it can be seen that the application can identify user in conjunction with browser cookie and other ID data.
In practical applications, a real object can possess multiple ID data of different application platforms, such as various browsings
Device cookie, various social platform accounts, multiple mobile phone IMEI (International Mobile Equipment
Identity, International Mobile Equipment Identity code) number and various financial accounts etc., after user logs in application platform, the application
Platform would generally be acquired and record to each ID data of the user.
Step S102 screens the abnormal ID data in the ID data;
For reliable and target object is recognized accurately, the ID of the available multiple application platforms records of the embodiment of the present application
Data carry out User ID using the incidence relation between these ID data and get through, be generated for user one it is globally unique virtual
ID identifies at least one corresponding target object of these ID data, to generate subject object object using ID data
User's portrait.
Schematic diagram is got through referring to ID shown in Fig. 2, during progress ID is got through, the presence of abnormal ID data is easy to lead
Cause association results invalid.For example, a large amount of different users all fill in the same cell-phone number if user fills in mobile phone IMEI number carelessly,
Different user can be identified as same user by the cell-phone number, and this abnormal incidence relation is easy to that the expansion of virus-type occurs,
Final resulting entire ID association map is caused to be in down state.
So in order to guarantee quality that User ID is got through, the application is in the multiple ID numbers for obtaining multiple application platforms and sending
According to rear, can reliable recognition exception ID data wherein included simultaneously reject, in this regard, the present embodiment can use blacklist, white list,
The modes such as statistical rules or non-directed graph realize the identification and rejecting of abnormal ID data, but be not limited to the present embodiment description this
Several implementations.
Optionally, the embodiment of the present application can screen different in the ID data of acquisition according to default blacklist filtering rule
Normal ID data.Wherein, blacklist filtering rule can rule of thumb or historical record determine, such as by history determination as
13800000000, the exception ID data such as mailbox numbers such as the cell-phone numbers such as 13612345678,123@xxx.xx, are added to blacklist
Library, and in this, as blacklist filtering rule, the ID data of acquisition are directly filtered, that is, the ID data that will acquire and black name
The data stored in single library are compared, and if they are the same, illustrate that the ID data currently compared are exception ID data, conversely, current ratio
Pair ID data may be considered normal ID data.
Optionally, the embodiment of the present application, which can also be screened directly, does not meet the different of default white list filtering rule in ID data
Normal ID data, specifically can be according to the create-rule of ID data, as cookie create-rule, legal mobile phone create-rule (can be with
By different operator etc. determine, the application to it without limitation) etc., white list filtering rule is determined, for the multiple of acquisition
ID data verify it and meet white list filtering rule, it is believed that it is normal ID data, can be used to carry out subsequent ID to get through place
Reason, and for such as 123456789 cell-phone number, it is clear that it does not simultaneously meet cell-phone number create-rule, can screen it as exception ID
Data.
As another embodiment of the application, the embodiment of the present application can also preset statistical rules, realize to abnormal ID data
Screening.Detailed process may include: the ID data for multiple application platforms record of acquisition, can be according to the class of ID data
Type is grouped the ID data of acquisition, later, counts the ID data that each ID data is organized relative to other in any one group
Incidence relation, determine incidence relation meet the second preset standard ID data be exception ID data.
For example, the quantity that identical cell-phone number corresponds to different mailbox numbers is counted, if the quantity a predetermined level is exceeded can be direct
Reject the incidence relation between the cell-phone number and the mailbox number.Such as corresponding 30 different mailbox numbers of the same cell-phone number, it is this
It is relatively low to happen probability, it is believed that the cell-phone number is abnormal cell-phone number.It should be noted that for different ID data,
The content of second preset standard can be different, however it is not limited to which the corresponding number a predetermined level is exceeded of the present embodiment description is in this
Hold, the application is no longer described in detail one by one herein.
In addition, the application can also utilize obtained ID data and its affiliated application as the application another embodiment
Platform constructs non-directed graph, later, obtains multiple ID according to side attribute (type of such as application platform) extraction of the non-directed graph and is associated with
Subgraph recycles the corresponding ID quality of data feature of each side attribute in each ID association subgraph, filters out ID association subgraph
In abnormal ID data.Specific implementation procedure is referred to the description of hereafter corresponding embodiment, and the present embodiment is herein no longer
It is described in detail.
It can be seen that the application can be by being not limited to mode listed above, from multiple ID numbers of acquisition above
Abnormal ID data are filtered out in, to improve the reliability and accuracy of recongnition of objects.
Step S103 counts the quantity in the ID data of each application platform record comprising exception ID data;
Applicants have found that for different application platforms record data there may be very big quality of data characteristic is poor
Different, for example the user mobile phone number of the website A is more accurate, the subscriber mailbox comparison of the website B is accurate, C browser default disabling the
Tripartite cookie is easy to appear the case where first party cookie corresponds to many third party cookie, and D browser seldom goes out relatively
Existing such case.
As it can be seen that source, that is, application platform of ID data, has a significant impact to the accurate and reliable recognition of abnormal ID data.Such as
If the website B is because a cell-phone number is determined as abnormal cell-phone number by ID data reasons, this will will lead to the cell-phone number and supervise under the website A
The corresponding ID data correlation relation measured is also removed, and reduces the identification certainty and accuracy of abnormal ID data.
So the application can be directed to the ID data from different application platforms, corresponding data processing standards are set i.e.
Abnormal ID data criterion of identification, still by taking above-mentioned C and D browser as an example, the first party and third party cookie quantity ratio of C browser
Abnormal judgment threshold, it is more high than the setting of the abnormal judgment threshold of D browser.It should be noted that for other kinds of
Application platform, identify its record abnormal ID data mode can all differences, specifically can be according to the type of application platform
And the content of ID data determines, the application is no longer described in detail one by one herein.
Based on this, the embodiment of the present application after filtering out abnormal ID data, can be tested further in the ID data from acquisition
The quantity of the abnormal ID data from each application platform is demonstrate,proved, i.e., the embodiment of the present application is in the verifying to ID data correlation relation
On the basis of, increase the verifying of the source-information to ID data.
Optionally, after the present embodiment filters out all exception ID data, can be divided according to the source of abnormal ID data
Class, to obtain the quantity of the abnormal ID data from different application platforms.Certainly, the application is when screening exception ID data,
After can also be according to all ID data screenings that the source of ID data, an application platform record, then screens another and answer
The ID data recorded with platform, to directly obtain the abnormal ID data of different application platforms record.The application is to each application products
The statistical of the abnormal ID data of platform is without limitation.
Step S104 judges whether there is the application platform that the quantity that statistics obtains meets the first preset standard, if so,
Enter step S105;If not, executing step S106;
It is corresponding to screen in conjunction with above-mentioned analysis it is found that the reliability for the ID data that different application platforms monitor is different
The standard of abnormal ID data is also different, however, either from the ID data of which application platform, if the application platform has
The abnormal ID data having are very more, it will usually think the application platform record ID data be it is incredible, in order to further mention
The reliability of high recongnition of objects can reject the ID data of this application platform record.
Based on this, after being screened according to respective mode and counting to obtain the quantity of the abnormal ID data of each application platform, this
The quantity for the abnormal ID data that application embodiment can further calculate each application platform accounts for the total of the ID data of its record
The ratio or percentage of quantity, and verify the ratio or whether percentage reaches outlier threshold corresponding with the application platform, if
Reach, illustrates that the ID data of application platform record have a large amount of exception ID data, this application platform note will not be reused
The ID data of record carry out subsequent processing;If not up to, illustrating exception ID data ratio existing for the ID data of application platform record
Less, the normal ID data that can also be recorded using the application platform carry out subsequent processing.
It can be seen that the first preset standard in step S104 can refer to that the quantity for the abnormal ID data that statistics obtains accounts for it
The ratio or percentage of the total quantity of the ID data of application platform record, reach the default outlier threshold of the application platform.Wherein,
The default outlier threshold of different application platforms may be the same or different, the embodiment of the present application can be according to practical need
It was determined that without limitation to the size of the default outlier threshold of each application platform.
Step S105 rejects the ID data of respective application platform record and the abnormal ID number of other application platform record
According to, and other ID data that will acquire are as ID data to be processed;
Step S106, the abnormal ID data that rejecting screening goes out, and other ID data that will acquire are as ID data to be processed;
In rejecting abnormalities ID data, the embodiment of the present application considers the source of ID data, verifies recording different types
The reliability of each application platform of ID data, it will be considered that the ID data of insecure application platform record are rejected, to will be considered to
Reliably ID data realize the identification of target object as ID data to be processed accordingly.
Step S107 determines at least the one of ID data mapping to be processed using the incidence relation between ID data to be processed
A target object.
Referring to the incidence relation between ID data shown in Fig. 2, for the ID to be processed of determining each application platform record
Data carry out User ID and get through, the ID data to be processed for being mapped as same target are associated, each ID number as shown in Figure 2
Thick segment between indicates the incidence relation between two ID data, and multiple ID data that same target maps are got through.
Wherein, the incidence relation between ID data to be processed can be determined according to data content, as shown in Fig. 2, user exists
Application platform carries out operation generation behavior event, which can usually will record behavior event, and generate the row
For the relevant information of event, the account of the application platform is such as logged in, what client is the application platform etc. is logged in by, this Shen
The data that application platform records please are denoted as ID data.
So if same user logs in different application platforms, in the ID data of this multiple application platforms record there may be
Identical ID data content, the present embodiment can be established between the ID data of different application platforms based on identical ID data content
Incidence relation;And for the different ID data of same application platform record, it can also in this way, to identical ID number
Incidence relation is determined between.
Optionally, multiple ID data that the embodiment of the present application can also have using pre-recorded same user, determine
Incidence relation between ID data to be processed from multiple application platforms, i.e., to corresponding with multiple ID data of same user or
Identical ID data to be processed determine incidence relation etc..The application closes the association how determined between multiple ID data to be processed
The method of system carries out the implementation method that User ID is got through to ID data to be processed and is not construed as limiting, it is not limited to the present embodiment
The implementation method of foregoing description.
It according to the method described above, can be using the ID data to be processed with incidence relation as an ID data group, the data
Each ID data in group can have incidence relation as shown in Figure 2, i.e., any one ID data to be processed at least with one its
His ID data correlation to be processed, most of ID data to be processed are usually and at least two other ID data correlations to be processed.This
In application, an ID data group usually corresponds to a target object, that is to say, that each ID number to be processed in an ID data group
According to usually a target object logs in the data that different application platforms monitor.
As another embodiment of the application, the application can use the multiple ID numbers to be processed for mapping same target object
According to, construct user's portrait of the target object, unique virtual ID data of the corresponding target object can also be generated, so as to
When needing to information such as certain user's advertisements, by inquiring the virtual ID data of the user, the user and its use are accurately identified
Family portrait, to push the information such as suitable advertisement for it.
In conclusion the embodiment of the present application will obtain the ID data of multiple application platforms records, sieved from these ID data
After selecting abnormal ID data, the quantity of the abnormal ID data of each application platform record will be further considered, and determining the number
Amount meets the first preset standard, it is believed that the ID data of respective application platform record are unreliable, will reject application platform note
The abnormal ID data of all ID data of record and the other application platform filtered out record, thus other ID data that will acquire
As ID data to be processed, using the incidence relation between ID data to be processed, these ID data mappings to be processed are determined extremely
A few target object.It can be seen that ID data itself and its source are added to abnormal judgement and identification by the embodiment of the present application
In, it realizes the anomalous identification for adapting to the ID data of different application platforms record, substantially increases the accuracy of user behavior analysis,
And then the recognition efficiency and accuracy of target object are improved, it is easy to implement the accurate dispensing of advertisement.
Referring to Fig. 3, for the flow chart of another data processing method provided by the embodiments of the present application, this method may include
Following steps:
Step S301 obtains the ID data of multiple application platform records;
Wherein, ID data may include the data of the identity for the user that characterization logs in respective application platform, such as browser
The different types of data such as cookie, mobile device ID, website account, cell-phone number and mailbox number, the application are flat to each application
The content of different types of ID data of platform record is not construed as limiting, the ID data type of each application platform record can it is identical can also
With difference, can specifically be determined according to application platform type and user in factors such as the operations of the application platform.
Step S302, the ID data that will acquire are as vertex, side of the application platform as the vertex belonging to the ID data,
Construct non-directed graph;
In practical applications, non-directed graph refers to that side does not have directive figure, in the present embodiment, the ID data that can be will acquire
As vertex set, corresponding application platform is as side collection, to be the vertex of non-directed graph by ID data pick-up, by the ID data institute
Side of the application platform of category as this vertex, to generate the non-directed graph of ID data.
The vertex attribute of description in conjunction with above-described embodiment to ID data, gained non-directed graph may include but not limit to
In extracted from ID data such as No. IEMI, enterprise is using cookie, cell-phone number and mailbox number ID type, Yi Jiru
The ID numerical value such as 138xxxxxxxx, 123@xxx.xx.The side attribute of non-directed graph can be application platform belonging to corresponding vertex attribute
Attribute information, then the side attribute can include but is not limited to various media names, various browser types etc..
Step S303 extracts ID according to the side attribute of the non-directed graph and is associated with subgraph;
The present embodiment can be grouped the side and its vertex of non-directed graph according to the type of side attribute, and by same class
The corresponding side of the side attribute of type and its vertex constitute an ID and are associated with subgraph.Such as by side attribute be various types browser side and
Its vertex constitutes an ID and is associated with subgraph;Side attribute is constituted another ID and be associated with subgraph etc. for the side of media name and its vertex
Deng the present embodiment is no longer described in detail one by one herein.
It can be seen that generally including same type of multiple side attributes in an ID association subgraph, such as above-mentioned is multiple
Browser, multiple media names, multiple cell-phone numbers etc..
Step S304 is counted between all kinds of ID data of respective attributes for each side attribute of each ID association subgraph
Quantity is than distribution;
In the present embodiment practical application, if a first ID data are many relative to the quantity of the 2nd ID data, this
The possible exception of first ID data, the first ID data and the 2nd ID data are the different types of ID data of same side attribute.Institute
With the application can count the quantity between all kinds of ID data than distribution for each side attribute, and a such as A application cookie is corresponding
B application cookie quantity is than distribution, and a B application cookie corresponding A application cookie quantity is than distribution;One A media corresponds to B
Media quantity is than distribution, and a B media corresponding A media quantity is than distribution etc..
Step S305 obtains the corresponding default quantile of each side attribute;
In conjunction with foregoing description, default quantile can be judge corresponding side attribute ID data whether Yi Chang judgement mark
Standard, default quantile corresponding for different side attributes can be different, naturally it is also possible to and it is identical, it specifically can be according to actual needs
It determines, the numerical value of the present embodiment default quantile corresponding to each side attribute is not construed as limiting.Wherein, default quantile can be one
A percentage, however, it is not limited to this.
Step S306 judges the quantity of the corresponding 2nd ID data of the corresponding first ID data of each side attribute than distribution
It whether is more than to preset quantile accordingly, if so, entering step S307;If it is not, executing step S308;
Wherein, the first ID data and the 2nd ID data are the corresponding different types of ID data of same side attribute, specifically may be used
To be any one ID data in different types of ID data set, the embodiment of the present application can successively be determined according to certain sequence
First ID data and the 2nd ID data are judged, the judgement for the quantity between the ID data of any two type than distribution
Method can be identical.
It illustrates, it is assumed that the default quantile of certain side attribute such as browser is 95%, and 95% A application cookie is corresponding
B application cookie number within 20, it is believed that A application cookie is normal;If the corresponding B of an A application cookie is answered
It is more than 20 with the quantity of cookie, it is believed that A application cookie is abnormal.For the exception of other side attributes such as media name
Judgment method is similar, and this will not be detailed here for the present embodiment.
Step S307 determines that the first ID data are exception ID data;
Step S308 detects whether to complete quantity the sentencing than distribution to the different types of ID data of all side attributes
It is disconnected, if not, entering step S309;If so, executing step S310;
Optionally, the embodiment of the present application can realize the different types of ID to each side attribute according to certain sequence or rule
Quantity between data after completing primary judgement, can detecte current whether there is and do not carry out quantity score than the judgement of distribution
The ID data of cloth judgement, if it does, will continue to judge in the manner described above, until completing all types of ID to all side attributes
Quantity between data than distribution judgement, so as to all exception ID data in the ID data that screening obtains.
Step S309 selects new the first ID data and the 2nd ID data, and return step S306;
Step S310 counts the quantity accounting of the abnormal ID data under each side attribute;
In the present embodiment, the corresponding exception ID data of each side attribute are filtered out according to the method described above, that is, are screened certainly
It is whether reliable in order to further verify each application sample platform after the abnormal ID data of different application platforms, each side attribute can be counted
The quantity of corresponding exception ID data, and calculate the quantity that the corresponding exception ID data of each side attribute account for its total ID data
Than the quantity accounting of exception ID data as under the side attribute.
Step S311, verifying are currently greater than the side attribute of corresponding outlier threshold with the presence or absence of quantity accounting, if so, entering step
Rapid S312;If it is not, executing step S313;
In this application, ID data corresponding for each side attribute abnormal ID data can may all occur, if it exists
Abnormal ID overabundance of data, it will usually think that the corresponding application platform of the side attribute is unreliable, will no longer with the application platform supervise
It surveys and the ID data recorded carries out user behavior analysis.So the application, which can be set, judges the side for different side attributes
The whether reliable critical value of the corresponding application platform of attribute is preset in the ID data of application platform record and there are how many exceptions
When ID data, so that it may think that the application platform is unreliable, can be using the critical value as outlier threshold, the present embodiment is different to this
The size of normal threshold value is not construed as limiting.
Step S312 rejects the ID data that quantity accounting is greater than under the side attribute of corresponding outlier threshold, and filter out
Abnormal ID data under other side attributes;
It is big for the quantity accounting of exception ID data in the ID data of record in conjunction with the description of above-described embodiment corresponding portion
When outlier threshold, it is generally recognized that application platform is insecure accordingly, in order to avoid it is to user behavior analysis result
The ID data that this kind of application platform records can be rejected, be not used in subsequent ID and get through processing by adverse effect, the embodiment of the present application.
Moreover, thinking insecure application platform record in addition to rejecting to improve the accuracy of identification target object
Outside ID data, other sides that the abnormal ID data that can also record the other application platform filtered out, i.e. rejecting screening go out belong to
Abnormal ID data under property.
Step S313, rejecting screening go out each side attribute under abnormal ID data;
Abnormal ID data in the ID data for determining each application platform record are all not many, i.e., verified each application of determination
Platform is all abnormal ID data deletion that is reliable, can only recording each application platform, retains the normal ID data of acquisition,
For realizing that ID is got through, recongnition of objects accuracy is improved.
Step S314 using other remaining ID data as ID data to be processed, and utilizes the pass between ID data to be processed
Connection relationship determines at least one target object of ID data mapping to be processed.
In the present embodiment, for obtained ID data to be processed, algorithm can be got through using ID and determines each ID to be processed
Incidence relation between data realizes gained to obtain the corresponding target object of each group of associated ID data to be processed
The reliable recognition of the target object for the ID data to be processed description arrived, and then to each ID to be processed of each target object mapping
Data are analyzed, and realize the accurately and securely analysis to user behavior, to be based on the analysis results its accurate dispensing advertisement.
Wherein, the embodiment of the present application not only allows for ID data correlation relation in determining ID data procedures to be processed
Monitoring, and consider the otherness in the source of different ID data, eliminate the poor application platform record of quality of data characteristic
ID data, that is, eliminate record ID data it is unreliable or inaccurate application platform record all ID data, to avoid
ID data source causes abnormal ID data to misidentify, and influences the accuracy and reliability of recongnition of objects.
It is a kind of structural block diagram of data processing equipment provided by the embodiments of the present application referring to Fig. 4, which can wrap
It includes:
Module 41 is obtained, for obtaining the ID data of multiple application platform records;
Screening module 42, for screening the abnormal ID data in the ID data;
Optionally, the application can realize the screening of abnormal ID data in different ways, so, referring to Fig. 5, the screening
Module 42 may include:
Grouped element 4211 is grouped the ID data of acquisition for the type according to the ID data;
First statistic unit 4212, the ID data organized for counting each of any one group ID data relative to other
Incidence relation;
First determination unit 4213, the ID data for determining that the incidence relation meets the second preset standard are exception ID
Data.
As another embodiment of the application, referring to structural block diagram shown in fig. 6, which can also include:
Structural unit 4221, ID data for will acquire are as vertex, application platform belonging to the ID data
As the side on the vertex, non-directed graph is constructed;
Extraction unit 4222 extracts ID for the side attribute according to the non-directed graph and is associated with subgraph;
Feature acquiring unit 4223, for obtaining the corresponding ID data matter of each side attribute in each described ID association subgraph
Measure feature;
In this application, this feature acquiring unit 4223 specifically can be used for belonging to for each side of each ID association subgraph
Property, the quantity between all kinds of ID data of respective attributes is counted than distribution.As it can be seen that the ID quality of data feature that the present embodiment obtains
The quantity between each ID data be can be than distribution, however, it is not limited to this.
Second determination unit 4224, for determining corresponding edge using the corresponding judgment criteria of all kinds of ID quality of data features
The corresponding exception ID data of attribute.
Specific implementation content based on features described above acquiring unit 4223, second determination unit 4224 may include:
Subelement is obtained, for obtaining the corresponding default quantile of each side attribute;
Judgment sub-unit, for judging the quantity of the corresponding 2nd ID data of the corresponding first ID data of each side attribute
It whether is more than the corresponding default quantile than distribution, the first ID data and the 2nd ID data are same side attributes
Corresponding different types of ID data;
First determine subelement, be for the judging result in judgment sub-unit it is yes, determine the first ID data be it is different
Normal ID data;
Second determines subelement, be for the judging result in judgment sub-unit it is no, select new the first ID data and
2nd ID data, and trigger judgment sub-unit and continue to judge, until completing the number of the different types of ID data of all side attributes
Measure the judgement than distribution.
Optionally, above-mentioned screening module 42 can also include:
First screening unit, for screening the abnormal ID data in the ID data according to blacklist filtering rule is preset;
Or;
Second screening unit, for screening the abnormal ID number for not meeting default white list filtering rule in the ID data
According to.
Statistical module 43, for counting the quantity in the ID data that each application platform records comprising exception ID data;
It is flat to reject respective application for determining that the quantity that statistics obtains meets the first preset standard for data processing module 44
The ID data of platform record and the abnormal ID data of other application platform record, and other ID data that will acquire are as to be processed
ID data;
Target object determining module 45, for using the incidence relation between the ID data to be processed, determine it is described to
Handle at least one target object of ID data mapping.
In conclusion in the present embodiment, the ID data of multiple application platform records will be obtained, and from these ID data
After filtering out abnormal ID data, the quantity of the abnormal ID data of each application platform record will be further considered, and should determining
Quantity meets the first preset standard, it is believed that the ID data of respective application platform record are unreliable, will reject the application platform
The abnormal ID data of all ID data of record and the other application platform filtered out record, thus other ID numbers that will acquire
According to as ID data to be processed, using the incidence relation between ID data to be processed, these ID data mappings to be processed are determined
At least one target object.
It can be seen that ID data itself and its source are added to abnormal judgement with identification by the embodiment of the present application, realize
The anomalous identification for adapting to the ID data of different application platforms record, substantially increases the accuracy of user behavior analysis, Jin Erti
The high recognition efficiency and accuracy of target object, is easy to implement the accurate dispensing of advertisement.
Apparatus structure description mainly is carried out from the functional module angle of realization data processing method above, below from hard
The description of part structure, the apparatus may include processor and memory, above-mentioned acquisition module, screening module, statistical module, at data
It manages module, target object determining module etc. to store in memory as program unit, storage is stored in by processor execution
Above procedure unit in device realizes corresponding function.
Include kernel in processor, is gone in memory to transfer corresponding program unit by kernel.Kernel can be set one
Or more, ID data itself and its source are added to abnormal judgement with identification by adjusting kernel parameter, realizes and adapts to not
With the anomalous identification of the ID data of application platform record, the accuracy of user behavior analysis is substantially increased, and then improve mesh
The recognition efficiency and accuracy for marking object, are easy to implement the accurate dispensing of advertisement.
Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/
Or the forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash
RAM), memory includes at least one storage chip.
The embodiment of the invention provides a kind of storage mediums, are stored thereon with program, real when which is executed by processor
Existing above-mentioned data processing method, the realization process of this method are referred to the description of above method embodiment corresponding portion, this reality
Applying example, details are not described herein.
The embodiment of the invention provides a kind of processor, the processor is for running program, wherein described program operation
The above-mentioned data processing method of Shi Zhihang, the realization process of this method are referred to the description of above method embodiment corresponding portion,
Details are not described herein for the present embodiment.
It is the hardware structure diagram of a kind of electronic equipment provided by the embodiments of the present application referring to Fig. 7, which can wrap
Include but be not limited to following hardware component, moreover, electronic equipment equipment provided by the present application can be server, PC, iPad,
The products such as mobile phone, the application are not construed as limiting the product type of electronic equipment, the electronic equipment may include: communication port 71,
Memory 72, processor 73 and it is stored in the program that can be run on memory 72 and on processor 73.
Communication port 71, for being communicatively coupled with multiple application platforms;
In embodiments of the present invention, communication port 71 can be the wireless communication such as WIFI module, gsm module or GPRS module
The port of module is also possible to the port of wire communication module, such as USB port, type and its knot of the application to communication port
Structure is not construed as limiting.Memory 72, for storing the multiple instruction for realizing the data processing method of above method embodiment description.
In practical applications, memory may include the non-volatile memory in computer-readable medium, arbitrary access
The forms such as memory (RAM) and/or Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM), memory
Including at least one storage chip.
Processor 73, for loading and executing the program of memory storage, comprising:
Obtain the ID data of multiple application platform records;
Screen the abnormal ID data in the ID data;
Count the quantity in the ID data of each application platform record comprising exception ID data;
It determines that the quantity that statistics obtains meets the first preset standard, rejects ID data and its that respective application platform records
The abnormal ID data of his application platform record, and other ID data that will acquire are as ID data to be processed;
Using the incidence relation between the ID data to be processed, the ID data mapping to be processed at least one is determined
Target object.
Optionally, the program that processor 73 executes that memory 72 stores realizes that the screening process of exception ID data can also wrap
It includes:
According to the type of the ID data, the ID data of acquisition are grouped;
Count incidence relation of each of any one group ID data relative to other ID data organized;
The ID data for determining that the incidence relation meets the second preset standard are exception ID data.
Alternatively, can also include:
The ID data that will acquire are as vertex, side of the application platform as the vertex belonging to the ID data,
Construct non-directed graph;
ID, which is extracted, according to the side attribute of the non-directed graph is associated with subgraph;
Obtain the corresponding ID quality of data feature of each side attribute in each described ID association subgraph;
Using the corresponding judgment criteria of all kinds of ID quality of data features, the corresponding exception ID data of corresponding side attribute are determined.
Optionally, the program for realizing following steps can also be performed in processor 73:
For each side attribute of each ID association subgraph, the quantity score between all kinds of ID data of respective attributes is counted
Cloth;
Obtain the corresponding default quantile of each side attribute;
Judge the quantity of the corresponding 2nd ID data of the corresponding first ID data of each side attribute than distribution whether be more than
The corresponding default quantile, the first ID data and the 2nd ID data are the corresponding different types of same side attribute
ID data;
If so, determining that the first ID data are exception ID data;
If not, the first ID data and the 2nd ID data that selection is new, it is one corresponding to return to each side attribute
Whether the quantity of the corresponding 2nd ID data of the first ID data is more than the corresponding default quantile step than distribution, until completing
Judgement of the quantity of the different types of ID data of all side attributes than distribution.
Alternatively, the program for realizing following steps can also be performed in processor 73:
According to default blacklist filtering rule, the abnormal ID data in the ID data are screened;
Or;Screen the abnormal ID data that default white list filtering rule is not met in the ID data.
It can be seen that ID data itself and its source are added to abnormal judgement with identification by the embodiment of the present application, realize
The anomalous identification for adapting to the ID data of different application platforms record, substantially increases the accuracy of user behavior analysis, Jin Erti
The high recognition efficiency and accuracy of target object, is easy to implement the accurate dispensing of advertisement.
It is a kind of structural block diagram of data processing system provided by the embodiments of the present application referring to Fig. 8, which can wrap
It includes: multiple application apparatus 81 of corresponding different application platforms, and the electronic equipment 82 with above-mentioned electronic equipment hardware configuration.
Wherein, application apparatus 81 can be terminal or server or database etc., the product class of the application application device
Type is not construed as limiting, and in the embodiment of the present application, each application apparatus can be used for monitoring and recording object and log in respective application platform
The ID data of generation, specific implementation process are referred to the description of above method embodiment corresponding portion, and the present embodiment is herein not
It is described in detail again.
Present invention also provides a kind of computer program products, when executing on data processing equipment, are adapted for carrying out just
The program of beginningization there are as below methods step:
Obtain the ID data of multiple application platform records;
Screen the abnormal ID data in the ID data;
Count the quantity in the ID data of each application platform record comprising exception ID data;
It determines that the quantity that statistics obtains meets the first preset standard, rejects ID data and its that respective application platform records
The abnormal ID data of his application platform record, and other ID data that will acquire are as ID data to be processed;
Using the incidence relation between the ID data to be processed, the ID data mapping to be processed at least one is determined
Target object.
Optionally, the program for screening the abnormal ID data method step in the ID data is executed:
According to the type of the ID data, the ID data of acquisition are grouped;
Count incidence relation of each of any one group ID data relative to other ID data organized;
The ID data for determining that the incidence relation meets the second preset standard are exception ID data.
As another embodiment of the application, the program for screening the abnormal ID data method step in the ID data is executed:
The ID data that will acquire are as vertex, side of the application platform as the vertex belonging to the ID data,
Construct non-directed graph;
ID, which is extracted, according to the side attribute of the non-directed graph is associated with subgraph;
Obtain the corresponding ID quality of data feature of each side attribute in each described ID association subgraph;
Using the corresponding judgment criteria of all kinds of ID quality of data features, the corresponding exception ID data of corresponding side attribute are determined.
Wherein, it executes and described obtains the corresponding ID quality of data feature step of each side attribute in each described ID association subgraph
Rapid program, can specifically include:
For each side attribute of each ID association subgraph, the quantity score between all kinds of ID data of respective attributes is counted
Cloth;
Correspondingly, execution is described using the corresponding judgment criteria of all kinds of ID quality of data features, determines corresponding side attribute pair
The program for the abnormal ID data step answered, can specifically include:
Obtain the corresponding default quantile of each side attribute;
Judge the quantity of the corresponding 2nd ID data of the corresponding first ID data of each side attribute than distribution whether be more than
The corresponding default quantile, the first ID data and the 2nd ID data are the corresponding different types of same side attribute
ID data;
If so, determining that the first ID data are exception ID data;
If not, the first ID data and the 2nd ID data that selection is new, it is one corresponding to return to each side attribute
Whether the quantity of the corresponding 2nd ID data of the first ID data is more than the corresponding default quantile step than distribution, until completing
Judgement of the quantity of the different types of ID data of all side attributes than distribution.
Optionally, the program for screening the abnormal ID data method step in the ID data is executed, may include:
According to default blacklist filtering rule, the abnormal ID data in the ID data are screened;
Or;
Screen the abnormal ID data that default white list filtering rule is not met in the ID data.
To sum up, ID data itself and its source are added to abnormal judgement by computer program product provided by the embodiments of the present application
In identification, realizes the anomalous identification for adapting to the ID data of different application platforms record, substantially increase user behavior analysis
Accuracy, and then the recognition efficiency and accuracy of target object are improved, it is easy to implement the accurate dispensing of advertisement.
It should be understood by those skilled in the art that, embodiments herein can provide as method, apparatus, electronic equipment, be
System or computer program product.Therefore, the application can be used complete hardware embodiment, complete software embodiment or combine software
With the form of the embodiment of hardware aspect.Moreover, it wherein includes that computer can use journey that the application, which can be used in one or more,
Implement in the computer-usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) of sequence code
Computer program product form.
The application is produced referring to according to the method, apparatus of the embodiment of the present application, electronic equipment, system and computer program
The flowchart and/or the block diagrams of product describes.It should be understood that can be realized by computer program instructions in flowchart and/or the block diagram
Each flow and/or block and flowchart and/or the block diagram in process and/or box combination.It can provide these meters
Calculation machine program instruction is to the place of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices
Device is managed to generate a machine, so that producing by the instruction that computer or the processor of other programmable data processing devices execute
Life is for realizing the function of specifying in one or more flows of the flowchart and/or one or more blocks of the block diagram
Device.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates,
Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or
The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting
Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or
The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one
The step of function of being specified in a box or multiple boxes.
In a typical configuration, calculating equipment includes one or more processors (CPU), input/output interface, net
Network interface and memory.
Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/
Or the forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flashRAM).Memory is computer-readable medium
Example.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method
Or technology come realize information store.Information can be computer readable instructions, data structure, the module of program or other data.
The example of the storage medium of computer includes, but are not limited to phase change memory (PRAM), static random access memory (SRAM), moves
State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable
Programmable read only memory (EEPROM), flash memory or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM),
Digital versatile disc (DVD) or other optical storage, magnetic cassettes, tape magnetic disk storage or other magnetic storage devices
Or any other non-transmission medium, can be used for storage can be accessed by a computing device information.As defined in this article, it calculates
Machine readable medium does not include temporary computer readable media (transitory media), such as the data-signal and carrier wave of modulation.
It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludability
It include so that the process, method, commodity or the equipment that include a series of elements not only include those elements, but also to wrap
Include other elements that are not explicitly listed, or further include for this process, method, commodity or equipment intrinsic want
Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including element
There is also other identical elements in process, method, commodity or equipment.
It will be understood by those skilled in the art that embodiments herein can provide for method, apparatus, electronic equipment, system or
Computer program product.Therefore, the application can be used complete hardware embodiment, complete software embodiment or combine software and hardware
The form of the embodiment of aspect.Moreover, it wherein includes computer usable program code that the application, which can be used in one or more,
Computer-usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) on the calculating implemented
The form of machine program product.
The above is only embodiments herein, are not intended to limit this application.To those skilled in the art,
Various changes and changes are possible in this application.It is all within the spirit and principles of the present application made by any modification, equivalent replacement,
Improve etc., it should be included within the scope of the claims of this application.
Claims (10)
1. a kind of data processing method, which is characterized in that the described method includes:
Obtain the ID data of multiple application platform records;
Screen the abnormal ID data in the ID data;
Count the quantity in the ID data of each application platform record comprising exception ID data;
Determine that the obtained quantity of statistics meets the first preset standard, reject respective application platform record ID data and other answer
The abnormal ID data recorded with platform, and other ID data that will acquire are as ID data to be processed;
Using the incidence relation between the ID data to be processed, at least one target of the ID data mapping to be processed is determined
Object.
2. the method according to claim 1, wherein the abnormal ID data in the screening ID data, packet
It includes:
According to the type of the ID data, the ID data of acquisition are grouped;
Count incidence relation of each of any one group ID data relative to other ID data organized;
The ID data for determining that the incidence relation meets the second preset standard are exception ID data.
3. the method according to claim 1, wherein the abnormal ID data in the screening ID data, packet
It includes:
The ID data that will acquire are as vertex, side of the application platform as the vertex belonging to the ID data, construction
Non-directed graph;
ID, which is extracted, according to the side attribute of the non-directed graph is associated with subgraph;
Obtain the corresponding ID quality of data feature of each side attribute in each described ID association subgraph;
Using the corresponding judgment criteria of all kinds of ID quality of data features, the corresponding exception ID data of corresponding side attribute are determined.
4. the method according to claim 1, wherein the abnormal ID data in the screening ID data, packet
It includes:
According to default blacklist filtering rule, the abnormal ID data in the ID data are screened;
Or;
Screen the abnormal ID data that default white list filtering rule is not met in the ID data.
5. according to the method described in claim 3, it is characterized in that, described obtain each side category in each described ID association subgraph
The corresponding ID quality of data feature of property, comprising:
For each side attribute of each ID association subgraph, the quantity between all kinds of ID data of respective attributes is counted than distribution;
It is described to determine the corresponding exception ID data of corresponding side attribute using the corresponding judgment criteria of all kinds of ID quality of data features,
Include:
Obtain the corresponding default quantile of each side attribute;
Judge the quantity of the corresponding 2nd ID data of the corresponding first ID data of each side attribute than whether being distributed more than corresponding
The default quantile, the first ID data and the 2nd ID data are the corresponding different types of ID of same side attribute
Data;
If so, determining that the first ID data are exception ID data;
If not, the first ID data and the 2nd ID data that selection is new, it is one first corresponding to return to each side attribute
Whether the quantity of the corresponding 2nd ID data of ID data is more than the corresponding default quantile step than distribution, until completing all
Judgement of the quantity of the different types of ID data of side attribute than distribution.
6. a kind of data processing equipment, which is characterized in that described device includes:
Module is obtained, for obtaining the ID data of multiple application platform records;
Screening module, for screening the abnormal ID data in the ID data;
Statistical module, for counting the quantity in the ID data that each application platform records comprising exception ID data;
Data processing module rejects respective application platform record for determining that the quantity that statistics obtains meets the first preset standard
ID data and other application platform record abnormal ID data, and other ID data that will acquire are as ID number to be processed
According to;
Target object determining module, for determining the ID to be processed using the incidence relation between the ID data to be processed
At least one target object of data mapping.
7. device according to claim 6, which is characterized in that the screening module includes:
Grouped element is grouped the ID data of acquisition for the type according to the ID data;
First statistic unit is closed for counting each of any one group ID data relative to the association of other ID data organized
System;
First determination unit, the ID data for determining that the incidence relation meets the second preset standard are exception ID data.
8. device according to claim 6, which is characterized in that the screening module includes:
Structural unit, ID data for will acquire are as vertex, described in application platform belonging to the ID data is used as
The side on vertex constructs non-directed graph;
Extraction unit extracts ID for the side attribute according to the non-directed graph and is associated with subgraph;
Feature acquiring unit, for obtaining the corresponding ID quality of data feature of each side attribute in each described ID association subgraph;
Second determination unit, for determining that corresponding side attribute is corresponding using the corresponding judgment criteria of all kinds of ID quality of data features
Abnormal ID data.
9. a kind of storage medium, which is characterized in that the storage medium includes the program of storage, wherein run in described program
When control the storage medium where equipment execute data processing method according to any one of claims 1 to 5.
10. a kind of processor, is characterized in that, the processor is for running program, wherein executes when described program is run as weighed
Benefit requires data processing method described in any one of 1-5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710916816.2A CN109598525B (en) | 2017-09-30 | 2017-09-30 | Data processing method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710916816.2A CN109598525B (en) | 2017-09-30 | 2017-09-30 | Data processing method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109598525A true CN109598525A (en) | 2019-04-09 |
CN109598525B CN109598525B (en) | 2023-01-17 |
Family
ID=65955783
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710916816.2A Active CN109598525B (en) | 2017-09-30 | 2017-09-30 | Data processing method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109598525B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111523034A (en) * | 2020-04-24 | 2020-08-11 | 腾讯科技(深圳)有限公司 | Application processing method, device, equipment and medium |
CN113396433A (en) * | 2019-06-11 | 2021-09-14 | 深圳市欢太科技有限公司 | User portrait construction method and related product |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20120114040A (en) * | 2011-04-06 | 2012-10-16 | 주식회사 바닐라하우스텐 | Abusing observing method and abusing observing system about advertisement type of cost per click |
US20140137226A1 (en) * | 2011-07-20 | 2014-05-15 | Tencent Technology (Shenzhen) Company Ltd. | Method and System for Processing Identity Information |
CN103886068A (en) * | 2014-03-20 | 2014-06-25 | 北京国双科技有限公司 | Data processing method and device for Internet user behavior analysis |
CN104216985A (en) * | 2014-09-04 | 2014-12-17 | 深圳供电局有限公司 | Method and system for discriminating abnormal data |
CN106656929A (en) * | 2015-10-30 | 2017-05-10 | 北京国双科技有限公司 | Information processing method and apparatus |
-
2017
- 2017-09-30 CN CN201710916816.2A patent/CN109598525B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20120114040A (en) * | 2011-04-06 | 2012-10-16 | 주식회사 바닐라하우스텐 | Abusing observing method and abusing observing system about advertisement type of cost per click |
US20140137226A1 (en) * | 2011-07-20 | 2014-05-15 | Tencent Technology (Shenzhen) Company Ltd. | Method and System for Processing Identity Information |
CN103886068A (en) * | 2014-03-20 | 2014-06-25 | 北京国双科技有限公司 | Data processing method and device for Internet user behavior analysis |
CN104216985A (en) * | 2014-09-04 | 2014-12-17 | 深圳供电局有限公司 | Method and system for discriminating abnormal data |
CN106656929A (en) * | 2015-10-30 | 2017-05-10 | 北京国双科技有限公司 | Information processing method and apparatus |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113396433A (en) * | 2019-06-11 | 2021-09-14 | 深圳市欢太科技有限公司 | User portrait construction method and related product |
CN113396433B (en) * | 2019-06-11 | 2023-12-26 | 深圳市欢太科技有限公司 | User portrait construction method and related products |
CN111523034A (en) * | 2020-04-24 | 2020-08-11 | 腾讯科技(深圳)有限公司 | Application processing method, device, equipment and medium |
CN111523034B (en) * | 2020-04-24 | 2023-08-18 | 腾讯科技(深圳)有限公司 | Application processing method, device, equipment and medium |
Also Published As
Publication number | Publication date |
---|---|
CN109598525B (en) | 2023-01-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20160140626A1 (en) | Web page advertisement configuration and optimization with visual editor and automatic website and webpage analysis | |
CN109544166A (en) | A kind of Risk Identification Method and device | |
CN107844518B (en) | Method for evaluating download quantity of specified APP, data server, packaging platform and system | |
CN110163647A (en) | A kind of data processing method and device | |
US20140089040A1 (en) | System and Method for Customer Experience Measurement & Management | |
CN112751711B (en) | Alarm information processing method and device, storage medium and electronic equipment | |
CN110362453A (en) | Log statistic alarm method and device, terminal and storage medium | |
EP3570242A1 (en) | Method and system for quantifying quality of customer experience (cx) of an application | |
CN110503545A (en) | Loan is independently into part method, terminal device, storage medium and device | |
CN106713242B (en) | Data request processing method and processing device | |
CN111368862A (en) | Method for distinguishing indoor and outdoor marks, training method and device of classifier and medium | |
CN109598525A (en) | Data processing method and device | |
CN113448834A (en) | Buried point testing method and device, electronic equipment and storage medium | |
CN107430590B (en) | System and method for data comparison | |
CN110377821A (en) | Generate method, apparatus, computer equipment and the storage medium of interest tags | |
CN105677677A (en) | Information classification and device | |
WO2016032531A1 (en) | Improvement message based on element score | |
CN107818477A (en) | One kind reward feedback method and device | |
CN109756762A (en) | A kind of determination method and device of terminal class | |
CN108021464A (en) | A kind of method and device of the processing of revealing all the details of application response data | |
CN111127050A (en) | Content channel evaluation method and device, electronic equipment and storage medium | |
CN111694872B (en) | Method and device for providing service handling data scheme | |
WO2016184318A1 (en) | Barcode popularity display method and apparatus | |
KR20220117676A (en) | Review reliability validation device and method of thereof | |
CN110618915B (en) | Method and equipment for cluster deployment decision power evaluation tool and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: 100080 No. 401, 4th Floor, Haitai Building, 229 North Fourth Ring Road, Haidian District, Beijing Applicant after: BEIJING GRIDSUM TECHNOLOGY Co.,Ltd. Address before: 100086 Beijing city Haidian District Shuangyushu Area No. 76 Zhichun Road cuigongfandian 8 layer A Applicant before: BEIJING GRIDSUM TECHNOLOGY Co.,Ltd. |
|
CB02 | Change of applicant information | ||
GR01 | Patent grant | ||
GR01 | Patent grant |