CN106372668A - Data matching method and device - Google Patents
Data matching method and device Download PDFInfo
- Publication number
- CN106372668A CN106372668A CN201610797496.9A CN201610797496A CN106372668A CN 106372668 A CN106372668 A CN 106372668A CN 201610797496 A CN201610797496 A CN 201610797496A CN 106372668 A CN106372668 A CN 106372668A
- Authority
- CN
- China
- Prior art keywords
- data
- acquisition system
- matched
- undetermined
- data acquisition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The invention discloses a data matching method and aims to solve a problem of relatively low accuracy of a matching result acquired through data matching carried out according to an IP address in the prior art. The method comprises steps that a first data set and a second data set are acquired, the first data set and the second data set respectively comprise at least one data group, each data group comprises at least two data; the data groups included by the first data set are matched with the data groups included by the second data set to acquire to-be-matched data group pairs; a matching accuracy degree of each to-be-matched data group pair is determined; at least one matching data group pair from the to-be-matched data group pairs is determined according to the matching accuracy degree of each to-be-matched data group pairs. The invention further discloses a data matching device.
Description
Technical field
The application is related to field of computer technology, more particularly, to a kind of data matching method and device.
Background technology
With the continuous development of Internet information technique, carrying out information recommendation by internet channels to user becomes more next
More universal, for example, it is possible to by internet channels to user's advertisement information, etc..
When carrying out information recommendation by way of the Internet, information recommendation side often compares the recommendation of concern information recommendation
Effect.Recommendation effect mentioned here may refer to whether recommendation information produces impact to the user receiving this information, such as,
User after receiving recommendation information, look into by the object that by clicking on this recommendation information, recommendation message may be recommended
See, or recommendation information sender website can be conducted interviews, etc..
At present, information recommendation side can reach preferable recommendation effect for recommendation information, often through to user to pushing away
Recommend click data (for example, the model of used equipment, user's click recommendation information when institute during user's click recommendation information of information
Use equipment the network address, etc.) and this recommendation information produce recommendation effect (e.g., have accessed information recommendation side website
Or download the application of information recommendation side, etc.) mated, to determine which user has carried out point to which recommendation information
Hit, create which type of recommendation effect, and by analyzing to matching result, can know which type of this recommendation information is directed to
User can produce preferable recommendation effect.
Taking carry out advertisement recommendation by internet channels as a example, user receive advertisement recommendation side push advertisement after,
Following effect of advertising may be produced, user passes through to the click receiving advertisement, jumps under application market (app store)
Carry the application recommended in advertisement.
For such effect of advertising, in the prior art, user is often being clicked on advertisement by advertisement sender
When the network address (internet protocol address, ip address) that the uses and ip address that uses when downloading of application
As basis for estimation, to determine the corresponding relation of the click data to advertisement for the user and effect of advertising.
But in actual use, because a lot of companies or school may share same public ip address, thus
Determine the corresponding relation of the click data to advertisement for the user and effect of advertising according to ip address, the matching result accuracy rate obtaining
Poor, and then advertisement sender cannot obtain desired result by the analysis to matching result.
It can be seen that, how to avoid prior art according to ip address, user click data and recommendation information effect data to be carried out
During coupling, the matching result accuracy rate obtaining is relatively low, becomes prior art problem demanding prompt solution.
Content of the invention
The embodiment of the present application provides a kind of data matching method, data is entered according only to ip address in order to solve prior art
Row coupling, the relatively low problem of the matching result accuracy rate obtaining.
The embodiment of the present application also provides a kind of data matching device, in order to solve prior art according only to ip address to data
Mated, the relatively low problem of the matching result accuracy rate obtaining.
The embodiment of the present application adopts following technical proposals:
A kind of data matching method, comprising:
Obtain the first data acquisition system and the second data acquisition system, described first data acquisition system, the second data acquisition system comprise respectively
At least one data set, each described data set comprises at least two data;
The data set comprising in the data set comprising in described first data acquisition system and described second data acquisition system is carried out
Coupling, obtains matched data group pair undetermined;
Determine the matching accuracy of each described matched data group pair undetermined;
According to the matching accuracy of each described matched data group pair undetermined, determine from each described matched data group centering undetermined
Go out at least one matched data group pair.
A kind of data matching device, comprising:
Data acquisition system acquiring unit, for obtaining the first data acquisition system and the second data acquisition system, described first data acquisition system,
Second data acquisition system comprises at least one data set respectively, and each described data set comprises at least two data;
Data matching unit, in the data set that will comprise in described first data acquisition system and described second data acquisition system
The data set comprising is mated, and obtains matched data group pair undetermined;
Accuracy determining unit, for determining the matching accuracy of each described matched data group pair undetermined;
Matched data determining unit, for the matching accuracy according to each described matched data group pair undetermined, from each described
At least one matched data group pair is determined in matched data group centering undetermined.
At least one technical scheme above-mentioned that the embodiment of the present application adopts can reach following beneficial effect:
Due to the data matching method being provided using the embodiment of the present application, every in the first data acquisition system and the second data acquisition system
Individual data set all comprises at least two data, by the data set comprising in described first data acquisition system and described second data
The data set comprising in set is mated, and obtains matched data group pair undetermined, and determine each matched data group pair undetermined
Registering exactness, according to the described matching accuracy determining, determines at least one coupling number from each matched data group centering undetermined
Right according to organizing, therefore with respect to only passing through in prior art to judge in two data sets, whether some data is identical, thus judging
The mode whether two data sets mate is compared, the scheme being provided due to the application when determining the data set of coupling, need
The coupling determining is according to more, also more reasonable, thus when determining the data set of coupling, the accuracy of matching result is higher.
Brief description
Accompanying drawing described herein is used for providing further understanding of the present application, constitutes the part of the application, this Shen
Schematic description and description please is used for explaining the application, does not constitute the improper restriction to the application.In the accompanying drawings:
A kind of idiographic flow schematic diagram of data matching method that Fig. 1 provides for the embodiment of the present application;
Fig. 2 illustrates for a kind of idiographic flow that click data is mated with effect data that the embodiment of the present application provides
Figure;
A kind of concrete structure schematic diagram of data matching device that Fig. 3 provides for the embodiment of the present application.
Specific embodiment
Purpose, technical scheme and advantage for making the application are clearer, below in conjunction with the application specific embodiment and
Corresponding accompanying drawing is clearly and completely described to technical scheme.Obviously, described embodiment is only the application one
Section Example, rather than whole embodiments.Based on the embodiment in the application, those of ordinary skill in the art are not doing
The every other embodiment being obtained under the premise of going out creative work, broadly falls into the scope of the application protection.
Below in conjunction with accompanying drawing, describe the technical scheme that each embodiment of the application provides in detail.
A kind of data matching method that the embodiment of the present application provides, in order to solve prior art according only to ip address to data
Mated, the relatively low problem of the matching result accuracy rate obtaining.
The embodiment of the present application provide data matching method executive agent, can be, but not limited to for mobile phone, panel computer,
PC (personal computer, pc) and intelligent television, wait at least one in terminal unit.In addition the method
Executive agent can also be server, for example, the server of shopping website, the server of advertiser website, application download site
Server, etc..
For ease of description, the reality as a example hereafter executive agent in this way is the server of advertiser website, to the method
The mode of applying is introduced.It is appreciated that the server that the executive agent of the method is advertiser website is a kind of exemplary saying
Bright, it is not construed as the restriction to the method.
The method implement schematic flow sheet as shown in figure 1, mainly comprising the steps:
Step 11, obtains the first data acquisition system and the second data acquisition system;
Wherein, described first data acquisition system and the second data acquisition system comprise at least one data set respectively, described in each
Data set comprises at least two data.
, advertiser is in order to understand effect of advertising it may be necessary to advertiser website is to user taking the server of advertiser website as a example
The effect data of the data and advertisement of clicking on advertisement is mated, and will be anti-for the effect of advertising being obtained according to matching result analysis
It is fed to advertiser, in order to achieve the above object, advertiser website is firstly the need of the click data of the advertisement to advertiser and advertisement
Effect data be collected.
The click data of described advertisement, when such as can include user's click advertisement, the model of the terminal being used, terminal
Operating system, the operating system version of terminal, ip address and click on the time, etc..Access in user's using terminal equipment
Advertiser website, and when clicking on the advertisement shown in advertiser website, the information of terminal that advertiser website can be used to user, point
The temporal information hit and the ip address of user are recorded, and using record those information as user click data.
And effect of advertising, usually may refer to advertisement and whether impact is produced on the user receiving this advertisement, such as, use
Family is after the advertisement watching advertiser website to show, if by clicking on advertisement, and the product shown in advertisement can be visited
Ask, or the website of advertiser is conducted interviews, etc..And advertisement effectiveness data mentioned here is being clicked on it is simply that referring to user
After advertisement, the data that produces when accessing to advertiser's recommended products, advertisement effectiveness data can also include the end of user's use
Client information, temporal information (such as, user accesses time of advertisement main web site, user buys the time of product in advertisement, etc.)
And the ip address of user's using terminal equipment, etc..
It is assumed that clicking on certain advertisement with user, the effect of advertising of generation are: jump in app store downloads ad and recommend
Application as a example, then now user click on advertisement produce effect of advertising typically can for user download app time, under user
Carry the end message of app, and ip address etc..In such a case, it is possible to it is embedded in the installation kit of the app of advertisement promotion
Information code, after user downloads the installation kit of this app and this app is installed in terminal, by app installation kit
Embedded information code, can to install the model of terminal of application, the operating system of terminal, operating system version number,
And download time of app and other effects data is collected, and those effect datas collected are sent to advertiser website
Server.
Still collected as a example ad click data and advertisement effectiveness data by above-mentioned advertiser website, advertiser website can will be received
Ad click data and the advertisement effectiveness data collecting is retained separately into two independent data acquisition systems.In each data acquisition system
Click on every time or each effect of advertising, as the data set in data acquisition system, above-mentioned terminal in each data set, can be comprised
Information, time, ip address, etc. data.
Step 12, the data that will comprise in the data set comprising in described first data acquisition system and described second data acquisition system
Group is mated, and obtains matched data group pair undetermined;
In one embodiment, can be by each data set comprising in the first data acquisition system be counted with second respectively
Mated according to each data set comprising in set, obtained matched data group pair undetermined.
Due to all comprising at least two data in data set in the data acquisition system that obtains by execution step 11, then to
When the data set comprising in one data acquisition system is mated with the data set comprising in the second data acquisition system, can be by first
The data pair comprising in the data set comprising in the data comprising in the data set comprising in data acquisition system and the second data acquisition system
Should be mated on ground, to obtain matched data group pair undetermined, in one embodiment, the specific implementation of step 12 is permissible
Including: for any data comprising in any data group comprising in described first data acquisition system and described second data acquisition system
Group, mates to type identical data in two data sets respectively;When at least a pair of data in described two data sets
When the match is successful, obtain the matched data group pair undetermined being made up of described two data sets.
For example, the click data to advertisement for the user that the server with the first data acquisition system as advertiser website is collected, the
As a example two data acquisition systems are the advertisement effectiveness data collected of server of advertiser website, advertiser website can be by hits
Mated with the effect data in effect data set according to the click data in set, to determine corresponding to certain one click
Effect.It is assumed that certain click data is in click data set: mobile phone model aaa, mobile phone operating system a, mobile phone operation system
System version a1, ip address 192.168.1.122, ad click time 15:20;Some effect data in effect data set
For: when mobile phone model aaa, mobile phone operating system a, mobile phone operating system version a1, ip address 192.168.1.122, app download
Between 15:23, by believing to mobile phone in the cellphone information comprising in this click data, ip address and click time and effect data
Breath, ip address and download time are mated, and obtain mobile phone model, mobile phone operating system, mobile phone operation in this click data
System version, ip address information all same corresponding with effect data, and click time of comprising in click data and effect
The app download time comprising in data is spaced in preset duration scope, then can determine that this click data with this effect data is
Matched data group pair undetermined.
It should be noted that the effect of advertising clicking on advertisement triggering with user for jumping to app store download application are
Example, under this scene, user must be first click advertisement, after app in downloads ad in app store, that is, wide when clicking on
Accuse with downloading app when being same user triggering, user the click time one of advertisement is scheduled on user download app time it
Before, and user also will not be oversize to the click time of advertisement and the time interval of app download time, thus the time can be will click on
Whether before app download time, and the time of clicking on is less than preset duration with the time interval of app download time, as judgement
The basis for estimation whether click data is mated with effect data.
By said method, the data comprising in the data set comprising in the first data acquisition system and the second data acquisition system
When group is mated, it is not the mode using single coupling according to (such as, ip address) coupling completely, but by the first data
The every number comprising in the data set comprising in each data comprising in the data set comprising in set and the second data acquisition system
According to all accordingly being mated, and in all coupling all successes, determine the data set and second comprising in the first data acquisition system
The data set coupling comprising in data acquisition system, this programme is not rely on single coupling foundation to determine matched data, but
To determine matched data by multiple couplings according to all couplings, thus with respect to prior art, the side being provided using step 12
Method, determines that the accuracy of matched data group pair undetermined is higher.
Step 13, determines the matching accuracy of each described matched data group pair undetermined;
It should be noted that pass through execution step 12, the matched data group centering undetermined obtaining, include at least two to
The data joined, then the matching accuracy between each pair matched data that can be comprised according to matched data group centering undetermined, come really
The matching accuracy of fixed matched data group pair undetermined, in one embodiment, the specific implementation of step 13 may include that
Determine the Data Matching accuracy of the data of every a pair of the coupling of matched data group centering undetermined;According to matched data group centering undetermined
The Data Matching accuracy of each matched data, determines the matching accuracy of described matched data group pair undetermined.
Due to the data comprising in the data set that comprises in the first data acquisition system and the second data acquisition system, it is not unique
In presence and some data set, for example, taking click data set with effect data set as a example, can in click data set
The equipment type that can have multiple click datas is huawei, and is also possible to setting of multiple effect datas in effect data set
Standby host type is huawei, and in one embodiment, the embodiment of the present application can be according to matched data group centering undetermined coupling
The number of times that data occurs respectively in the first data acquisition system and the second data acquisition system, determines the data of described coupling described first
The all match condition that can occur in data acquisition system and described second data acquisition system;And according to the described match condition determining,
Determine the Data Matching accuracy between the data of every a pair of coupling of described matched data group centering undetermined.
In one embodiment, the coupling number that matched data group centering undetermined comprises can be calculated according to arrangement formula
According to all match condition that can occur in the first data acquisition system and the second data acquisition system, and according to the match condition determining,
Using such as following formula [1], to determine the data between the data of every a pair of coupling of described matched data group centering undetermined
Registering exactness p:
Wherein, m represents the number of times that the data of matched data group centering coupling undetermined occurs in the first data acquisition system, n table
Show the number of times that the data of matched data group centering coupling undetermined occurs in the second data acquisition system.
It is assumed that one having 10 huawei types, effect data set in the click data comprising in click data set
In one have 2 huawei types in the effect data that comprises, then only using type as coupling according to click data and effect
When fruit data is mated, comprise in click data that click data set comprises huawei type and effect data set
The effect data of huawei type can occur altogetherPlant match condition, then assume that and comprise treating of huawei type
Determining matched data to the accuracy rate being true match data is
Equally, for the click time in data acquisition system and download time, can be according to the time of click in download time
Before, and click on time interval between time and download time in default duration scope, click time and download are determined
The Data Matching accuracy of time, in one embodiment, when can calculate the click time and download by below equation
Between Data Matching accuracy,Wherein, teffectRepresent application download time, tclickWhen representing click
Between.
In one embodiment, determining every a pair of the matched data of matched data group centering undetermined by above-mentioned steps
After Data Matching accuracy, can be to the Data Matching accuracy between the data of each coupling of described matched data group centering undetermined
Weighted sum, weighted sum result is defined as the matching accuracy of described matched data group pair undetermined.
For example, with the first data acquisition system for click data set, the second data acquisition system is effect data set, coupling undetermined
The matched data that data set centering comprises be respectively as follows: type (being represented with model hereinafter), operating system (being represented with os hereinafter),
Operating system version (being represented with osversion hereinafter) and ip address (being represented with ip hereinafter), then matched data group pair undetermined
In each coupling data between Data Matching accuracy be respectively as follows: p(model)、p(os)、p(osversion)、p(ip), then number to be matched
According to group to matching accuracy be: p=p(model)+p(os)+p(osversion)+p(ip).
By execution step 13 it may be determined that the coupling going out the data set pair all undetermined obtaining by execution step 12 is accurate
Exactness, and then the matching accuracy according to each matched data group pair undetermined, determine at least one from matched data group centering undetermined
Individual matched data group pair, detailed process step 14 as described below.
Step 14, according to the matching accuracy of each described matched data group pair undetermined, from each described matched data group undetermined
At least one matched data group pair is determined in centering.
The matching accuracy of the matched data group pair each undetermined being determined by execution step 13 is suitable according to from high to low
Sequence arranges, and chooses the several matched data group pair undetermined of accuracy highest, as the matched data group pair determining.
In one embodiment, the accuracy of the matched data group pair undetermined in order to avoid being obtained by execution step 13
More than 1, the matching accuracy obtaining can be processed, it is accurate the bottom of for 2 for example the matching accuracy that obtain can be taken
The logarithm of degree, i.e. log2p.
It should be noted that working as by execution step 14, determine matched data group to rear, in order to avoid have determined
The impact to the matching accuracy of other matched data groups pair undetermined in data acquisition system for the data that matched data group centering comprises,
In a kind of embodiment, when determine matched data group to after, can incite somebody to action in the first data acquisition system and the second data acquisition system
The data set that matched data group centering comprises is deleted, and the first data acquisition system after deleting and the second data acquisition system are continued to hold
Row above-mentioned steps 11, step 12, step 13 and step 14, specifically, the embodiment of the present application provide method may include that from
Reject, in described first data acquisition system and described second data acquisition system, the data set that described matched data group centering comprises, obtain more
The first data acquisition system after new and the second data acquisition system after renewal;After the first data acquisition system after described renewal and renewal
At least one matched data group pair is determined, until meeting pre-conditioned in second data acquisition system;Wherein, described pre-conditioned bag
Include: the data set comprising in described first data acquisition system or described second data acquisition system is less than one.
It should be noted that described pre-conditioned can as needed, flexibly setting is configured, and such as can arrange really
The matched data group making specified quantity can terminate, and is not necessarily required to be processed according to above-mentioned steps, until the first number
It is not more than one according to the data set comprising in set or the second data acquisition system.
Hereafter mated with effect data with the click data to advertisement, a kind of detailed the embodiment of the present application of introducing carries
For a kind of data matching method, the idiographic flow schematic diagram of the method is as shown in Fig. 2 mainly may comprise steps of:
Step 21, data collection clicks on data and effect data;
Step 22, the click data collected and effect data are saved in click data collection and effect data collection respectively
In;
Step 23, counts to the occurrence number of click data collection and effect data intensive data respectively;
Step 24, concentrates the data comprising to mate click data collection and effect data, obtains coupling number undetermined
According to right;
Step 25, calculates the matching accuracy of each matched data pair undetermined;
Step 26, by matching accuracy order from high to low, the coupling to calculated each matched data degree undetermined
Accuracy is ranked up, and judges to whether there is accuracy in matching accuracy in 1 situation, if existing, execution step 27,
If not existing, execution step 28;
Step 27, the matched data undetermined that matching accuracy is equal to 1 is defined as matched data pair, and from click data collection
Concentrate with effect data and reject the data that matched data centering comprises, and then execution step 23;
Step 28, chooses some matched datas pair undetermined of matching accuracy highest, as matched data pair.
The data matching method being provided using the embodiment of the present application, due to every in the first data acquisition system and the second data acquisition system
Individual data set all comprises at least two data, by the data set comprising in described first data acquisition system and described second data
The data set comprising in set is mated, and obtains matched data group pair undetermined, and determine each matched data group pair undetermined
Registering exactness, according to the described matching accuracy determining, determines at least one coupling number from each matched data group centering undetermined
Right according to organizing, therefore with respect to only passing through in prior art to judge in two data sets, whether some data is identical, thus judging
The mode whether two data sets mate is compared, the scheme being provided due to the application when determining the data set of coupling, need
The coupling determining is according to more, also more reasonable, thus when determining the data set of coupling, the accuracy of matching result is higher.
The embodiment of the present application additionally provides a kind of data matching device, in order to solve prior art according only to ip address logarithm
According to being mated, the relatively low problem of the matching result accuracy rate that obtains.The concrete structure schematic diagram of this device is as shown in Fig. 2 wrap
Include: data acquisition system acquiring unit 31, data matching unit 32, accuracy determining unit 33 and matched data determining unit 34.
Wherein, data acquisition system acquiring unit 31, for obtaining the first data acquisition system and the second data acquisition system, described first number
Comprise at least one data set according to set, the second data acquisition system respectively, each described data set comprises at least two data;
Data matching unit 32, for the data set and described second data acquisition system that will comprise in described first data acquisition system
In the data set that comprises mated, obtain matched data group pair undetermined;
Accuracy determining unit 33, for determining the matching accuracy of each described matched data group pair undetermined;
Matched data determining unit 34, for the matching accuracy according to each described matched data group pair undetermined, from each institute
State matched data group centering undetermined and determine at least one matched data group pair.
In one embodiment, described device also includes data acquisition system updating block 35, for from described first data
Reject, in set and described second data acquisition system, the data set that described matched data group centering comprises, the first number after being updated
According to the second data acquisition system after set and renewal;
Data matching unit 32, is additionally operable to from the first data acquisition system after described renewal and the second data acquisition system after renewal
In determine at least one matched data group pair, until meet pre-conditioned, described pre-conditioned inclusion: described first data set
The data set comprising in conjunction or described second data acquisition system is not more than one.
In one embodiment, data matching unit 32, specifically for every by comprise in described first data acquisition system
Individual data component is not mated with each data set comprising in described second data acquisition system, obtains matched data group undetermined
Right.
In one embodiment, data matching unit 32, specifically for for comprising in described first data acquisition system
Any data group comprising in any data group and described second data acquisition system, respectively to type identical number in two data sets
According to being mated;When at least a pair of Data Matching success in described two data sets, obtain by described two data set groups
The matched data group pair undetermined becoming.
In one embodiment, accuracy determining unit 33, comprising: the first determination subelement 331, undetermined for determining
The Data Matching accuracy of the data of every a pair of the coupling of matched data group centering;Second determination subelement 332, for according to undetermined
The Data Matching accuracy of each matched data of matched data group centering, determines that the coupling of described matched data group pair undetermined is accurate
Degree.
In one embodiment, the first determination subelement 331, specifically for according to described matched data group centering undetermined
The number of times that the data of coupling occurs respectively in described first data acquisition system and described second data acquisition system, determines described coupling
All match condition that data can occur in described first data acquisition system with described second data acquisition system;According to the institute determining
State match condition, determine the Data Matching accuracy between the data of every a pair of coupling of described matched data group centering undetermined;
And/or, the second determination subelement 332, specifically for the data to each coupling of described matched data group centering undetermined
Between Data Matching accuracy weighted sum, will be accurate for coupling that weighted sum result is defined as described matched data group pair undetermined
Exactness.
In one embodiment, described first data acquisition system and the second data acquisition system correspond respectively to different terminals;Institute
State the data comprising in data set and include following at least two: the hardware information of terminal;The network address of terminal;The operation of terminal
System information.
The data matching device being provided using the embodiment of the present application, due to every in the first data acquisition system and the second data acquisition system
Individual data set all comprises at least two data, by the data set comprising in described first data acquisition system and described second data
The data set comprising in set is mated, and obtains matched data group pair undetermined, and determine each matched data group pair undetermined
Registering exactness, according to the described matching accuracy determining, determines at least one coupling number from each matched data group centering undetermined
Right according to organizing, therefore with respect to only passing through in prior art to judge in two data sets, whether some data is identical, thus judging
The mode whether two data sets mate is compared, the scheme being provided due to the application when determining the data set of coupling, need
The coupling determining is according to more, also more reasonable, thus when determining the data set of coupling, the accuracy of matching result is higher.
Those skilled in the art are it should be appreciated that embodiments of the invention can be provided as method, system or computer program
Product.Therefore, the present invention can be using complete hardware embodiment, complete software embodiment or the reality combining software and hardware aspect
Apply the form of example.And, the present invention can be using in one or more computers wherein including computer usable program code
The upper computer program implemented of usable storage medium (including but not limited to disk memory, cd-rom, optical memory etc.) produces
The form of product.
The present invention is the flow process with reference to method according to embodiments of the present invention, equipment (system) and computer program
Figure and/or block diagram are describing.It should be understood that can be by each stream in computer program instructions flowchart and/or block diagram
Flow process in journey and/or square frame and flow chart and/or block diagram and/or the combination of square frame.These computer programs can be provided
The processor instructing general purpose computer, special-purpose computer, Embedded Processor or other programmable data processing device is to produce
A raw machine is so that produced for reality by the instruction of computer or the computing device of other programmable data processing device
The device of the function of specifying in present one flow process of flow chart or multiple flow process and/or one square frame of block diagram or multiple square frame.
These computer program instructions may be alternatively stored in and can guide computer or other programmable data processing device with spy
Determine in the computer-readable memory that mode works so that the instruction generation inclusion being stored in this computer-readable memory refers to
Make the manufacture of device, this command device realize in one flow process of flow chart or multiple flow process and/or one square frame of block diagram or
The function of specifying in multiple square frames.
These computer program instructions also can be loaded in computer or other programmable data processing device so that counting
On calculation machine or other programmable devices, execution series of operation steps to be to produce computer implemented process, thus in computer or
On other programmable devices, the instruction of execution is provided for realizing in one flow process of flow chart or multiple flow process and/or block diagram one
The step of the function of specifying in individual square frame or multiple square frame.
In a typical configuration, computing device includes one or more processors (cpu), input/output interface, net
Network interface and internal memory.
Internal memory potentially includes the volatile memory in computer-readable medium, random access memory (ram) and/or
The forms such as Nonvolatile memory, such as read only memory (rom) or flash memory (flash ram).Internal memory is computer-readable medium
Example.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method
Or technology is realizing information Store.Information can be computer-readable instruction, data structure, the module of program or other data.
The example of the storage medium of computer includes, but are not limited to phase transition internal memory (pram), static RAM (sram), moves
State random access memory (dram), other kinds of random access memory (ram), read only memory (rom), electric erasable
Programmable read only memory (eeprom), fast flash memory bank or other memory techniques, read-only optical disc read only memory (cd-rom),
Digital versatile disc (dvd) or other optical storage, magnetic cassette tape, the storage of tape magnetic rigid disk or other magnetic storage apparatus
Or any other non-transmission medium, can be used for storing the information that can be accessed by a computing device.Define according to herein, calculate
Machine computer-readable recording medium does not include temporary computer readable media (transitory media), the such as data signal of modulation and carrier wave.
Also, it should be noted term " inclusion ", "comprising" or its any other variant are intended to nonexcludability
Comprising, so that including a series of process of key elements, method, commodity or equipment not only include those key elements, but also wrapping
Include other key elements being not expressly set out, or also include for this process, method, commodity or intrinsic the wanting of equipment
Element.In the absence of more restrictions, the key element being limited by sentence "including a ..." is it is not excluded that including described wanting
Also there is other identical element in the process of element, method, commodity or equipment.
It will be understood by those skilled in the art that embodiments herein can be provided as method, system or computer program.
Therefore, the application can adopt complete hardware embodiment, complete software embodiment or combine the embodiment of software and hardware aspect
Form.And, the application can be deposited using can use in one or more computers wherein including computer usable program code
The shape of the upper computer program implemented of storage media (including but not limited to disk memory, cd-rom, optical memory etc.)
Formula.
The foregoing is only embodiments herein, be not limited to the application.For those skilled in the art
For, the application can have various modifications and variations.All any modifications made within spirit herein and principle, equivalent
Replace, improve etc., within the scope of should be included in claims hereof.
Claims (14)
1. a kind of data matching method is it is characterised in that include:
Obtain the first data acquisition system and the second data acquisition system, described first data acquisition system, the second data acquisition system comprise at least respectively
One data set, each described data set comprises at least two data;
The data set comprising in described first data acquisition system is mated with the data set comprising in described second data acquisition system,
Obtain matched data group pair undetermined;
Determine the matching accuracy of each described matched data group pair undetermined;
According to the matching accuracy of each described matched data group pair undetermined, from each described matched data group centering undetermined determine to
A few matched data group pair.
2. the method for claim 1 is it is characterised in that accurate according to the coupling of each described matched data group pair undetermined
Degree, determines at least one matched data group to rear from each described matched data group centering undetermined, methods described also includes:
Reject, from described first data acquisition system and described second data acquisition system, the data set that described matched data group centering comprises,
The first data acquisition system after being updated and the second data acquisition system after renewal;
Determine at least one matched data from the first data acquisition system after described renewal and the second data acquisition system after renewal
Group is right, until meeting pre-conditioned, described pre-conditioned inclusion: wrap in described first data acquisition system or described second data acquisition system
The data set containing is not more than one.
3. the method for claim 1 it is characterised in that by the data set comprising in described first data acquisition system with described
The data set comprising in second data acquisition system is mated, and obtains matched data group pair undetermined, specifically includes:
By each data set comprising in described first data acquisition system respectively with every number of comprising in described second data acquisition system
Mated according to group, obtained matched data group pair undetermined.
4. the method for claim 1 it is characterised in that by the data set comprising in described first data acquisition system with described
The data set comprising in second data acquisition system is mated, and obtains matched data group pair undetermined, specifically includes:
For any data comprising in any data group comprising in described first data acquisition system and described second data acquisition system
Group, mates to type identical data in two data sets respectively;
When at least a pair of Data Matching success in described two data sets, obtain by described two data sets form undetermined
Matched data group pair.
5. the method for claim 1 it is characterised in that determine described matched data group pair undetermined matching accuracy,
Specifically include:
Determine the Data Matching accuracy of the data of every a pair of the coupling of matched data group centering undetermined;
According to the Data Matching accuracy of each matched data of matched data group centering undetermined, determine described matched data group pair undetermined
Matching accuracy.
6. method as claimed in claim 5 it is characterised in that described determination matched data undetermined group centering every a pair of coupling
The Data Matching accuracy of data, specifically includes:
Data according to described matched data group centering coupling undetermined is respectively in described first data acquisition system and described second data
The number of times occurring in set, determines that the data of described coupling can in described first data acquisition system with described second data acquisition system
The all match condition occurring;
According to the described match condition determining, determine between the data of every a pair of coupling of described matched data group centering undetermined
Data Matching accuracy;
And/or, the described Data Matching accuracy according to each matched data of matched data group centering undetermined, determine described undetermined
Join the matching accuracy of data set pair, specifically include:
To the Data Matching accuracy weighted sum between the data of each coupling of described matched data group centering undetermined, weighting is asked
It is defined as the matching accuracy of described matched data group pair undetermined with result.
7. method as claimed in claim 6 is it is characterised in that determine described matched data group centering undetermined according to below equation
Every a pair of coupling data between Data Matching accuracy p:
Wherein, n represents the number of times that the data of coupling occurs in described first data acquisition system, and m represents the data of coupling described
The number of times occurring in second data acquisition system,Represent that choosing m from n data is arranged.
8. the method as described in any one of claim 1~7 is it is characterised in that described first data acquisition system and the second data set
Conjunction corresponds respectively to different terminals;
The data comprising in described data set includes following at least two:
The hardware information of terminal;
The network address of terminal;
The operation system information of terminal.
9. a kind of data matching device is it is characterised in that include:
Data acquisition system acquiring unit, for obtaining the first data acquisition system and the second data acquisition system, described first data acquisition system, second
Data acquisition system comprises at least one data set respectively, and each described data set comprises at least two data;
Data matching unit, for comprising in the data set comprising in described first data acquisition system and described second data acquisition system
Data set mated, obtain matched data group pair undetermined;
Accuracy determining unit, for determining the matching accuracy of each described matched data group pair undetermined;
Matched data determining unit, for the matching accuracy according to each described matched data group pair undetermined, from each described undetermined
At least one matched data group pair is determined in matched data group centering.
10. device as claimed in claim 9 is it is characterised in that described device also includes data acquisition system updating block, for from
Reject, in described first data acquisition system and described second data acquisition system, the data set that described matched data group centering comprises, obtain more
The first data acquisition system after new and the second data acquisition system after renewal;
Data matching unit, is additionally operable to determine from the first data acquisition system after described renewal and the second data acquisition system after renewal
Go out at least one matched data group pair, until meeting pre-conditioned, described pre-conditioned inclusion: described first data acquisition system or institute
State the data set comprising in the second data acquisition system and be not more than one.
11. devices as claimed in claim 9 it is characterised in that data matching unit, specifically for by described first data set
Each data set comprising in conjunction is mated with each data set comprising in described second data acquisition system respectively, obtains undetermined
Matched data group pair.
12. devices as claimed in claim 9 it is characterised in that data matching unit, specifically for for described first data
Any data group comprising in any data group comprising in set and described second data acquisition system, respectively in two data sets
Type identical data is mated;
When at least a pair of Data Matching success in described two data sets, obtain by described two data sets form undetermined
Matched data group pair.
13. devices as claimed in claim 9 are it is characterised in that accuracy determining unit, comprising:
First determination subelement, the Data Matching for determining the data of every a pair of the coupling of matched data group centering undetermined is accurate
Degree;
Second determination subelement, for the Data Matching accuracy according to each matched data of matched data group centering undetermined, determines
The matching accuracy of described matched data group pair undetermined.
14. devices as claimed in claim 13 it is characterised in that the first determination subelement, specifically for according to described undetermined
The number of times that the data of matched data group centering coupling occurs respectively in described first data acquisition system and described second data acquisition system,
Determine the data of described coupling described first data acquisition system with can occur in described second data acquisition system all mate feelings
Condition;According to the described match condition determining, determine between the data of every a pair of coupling of described matched data group centering undetermined
Data Matching accuracy;
And/or, the second determination subelement, specifically for the number between the data to each coupling of described matched data group centering undetermined
According to matching accuracy weighted sum, weighted sum result is defined as the matching accuracy of described matched data group pair undetermined.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610797496.9A CN106372668A (en) | 2016-08-31 | 2016-08-31 | Data matching method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610797496.9A CN106372668A (en) | 2016-08-31 | 2016-08-31 | Data matching method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106372668A true CN106372668A (en) | 2017-02-01 |
Family
ID=57898854
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610797496.9A Pending CN106372668A (en) | 2016-08-31 | 2016-08-31 | Data matching method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106372668A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107193884A (en) * | 2017-04-27 | 2017-09-22 | 北京小米移动软件有限公司 | A kind of method and apparatus of matched data |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103646110A (en) * | 2013-12-26 | 2014-03-19 | 中国人民银行征信中心 | Natural person basic identity information matching method |
CN103678327A (en) * | 2012-09-04 | 2014-03-26 | 中国移动通信集团四川有限公司 | Method and device for information association |
CN103810527A (en) * | 2008-10-23 | 2014-05-21 | 起元技术有限责任公司 | Method and system for operating data operations, mesuring data quality and joining data elements |
CN104239301A (en) * | 2013-06-06 | 2014-12-24 | 阿里巴巴集团控股有限公司 | Data comparing method and device |
CN104298736A (en) * | 2014-09-30 | 2015-01-21 | 华为软件技术有限公司 | Method and device for aggregating and connecting data as well as database system |
CN104504021A (en) * | 2014-12-11 | 2015-04-08 | 北京国双科技有限公司 | Data matching method and device |
CN105224649A (en) * | 2015-09-29 | 2016-01-06 | 北京奇艺世纪科技有限公司 | A kind of data processing method and device |
CN105630867A (en) * | 2015-12-01 | 2016-06-01 | 广东小天才科技有限公司 | Method and device for matching data |
-
2016
- 2016-08-31 CN CN201610797496.9A patent/CN106372668A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103810527A (en) * | 2008-10-23 | 2014-05-21 | 起元技术有限责任公司 | Method and system for operating data operations, mesuring data quality and joining data elements |
CN103678327A (en) * | 2012-09-04 | 2014-03-26 | 中国移动通信集团四川有限公司 | Method and device for information association |
CN104239301A (en) * | 2013-06-06 | 2014-12-24 | 阿里巴巴集团控股有限公司 | Data comparing method and device |
CN103646110A (en) * | 2013-12-26 | 2014-03-19 | 中国人民银行征信中心 | Natural person basic identity information matching method |
CN104298736A (en) * | 2014-09-30 | 2015-01-21 | 华为软件技术有限公司 | Method and device for aggregating and connecting data as well as database system |
CN104504021A (en) * | 2014-12-11 | 2015-04-08 | 北京国双科技有限公司 | Data matching method and device |
CN105224649A (en) * | 2015-09-29 | 2016-01-06 | 北京奇艺世纪科技有限公司 | A kind of data processing method and device |
CN105630867A (en) * | 2015-12-01 | 2016-06-01 | 广东小天才科技有限公司 | Method and device for matching data |
Non-Patent Citations (1)
Title |
---|
甄灵敏等: ""基于属性权重的实体解析技术", 《计算机研究与发展》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107193884A (en) * | 2017-04-27 | 2017-09-22 | 北京小米移动软件有限公司 | A kind of method and apparatus of matched data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11444856B2 (en) | Systems and methods for configuring a resource for network traffic analysis | |
CN107562620B (en) | Automatic buried point setting method and device | |
CN106202453B (en) | Multimedia resource recommendation method and device | |
CN105630977B (en) | Application program recommended method, apparatus and system | |
CN104410516B (en) | A kind of customer service perceptibility appraisal procedure and device | |
CN105023165A (en) | Method, device and system for controlling release tasks in social networking platform | |
US11500709B1 (en) | Mobile application crash monitoring user interface | |
CN104462293A (en) | Search processing method and method and device for generating search result ranking model | |
CN108334641B (en) | Method, system, electronic equipment and storage medium for collecting user behavior data | |
US20080270549A1 (en) | Extracting link spam using random walks and spam seeds | |
CN105868256A (en) | Method and system for processing user behavior data | |
CN107578263A (en) | A kind of detection method, device and the electronic equipment of advertisement abnormal access | |
CN105824834A (en) | Search traffic cheating behavior identification method and apparatus | |
CN106326297B (en) | Application program recommendation method and device | |
CN110851583A (en) | Novel recommendation method and device | |
CN107644100A (en) | Information processing method, device and system and computer-readable recording medium | |
WO2013134300A1 (en) | Method and apparatus of determining redirection quality, and method and apparatus of placing promotion information | |
CN106569860A (en) | Application management method and terminal | |
CN113407773A (en) | Short video intelligent recommendation method and system, electronic device and storage medium | |
CN103761228A (en) | Ranking threshold determination method and ranking threshold determination system for application program | |
CN104111970A (en) | Method and device for counting page average residence time and method and device for determining page user viscosity | |
CN109150700A (en) | A kind of method and device of data acquisition | |
CN106789277B (en) | User behavior determination method and device based on state machine model | |
CN111444447A (en) | Content recommendation page display method and device | |
CN105450460B (en) | Network operation recording method and system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170201 |
|
RJ01 | Rejection of invention patent application after publication |