CN110493368B - Matching method and device of equipment identifiers - Google Patents

Matching method and device of equipment identifiers Download PDF

Info

Publication number
CN110493368B
CN110493368B CN201910775847.XA CN201910775847A CN110493368B CN 110493368 B CN110493368 B CN 110493368B CN 201910775847 A CN201910775847 A CN 201910775847A CN 110493368 B CN110493368 B CN 110493368B
Authority
CN
China
Prior art keywords
device identifier
identifier
pair
acquisition
equipment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910775847.XA
Other languages
Chinese (zh)
Other versions
CN110493368A (en
Inventor
林晓明
江金陵
梁秀钦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Mininglamp Software System Co ltd
Original Assignee
Beijing Mininglamp Software System Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Mininglamp Software System Co ltd filed Critical Beijing Mininglamp Software System Co ltd
Priority to CN201910775847.XA priority Critical patent/CN110493368B/en
Publication of CN110493368A publication Critical patent/CN110493368A/en
Application granted granted Critical
Publication of CN110493368B publication Critical patent/CN110493368B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/06Generation of reports
    • H04L43/065Generation of reports related to network devices
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/09Mapping addresses
    • H04L61/10Mapping addresses of different types
    • H04L61/106Mapping addresses of different types across networks, e.g. mapping telephone numbers to data network addresses
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2101/00Indexing scheme associated with group H04L61/00
    • H04L2101/60Types of network addresses
    • H04L2101/618Details of network addresses
    • H04L2101/622Layer-2 addresses, e.g. medium access control [MAC] addresses
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L2101/00Indexing scheme associated with group H04L61/00
    • H04L2101/60Types of network addresses
    • H04L2101/618Details of network addresses
    • H04L2101/654International mobile subscriber identity [IMSI] numbers

Abstract

The invention provides a method and a device for matching equipment identifiers, wherein the method comprises the following steps: acquiring a first equipment identification list and a second equipment identification list acquired in a target time period; constructing an equipment identifier pair set according to the first equipment identifier list and the second equipment identifier list; dividing the device identification pair set into a first device identification pair sub-set and a second device identification pair sub-set; and training the initial model by using the first equipment identification pair subset as a positive sample and the second equipment identification pair subset as a negative sample to obtain the target model. According to the invention, the problem of low matching efficiency of the equipment identifier in the related technology is solved, and the effect of improving the matching efficiency of the equipment identifier is further achieved.

Description

Matching method and device of equipment identifiers
Technical Field
The invention relates to the field of computers, in particular to a method and a device for matching equipment identifiers.
Background
Wifi probes and electronic fences are effective technologies for collecting device information. However, the two devices provide different services, and because of privacy problems, neither device can collect additional mobile phone information. The Wifi probe collects the MAC number of the mobile phone, the electronic fence technology collects the IMSI number of the mobile phone, however, the two IDs do not match a table, and the problem that how to accurately correspond the IDs collected by the two devices is needed to be solved.
In view of the above problems, no effective solution has been proposed.
Disclosure of Invention
The embodiment of the invention provides a method and a device for matching equipment identifiers, which are used for at least solving the problem of low matching efficiency of the equipment identifiers in the related technology.
According to an embodiment of the present invention, there is provided a method for matching device identifiers, including:
acquiring a first equipment identification list and a second equipment identification list acquired in a target time period, wherein the first equipment identification list is used for recording first equipment identifications acquired in a first acquisition mode, and the second equipment identification list is used for recording second equipment identifications acquired in a second acquisition mode;
constructing an equipment identifier pair set according to the first equipment identifier list and the second equipment identifier list, wherein the equipment identifier pair set is used for recording a first equipment identifier and a second equipment identifier which have a corresponding relationship;
dividing the device identification pair set into a first device identification pair sub-set and a second device identification pair sub-set, wherein the first device identification having the corresponding relation in the first device identification pair sub-set is matched with the second device identification, and the second device identification having the corresponding relation in the second device identification pair sub-set is not matched with the first device identification having the corresponding relation in the second device identification pair sub-set;
and training an initial model by using the first equipment identification pair subset as a positive sample and the second equipment identification pair subset as a negative sample to obtain a target model, wherein the target model is used for matching the first equipment identification with the second equipment identification.
Optionally, the obtaining the first device identifier list and the second device identifier list collected in the target time period includes:
acquiring a first device identifier in a first acquisition mode in a target time period, and recording first acquisition information when the first device identifier is acquired to obtain a first device identifier list, wherein the first device identifier list records the first device identifier and first acquisition information which have a corresponding relationship, and the first acquisition information is used for indicating an acquisition place and acquisition time when the corresponding first device identifier is acquired;
and acquiring a second equipment identifier in a second acquisition mode in a target time period, and recording second acquisition information when the second equipment identifier is acquired to obtain a second equipment identifier list, wherein the second equipment identifier list records the second equipment identifier and the second acquisition information which have a corresponding relationship, and the second acquisition information is used for indicating an acquisition place and acquisition time when the corresponding second equipment identifier is acquired.
Optionally, constructing the device identifier pair set according to the first device identifier list and the second device identifier list includes:
extracting feature information from the first device identifier list and the second device identifier list, wherein the feature information is used for indicating a relationship between an acquisition place of each first device identifier and an acquisition place of each second device identifier, and/or a relationship between acquisition time of each first device identifier and acquisition time of each second device identifier;
and establishing a corresponding relation among each first equipment identifier, each second equipment identifier and the characteristic information to obtain an equipment identifier pair set.
Optionally, dividing the set of device identification pairs into a first set of device identification pairs and a second set of device identification pairs comprises:
determining a screening rule corresponding to the characteristic information;
screening a first equipment identification pair meeting the screening rule from the equipment identification pair set;
determining the first device identification pair as the device identification pair comprised by the first device identification pair subset;
and determining the device identification pairs in the device identification pair set except the first device identification pair as the device identification pairs included in the second device identification pair sub-set.
Optionally, training the initial model by using the first device identifier pair subset as a positive sample and the second device identifier pair subset as a negative sample, and obtaining the target model includes:
determining a label corresponding to the first device identifier pair sub-set as a first label value, and determining a label corresponding to the second device identifier pair sub-set as a second label value, wherein the first label value is used for indicating that the first device identifier is matched with the second device identifier, and the second label value is used for indicating that the first device identifier is not matched with the second device identifier;
inputting the characteristic information recorded in the first equipment identification pair subset into the initial model to obtain a first output value, and inputting the characteristic information recorded in the second equipment identification pair subset into the initial model to obtain a second output value;
adjusting the model parameters of the initial model according to the difference between the first output value and the first mark value and the difference between the second output value and the second mark value until the adjusted model converges;
and determining the adjusted converged model as a target model.
Optionally, the step of screening out a first device identifier pair satisfying the screening rule from the set of device identifier pairs comprises:
screening initial equipment identification pairs meeting the screening rule from the equipment identification pair set;
determining target parameters of an initial equipment identification pair;
and under the condition that the target parameter meets the parameter condition, determining the initial equipment identification pair as a first equipment identification pair.
Optionally, after determining the target parameter of the initial device identification pair, the method further includes:
under the condition that the target parameter does not meet the parameter condition, adjusting the screening rule until the target parameter of the equipment identification pair meeting the adjusted screening rule meets the parameter condition;
and determining the device identification pair with the target parameter meeting the parameter condition as a first device identification pair.
According to another embodiment of the present invention, there is provided an apparatus for matching device identifiers, including:
the device comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring a first device identifier list and a second device identifier list acquired in a target time period, the first device identifier list is used for recording a first device identifier acquired in a first acquisition mode, and the second device identifier list is used for recording a second device identifier acquired in a second acquisition mode;
a building module, configured to build a device identifier pair set according to the first device identifier list and the second device identifier list, where the device identifier pair set is used to record a first device identifier and a second device identifier that have a correspondence relationship;
a dividing module, configured to divide the device identifier pair set into a first device identifier pair subset and a second device identifier pair subset, where a first device identifier and a second device identifier that have a correspondence in a record in the first device identifier pair subset match, and a first device identifier and a second device identifier that have a correspondence in a record in the second device identifier pair subset do not match;
and the training module is used for training an initial model by using the first equipment identifier pair subset as a positive sample and the second equipment identifier pair subset as a negative sample to obtain a target model, wherein the target model is used for matching the first equipment identifier with the second equipment identifier.
Optionally, the obtaining module includes:
a first acquisition unit, configured to acquire the first device identifier in the target time period in the first acquisition manner, and record first acquisition information when the first device identifier is acquired, so as to obtain a first device identifier list, where the first device identifier list records the first device identifier and the first acquisition information that have a corresponding relationship, and the first acquisition information is used to indicate an acquisition location and an acquisition time when the corresponding first device identifier is acquired;
and the second acquisition unit is used for acquiring the second equipment identifier in the second acquisition mode in the target time period, recording second acquisition information when the second equipment identifier is acquired, and obtaining a second equipment identifier list, wherein the second equipment identifier list records the second equipment identifier and the second acquisition information which have corresponding relations, and the second acquisition information is used for indicating an acquisition place and acquisition time when the corresponding second equipment identifier is acquired.
Optionally, the building module comprises:
an extracting unit, configured to extract feature information from the first device identifier list and the second device identifier list, where the feature information is used to indicate a relationship between an acquisition location of each first device identifier and an acquisition location of each second device identifier, and/or a relationship between an acquisition time of each first device identifier and an acquisition time of each second device identifier;
and the establishing unit is used for establishing a corresponding relation among each first equipment identifier, each second equipment identifier and the characteristic information to obtain the equipment identifier pair set.
Optionally, the dividing module includes:
a first determining unit, configured to determine a filtering rule corresponding to the feature information;
a screening unit, configured to screen out, from the set of device identifier pairs, a first device identifier pair that meets the screening rule;
a second determining unit, configured to determine the first device identification pair as a device identification pair included in the first device identification pair subset;
a third determining unit, configured to determine, as the device identifier pair included in the second device identifier pair subset, a device identifier pair in the device identifier pair set other than the first device identifier pair.
Optionally, the training module comprises:
a fourth determining unit, configured to determine, as a first tag value, a tag corresponding to the first device identifier pair subset, and determine, as a second tag value, a tag corresponding to the second device identifier pair subset, where the first tag value is used to indicate that the first device identifier and the second device identifier match, and the second tag value is used to indicate that the first device identifier and the second device identifier do not match;
the input unit is used for inputting the characteristic information recorded in the first equipment identification pair subset into the initial model to obtain a first output value, and inputting the characteristic information recorded in the second equipment identification pair subset into the initial model to obtain a second output value;
an adjusting unit, configured to adjust a model parameter of the initial model according to a difference between the first output value and the previous difference between the first flag value and the previous difference between the second output value and the previous difference between the second flag value until the adjusted model converges;
and a fifth determining unit, configured to determine the adjusted converged model as the target model.
Optionally, the screening module comprises:
a screening unit, configured to screen an initial device identifier pair that meets the screening rule from the device identifier pair set;
a sixth determining unit, configured to determine a target parameter of the initial device identifier pair;
a seventh determining unit, configured to determine the initial device identifier pair as the first device identifier pair when the target parameter satisfies a parameter condition.
Optionally, the apparatus further comprises:
an adjusting module, configured to, after determining a target parameter of the initial device identifier pair, adjust the screening rule until the target parameter of the device identifier pair that satisfies the adjusted screening rule satisfies the parameter condition when the target parameter does not satisfy the parameter condition;
a determining module, configured to determine, as the first device identifier pair, a device identifier pair in which the target parameter satisfies the parameter condition.
According to a further embodiment of the present invention, there is also provided a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.
According to yet another embodiment of the present invention, there is also provided an electronic device, including a memory in which a computer program is stored and a processor configured to execute the computer program to perform the steps in any of the above method embodiments.
According to the invention, a first equipment identification list and a second equipment identification list acquired in a target time period are acquired, wherein the first equipment identification list is used for recording first equipment identifications acquired in a first acquisition mode, and the second equipment identification list is used for recording second equipment identifications acquired in a second acquisition mode; constructing an equipment identifier pair set according to the first equipment identifier list and the second equipment identifier list, wherein the equipment identifier pair set is used for recording a first equipment identifier and a second equipment identifier which have a corresponding relationship; dividing the device identification pair set into a first device identification pair sub-set and a second device identification pair sub-set, wherein the first device identification having the corresponding relation in the first device identification pair sub-set is matched with the second device identification, and the second device identification having the corresponding relation in the second device identification pair sub-set is not matched with the first device identification having the corresponding relation in the second device identification pair sub-set; the method comprises the steps of using a first equipment identification pair subset as a positive sample, using a second equipment identification pair subset as a negative sample to train an initial model to obtain a target model, wherein the target model is used for matching the first equipment identification and the second equipment identification, constructing an equipment identification pair set according to a collected first equipment identification list and a collected second equipment identification list, dividing the equipment identification pair set into a positive sample set and a negative sample set, using a positive and negative sample set to train a model, and subsequently using a trained target model to match the first equipment identification and the second equipment identification, so that a rule engine with high accuracy and low recall rate is used, and a reasonable machine learning algorithm is matched, so that the whole ID fusion algorithm greatly improves the recall rate on the premise of ensuring the accuracy rate. Therefore, the problem of low matching efficiency of the equipment identifier in the related technology can be solved, and the effect of improving the matching efficiency of the equipment identifier is achieved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the invention without limiting the invention. In the drawings:
fig. 1 is a block diagram of a hardware structure of a mobile terminal according to an embodiment of the present invention;
FIG. 2 is a flow chart of a method of matching device identifications according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a method of matching device identifications, in accordance with an alternative embodiment of the present invention;
fig. 4 is a block diagram of a matching apparatus for device identification according to an embodiment of the present invention;
fig. 5 is a schematic diagram of a matching device for device identification according to an alternative embodiment of the present invention.
Detailed Description
The invention will be described in detail hereinafter with reference to the accompanying drawings in conjunction with embodiments. It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order.
The method provided by the first embodiment of the present application may be executed in a mobile terminal, a computer terminal, or a similar computing device. Taking an example of the method running on a mobile terminal, fig. 1 is a block diagram of a hardware structure of the mobile terminal of the method for matching device identifiers according to the embodiment of the present invention. As shown in fig. 1, the mobile terminal 10 may include one or more (only one shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA) and a memory 104 for storing data, and optionally may also include a transmission device 106 for communication functions and an input-output device 108. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration, and does not limit the structure of the mobile terminal. For example, the mobile terminal 10 may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.
The memory 104 may be used to store computer programs, for example, software programs and modules of application software, such as computer programs corresponding to the device identification matching method in the embodiment of the present invention, and the processor 102 executes various functional applications and data processing by running the computer programs stored in the memory 104, so as to implement the above-mentioned method. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some instances, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the mobile terminal 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 106 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the mobile terminal 10. In one example, the transmission device 106 includes a Network adapter (NIC), which can be connected to other Network devices through a base station so as to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used for communicating with the internet in a wireless manner.
In this embodiment, a method for matching device identifiers is provided, and fig. 2 is a flowchart of the method for matching device identifiers according to the embodiment of the present invention, as shown in fig. 2, the flowchart includes the following steps:
step S202, a first device identifier list and a second device identifier list acquired in a target time period are acquired, wherein the first device identifier list is used for recording first device identifiers acquired in a first acquisition mode, and the second device identifier list is used for recording second device identifiers acquired in a second acquisition mode;
step S204, constructing an equipment identifier pair set according to the first equipment identifier list and the second equipment identifier list, wherein the equipment identifier pair set is used for recording a first equipment identifier and a second equipment identifier which have a corresponding relationship;
step S206, dividing the device identification pair set into a first device identification pair sub-set and a second device identification pair sub-set, wherein the first device identification having the corresponding relation in the first device identification pair sub-set is matched with the second device identification, and the first device identification having the corresponding relation in the second device identification pair sub-set is not matched with the second device identification;
and S208, training an initial model by using the first equipment identifier pair subset as a positive sample and the second equipment identifier pair subset as a negative sample to obtain a target model, wherein the target model is used for matching the first equipment identifier with the second equipment identifier.
Optionally, in this embodiment, the first device identifier may include, but is not limited to, a MAC number of the device, and the second device identifier may include, but is not limited to, an IMSI number of the device.
Through the steps, an equipment identification pair set is constructed according to the collected first equipment identification list and the collected second equipment identification list, the equipment identification pair set is divided into a positive sample set and a negative sample set, a positive sample set and a negative sample set are used for training a model, and the trained target model is subsequently used for matching the first equipment identification and the second equipment identification, so that a rule engine with high accuracy and low recall rate is utilized and matched with a reasonable machine learning algorithm, so that the whole ID fusion algorithm greatly improves the recall rate on the premise of ensuring the accuracy rate. Therefore, the problem of low matching efficiency of the equipment identifier in the related technology can be solved, and the effect of improving the matching efficiency of the equipment identifier is achieved.
Optionally, the MAC number of the device may be collected, but not limited to, by a WIFI probe, and the IMSI number of the device may be collected by an electronic fence. For example: in the step S202, a first device identifier is acquired in a first acquisition manner in a target time period, and first acquisition information obtained when the first device identifier is acquired is recorded, so as to obtain a first device identifier list, where the first device identifier list records the first device identifier and the first acquisition information having a corresponding relationship, and the first acquisition information is used to indicate an acquisition location and an acquisition time when the corresponding first device identifier is acquired; and acquiring a second equipment identifier in a second acquisition mode in a target time period, and recording second acquisition information when the second equipment identifier is acquired to obtain a second equipment identifier list, wherein the second equipment identifier list records the second equipment identifier and the second acquisition information which have a corresponding relationship, and the second acquisition information is used for indicating an acquisition place and acquisition time when the corresponding second equipment identifier is acquired.
Optionally, the first collection mode may be a WIFI probe, and the second collection mode may be an electronic fence.
For example: data collected by the Wifi probe are shown in table1, and data collected by the electronic fence are shown in table 2.
TABLE1
MAC STARTTIME (time) LOCATION (place)
DA:A1:19:17:AC:12 2019-01-04 16:20:13 307001
DA:A5:11:19:AC:10 2019-01-04 16:20:12 306015
TABLE2
IMSI STARTTIME (time) LOCATION (place)
460003111370161 2019-01-04 16:20:10 305002
460001211370160 2019-01-04 16:19:11 307920
Optionally, the set of device identification pairs may be constructed from the feature information by, but not limited to, extracting the feature information from the collected information of the device identifications. For example: in step S204, extracting feature information from the first device identifier list and the second device identifier list, where the feature information is used to indicate a relationship between an acquisition location of each first device identifier and an acquisition location of each second device identifier, and/or a relationship between an acquisition time of each first device identifier and an acquisition time of each second device identifier; and establishing a corresponding relation among each first equipment identifier, each second equipment identifier and the characteristic information to obtain an equipment identifier pair set.
Optionally, the characteristic information may include, but is not limited to: the number of places where the MAC and IMSI appear in the time interval of less than 3 minutes, the number of places where the MAC appears, the number of places where the IMSI appears, the number of times that the MAC and IMSI are in the same place and the time interval of appearance is less than 3 seconds, etc.
Optionally, the device identification pair subsets may be divided, but are not limited to, by: determining a screening rule corresponding to the characteristic information; screening a first equipment identification pair meeting the screening rule from the equipment identification pair set; determining the first device identification pair as the device identification pair comprised by the first device identification pair subset; and determining the device identification pairs in the device identification pair set except the first device identification pair as the device identification pairs included in the second device identification pair sub-set.
Alternatively, the screening rules may also be referred to as a rules engine. The filtering rules may be determined, but are not limited to, based on the extracted feature information, such as: the filtering rule corresponding to the above feature information may be that MAC-IMSI pairs occur at 10 different places in a day with time intervals of less than 3 s.
For example: in the process of screening positive and negative samples. Fig. 3 is a schematic diagram of a matching method for device identifiers according to an alternative embodiment of the present invention, and as shown in fig. 3, taking data of day1 and day2 as an example, screening rules are used to screen feature tables features1 and features2 of day1 and day2 respectively to obtain tables 1 and table2, and table1 and table2 are merged and deduplicated to obtain a table. And screening out the data of MAC or IMSI in the features1 and 2 in the table, and combining to obtain the table features. For table features, if a MAC-IMSI pair appears in the table, it is taken as a positive sample, otherwise it is taken as a negative sample.
Optionally, the initial model may be trained, but not limited to, in the following way:
determining a label corresponding to the first device identifier pair sub-set as a first label value, and determining a label corresponding to the second device identifier pair sub-set as a second label value, wherein the first label value is used for indicating that the first device identifier is matched with the second device identifier, and the second label value is used for indicating that the first device identifier is not matched with the second device identifier;
inputting the characteristic information recorded in the first equipment identification pair subset into the initial model to obtain a first output value, and inputting the characteristic information recorded in the second equipment identification pair subset into the initial model to obtain a second output value;
adjusting the model parameters of the initial model according to the difference between the first output value and the first mark value and the difference between the second output value and the second mark value until the adjusted model converges;
and determining the adjusted converged model as a target model.
Optionally, the initial device identifier pair may be first screened out according to a screening rule, and then whether the initial device identifier pair satisfies a condition is determined according to a target parameter of the initial device identifier pair, and the initial device identifier pair satisfying the condition is determined as the first device identifier pair. For example: the first device identification pair may be screened out by, but is not limited to: screening initial equipment identification pairs meeting the screening rule from the equipment identification pair set; determining target parameters of an initial equipment identification pair; and under the condition that the target parameter meets the parameter condition, determining the initial equipment identification pair as a first equipment identification pair.
Optionally, after the screening rule is determined, the screening rule may be verified, and if the verification passes, the screening rule is used, and if the verification fails, a new screening rule is replaced.
Optionally, after determining the target parameter of the initial device identifier pair, under the condition that the target parameter does not satisfy the parameter condition, adjusting the screening rule until the target parameter of the device identifier pair satisfying the adjusted screening rule satisfies the parameter condition; and determining the device identification pair with the target parameter meeting the parameter condition as a first device identification pair.
Alternatively, the target parameters may include, but are not limited to: data ratio of one-to-one MAC-IMSI pair, data amount of one-to-one MAC-IMSI, coincidence rate, contradiction rate, discrepancy rate, and the like.
For example: the authentication method may include, but is not limited to, two types:
in the first mode, data verification in one day is used, and the data proportion of one-to-one MAC-IMSI pairs in the data screened by the use rule is checked, wherein the higher the proportion is, the better the effect is considered, and the verification is passed. And checking the screened one-to-one MAC-IMSI data volume, wherein the larger the data volume is, the better the effect is considered, and the verification is passed.
And secondly, verifying the data in two days, screening the data in two days by using the rule, and comparing the screening results in two days. Comparing the coincidence rate, namely the matching rate occurring within two days, wherein the higher the coincidence rate is, the better the effect is considered; comparing the contradiction rate, namely, the contradiction occurs between the results of two days, the same MAC corresponds to different IMSIs, and the lower the contradiction rate is, the better the effect is considered; and comparing the difference rates of the results in two days, namely removing the residual MAC-IMSI data pairs after the overlapped and contradictory MAC and IMSI data are removed, wherein the larger the difference rate is, the larger the promotion space of the rule is, and the worse the effect is.
Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
In this embodiment, a device for matching device identifiers is further provided, and the device is used to implement the foregoing embodiments and preferred embodiments, which have already been described and are not described again. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated.
Fig. 4 is a block diagram of a device identifier matching apparatus according to an embodiment of the present invention, and as shown in fig. 4, the apparatus includes:
an obtaining module 42, configured to obtain a first device identifier list and a second device identifier list acquired in a target time period, where the first device identifier list is used to record a first device identifier acquired in a first acquisition manner, and the second device identifier list is used to record a second device identifier acquired in a second acquisition manner;
a constructing module 44, configured to construct a device identifier pair set according to the first device identifier list and the second device identifier list, where the device identifier pair set is used to record a first device identifier and a second device identifier that have a corresponding relationship;
a dividing module 46, configured to divide the device identifier pair set into a first device identifier pair subset and a second device identifier pair subset, where a first device identifier having a corresponding relationship in a record in the first device identifier pair subset matches a second device identifier, and a first device identifier having a corresponding relationship in a record in the second device identifier pair subset does not match the second device identifier;
and a training module 48, configured to train the initial model using the first device identifier pair subset as a positive sample and the second device identifier pair subset as a negative sample, so as to obtain a target model, where the target model is used to match the first device identifier and the second device identifier.
Optionally, the obtaining module includes:
the device comprises a first acquisition unit, a second acquisition unit and a third acquisition unit, wherein the first acquisition unit is used for acquiring a first device identifier in a first acquisition mode in a target time period and recording first acquisition information when the first device identifier is acquired to obtain a first device identifier list, the first device identifier list records the first device identifier and the first acquisition information which have a corresponding relationship, and the first acquisition information is used for indicating an acquisition place and acquisition time when the corresponding first device identifier is acquired;
and the second acquisition unit is used for acquiring a second equipment identifier in a second acquisition mode in a target time period, recording second acquisition information when the second equipment identifier is acquired, and obtaining a second equipment identifier list, wherein the second equipment identifier list records the second equipment identifier and the second acquisition information which have a corresponding relationship, and the second acquisition information is used for indicating an acquisition place and acquisition time when the corresponding second equipment identifier is acquired.
Optionally, the building block comprises:
the extraction unit is used for extracting characteristic information from the first equipment identification list and the second equipment identification list, wherein the characteristic information is used for indicating the relation between the acquisition place of each first equipment identification and the acquisition place of each second equipment identification and/or the relation between the acquisition time of each first equipment identification and the acquisition time of each second equipment identification;
and the establishing unit is used for establishing the corresponding relation among each first equipment identifier, each second equipment identifier and the characteristic information to obtain an equipment identifier pair set.
Optionally, the dividing module includes:
a first determination unit configured to determine a filtering rule corresponding to the feature information;
the screening unit is used for screening out a first equipment identification pair meeting the screening rule from the equipment identification pair set;
a second determining unit, configured to determine the first device identification pair as a device identification pair included in the first device identification pair subset;
a third determining unit, configured to determine device identifier pairs other than the first device identifier pair in the device identifier pair set as device identifier pairs included in the second device identifier pair subset.
Optionally, the training module comprises:
a fourth determining unit, configured to determine a tag corresponding to the first device identifier pair subset as a first tag value, and determine a tag corresponding to the second device identifier pair subset as a second tag value, where the first tag value is used to indicate that the first device identifier and the second device identifier match, and the second tag value is used to indicate that the first device identifier and the second device identifier do not match;
the input unit is used for inputting the characteristic information recorded in the first equipment identification pair subset into the initial model to obtain a first output value, and inputting the characteristic information recorded in the second equipment identification pair subset into the initial model to obtain a second output value;
the adjusting unit is used for adjusting the model parameters of the initial model according to the difference between the first output value and the first marked value and the difference between the second output value and the second marked value until the adjusted model converges;
and a fifth determining unit, configured to determine the adjusted converged model as the target model.
Optionally, the screening module comprises:
the screening unit is used for screening out initial equipment identification pairs meeting the screening rule from the equipment identification pair set;
a sixth determining unit, configured to determine a target parameter of the initial device identifier pair;
and a seventh determining unit, configured to determine the initial device identifier pair as the first device identifier pair when the target parameter satisfies the parameter condition.
Optionally, the apparatus further comprises:
the adjusting module is used for adjusting the screening rule under the condition that the target parameter does not meet the parameter condition after the target parameter of the initial equipment identification pair is determined until the target parameter of the equipment identification pair meeting the adjusted screening rule meets the parameter condition;
and the determining module is used for determining the equipment identifier pair with the target parameter meeting the parameter condition as a first equipment identifier pair.
It should be noted that, the above modules may be implemented by software or hardware, and for the latter, the following may be implemented, but not limited to: the modules are all positioned in the same processor; alternatively, the modules are respectively located in different processors in any combination.
Reference will now be made in detail to the alternative embodiments of the present invention.
An alternative embodiment of the present invention provides a device for matching device identifiers, and fig. 5 is a schematic diagram of a device for matching device identifiers according to an alternative embodiment of the present invention, as shown in fig. 5, the device includes:
a feature extraction module 502 for extracting features from the raw data;
a rule engine module 504 for screening data using rules;
an effect verification module 506 for verifying the effect of the rules engine/model results;
a tag data screening module 508 for selecting appropriate positive and negative samples;
and a model training and predicting module 510, configured to train the initial model using the positive and negative samples to obtain a target model, and predict data using the target model.
Embodiments of the present invention also provide a storage medium having a computer program stored therein, wherein the computer program is arranged to perform the steps of any of the above method embodiments when executed.
Alternatively, in the present embodiment, the storage medium may be configured to store a computer program for executing the steps of:
s1, acquiring a first device identifier list and a second device identifier list acquired in a target time period, wherein the first device identifier list is used for recording first device identifiers acquired in a first acquisition mode, and the second device identifier list is used for recording second device identifiers acquired in a second acquisition mode;
s2, constructing a device identifier pair set according to the first device identifier list and the second device identifier list, wherein the device identifier pair set is used for recording a first device identifier and a second device identifier which have a corresponding relationship;
s3, dividing the device identification pair set into a first device identification pair sub-set and a second device identification pair sub-set, wherein the first device identification with the corresponding relation in the first device identification pair sub-set is matched with the second device identification, and the first device identification with the corresponding relation in the second device identification pair sub-set is not matched with the second device identification;
and S4, training an initial model by using the first device identifier pair subset as a positive sample and the second device identifier pair subset as a negative sample to obtain a target model, wherein the target model is used for matching the first device identifier and the second device identifier.
Optionally, in this embodiment, the storage medium may include, but is not limited to: various media capable of storing computer programs, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.
Embodiments of the present invention also provide an electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the steps of any of the above method embodiments.
Optionally, the electronic apparatus may further include a transmission device and an input/output device, wherein the transmission device is connected to the processor, and the input/output device is connected to the processor.
Optionally, in this embodiment, the processor may be configured to execute the following steps by a computer program:
s1, acquiring a first device identifier list and a second device identifier list acquired in a target time period, wherein the first device identifier list is used for recording first device identifiers acquired in a first acquisition mode, and the second device identifier list is used for recording second device identifiers acquired in a second acquisition mode;
s2, constructing a device identifier pair set according to the first device identifier list and the second device identifier list, wherein the device identifier pair set is used for recording a first device identifier and a second device identifier which have a corresponding relationship;
s3, dividing the device identification pair set into a first device identification pair sub-set and a second device identification pair sub-set, wherein the first device identification with the corresponding relation in the first device identification pair sub-set is matched with the second device identification, and the first device identification with the corresponding relation in the second device identification pair sub-set is not matched with the second device identification;
and S4, training an initial model by using the first device identifier pair subset as a positive sample and the second device identifier pair subset as a negative sample to obtain a target model, wherein the target model is used for matching the first device identifier and the second device identifier.
Optionally, the specific examples in this embodiment may refer to the examples described in the above embodiments and optional implementation manners, and this embodiment is not described herein again.
It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different than that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the principle of the present invention should be included in the protection scope of the present invention.

Claims (6)

1. A method for matching device identifiers, comprising:
acquiring a first device identifier list and a second device identifier list acquired in a target time period, wherein the first device identifier list is used for recording a first device identifier acquired in a first acquisition mode, and the second device identifier list is used for recording a second device identifier acquired in a second acquisition mode;
constructing an equipment identifier pair set according to the first equipment identifier list and the second equipment identifier list, wherein the equipment identifier pair set is used for recording a first equipment identifier and a second equipment identifier which have a corresponding relationship;
dividing the device identification pair set into a first device identification pair sub-set and a second device identification pair sub-set, wherein the first device identification having the corresponding relation in the first device identification pair sub-set is matched with the second device identification, and the second device identification having the corresponding relation in the second device identification pair sub-set is not matched with the first device identification having the corresponding relation in the second device identification pair sub-set;
using the first equipment identification pair subset as a positive sample, and using the second equipment identification pair subset as a negative sample to train an initial model to obtain a target model, wherein the target model is used for matching the first equipment identification with the second equipment identification;
the acquiring the first device identifier list and the second device identifier list acquired in the target time period comprises: acquiring the first equipment identifier in the target time period in the first acquisition mode, and recording first acquisition information when the first equipment identifier is acquired to obtain a first equipment identifier list, wherein the first equipment identifier list records the first equipment identifier and the first acquisition information which have a corresponding relationship, and the first acquisition information is used for indicating an acquisition place and acquisition time when the corresponding first equipment identifier is acquired; acquiring the second equipment identifier in the second acquisition mode in the target time period, and recording second acquisition information when the second equipment identifier is acquired to obtain a second equipment identifier list, wherein the second equipment identifier list records the second equipment identifier and the second acquisition information which have a corresponding relationship, and the second acquisition information is used for indicating an acquisition place and acquisition time when the corresponding second equipment identifier is acquired;
constructing a device identifier pair set according to the first device identifier list and the second device identifier list comprises: extracting feature information from the first device identifier list and the second device identifier list, wherein the feature information is used for indicating a relationship between an acquisition place of each first device identifier and an acquisition place of each second device identifier, and/or a relationship between an acquisition time of each first device identifier and an acquisition time of each second device identifier; and establishing a corresponding relation among each first device identification, each second device identification and the characteristic information to obtain the device identification pair set.
2. The method of claim 1, wherein partitioning the set of device identification pairs into a first set of device identification pairs and a second set of device identification pairs comprises:
determining a screening rule corresponding to the characteristic information;
screening a first equipment identification pair meeting the screening rule from the equipment identification pair set;
determining the first device identification pair as a device identification pair comprised by the first device identification pair subset;
determining device identification pairs in the device identification pair set except the first device identification pair as the device identification pairs included in the second device identification pair set.
3. An apparatus for matching device identifiers, comprising:
the device comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring a first device identifier list and a second device identifier list acquired in a target time period, the first device identifier list is used for recording a first device identifier acquired in a first acquisition mode, and the second device identifier list is used for recording a second device identifier acquired in a second acquisition mode;
a building module, configured to build a device identifier pair set according to the first device identifier list and the second device identifier list, where the device identifier pair set is used to record a first device identifier and a second device identifier that have a correspondence relationship;
a dividing module, configured to divide the device identifier pair set into a first device identifier pair subset and a second device identifier pair subset, where a first device identifier and a second device identifier that have a correspondence in a record in the first device identifier pair subset match, and a first device identifier and a second device identifier that have a correspondence in a record in the second device identifier pair subset do not match;
the training module is used for training an initial model by using the first equipment identifier pair subset as a positive sample and the second equipment identifier pair subset as a negative sample to obtain a target model, wherein the target model is used for matching the first equipment identifier with the second equipment identifier;
the acquisition module includes: a first acquisition unit, configured to acquire the first device identifier in the target time period in the first acquisition manner, and record first acquisition information when the first device identifier is acquired, so as to obtain a first device identifier list, where the first device identifier list records the first device identifier and the first acquisition information that have a corresponding relationship, and the first acquisition information is used to indicate an acquisition location and an acquisition time when the corresponding first device identifier is acquired; a second acquisition unit, configured to acquire the second device identifier in the target time period in the second acquisition manner, and record second acquisition information when the second device identifier is acquired, so as to obtain a second device identifier list, where the second device identifier list records the second device identifier and the second acquisition information that have a corresponding relationship, and the second acquisition information is used to indicate an acquisition location and an acquisition time when the corresponding second device identifier is acquired;
the building module comprises: an extracting unit, configured to extract feature information from the first device identifier list and the second device identifier list, where the feature information is used to indicate a relationship between an acquisition location of each first device identifier and an acquisition location of each second device identifier, and/or a relationship between an acquisition time of each first device identifier and an acquisition time of each second device identifier; and the establishing unit is used for establishing a corresponding relation among each first equipment identifier, each second equipment identifier and the characteristic information to obtain the equipment identifier pair set.
4. The apparatus of claim 3, wherein the partitioning module comprises:
a first determining unit, configured to determine a filtering rule corresponding to the feature information;
a screening unit, configured to screen out, from the set of device identifier pairs, a first device identifier pair that meets the screening rule;
a second determining unit, configured to determine the first device identification pair as a device identification pair included in the first device identification pair subset;
a third determining unit, configured to determine, as the device identifier pair included in the second device identifier pair subset, a device identifier pair in the device identifier pair set other than the first device identifier pair.
5. A storage medium, in which a computer program is stored, wherein the computer program is arranged to perform the method of any of claims 1 to 2 when executed.
6. An electronic device comprising a memory and a processor, wherein the memory has stored therein a computer program, and wherein the processor is arranged to execute the computer program to perform the method of any of claims 1 to 2.
CN201910775847.XA 2019-08-21 2019-08-21 Matching method and device of equipment identifiers Active CN110493368B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910775847.XA CN110493368B (en) 2019-08-21 2019-08-21 Matching method and device of equipment identifiers

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910775847.XA CN110493368B (en) 2019-08-21 2019-08-21 Matching method and device of equipment identifiers

Publications (2)

Publication Number Publication Date
CN110493368A CN110493368A (en) 2019-11-22
CN110493368B true CN110493368B (en) 2022-02-25

Family

ID=68552647

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910775847.XA Active CN110493368B (en) 2019-08-21 2019-08-21 Matching method and device of equipment identifiers

Country Status (1)

Country Link
CN (1) CN110493368B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110944290B (en) * 2019-12-02 2021-09-10 北京明略软件系统有限公司 Companion relationship analysis method and apparatus

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106156210A (en) * 2015-04-23 2016-11-23 腾讯科技(深圳)有限公司 A kind of method and apparatus determining application identities list of matches
CN108985954A (en) * 2018-07-02 2018-12-11 武汉斗鱼网络科技有限公司 A kind of method and relevant device of incidence relation that establishing each mark
CN109886204A (en) * 2019-02-25 2019-06-14 武汉烽火众智数字技术有限责任公司 A kind of Multidimensional Awareness system based on the application of big data police service
CN109951289A (en) * 2019-01-25 2019-06-28 北京三快在线科技有限公司 A kind of recognition methods, device, equipment and readable storage medium storing program for executing

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2628344A1 (en) * 2010-10-11 2013-08-21 Telefonaktiebolaget L M Ericsson (publ) A method for associating a tracking area identity list with a user equipment in a wireless communications network
CN105721629B (en) * 2016-03-24 2019-04-26 百度在线网络技术(北京)有限公司 User identifier matching process and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106156210A (en) * 2015-04-23 2016-11-23 腾讯科技(深圳)有限公司 A kind of method and apparatus determining application identities list of matches
CN108985954A (en) * 2018-07-02 2018-12-11 武汉斗鱼网络科技有限公司 A kind of method and relevant device of incidence relation that establishing each mark
CN109951289A (en) * 2019-01-25 2019-06-28 北京三快在线科技有限公司 A kind of recognition methods, device, equipment and readable storage medium storing program for executing
CN109886204A (en) * 2019-02-25 2019-06-14 武汉烽火众智数字技术有限责任公司 A kind of Multidimensional Awareness system based on the application of big data police service

Also Published As

Publication number Publication date
CN110493368A (en) 2019-11-22

Similar Documents

Publication Publication Date Title
CN108509658B (en) XML file parsing method and device
CN109635857B (en) Human-vehicle track monitoring and analyzing method, device, equipment and storage medium
CN104283918B (en) A kind of WLAN terminal type acquisition methods and system
CN109982361B (en) Signal interference analysis method, device, equipment and medium
CN112434039A (en) Data storage method, device, storage medium and electronic device
CN106843941B (en) Information processing method, device and computer equipment
CN107483381B (en) Monitoring method and device of associated account
CN105931123A (en) Method and apparatus for recommending friends based on network account
CN104199945A (en) Data storing method and device
CN110493368B (en) Matching method and device of equipment identifiers
WO2017000817A1 (en) Method and device for acquiring matching relationship between data
CN108197050B (en) Equipment identification method, device and system
CN114493028A (en) Method and device for establishing prediction model, storage medium and electronic device
US10419885B2 (en) Communication device and method, and computer program product for associating a mobile telephony identifier and a computer network identifier
CN107094306B (en) Terminal performance evaluation method and device
CN111672128A (en) Game mall game recommendation method and system based on local reserved time identification
CN116302889A (en) Performance test method and device for functional module and server
CN110471926B (en) File establishing method and device
CN112600715B (en) Distribution network operation analysis method and device, storage medium and electronic device
CN115967906A (en) User resident position identification method, terminal, electronic device and storage medium
CN110781178B (en) Data storage method, data storage device, storage medium and electronic device
CN113065058A (en) Family member identification method and device, electronic equipment and readable storage medium
CN111866848B (en) Mobile base station identification method and device and computer equipment
CN113064926B (en) Data screening method and device, storage medium and electronic device
CN112752252B (en) Cell home location identification method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant