CN110674879B - Identification matching method and device, electronic equipment and readable storage medium - Google Patents

Identification matching method and device, electronic equipment and readable storage medium Download PDF

Info

Publication number
CN110674879B
CN110674879B CN201910919106.4A CN201910919106A CN110674879B CN 110674879 B CN110674879 B CN 110674879B CN 201910919106 A CN201910919106 A CN 201910919106A CN 110674879 B CN110674879 B CN 110674879B
Authority
CN
China
Prior art keywords
terminal
place
type
acquired
location
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910919106.4A
Other languages
Chinese (zh)
Other versions
CN110674879A (en
Inventor
林晓明
江金陵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Mininglamp Software System Co ltd
Original Assignee
Beijing Mininglamp Software System Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Mininglamp Software System Co ltd filed Critical Beijing Mininglamp Software System Co ltd
Priority to CN201910919106.4A priority Critical patent/CN110674879B/en
Publication of CN110674879A publication Critical patent/CN110674879A/en
Application granted granted Critical
Publication of CN110674879B publication Critical patent/CN110674879B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/021Services related to particular areas, e.g. point of interest [POI] services, venue services or geofences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W8/00Network data management
    • H04W8/22Processing or transfer of terminal data, e.g. status or physical capabilities

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Databases & Information Systems (AREA)
  • Mobile Radio Communication Systems (AREA)
  • Traffic Control Systems (AREA)

Abstract

The application provides an identification matching method, an identification matching device, an electronic device and a readable storage medium, the identification matching method, the identification matching device, the electronic device and the readable storage medium are characterized in that terminal information of terminal devices collected at various places is firstly obtained, the place type of each place can be determined according to the number of the terminal devices collected at each place, further, characteristic information of a first terminal and a second terminal is respectively extracted from the terminal information collected at the places corresponding to each place type, then a first characteristic vector representing the association relation between the first terminal and the second terminal is generated according to the characteristic information respectively corresponding to each place type, the place type is taken as a unit for extracting the characteristic information, vector dimension can be reduced, and whether the first identification and the second identification are from the same terminal or not can be determined by inputting the first characteristic vector into a matching model, because the influence of the pedestrian flow of different places on the identification matching is considered, the accuracy of the identification matching can be improved.

Description

Identification matching method and device, electronic equipment and readable storage medium
Technical Field
The present application relates to the field of information matching technologies, and in particular, to an identifier matching method and apparatus, an electronic device, and a readable storage medium.
Background
The WIFI probe and the mobile phone electronic fence are effective technologies for acquiring information of the terminal equipment, and are widely arranged at various places, such as highway intersections, commercial area intersections, cell entrances and the like, and can be used for measuring the speed of the highway, monitoring the pedestrian flow of a shopping mall and monitoring the pedestrian flow of a cell, however, the WIFI probe collects the Media Access Control Address (MAC) number of the terminal device, the electronic fence of the Mobile phone collects the International Mobile Subscriber Identity (IMSI) of the terminal device, however, there is no matching table between the two terminal identifications, so it is a matter of concern how to match the two terminal identifications to determine whether the two identifications come from the same terminal.
At present, the matching mode adopted in the prior art is that a WIFI probe and a mobile phone electronic fence are deployed in pair at the same place, and a rule engine is used for matching the MAC address and the IMSI number which are simultaneously acquired by a terminal device, but in places with large traffic, the WIFI probe and the mobile phone electronic fence can acquire a plurality of terminal identifications, so that the MAC-IMSI pairs found by the rule engine are all many-to-many data sets, and it is impossible to accurately determine which MAC-IMSI pair really comes from the same terminal.
Disclosure of Invention
In view of this, embodiments of the present application provide an identifier matching method, an identifier matching apparatus, an electronic device, and a readable storage medium, which can improve the accuracy of identifier matching by considering the influence of pedestrian flow in different locations on identifier matching.
The application mainly comprises the following aspects:
in a first aspect, an embodiment of the present application provides an identifier matching method, where the identifier matching method includes:
acquiring terminal information of terminal equipment acquired at each place; the terminal information comprises an identifier of the terminal equipment;
for each of the various sites, determining a site type for each site based on the number of terminal devices collected at each site;
generating a first feature vector representing the association relation between the first terminal and the second terminal based on terminal information of the first terminal and the second terminal acquired by a place corresponding to each place type in the place types; the collected first identification of the first terminal and the collected second identification of the second terminal are different in identification type;
inputting the first feature vector into a matching model, and determining whether the first identifier and the second identifier are from the same terminal.
In a possible implementation manner, the acquiring terminal information of the terminal device collected at each location includes:
acquiring terminal information of terminal equipment acquired by the paired WIFI probes and the mobile phone electronic fence at each place;
the WIFI probe and the mobile phone electronic fence acquisition terminal equipment are different in identification type.
In a possible embodiment, the determining, for each of the locations, a location type of each location based on the number of terminal devices collected at each location includes:
generating a second feature vector representing the pedestrian volume of each location based on the number of terminal devices acquired by the WIFI probe and/or the mobile phone electronic fence deployed at each location for each of the locations;
and inputting the second feature vector corresponding to each place into a place clustering model, and determining the place type of each place.
In one possible embodiment, each element in the second feature vector includes at least one of the following elements:
average value of the number of terminal devices collected every day; the average value of the number of the terminal devices collected in each preset time period; the average value of the number of the terminal devices collected in each working day; the average value of the number of terminal devices collected on non-working days.
In a possible implementation manner, each element in the first feature vector represents an association relationship between the first terminal and the second terminal, and the terminal information acquired at the place corresponding to each place type; the dimension number of the first feature vector is determined by the number of the location types and the number of elements corresponding to each location type.
In one possible implementation, the elements in the first feature vector include:
the collected number of places with the time interval between the occurrence of the first terminal and the occurrence of the second terminal being less than a first preset duration is acquired at the place corresponding to each place type in each place type; the number of the places where the first terminal appears is acquired according to the place corresponding to each place type in the place types; the number of the places where the second terminal appears is acquired from the place corresponding to each place type in the place types; and the times of the occurrence time interval of the first terminal and the second terminal in the same place, which is acquired by the place corresponding to each place type in each place type, is less than a second preset time length.
In a second aspect, an embodiment of the present application further provides an identifier matching apparatus, where the identifier matching apparatus includes:
the acquisition module is used for acquiring terminal information of the terminal equipment acquired at each place; the terminal information comprises an identifier of the terminal equipment;
a first determining module, configured to determine, for each of the locations, a location type of each location based on the number of terminal devices collected at each location;
the generating module is used for generating a first feature vector representing the association relation between the first terminal and the second terminal based on the terminal information of the first terminal and the second terminal acquired by the places corresponding to each place type in each place type; the collected first identification of the first terminal and the collected second identification of the second terminal are different in identification type;
and the second determining module is used for inputting the first feature vector into a matching model and determining whether the first identifier and the second identifier come from the same terminal.
In a possible implementation manner, the obtaining module is configured to obtain the terminal information according to the following steps:
acquiring terminal information of terminal equipment acquired by the paired WIFI probes and the mobile phone electronic fence at each place;
the WIFI probe and the mobile phone electronic fence acquisition terminal equipment are different in identification type.
In one possible implementation, the first determining module includes:
a generating unit, configured to generate, for each of the various locations, a second feature vector representing a pedestrian volume of each location based on the number of terminal devices acquired by the WIFI probes and/or the mobile phone electronic fences deployed at each location;
and the determining unit is used for inputting the second feature vector corresponding to each place into the place clustering model and determining the place type of each place.
In one possible embodiment, each element in the second feature vector includes at least one of the following elements:
average value of the number of terminal devices collected every day; the average value of the number of the terminal devices collected in each preset time period; the average value of the number of the terminal devices collected in each working day; the average value of the number of terminal devices collected on non-working days.
In a possible implementation manner, each element in the first feature vector represents an association relationship between the first terminal and the second terminal, and the terminal information acquired at the place corresponding to each place type; the dimension number of the first feature vector is determined by the number of the location types and the number of elements corresponding to each location type.
In one possible implementation, the elements in the first feature vector include:
the collected number of places with the time interval between the occurrence of the first terminal and the occurrence of the second terminal being less than a first preset duration is acquired at the place corresponding to each place type in each place type; the number of the places where the first terminal appears is acquired according to the place corresponding to each place type in the place types; the number of the places where the second terminal appears is acquired from the place corresponding to each place type in the place types; and the times of the occurrence time interval of the first terminal and the second terminal in the same place, which is acquired by the place corresponding to each place type in each place type, is less than a second preset time length.
In a third aspect, an embodiment of the present application further provides an electronic device, including: a processor, a memory and a bus, wherein the memory stores machine-readable instructions executable by the processor, and when the electronic device runs, the processor and the memory communicate with each other through the bus, and the machine-readable instructions are executed by the processor to perform the steps of the identity matching method according to the first aspect or any one of the possible implementation manners of the first aspect.
In a fourth aspect, this application provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the identity matching method described in the first aspect or any one of the possible implementation manners of the first aspect are performed.
In the embodiment of the application, the location type of each location can be determined by acquiring the terminal information of the terminal equipment acquired at each location, according to the number of the terminal equipment acquired at each location, that is, according to the flow of people at each location, further, the feature information of the first terminal and the second terminal is respectively extracted from the terminal information acquired at the location corresponding to each location type, and further, the first feature vector representing the association relationship between the first terminal and the second terminal is generated according to the feature information corresponding to each location type, because the location type is taken as a unit for extracting the feature information, the vector dimension can be reduced, and because the first feature vector is input into the matching model, whether the first identifier and the second identifier are from the same terminal can be determined, because the influence of the flow of people at different locations on identifier matching is considered, the accuracy of the identification matching can be improved.
In order to make the aforementioned objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
Fig. 1 is a flowchart illustrating an identifier matching method provided in an embodiment of the present application;
FIG. 2 is a functional block diagram of an identity matching apparatus provided in an embodiment of the present application;
FIG. 3 is a second functional block diagram of an identity matching apparatus according to an embodiment of the present application;
fig. 4 shows a schematic structural diagram of an electronic device provided in an embodiment of the present application.
Detailed Description
To make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it should be understood that the drawings in the present application are for illustrative and descriptive purposes only and are not used to limit the scope of protection of the present application. Additionally, it should be understood that the schematic drawings are not necessarily drawn to scale. The flowcharts used in this application illustrate operations implemented according to some embodiments of the present application. It should be understood that the operations of the flow diagrams may be performed out of order, and that steps without logical context may be performed in reverse order or concurrently. One skilled in the art, under the guidance of this application, may add one or more other operations to, or remove one or more operations from, the flowchart.
In addition, the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
To enable those skilled in the art to utilize the present disclosure, the following embodiments are presented in conjunction with a specific application scenario "matching of terminal identifications", and it will be apparent to those skilled in the art that the general principles defined herein may be applied to other embodiments and application scenarios without departing from the spirit and scope of the present disclosure.
The method, apparatus, electronic device or computer-readable storage medium described in the embodiments of the present application may be applied to any scenario that requires identifier matching, and the embodiments of the present application do not limit a specific application scenario, and any scheme that uses the identifier matching method and apparatus provided in the embodiments of the present application is within the scope of the present application.
It is worth noting that, before the application is proposed, in the existing scheme, the matching mode is adopted in which the WIFI probe and the mobile phone electronic fence are deployed in pair at the same place, and the rule engine is used for matching the MAC address and the IMSI number which are simultaneously acquired by the terminal device, however, in places with large traffic, a plurality of terminal identifiers are acquired by the WIFI probe and the mobile phone electronic fence, so that the MAC-IMSI pairs found by the rule engine are all multi-to-multi data sets, and it is impossible to accurately determine which MAC-IMSI pair really comes from the same terminal.
In view of the above problems, in the embodiment of the present application, by acquiring terminal information of terminal devices acquired at each location, and according to the number of terminal devices acquired at each location, that is, according to the amount of people flowing at each location, a location type of each location can be determined, further, feature information of a first terminal and a second terminal is extracted from terminal information acquired at locations corresponding to each location type, and further, a first feature vector representing an association relationship between the first terminal and the second terminal is generated according to feature information corresponding to each location type, and because the location type is used as a unit for extracting feature information, vector dimensions can be reduced, and by inputting the first feature vector into a matching model, it can be determined whether the first identifier and the second identifier are from the same terminal, because influence of people flowing at different locations on identifier matching is considered, the accuracy of the identification matching can be improved.
For the convenience of understanding of the present application, the technical solutions provided in the present application will be described in detail below with reference to specific embodiments.
Fig. 1 is a flowchart of an identifier matching method according to an embodiment of the present application. As shown in fig. 1, the identifier matching method provided in the embodiment of the present application includes the following steps:
s101: acquiring terminal information of terminal equipment acquired at each place; the terminal information includes an identifier of the terminal device.
In specific implementation, the terminal information of the terminal device acquired by the acquisition device deployed in each place in advance is acquired, specifically, the terminal information of the terminal device acquired in each place at a distance from the current preset time period may be acquired, and the preset time period may be several months or other time. Here, the collection equipment includes WIFI probe and cell-phone fence, and terminal information includes terminal sign, collection place and acquisition time.
The terminal device may be a mobile terminal, a personal computer terminal, or the like.
Further, in S101, terminal information of terminal devices collected at various locations is acquired; the terminal information includes an identifier of the terminal device, including:
acquiring terminal information of terminal equipment acquired by the paired WIFI probes and the mobile phone electronic fence at each place; the WIFI probe and the mobile phone electronic fence acquisition terminal equipment are different in identification type.
In specific implementation, the WIFI probe and the mobile phone electronic fence can be deployed in pairs at each place in advance, and then the terminal information of the same terminal device can be acquired through the WIFI probe and the mobile phone electronic fence at the same time.
It should be noted that the WIFI probe and the mobile phone electronic fence provide different services, the identifier types of the identifier of the terminal device are collected by the WIFI probe and the mobile phone electronic fence are different, the MAC address of the terminal device is collected by the WIFI probe, and the IMSI number of the terminal device is collected by the mobile phone electronic fence.
S102: and determining the location type of each location based on the number of the terminal devices collected in each location aiming at each location in the various locations.
In a specific implementation, after the terminal information of the terminal devices collected in each location is obtained, the location type of each location can be determined according to the number of the terminal devices collected in each location, that is, according to the traffic of each location, where the terminal information includes terminal identifiers, each terminal identifier represents one terminal device, and usually one terminal device is carried by one person.
It should be noted that the location types are divided according to the traffic volume, the number of the location types may be set according to actual needs, and the number of the terminal devices collected by the locations of the same location type is approximate.
Further, in S102, for each of the locations, determining a location type of each location based on the number of terminal devices collected at each location includes the following steps:
step A: generating a second feature vector representing the pedestrian volume of each location based on the number of terminal devices collected by the WIFI probe and/or the mobile phone electronic fence deployed at each location for each of the locations.
In particular implementation, since the WIFI probe and the cell phone electronic fence are deployed in pairs at each site, therefore, the pedestrian volume of the place can be represented by the number of the terminal devices acquired by the WIFI probe, the pedestrian volume of the place can be represented by the number of the terminal devices acquired by the mobile phone electronic fence, the pedestrian volume of the place can be represented by the total number of the terminal devices acquired by the WIFI probe and the mobile phone electronic fence, here, the number of terminal devices may be the number of terminal devices which are collected at the place on average every day within a few months from the present, may be the number of terminal devices which are collected at the place on average in any one time period of the day, may be the number of terminal devices which are collected at the place on average every working day in the several months, and then according to the quantities, a second feature vector for representing the characteristic of the flow of people at the place can be generated.
It should be noted that, when determining the location type of each of the locations, the calculation method and the type of the usage data used are the same, and if one location determines the type of the location by using the number of terminal devices acquired by the WIFI probe, other locations should also determine the location type by using the number of terminal devices of the WIFI probe set.
And B: and inputting the second feature vector corresponding to each place into a place clustering model, and determining the place type of each place.
In a specific implementation, the second feature vector representing the pedestrian volume of each location is input into a location clustering model, and the location type of each location can be determined, wherein the model parameters of the clustering model comprise the number of the location types. In addition to the number of people, the time is also considered during clustering, so that each place can be clustered more quickly.
Further, each element in the second feature vector comprises at least one of the following elements: average value of the number of terminal devices collected every day; the average value of the number of the terminal devices collected in each preset time period; the average value of the number of the terminal devices collected in each working day; the average value of the number of terminal devices collected on non-working days.
In specific implementation, the preset time period may be 01:00-02:00, 02:00-03:00, 03:00-04:00, 04:00-05:00, and the like, the working day may be monday to friday, and the non-working day may be saturday, sunday, and other holidays.
In one example, each element in the second feature vector may include an average of the number of MAC addresses collected per day, an average of the number of MACs collected from 01:00 to 02:00, an average of the number of MAC numbers collected from 02:00 to 03:00, …, 24: the average of the number of MAC numbers collected at 00-01:00, the average of MAC numbers collected on monday, …, the average of MAC numbers collected on sunday, etc., and the average of the number of IMSI numbers collected on daily basis, the average of the number of IMSI numbers collected at 01:00-02:00, etc.
S103: generating a first feature vector representing the association relation between the first terminal and the second terminal based on terminal information of the first terminal and the second terminal acquired by a place corresponding to each place type in the place types; the collected first identification of the first terminal and the collected second identification of the second terminal are different in identification type.
In specific implementation, after the location type of each location is determined, clustering is performed on each location in each location, the feature information of the first terminal and the second terminal is respectively extracted from the terminal information acquired from the location corresponding to each location type, and then a first feature vector representing the association relationship between the first terminal and the second terminal is generated according to the feature information respectively corresponding to each location type. Here, the first terminal is a device acquired by the WIFI probe, the second terminal is a device acquired by the mobile phone electronic fence, the MAC address of the first terminal device is acquired by the WIFI probe, and the IMSI number of the second terminal device is acquired by the mobile phone electronic fence.
S104: inputting the first feature vector into a matching model, and determining whether the first identifier and the second identifier are from the same terminal.
In a specific implementation, a first feature vector representing the association relationship between the first terminal and the second terminal is input into the matching model, so that whether the first identifier and the second identifier come from the same terminal or not can be determined. Here, different weights may be assigned to the elements corresponding to each of the different location types in advance, for example, the elements of the location type with less pedestrian volume are assigned with a larger weight, so that the influence of the pedestrian volume of different locations on the identifier matching can be further considered, and the accuracy of the identifier matching can be improved.
It should be noted that, in the prior art, the rule engine is used to match the MAC address and the IMSI number acquired by the terminal device at the same time, but the prior art does not consider the influence of the pedestrian flow at different locations on identifier matching, and in a location with a large pedestrian flow, a plurality of terminal identifiers are acquired by the WIFI probe and the mobile phone electronic fence, so that the MAC-IMSI pairs found by the rule engine are all many-to-many data sets, and it is impossible to accurately determine which MAC-IMSI pair really comes from the same terminal. The method and the device aim at the problems in the prior art, the influence of the pedestrian flow of different places on the identification matching is considered, namely, compared with the IMSI-MAC pair which simultaneously appears at an intersection with large pedestrian flow, the IMSI-MAC pair which simultaneously appears at an intersection with rare pedestrian traffic and more likely to be a real IMSI-MAC pair, the probability of the IMSI-MAC pair is higher, so that the characteristic information of a first terminal and the characteristic information of a second terminal are respectively extracted from the terminal information which is acquired from the places corresponding to each place type, a first characteristic vector is generated according to the characteristic information which respectively corresponds to each place type, and whether the first identification and the second identification come from the same terminal or not is determined through the first characteristic vector, so that the accuracy of identification matching can be improved. In the method, the location type is used as a unit for extracting the feature information, so that the vector dimension can be reduced, and the calculation amount in the identification matching process can be reduced.
Further, the matching model may be trained in advance, specifically, it may be determined that a first feature vector corresponding to the MAC-IMSI from the same terminal device is used as a positive sample, a second feature vector corresponding to the MAC-IMSI which is determined not to be the same terminal device is used as a negative sample, labels are added to the positive sample and the negative sample, and the initial matching model is trained according to the positive sample, the negative sample, the label of the positive sample, and the label of the negative sample to obtain the trained matching model. Here, the event matching model may be any one of a logistic regression model, a support vector machine model, a random forest model, a machine learning model, and a deep learning model.
Further, each element in the first feature vector represents an association relationship between the first terminal and the second terminal and the terminal information acquired at the place corresponding to each place type; the dimension number of the first feature vector is determined by the number of the location types and the number of elements corresponding to each location type.
In a specific implementation, each element in the first feature vector is calculated according to terminal information acquired at a place corresponding to each place type, and the terminal information includes acquisition time, acquisition place and interrupt identifier. The number of elements corresponding to each location type can be set according to actual needs, and the location type is used as a unit for extracting the feature information, so that vector dimensionality can be reduced, and the calculation amount for performing identification matching through the first feature vector can be reduced.
In an example, if the number of location types is m and the number of elements corresponding to each location type is n, the dimension number of the first feature vector is m × n.
Further, the elements in the first feature vector include: the collected number of places with the time interval between the occurrence of the first terminal and the occurrence of the second terminal being less than a first preset duration is acquired at the place corresponding to each place type in each place type; the number of the places where the first terminal appears is acquired according to the place corresponding to each place type in the place types; the number of the places where the second terminal appears is acquired from the place corresponding to each place type in the place types; and the times of the occurrence time interval of the first terminal and the second terminal in the same place, which is acquired by the place corresponding to each place type in each place type, is less than a second preset time length. Here, the first preset time period and the second preset time period may be set according to actual needs.
In one example, there are two location types a1 and a2, the first preset duration is 3 seconds, the second preset duration is 2 seconds, the number of elements corresponding to each location type is 4, and the degree of dimension of the first feature vector between the first terminal and the second terminal is 2 × 4 — 8. The number of places with the time interval of less than 3 seconds between the appearance of the first terminal and the appearance of the second terminal in the A1 place type is 10, the number of places with the appearance of the first terminal in the A1 place type is 8, the number of places with the appearance of the second terminal in the A1 place type is 12, and the number of times that the time interval of less than 2 seconds between the appearance of the first terminal and the appearance of the second terminal in the same place in the A1 place type is 15; the number of places where the first terminal and the second terminal appear in the a2 place type with a time interval of less than 3 seconds is 11, the number of places where the first terminal appears in the a2 place type is 9, the number of places where the second terminal appears in the a2 place type is 10, the number of times that the first terminal and the second terminal appear in the same place in the a2 place type with a time interval of less than 2 seconds is 12, and the first feature vector is (10, 8, 12, 15, 11, 9, 10, 12).
In the embodiment of the application, the terminal information of the terminal equipment collected at each place is obtained, and according to the number of the terminal equipment collected at each place, namely, the location type of each location can be determined according to the pedestrian volume of each location, further, the characteristic information of the first terminal and the second terminal is respectively extracted from the terminal information collected from the location corresponding to each location type, then generating a first feature vector representing the association relationship between the first terminal and the second terminal according to the feature information respectively corresponding to each location type, since the location type is used as a unit for extracting the feature information, the vector dimension can be reduced, and by inputting the first feature vector into the matching model, it can be determined whether the first identifier and the second identifier are from the same terminal, because the influence of the pedestrian flow of different places on the identification matching is considered, the accuracy of the identification matching can be improved.
Based on the same application concept, an identifier matching device corresponding to the identifier matching method shown in fig. 1 is further provided in the embodiment of the present application, and since the principle of solving the problem of the device in the embodiment of the present application is similar to that of the identifier matching method shown in fig. 1, the implementation of the device may refer to the implementation of the method, and repeated details are not repeated.
Referring to fig. 2, there is shown one functional block diagram of an identifier matching apparatus 200 according to an embodiment of the present application, and referring to fig. 3, there is shown another functional block diagram of an identifier matching apparatus 200 according to an embodiment of the present application. As shown in fig. 2 and 3, the identification matching apparatus 200 includes:
an obtaining module 210, configured to obtain terminal information of terminal devices collected in various places; the terminal information comprises an identifier of the terminal equipment;
a first determining module 220, configured to determine, for each of the locations, a location type of each location based on the number of terminal devices collected at each location;
a generating module 230, configured to generate a first feature vector representing an association relationship between a first terminal and a second terminal based on terminal information of the first terminal and the second terminal acquired by a place corresponding to each place type in each place type; the collected first identification of the first terminal and the collected second identification of the second terminal are different in identification type;
a second determining module 240, configured to input the first feature vector into a matching model, and determine whether the first identifier and the second identifier are from the same terminal.
In a possible implementation manner, as shown in fig. 2 and fig. 3, the obtaining module 210 is configured to obtain the terminal information according to the following steps:
acquiring terminal information of terminal equipment acquired by the paired WIFI probes and the mobile phone electronic fence at each place;
the WIFI probe and the mobile phone electronic fence acquisition terminal equipment are different in identification type.
In one possible implementation, as shown in fig. 3, the first determining module 220 includes:
a generating unit 222, configured to generate, for each of the various locations, a second feature vector representing a human traffic of each location based on the number of terminal devices collected by the WIFI probes and/or the mobile phone electronic fences deployed at each location;
a determining unit 224, configured to input the second feature vector corresponding to each location into the location clustering model, and determine a location type of each location.
In one possible embodiment, each element in the second feature vector includes at least one of the following elements:
average value of the number of terminal devices collected every day; the average value of the number of the terminal devices collected in each preset time period; the average value of the number of the terminal devices collected in each working day; the average value of the number of terminal devices collected on non-working days.
In a possible implementation manner, each element in the first feature vector represents an association relationship between the first terminal and the second terminal, and the terminal information acquired at the place corresponding to each place type; the dimension number of the first feature vector is determined by the number of the location types and the number of elements corresponding to each location type.
In one possible implementation, the elements in the first feature vector include:
the collected number of places with the time interval between the occurrence of the first terminal and the occurrence of the second terminal being less than a first preset duration is acquired at the place corresponding to each place type in each place type; the number of the places where the first terminal appears is acquired according to the place corresponding to each place type in the place types; the number of the places where the second terminal appears is acquired from the place corresponding to each place type in the place types; and the times of the occurrence time interval of the first terminal and the second terminal in the same place, which is acquired by the place corresponding to each place type in each place type, is less than a second preset time length.
In this embodiment, the obtaining module 210 obtains the terminal information of the terminal devices collected at each location, and according to the number of the terminal devices collected at each location, that is, according to the amount of people flowing at each location, the location type of each location can be determined by the first determining module 220, further, the feature information of the first terminal and the second terminal can be respectively extracted from the terminal information collected at the location corresponding to each location type, and then the first feature vector representing the association relationship between the first terminal and the second terminal is generated according to the feature information corresponding to each location type, because the location type is used as a unit for extracting the feature information, the vector dimension can be reduced, and by inputting the first feature vector into the matching model, it can be determined whether the first identifier and the second identifier are from the same terminal, because the influence of people flowing at different locations on identifier matching is considered, the accuracy of the identification matching can be improved.
Based on the same application concept, referring to fig. 4, a schematic structural diagram of an electronic device 400 provided in the embodiment of the present application includes: a processor 410, a memory 420 and a bus 430, wherein the memory 420 stores machine-readable instructions executable by the processor 410, and when the electronic device 400 is operated, the processor 410 communicates with the memory 420 via the bus 430, and the machine-readable instructions are executed by the processor 410 to perform the steps of the identity matching means method shown in fig. 1.
In particular, the machine readable instructions, when executed by the processor 410, may perform the following:
acquiring terminal information of terminal equipment acquired at each place; the terminal information comprises an identifier of the terminal equipment;
for each of the various sites, determining a site type for each site based on the number of terminal devices collected at each site;
generating a first feature vector representing the association relation between the first terminal and the second terminal based on terminal information of the first terminal and the second terminal acquired by a place corresponding to each place type in the place types; the collected first identification of the first terminal and the collected second identification of the second terminal are different in identification type;
inputting the first feature vector into a matching model, and determining whether the first identifier and the second identifier are from the same terminal.
In the embodiment of the application, the terminal information of the terminal equipment collected in each place is obtained, and according to the number of the terminal equipment collected in each place, namely, the location type of each location can be determined according to the pedestrian volume of each location, further, the characteristic information of the first terminal and the second terminal is respectively extracted from the terminal information collected from the location corresponding to each location type, then generating a first feature vector representing the association relationship between the first terminal and the second terminal according to the feature information respectively corresponding to each location type, since the location type is used as a unit for extracting the feature information, the vector dimension can be reduced, and by inputting the first feature vector into the matching model, it can be determined whether the first identifier and the second identifier are from the same terminal, because the influence of the pedestrian flow of different places on the identification matching is considered, the accuracy of the identification matching can be improved.
Based on the same application concept, the embodiments of the present application further provide a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the identity matching method shown in fig. 1 are performed.
Specifically, the storage medium can be a general storage medium, such as a mobile disk, a hard disk, and the like, and when a computer program on the storage medium is run, the identifier matching method can be executed.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (8)

1. An identity matching method, characterized in that the identity matching method comprises:
acquiring terminal information of terminal equipment acquired at each place; the terminal information comprises an identifier of the terminal equipment;
for each of the various sites, determining a site type for each site based on the number of terminal devices collected at each site;
generating a first feature vector representing the association relation between the first terminal and the second terminal based on terminal information of the first terminal and the second terminal acquired by a place corresponding to each place type in the place types; the collected first identification of the first terminal and the collected second identification of the second terminal are different in identification type; each element in the first feature vector represents an association relationship between the first terminal and the second terminal and the terminal information acquired at the place corresponding to each place type; the dimension number of the first feature vector is determined by the number of the location types and the number of elements corresponding to each location type;
the elements in the first feature vector include: the collected number of places with the time interval between the occurrence of the first terminal and the occurrence of the second terminal being less than a first preset duration is acquired at the place corresponding to each place type in each place type; the number of the places where the first terminal appears is acquired according to the place corresponding to each place type in the place types; the number of the places where the second terminal appears is acquired from the place corresponding to each place type in the place types; the times that the time interval of the occurrence time of the first terminal and the second terminal in the same place is smaller than a second preset time length are acquired for the place corresponding to each place type in each place type;
inputting the first feature vector into a matching model, and determining whether the first identifier and the second identifier are from the same terminal.
2. The identifier matching method according to claim 1, wherein the acquiring the terminal information of the terminal device collected at each location includes:
acquiring terminal information of terminal equipment acquired by the paired WIFI probes and the mobile phone electronic fence at each place;
the WIFI probe and the mobile phone electronic fence acquisition terminal equipment are different in identification type.
3. The identity matching method according to claim 2, wherein the determining, for each of the locations, a location type for each location based on the number of terminal devices collected at each location comprises:
generating a second feature vector representing the pedestrian volume of each location based on the number of terminal devices acquired by the WIFI probe and/or the mobile phone electronic fence deployed at each location for each of the locations;
and inputting the second feature vector corresponding to each place into a place clustering model, and determining the place type of each place.
4. The identity matching method of claim 3, wherein each element in the second feature vector comprises at least one of:
average value of the number of terminal devices collected every day; the average value of the number of the terminal devices collected in each preset time period; the average value of the number of the terminal devices collected in each working day; the average value of the number of terminal devices collected on non-working days.
5. An identity matching apparatus, characterized in that the identity matching apparatus comprises:
the acquisition module is used for acquiring terminal information of the terminal equipment acquired at each place; the terminal information comprises an identifier of the terminal equipment;
a first determining module, configured to determine, for each of the locations, a location type of each location based on the number of terminal devices collected at each location;
the generating module is used for generating a first feature vector representing the association relation between the first terminal and the second terminal based on the terminal information of the first terminal and the second terminal acquired by the places corresponding to each place type in each place type; the collected first identification of the first terminal and the collected second identification of the second terminal are different in identification type; each element in the first feature vector represents an association relationship between the first terminal and the second terminal and the terminal information acquired at the place corresponding to each place type; the dimension number of the first feature vector is determined by the number of the location types and the number of elements corresponding to each location type;
the elements in the first feature vector include: the collected number of places with the time interval between the occurrence of the first terminal and the occurrence of the second terminal being less than a first preset duration is acquired at the place corresponding to each place type in each place type; the number of the places where the first terminal appears is acquired according to the place corresponding to each place type in the place types; the number of the places where the second terminal appears is acquired from the place corresponding to each place type in the place types; the times that the time interval of the occurrence time of the first terminal and the second terminal in the same place is smaller than a second preset time length are acquired for the place corresponding to each place type in each place type;
and the second determining module is used for inputting the first feature vector into a matching model and determining whether the first identifier and the second identifier come from the same terminal.
6. The identifier matching apparatus according to claim 5, wherein the obtaining module is configured to obtain the terminal information according to the following steps:
acquiring terminal information of terminal equipment acquired by the paired WIFI probes and the mobile phone electronic fence at each place;
the WIFI probe and the mobile phone electronic fence acquisition terminal equipment are different in identification type.
7. An electronic device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating over the bus when the electronic device is run, the machine-readable instructions when executed by the processor performing the steps of the identity matching method of any of claims 1 to 4.
8. A computer-readable storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when being executed by a processor, performs the steps of the identity matching method according to one of the claims 1 to 4.
CN201910919106.4A 2019-09-26 2019-09-26 Identification matching method and device, electronic equipment and readable storage medium Active CN110674879B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910919106.4A CN110674879B (en) 2019-09-26 2019-09-26 Identification matching method and device, electronic equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910919106.4A CN110674879B (en) 2019-09-26 2019-09-26 Identification matching method and device, electronic equipment and readable storage medium

Publications (2)

Publication Number Publication Date
CN110674879A CN110674879A (en) 2020-01-10
CN110674879B true CN110674879B (en) 2022-03-25

Family

ID=69079329

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910919106.4A Active CN110674879B (en) 2019-09-26 2019-09-26 Identification matching method and device, electronic equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN110674879B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111556490B (en) * 2020-05-14 2021-05-25 武汉卓尔信息科技有限公司 Communication service system and method for monitoring different user identification codes
CN115037489A (en) * 2021-02-24 2022-09-09 北京国双千里科技有限公司 Method and device for determining number of equipment identifications, storage medium and electronic equipment
CN113434450B (en) * 2021-05-24 2023-08-22 中国航空工业集团公司沈阳飞机设计研究所 Computer and terminal equipment matching method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016180316A1 (en) * 2015-05-12 2016-11-17 杭州海康威视数字技术股份有限公司 Method, server, system, and image capturing device for surveillance
US9628958B1 (en) * 2013-03-15 2017-04-18 Paul McBurney User-controlled, smart device-based location and transit data gathering and sharing
CN108366342A (en) * 2018-03-12 2018-08-03 宁波亿拍客网络科技有限公司 A kind of perception information correlating method
CN109886204A (en) * 2019-02-25 2019-06-14 武汉烽火众智数字技术有限责任公司 A kind of Multidimensional Awareness system based on the application of big data police service
CN110008298A (en) * 2019-03-28 2019-07-12 武汉星视源科技有限公司 Parking multidimensional information aware application system and method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9628958B1 (en) * 2013-03-15 2017-04-18 Paul McBurney User-controlled, smart device-based location and transit data gathering and sharing
WO2016180316A1 (en) * 2015-05-12 2016-11-17 杭州海康威视数字技术股份有限公司 Method, server, system, and image capturing device for surveillance
CN108366342A (en) * 2018-03-12 2018-08-03 宁波亿拍客网络科技有限公司 A kind of perception information correlating method
CN109886204A (en) * 2019-02-25 2019-06-14 武汉烽火众智数字技术有限责任公司 A kind of Multidimensional Awareness system based on the application of big data police service
CN110008298A (en) * 2019-03-28 2019-07-12 武汉星视源科技有限公司 Parking multidimensional information aware application system and method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
TrackU: Exploiting User"s Mobility Behavior via WiFi List;Fenghua Li et al.;《GLOBECOM 2017 - 2017 IEEE Global Communications Conference》;20180115;第1-6页 *
基于多源感知数据的用户交互关系研究;赵邦辉;《万方数据库》;20190118;第1-50页 *

Also Published As

Publication number Publication date
CN110674879A (en) 2020-01-10

Similar Documents

Publication Publication Date Title
CN110674879B (en) Identification matching method and device, electronic equipment and readable storage medium
CN110490651B (en) Information pushing method, device, equipment and computer readable storage medium
CN106911801B (en) method for associating user information and information pushing system
CN110995484B (en) Automatic diagnosis method and device for service recovery of Internet of things
CN110990514A (en) Behavior track display method, display device and readable storage medium
CN113412608B (en) Content pushing method and device, server and storage medium
CN111654823A (en) Information pushing method and device
CN107124391B (en) Abnormal behavior identification method and device
CN110866692A (en) Generation method and generation device of early warning information and readable storage medium
CN106557963A (en) Process method, device and the server for using car order
CN105741161A (en) Method and system for recognizing click farming users in taxi businesses on basis of driver credit
CN110557466A (en) data processing method and device, electronic equipment and storage medium
CN112734046A (en) Model training and data detection method, device, equipment and medium
CN110490106B (en) Information management method and related equipment
CN110262863B (en) Method and device for displaying terminal main interface
CN108734514B (en) User normalization method
CN113179423A (en) Event detection output method and device, electronic equipment and storage medium
CN110825933B (en) Relation graph display method and device, electronic equipment and readable storage medium
CN110766938B (en) Road network topological structure construction method and device, computer equipment and storage medium
CN111782973A (en) Interest point state prediction method and device, electronic equipment and storage medium
WO2020107053A1 (en) Improved method and system for determining locations of point-of-sale terminals
CN107491332A (en) The method, apparatus and server of the false installation of recognition application
CN110458459B (en) Visual analysis method, device and equipment for traffic data and readable storage medium
CN112819563A (en) Information interaction method and device based on mobile service party and electronic equipment
CN110662169B (en) Terminal equipment matching method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant