WO2021147431A1 - 无线热点与兴趣点的映射方法、装置、计算机可读存储介质和计算机设备 - Google Patents

无线热点与兴趣点的映射方法、装置、计算机可读存储介质和计算机设备 Download PDF

Info

Publication number
WO2021147431A1
WO2021147431A1 PCT/CN2020/124594 CN2020124594W WO2021147431A1 WO 2021147431 A1 WO2021147431 A1 WO 2021147431A1 CN 2020124594 W CN2020124594 W CN 2020124594W WO 2021147431 A1 WO2021147431 A1 WO 2021147431A1
Authority
WO
WIPO (PCT)
Prior art keywords
interest
point
sniffing
wireless
mapping
Prior art date
Application number
PCT/CN2020/124594
Other languages
English (en)
French (fr)
Inventor
肖邱勇
张长旺
黄新营
张纪红
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2021147431A1 publication Critical patent/WO2021147431A1/zh
Priority to US17/678,478 priority Critical patent/US20220286956A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W48/00Access restriction; Network selection; Access point selection
    • H04W48/20Selecting an access point
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W48/00Access restriction; Network selection; Access point selection
    • H04W48/16Discovering, processing access restriction or access information
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/10Active monitoring, e.g. heartbeat, ping or trace-route
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/12Network monitoring probes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/04Arrangements for maintaining operational condition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/08Testing, supervising or monitoring using real traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/10Scheduling measurement reports ; Arrangements for measurement reports
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/02Services making use of location information
    • H04W4/021Services related to particular areas, e.g. point of interest [POI] services, venue services or geofences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W48/00Access restriction; Network selection; Access point selection
    • H04W48/18Selecting a network or a communication service
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W88/00Devices specially adapted for wireless communication networks, e.g. terminals, base stations or access point devices
    • H04W88/08Access point devices

Definitions

  • This application relates to the field of computer technology, and in particular to a method and device for mapping wireless hotspots and points of interest, and computer-readable storage media and computer equipment.
  • Wireless hotspots have become one of the necessary facilities and services for individuals, families, businesses, restaurants, hotels, retail and other service industries.
  • Wireless hotspots can provide Internet access services for users within a certain distance of the surrounding area through wireless local area networks. If the user sniffs or connects to the wireless hotspot, it can be considered that the user has visited the point of interest (POI) where the wireless hotspot is located. Therefore, the construction of the mapping relationship between wireless hotspots and POI is of great significance to the mining of crowd activity patterns, shop location selection, and transportation planning.
  • POI point of interest
  • the traditional way is mainly based on the name to construct the mapping relationship between wireless hotspots and POIs.
  • the name-based mapping method requires a strong correlation between the wireless hotspot name and the POI name.
  • most of the wireless hotspot names customized by the user have a very weak correlation with the POI name, so the name-based mapping method can The applicable scenarios are very limited.
  • a method, apparatus, computer-readable storage medium, and computer equipment for mapping wireless hotspots and points of interest are provided.
  • a method for mapping wireless hotspots and points of interest which is executed by a computer device, and the method includes:
  • the sniffing record includes data of the wireless hotspot sniffed by the sniffing device
  • the sniffing record determine the degree of overlap of sniffing devices
  • the target mapping probability a mapping between the wireless hotspot and the point of interest is established.
  • a mapping device for wireless hotspots and points of interest comprising:
  • the hotspot correlation measurement module is used to obtain sniffing records, the sniffing records including data of wireless hotspots sniffed by the sniffing device; and determining the degree of overlap of the sniffing devices according to the sniffing record;
  • the mapping probability propagation module is used to determine the initial mapping probability between the wireless hotspot and the corresponding point of interest according to the distance between the wireless hotspot and the corresponding point of interest; perform the initial mapping based on the degree of overlap of the sniffing device Iterative propagation between the mapping probabilities, at the end of the iteration, the target mapping probability between the wireless hotspot and the point of interest is obtained;
  • the hotspot and interest point mapping module is used to establish a mapping between the wireless hotspot and the interest point according to the target mapping probability.
  • a computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the processor executes the steps of the method for mapping wireless hotspots and points of interest.
  • a computer device includes a memory and a processor, and the memory stores a computer program.
  • the processor executes the steps of the method for mapping wireless hotspots and points of interest.
  • FIG. 1 is an application environment diagram of a method for mapping wireless hotspots and points of interest in an embodiment
  • FIG. 2 is a schematic flowchart of a method for mapping wireless hotspots and points of interest in an embodiment
  • FIG. 3 is a schematic diagram of a complete graph used when performing mapping probability propagation based on a label propagation algorithm in an embodiment
  • FIG. 4 is a schematic flowchart of a method for mapping wireless hotspots and points of interest in another embodiment
  • FIG. 5 is a schematic diagram of the principle of introducing hierarchical relationships among points of interest in an embodiment
  • FIG. 6 is a schematic flowchart of a method for mapping wireless hotspots and points of interest in another embodiment
  • FIG. 7 is a schematic diagram of a statistical area divided into multiple sub-areas in an embodiment
  • FIG. 8 is a schematic flowchart of a method for mapping wireless hotspots and points of interest in another embodiment
  • FIG. 9 is a schematic flowchart of a method for mapping wireless hotspots and points of interest in a specific embodiment
  • FIG. 10 is a schematic flowchart of a method for mapping wireless hotspots and points of interest in another specific embodiment
  • FIG. 11 is a structural block diagram of an apparatus for mapping wireless hotspots and points of interest in an embodiment
  • FIG. 12 is a structural block diagram of an apparatus for mapping wireless hotspots and points of interest in another embodiment
  • Fig. 13 is a structural block diagram of a computer device in an embodiment.
  • Fig. 1 is an application environment diagram of a method for mapping wireless hotspots and points of interest in an embodiment.
  • the wireless hotspot and interest point mapping method is applied to a wireless hotspot and interest point mapping system.
  • the wireless hotspot and interest point mapping system includes a terminal 110, a server 120, and a sniffing device 130.
  • the terminal 110 It is connected to the server 120 through a network.
  • the terminal 110 can be a desktop terminal or a mobile terminal, and the mobile terminal can be at least one of a mobile phone, a tablet computer, a notebook computer, a smart wearable device, etc.
  • the server 120 can be an independent server or multiple A server cluster composed of two servers.
  • the sniffing device 130 is a device with wireless hotspot sniffing and connection functions, such as mobile phones, computers, smart wearable devices, electronic reading devices, etc.
  • the sniffing device 130 is used to detect wireless hotspots
  • the sniffing record is directly reported to the terminal 110 or the server 120, or is reported to other storage devices, and is pulled from the storage device by the terminal 110 or the server 120.
  • Both the terminal 110 and the server 120 can perform the operations in the embodiments of the present application based on the sniffing records separately Provides a method for mapping wireless hotspots and points of interest.
  • the terminal 110 and the server 120 can also cooperate to execute the method for mapping wireless hotspots and points of interest provided in the embodiments of the present application based on sniffing records.
  • a method for mapping wireless hotspots and points of interest is provided.
  • the method is mainly applied to a computer device as an example.
  • the computer device may be the terminal 110 or the server 120 in the above figure.
  • the method for mapping wireless hotspots and points of interest specifically includes the following steps:
  • S202 Obtain a sniffing record; the sniffing record includes data of the wireless hotspot sniffed by the sniffing device.
  • the sniffing record refers to the data reported by the sniffing device when it sniffs the wireless hotspot.
  • the wireless hotspot can be a WiFi (Wireless-Fidelity) network provided by a wireless access point (AP, Access Point) or router, or a mobile hotspot provided by a mobile terminal and other devices, such as a mobile phone WiFi hotspot, a car WiFi hotspot, etc.
  • AP wireless access point
  • AP Access Point
  • mobile hotspot provided by a mobile terminal and other devices, such as a mobile phone WiFi hotspot, a car WiFi hotspot, etc.
  • the sniffing record includes the device identification of the sniffing device, the generation time of the sniffing record, and the data of each wireless hotspot sniffed by the sniffing device.
  • the data of the wireless hotspot includes the name, location coordinates, signal strength, etc. of the wireless hotspot.
  • the name of the wireless hotspot refers to the SSID (Service Set Identity) broadcasted by the wireless hotspot. Specifically, it can be a user-defined character string that provides the wireless hotspot, such as "TP-LINK-XX", "Guangming Community-13" "Wait.
  • the location coordinates of the wireless hotspot are spherical coordinates (lon, lat) that use longitude lon and latitude lat to represent the location of the surface point where the wireless hotspot is located.
  • the geographic coordinates may specifically be astronomical longitude and latitude, geodetic longitude and latitude, or geocentric longitude and latitude.
  • the sniffing record reported by sniffing device A at time t1 can be [A, t1, (TP-LINK-YY, Guangming community-13, 3-506A, Yuij99), [(114.32,30.51), (110.22, 35.09), (11.32,31.77), (109.92,30.01)]].
  • the sniffing device sniffs the wireless hotspots that exist around it, and displays the sniffed wireless hotspots to the user in the form of a list, and the user can select one of them Wireless hotspot to connect.
  • Each sniffing device generates a sniffing record based on the data of the sniffed wireless hotspot according to a preset time and frequency, and reports the generated sniffing record to the designated device.
  • the computer device pulls the sniffing records of the statistical area in the statistical period from the designated device through a communication method such as a USB (Universal Serial Bus) interface connection or a network connection.
  • the statistical area is a geographic area where wireless hotspots and points of interest in the area need to be mapped.
  • the regional boundary of the statistical area can be freely defined according to statistical needs, such as the territory of the entire country, the area of a certain province or town, etc. In a digital map, the statistical area can be an area within a closed contour surrounded by multiple consecutive coordinate points.
  • the statistical period refers to the time span of the generation time of the sniffing record that the mapping of wireless hotspots and points of interest relies on, including the statistical start time and the statistical end time.
  • the time length of the statistical period is too short, which will affect the accuracy of the final wireless hotspots and points of interest mapping; too long time length will increase the amount of data calculation.
  • the length of the statistical period needs to be reasonably set according to requirements, such as 1 month.
  • the computer device eliminates the acquired sniffing records that contain only one wireless hotspot data. It can be understood that the sniffing record of this application is used to measure the correlation between wireless hotspots, and a sniffing record that only contains data of a single wireless hotspot is not of analytical value for measuring the correlation between wireless hotspots. This type of sniffing record is excluded The amount of sniffing record data that needs to be processed can be reduced without affecting the accuracy of correlation analysis.
  • the sniffing device can also directly report the generated sniffing record to the computer device used for wireless hotspot and point of interest mapping.
  • the wireless hotspot may also be a point-to-point transmission network provided by near field communication, such as BLE (Bluetooth Low Energy), NFC (near field communication, near field communication), or RFID (Radio Frequency Identification). , RFID) and other wireless connection networks.
  • near field communication such as BLE (Bluetooth Low Energy), NFC (near field communication, near field communication), or RFID (Radio Frequency Identification). , RFID) and other wireless connection networks.
  • S204 Determine the degree of overlap of the sniffing devices according to the sniffing records.
  • the wireless hotspot has a corresponding sniffing distance range, such as 50 meters around.
  • Each wireless hotspot can be sniffed by any sniffing device within the sniffing distance.
  • wireless hotspots in close proximity may be sniffed by the same sniffing device, so the data of the same wireless hotspot may appear in multiple different sniffing records.
  • Sniffing devices and wireless hotspots that appear in the same sniffing record can be considered to be related.
  • the sniffing devices associated with different wireless hotspots may overlap.
  • the overlap degree of sniffing devices is a value that can reflect the degree of overlap of sniffing devices associated with wireless hotspots.
  • the degree of overlap of sniffing devices between two wireless hotspots may specifically be the ratio of the number of overlapping sniffing devices among the sniffing devices associated with the two wireless hotspots to the total number of sniffing devices associated with one of the wireless hotspots , It can also be the ratio of the total number of sniffing devices associated with two wireless hotspots after deduplication to the total number of sniffing devices associated with one of the wireless hotspots, or the sniffing devices associated with two wireless hotspots The ratio of the number of overlapping sniffing devices in the device to the number of all sniffing devices associated with two wireless hotspots after deduplication.
  • the computer device determines the wireless hotspots involved in the pulled sniffing records and the sniffing devices associated with each wireless hotspot. Count the total number of computer devices NUM i sniff each wireless device associated hotspot WiFi i. The computer equipment counts the number of overlapping sniffing devices NUM ij among sniffing devices associated with each two wireless hotspots WiFi i and WiFi j . i and j are integers greater than zero.
  • the computer equipment can calculate the ratio of the number of overlapping sniffing devices NUM ij to the total number of sniffing devices NUM i associated with the wireless hotspot WiFi i , and use the proportion NUM ij /NUM i as the wireless hotspot WiFi i relative to the wireless hotspot WiFi j sniffing equipment degree of overlap W ij.
  • the computer equipment calculates the proportion of the number of overlapping sniffing devices NUM ij relative to the total number of sniffing devices NUM j associated with the wireless hotspot WiFi j , and the proportion NUM ij /NUM j is regarded as the wireless hotspot WiFi j relative to the wireless hotspot WiFi
  • the overlap degree of i 's sniffing equipment W ji The overlap degree of i 's sniffing equipment W ji .
  • the computer device can also calculate the total number NUM i +NUM j -NUM ij of the sniffing devices associated with the wireless hotspot WiFi i and WiFi j after deduplication, and calculate the total number NUM i +NUM after deduplication.
  • j -NUM ij is the ratio of NUM i to the total number of sniffing devices associated with the wireless hotspot WiFi i
  • the ratio (NUM i +NUM j -NUM ij )/NUM i is regarded as the wireless hotspot WiFi i relative to the wireless hotspot WiFi J 's sniffing device overlap degree Wij .
  • the computer equipment calculates the proportion of the total number NUM i +NUM j -NUM ij after deduplication to the total number of sniffing devices NUM j associated with the wireless hotspot WiFi j , and the proportion (NUM i +NUM j -NUM ij )/NUM j is used as the wireless hotspot WiFi j and the wireless hotspot WiFi i 's sniffing device overlap degree W ji .
  • the computer device calculates the number of overlapping sniffing devices NUM ij relative to the wireless hotspot WiFi i and the total number of sniffing devices associated with the wireless hotspot WiFi j after deduplication NUM i +NUM j- NUM ij ratio, the proportion NUM ij / (NUM i + NUM j -NUM ij), as a proportion of the wireless hotspot to the wireless hotspot WiFi i WiFi j degree of overlap of the sniffing device W i + j.
  • the overlap degree Wij of the sniffing device of the wireless hotspot WiFi i relative to the wireless hotspot WiFi j is the same as the overlap degree W ji of the sniffing device of the wireless hotspot WiFi j relative to the wireless hotspot WiFi i , and both are sniffing devices.
  • the degree of overlap W i+j is the same as the overlap degree W ii of the sniffing device of the wireless hotspot WiFi j relative to the wireless hotspot WiFi i .
  • the sniffing record includes the identification of the sniffing device and the hotspot names of at least two wireless hotspots; determining the degree of overlap of the sniffing devices between the wireless hotspots includes: determining the corresponding name of each hotspot based on the identification of the sniffing device Deduplication sniffing device collections; identifying overlapping sniffing device IDs in every two deduplication sniffing device sets; based on the number of overlapping sniffing device IDs and the number of sniffing device IDs in the corresponding deduplication sniffing device collection, Determine the degree of overlap between the sniffing devices corresponding to the two wireless hotspots.
  • the sniffing device identifier is information that can uniquely identify a sniffing device.
  • the identification of the sniffing device in the sniffing record may be identification data that has been irreversibly encrypted by the data producer.
  • the sniffing device identifier may specifically be a SUPI (Subscription Permanent Identifier) user permanent identifier, GPSI (Generic Public Subscription Identifier), PEI (Permanent Equipment Identifier, permanent equipment identifier), etc.
  • SUPI Subscriber Identification
  • IMSI International Mobile Subscriber Identification Number
  • NAI Network Access Identifier, network access Identifier
  • the sniffing device set is a set composed of one or more sniffing device identifiers. Each set of sniffing devices corresponds to the hotspot name of a wireless hotspot. The same sniffing device may report multiple sniffing records during the statistical period, and the sniffing records reported multiple times may contain the same wireless hotspot. In this way, the identification of the sniffing device in the set of sniffing devices corresponding to the wireless hotspot may be repeated.
  • the deduplication sniffing device collection is a collection of sniffing devices after deduplicating the identification of the sniffing device.
  • the computer device parses the sniffing records that are pulled, and constructs a set of sniffing devices corresponding to each wireless hotspot involved. Although it is possible that the same user carries multiple sniffing devices at the same time, it can still be considered that each sniffing device can uniquely represent a user with a high degree of confidence. Since the identification of the sniffing device is directly recorded in the sniffing record, users can be better distinguished based on the identification of the sniffing device.
  • the computer device deduplicates the sniffing device identifiers in the sniffing device set to obtain the deduplication sniffing device set.
  • the computer device performs a comparative analysis on every two deduplication sniffing device sets, and identifies the part of the sniffer device identification that overlaps in every two deduplication sniffing device sets.
  • the computer equipment calculates and counts the number of overlapping sniffing device IDs, and calculates the number of overlapping sniffing device IDs relative to the total number of sniffing device IDs in the set of deduplication sniffing devices corresponding to the target wireless hotspots in the two wireless hotspots The percentage is the degree of overlap of the target wireless hotspot with the sniffing device of another wireless hotspot.
  • the correlation between wireless hotspots can be measured based on the degree of overlap of the sniffing devices, and the calculated correlation value range is [0, 1], and no normalization processing is required. More importantly, since each sniffing device can uniquely represent a user, the overlapping degree of sniffing devices reflects the similarity of users between different wireless hotspots to a high degree of confidence, which can assist in judging user attributes, such as mobile Users, resident users, etc.
  • the method of measuring the correlation between wireless hotspots based on the overlapping degree of sniffing devices compared to the method based solely on the location coordinates of the wireless hotspots, or the number of times that they appear in the same sniffing record, retains the user's spatial behavior characteristic information, and can It better realizes the distinction of wireless hotspots in spatial position, and effectively solves the problem of difficulty in distinguishing wireless interest points between adjacent points of interest, so that the reflected wireless hotspot correlation is more objective and stable, and the reliability is higher.
  • S206 Determine an initial mapping probability between the wireless hotspot and the corresponding point of interest according to the distance between the wireless hotspot and the corresponding point of interest.
  • POI points of interest refer to landmarks, scenic spots, etc. in the geographic information system, such as government departments, community buildings, and commercial institutions in a certain area (such as gas stations, department stores, supermarkets, restaurants, hotels, convenience stores, post boxes, hospitals, etc.) ), historical sites, tourist attractions (such as parks, public toilets, etc.), transportation facilities (such as stations, parking lots, toll gates, speed limit signs) and other features.
  • geographic information system such as government departments, community buildings, and commercial institutions in a certain area (such as gas stations, department stores, supermarkets, restaurants, hotels, convenience stores, post boxes, hospitals, etc.) ), historical sites, tourist attractions (such as parks, public toilets, etc.), transportation facilities (such as stations, parking lots, toll gates, speed limit signs) and other features.
  • the mapping probability refers to the probability that a wireless hotspot is mapped to a certain POI.
  • the initial mapping probability reflects the correlation between the wireless hotspot and the point of interest in the dimension of the spatial distance. In other words, considering the spatial distance purely, the probability of the wireless hotspot being mapped to the corresponding POI is the initial mapping probability. It can be understood that multiple wireless hotspots appearing in the same sniffing record have similar locations. The sniffing devices associated with wireless hotspots in similar locations have a high degree of overlap and have a greater probability of belonging to the same POI.
  • the computer device obtains data of points of interest.
  • the point-of-interest data can be obtained from a third-party channel provider or through a web crawler. There is no restriction on how the point-of-interest data can be obtained.
  • the data of the point of interest includes the name of the point of interest, the location of the point of interest, and so on.
  • the names of points of interest are people's naming of points of interest, such as "Starry Sky Elementary School", "China Technology Exchange Building” and so on.
  • the location of the point of interest may be the geographic coordinates of the point of interest.
  • the computer device calculates the distance between each wireless hotspot and each point of interest in the statistical area according to the geographic coordinates of the wireless hotspot and the geographic coordinates of the point of interest.
  • the computer equipment normalizes the distance between the wireless hotspot and each point of interest, converts the distance value into a probability value in the range of [0, 1], and uses the probability value as the initial value between the corresponding wireless hotspot and the point of interest. Mapping probability.
  • the method used for the normalization processing may specifically be 01 standardization, Z-score standardization, sigmoid function standardization, and the like. It is understandable that the computer device can also use other methods to determine the initial mapping probability between the point of interest and the wireless hotspot. For example, it can also determine the longest distance between the wireless hotspot and the point of interest in the statistical area, and compare the current wireless hotspot to the point of interest. The ratio of the distance to the farthest distance is used as the initial mapping probability between the current wireless hotspot and the point of interest.
  • the location of the point of interest may also be an address text used to describe the location of the point of interest.
  • Address text is the text used to describe the geographic location information of points of interest, such as "McDonald's, Haidian Street, Zhongguancun, Haidian District, Beijing".
  • the computer equipment converts the address text into the geographic coordinates of the point of interest based on the address encoding service.
  • the geographic coordinates obtained by the geocoding service correspond one-to-one with the address text.
  • the computer equipment can also search for the geographic coordinates corresponding to each address text based on the coordinate retrieval service.
  • the geographic coordinates and address text obtained by the coordinate search service may have a one-to-one relationship or a many-to-one relationship. In other words, the coordinate search service can obtain one or more geographic coordinates corresponding to each address text.
  • Different coordinate retrieval service providers provide different coordinate retrieval methods. For example, Baidu Maps and Google Maps provide different coordinate retrieval methods.
  • the computer device When there are multiple geographic coordinates obtained by the conversion, the computer device recognizes the key address element in the address text.
  • the key address element refers to the address element that can make the address location information described by the address text in a convergent state.
  • the convergence state refers to a state where a large number of scattered possible areas can be concentrated and accurately positioned to a certain possible area.
  • the key address element may specifically be a POI prefix that restricts the geographic location from a large number of POIs to one or several POIs.
  • the computer device can filter a geographic coordinate of the target from the multiple geographic coordinates obtained by the conversion according to the key address element.
  • S208 Perform iterative propagation between initial mapping probabilities based on the degree of overlap of the sniffing devices, and obtain the target mapping probability between the wireless hotspot and the point of interest at the end of the iteration.
  • the purpose of this application is to establish a mapping relationship between wireless hotspots and points of interest, and it is necessary to ensure that each wireless hotspot can only be mapped to one point of interest, which is a non-overlapping area division problem.
  • the algorithm used to solve the problem of non-overlapping area division can specifically adopt area division based on modularity optimization, area division based on spectral analysis, area division based on information theory, and area division based on tag propagation.
  • the area division algorithm based on label propagation can be LPA (Label Propagation Algorithm), COPRA (overlapping community discovery algorithm based on label delivery), SLPA (Speaker-listener Label Propagation Algorithm, community discovery algorithm), etc.
  • the computer device can divide the statistical area except the target wireless other wireless hotspot hotspot of WiFi i respectively sniffing equipment degree of overlap between the target wireless hotspots WiFi i as the propagation weight, each of the other wireless hotspot and the initial mapping probability of the statistical point of interest POIj propagation weight region in the target superposed spread initial mapping of the probability of the wireless hotspot WiFi i POIj point of interest, the probability map to give the intermediate target wireless hotspot WiFi i POIj the point of interest, the probability map as the intermediate initial probability map and the target wireless hotspot WiFi i POIj iterating point of interest Until the preset iteration stop condition is reached.
  • the iteration stop condition may be convergence of the intermediate mapping probability, or reaching the set maximum number of iterations, etc.
  • the computer device confirms the intermediate mapping probability calculated when the iteration is stopped as the target mapping probability between the corresponding wireless hotspot and the point of interest.
  • the label propagation algorithm LPA is a semi-supervised learning method based on graphs.
  • the basic idea is to use the label information of the labeled nodes to predict the label information of the unlabeled nodes.
  • nodes include labeled and unlabeled data, and their edges represent the similarity of two nodes, and the label of the node is passed to other nodes according to the similarity.
  • Labeled data is like a source. Unlabeled data can be labeled. The greater the similarity of nodes, the easier it is to spread the label.
  • the computer device can establish a complete map with each group of wireless hotspots and POIs to be mapped as nodes 302.
  • each node has corresponding label information
  • each connecting edge 304 used to connect the nodes has a corresponding edge weight.
  • the label information corresponding to the node refers to the initial mapping probability between the corresponding wireless hotspot and the POI of the point of interest
  • the edge weight corresponding to the connecting edge between the nodes refers to the sniffing between the corresponding two wireless hotspots Equipment overlap.
  • the above-mentioned mapping method of wireless hotspots and points of interest further includes: using the overlapping degree of sniffing devices between wireless hotspots as a matrix element to establish a propagation matrix; and using the initial mapping probability between wireless hotspots and points of interest as the matrix element Establish an initial mapping matrix; iteratively propagate between initial mapping probabilities based on the overlap of the sniffing devices, and obtain the target mapping probabilities between wireless hotspots and points of interest at the end of the iteration, including: multiplying the propagation matrix and the initial mapping matrix to obtain The mapping matrix after the probability propagation; the mapping matrix after the probability propagation is used as the initial mapping matrix to iterate until the iteration stop condition is met, and the target mapping matrix is obtained; the target mapping matrix records the target between each wireless hotspot and the point of interest Mapping probability.
  • the computer device initializes and constructs the propagation matrix according to the number of wireless hotspots involved in the statistical area. Assuming that n wireless hotspots are deployed in the statistical area, the propagation matrix can be an n*n two-dimensional matrix W n*n .
  • Initialization propagation matrix W n * n matrix elements W ij are default 0.
  • the matrix element Wij is used to record the value reflecting the correlation degree of the wireless hotspot WiFi i relative to the wireless hotspot WiFi j.
  • the computer device calculates the sniffing overlap degree before the wireless hotspot, it fills the initialized propagation matrix W n*n as a matrix element.
  • the computer equipment initializes and constructs a mapping matrix according to the number of wireless hotspots and points of interest involved in the statistical area. Assuming that n wireless hotspots are deployed in the statistical area, including m POIs, the mapping matrix can be a two-dimensional matrix of n*m The superscript 0 represents that the mapping matrix is initialized.
  • Initialized mapping matrix Matrix elements The default is 0. Matrix element It is used to record the value reflecting the correlation between WiFi i and POI j of the point of interest. After the computer device calculates the initial mapping probability between the wireless hotspot and the point of interest, it fills the initial mapping matrix as a matrix element
  • the computer equipment converts the propagation matrix W n*n to the mapping matrix of the initialization state Multiply, calculate the mapping matrix after probability propagation
  • the mapping matrix after the computer equipment spreads the probability As the initial mapping matrix, iterate until the iteration stop condition is met to stop the iteration to obtain the target mapping matrix
  • the target mapping matrix records the target mapping probability between wireless hotspots and points of interest.
  • S210 Establish a mapping between wireless hotspots and points of interest according to the target mapping probability.
  • the target mapping probability reflects the probability that a wireless hotspot belongs to a POI at the address location.
  • the computer equipment determines the POI with the highest target mapping probability corresponding to each wireless hotspot, and records it as the target POI.
  • the computer equipment establishes a mapping between the wireless hotspot and the target POI. In other words, the computer device establishes a mapping between the wireless hotspot WiFi i and the POI corresponding to the largest element value in the i-th row of the target mapping matrix.
  • establishing a mapping between a wireless hotspot and a point of interest according to the target mapping probability includes: removing wireless hotspots whose target mapping probability is less than a second threshold among all the points of interest; establishing each reserved wireless hotspot The mapping between the points of interest with the highest mapping probability to the corresponding target.
  • the second threshold is the minimum target mapping probability set for determining whether a wireless hotspot WiFi i needs to be mapped to a certain POI.
  • the size of the second threshold can be freely set according to requirements.
  • the computer device traverses whether the target mapping probability between the wireless hotspot WiFi i and each POI reaches the second threshold.
  • the computer device judges the wireless hotspot WiFi i as noise WiFi i , removes the noise WiFi i , establishes the reserved wireless hotspot WiFi ii and reaches The mapping between POIs of the second threshold and the maximum target mapping probability.
  • the mapping relationship between wireless hotspots and points of interest established based on the method provided in this application can realize the identification of users on the spot, and be used for mining the rules of crowd activities, thereby supporting many important business decisions and policy formulations such as shop location and transportation planning. , Has a very high application value.
  • noisy wireless hotspots whose target mapping probability with each interest point is less than the second threshold are eliminated, which can improve the accuracy and reliability of the established mapping relationship.
  • mapping method of wireless hotspots and points of interest establishes the mapping relationship between wireless hotspots and points of interest based on the sniffing records of wireless hotspots, without the need to manually collect and report POI visit data, which not only improves the mapping efficiency, but also reduces the number of wireless hotspots and points of interest.
  • the dependence of the name makes this mapping method applicable to a wide range and can also improve the recall rate of wireless hotspots.
  • Measure the correlation between wireless hotspots based on the overlapping degree of sniffing devices which can assist in judging the user's flow attributes between points of interest, retain the user's spatial behavior characteristic information, and better realize the distinction of wireless hotspots in space.
  • the mapping between the wireless hotspot and the point of interest is established, which can improve the accuracy of the mapping.
  • the above-mentioned mapping method of wireless hotspots and points of interest further includes: identifying the mobile hotspots in the wireless hotspots according to the location changes of the wireless hotspots in different sniffing records; and excluding the mobile hotspots in each sniffing record.
  • the wireless hotspots can be divided into mobile hotspots and stable hotspots according to their location mobility attributes.
  • Mobile hotspots are wireless hotspots whose location will change over time, such as mobile WiFi hotspots, car WiFi hotspots, etc.
  • Mobile hotspots may be in different POI locations at different times, and it is difficult to establish a stable mapping relationship between mobile hotspots and POIs.
  • the computer device identifies mobile hotspots in the wireless hotspots involved in the statistical area.
  • mobile hotspots There are many ways to identify mobile hotspots and stable hotspots. For example, the position coordinates of each wireless hotspot at different time points can be deduced based on the sniffing records during the statistical period, and the position change value of adjacent time points can be calculated. When the number of position changes is greater than the preset value, it can be determined that the wireless hotspot is a mobile hotspot.
  • the computer device can cluster the position coordinates of the wireless hotspots at different points in time, calculate the clustering characteristics of each geographic coordinate through a clustering algorithm, and determine a category from the multiple geographic coordinates according to the clustering characteristics.
  • Cluster center point is a feature that characterizes the clustering feature of geographic coordinates, such as the Gaussian density distribution value. The larger the Gaussian density distribution value, the more clustered the corresponding geographic coordinates, which can be used as the cluster center point.
  • the cluster center point refers to the geographic coordinate point with the highest aggregation among multiple geographic coordinate points.
  • Clustering algorithm such as k-means (clustering method based on partition), fuzzy cluster (fuzzy clustering algorithm), DBSCAN (Density-Based Spatial Clustering of Application with Noise, density-based clustering algorithm) or Fast Search and Find of Density Peaks (a clustering algorithm for fast search and discovery of density peaks), etc.
  • the computer equipment eliminates the data about mobile hotspots in each sniffing record, and then only needs to calculate the overlap of the sniffing devices between stable hotspots, and only needs to calculate the initial mapping probability and target mapping probability between stable hotspots and interest points to establish stability The mapping between hotspots and points of interest.
  • the mobile hotspots are identified according to the position changes of the wireless hotspots in different sniffing records, and the noise data in the sniffing records is filtered. While improving the stability of the established mapping relationship, it accurately reduces the need to perform The amount of data processed by the mapping improves the mapping efficiency and saves the data processing resources of the computer equipment.
  • the above-mentioned method for mapping wireless hotspots and points of interest further includes: removing wireless hotspots in the re-sniffing device set whose number of sniffing device identifications is less than a first threshold to obtain the target's de-duplicating sniffing device set; identifying;
  • the overlapping sniffing device identifiers in every two sniffing device sets include: identifying the overlapping sniffing device identifiers in the deduplication sniffing device sets of every two reserved targets.
  • the first threshold is the minimum number of sniffing device identifications set to determine whether a wireless hotspot WiFi i needs to be mapped to a certain POI.
  • the size of the first threshold can be freely set according to requirements.
  • the computer device counts the number of sniffing device identifiers in the sniffing device set to determine the number of visiting users corresponding to each wireless hotspot.
  • the computer device traverses whether the number of sniffer device identifications associated with each wireless hotspot WiFi i reaches the first threshold.
  • the number of sniffer device identifiers associated with the wireless hotspot WiFi i is less than the first threshold, it means that the wireless hotspot may have malfunctioned at a certain point in the statistical period, so that the sniffer device carried by the visiting user cannot be detected.
  • the wireless hotspot itself has fewer visiting users, and the computer device judges this type of wireless hotspot WiFi i as faulty hotspots or unpopular hotspots.
  • Computer equipment eliminates fault hotspots and unpopular hotspots.
  • the subsequent calculation only needs to calculate the degree of overlap of the sniffing devices between the reserved wireless hotspots WiFi ii , and only the initial mapping probability and target mapping probability between the reserved wireless hotspots and points of interest are calculated to establish the reserved wireless hotspots and points of interest. Mapping between.
  • wireless hotspots with a small number of sniffing device identifiers are not included in the scope of sniffing device overlap statistics, which improves the accuracy of sniffing device overlap and reduces the dimension of the propagation matrix W n*n , which helps to improve the mapping. efficiency.
  • calculating the initial mapping probability between the wireless hotspot and the corresponding point of interest includes: screening the wireless hotspots whose distance from the point of interest is less than a preset value as the seed of the corresponding point of interest Hotspot: Determine the initial mapping probability between the seed hotspot and the corresponding point of interest as 1; Determine the initial mapping probability between the wireless hotspots and the points of interest except the seed node as 0.
  • the preset value is a preset maximum distance used to determine whether a wireless hotspot can be used as a seed hotspot of a certain point of interest.
  • Seed hotspots are wireless hotspots whose distance from at least one point of interest is less than a preset value. With a higher confidence, it is possible to determine which point of interest belongs to based solely on the distance. In fact, for seed hotspots, the computer equipment can already establish a mapping between it and the corresponding points of interest at this time. Wireless hotspots other than seed nodes can be called hotspots to be mapped.
  • the hotspot to be mapped is a wireless hotspot whose distance from each point of interest is greater than or equal to a preset value, and it is not yet possible to determine which point of interest it belongs to based on the distance alone.
  • each point of interest can have multiple corresponding seed nodes, but each wireless hotspot can only serve as the seed hotspot of one point of interest.
  • the initialized mapping matrix The initial mapping probability of each row of wireless hotspot WiFi i to at most one POI of interest is 1. When the distance between a wireless hotspot and multiple points of interest is less than a preset value, the wireless hotspot can be determined as the seed hotspot of the closest point of interest.
  • each wireless hotspot can only be used as a seed hotspot of a point of interest, so it should not be too large, and it can be a distance value much smaller than the radiation range of the wireless hotspot, such as 20m, etc. .
  • the computer device determines the number corresponding to each point of interest POI.
  • the number range can be 0 to m-1.
  • m is the number of POIs included in the statistical area.
  • the number may be randomly determined by the computer equipment, or may be determined by the position coordinates of the POI, such as numbering in the order of decreasing longitude and/or latitude. In this way, the point of interest number can be directly used as a mapping matrix The column index.
  • the computer device traverses whether the distance between each wireless hotspot WiFi i and each point of interest POIj is less than a preset value according to the number of points of interest.
  • the computer device sets the initial mapping probability of the wireless hotspot WiFi i and the point of interest POIj to zero.
  • the computer device sets the initial mapping probability of the wireless hotspot WiFi i and the point of interest POIj to 1, and sets the distance between the wireless hotspot WiFi i and the point of interest POIj+k
  • the initial mapping probability is set to 0.
  • the above-mentioned mapping method of wireless hotspots and points of interest includes:
  • S402 Obtain a sniffing record; the sniffing record includes data of the wireless hotspot sniffed by the sniffing device.
  • S404 Determine the degree of overlap of the sniffing devices according to the sniffing record.
  • S408 Divide multiple points of interest that have an inclusion relationship between the names of points of interest into one point of interest group.
  • S410 Calculate the initial mapping probability between the wireless hotspot and the corresponding interest point group according to the distance between the wireless hotspot and the interest point in the corresponding interest point group.
  • S412 Perform iterative propagation between initial mapping probabilities based on the degree of overlap of the sniffing devices, and obtain the target mapping probability between the wireless hotspot and the interest point group at the end of the iteration.
  • S414 Establish a mapping between the wireless hotspot and the interest point group according to the target mapping probability.
  • the name of the point of interest includes one or more address elements.
  • POI "Guangming Community East District 1” includes three address elements of "Guangming Community", “East District” and "1 Building".
  • the inclusion relationship between the names of points of interest means that one point of interest name is one or more address elements in the name of another point of interest.
  • the POI "Guangming Community” is included in the POI "Building 1 in the East District of Guangming Community”. It is easy to understand that the inclusion relationship between the names of the points of interest only needs to be compared between words, and the word segmentation of the names of the points of interest is not required.
  • the computer device traverses whether the name of each point of interest is included in the name of another point of interest.
  • the computer device determines the point of interest corresponding to the point of interest name POIi as the parent point of interest of the point of interest name POIj, and determines the point of interest corresponding to the point of interest name POIj as a point of interest
  • the point name POIi corresponds to the child point of interest of the point of interest.
  • An interest point group having a parent-child relationship constitutes an interest point group. For example, the parent point of interest "Guangming Community” and the corresponding child points of interest "Guangming Community East District 1", “Guangming Community East District 3", etc. form a point of interest group ⁇ "Guangming Community", "Guangming Community East District 1" , "3 Buildings in the East District of Guangming Community",... ⁇ .
  • the same point of interest group may include points of interest at multiple levels, that is, child points of interest may serve as parent points of interest for other points of interest.
  • child points of interest may serve as parent points of interest for other points of interest.
  • the "Eastern District of Guangming Community” is a child-level point of interest of "Guangming Community", but it is also the parent-level point of interest of "Building 1 in the East of Guangming Community” and "Building 3 in the East of Guangming Community”.
  • the interest point group contains multiple levels of interest points, the highest level interest point of the computer device is determined as the parent interest point of the interest point group.
  • the computer device may also divide the names of multiple points of interest into multiple groups, and simultaneously divide the names of each group of points of interest hierarchically in the above-mentioned manner, and finally divide each group The results of the level division are merged to determine the final hierarchical relationship of the points of interest in the statistical area.
  • the computer device can perform the mapping between wireless hotspots and points of interest in units of points of interest groups. Specifically, the computer device calculates the initial mapping probability between the wireless hotspot and the corresponding interest point group according to the distance between the wireless hotspot and each interest point in the interest point group. For example, the computer device can calculate the initial mapping probability of ⁇ POIi ⁇ between the wireless hotspot and the corresponding point of interest group based on the minimum distance or average distance between the wireless hotspot WiFi i and the points of interest in the group of points of interest ⁇ POIi ⁇ .
  • the computer equipment iteratively propagates the initial mapping probability of ⁇ POIi ⁇ between the wireless hotspot and the corresponding point of interest group based on the overlap degree of the sniffing device in the above-mentioned manner, and obtains the target mapping probability of ⁇ POIi ⁇ between the wireless hotspot and the corresponding point of interest group, and establishes wireless The mapping between the hotspot WiFi i and the point of interest group ⁇ POIi ⁇ with the highest target mapping probability.
  • the computer device When the minimum target mapping probability of mapping a wireless hotspot WiFi i to a certain POI is defined, that is, the second threshold, the computer device 's target mapping probability between the wireless hotspot WiFi i and each point of interest group ⁇ POIi ⁇ Whether to reach the second threshold for traversal.
  • the target mapping probability between the wireless hotspot WiFi i and each point of interest ⁇ POIi ⁇ is less than the second threshold, the computer device judges the wireless hotspot WiFi i as noise WiFi i , eliminates the noise WiFi i , and establishes the reserved The mapping between the wireless hotspot WiFi ii and each point of interest in the group of points of interest ⁇ POIi ⁇ that reaches the second threshold and has the largest target mapping probability.
  • the realization of the mapping between wireless hotspots and points of interest is based on the target mapping probability between wireless hotspots and points of interest. It is assumed that the second threshold of the target mapping probability of establishing a mapping between the wireless hotspot and the point of interest is 0.7. As shown in Figure 5, the target mapping probability of the wireless hotspot WiFi i and the point of interest "Runcheng Garden” is 0.2, and the target mapping probability of the point of interest "Runcheng Garden 1" is 0.5, and the target mapping probability of the point of interest "Runcheng Garden 9" is 0.5. The target mapping probability of “Dong” is 0.1, and the target mapping probability of “Longdu Garden” is 0.2.
  • the target mapping probability of WiFii will be diluted by multiple nearby POIs.
  • the probability values mapped to each POI are very small, and are eventually eliminated, resulting in a low recall rate of wireless hotspot WiFii.
  • the introduction of the POI hierarchical relationship can also reduce the dimension of the mapping matrix in the iterative propagation link of the initial mapping probability, reduce the impact of data tilt, greatly reduce the amount of calculation, and improve the stability of the established mapping relationship.
  • the points of interest are divided into groups based on the names of the points of interest, and the POI hierarchical relationship is reflected.
  • the mapping efficiency is improved;
  • the POI hierarchical relationship is reflected, and the mapping recall rate of wireless hotspot WiFi and POI can also be improved.
  • the iterative propagation between the initial mapping probabilities is performed based on the degree of overlap of the sniffing devices, and obtaining the target mapping probabilities between the wireless hotspots and the points of interest at the end of the iteration includes: multiplying the propagation matrix by the initial mapping matrix, and calculating Obtain the intermediate mapping matrix; reset the intermediate mapping probability between the seed hotspots and the corresponding interest points in the intermediate mapping matrix to 1, and then iterate as the initial mapping matrix until the iteration stop condition is met, and the target mapping matrix is obtained; the target mapping matrix is recorded The target mapping probability between wireless hotspots and points of interest is calculated.
  • the computer device converts the propagation matrix W n*n to the mapping matrix of the initialization state Multiply, calculate the intermediate mapping matrix
  • the intermediate mapping matrix records the intermediate mapping probabilities between wireless hotspots and points of interest.
  • Computer equipment will map the matrix in between The Chinese mapping probability of the middle seed hotspot and the corresponding interest point is reset to the initial value, which is 1.
  • Computer equipment will map the matrix in between The intermediate mapping probabilities of hotspots to be mapped and points of interest are normalized.
  • the computer equipment will complete the intermediate mapping matrix that resets the intermediate mapping probability of the seed hotspot and normalizes the intermediate mapping probability of the hotspot to be mapped.
  • the initial mapping probability between the seed node and the corresponding interest point is reliable, resetting during the iterative propagation process can ensure that the mapping probability between the seed node and the corresponding interest point is 1 from beginning to end, and the reliable seed hot spot can be used as the propagation
  • the source of, through iteration, the initial mapping probability of the seed hotspot is continuously propagated to the surrounding hotspots to be mapped to improve the accuracy of the mapping relationship.
  • the hotspot location of the wireless hotspot recorded in the sniffing record may have deviations, and the data reliability is poor.
  • the initial mapping probability of seed nodes and corresponding points of interest It is determined to be 1
  • the initial mapping probability of other wireless nodes and points of interest is determined to be 0, which can reduce the use of hotspot location data in the sniffing record as much as possible, and only use the seed hotspots with strong reliability as the source of propagation Spread to improve the accuracy of mapping relationships.
  • the above-mentioned mapping method of wireless hotspots and points of interest includes:
  • the sniffing record includes data of the wireless hotspot sniffed by the sniffing device.
  • S604 Divide a statistical area that needs to be mapped to wireless hotspots and points of interest into multiple sub-areas.
  • S606 Determine the degree of overlap of sniffing devices between wireless hotspots in each sub-area according to the sniffing record.
  • S608 According to the distance between the wireless hotspot and the point of interest, determine an initial mapping probability between the wireless hotspot and the point of interest in the corresponding sub-area.
  • S610 Perform iterative propagation between initial mapping probabilities based on the degree of overlap of the sniffing devices, and obtain target mapping probabilities between wireless hotspots and points of interest in the corresponding sub-area at the end of the iteration.
  • S612 Establish a mapping between the wireless hotspots and the points of interest in the corresponding sub-area according to the target mapping probability between the wireless hotspots and the points of interest in the same sub-area.
  • S614 Establish a mapping between wireless hotspots and points of interest in the statistical area by fusing data mapped between wireless hotspots and points of interest in all sub-regions.
  • the embodiment of the present application divides the statistical area into multiple sub-areas.
  • the area area and the contour shape of the area boundary of each sub-area may be different.
  • the method for dividing the statistical area may specifically be that the computer device divides the statistical area into a plurality of sub-areas with different area areas and/or different contour shapes of the area borders according to the population distribution and the passenger flow under normal circumstances. It is understandable that for locations with a large population or a large passenger flow, it can be divided into sub-regions with a small area by restricting the regional boundary; for a location with a small population or a small passenger flow, it can be increased by increasing The regional boundary divides it into sub-regions with a large area, so that the number of wireless hotspots contained in each sub-region is similar to the number of points of interest.
  • the area area and the contour shape of the area boundary of each sub-area may also be the same.
  • the method for dividing the statistical area may specifically be that the computer device divides the statistical area into multiple sub-areas with the same area and/or boundary contour shape based on a preset grid.
  • the computer equipment establishes a plane coordinate system based on the location of the statistical area in the digital map, and divides the statistical area into a plurality of areas of equal area on the basis of a positive grid 702 of a preset size Sub-area.
  • the preset grid may also be other regular polygons, such as triangles, parallelograms, rhombuses, and so on.
  • the preset grid may also be an irregular border, which is not limited. There may be no overlapping area between different sub-areas, and a certain overlapping area for transition may also be set, which is not limited. Those skilled in the art can also use other area division methods, which are not limited.
  • the computer equipment separately establishes the mapping of wireless hotspots and points of interest in each sub-area according to the above-mentioned method, and finally merges the mapping data of wireless hotspots and points of interest in all sub-areas to obtain the statistics of all wireless hotspots and points of interest in the statistical area. Complete mapping relationship.
  • the statistical area is divided into multiple sub-areas. For each sub-area, there is less data that needs to be mapped and processed for wireless hotspots and points of interest, and the wireless hotspots and points of interest in multiple sub-areas can be mapped simultaneously, which greatly improves
  • the mapping efficiency makes the mapping method of wireless hotspots and points of interest provided in this application suitable for scenarios with large statistical areas.
  • the above-mentioned mapping method of wireless hotspots and points of interest includes:
  • the sniffing record includes data of the wireless hotspot sniffed by the sniffing device.
  • S804 Divide a statistical area that needs to be mapped to wireless hotspots and points of interest into multiple sub-areas.
  • S806 Determine the degree of overlap of sniffing devices between wireless hotspots in the same sub-area according to the sniffing record.
  • S808 Group multiple points of interest in the same subregion to obtain one or more points of interest groups.
  • S810 Combine the two points of interest whose names of the points of interest have an inclusion relationship in two adjacent sub-regions.
  • S812 Calculate the initial mapping probability between the wireless hotspot and the corresponding interest point group according to the distance between the wireless hotspot and the corresponding interest point group.
  • S814 Perform iterative propagation between initial mapping probabilities based on the degree of overlap of the sniffing devices, and obtain the target mapping probability between the wireless hotspots and interest point groups in the corresponding sub-regions at the end of the iteration.
  • S816 According to the target mapping probability between the wireless hotspots and the points of interest in the same sub-area, establish a mapping between the wireless hotspots and the points of interest in the corresponding sub-area.
  • S818 Establish a mapping between wireless hotspots and points of interest in the statistical area by fusing data mapped between wireless hotspots and points of interest in all sub-regions.
  • the adjacent sub-areas may be two sub-areas adjacent to the boundary of the area.
  • each sub-region has corresponding position coordinates for representing the position of the region, such as the position coordinates of the center point.
  • the adjacent sub-regions may also be two sub-regions whose position coordinate distance is less than a preset distance threshold. For example, in the statistical area shown in FIG. 7, when the adjacent sub-areas are two adjacent sub-areas adjacent to the boundary of the area, the adjacent sub-areas corresponding to the sub-area E include the sub-areas B, D, F, and H.
  • the adjacent sub-regions are two sub-regions whose position coordinate distance is less than the distance threshold, and the distance threshold is the diagonal length of the sub-region
  • the adjacent sub-regions corresponding to the sub-region E include the sub-regions A, B, C, D, and F. , G, H, and J.
  • the computer device separately classifies the POIs in each sub-area in the above-mentioned manner to obtain one or more interest point groups in each sub-area.
  • grouping multiple points of interest in the same subregion to obtain one or more point of interest groups includes: obtaining the point of interest name of each point of interest in the same subregion; The multiple points of interest in the relationship are divided into one point of interest group; in the point of interest group, the point of interest corresponding to the name of the included point of interest is determined as the parent point of interest.
  • the computer device merges the interest point groups in each sub-region to obtain the POI hierarchical relationship in the entire statistical area.
  • the two points of interest whose names of interest points have an inclusive relationship are combined and include: the name of a parent-level point of interest in the current sub-region contains adjacent children.
  • the computer device traverses the sub-regions, and identifies whether all parent points of interest in the current sub-region and all parent points of interest in adjacent sub-regions of the current sub-region have a point-of-interest name inclusion relationship. If the point of interest name of the parent point of interest POIi in a point of interest group in the current child area is included in the point of interest name of the parent point of interest POIj in a point of interest group in the adjacent child area, then the parent The point of interest group where the point of interest POIj is located is divided into the point of interest group where the parent point of interest POIi is located.
  • the point of interest name of the parent point of interest POIi in a point of interest group in the current child area includes the point of interest name of the parent point of interest POIj in a point of interest group in the adjacent child area, then the parent The point of interest group where the point of interest POIi is located is divided into the point of interest group where the parent point of interest POIj is located.
  • the point of interest group ⁇ POIe ⁇ in the subarea E is merged into the point of interest group ⁇ POId ⁇ in the subarea D and the point of interest group ⁇ POIh ⁇ in the subarea H at the same time, and then the merged ⁇ POId, POIe ⁇ and ⁇ POIh, POIe ⁇ have the problem of duplication of some points of interest, which in turn leads to the problem of repeated mapping between points of interest and wireless hotspots.
  • the embodiment of the present application merges two interest point groups in adjacent subregions that have an interest point name inclusion relationship, if it is found that one interest point group can be merged into multiple adjacent subregions.
  • the computer device randomly combines the interest point combination into the interest point group in one of the adjacent sub-regions.
  • each sub-region has a corresponding number, and the points of interest can be combined into the group of points of interest in the adjacent sub-region with the largest number of the sub-regions.
  • the computer equipment calculates the overlapping degree of sniffing devices between the wireless hotspots in each sub-area according to the above-mentioned method, and calculates the distance between the wireless hotspots and the points of interest in each sub-area.
  • the initial mapping probability is the probability that the wireless hotspots and the points of interest in each sub-area.
  • calculating the initial mapping probability between the wireless hotspot and the corresponding interest point group according to the distance between the wireless hotspot and each interest point in the interest point group includes: filtering that the distance to at least one interest point in the interest point group is less than The preset wireless hotspot is used as the seed hotspot of the corresponding interest point group; the initial mapping probability between the seed hotspot and the corresponding interest point group is determined to be 1; the initial mapping between the wireless hotspots and the interest point group except the seed node is determined The mapping probability is 0.
  • each point of interest group ⁇ POI ⁇ has a corresponding number. Assuming that the m POIs contained in the statistical area are divided into p interest point groups, the number range may be 0 to p-1. In this way, the point of interest number can be directly used as a mapping matrix The column index.
  • the computer equipment traverses whether the distance between each wireless hotspot WiFi i and each point of interest in the point of interest group ⁇ POIj ⁇ is less than a preset value according to the point of interest group number.
  • the computer device sets the initial mapping probability of the wireless hotspot WiFi i and the point of interest group ⁇ POIj ⁇ to 0.
  • the computer device sets the initial mapping probability of the wireless hotspot WiFi i and the point-of-interest group ⁇ POIj ⁇ to 1, and The initial mapping probability of the wireless hotspot WiFi i and the point of interest group ⁇ POIj+k ⁇ is set to 0.
  • the wireless hotspot WiFi i is the seed node of a point-of-interest group ⁇ POIj ⁇
  • j+k ⁇ p-1 when it is determined that the wireless hotspot WiFi i is the seed node of a point-of-interest group ⁇ POIj ⁇ .
  • the computer equipment Based on the overlap of the sniffing devices between the wireless hotspots in each sub-area, the computer equipment iteratively propagates the initial mapping probability of the seed hotspot in the corresponding sub-area to the hotspot to be mapped, and obtains the wireless hotspots and interests in the corresponding sub-area at the end of the iteration According to the target mapping probability between points, finally, the mapping between each wireless hotspot and the point of interest in the corresponding sub-region is established according to the target mapping probability.
  • the POI hierarchical relationship is introduced. This improves the mapping efficiency and extends the usage scenarios of the embodiments of this application. At the same time, it can also improve the mapping recall rate of wireless hotspot WiFi and POI. In other words, the embodiments of the present application can improve the call rate of wireless hotspots and points of interest.
  • mapping method of wireless hotspots and points of interest includes:
  • S902 Obtain a sniffing record; the sniffing record includes the identification of the sniffing device and the location and hotspot name of at least one wireless hotspot.
  • S904 Identify mobile hotspots in the wireless hotspots according to changes in the positions of the wireless hotspots in different sniffing records.
  • S906 Eliminate data about mobile hotspots in each sniffing record.
  • S908 Divide the statistical area that needs to be mapped to wireless hotspots and points of interest into multiple sub-areas.
  • S910 Determine a set of deduplication sniffing devices corresponding to each wireless hotspot in the same sub-area according to the sniffing record for completing the data removal.
  • S912 From all the deduplication sniffing device sets, remove the deduplication sniffing device set whose number of sniffer device identifications is less than the first threshold, and obtain the target deduplication sniffing device set.
  • S914 Identify overlapping sniffing device identifiers in the deduplication sniffing device sets of every two targets.
  • S916 Based on the number of overlapping sniffing device identities and the number of sniffing device identities in the corresponding deduplication sniffing device set, determine the degree of overlap of sniffing devices corresponding to two wireless hotspots in the corresponding sub-area.
  • S920 Divide multiple points of interest that have an inclusion relationship between the names of points of interest into one point of interest group.
  • S922 Determine the interest point corresponding to the included interest point name as the parent interest point in the interest point group.
  • S928 Select wireless hotspots whose distance from at least one interest point in the interest point group is less than a preset value as the seed hotspot of the corresponding interest point group.
  • S930 Determine that the initial mapping probability between the seed hotspot and the corresponding interest point group is 1; determine that the initial mapping probability between each wireless hotspot and the interest point group except the seed node is 0.
  • the overlap degree of the sniffing devices between the wireless hotspots is used as a matrix element to establish a propagation matrix.
  • S934 The initial mapping probability between the wireless hotspot and the point of interest is used as a matrix element to establish an initial mapping matrix.
  • S938 Reset the intermediate mapping probability between the seed hotspot and the corresponding interest point in the intermediate mapping matrix to 1, and then iterate as the initial mapping matrix until the iteration stop condition is met to stop the iteration to obtain the target mapping matrix; the target mapping matrix records each wireless The target mapping probability between hotspots and interest points.
  • S942 Establish a mapping between each reserved wireless hot spot in the corresponding sub-area and the point of interest with the highest mapping probability of the corresponding target.
  • S944 Establish a mapping between wireless hotspots and points of interest in the statistical area by fusing data mapped between wireless hotspots and points of interest in all sub-regions.
  • the above-mentioned mapping method of wireless hotspots and points of interest establishes the mapping relationship between wireless hotspots and points of interest based on the sniffing records of wireless hotspots, without the need to manually collect and report POI visit data, which not only improves the mapping efficiency, but also reduces the number of wireless hotspots and points of interest.
  • the dependence of the name makes this mapping method applicable to a wide range and can also improve the recall rate of wireless hotspots.
  • the correlation between wireless hotspots is measured based on the overlap degree of sniffing devices, which can assist in judging the user's flow attributes between points of interest, retain the user's spatial behavior characteristic information, and better realize the distinction of wireless hotspots in space.
  • the mapping between the wireless hotspot and the point of interest is established, which can improve the accuracy of the mapping.
  • mapping method for wireless hotspots and points of interest includes:
  • S1002 Obtain a sniffing record, identify and remove mobile hotspots in the sniffing record.
  • Wireless hotspot mobile/fixed identification can infer a series of location data of each wireless hotspot through sniffing records for a period of time. If the multiple locations of a wireless hotspot change greatly, it can be judged as mobile, otherwise it is fixed.
  • the wireless hotspot mobile/fixed identification can also use other means, which is not limited.
  • S1004 Determine the number of the geographic grid where each wireless hotspot and point of interest are located.
  • the method adopted in this embodiment is to first perform grid division of wireless hotspots and POIs, then calculate the mapping relationship between wireless hotspots and POIs in each geographic grid, and finally perform fusion to obtain a complete mapping result.
  • Each geographic grid has a corresponding number.
  • the number can be determined in the following way: a two-dimensional coordinate system is established in a plane area containing a geographic area that needs to be mapped to wireless hotspots and points of interest, and the size of a given geographic grid can be a square grid with a side length of d meters , Each square grid has corresponding coordinates in the two-dimensional coordinate system, such as the coordinates (x, y) of the center point or vertex, etc.
  • the calculation method of the corresponding geographic grid number of the wireless hotspot and POI is: (x, y )->([x/d]*d,[y/d]*d), where x is the abscissa, y is the ordinate, and [] is the rounding operation.
  • S1006 Identify the parent interest points based on the inclusion relationship between the names of the interest points.
  • POI level division In reality, there is a hierarchical relationship among many POIs. Taking the residential community-type POI "Runcheng Garden” as an example, there will be POIs such as “Run City Garden 1 Building” and “Run City Garden 9 Buildings” nearby. Each POI data includes id, name (name), lon (longitude), lat (latitude) and so on. The steps of POI level division are as follows:
  • Propagation matrix It is an n*n second-order square matrix, where the row/column subscript n is the number of wireless hotspots, and the element at position [i, j] represents the correlation between wifi i and wifi j , and the value range is [0,1 ].
  • n the number of wireless hotspots
  • the element at position [i, j] represents the correlation between wifi i and wifi j
  • the value range is [0,1 ].
  • This embodiment proposes a correlation measurement method between wireless hotspots based on the overlap degree of sniffing devices.
  • the correlation value range calculated by this method is [0,1], without normalization processing, and the user's individual information is retained at the same time , Can better realize the segmentation of the wireless hotspot relationship network.
  • the steps for constructing a wireless hotspot relationship network are as follows:
  • the label matrix The element in contains two values of 0 and 1, where 1 corresponds to the seed hot spot, and 0 corresponds to the hot spot to be mapped.
  • S1014 Identify and filter noise hotspots in the label matrix L to obtain mapping data of wireless hotspots and points of interest in each geographic grid.
  • the propagation iteration ends, and the final label matrix is obtained Calculate the maximum element value of each row. If the maximum value is greater than the given filter threshold, the wireless hotspot is retained and the mapping relationship between POIs corresponding to the maximum is established; otherwise, the wireless hotspot is determined as a noise hotspot and eliminated.
  • S1016 Combine the mapping data in all geographic grids to obtain a mapping result of a full amount of wireless hotspots and points of interest.
  • the number of wireless hotspots recalled with an accuracy higher than 80% in the range of 10 residential areas exceeds 5000, while the traditional name-based mapping method only recalls about 100 wireless hotspots, based on location under the same accuracy conditions.
  • the mapping method only recalls about 2000 wireless hotspots.
  • the mapping method provided in this application is significantly better than the traditional name-based or location-based mapping method in terms of accuracy and recall rate.
  • Figures 2, 4, 6, 8, 9 and 10 are schematic flowcharts of a method for mapping wireless hotspots and points of interest in an embodiment. It should be understood that although the steps in the flowcharts of FIGS. 2, 4, 6, 8, 9 and 10 are displayed in sequence as indicated by the arrows, these steps are not necessarily executed in sequence in the order indicated by the arrows. Unless specifically stated in this article, the execution of these steps is not strictly limited in order, and these steps can be executed in other orders. Moreover, at least part of the steps in Figures 2, 4, 6, 8, 9 and 10 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but can be performed at different times. When executed at any time, the execution order of these sub-steps or stages is not necessarily performed sequentially, but may be executed alternately or alternately with other steps or at least a part of the sub-steps or stages of other steps.
  • a wireless hotspot and interest point mapping device 1100 which includes a hotspot correlation measurement module 1102, a mapping probability propagation module 1104, and a hotspot interest point mapping module 1106, where:
  • the hotspot correlation measurement module 1102 is used to obtain sniffing records.
  • the sniffing record packet includes data of wireless hotspots sniffed by the sniffing device; according to the sniffing record, the degree of overlap of the sniffing device is determined.
  • the mapping probability propagation module 1104 is used to determine the initial mapping probability between the wireless hotspot and the corresponding point of interest according to the distance between the wireless hotspot and the corresponding point of interest; the iterative propagation between the initial mapping probabilities is performed based on the overlap degree of the sniffing device, and the At the end of the iteration, the target mapping probability between the wireless hotspot and the point of interest is obtained.
  • the hotspot and interest point mapping module 1106 is used to establish a mapping between the wireless hotspot and the interest point according to the target mapping probability.
  • the data of the wireless hotspot includes the location of the wireless hotspot; as shown in FIG. 12, the above-mentioned wireless hotspot and point of interest mapping device 1100 further includes a mobile hotspot culling module 1108 for sniffing records according to different wireless hotspots To identify the mobile hotspots in the wireless hotspots; eliminate the data about the mobile hotspots in each sniffing record; determine the overlap degree of the sniffing devices between the wireless hotspots according to the sniffing records of the wireless hotspots, including: According to the completed data Sniffing records removed, determine the degree of overlap of sniffing devices between wireless hotspots.
  • the sniffing record includes the identification of the sniffing device and the hotspot names of at least two wireless hotspots; the hotspot correlation measurement module 1102 is further configured to determine the deduplication sniffing corresponding to each hotspot name based on the identification of the sniffing device Device collection; identify the overlapping sniffing device IDs in every two deduplication sniffing device sets; determine the corresponding two based on the number of overlapping sniffing device IDs and the number of sniffing device IDs in the corresponding deduplication sniffing device set The degree of overlap of sniffing devices for wireless hotspots.
  • the device 1100 for mapping wireless hotspots and points of interest described above further includes an unpopular hotspot elimination module 1110, which is used to eliminate the deduplication of all deduplication sniffing device sets whose number of sniffer device identifications is less than the first threshold.
  • the set of sniffing devices is used to obtain the set of deduplication sniffing devices of the target; the hotspot correlation measurement module 1102 is also used to identify overlapping sniffing device identifiers in the sets of deduplication sniffing devices of every two targets.
  • the device 1100 for mapping wireless hotspots and points of interest described above further includes a POI level division module 1112, which is used to obtain the name of each point of interest to be mapped; Each point of interest is divided into a point of interest group; the mapping probability propagation module 1104 is also used to calculate the initial mapping probability between the wireless hotspot and the corresponding point of interest group according to the distance between the wireless hotspot and the point of interest in the corresponding point of interest group.
  • the mapping probability propagation module 1104 is also used to screen wireless hotspots whose distance from the point of interest is less than a preset value as the seed hotspot of the corresponding point of interest; determining that the initial mapping probability between the seed hotspot and the corresponding point of interest is 1. ; Determine that the initial mapping probability between wireless hotspots and points of interest other than the seed node is 0.
  • the device 1100 for mapping wireless hotspots and points of interest described above further includes a statistical area dividing module 1114 for dividing the statistical area that needs to be mapped to wireless hotspots and points of interest into multiple sub-areas; the POI level dividing module 1112 also Used to group multiple points of interest in the same sub-region to obtain one or more point-of-interest groups; in adjacent sub-regions, combine two points of interest whose names of interest points have an inclusive relationship; mapping probability propagation module 1104 It is also used to calculate the initial mapping probability between the wireless hotspot and the corresponding interest point group according to the distance between the wireless hotspot and each interest point in the interest point group.
  • a statistical area dividing module 1114 for dividing the statistical area that needs to be mapped to wireless hotspots and points of interest into multiple sub-areas
  • the POI level dividing module 1112 also Used to group multiple points of interest in the same sub-region to obtain one or more point-of-interest groups; in adjacent sub-regions, combine two points of interest
  • the POI level division module 1112 is also used to obtain the point of interest name of each point of interest in the same sub-region; divide multiple points of interest that have an inclusive relationship between the point of interest names into a point of interest group; In the interest point group, the interest point corresponding to the included interest point name is determined as the parent interest point.
  • the POI hierarchy division module 1112 is also used to divide the current sub-area when the point-of-interest name of a parent point of interest in the current sub-area contains the point-of-interest name of a parent point of interest in the adjacent sub-area.
  • the points of interest of the inner parent point of interest are combined into the point of interest group of the corresponding parent point of interest in the adjacent subarea; the name of the point of interest of a parent point of interest in the current subarea is included in the adjacent subarea
  • the points of interest of the parent point of interest in the adjacent child area are combined into the point of interest group of the corresponding parent point of interest in the current child area.
  • the mapping probability propagation module 1104 is also used to screen wireless hotspots whose distance from at least one interest point in the interest point group is less than a preset value as the seed hotspot of the corresponding interest point group; determine the seed hotspot and the corresponding interest point group The initial mapping probability between is 1; it is determined that the initial mapping probability between each wireless hotspot and interest point group except the seed node is 0.
  • the device 1100 for mapping wireless hotspots and points of interest described above further includes a matrix initialization module 1116, which is used to establish a propagation matrix using the overlapping degree of sniffing devices between wireless hotspots as matrix elements;
  • the initial mapping probability of is used as a matrix element to establish the initial mapping matrix;
  • the mapping probability propagation module 1104 is also used to multiply the propagation matrix and the initial mapping matrix to calculate the intermediate mapping matrix;
  • the intermediate mapping between the seed hotspots in the intermediate mapping matrix and the corresponding points of interest After the probability is reset to 1, iterate as the initial mapping matrix until the iteration stop condition is met to stop the iteration to obtain the target mapping matrix;
  • the target mapping matrix records the target mapping probability between each wireless hotspot and the point of interest.
  • the hotspot interest point mapping module 1106 is also used to eliminate the wireless hotspots whose target mapping probability is less than the second threshold among all the points of interest; establish the interest with the highest mapping probability between each reserved wireless hotspot and the corresponding target. The mapping between points.
  • the statistical area dividing module 1114 is used to divide the statistical area that needs to be mapped to wireless hotspots and points of interest into multiple sub-areas; the hotspot and interest point mapping module 1106 is also used to compare the wireless hotspots in the same sub-area with Target mapping probability between points of interest, establish a mapping between wireless hotspots and points of interest in the corresponding sub-area; by fusing the data mapped between wireless hotspots and points of interest in all sub-areas, establish a statistical area for each wireless hotspot and interest The mapping between points.
  • the above-mentioned mapping device for wireless hotspots and points of interest establishes the mapping relationship between wireless hotspots and points of interest based on the sniffing records of wireless hotspots, without the need to manually collect and report POI visit data, which not only improves the mapping efficiency, but also reduces the number of wireless hotspots and points of interest.
  • the dependence of the name makes this mapping method applicable to a wide range and can also improve the recall rate of wireless hotspots.
  • the correlation between wireless hotspots is measured based on the overlap degree of sniffing devices, which can assist in judging the user's flow attributes between points of interest, retain the user's spatial behavior characteristic information, and better realize the distinction of wireless hotspots in space.
  • the mapping between the wireless hotspot and the point of interest is established, which can improve the accuracy of the mapping.
  • Fig. 13 shows an internal structure diagram of a computer device in an embodiment.
  • the computer device may specifically be the terminal 110 or the server 120 in FIG. 1.
  • the computer device includes the computer device including a processor, a memory, and a network interface connected through a system bus.
  • the memory includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium of the computer device stores an operating system and may also store a computer program.
  • the processor can realize the mapping method between wireless hotspots and points of interest.
  • a computer program can also be stored in the internal memory, and when the computer program is executed by the processor, the processor can execute the mapping method between wireless hotspots and points of interest.
  • FIG. 13 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied.
  • the specific computer device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
  • the device for mapping wireless hotspots and points of interest provided in the present application can be implemented in the form of a computer program, and the computer program can be run on a computer device as shown in FIG. 13.
  • the memory of the computer device can store various program modules that make up the wireless hotspot and interest point mapping device, such as the correlation measurement module, the mapping probability propagation module, and the hotspot interest point mapping module shown in FIG. 11.
  • the computer program composed of each program module causes the processor to execute the steps in the wireless hotspot and interest point mapping method described in the specification in each embodiment of the present application.
  • the computer device shown in FIG. 13 may perform steps S202 and S204 through the hotspot correlation measurement module in the wireless hotspot and interest point mapping apparatus shown in FIG. 11.
  • the computer device can execute steps S206 and S208 through the mapping probability propagation module.
  • the computer device may execute step S210 through the hotspot and interest point mapping module.
  • a computer device including a memory and a processor, and the memory stores a computer program.
  • the processor executes the steps of the above-mentioned method for mapping wireless hotspots and points of interest.
  • the steps of the method for mapping wireless hotspots and points of interest may be the steps in the method for mapping wireless hotspots and points of interest in each of the foregoing embodiments.
  • a computer-readable storage medium which stores a computer program, and when the computer program is executed by a processor, the processor executes the steps of the above-mentioned method for mapping wireless hotspots and points of interest.
  • the steps of the method for mapping wireless hotspots and points of interest may be the steps in the method for mapping wireless hotspots and points of interest in each of the foregoing embodiments.
  • a computer program product or computer program includes computer instructions, and the computer instructions are stored in a computer-readable storage medium; the processor of the electronic device reads from the computer-readable storage medium.
  • the computer instructions are fetched and executed, the electronic device is caused to execute the steps of the method for mapping wireless hotspots and points of interest.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
  • SRAM static RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDRSDRAM double data rate SDRAM
  • ESDRAM enhanced SDRAM
  • SLDRAM synchronous chain Channel
  • memory bus Radbus direct RAM
  • RDRAM direct memory bus dynamic RAM
  • RDRAM memory bus dynamic RAM

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Cardiology (AREA)
  • Health & Medical Sciences (AREA)
  • Remote Sensing (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

一种无线热点与兴趣点的映射方法、装置、计算机可读存储介质和计算机设备,所述方法包括:获取嗅探记录;每条嗅探记录包含嗅探设备所嗅探到的无线热点的数据(S202);根据所述嗅探记录,确定无线热点之间的嗅探设备重叠度(S204);根据无线热点与兴趣点的距离,确定无线热点与兴趣点间的初始映射概率(S206);基于所述嗅探设备重叠度进行所述初始映射概率之间的迭代传播,在迭代结束时得到无线热点与兴趣点间的目标映射概率(S208);根据所述目标映射概率,建立无线热点与兴趣点之间的映射(S210)。

Description

无线热点与兴趣点的映射方法、装置、计算机可读存储介质和计算机设备
本申请要求于2020年01月21日提交中国专利局,申请号为2020100722893,发明名称为“无线热点与兴趣点的映射方法、装置和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,特别是涉及一种无线热点与兴趣点的映射方法、装置、计算机可读存储介质计算机设备。
背景技术
随着移动互联网和移动设备的不断发展普及,无线热点成为了个人、家庭、企业、餐饮、酒店、零售等服务行业必备的设施和服务之一。无线热点可通过无线局域网为周围一定距离范围内的用户提供接入互联网服务。若用户嗅探或连接了无线热点,则可以认为该用户到访了此无线热点所在的兴趣点(POI,Point of Interest)。因而,构建无线热点与POI之间的映射关系,对于人群活动规律挖掘、店铺选址、交通规划等具有重要意义。
传统方式主要基于名称构建无线热点与POI的映射关系。基于名称的映射方式,需要无线热点名称与POI名称具有强相关性,然而实际场景中由用户自定义的大部分无线热点名称与POI名称的相关性是极弱的,从而基于名称的映射方式能够适用的场景非常有限。
发明内容
根据本申请的各种实施例,提供了一种无线热点与兴趣点的映射方法、装置、计算机可读存储介质和计算机设备。
一种无线热点与兴趣点的映射方法,由计算机设备执行,所述方法包括:
获取嗅探记录,所述嗅探记录包括嗅探设备所嗅探到的无线热点的数据;
根据所述嗅探记录,确定嗅探设备重叠度;
根据所述无线热点与相应的兴趣点的距离,确定所述无线热点与所述相应的兴趣点间的初始映射概率;
基于所述嗅探设备重叠度进行所述初始映射概率之间的迭代传播,在迭代结束时得到所述无线热点与所述兴趣点之间的目标映射概率;
根据所述目标映射概率,建立所述无线热点与所述兴趣点之间的映射。
一种无线热点与兴趣点的映射装置,所述装置包括:
热点相关性度量模块,用于获取嗅探记录,所述嗅探记录包括嗅探设备所嗅探到的无线热点的数据;根据所述嗅探记录,确定的嗅探设备重叠度;
映射概率传播模块,用于根据所述无线热点与相应的兴趣点的距离,确定所述无线热点与所述相应的兴趣点间的初始映射概率;基于所述嗅探设备重叠度进行所述初始映射概率之间的迭代传播,在迭代结束时得到所述无线热点与所述兴趣点之间的目标映射概率;
热点兴趣点映射模块,用于根据所述目标映射概率,建立所述无线热点与所述兴趣点之间的映射。
一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时,使得所述处理器执行所述无线热点与兴趣点的映射方法的步骤。
一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述计算机程序被所述处理器执行时,使得所述处理器执行所述无线热点与兴趣点的映射方法的步骤。
一种计算机程序产品或计算机程序,所述计算机程序产品或计算机程序包括计算机指令,所述计算机指令存储在计算机可读存储介质中;电子装置的处理器从所述计算机可读存储介质读取并执行所述计算机指令时,使得所述电子装置执行所述无线热点与兴趣点的映射方法的步骤。
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请的其它特征和优点将从说明书、附图以及权利要求书变得明显。
附图说明
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:
图1为一个实施例中无线热点与兴趣点的映射方法的应用环境图;
图2为一个实施例中无线热点与兴趣点的映射方法的流程示意图;
图3为一个实施例中基于标签传播算法进行映射概率传播时采用的完全图的示意图;
图4为另一个实施例中无线热点与兴趣点的映射方法的流程示意图;
图5为一个实施例中引入兴趣点层级关系的原理示意图;
图6为又一个实施例中无线热点与兴趣点的映射方法的流程示意图;
图7为一个实施例中划分为多个子区域的统计区域的示意图;
图8为再一个实施例中无线热点与兴趣点的映射方法的流程示意图;
图9为一个具体实施例中无线热点与兴趣点的映射方法的流程示意图;
图10为另一个具体实施例中无线热点与兴趣点的映射方法的流程示意图;
图11为一个实施例中无线热点与兴趣点的映射装置的结构框图;
图12为另一个实施例中无线热点与兴趣点的映射装置的结构框图;
图13为一个实施例中计算机设备的结构框图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
图1为一个实施例中无线热点与兴趣点的映射方法的应用环境图。(例子:参照图1,该无线热点与兴趣点的映射方法应用于无线热点与兴趣点的映射系统。该无线热点与兴趣点的映射系统包括终端110、服务器120、嗅探设备130。终端110和服务器120通过网络连接。终端110具体可以是台式终端或移动终 端,移动终端具体可以手机、平板电脑、笔记本电脑、智能穿戴设备等中的至少一种。服务器120可以用独立的服务器或者是多个服务器组成的服务器集群来实现。嗅探设备130是具有无线热点嗅探及连接功能的设备,如手机、电脑、智能穿戴设备、电子阅读设备等。嗅探设备130用于将对无线热点的嗅探记录直接上报至终端110或服务器120,或者上报至其他存储设备,由终端110或服务器120从该存储设备拉取。终端110和服务器120均可单独基于嗅探记录执行本申请实施例中提供的无线热点与兴趣点的映射方法。终端110和服务器120也可协同基于嗅探记录执行本申请实施例中提供的无线热点与兴趣点的映射方法。
如图2所示,在一个实施例中,提供了一种无线热点与兴趣点的映射方法。本实施例主要以该方法应用于计算机设备来举例说明,该计算机设备具体可以是上图中的终端110或者服务器120。参照图2,该无线热点与兴趣点的映射方法具体包括如下步骤:
S202,获取嗅探记录;嗅探记录包括嗅探设备所嗅探到的无线热点的数据。
其中,嗅探记录是指嗅探设备在嗅探到无线热点时上报的数据。无线热点可以是通过无线接入点(AP,Access Point)或路由器提供的WiFi(Wireless-Fidelity)网络,也可以是通过移动终端等设备提供的移动热点,如手机WiFi热点、车载WiFi热点等。
嗅探记录包括嗅探设备的设备标识、嗅探记录生成时间以及嗅探设备所嗅探到的每个无线热点的数据等。无线热点的数据包括无线热点的名称、位置坐标、信号强度等。无线热点的名称是指无线热点广播出来的SSID(Service Set Identity,服务集标识),具体可以是提供无线热点的用户自定义的字符串,如“TP-LINK-XX”、“光明小区-13”等。无线热点的位置坐标是用经度lon和纬度lat表示无线热点所在地面点位置的球面坐标(lon,lat)。地理坐标具体可以是天文经纬度、大地经纬度或地心经纬度。比如,嗅探设备A的t1时刻上报的嗅探记录可以是[A,t1,(TP-LINK-YY,光明小区-13,3-506A,Yuij99),[(114.32,30.51),(110.22,35.09),(11.32,31.77),(109.92,30.01)]]。
用户携带嗅探设备在某个位置停留或进行位置移动时,嗅探设备对周围存在的无线热点进行嗅探,并将嗅探到的无线热点以列表的形式展示给用户,用户可以选择其中一个无线热点进行连接。每个嗅探设备按照预设时间频率基于嗅探到的无线热点的数据生成嗅探记录,并将生成的嗅探记录上报至指定设备。
具体地,计算机设备通过USB(Universal Serial Bus,通用串行总线)接口连接或网络连接等通信方式从指定设备拉取统计区域在统计时段内的嗅探记录。统计区域是需要对区域内无线热点和兴趣点进行映射的地理区域。统计区域的区域边界可以根据统计需求自由限定,如整个国家的领土地域,某个省份或城镇的地域等。在数字地图中,统计区域可以是多个连续坐标点围成的封闭轮廓内的区域。统计时段是指进行无线热点和兴趣点映射时所依赖嗅探记录的生成时间的时间跨度,包括统计起始时间和统计结束时间。统计时段的时间长度太短会影响最终无线热点与兴趣点的映射准确性;时间长度过长会导致数据计算量增大。统计时段的时间长度需要根据需求合理设定,如1个月等。
在一个实施例中,计算机设备将所获取的仅包含一个无线热点的数据的嗅探记录剔除。可以理解,本申请的嗅探记录用于度量无线热点之间的相关性,仅包含单个无线热点数据的嗅探记录对度量无线热点之间的相关性并不具分析价值,剔除该类嗅探记录可以在不影响相关性分析准确性的情况下,减小需要处理的嗅探记录数据量。
在一个实施例中,嗅探设备也可以将产生的嗅探记录直接上报至用于无线热点和兴趣点映射的计算机设备。
在一个实施例中,无线热点还可以是通过近场通信方式提供的点对点传输网络,如BLE(Bluetooth Low Energy,蓝牙低能耗)、NFC(near field communication,近场通信)或RFID(Radio Frequency Identification,无线射频识别)等无线连接网络。
S204,根据嗅探记录,确定嗅探设备重叠度。
其中,无线热点具有对应的嗅探距离范围,如周围50米。每个无线热点可以被嗅探距离范围内的任意嗅探设备嗅探到。如此,位置相近的无线热点可能会被同一嗅探设备嗅探到,从而同一个无线热点的数据可能会出现在多个不同 的嗅探记录中。出现在同一嗅探记录中的嗅探设备和无线热点,可以认为具有关联关系。不同无线热点所关联的嗅探设备可能存在重叠。
嗅探设备重叠度是能够反映无线热点所关联的嗅探设备发生重叠程度的数值。两个无线热点之间的嗅探设备重叠度具体可以是两个无线热点所关联的嗅探设备中重叠的嗅探设备的数量相对其中一个无线热点所关联的全部嗅探设备的数量的占比,也可以是两个无线热点所关联的嗅探设备去重后的总数量相对其中一个无线热点所关联的全部嗅探设备的数量的占比,还可以是两个无线热点所关联的嗅探设备中重叠的嗅探设备的数量相对两个无线热点所关联的去重后全部嗅探设备的数量的占比。
具体地,计算机设备确定所拉取的嗅探记录中所涉及的无线热点,以及每个无线热点所关联的嗅探设备。计算机设备统计每个无线热点WiFi i所关联的嗅探设备的总数量NUM i。计算机设备统计每两个无线热点WiFi i和WiFi j所关联的嗅探设备中重叠的嗅探设备的数量NUM ij。i和j为大于0的整数。
计算机设备可以计算重叠的嗅探设备的数量NUM ij相对无线热点WiFi i所关联的嗅探设备的总数量NUM i的占比,将该占比NUM ij/NUM i作为无线热点WiFi i相对无线热点WiFi j的嗅探设备重叠度W ij。计算机设备计算重叠的嗅探设备的数量NUM ij相对无线热点WiFi j所关联的嗅探设备的总数量NUM j的占比,将该占比NUM ij/NUM j作为无线热点WiFi j相对无线热点WiFi i的嗅探设备重叠度W ji
在一个实施例中,计算机设备也可以计算无线热点WiFi i与WiFi j所关联的嗅探设备去重后的总数量NUM i+NUM j-NUM ij,计算去重后的总数量NUM i+NUM j-NUM ij相对无线热点WiFi i所关联的嗅探设备的总数量NUM i的占比,将该占比(NUM i+NUM j-NUM ij)/NUM i作为无线热点WiFi i相对无线热点WiFi j的嗅探设备重叠度W ij。计算机设备计算去重后的总数量NUM i+NUM j-NUM ij相对无线热点WiFi j所关联的嗅探设备的总数量NUM j的占比,将该占比(NUM i+NUM j-NUM ij)/NUM j作为无线热点WiFi j相对无线热点WiFi i的嗅探设备重叠度W ji
在一个实施例中,计算机设备计算重叠的嗅探设备的数量NUM ij相对无线热点WiFi i以及无线热点WiFi j所关联的嗅探设备去重后的总数量NUM i+NUM j-NUM ij的占比,将该占比NUM ij/(NUM i+NUM j-NUM ij),将该占比作为无线热点WiFi i与无 线热点WiFi j的嗅探设备重叠度W i+j。可以理解,这种方式中无线热点WiFi i相对无线热点WiFi j的嗅探设备重叠度W ij,与无线热点WiFi j相对无线热点WiFi i的嗅探设备重叠度W ji相同,均为嗅探设备重叠度W i+j
在一个实施例中,嗅探记录包含嗅探设备标识及至少两个无线热点的热点名称;确定无线热点之间的嗅探设备重叠度包括:基于嗅探设备标识,确定每个热点名称对应的去重嗅探设备集合;识别每两个去重嗅探设备集合中重叠的嗅探设备标识;基于重叠的嗅探设备标识的数量以及相应去重嗅探设备集合中嗅探设备标识的数量,确定对应两个无线热点的嗅探设备重叠度。
其中,嗅探设备标识是能够唯一标识一个嗅探设备的信息。嗅探记录中的嗅探设备标识可以是已被数据生产方做了不可逆加密处理的标识数据。嗅探设备标识具体可以是SUPI(Subscription Permanent Identifier)用户永久标识符,GPSI(Generic Public Subscription Identifier,通用公共用户标识符)、PEI(Permanent Equipment Identifier,永久设备标识符)等。当SUPI取值为0时,嗅探设备标识为IMSI(International Mobile Subscriber Identification Number,国际移动用户识别码);当SUPI取值为1时,嗅探设备标识为NAI(Network Access Identifier,网络接入标识符)。
嗅探设备集合是由一个或多个嗅探设备标识构成的集合。每个嗅探设备集合对应于一个无线热点的热点名称。同一嗅探设备在统计时段可能会上报多条嗅探记录,且多次上报的嗅探记录可能包含相同的无线热点。如此,无线热点对应的嗅探设备集合中嗅探设备标识可能会发生重复。去重嗅探设备集合是对进行嗅探设备标识去重后的嗅探设备集合。
具体地,计算机设备对拉取到的嗅探记录进行解析,构建所涉及的每个无线热点对应的嗅探设备集合。虽然存在同一个用户同时携带多个嗅探设备的可能,但在较高置信度上仍可以认为每个嗅探设备可唯一代表一个用户。由于嗅探记录中直接记录了嗅探设备标识,从而基于嗅探设备标识可以对用户进行更好的区分。计算机设备对嗅探设备集合中嗅探设备标识进行去重,得到去重嗅探设备集合。
进一步地,计算机设备对每两个去重嗅探设备集合进行对比分析,识别出 每两个去重嗅探设备集合中发生重叠的那部分嗅探设备标识。计算机设备计算统计发生重叠的嗅探设备标识的数量,计算发生重叠的嗅探设备标识的数量相对两个无线热点中目标的无线热点对应的去重嗅探设备集合中嗅探设备标识总数量的占比,将该占比作为目标的无线热点相对另一个无线热点的嗅探设备重叠度。比如,对WiFi i对应的去重嗅探设备集合和WiFi j对应的去重嗅探设备集合进行对比分析后,计算得到无线热点WiFi j相对无线热点WiFi i的嗅探设备重叠度W ij=(WiFi i嗅探设备标识去重数+WiFi j嗅探设备标识去重数-WiFi i和WiFi j的嗅探设备标识去重数)/WiFi i嗅探设备标识去重数。
本实施例,基于嗅探设备重叠度可以度量无线热点之间的相关性,计算得到的相关性数值范围为[0,1],无需进行归一化处理。更重要的是,由于每个嗅探设备可唯一代表一个用户,从而在较高置信程度上嗅探设备重叠度反映了不同无线热点之间用户的相似度,可以辅助判断用户的属性,如流动用户、常驻用户等。因而,基于嗅探设备重叠度度量无线热点之间相关性的方式,相比单纯基于无线热点的位置坐标、或者出现在同一嗅探记录的次数等方式,保留了用户的空间行为特征信息,可以更好的实现无线热点在空间位置上的区分,有效解决了相邻兴趣点之间无线兴趣点区分困难的问题,从而使反映出的无线热点相关性更加客观稳定,可靠性更高。
S206,根据无线热点与相应的兴趣点的距离,确定无线热点与相应的兴趣点间的初始映射概率。
其中,兴趣点POI是指地理信息系统中的地标、景点等,如某地区的政府部门、社区建筑、商业机构(如加油站、百货公司、超市、餐厅、酒店、便利店、邮筒、医院等)、名胜古迹、旅游景点(如公园、公共厕所等)、交通设施(如车站、停车场、收费站、速限标示)等地物。
映射概率是指无线热点归属映射于某个POI的概率。初始映射概率从空间距离的维度上反映了无线热点与兴趣点的关联度,换言之,单纯从空间距离的维度考虑,无线热点归属映射于相应POI的概率为初始映射概率。可以理解,出现在同一嗅探记录中的多个无线热点位置相近。位置相近的无线热点所关联的嗅探设备的重叠度高且归属于同一个POI的概率较大。
具体地,计算机设备获取兴趣点的数据。兴趣点的数据可以是从第三方渠道商获取得到,也可以通过网络爬虫获取得到,对于兴趣点数据的获取方式不作限制。兴趣点的数据包括兴趣点名称、兴趣点位置等。兴趣点名称是人们对兴趣点的命名,如“星空小学校”、“中国技术交易大厦”等。兴趣点位置可以是兴趣点的地理坐标。计算机设备根据无线热点的地理坐标和兴趣点的地理坐标,计算统计区域内每个无线热点与各兴趣点的距离。
进一步地,计算机设备对无线热点和各兴趣点的距离进行归一化处理,将距离值转换为[0,1]范围的概率值,将该概率值作为相应无线热点与兴趣点之间的初始映射概率。归一化处理所采用的方法具体可以是01标准化、Z-score标准化、sigmoid函数标准化等。可以理解,计算机设备还可以采用其他方式确定兴趣点与无线热点之间的初始映射概率,比如,还可以确定统计区域内无线热点与兴趣点的最远距离,将当前无线热点与兴趣点之间的距离与该最远距离的比值作为当前无线热点与兴趣点之间的初始映射概率。
在一个实施例中,兴趣点位置也可以是用于描述兴趣点所在位置的地址文本。地址文本是用于描述兴趣点的地理位置信息的文本,如“北京市海淀区中关村海淀大街麦当劳”。计算机设备基于地址编码服务将地址文本转换为兴趣点的地理坐标。地理编码服务得到的地理坐标与地址文本一一对应。计算机设备也可以基于坐标检索服务搜索每个地址文本对应的地理坐标。坐标检索服务得到地理坐标与地址文本可能是一对一的关系,也可能是多对一的关系。换言之,基于坐标检索服务可得到每个地址文本对应的一个或多个地理坐标。不同的坐标检索服务提供方提供不同的坐标检索方式。比如,百度地图、谷歌地图等提供的坐标检索方式不同。
当转换得到的地理坐标有多个时,计算机设备识别地址文本中的关键地址元素。关键地址元素是指能够使地址文本描述的地址位置信息处于收敛状态的地址元素。收敛状态是指可从大量分散的可能区域集中精准定位到某个可能区域的状态。关键地址元素具体可以是将地理位置从大量POI中限缩至其中一个或几个POI的POI前缀。比如,在上述举例的地址文本中POI“麦当劳”在北京市有很多,但是“海淀大街麦当劳”或者“中关村麦当劳”则只有少量的几个, 表明海淀大街或中关村是能够帮助地址文本描述的地理位置信息收敛的关键地址元素。最终,计算机设备可以根据关键地址元素从转换得到的多个地理坐标中筛选目标的一个地理坐标。
S208,基于嗅探设备重叠度进行初始映射概率之间的迭代传播,在迭代结束时得到无线热点与兴趣点之间的目标映射概率。
其中,本申请目的在于建立无线热点与兴趣点之间的映射关系,且需要保证每个无线热点只能归属映射于一个兴趣点,属于非重叠区域划分问题。用于解决非重叠区域划分问题的算法具体可以采用基于模块度优化的区域划分、基于谱分析的区域划分、基于信息论的区域划分、基于标签传播的区域划分等。其中,基于标签传播的区域划分算法可以是LPA(Label Propagation Algorithm,标签传播算法)、COPRA(基于标签传递的重叠社区发现算法)、SLPA(Speaker-listener Label Propagation Algorithm,社区发现算法)等。
具体地,假设根据拉取的统计区域在统计时段产生的嗅探记录,解析发现统计区域内部署了n个无线热点;根据兴趣点的数据,发现统计区域包含有m个POI。当需要计算目标无线热点WiFi i(0≤i<n)与兴趣点POIj(0≤j<m)的目标映射概率时,计算机设备可以基于上述非重叠区域划分算法,将统计区域内除了目标无线热点WiFi i的其他无线热点分别与目标无线热点WiFi i之间的嗅探设备重叠度作为传播权重,将统计区域内每个其他无线热点与兴趣点POIj的初始映射概率以传播权重传播叠加在目标无线热点WiFi i与兴趣点POIj的初始映射概率上,得到目标无线热点WiFi i与兴趣点POIj的中间映射概率,将该中间映射概率作为目标无线热点WiFi i与兴趣点POIj的初始映射概率进行迭代,直至达到预设的迭代停止条件。迭代停止条件可以是中间映射概率收敛,或达到设定的最大迭代次数等。计算机设备将停止迭代时计算得到的中间映射概率确认为相应无线热点与兴趣点之间的目标映射概率。
以标签传播算法LPA为例,标签传播算法LPA是一种基于图的半监督学习方法,其基本思路是用已标记节点的标签信息去预测未标记节点的标签信息。利用样本间的关系建立关系完全图模型,在完全图中,节点包括已标注和未标注数据,其边表示两个节点的相似度,节点的标签按相似度传递给其他节点。 标签数据就像是一个源头,可以对无标签数据进行标注,节点的相似度越大,标签越容易传播。
参考图3所示,计算机设备可以以每组待映射的无线热点与兴趣点POI作为节点302建立完全图。完全图中每个节点具有对应的标签信息,用于连接节点的每个连接边304具有对应的边权重。在本申请的实施例中,节点对应的标签信息是指相应无线热点与兴趣点POI之间的初始映射概率;节点之间连接边对应的边权重是指对应两个无线热点之间的嗅探设备重叠度。
在一个实施例中,上述无线热点与兴趣点的映射方法还包括:将无线热点之间的嗅探设备重叠度作为矩阵元素建立传播矩阵;将无线热点与兴趣点间的初始映射概率作为矩阵元素建立初始映射矩阵;基于嗅探设备重叠度进行初始映射概率之间的迭代传播,在迭代结束时得到无线热点与兴趣点间的目标映射概率包括:将传播矩阵与初始映射矩阵相乘,计算得到概率传播后的映射矩阵;将概率传播后的映射矩阵作为初始映射矩阵进行迭代,直至满足迭代停止条件时停止迭代,得到目标映射矩阵;目标映射矩阵记录了各无线热点与兴趣点之间的目标映射概率。
具体地,计算机设备根据统计区域所涉及的无线热点的数量初始化构建传播矩阵。假设统计区域内部署了n个无线热点,则传播矩阵可以是n*n的二维矩阵W n*n。初始化的传播矩阵W n*n的矩阵元素W ij默认均为0。矩阵元素W ij用于记录反映无线热点WiFi i相对无线热点WiFi j相关度的数值。计算机设备在计算得到无线热点之前的嗅探重叠度之后,将其作为矩阵元素填充至初始化的传播矩阵W n*n
计算机设备根据统计区域所涉及的无线热点和兴趣点的数量初始化构建映射矩阵。假设统计区域内部署了n个无线热点,包含有m个POI,则映射矩阵可以是n*m的二维矩阵
Figure PCTCN2020124594-appb-000001
上标0代表映射矩阵为初始化状态。初始化的映射矩阵
Figure PCTCN2020124594-appb-000002
的矩阵元素
Figure PCTCN2020124594-appb-000003
默认均为0。矩阵元素
Figure PCTCN2020124594-appb-000004
用于记录反映无线热点WiFi i与兴趣点POI j相关度的数值。计算机设备在计算得到无线热点与兴趣点之间的初始映射概率之后,将其作为矩阵元素填充至初始化的映射矩阵
Figure PCTCN2020124594-appb-000005
进一步地,计算机设备将传播矩阵W n*n与初始化状态的映射矩阵
Figure PCTCN2020124594-appb-000006
相乘, 计算得到概率传播后的映射矩阵
Figure PCTCN2020124594-appb-000007
计算机设备将概率传播后的映射矩阵
Figure PCTCN2020124594-appb-000008
作为初始映射矩阵进行迭代,直至满足迭代停止条件时停止迭代,得到目标映射矩阵
Figure PCTCN2020124594-appb-000009
目标映射矩阵记录了各无线热点与兴趣点之间的目标映射概率。
S210,根据目标映射概率,建立无线热点与兴趣点之间的映射。
具体地,目标映射概率反映了一个无线热点在地址位置上归属于一个POI的概率。计算机设备确定每个无线热点所对应的目标映射概率最大的POI,记作目标POI。计算机设备建立无线热点与目标POI之间的映射。换言之,计算机设备建立无线热点WiFi i与目标映射矩阵中第i行中最大元素值对应的POI之间的映射。
在一个实施例中,根据目标映射概率,建立无线热点与兴趣点之间的映射包括:在所有兴趣点中,剔除目标映射概率均小于第二阈值的无线热点;建立所保留的每个无线热点与对应目标映射概率最大的兴趣点间的映射。
其中,第二阈值是设定的用于判定是否需要将一个无线热点WiFi i映射至某个POI的目标映射概率最小值。第二阈值的大小可以根据需求自由设定。
具体地,计算机设备对无线热点WiFi i与每个POI之间的目标映射概率是否达到第二阈值进行遍历。当无线热点WiFi i与每个POI之间的目标映射概率均小于第二阈值时,计算机设备将无线热点WiFi i判定为噪声WiFi i,剔除噪声WiFi i,建立所保留的无线热点WiFi ii与达到第二阈值且最大的目标映射概率的POI之间的映射。基于本申请提供的方法所建立的无线热点与兴趣点之间的映射关系可以实现用户到场识别,并用于人群活动规律的挖掘,进而支撑店铺选址、交通规划等诸多重要的商业决策和政策制定,具有非常高的应用价值。
本实施例中,剔除与每个兴趣点之间的目标映射概率均小于第二阈值的噪声无线热点,可以提高所建立的映射关系的准确可靠性。
上述无线热点与兴趣点的映射方法,基于无线热点的嗅探记录建立无线热点与兴趣点的映射关系,无需人工采集上报POI到访数据,不仅提高映射效率;由于减少了对无线热点和兴趣点名称的依赖,使这种映射方式适用范围广,还可以提高无线热点召回率。基于嗅探设备重叠度度量无线热点之间的相关性, 可以辅助判断用户在兴趣点之间的流动属性,保留了用户的空间行为特征信息,可以更好的实现无线热点在空间位置上的区分,从而使反映出的无线热点相关性可靠性更高。进而,综合无线热点与兴趣点的距离,以及无线热点之间的嗅探设备重叠度,建立无线热点及兴趣点之间的映射,可以提高映射准确性。
在一个实施例中,上述无线热点与兴趣点的映射方法还包括:根据无线热点在不同嗅探记录中的位置变化,识别无线热点中的移动热点;剔除每条嗅探记录中关于移动热点的数据;根据无线热点的嗅探记录,确定嗅探设备重叠度包括:根据完成数据剔除的嗅探记录,确定嗅探设备重叠度。
其中,根据无线热点的位置移动属性,可以将无线热点区分为移动热点和稳定热点。移动热点是热点位置会随着时间变化的无线热点,比如,手机WiFi热点、车载WiFi热点等。移动热点在不同时间可能处于不同POI的位置,难以建立移动热点与POI之间的稳定映射关系。
具体地,为了提高所建立的无线热点与兴趣点之间映射关系的准确性,计算机设备识别统计区域所涉及无线热点中的移动热点。移动热点和稳定热点的识别方法有多种,比如,可以根据统计时段内的嗅探记录反推出每个无线热点在不同时间点的位置坐标,计算相邻时间点的位置变化值,当存在预设数量的位置变化值大于预设值时,可以判定为该无线热点为移动热点。
在一个实施例中,计算机设备可以对无线热点在不同时间点的位置坐标进行聚类,通过聚类算法计算每个地理坐标的聚类特征,根据聚类特征在多个地理坐标中确定一个类簇中心点。聚类特征是表征地理坐标的聚类特征的特征,比如高斯密度分布值。高斯密度分布值越大,表示相应地理坐标越有聚集性,可作为类簇中心点。类簇中心点是指多个地理坐标点中聚集性最高的地理坐标点。聚类算法,比如,k-means(基于划分的聚类方法)、fuzzy cluster(模糊聚类算法)、DBSCAN(Density-Based Spatial Clustering of Application with Noise,基于密度的聚类算法)或Fast Search and Find of Density Peaks(快速搜索和发现密度峰的聚类算法)等。
计算机设备统计与类簇中心点距离小于目标值的地理坐标的数量相对该无线热点在全部时间点的地理坐标的总数量的比值,该比值反映了无线热点在全 部时间点的地理坐标的集中度。当集中度小于预设值时,计算机设备将相应无线热点判定为移动热点。
计算机设备剔除每条嗅探记录中关于移动热点的数据,后续只需计算稳定热点之间的嗅探设备重叠度,只需计算稳定热点与兴趣点之间初始映射概率和目标映射概率,建立稳定热点与兴趣点之间的映射。
本实施例中,根据无线热点在不同嗅探记录中的位置变化,对移动热点进行识别,过滤了嗅探记录中的噪音数据,在提高所建立映射关系稳定性的同时,精准缩小了需要进行映射处理的数据量,提高映射效率,节约计算机设备数据处理资源。
在一个实施例中,上述无线热点与兴趣点的映射方法还包括:剔除去重嗅探设备集合中嗅探设备标识数量小于第一阈值的无线热点,得到目标的去重嗅探设备集合;识别每两个嗅探设备集合中重叠的嗅探设备标识包括:识别保留的每两个目标的去重嗅探设备集合中重叠的嗅探设备标识。
其中,第一阈值是设定的用于判定是否需要将一个无线热点WiFi i映射至某个POI的嗅探设备标识数量最小值。第一阈值的大小可以根据需求自由设定。
具体地,计算机设备统计嗅探设备集合中嗅探设备标识数量,以确定每个无线热点所对应的到访用户的数量。计算机设备对每个无线热点WiFi i关联的嗅探设备标识数量是否达到第一阈值进行遍历。当无线热点WiFi i关联的嗅探设备标识数量小于第一阈值时,表示该无线热点可能在统计时段的某种时间点发生了故障异常,使得到访用户携带的嗅探设备无法嗅探到,或者该无线热点本身的到访用户较少,计算机设备将这类无线热点WiFi i判定为故障热点或冷门热点。计算机设备剔除故障热点和冷门热点。后续只需计算所保留的无线热点WiFi ii之间的嗅探设备重叠度,只需计算所保留的无线热点与兴趣点之间初始映射概率和目标映射概率,建立所保留的无线热点与兴趣点之间的映射。
本实施例中,对于嗅探设备标识数量少的无线热点不纳入嗅探设备重叠度统计范畴,提高嗅探设备重叠度准确性同时降低了传播矩阵W n*n的维度,有助于提高映射效率。
在一个实施例中,根据无线热点与兴趣点的距离,计算无线热点与相应的 兴趣点间的初始映射概率,包括:筛选与兴趣点的距离小于预设值的无线热点作为相应兴趣点的种子热点;确定种子热点与相应兴趣点之间的初始映射概率为1;确定除种子节点之外的各无线热点与兴趣点之间的初始映射概率为0。
其中,预设值是预设的用于判断一个无线热点是否可以作为某个兴趣点的种子热点的距离最大值。种子热点是至少与一个兴趣点的距离小于预设值的无线热点,在较高置信度上单纯根据距离已经可以确定归属于哪个兴趣点。事实上,对于种子热点,计算机设备此时已经可以建立其与相应兴趣点之间的映射。除种子节点之外的无线热点可以称作待映射热点。待映射热点是与每个兴趣点的距离均大于或等于预设值的无线热点,单纯根据距离尚不能确定归属于哪个兴趣点。
值得强调的是,每个兴趣点可以具有对应的多个种子节点,但每个无线热点只能作为一个兴趣点的种子热点。换句话说,初始化的映射矩阵
Figure PCTCN2020124594-appb-000010
中每行无线热点WiFi i至多与一个兴趣点POI的初始映射概率为1。当一个无线热点与多个兴趣点的距离小于预设值时,可以将该无线热点判定为距离最近的那个兴趣点的种子热点。容易发现,预设值为用于判定种子热点的阈值,而每个无线热点只能作为一个兴趣点的种子热点,因而不宜太大,可以是远小于无线热点辐射范围的距离值,如20m等。
具体地,计算机设备确定每个兴趣点POI对应的编号。编号范围可以是0至m-1。其中,m为统计区域所包含POI的数量。该编号可以是计算机设备随机确定的,也可以是POI的位置坐标确定的,比如按照经度和/或纬度减小的顺序编号等。如此,兴趣点编号可以直接作为映射矩阵
Figure PCTCN2020124594-appb-000011
的列下标。
进一步地,计算机设备按照兴趣点编号,对每个无线热点WiFi i与各兴趣点POIj的距离是否小于预设值进行遍历。当无线热点WiFi i与兴趣点POIj的距离大于或等于预设值时,计算机设备将无线热点WiFi i与兴趣点POIj的初始映射概率设定为0。当无线热点WiFi i与兴趣点POIj的距离小于预设值时,计算机设备将无线热点WiFi i与兴趣点POIj的初始映射概率设定为1,并将无线热点WiFi i与兴趣点POIj+k的初始映射概率设定为0。换言之,当确定无线热点WiFi i为一个兴趣点POIj的种子节点后,则无需对兴趣点POIj之后的兴趣点POIj+k与无 线热点WiFi i的距离是否小于预设值进行判断。其中,j+k≤m-1。
在一个实施例中,参考图4所示,上述无线热点与兴趣点的映射方法包括:
S402,获取嗅探记录;嗅探记录包括嗅探设备所嗅探到的无线热点的数据。
S404,根据嗅探记录,确定嗅探设备重叠度。
S406,获取待映射的每个兴趣点的兴趣点名称。
S408,将兴趣点名称之间具有包含关系的多个兴趣点划分至一个兴趣点组。
S410,根据无线热点与相应的兴趣点组中兴趣点的距离,计算无线热点与相应的兴趣点组间的初始映射概率。
S412,基于嗅探设备重叠度进行初始映射概率之间的迭代传播,在迭代结束时得到无线热点与兴趣点组间的目标映射概率。
S414,根据目标映射概率,建立无线热点与兴趣点组之间的映射。
其中,兴趣点名称包括一个或多个地址元素,比如,POI“光明小区东区1栋”包括“光明小区”“东区”和“1栋”三个地址元素。现实场景中POI之间是存在层级关系的,这种层级关系体现在兴趣点名称之间的包含关系。兴趣点名称之间具有包含关系是指一个兴趣点名称为另一个兴趣点名称中的一种或多种地址元素。比如,POI“光明小区”包含于POI“光明小区东区1栋”。容易理解,兴趣点名称之间具有包含关系只需进行词间比对即可,并不需要对兴趣点名称进行分词处理。
具体地,计算机设备对各兴趣点名称是否包含在另一个兴趣点名称中进行遍历。当兴趣点名称POIi包含在兴趣点名称POIj中时,计算机设备将兴趣点名称POIi对应兴趣点确定为兴趣点名称POIj对应兴趣点的父级兴趣点,将兴趣点名称POIj对应兴趣点确定为兴趣点名称POIi对应兴趣点的子级兴趣点。具有父子关系兴趣点组构成兴趣点组。比如,父级兴趣点“光明小区”与对应的子级兴趣点“光明小区东区1栋”“光明小区东区3栋”等构成一个兴趣点组{“光明小区”,“光明小区东区1栋”,“光明小区东区3栋”,…}。
在一个实施例中,同一个兴趣点组可以包含多层级的兴趣点,即子级兴趣点可以作为其他兴趣点的父级兴趣点。比如,{“光明小区”,“光明小区东区”,“光明小区东区1栋”,“光明小区东区3栋”,…}。“光明小区东区” 为“光明小区”的子级兴趣点,但同时是“光明小区东区1栋”和“光明小区东区3栋”的父级兴趣点。当兴趣点组包含多层级的兴趣点时,计算机设备最高层级的兴趣点确定为兴趣点组的父级兴趣点。
在一个实施例中,当统计区域涉及的兴趣点名称较多时,计算机设备也可以将多个兴趣点名称划分为多组,同步按照上述方式对每组兴趣点名称进行层级划分,最后将各组层级划分的结果融合,确定统计区域最终的兴趣点层级关系。
如此,计算机设备可以以兴趣点组为单位进行无线热点和兴趣点之间的映射。具体地,计算机设备根据无线热点与兴趣点组中每个兴趣点的距离,计算无线热点与相应兴趣点组间的初始映射概率。比如,计算机设备可以根据无线热点WiFi i与根据兴趣点组{POIi}中兴趣点的最小距离或者平均距离等,计算无线热点与相应兴趣点组间{POIi}的初始映射概率。计算机设备按照上述方式基于嗅探设备重叠度对无线热点与相应兴趣点组间{POIi}的初始映射概率进行迭代传播,得到无线热点与相应兴趣点组间{POIi}的目标映射概率,建立无线热点WiFi i与目标映射概率最高的一个兴趣点组间{POIi}中每个兴趣点之间的映射。
当限定了将一个无线热点WiFi i映射至某个POI的目标映射概率最小值,即第二阈值时,计算机设备对无线热点WiFi i与每个兴趣点组间{POIi}之间的目标映射概率是否达到第二阈值进行遍历。当无线热点WiFi i与每个兴趣点组间{POIi}之间的目标映射概率均小于第二阈值时,计算机设备将无线热点WiFi i判定为噪声WiFi i,剔除噪声WiFi i,建立所保留的无线热点WiFi ii与达到第二阈值且最大的目标映射概率的兴趣点组间{POIi}中每个兴趣点之间的映射。
值得注意的是,实现无线热点与兴趣点之间的映射是基于无线热点与兴趣点之间的目标映射概率。假设在无线热点与兴趣点之间建立映射的目标映射概率的第二阈值为0.7。参考图5所示,无线热点WiFi i与兴趣点“润城花园”的目标映射概率为0.2,与兴趣点“润城花园1栋”的目标映射概率为0.5,与兴趣点“润城花园9栋”的目标映射概率为0.1,与兴趣点“龙都花园”的目标映射概率为0.2,在未引入POI层级关系时,由于无线热点WiFii的目标映射概率 会被附近的多个POI稀释,导致映射到各POI的概率值都很小,最终被剔除,导致无线热点WiFii召回率低。
而本申请的实施例,通过引入POI层级关系,可以将“润城花园1栋”和“润城花园9栋”与“润城花园”划分为同一个兴趣点组{“润城花园”“润城花园1栋”,“润城花园9栋”},最终得到润城花园映射到{“润城花园”“润城花园1栋”,“润城花园9栋”}的概率为0.8,“润城花园”被召回。因此,引入POI层级关系可以提升无线热点WiFi与POI的映射召回率。
可以理解,在引入POI层级关系后,假设统计区域所包含的m个POI被划分为p个兴趣点组,上述n*m的映射矩阵
Figure PCTCN2020124594-appb-000012
可以降维至
Figure PCTCN2020124594-appb-000013
m≥p。因而引入POI层级关系还可以降低初始映射概率迭代传播环节的映射矩阵维度,减缓数据倾斜的影响,大大降低计算量,提升所建立映射关系的稳定性。
本实施例中,对根据兴趣点名称对兴趣点进行群组划分,映入POI层级关系,不仅可以以兴趣点组为单位进行无线热点和兴趣点之间的映射,提高映射效率;在限定目标映射概率第二阈值的场景中,映入POI层级关系,还可以提升无线热点WiFi与POI的映射召回率。
在一个实施例中,基于嗅探设备重叠度进行初始映射概率之间的迭代传播,在迭代结束时得到无线热点与兴趣点间的目标映射概率包括:将传播矩阵与初始映射矩阵相乘,计算得到中间映射矩阵;将中间映射矩阵中种子热点与相应兴趣点的中间映射概率重置为1后作为初始映射矩阵进行迭代,直至满足迭代停止条件时停止迭代,得到目标映射矩阵;目标映射矩阵记录了各无线热点与兴趣点之间的目标映射概率。
具体地,计算机设备将传播矩阵W n*n与初始化状态的映射矩阵
Figure PCTCN2020124594-appb-000014
相乘,计算得中间映射矩阵
Figure PCTCN2020124594-appb-000015
中间映射矩阵记录了各无线热点与兴趣点之间的中间映射概率。计算机设备将中间映射矩阵
Figure PCTCN2020124594-appb-000016
中种子热点与相应兴趣点的中国映射概率重置为初始值,即1。计算机设备将中间映射矩阵
Figure PCTCN2020124594-appb-000017
待映射热点与兴趣点的中间映射概率进行归一化处理。计算机设备将完成种子热点中间映射概率重置和待映射热点中间映射概率归一化处理的中间映射矩阵
Figure PCTCN2020124594-appb-000018
作为初始映射矩阵进行迭代,得到中间映射矩阵
Figure PCTCN2020124594-appb-000019
在对中 间映射矩阵
Figure PCTCN2020124594-appb-000020
完成种子热点中间映射概率重置和待映射热点中间映射概率归一化处理后,将
Figure PCTCN2020124594-appb-000021
作为初始映射矩阵继续迭代,直至满足迭代停止条件时停止迭代,得到目标映射矩阵
Figure PCTCN2020124594-appb-000022
由于种子节点与相应兴趣点的初始映射概率是可靠的,在迭代传播过程中通过重置,可以保证种子节点与相应兴趣点的映射概率自始至终为1,进而可以将可靠性强的种子热点作为传播的源头发起传播,通过迭代将种子热点的初始映射概率不断传播给周围的待映射热点,提高映射关系准确性。
本实施例中,嗅探记录中所记录的无线热点的热点位置可能存在偏差,数据可靠性差。通过限定无线热点与兴趣点之间较小的距离阈值,可以在待映射的大量无线热点中识别发现可以单纯根据距离即可实现可靠映射的种子节点;将种子节点与相应兴趣点的初始映射概率确定为1,而将其他无线节点与各兴趣点的初始映射概率确定为0,可以尽可能减少对嗅探记录中热点位置数据的使用,而仅将可靠性强的种子热点作为传播的源头发起传播,提高映射关系准确性。
在一个实施例中,参考图6所示,上述无线热点与兴趣点的映射方法包括:
S602,获取嗅探记录;嗅探记录包括嗅探设备所嗅探到的无线热点的数据。
S604,将需要进行无线热点和兴趣点映射的统计区域划分为多个子区域。
S606,根据嗅探记录,确定每个子区域内无线热点之间的嗅探设备重叠度。
S608,根据无线热点与兴趣点的距离,确定相应子区域内无线热点与兴趣点间的初始映射概率。
S610,基于嗅探设备重叠度进行初始映射概率之间的迭代传播,在迭代结束时得到相应子区域内无线热点与兴趣点间的目标映射概率。
S612,根据同一子区域内的无线热点与兴趣点之间的目标映射概率,建立相应子区域内无线热点与兴趣点之间的映射。
S614,通过融合全部子区域内无线热点与兴趣点之间映射的数据,建立统计区域内各无线热点与兴趣点之间的映射。
其中,当统计区域的区域面积较小时,所包含的POI及无线热点数量较少,可以按照上述方式遍历计算统计区域内每个无线热点和兴趣点的映射关系。但 当统计区域的区域面积较大时,所包含的POI及无线热点数量通常是非常大的。比如全国区域内的无线热点和兴趣点的数量在十亿量级。为了实现对大面积的统计区域内无线热点和兴趣点之间的高效映射,本申请的实施例将统计区域划分为多个子区域。
具体地,各子区域的区域面积和区域边界轮廓形状可以不同。对统计区域进行区域划分的方法具体可以是计算机设备根据人口分布以及常规情况下的客流量将统计区域划分为多个区域面积和/或区域边界轮廓形状不等的子区域。可以理解,对于人口数量较多或者客流量较大的位置,可以通过限缩区域边界将其划分为区域面积小的子区域;对于人口数量较少或者客流量较小的位置,可以通过增大区域边界将其划分为区域面积大的子区域,以使各个子区域内所包含的无线热点的数量和兴趣点的数量相近。
各子区域的区域面积和区域边界轮廓形状也可以相同。对统计区域进行区域划分的方法具体可以是计算机设备基于预设的网格将统计区域划分为多个区域面积和/或区域边界轮廓形状相同的子区域。参考图7所示,计算机设备在数字地图中,基于统计区域所在地面建立平面坐标系,在该平面坐标系基于预设大小的正方向网格702将统计区域平均划分为多个区域面积相等的子区域。可以理解,预设的网格还可以是其他规则的多边形,如三角形、平行四边形、菱形等。预设的网格还可以是不规则边框,对此不做限定。不同子区域之间可以没有重叠区域,也可以设置一定的用于过渡的重叠区域,对此不作限制。本领域技术人员还可以采用其他区域划分方法,对此不作限制。
进一步地,计算机设备按照上述方式分别建立每个子区域内无线热点和兴趣点的映射,最后再将全部子区域内无线热点与兴趣点的映射数据合并,得到统计区域内全部无线热点与兴趣点的完整映射关系。
本实施例中,将统计区域划分为多个子区域,对于每个子区域内需要映射处理无线热点和兴趣点的数据较少,且可以同步对多个子区域内无线热点和兴趣点进行映射,大大提高映射效率,使本申请提供的无线热点和兴趣点的映方法适用于大面积统计区域的场景。
在一个实施例中,参考图8所示,上述无线热点与兴趣点的映射方法包括:
S802,获取嗅探记录;嗅探记录包括嗅探设备所嗅探到的无线热点的数据。
S804,将需要进行无线热点和兴趣点映射的统计区域划分为多个子区域。
S806,根据嗅探记录,确定同一子区域内无线热点之间的嗅探设备重叠度。
S808,对同一子区域内多个兴趣点进行分组,得到一个或多个兴趣点组。
S810,在相邻的两个子区域中,将兴趣点名称存在包含关系的两个兴趣点组合并。
S812,根据无线热点与相应的兴趣点组中兴趣点的距离,计算无线热点与相应的兴趣点组间的初始映射概率。
S814,基于嗅探设备重叠度进行初始映射概率之间的迭代传播,在迭代结束时得到相应子区域内无线热点与兴趣点组间的目标映射概率。
S816,根据同一子区域内的无线热点与兴趣点之间的目标映射概率,建立相应子区域内无线热点与兴趣点之间的映射。
S818,通过融合全部子区域内无线热点与兴趣点之间映射的数据,建立统计区域内各无线热点与兴趣点之间的映射。
其中,相邻子区域可以是区域边界相邻的两个子区域。在一个实施例中,每个子区域具有对应的用于表征区域位置的位置坐标,如中心点的位置坐标。相邻子区域也可以是位置坐标距离小于预设的距离阈值的两个子区域。比如,在图7所示的统计区域内,当相邻子区域为区域边界相邻的两个子区域时,子区域E对应的相邻子区域包括子区域B、D、F和H。当相邻子区域为位置坐标距离小于距离阈值的两个子区域,且距离阈值为子区域对角线长度时,子区域E对应的相邻子区域包括子区域A、B、C、D、F、G、H和J。
具体地,计算机设备按照上述方式分别对每个子区域内的POI进行层级划分,得到每个子区域内的一个或多个兴趣点组。在一个实施例中,对同一子区域内多个兴趣点进行分组,得到一个或多个兴趣点组包括:获取同一子区域内每个兴趣点的兴趣点名称;将兴趣点名称之间具有包含关系的多个兴趣点划分至一个兴趣点组;在兴趣点组中,将被包含的兴趣点名称对应的兴趣点确定为父级兴趣点。
进一步地,在进行区域划分,并确定每个子区域内POI的层级关系后,计 算机设备对各个子区域内的兴趣点组进行融合,得到整个统计区域内POI的层级关系。在一个实施例中,在相邻的两个子区域中,将兴趣点名称存在包含关系的两个兴趣点组合并包括:在当前子区域内一个父级兴趣点的兴趣点名称包含了相邻子区域内一个父级兴趣点的兴趣点名称时,将当前子区域内父级兴趣点所在的兴趣点组合并至相邻子区域内相应父级兴趣点所在的兴趣点组中;在当前子区域内一个父级兴趣点的兴趣点名称包含于相邻子区域内一个父级兴趣点的兴趣点名称时,将相邻子区域内父级兴趣点所在的兴趣点组合并至当前子区域内相应父级兴趣点所在的兴趣点组中。
计算机设备遍历各子区域,识别当前子区域内的所有父级兴趣点以及当前子区域的相邻子区域内的所有父级兴趣点是否存在兴趣点名称包含关系。如果当前子区域内的某个兴趣点组中父级兴趣点POIi的兴趣点名称包含于相邻子区域内的某个兴趣点组中父级兴趣点POIj的兴趣点名称中,则将父级兴趣点POIj所在的兴趣点组都划到父级兴趣点POIi所在的兴趣点组中。如果当前子区域内的某个兴趣点组中父级兴趣点POIi的兴趣点名称包含了相邻子区域内的某个兴趣点组中父级兴趣点POIj的兴趣点名称中,则将父级兴趣点POIi所在的兴趣点组都划到父级兴趣点POIj所在的兴趣点组中。
比如,在当前子区域E中有父级POI“光明小区”及其子POI集{“光明小区东区1栋”,“光明小区东区3栋”,…},其相邻子区域B中有父级POI“光明小区西区”及其子POI集{“光明小区西区1栋”,“光明小区西区2栋”,…}。按照上述方式合并后的父级POI为“光明小区”,其子POI集为{“光明小区东区1栋”,“光明小区东区3栋”,“光明小区西区1栋”,“光明小区西区2栋”,…}。
值得说明的是,在对相邻子区域中存在兴趣点名称包含关系的两个兴趣点组进行合并时,可能发生同一个兴趣点组被合并至不同的兴趣点组中的问题,继而发生同一兴趣点在不同子区域的兴趣点组中重复出现的情况。比如,在上述举例中,子区域E中的兴趣点组{POIe}被同时合并至子区域D中的兴趣点组{POId}和子区域H中的兴趣点组{POIh}中,进而合并后的{POId,POIe}与{POIh,POIe}存在部分兴趣点重复的问题,进而引发兴趣点与无线热点的重复映射的问 题。
为了解决上述问题,本申请的实施例,在对相邻子区域中存在兴趣点名称包含关系的两个兴趣点组进行合并时,若发现一个兴趣点组可以合并至多个相邻子区域内的兴趣点组中,计算机设备随机将兴趣点组合并至其中一个相邻子区域内的兴趣点组中。或者,各个子区域具有对应的编号,可以将兴趣点组合并至其中子区域编号最大的相邻子区域内的兴趣点组中。
进一步地,在融合确定每个子区域的POI层级关系后,计算机设备按照上述方式分别计算每个子区域内无线热点之间的嗅探设备重叠度,分别计算每个子区域内无线热点与兴趣点之间的初始映射概率。在一个实施例中,根据无线热点与兴趣点组中每个兴趣点的距离,计算无线热点与相应兴趣点组间的初始映射概率,包括:筛选与兴趣点组中至少一个兴趣点的距离小于预设值的无线热点作为相应兴趣点组的种子热点;确定种子热点与相应兴趣点组之间的初始映射概率为1;确定除种子节点之外的各无线热点与兴趣点组之间的初始映射概率为0。
若引入了POI层级关系,每个兴趣点组{POI}具有对应的编号。假设统计区域所包含的m个POI被划分为p个兴趣点组,则编号范围可以是0至p-1。如此,兴趣点编号可以直接作为映射矩阵
Figure PCTCN2020124594-appb-000023
的列下标。
计算机设备按照兴趣点组编号,对每个无线热点WiFi i与兴趣点组{POIj}中每个兴趣点的距离是否小于预设值进行遍历。当无线热点WiFi i与兴趣点组{POIj}中每个兴趣点的距离大于或等于预设值时,计算机设备将无线热点WiFi i与兴趣点组{POIj}的初始映射概率设定为0。当无线热点WiFi i与兴趣点组{POIj}中至少一个兴趣点的距离小于预设值时,计算机设备将无线热点WiFi i与兴趣点组{POIj}的初始映射概率设定为1,并将无线热点WiFi i与兴趣点组{POIj+k}的初始映射概率设定为0。换言之,当确定无线热点WiFi i为一个兴趣点组{POIj}的种子节点后,则无需对兴趣点组{POIj}之后的兴趣点组{POIj+k}与无线热点WiFi i的距离是否小于预设值进行判断。其中,j+k≤p-1。
计算机设备基于每个子区域内无线热点之间的嗅探设备重叠度,将相应子区域内种子热点的初始映射概率迭代传播至待映射热点,在迭代结束时得到相 应子区域内各个无线热点与兴趣点间的目标映射概率,最后根据目标映射概率,建立相应子区域内各无线热点与兴趣点之间的映射。
本实施例中,对统计区域进行区域划分的同时,引入了POI层级关系,进而在提高映射效率,延伸本申请实施例使用场景的同时,还可以提升无线热点WiFi与POI的映射召回率,换句话说,本申请的实施例可以提高无线热点与兴趣点的准召率。
在一个具体的实施例中,参考图9所示,上述无线热点与兴趣点的映射方法包括:
S902,获取嗅探记录;嗅探记录包括嗅探设备标识及至少一个无线热点的位置及热点名称。
S904,根据无线热点在不同嗅探记录中的位置变化,识别无线热点中的移动热点。
S906,剔除每条嗅探记录中关于移动热点的数据。
S908,将需要进行无线热点和兴趣点映射的统计区域划分为多个子区域。
S910,根据完成数据剔除的嗅探记录,确定同一子区域内每个无线热点对应的去重嗅探设备集合。
S912,在所有的去重嗅探设备集合中,剔除嗅探设备标识数量小于第一阈值的去重嗅探设备集合,得到目标的去重嗅探设备集合。
S914,识别每两个目标的去重嗅探设备集合中重叠的嗅探设备标识。
S916,基于重叠的嗅探设备标识的数量和相应去重嗅探设备集合中嗅探设备标识的数量,确定相应子区域内对应两个无线热点的嗅探设备重叠度。
S918,获取同一子区域内每个兴趣点的兴趣点名称。
S920,将兴趣点名称之间具有包含关系的多个兴趣点划分至一个兴趣点组。
S922,将被包含的兴趣点名称对应兴趣点确定为兴趣点组中的父级兴趣点。
S924,在当前子区域内一个父级兴趣点的兴趣点名称包含了相邻子区域内一个父级兴趣点的兴趣点名称时,将当前子区域内父级兴趣点所在的兴趣点组合并至相邻子区域内相应父级兴趣点所在的兴趣点组中。
S926,在当前子区域内一个父级兴趣点的兴趣点名称包含于相邻子区域内 一个父级兴趣点的兴趣点名称时,将相邻子区域内父级兴趣点所在的兴趣点组合并至当前子区域内相应父级兴趣点所在的兴趣点组中。
S928,筛选与兴趣点组中至少一个兴趣点的距离小于预设值的无线热点作为相应兴趣点组的种子热点。
S930,确定种子热点与相应兴趣点组之间的初始映射概率为1;确定除种子节点之外的各无线热点与兴趣点组之间的初始映射概率为0。
S932,将无线热点之间的嗅探设备重叠度作为矩阵元素建立传播矩阵。
S934,将无线热点与兴趣点间的初始映射概率作为矩阵元素建立初始映射矩阵。
S936,将传播矩阵与初始映射矩阵相乘,计算得到中间映射矩阵。
S938,将中间映射矩阵中种子热点与相应兴趣点的中间映射概率重置为1后作为初始映射矩阵进行迭代,直至满足迭代停止条件时停止迭代,得到目标映射矩阵;目标映射矩阵记录了各无线热点与兴趣点之间的目标映射概率。
S940,在所有兴趣点中,剔除目标映射概率均小于第二阈值的无线热点。
S942,建立相应子区域内所保留的每个无线热点与对应目标映射概率最大的兴趣点间的映射。
S944,通过融合全部子区域内无线热点与兴趣点之间映射的数据,建立统计区域内各无线热点与兴趣点之间的映射。
上述无线热点与兴趣点的映射方法,基于无线热点的嗅探记录建立无线热点与兴趣点的映射关系,无需人工采集上报POI到访数据,不仅提高映射效率;由于减少了对无线热点和兴趣点名称的依赖,使这种映射方式适用范围广,还可以提高无线热点召回率。基于嗅探设备重叠度度量无线热点之间的相关性,可以辅助判断用户在兴趣点之间的流动属性,保留了用户的空间行为特征信息,可以更好的实现无线热点在空间位置上的区分,从而使反映出的无线热点相关性可靠性更高。进而,综合无线热点与兴趣点的距离,以及无线热点之间的嗅探设备重叠度,建立无线热点及兴趣点之间的映射,可以提高映射准确性。
在一个最具体的实施例中,参考图10所示,上述无线热点与兴趣点的映射方法包括:
S1002,获取嗅探记录,识别嗅探记录中的移动热点并剔除。
移动热点的位置是随着时间变化的,无法得到其与POI的稳定映射关系,因此需要剔除。无线热点移动/固定识别可以通过一段时间的嗅探记录反推出各无线热点一系列位置数据,如果一个无线热点的多个位置变化非常大,则可以判断为移动,反之为固定。无线热点移动/固定识别还可以采用其他手段,对此不做限定。
S1004,确定各无线热点与兴趣点所在地理网格的编号。
全国无线热点、POI量级数十亿,遍历计算所有无线热点与POI的映射关系没有必要。本实施例采用的方法是先将无线热点与POI进行网格划分,然后计算各地理网格内无线热点与POI的映射关系,最后进行融合得到完整的映射结果。每个地理网格具有对应的编号。该编号可以是按照如下方式确定的:在包含需要进行无线热点和兴趣点映射的地理区域的平面区域内建立二维坐标系,给定地理网格的大小可以边长为d米的正方形网格,每个正方形网格在二维坐标系中具有对应的坐标,如中心点或顶点等的坐标(x,y),则无线热点与POI对应地理网格编号的计算方法为:(x,y)->([x/d]*d,[y/d]*d),其中,其中,x为横坐标,y为纵坐标,[]为取整运算。
S1006,基于兴趣点的名称之间的包含关系识别其中的父级兴趣点。
现实场景很多POI之间是存在层级关系的,以住宅小区类POI“润城花园”为例,其附近会存在“润城花园1栋”、“润城花园9栋”等POI。每个POI数据包括id,name(名称),lon(经度),lat(纬度)等。POI层级划分的步骤如下:
(1)根据POI的lon、lat计算各POI所属的地理网格。
(2)遍历各地理网格,执行计算:如果某个POI的名称中不包含其它任何POI的名称,则将该POI置为父级POI,所有包含该POI名称的POI为其子POI。
(3)遍历各地理网格,执行计算:在以当前网格为中心的九宫格区域内,获取当前网格内的所有父级POI和与该网格相邻的8个地理网格内的所有父级POI,如果当前网格内的某个父级POI i的名称包含于相邻8个网格内的某个父级POI j的名称中,则将该POI j以及POI j的子POI都划到POI i的子POI集合中。
S1008,根据无线热点之间的嗅探设备重叠度,生成传播矩阵
Figure PCTCN2020124594-appb-000024
传播矩阵
Figure PCTCN2020124594-appb-000025
是一个n*n的二阶方阵,其中行/列下标n为无线热点的数量,位置[i,j]上的元素表示wifi i与wifi j的相关性,数值范围为[0,1]。WiFi间的相关性有多种度量方法,比如距离、出现在同一嗅探记录中的次数等。本实施例提出一种基于嗅探设备重叠度的无线热点间相关性度量方法,该方法计算的相关性数值范围为[0,1],无需进行归一化处理,同时保留了用户的个体信息,可以更好的实现无线热点关系网的分割。无线热点关系网构建步骤如下:
(1)解析WiFi嗅探记录(如一个月),得到各无线热点的嗅探设备集合。
(2)剔除一个月内被嗅探的设备数小于阈值的WiFi。
(3)初始化传播矩阵
Figure PCTCN2020124594-appb-000026
所有元素默认值为0。
(4)计算嗅探用户重叠度w i,j=(wifi i嗅探设备去重数+wifi j嗅探设备去重数-wifi i和wifi j的嗅探设备去重数)/wifi i嗅探设备去重数。
(5)返回传播矩阵W。
S1010,根据无线热点与兴趣点间的距离,初始化映射矩阵
Figure PCTCN2020124594-appb-000027
映射矩阵
Figure PCTCN2020124594-appb-000028
是一个n*m的二阶矩阵,其中上标0表示初始化状态,行下标n为WiFi数量,列下标m为所有POI的父级POI数量,位置[i,j]上的元素表示wifi i归属于父级poi j的概率值。映射矩阵
Figure PCTCN2020124594-appb-000029
初始状态默认所有元素为0。列对应的下标值j=poi.parentid.no为当前POI的父级POI编号值,该编号从0开始,最大值为所有POI的父级POI数量减1。
映射矩阵
Figure PCTCN2020124594-appb-000030
的初始化流程为:获取地理网格内所有父级POI,从0开始进行编号;遍历所有POI(包括父级POI及其子POI),如果存在无线热点wifi i满足Distance(wifi i,poi)<默认距离阈值,则给映射矩阵元素赋值L i*j=1。初始化之后,标签矩阵
Figure PCTCN2020124594-appb-000031
中的元素包含0,1两种数值,其中1对应种子热点,0对应待映射热点。
S1012,基于传播矩阵
Figure PCTCN2020124594-appb-000032
和映射矩阵
Figure PCTCN2020124594-appb-000033
执行标签传播算法,得到当前地理网格内全量无线热点与父级兴趣点的最终标签矩阵L。
执行半监督标签传播算法的步骤如下:
(1)初始化后的传播矩阵
Figure PCTCN2020124594-appb-000034
和映射矩阵
Figure PCTCN2020124594-appb-000035
(2)执行传播,将传播矩阵
Figure PCTCN2020124594-appb-000036
记录的无线热点之间的嗅探重叠度作为传 播权重,按照传播权重将种子热点与相应兴趣点的初始映射概率传播至映射矩阵
Figure PCTCN2020124594-appb-000037
中周围的待映射热点,并更新到待映射热点自己的概率分布,得到映射矩阵L t=W*L t-1。
(3)重置L t中初始化的种子热点对应的映射概率值为初始值。
(4)重复(2)、(3)直至映射矩阵L收敛或达到最大迭代次。
(5)返回映射矩阵L。
S1014,识别标签矩阵L中的噪声热点并过滤,得到各地理网格内无线热点与兴趣点的映射数据。
传播迭代结束,得到最终的标签矩阵
Figure PCTCN2020124594-appb-000038
计算每一行最大元素值,如果最大值大于给定的过滤阈值则保留该无线热点,并建立与最大值对应的POI之间的映射关系,否则判定该无线热点为噪声热点并剔除。
S1016,合并所有地理网格内的映射数据,得到全量无线热点与兴趣点映射结果。
经实际测试,在具有10个住宅小区的范围内以高于80%的精度召回无线热点数量超过5000,而传统基于名称的映射方法仅召回无线热点数量约100个,在相同精度条件下基于位置的映射方法仅召回无线热点数量约2000个,本申请提供的映射方法无论在准确性方面,还是召回率方面均明显优于传统基于名称或者位置的映射方法。
图2、4、6、8、9和10为一个实施例中无线热点与兴趣点的映射方法的流程示意图。应该理解的是,虽然图2、4、6、8、9和10的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2、4、6、8、9和10中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
如图11所示,在一个实施例中,提供了无线热点与兴趣点的映射装置1100,包括热点相关性度量模块1102、映射概率传播模块1104和热点兴趣点映射模块1106,其中,
热点相关性度量模块1102,用于获取嗅探记录,嗅探记录包包括嗅探设备所嗅探到的无线热点的数据;根据嗅探记录,确定嗅探设备重叠度。
映射概率传播模块1104,用于根据无线热点与相应的兴趣点的距离,确定无线热点与相应的兴趣点间的初始映射概率;基于嗅探设备重叠度进行初始映射概率之间的迭代传播,在迭代结束时得到无线热点与兴趣点之间的目标映射概率。
热点兴趣点映射模块1106,用于根据目标映射概率,建立无线热点与兴趣点之间的映射。
在一个实施例中,无线热点的数据包括无线热点的位置;参考图12所示,上述无线热点与兴趣点的映射装置1100还包括移动热点剔除模块1108,用于根据无线热点在不同嗅探记录中的位置变化,识别无线热点中的移动热点;剔除每条嗅探记录中关于移动热点的数据;根据无线热点的嗅探记录,确定无线热点之间的嗅探设备重叠度包括:根据完成数据剔除的嗅探记录,确定无线热点之间的嗅探设备重叠度。
在一个实施例中,嗅探记录包含嗅探设备标识及至少两个无线热点的热点名称;热点相关性度量模块1102还用于基于嗅探设备标识,确定每个热点名称对应的去重嗅探设备集合;识别每两个去重嗅探设备集合中重叠的嗅探设备标识;基于重叠的嗅探设备标识的数量与相应去重嗅探设备集合中嗅探设备标识的数量,确定对应两个无线热点的嗅探设备重叠度。
在一个实施例中,上述无线热点与兴趣点的映射装置1100还包括冷门热点剔除模块1110,用于在所有的去重嗅探设备集合中,剔除嗅探设备标识数量小于第一阈值的去重嗅探设备集合,得到目标的去重嗅探设备集合;热点相关性度量模块1102还用于在每两个目标的去重嗅探设备集合中识别重叠的嗅探设备标识。
在一个实施例中,上述无线热点与兴趣点的映射装置1100还包括POI层级 划分模块1112,用于获取待映射的每个兴趣点的兴趣点名称;将兴趣点名称之间具有包含关系的多个兴趣点划分至一个兴趣点组;映射概率传播模块1104还用于根据无线热点与相应的兴趣点组中兴趣点的距离,计算无线热点与相应的兴趣点组间的初始映射概率。
在一个实施例中,映射概率传播模块1104还用于筛选与兴趣点的距离小于预设值的无线热点作为相应兴趣点的种子热点;确定种子热点与相应兴趣点之间的初始映射概率为1;确定除种子节点之外的各无线热点与兴趣点之间的初始映射概率为0。
在一个实施例中,上述无线热点与兴趣点的映射装置1100还包括统计区域划分模块1114,用于将需要进行无线热点和兴趣点映射的统计区域划分为多个子区域;POI层级划分模块1112还用于对同一子区域内多个兴趣点进行分组,得到一个或多个兴趣点组;在相邻子区域中,将兴趣点名称存在包含关系的两个兴趣点组合并;映射概率传播模块1104还用于根据无线热点与兴趣点组中每个兴趣点的距离,计算无线热点与相应兴趣点组间的初始映射概率。
在一个实施例中,POI层级划分模块1112还用于获取同一子区域内每个兴趣点的兴趣点名称;将兴趣点名称之间具有包含关系的多个兴趣点划分至一个兴趣点组;在兴趣点组中,将被包含的兴趣点名称对应的兴趣点确定为父级兴趣点。
在一个实施例中,POI层级划分模块1112还用于在当前子区域内一个父级兴趣点的兴趣点名称包含了相邻子区域内一个父级兴趣点的兴趣点名称时,将当前子区域内父级兴趣点所在的兴趣点组合并至相邻子区域内相应父级兴趣点所在的兴趣点组中;在当前子区域内一个父级兴趣点的兴趣点名称包含于相邻子区域内一个父级兴趣点的兴趣点名称时,将相邻子区域内父级兴趣点所在的兴趣点组合并至当前子区域内相应父级兴趣点所在的兴趣点组中。
在一个实施例中,映射概率传播模块1104还用于筛选与兴趣点组中至少一个兴趣点的距离小于预设值的无线热点作为相应兴趣点组的种子热点;确定种子热点与相应兴趣点组之间的初始映射概率为1;确定除种子节点之外的各无线热点与兴趣点组之间的初始映射概率为0。
在一个实施例中,上述无线热点与兴趣点的映射装置1100还包括矩阵初始化模块1116,用于将无线热点之间的嗅探设备重叠度作为矩阵元素建立传播矩阵;将无线热点与兴趣点间的初始映射概率作为矩阵元素建立初始映射矩阵;映射概率传播模块1104还用于将传播矩阵与初始映射矩阵相乘,计算得到中间映射矩阵;将中间映射矩阵中种子热点与相应兴趣点的中间映射概率重置为1后作为初始映射矩阵进行迭代,直至满足迭代停止条件时停止迭代,得到目标映射矩阵;目标映射矩阵记录了各无线热点与兴趣点之间的目标映射概率。
在一个实施例中,热点兴趣点映射模块1106还用于在所有兴趣点中,剔除目标映射概率均小于第二阈值的无线热点;建立所保留的每个无线热点与对应目标映射概率最大的兴趣点间的映射。
在一个实施例中,统计区域划分模块1114,用于将需要进行无线热点和兴趣点映射的统计区域划分为多个子区域;热点兴趣点映射模块1106还用于根据同一子区域内的无线热点与兴趣点之间的目标映射概率,建立相应子区域内无线热点与兴趣点之间的映射;通过融合全部子区域内无线热点与兴趣点之间映射的数据,建立统计区域内各无线热点与兴趣点之间的映射。
上述无线热点与兴趣点的映射装置,基于无线热点的嗅探记录建立无线热点与兴趣点的映射关系,无需人工采集上报POI到访数据,不仅提高映射效率;由于减少了对无线热点和兴趣点名称的依赖,使这种映射方式适用范围广,还可以提高无线热点召回率。基于嗅探设备重叠度度量无线热点之间的相关性,可以辅助判断用户在兴趣点之间的流动属性,保留了用户的空间行为特征信息,可以更好的实现无线热点在空间位置上的区分,从而使反映出的无线热点相关性可靠性更高。进而,综合无线热点与兴趣点的距离,以及无线热点之间的嗅探设备重叠度,建立无线热点及兴趣点之间的映射,可以提高映射准确性。
图13示出了一个实施例中计算机设备的内部结构图。该计算机设备具体可以是图1中的终端110或服务器120。如图13所示,该计算机设备包括该计算机设备包括通过系统总线连接的处理器、存储器和网络接口。其中,存储器包括非易失性存储介质和内存储器。该计算机设备的非易失性存储介质存储有操 作系统,还可存储有计算机程序,该计算机程序被处理器执行时,可使得处理器实现无线热点与兴趣点的映射方法。该内存储器中也可储存有计算机程序,该计算机程序被处理器执行时,可使得处理器执行无线热点与兴趣点的映射方法。
本领域技术人员可以理解,图13中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
在一个实施例中,本申请提供的无线热点与兴趣点的映射装置可以实现为一种计算机程序的形式,计算机程序可在如图13所示的计算机设备上运行。计算机设备的存储器中可存储组成该无线热点与兴趣点的映射装置的各个程序模块,比如,图11所示的相关性度量模块、映射概率传播模块和热点兴趣点映射模块。各个程序模块构成的计算机程序使得处理器执行本说明书中描述的本申请各个实施例的无线热点与兴趣点的映射方法中的步骤。
例如,图13所示的计算机设备可以通过如图11所示的无线热点与兴趣点的映射装置中的热点相关性度量模块执行步骤S202和S204。计算机设备可通过映射概率传播模块执行步骤S206和S208。计算机设备可通过热点兴趣点映射模块执行步骤S210。
在一个实施例中,提供了一种计算机设备,包括存储器和处理器,存储器存储有计算机程序,计算机程序被处理器执行时,使得处理器执行上述无线热点与兴趣点的映射方法的步骤。此处无线热点与兴趣点的映射方法的步骤可以是上述各个实施例的无线热点与兴趣点的映射方法中的步骤。
在一个实施例中,提供了一种计算机可读存储介质,存储有计算机程序,计算机程序被处理器执行时,使得处理器执行上述无线热点与兴趣点的映射方法的步骤。此处无线热点与兴趣点的映射方法的步骤可以是上述各个实施例的无线热点与兴趣点的映射方法中的步骤。
在一个实施例中,提供了一种计算机程序产品或计算机程序,计算机程序 产品或计算机程序包括计算机指令,计算机指令存储在计算机可读存储介质中;电子装置的处理器从计算机可读存储介质读取并执行计算机指令时,使得电子装置执行无线热点与兴趣点的映射方法的步骤。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,的程序可存储于一非易失性计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对本申请专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (20)

  1. 一种无线热点与兴趣点的映射方法,由计算机设备执行,所述方法包括:
    获取嗅探记录,所述嗅探记录包括嗅探设备所嗅探到的无线热点的数据;
    根据所述嗅探记录,确定嗅探设备重叠度;
    根据所述无线热点与相应的兴趣点的距离,确定所述无线热点与所述相应的兴趣点间的初始映射概率;
    基于所述嗅探设备重叠度进行所述初始映射概率之间的迭代传播,在迭代结束时得到所述无线热点与所述兴趣点之间的目标映射概率;
    根据所述目标映射概率,建立所述无线热点与所述兴趣点之间的映射。
  2. 根据权利要求1所述的方法,其特征在于,所述无线热点的数据包括无线热点的位置;所述方法还包括:
    根据所述无线热点在不同嗅探记录中的位置变化,识别所述无线热点中的移动热点;
    剔除每条所述嗅探记录中关于所述移动热点的数据;
    所述根据所述嗅探记录,确定嗅探设备重叠度包括:
    根据完成数据剔除的嗅探记录,确定嗅探设备重叠度。
  3. 根据权利要求1或2所述的方法,其特征在于,所述嗅探记录包含嗅探设备标识及至少两个无线热点的热点名称;所述确定嗅探设备重叠度包括:
    基于所述嗅探设备标识,确定每个所述热点名称对应的去重嗅探设备集合;
    识别每两个所述去重嗅探设备集合中重叠的嗅探设备标识;
    基于所述重叠的嗅探设备标识的数量以及相应去重嗅探设备集合中嗅探设备标识的数量,确定对应两个无线热点的嗅探设备重叠度。
  4. 根据权利要求3所述的方法,其特征在于,所述方法还包括:
    在所有的所述去重嗅探设备集合中,剔除嗅探设备标识数量小于第一阈值的去重嗅探设备集合,得到目标的去重嗅探设备集合;
    所述识别每两个嗅探设备集合中重叠的嗅探设备标识包括:
    在每两个所述目标的去重嗅探设备集合中识别重叠的嗅探设备标识。
  5. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    获取待映射的每个兴趣点的兴趣点名称;
    将所述兴趣点名称之间具有包含关系的多个兴趣点划分至一个兴趣点组;
    所述根据所述无线热点与相应的兴趣点的距离,确定所述无线热点与所述相应的兴趣点间的初始映射概率包括:
    根据所述无线热点与相应的兴趣点组中兴趣点的距离,计算无线热点与所述相应的兴趣点组间的初始映射概率。
  6. 根据权利要求1所述的方法,其特征在于,所述根据所述无线热点与相应的兴趣点的距离,确定所述无线热点与所述相应的兴趣点间的初始映射概率,包括:
    筛选与兴趣点的距离小于预设值的无线热点作为相应兴趣点的种子热点;
    确定所述种子热点与相应兴趣点之间的初始映射概率为1;
    确定除所述种子节点之外的各无线热点与兴趣点之间的初始映射概率为0。
  7. 根据权利要求1所述的方法,其特征在于,所述方法包括:
    将需要进行无线热点和兴趣点映射的统计区域划分为多个子区域;
    对同一所述子区域内多个兴趣点进行分组,得到一个或多个兴趣点组;
    在相邻的两个子区域中,将兴趣点名称存在包含关系的两个兴趣点组合并;
    所述根据所述无线热点与相应的兴趣点的距离,确定所述无线热点与所述相应的兴趣点间的初始映射概率包括:
    根据所述无线热点与相应的兴趣点组中兴趣点的距离,计算所述无线热点与所述相应的兴趣点组间的初始映射概率。
  8. 根据权利要求7所述的方法,其特征在于,所述对同一所述子区域内多个兴趣点进行分组,得到一个或多个兴趣点组包括:
    获取同一所述子区域内每个兴趣点的兴趣点名称;
    将所述兴趣点名称之间具有包含关系的多个兴趣点划分至一个兴趣点组;
    所述方法还包括:在所述兴趣点组中,将被包含的兴趣点名称对应的兴趣点确定为父级兴趣点。
  9. 根据权利要求8所述的方法,其特征在于,所述在相邻的两个子区域中, 将兴趣点名称存在包含关系的两个兴趣点组合并包括:
    在当前子区域内一个父级兴趣点的兴趣点名称包含了相邻子区域内一个父级兴趣点的兴趣点名称时,将当前子区域内所述父级兴趣点所在的兴趣点组合并至所述相邻子区域内相应父级兴趣点所在的兴趣点组中;
    在当前子区域内一个父级兴趣点的兴趣点名称包含于相邻子区域内一个父级兴趣点的兴趣点名称时,将相邻子区域内所述父级兴趣点所在的兴趣点组合并至所述当前子区域内相应父级兴趣点所在的兴趣点组中。
  10. 根据权利要求6至9任意一项所述的方法,其特征在于,所述根据所述无线热点与相应的兴趣点组中兴趣点的距离,确定所述无线热点与所述相应的兴趣点组间的初始映射概率,包括:
    筛选与所述兴趣点组中至少一个兴趣点的距离小于预设值的无线热点作为相应兴趣点组的种子热点;
    确定所述种子热点与相应兴趣点组之间的初始映射概率为1;
    确定除所述种子节点之外的各无线热点与相应兴趣点组之间的初始映射概率为0。
  11. 根据权利要求6所述的方法,其特征在于,所述方法还包括:
    将所述无线热点之间的嗅探设备重叠度作为矩阵元素建立传播矩阵;
    将所述无线热点与所述兴趣点之间的初始映射概率作为矩阵元素建立初始映射矩阵;
    所述基于所述嗅探设备重叠度进行初始映射概率之间的迭代传播,在迭代结束时得到所述无线热点与所述兴趣点之间的目标映射概率包括:
    将所述传播矩阵与所述初始映射矩阵相乘,得到中间映射矩阵;
    将所述中间映射矩阵中种子热点与相应兴趣点的中间映射概率重置为1后作为初始映射矩阵进行迭代,直至满足迭代停止条件时停止迭代,得到目标映射矩阵;所述目标映射矩阵记录了各所述无线热点与所述兴趣点之间的目标映射概率。
  12. 根据权利要求11所述的方法,其特征在于,所述根据所述目标映射概率,建立所述无线热点与所述兴趣点之间的映射包括:
    在所有兴趣点中,剔除所述目标映射概率均小于第二阈值的无线热点;
    建立所保留的每个无线热点与对应目标映射概率最大的兴趣点间的映射。
  13. 根据权利要求1所述的方法,其特征在于,所述方法包括:
    将需要进行无线热点和兴趣点映射的统计区域划分为多个子区域;
    所述根据所述目标映射概率,建立所述无线热点与所述兴趣点之间的映射包括:
    根据同一所述子区域内的无线热点与所述兴趣点之间的目标映射概率,建立所述子区域内无线热点与所述兴趣点之间的映射;
    通过融合全部子区域内无线热点与所述兴趣点之间映射的数据,建立所述统计区域内各无线热点与所述兴趣点之间的映射。
  14. 一种无线热点与兴趣点的映射装置,所述装置包括:
    热点相关性度量模块,用于获取嗅探记录,所述嗅探记录包括嗅探设备所嗅探到的无线热点的数据;根据所述嗅探记录,确定嗅探设备重叠度;
    映射概率传播模块,用于根据所述无线热点与相应的兴趣点的距离,确定所述无线热点与所述相应的兴趣点间的初始映射概率;基于所述嗅探设备重叠度进行所述初始映射概率之间的迭代传播,在迭代结束时得到所述无线热点与所述兴趣点之间的目标映射概率;
    热点兴趣点映射模块,用于根据所述目标映射概率,建立所述无线热点与所述兴趣点之间的映射。
  15. 根据权利要求14所述的装置,其特征在于,所述无线热点的数据包括无线热点的位置;所述装置还包括:
    移动热点剔除模块,用于根据所述无线热点在不同嗅探记录中的位置变化,识别所述无线热点中的移动热点;剔除每条所述嗅探记录中关于所述移动热点的数据;
    所述热点相关性度量模块,还用于根据完成数据剔除的嗅探记录,确定嗅探设备重叠度。
  16. 根据权利要求14或15所述的装置,其特征在于,所述嗅探记录包含 嗅探设备标识及至少两个无线热点的热点名称;
    所述热点相关性度量模块,还用于基于所述嗅探设备标识,确定每个所述热点名称对应的去重嗅探设备集合;识别每两个所述去重嗅探设备集合中重叠的嗅探设备标识;基于所述重叠的嗅探设备标识的数量以及相应去重嗅探设备集合中嗅探设备标识的数量,确定对应两个无线热点的嗅探设备重叠度。
  17. 根据权利要求16所述的装置,其特征在于,所述装置还包括:
    冷门热点剔除模块,用于在所有的所述去重嗅探设备集合中,剔除嗅探设备标识数量小于第一阈值的去重嗅探设备集合,得到目标的去重嗅探设备集合;
    所述热点相关性度量模块,还用于在每两个所述目标的去重嗅探设备集合中识别重叠的嗅探设备标识。
  18. 根据权利要求14所述的装置,其特征在于,所述装置还包括:
    POI层级划分模块,用于获取待映射的每个兴趣点的兴趣点名称;将所述兴趣点名称之间具有包含关系的多个兴趣点划分至一个兴趣点组;
    所述映射概率传播模块,还用于根据所述无线热点与相应的兴趣点组中兴趣点的距离,计算无线热点与所述相应的兴趣点组间的初始映射概率。
  19. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求1至13中任一项所述方法的步骤。
  20. 一种计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时,使得所述处理器执行如权利要求1至13中任一项所述方法的步骤。
PCT/CN2020/124594 2020-01-21 2020-10-29 无线热点与兴趣点的映射方法、装置、计算机可读存储介质和计算机设备 WO2021147431A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/678,478 US20220286956A1 (en) 2020-01-21 2022-02-23 Method and apparatus for mapping wireless hotspots and points of interest, computer-readable storage medium, and computer device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010072289.3 2020-01-21
CN202010072289.3A CN111291145B (zh) 2020-01-21 2020-01-21 无线热点与兴趣点的映射方法、装置和存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/678,478 Continuation US20220286956A1 (en) 2020-01-21 2022-02-23 Method and apparatus for mapping wireless hotspots and points of interest, computer-readable storage medium, and computer device

Publications (1)

Publication Number Publication Date
WO2021147431A1 true WO2021147431A1 (zh) 2021-07-29

Family

ID=71029222

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/124594 WO2021147431A1 (zh) 2020-01-21 2020-10-29 无线热点与兴趣点的映射方法、装置、计算机可读存储介质和计算机设备

Country Status (3)

Country Link
US (1) US20220286956A1 (zh)
CN (1) CN111291145B (zh)
WO (1) WO2021147431A1 (zh)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111291145B (zh) * 2020-01-21 2022-07-29 腾讯科技(深圳)有限公司 无线热点与兴趣点的映射方法、装置和存储介质
CN112261570B (zh) * 2020-09-30 2023-01-06 汉海信息技术(上海)有限公司 兴趣点与无线网络的关联方法、装置、服务器及存储介质
CN112487312A (zh) * 2020-11-30 2021-03-12 北京百度网讯科技有限公司 关联兴趣点的方法、装置及确定兴趣点状态的方法、装置
CN115002675A (zh) * 2022-05-23 2022-09-02 北京字节跳动科技有限公司 数据匹配方法、装置、可读介质及电子设备
CN115292342B (zh) * 2022-10-09 2022-12-20 湖北省国土测绘院 一种基于poi数据更新城市用地现状图的方法、系统及设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103533501A (zh) * 2013-10-15 2014-01-22 厦门雅迅网络股份有限公司 一种地理围栏生成方法
US20180285779A1 (en) * 2017-03-30 2018-10-04 Baidu Online Network Technology (Beijing) Co., Ltd. Method and Device for Generating Information
CN109672980A (zh) * 2018-12-25 2019-04-23 腾讯科技(深圳)有限公司 确定兴趣点对应的无线局域网热点的方法、装置及存储介质
CN110309434A (zh) * 2018-10-10 2019-10-08 腾讯大地通途(北京)科技有限公司 一种轨迹数据处理方法、装置以及相关设备
CN111291145A (zh) * 2020-01-21 2020-06-16 腾讯科技(深圳)有限公司 无线热点与兴趣点的映射方法、装置和存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103533501A (zh) * 2013-10-15 2014-01-22 厦门雅迅网络股份有限公司 一种地理围栏生成方法
US20180285779A1 (en) * 2017-03-30 2018-10-04 Baidu Online Network Technology (Beijing) Co., Ltd. Method and Device for Generating Information
CN110309434A (zh) * 2018-10-10 2019-10-08 腾讯大地通途(北京)科技有限公司 一种轨迹数据处理方法、装置以及相关设备
CN109672980A (zh) * 2018-12-25 2019-04-23 腾讯科技(深圳)有限公司 确定兴趣点对应的无线局域网热点的方法、装置及存储介质
CN111291145A (zh) * 2020-01-21 2020-06-16 腾讯科技(深圳)有限公司 无线热点与兴趣点的映射方法、装置和存储介质

Also Published As

Publication number Publication date
CN111291145B (zh) 2022-07-29
US20220286956A1 (en) 2022-09-08
CN111291145A (zh) 2020-06-16

Similar Documents

Publication Publication Date Title
WO2021147431A1 (zh) 无线热点与兴趣点的映射方法、装置、计算机可读存储介质和计算机设备
Chen et al. Delineating urban functional areas with building-level social media data: A dynamic time warping (DTW) distance based k-medoids method
Zhao et al. A trajectory clustering approach based on decision graph and data field for detecting hotspots
Zheng et al. Diagnosing New York city's noises with ubiquitous data
Guo et al. A graph-based approach to vehicle trajectory analysis
Gao et al. Extracting urban functional regions from points of interest and human activities on location‐based social networks
Guo et al. Discovering spatial patterns in origin‐destination mobility data
WO2018113787A1 (zh) 一种区域划分方法及装置和存储介质
McKenzie et al. How where is when? On the regional variability and resolution of geosocial temporal signatures for points of interest
Liu et al. Characterizing mixed-use buildings based on multi-source big data
Wang et al. Understanding travellers’ preferences for different types of trip destination based on mobile internet usage data
WO2016086786A1 (zh) 地理围栏生成方法及装置
CN108804551B (zh) 一种兼顾多样性与个性化的空间兴趣点推荐方法
WO2020258905A1 (zh) 一种信息推送方法和装置
Hochmair et al. Data quality of points of interest in selected mapping and social media platforms
Belcastro et al. G-RoI: automatic region-of-interest detection driven by geotagged social media data
Agryzkov et al. Analysing successful public spaces in an urban street network using data from the social networks Foursquare and Twitter
Li et al. Region sampling and estimation of geosocial data with dynamic range calibration
CN110298687B (zh) 一种区域吸引力评估方法及设备
Deng et al. A density-based approach for detecting network-constrained clusters in spatial point events
Lera et al. Analysing human mobility patterns of hiking activities through complex network theory
AU2021209242A1 (en) Systems and methods for implementing density variation (densvar) clustering algorithms
Li et al. A two-phase clustering approach for urban hotspot detection with spatiotemporal and network constraints
Liu et al. Activity knowledge discovery: Detecting collective and individual activities with digital footprints and open source geographic data
CN114529043B (zh) 一种城市空间组团划分方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20915197

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20915197

Country of ref document: EP

Kind code of ref document: A1