CN115730023A - Data processing method and device, electronic equipment and storage medium - Google Patents

Data processing method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115730023A
CN115730023A CN202111006953.5A CN202111006953A CN115730023A CN 115730023 A CN115730023 A CN 115730023A CN 202111006953 A CN202111006953 A CN 202111006953A CN 115730023 A CN115730023 A CN 115730023A
Authority
CN
China
Prior art keywords
grid
positioning
area
grids
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111006953.5A
Other languages
Chinese (zh)
Inventor
杨帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202111006953.5A priority Critical patent/CN115730023A/en
Publication of CN115730023A publication Critical patent/CN115730023A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Mobile Radio Communication Systems (AREA)

Abstract

The embodiment of the application provides a data processing method and device, electronic equipment and a storage medium, and relates to the technical field of electronic maps. The method comprises the following steps: determining the quantity of LBS data respectively corresponding to at least two grids in a preset geographic area, and taking the grid with the LBS data quantity larger than the current positioning behavior quantity threshold value as a positioning heat grid; and acquiring a hot spot area in the preset geographic area according to the outline range of the positioning heat grid in the preset geographic area. The embodiment of the application can avoid dependence on satellite remote sensing image data.

Description

Data processing method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of electronic map technologies, and in particular, to a data processing method, an apparatus, an electronic device, and a storage medium.
Background
The existing technical scheme about hot spot area mining is mainly based on remote sensing images, such as high-definition remote sensing images and night light remote sensing images, and identification is carried out on construction land (impervious surface) based on methods of multiple indexes, threshold cutting, deep learning and the like.
The mining scheme based on the remote sensing image has the following defects:
1) The acquisition cost of satellite remote sensing image data is high, and particularly the satellite remote sensing image in the national range is high;
2) The remote sensing image products with strong timeliness and short revisit period are often low in spatial resolution, and the products with high spatial resolution are often weak in timeliness and long in revisit period.
Disclosure of Invention
Embodiments of the present invention provide a data processing method, apparatus, electronic device and storage medium that overcome the above-mentioned problems or at least partially solve the above-mentioned problems.
In a first aspect, a data processing method is provided, where the method includes:
determining the quantity of LBS data respectively corresponding to at least two grids in a preset geographic area, and taking the grid with the LBS data quantity larger than the current positioning behavior quantity threshold value as a positioning heat grid;
and obtaining a hot spot area in the preset geographic area according to the outline range of the positioning heat grid in the preset geographic area.
In a possible implementation manner, obtaining a hotspot area in a preset geographic area according to a contour range of a positioning heat grid in the preset geographic area includes:
determining at least one communication area, wherein the communication area comprises at least one positioning heat grid, and the distance between the positioning heat grid in each communication area and any one positioning heat grid in other communication areas except each communication area is greater than a preset distance;
and determining the administrative division level corresponding to each connected region, and if the difference between the total area of the connected regions corresponding to the preset administrative division and the built region area in the preset geographic region meets the preset condition, acquiring the outline range of each connected region and taking the outline range as a hot spot region.
In one possible implementation manner, determining an administrative division level corresponding to each connected region further includes:
if the difference value between the total area of the connected region corresponding to the administrative division with the preset level and the built region area in the preset geographic region does not accord with the preset condition, executing an updating step of the positioning action quantity threshold value until the difference value obtained according to the updated positioning action quantity threshold value accords with the preset condition;
wherein, the updating step of the positioning action quantity threshold comprises the following steps:
and obtaining an updated positioning behavior amount threshold according to the difference, the current positioning behavior amount threshold and the maximum value and the minimum value of the LBS data in all grids, and taking the updated positioning behavior amount threshold as the current positioning behavior amount threshold.
In a possible implementation manner, obtaining the updated positioning behavior amount threshold according to the difference, the current positioning behavior amount threshold, and the maximum value and the minimum value of the LBS data in all grids includes:
determining the value range of the updated positioning behavior quantity threshold value according to the size relationship between the difference value and a preset value and by combining the current positioning behavior quantity threshold value and the maximum value and the minimum value of LBS data in all grids;
and obtaining the updated positioning behavior threshold according to the value range.
In a possible implementation manner, determining a value range of the updated positioning behavior amount threshold according to a magnitude relationship between the difference and a preset value and by combining the current positioning behavior amount threshold and the maximum and minimum values of the LBS data in all grids includes:
if the difference value is larger than a first preset value, taking the current positioning behavior threshold value as the right boundary of the value range, and taking the minimum value of LBS data in all grids as the left boundary of the value range;
if the difference is smaller than a first preset value, the current positioning behavior threshold value is used as the left boundary of the value range, and the maximum value of LBS data in all grids is used as the right boundary of the value range.
In one possible implementation, determining at least one connected region includes:
determining at least one initial connected region through a preset search algorithm, wherein the initial connected region comprises at least one positioning heat grid, and when the initial connected region comprises a plurality of positioning heat grids, any one positioning heat grid in the connected region is adjacent to the at least one positioning heat grid in the connected region;
and taking the initial communication areas with the positioning heat grids smaller than the preset number as debris areas, traversing all the debris areas, and merging any debris area with other initial communication areas except any debris area if any debris area and any positioning heat network in other initial communication areas except any debris area are smaller than a first preset distance.
In one possible implementation, the search algorithm includes any one of breadth-first search, depth-first search, and census.
In one possible implementation, determining at least one initial connected region includes:
for any positioning heat grid which is not marked as accessed, adding the positioning heat grid serving as a head node into a pre-established queue, and taking the unique identifier of the head node grid as the unique identifier of the current communication area;
if the queue is not empty, taking out a positioning heat grid from the queue, marking the taken-out positioning heat grid as an accessed grid, executing search operation on the accessed grid, and judging whether the queue is empty again after executing the search operation;
if the queue is empty, reselecting the positioning heat grids marked as accessed until all the positioning heat grids are marked as accessed;
wherein performing a search operation on the accessed grid comprises:
recording the unique identification of the connected region to which the positioning heat grid belongs as the unique identification of the current connected region, and recording the unique identification of the positioning heat grid into the grid set of the current connected region;
and traversing the adjacent grids of the positioning heat grid, and for any adjacent grid, if the adjacent grid is the positioning heat grid and is not marked as the visited grid, adding the adjacent grid into the queue until the traversal is completed.
In one possible implementation, traversing all the debris areas, and if any one of the debris areas and any one of the positioning heat networks in other initial communication areas except for any one of the debris areas are smaller than a first preset distance, merging any one of the debris areas and other initial communication areas except for any one of the debris areas, including:
for any grid in any debris area, if determining that positioning heat grids belonging to other initial connected areas exist in the range of N-N grids around the grid as the center, determining the shortest distance between the other initial connected areas and the debris area; n is a positive integer;
and updating a communication area of the positioning heat grid in the debris area into a target communication area, wherein the target communication area is an initial communication area with the shortest distance to the debris area.
In one possible implementation, determining the shortest distance between the other initial connected regions and the debris region further includes:
determining the shortest path between other initial connected regions and the debris region;
and updating all grids in the shortest path corresponding to the target connected region into positioning heat grids.
In one possible implementation, obtaining the profile range of each connected region includes:
and if the number of the grids in the communication area is not higher than a second preset value, taking the outline formed by the grids in the area as the outline range of the communication area.
If the number of the grids in the communication area is higher than a second preset value, expanding the contour of each grid in the communication area by a plurality of second preset distances, performing Union operation of spatial superposition analysis to merge the contour into one contour, and contracting one contour by a third preset distance to serve as the contour range of the communication area.
In a second aspect, there is provided a data processing apparatus comprising:
the positioning heat grid determining module is used for determining the quantity of LBS data respectively corresponding to at least two grids in a preset geographic area, and taking the grid with the LBS data quantity larger than the current positioning behavior quantity threshold value as a positioning heat grid;
and the mining result determining module is used for obtaining the hot spot area in the preset geographic area according to the outline range of the positioning heat grid in the preset geographic area.
In one possible implementation, the mining result determining module includes:
the communication area determining module is used for determining at least one communication area, the communication area comprises at least one positioning heat grid, and the distance between the positioning heat grid in each communication area and any one positioning heat grid in other communication areas except each communication area is larger than a preset distance;
and the region range determining module is used for determining the administrative division level corresponding to each connected region, and if the difference value between the total area of the connected regions corresponding to the preset administrative division and the built-up region area in the preset geographic region meets the preset condition, acquiring the outline range of each connected region and using the outline range as a hot spot region.
In one possible implementation, the data processing apparatus further includes:
the threshold updating module is used for executing the step of updating the positioning behavior quantity threshold if the difference value between the total area of the connected region corresponding to the administrative division of the preset level and the built region area in the preset geographic region does not accord with the preset condition until the difference value obtained according to the updated positioning behavior quantity threshold accords with the preset condition;
wherein, the updating step of the positioning behavior quantity threshold value comprises the following steps:
and obtaining an updated positioning behavior quantity threshold according to the difference value, the current positioning behavior quantity threshold and the maximum value and the minimum value of the LBS data in all grids, and taking the updated positioning behavior quantity threshold as the current positioning behavior quantity threshold.
In one possible implementation, the threshold updating module includes:
a threshold value range determining submodule, configured to determine a value range of the updated positioning behavior amount threshold according to a magnitude relationship between the difference value and a preset value, in combination with the current positioning behavior amount threshold, and the maximum value and the minimum value of the LBS data in all grids;
and the threshold value determining submodule is used for obtaining the updated positioning behavior threshold value according to the value range.
In one possible implementation, the threshold value range determining sub-module includes:
a first boundary determining unit, configured to, if the difference is greater than a first preset value, take the current positioning activity threshold as a right boundary of the value range, and take the minimum value of the LBS data in all grids as a left boundary of the value range;
and the second boundary determining unit is used for taking the current positioning behavior amount threshold as the left boundary of the value range and taking the maximum value of the LBS data in all grids as the right boundary of the value range if the difference value is smaller than the first preset value.
In one possible implementation, the connected component determining module includes:
the initial connected search submodule is used for determining at least one initial connected region through a preset search algorithm, the initial connected region comprises at least one positioning heat grid, and when the initial connected region comprises a plurality of positioning heat grids, any one positioning heat grid in the connected region is adjacent to at least one positioning heat grid in the connected region;
and the debris merging submodule is used for taking the initial communication areas with the positioning heat grids smaller than the preset number as debris areas, traversing all the debris areas, and merging any one of the debris areas and other initial communication areas except any one of the debris areas if any one of the debris areas and any one of the positioning heat networks in the other initial communication areas except any one of the debris areas are smaller than a first preset distance.
In one possible implementation, the search algorithm includes any one of breadth-first search, depth-first search, and co-lookup set.
In one possible implementation, the initial connectivity search sub-module includes:
the initialization unit is used for adding any positioning heat grid which is not marked as accessed into a pre-established queue as a head node, and taking the unique identifier of the head node grid as the unique identifier of the current communication area;
the searching unit is used for taking out a positioning heat grid from the queue if the queue is not empty, marking the taken-out positioning heat grid as an accessed grid, executing searching operation on the accessed grid, and judging whether the queue is empty again after executing the searching operation;
the reselection unit is used for reselecting the positioning heat grids marked as visited if the queue is empty until all the positioning heat grids are marked as visited;
wherein, search unit includes:
the identification recording unit is used for recording the unique identification of the connected region to which the positioning heat grid belongs as the unique identification of the current connected region and recording the unique identification of the positioning heat grid into the grid set of the current connected region;
and the traversing unit is used for traversing the adjacent grids of the positioning heat grid, and for any adjacent grid, if the adjacent grid is the positioning heat grid and is not marked as the visited grid, the adjacent grid is added into the queue until the traversing is finished.
In one possible implementation, the debris merging submodule includes:
the shortest distance determining unit is used for determining the shortest distance between other initial connected areas and the debris area if determining that the positioning heat grids belonging to other initial connected areas exist in the range of the N x N grids around the grids as the center for any grid in any debris area; n is a positive integer;
and the area updating unit is used for updating the communication area of the positioning heat grid in the debris area into a target communication area, wherein the target communication area is an initial communication area which has the shortest distance with the debris area.
In one possible implementation, the debris merging submodule further includes:
the shortest path determining unit is used for determining the shortest paths between other initial connected areas and the debris areas;
a heat updating unit for updating the grids in the shortest path corresponding to the target connected region to the positioning heat grids
In one possible implementation, the region range determining module includes:
the first outline determining unit is used for taking the outline formed by the grids in the area as the outline range of the communication area if the number of the grids in the communication area is not higher than a second preset value;
and the second contour determining unit is used for expanding the contour of each grid in the communication area by a plurality of second preset distances if the number of the grids in the communication area is higher than a second preset value, then performing Union operation of spatial superposition analysis to merge the expanded contour into a contour, and contracting the contour by a third preset distance to serve as the contour range of the communication area.
In a third aspect, an embodiment of the present invention provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the steps of the method provided in the first aspect.
In a fourth aspect, embodiments of the present invention provide a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the steps of the method as provided in the first aspect.
In a fifth aspect, an embodiment of the present invention provides a computer program, where the computer program includes computer instructions stored in a computer-readable storage medium, and when a processor of a computer device reads the computer instructions from the computer-readable storage medium, the processor executes the computer instructions, so that the computer device executes the steps of implementing the method provided in the first aspect.
According to the data processing method, the data processing device, the electronic equipment and the storage medium, the number of the LBS data of the grids in the preset geographic area is determined, the LBS data is generated based on the entity information, so that the frequency degree of the corresponding geographic position entity information can be reflected by the size of the LBS data, the grids with the number of the LBS data larger than the current positioning behavior quantity threshold value are used as positioning heat grids and are used as areas with more remarkable entity information, namely hot spot areas, the LBS data which can be updated in real time is relied on in the embodiment of the application, and dependence on satellite remote sensing image data is avoided.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments of the present application will be briefly described below.
Fig. 1 is a schematic diagram of an implementation environment of a data processing method according to an embodiment of the present application;
FIG. 2 is a schematic flow chart of a data processing method according to an embodiment of the present application
FIG. 3 is a schematic diagram of a connected region according to an embodiment of the present application
FIG. 4 is a flowchart illustrating a data processing method according to an embodiment of the present application
Fig. 5 is a schematic flowchart of an updating method of a positioning behavior amount threshold based on binary search according to an embodiment of the present application;
FIG. 6 is a flowchart illustrating a process of searching for an initial connected region according to an embodiment of the present application
FIG. 7 is a schematic diagram of various grid ranges for an embodiment of the present application;
FIG. 8 is a schematic diagram of determining the shortest distance between other initial communication areas and the debris area according to an embodiment of the present application
FIG. 9 is a schematic flow chart illustrating merging debris regions with other initially connected regions in accordance with an embodiment of the present disclosure;
fig. 10 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application;
fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to the embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative and are only for the purpose of explaining the present application and are not to be construed as limiting the present invention.
As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. As used herein, the term "and/or" includes all or any element and all combinations of one or more of the associated listed items.
To make the objects, technical solutions and advantages of the present application more clear, the following detailed description of the embodiments of the present application will be made with reference to the accompanying drawings.
The terms referred to in this application will first be introduced and explained:
location Based Services (LBS) uses various types of positioning technologies to obtain the current Location of a positioning device, and provides information resources and basic Services to the positioning device through the mobile internet. First, a user can determine the spatial position of the user by using a positioning technology, and then the user can acquire resources and information related to the position through the mobile internet. The LBS service integrates various information technologies such as mobile communication, internet, space positioning, position information, big data and the like, and a mobile internet service platform is utilized to update and interact data, so that a user can obtain corresponding services through space positioning.
A built-up area: generally refers to a built-up area of a city, namely an area which is actually developed and constructed in a large scale, municipal utilities and is basically provided with public facilities in a city administrative area; in a broad sense, the built-up area also includes a rural built-up area, that is, a built-up land (homestead) mainly based on living functions in a rural area. Urban and rural built-up areas can be understood to be similar to the spatial range of the above hot spot areas;
union, a spatial overlay analysis means, overlays input polygons, and the output layer is all polygons of each original input layer.
The application provides a data processing method, a data processing device, an electronic device and a computer-readable storage medium, which aim to solve the above technical problems in the prior art.
The following describes the technical solutions of the present application and how to solve the above technical problems with specific embodiments. The following several specific embodiments may be combined with each other, and details of the same or similar concepts or processes may not be repeated in some embodiments. Embodiments of the present application will be described below with reference to the accompanying drawings.
Referring to fig. 1, which schematically illustrates an implementation environment of a data processing method according to an embodiment of the present application, as shown in the figure, terminal devices 10, 20, and 30 request a location server 40 for an LBS service, the location server 40 stores the LBS data in a location log 41 after providing the LBS service to each terminal, privacy protection processing is performed on the LBS data in the location log, the location server 40 sends the location log within a certain time to an area mining server 50, and the area mining server 50 determines a hotspot area in a preset area from the location log 41 according to the data processing method according to the embodiment of the present application.
It should be noted that the terminal device may be any electronic device with a positioning function. Such as a cell phone, laptop, tablet, drone, smart watch, and so forth.
The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a CDN (Content Delivery Network), a big data and an artificial intelligence platform, which is not limited in this embodiment of the present application.
The execution method of the server in the embodiment of the present application may be implemented in a form of cloud computing (cloud computing), which is a computing mode and distributes computing tasks on a resource pool formed by a large number of computers, so that various application systems can obtain computing power, storage space, and information services as needed. The network that provides the resources is referred to as the "cloud". Resources in the "cloud" appear to the user as if they are infinitely expandable and can be acquired at any time, used on demand, expanded at any time, and paid for use.
As a basic capability provider of cloud computing, a cloud computing resource pool (cloud platform, generally referred to as IaaS a Service (Infrastructure as a Service) platform is established, and multiple types of virtual resources are deployed in the resource pool and are selectively used by external clients.
According to the logic function division, a PaaS (Platform as a Service) layer can be deployed on an IaaS (Infrastructure as a Service) layer, a SaaS (Software as a Service) layer is deployed on the PaaS layer, and the SaaS can be directly deployed on the IaaS. PaaS is a platform on which software runs, such as a database, a web container, etc. SaaS is a variety of business software, such as web portal, sms, and mass texting. Generally speaking, saaS and PaaS are upper layers relative to IaaS.
Referring to fig. 2, a schematic flow chart of a data processing method according to an embodiment of the present application is exemplarily shown, and as shown, the method includes:
s101, determining the quantity of LBS data corresponding to at least two grids in a preset geographic area respectively, and taking the grids with the quantity of the LBS data larger than a current positioning behavior quantity threshold value as positioning heat grids.
The embodiment of the application can divide the landmark space of the preset geographic area based on the regular grids, and count various spatial elements based on the grids. The regular grid can be based on a geographic coordinate system or a projected planar coordinate system.
In the embodiment of the application, the following regular grid subdivision mode is adopted: if the grid division is performed based on the geographic coordinate system (latitude and longitude) and the size of the grid cell is d, for example, d =0.002, the following grid division rule may be constructed:
1) For a certain input longitude and latitude, dividing the longitude and latitude by d and then rounding down to obtain two integers which are used as indexes of the grid, and sequentially splicing the two indexes by underlining to obtain a unique identifier (GridID) of the grid, wherein the unique identifier is stored in a character string form.
For example: when the longitude and latitude (113.41872, 35.19727) are input, the longitude and latitude are divided by 0.002 respectively and then are rounded downwards to obtain a grid index 1 (corresponding to the longitude) of 56709, a grid index 2 (corresponding to the latitude) of 17598, and a grid unique identifier (GridID) corresponding to the longitude and latitude coordinate of 56709_17598;
2) For each grid's unique identifier (gridld), two grid index numbers can be derived, and four coordinate points of its grid can be derived based on the grid cell size d. For example: unique identification for the grid above (gridiid): 56709/17598, the longitude and latitude of the lower left corner of the grid are respectively the index multiplied by d, and the longitude and latitude of the lower left corner of the grid are obtained (113.418, 35.196), so that the other three coordinate points of the grid (obtained based on the side length d of the grid) can be obtained.
3) According to the unique identification (gridld) of the grid, the unique identifications (gridld) of the four adjacent grids can be obtained in the following way: note that the unique identifier (griddid) of the grid is x _ y, x and y are two indexes thereof, so the grid adjacent to the upper side is x _ { y +1}, the grid adjacent to the left side is { x-1} _ y, and so on, and the description is omitted; for example: for mesh unique identification (grididd) 56709_17598, the unique identifications (grididds) of the four adjoining meshes are 56710_17598, 56708_17598, 56709_17597, 56709_17599, respectively.
It is emphasized that the positioning data obtained in the correlation method is often positioning data included in a dynamic state disclosed by a user on a social network site, for example, when the user publishes the own dynamic state through software such as a friend circle and a microblog, positioning information is often actively added to the dynamic state, but the problem that the number of the public positioning data obtained in a crawling manner is often small is solved.
The format of the LBS data in the embodiment of the present application may be represented as a triple of (x, y, t), where x is the longitude of the geographic coordinate of the location request, y is the latitude of the geographic coordinate of the location request, and t is the timestamp of the location request.
The embodiment of the application also needs to execute privacy protection processing before using the LBS data, and does not need to use any positioning information related to individual identification in the whole processing flow.
After obtaining the LBS data, the grid where the LBS data is located can be determined according to the geographic position of the LBS data, the number of the LBS data in each grid is further counted, and the grid with the location heat, namely the grid with the LBS data number larger than the current location action threshold value, can be determined by combining the current location action threshold value.
The amount of LBS data in the embodiment of the present application may be an average amount over a certain time period scale, for example, a daily average LBS data amount, a weekly average LBS data amount, a monthly average LBS data amount, etc., may also be a total amount counted from a certain calculation date, and may also be a median within a plurality of time period scales, such as a median of a multi-day LBS data amount, etc.
S102, obtaining a hot spot area in the preset geographic area according to the outline range of the positioning heat grid in the preset geographic area.
Specifically, in the embodiment of the present application, each positioning heat grid may be used as a hot spot region, and the contour range of the positioning heat grid is also referred to as the contour range of the hot spot region.
The number of the LBS data of the grids in the preset geographic area is determined, the LBS data is generated based on the entity information, the entities in the embodiment of the application can refer to people and can further comprise animals, therefore, the frequency degree of the corresponding geographic position entity information can be reflected by the size of the LBS data, the grids with the number of the LBS data larger than the current positioning behavior amount threshold value are used as positioning heat grids, and the area with remarkable entity information can be preliminarily screened out. Through verification, although satellite remote sensing data are not used, the obtained contour range is very fit with the built-up area range on the satellite remote sensing image, and the fact that the mining result of the hot spot area can reflect the reality situation with high accuracy is shown.
On the basis of the foregoing embodiments, as an optional embodiment, obtaining a hotspot area in the preset geographic area according to the contour range of the positioning heat grid in the preset geographic area includes:
and determining at least one communication area, wherein the communication area comprises at least one positioning heat grid, and the distance between the positioning heat grid in each communication area and any one positioning heat grid in other communication areas except each communication area is greater than a preset distance.
After the positioning heat grids in the preset geographic area are determined, the connected areas in the preset geographic area can be further determined, the number of the connected areas in the embodiment of the application includes at least one positioning heat grid, and the distance between the positioning heat grid in each connected area and any one of the positioning heat grids in other connected areas except each connected area is greater than a preset distance, that is, when the number of the positioning heat grids in each connected area is more than 1, the distance between any one positioning heat grid and at least one positioning heat grid in the same connected area is smaller than or equal to the preset distance.
Referring to fig. 3, which exemplarily shows a schematic diagram of a connected region in the present embodiment, as shown in the drawing, the preset geographic region is divided into 6 × 6 grids, and 2 connected regions are respectively connected regions 1 and 2, and the preset distance in the drawing is 2 grid lengths, so that when two positioning heat grids are adjacent or only separated by one grid length, the two positioning heat grids are located in the same connected region, where all the positioning heat grids in the connected region 1 are adjacent to each other, and one positioning heat grid (the grid at the lower right corner in the drawing) exists in the connected region 2, but the positioning heat grid not adjacent to the positioning heat grid, but spaced from 1 positioning heat grid in the connected region 2 by 1 grid length (less than the preset distance 2 grid lengths).
And determining the administrative division level corresponding to each connected region, and if the difference value between the total area of the connected regions corresponding to the preset administrative division and the built region area in the preset geographic region meets the preset condition, acquiring the contour range of each connected region and using the contour range as a hot spot region.
It should be noted that, after obtaining the connected region, the embodiment of the present application performs labeling on the connected region at the administrative division level, and there are two purposes in this case: 1) Enriching the information of the mined connected region; 2) Assist the best screening threshold search for the next module.
The method and the device for determining the administrative division level can acquire the geographic positions of all levels of administrative residences in a preset geographic area, then determine the communication area where all levels of administrative residences are located, and if only one level of administrative residences exists in one communication area, the administrative division level corresponding to the communication area is the administrative division level of the administrative residences. If only one level of administrative residences exist in one connected region, the administrative division level corresponding to the connected region is the highest administrative division level.
For example, if a certain connected region only includes the address of a rural government entity, the administrative division level of the connected region is rural; if a connected region includes both county-level and city-level government units, the administrative division level of the connected region is at the city level.
Specifically, the labeling of the connected region at the administrative division level may include the following steps:
1) Initializing a hash table key value pair AdminLevel, wherein the key is used for storing a unique identifier (RegionID) of a connected region, and the value is used for storing the highest administrative resident level in the connected region;
2) Traversing each level of administrative division government residence point data, and obtaining a unique identifier v of a grid where the residence point is located based on x and y for a certain administrative division government residence coordinate;
3) If v is not in the positioning heat grid set S, traversing the data of the next government station point;
4) If v is in the positioning heat grid set S, finding the unique identification of the connected region to which the v belongs, namely taking out F [ v ], and updating the administrative division level information of the F [ v ] according to the following logic:
a) If the F [ v ] is not in the key of the hash table key value pair AdminLevel, directly setting the AdminLevel [ F [ v ] as the current administrative division level;
b) If F [ v ] is in the key of the hash table key value pair AdminLevel, if the current administrative division level is higher than AdminLevel [ F [ v ] ], then AdminLevel [ F [ v ] ] is set as the current administrative division level. For the administrative division level, the order of the ground level, the county level and the village level is provided;
5) Until the data of the government site points of each level of administrative division is traversed, all connected areas can be divided into five levels according to the AdminLevel of the key value pair of the hash table: the system comprises a communication area where a local administrative division government station is located, a communication area where a county administrative division government station is located, a communication area where a village administrative division government station is located, and a communication area without administrative division level information.
It can be known from the above steps that the positioning behavior quantity threshold is an important parameter affecting the mining result, and the higher the value of the positioning behavior quantity threshold is, the fewer the number of grids meeting the threshold is, and the smaller the area of the screened connected region is.
In the embodiment of the present application, the area of the built-up area in the "annual survey of city statistics in china" can be used as a true target value, and it is considered that the sum of the areas of the communication areas where the government premises of the administrative district on the ground level and the government premises of the administrative district on the county level are marked as the whole country should be close to the area a of the built-up area in the latest "annual survey of city statistics in china target Therefore, the optimal value of the positioning behavior quantity threshold value can be obtained.
And if the difference value between the total area of the connected areas corresponding to the preset administrative division and the built area in the preset geographic area meets the preset condition, acquiring the outline range of each connected area and taking the outline range as the mining result of the hot spot area.
On the basis of the foregoing embodiments, as an optional embodiment, determining an administrative division level corresponding to each connected region, and then further includes:
and if the difference value between the total area of the connected region corresponding to the administrative division with the preset level and the built region area in the preset geographic region does not accord with the preset condition, executing an updating step of the positioning action quantity threshold value until the difference value obtained according to the updated positioning action quantity threshold value accords with the preset condition.
Referring to fig. 4, a schematic flow chart of a data processing method according to an embodiment of the present application is exemplarily shown, and as shown, the method includes:
s201, acquiring LBS data;
s202, mesh generation is carried out on a preset geographic area to obtain at least two meshes;
s203, determining the number of LBS data corresponding to each grid;
s204, determining a current positioning behavior amount threshold;
s205, taking grids with LBS data quantity larger than the current positioning action quantity threshold value as positioning heat grids;
s206, determining at least one communication area;
s207, determining an administrative division level corresponding to each connected region;
s208, determining a difference value between the total area of a connected region corresponding to the preset administrative division and the built-up region area in the preset geographic region;
s209, judging whether the difference value meets a preset condition, if not, executing a step S210, and if so, executing a step S211;
s210, obtaining an updated positioning behavior quantity threshold according to the difference value, the current positioning behavior quantity threshold and the maximum value and the minimum value of LBS data in all grids, taking the updated positioning behavior quantity threshold as the current positioning behavior quantity threshold, and returning to the step S204;
s211, acquiring the outline range of each connected region and taking the outline range as the mining result of the hotspot region.
According to the embodiment of the application, whether the difference value meets the preset condition or not is judged after the difference value is obtained, if the difference value does not meet the preset condition, the positioning behavior quantity threshold value is adjusted again, the connected region and the total area of the connected region corresponding to the preset administrative division are determined again by using the adjusted positioning behavior quantity threshold value, and if the difference value between the total area of the connected region corresponding to the preset administrative division and the area of the built-up region does not meet the preset condition, the positioning behavior quantity threshold value needs to be adjusted again until the difference value between the total area of the connected region corresponding to the preset administrative division and the area of the built-up region in the preset geographic region meets the preset condition.
The method for obtaining the updated positioning behavior amount threshold according to the difference value, the current positioning behavior amount threshold and the maximum value and the minimum value of the LBS data in all grids comprises the following steps:
s301, determining the value range of the updated positioning behavior quantity threshold value according to the size relationship between the difference value and a preset value and by combining the current positioning behavior quantity threshold value and the maximum value and the minimum value of LBS data in all grids;
specifically, if the difference is greater than a first preset value, taking the current positioning behavior threshold as the right boundary of the value range, and taking the minimum value of the LBS data in all the grids as the left boundary of the value range;
if the difference is smaller than a first preset value, taking the current positioning behavior threshold as the left boundary of the value range, and taking the maximum value of the LBS data in all grids as the right boundary of the value range.
And S302, obtaining an updated positioning behavior quantity threshold according to the value range.
In the embodiment of the present application, one optional value in the value range may be used as the positioning behavior amount threshold, for example, a middle value in the value range may be used as the positioning behavior amount threshold.
Referring to fig. 5, a flowchart schematically illustrates an updating method of a positioning activity threshold based on Binary Search (Binary Search) according to an embodiment of the present application is exemplarily shown:
1) The following variables are noted: left boundary T of value range left Right boundary T of value range right The current positioning behavior threshold T and the current search iteration times i are simultaneously input by depending on two self-defined search model parameters: the maximum number of search iterations maxI (e.g., maxI = 10), the error control range eps (e.g., eps = 1%), there are:
a) Recording PV as an array for storing positioning behavior quantity in all grids, and initializing T left =min(PV),T right = max (PV), namely, respectively initializing the left boundary and the right boundary to the minimum value and the maximum value of positioning behavior quantity of all grid days;
b) Initialization T = T 0 Wherein T is 0 Is an initial value which can be set according to experience;
c) Initializing i =0;
2) Inputting T as a threshold parameter, sequentially executing the steps S205-S208, and calculating the area A of the connected region marked as the corresponding ground-level administrative district and the connected region marked as the corresponding county-level administrative district in the output result;
3) Comparing A with A target (namely the area of a built-up area in the Chinese city statistics yearbook):
a) If A and A are target Is less than the error control range, i.e. has | A-A target |/A target <eps, returning T as the best threshold value, and stopping updating the threshold value of the positioning behavior quantity;
b) Otherwise, if A>A target Then, T is updated right =T;
c) Otherwise, if A<A target Then, T is updated left =T;
d) Update T =0.5 (T) left +T right ) Updating i = i +1 if T is satisfied at the same time left <T right And i is<And (5) repeating the step 2) under the maxI two conditions, otherwise, returning the current positioning behavior threshold T as the best threshold value, and stopping searching.
On the basis of the foregoing embodiments, as an optional embodiment, the determining at least one connected region includes:
s401, determining at least one initial connected region through a preset search algorithm, wherein the initial connected region comprises at least one positioning heat grid, and when the initial connected region comprises a plurality of positioning heat grids, any one positioning heat grid in the connected region is adjacent to the at least one positioning heat grid in the connected region.
The Search algorithm of the embodiment of the present application may be any one of Breadth-First Search, depth-First Search, and parallel Search, and an initial connected region Search method of Breadth-First Search (BFS) is described below.
S501, for any positioning heat grid which is not marked as being accessed, adding the positioning heat grid serving as a head node into a pre-established queue, and taking the unique identifier of the head node grid as the unique identifier of the current communication area.
Performing a search operation on the accessed grid, comprising:
s501a, recording the unique identifier of the connected region to which the positioning heat grid belongs as the unique identifier of the current connected region, and recording the unique identifier of the positioning heat grid into a grid set of the current connected region;
s501b, traversing the adjacent grids of the positioning heat grid, and adding the adjacent grids into the queue until the traversing is completed if any adjacent grid is the positioning heat grid and is not marked as the visited grid.
S502, if the queue is not empty, taking out a positioning heat grid from the queue, marking the taken-out positioning heat grid as an accessed grid, executing search operation on the accessed grid, and judging whether the queue is empty again after executing the search operation; and if the queue is empty, reselecting the positioning heat grids marked as visited until all the positioning heat grids are marked as visited.
Referring to fig. 6, which schematically illustrates a flowchart of searching for an initial connected component according to an embodiment of the present application, as shown in the drawing, the method includes:
1) The following data structure is initialized:
initializing whether a grid with a unique identifier (GridID) of k is accessed or not when a hash table key value pair V, V [ k ] record is searched, if so, V [ k ] =1, otherwise, V [ k ] =0;
initializing hash table key value pairs F, F [ k ] to record a unique identifier (GridID) of a connected region to which a grid of k belongs during searching;
initializing a hash table key value pair R, R [ k ] to store a list, and recording the unique identification (RegionID) of a connected region as the unique identification (GridID) of which grids the connected region of k comprises;
2) Randomly selecting a positioning heat grid v which is not accessed from a positioning heat grid set S to be used as a head node grid, taking a unique identifier (GridID) of the head node grid as a unique identifier (RegionID) of a current connected region, initializing a first-in first-out queue Q, and adding the head node grid v into the queue Q;
3) If the queue Q is not empty, taking out a grid a from the queue Q, and updating:
V[a]=1;
f [ a ] = v (v is the current RegionID);
adding a to R [ v ] (v is the current RegionID);
4) Traversing GridIDs of four grids adjacent to the grid a, for a certain adjacent grid n, if n is in the set S and V [ n ] =0, adding n into a queue Q, and after the traversal is finished, returning to the step 3 if the queue Q is not empty), otherwise, entering the step 5);
5) At this time, all grids in a connected region where the positioning heat grids serving as head nodes are located are searched, if the set S has positioning heat grids which are not accessed yet, the step 2 is returned, and if not, all connected regions are searched;
6) And storing the search results F and R, and ending the process.
S402, taking the initial communication areas with the positioning heat grids smaller than a preset number (for example, 5) as debris areas, traversing all the debris areas, and if any one of the debris areas and any one of the other initial communication areas except any one of the debris areas are smaller than a first preset distance, merging any one of the debris areas and other initial communication areas except any one of the debris areas.
S402a, for any grid in any debris area, if determining that positioning heat grids belonging to other initial connected areas exist in the range of N-N grids around the grid as the center, determining the shortest distance between the other initial connected areas and the debris area; n is a positive integer.
Referring to fig. 7, which schematically illustrates various grid ranges of the embodiment of the present application, as shown in the figure, 8 grids are included in a range of 1 × 1 grid around the grid 101 as a center, and 24 grids are included in a range of 2 × 2 grid around the grid 101 as a center, so that if the coordinates of a grid are (x, y), the coordinates of 4 boundary points of a range of N × N grid around the grid as a center are: (x + N, y + N), (x-N, y-N), (x + N, y-N), and (x-N, y + N).
Referring to fig. 8, a schematic diagram of determining the shortest distance between other initial connected areas and a debris area according to an embodiment of the present application is exemplarily shown, as shown in the figure, the debris area a includes only one grid (and also a location heat grid) a, and there is one initial connected area B around the debris area.
When searching for other initial connected regions in the range of 1 × 1 grid around the grid a, it is known that 8 grids (grids 1 to 8) around the grid a do not have a location heat grid, that is, there are no other initial connected regions, and when searching for other initial connected regions in the range of 2 × 2 grid around the grid a, it is known that there are 1 location heat grid B in 24 grids around the grid a, and the location heat grid belongs to the initial connected region B, and the initial connected region B further includes location heat grids c and d.
The distances between the grid a and the grids b to d are respectively calculated, and it can be known that the grid a needs to pass through 1 grid to the grid b, the grid a needs to pass through 2 grids to the grid c, and the grid a needs to pass through 3 grids to the grid d.
It can be seen that the shortest distance between the debris region and the other initially connected regions is 1.
When there are a plurality of other initially connected regions around the debris region, the shortest distance between the debris region and each of the other initially connected regions can be obtained using the method described above.
In the embodiment of the application, when the shortest distance between the chip area and other initial communication areas is determined, a BFS breadth-first search method, a Dijkstra Dikstro algorithm and other search algorithms can be adopted.
And S402b, updating the communication area of the positioning heat grid in the debris area into a target communication area, wherein the target communication area is an initial communication area with the shortest distance to the debris area.
On the basis of the above embodiments, as an alternative embodiment, the method for determining the shortest distance between the other initial communication areas and the debris area further includes:
determining the shortest path between other initial connected regions and the debris region;
and updating all grids in the shortest path corresponding to the target connected region into positioning heat grids.
Taking fig. 8 as an example, since the shortest path from the debris area where the grid a is located to the target connected area passes through the grid e, the grid e may be updated to be the location heat grid.
Referring to fig. 9, a schematic flow chart illustrating merging of a debris region with other initial communication regions according to an embodiment of the present application is shown, and includes:
1) Traversing each initial connected region R according to R [ R ]]Obtaining the number of grids contained in the initial connected region when R [ R ]]Is less than a threshold value T N Then, the initial connected region is indicated as a debris region, and the step 2) of expanding and searching the initial connected region can be carried out, and the following data structure is initialized:
initializing temporary hash table key-value pairs neighbor, wherein keys of a key-value pair store the regionids of other connected regions near the crumb region, and values of a key-value pair store the gridiids of the grids within the crumb region adjacent to the regionids;
initializing a temporary hash table key-value pair Dist, wherein keys of the key-value pair store the RegionIDs of other connected regions near the debris region, wherein values of the key-value pair store the shortest distance and shortest path to the RegionID within the debris region;
2) Traversing each grid in R [ R ], searching a positioning heat grid within the range of N × N grids around the grid v as the center for the grid v, if other grids v ' exist in the positioning heat grid set S, and the region unique identification (RegionID) of the initial connected region of v ' slave is inconsistent with the region unique identification (RegionID) of R, namely F [ v ' ]! = r, indicating the presence of other connected areas in the vicinity of v, "! = "symbol indicates not equal then:
if F [ v ' ] is not in the temporary hash table key value pairs, initializing Neighbors [ F [ v ' ] ] into a set, and adding v into the Neighbors [ F [ v ' ] ];
3) If the key value pair neighbor of the temporary hash table is empty, indicating that the current fragment area has no object to be merged, returning to the step 1) to search the next fragment area;
4) If the key value pair Neighbors of the temporary hash table is not empty, traversing each grididd in Neighbors [ region id ] for each region id in the temporary hash table Neighbors, searching the shortest path length (measured by the number of grids) from the grid corresponding to the grididd to the region id and a specific path (storing a list of the grididds of the grids on the path) through a shortest path search algorithm, and updating the lengths and the specific path (measured by the number of grids) into the temporary hash table Dist in the following way:
a) If the RegionID is not in the key list of the key value pair Dist of the temporary hash table, directly updating the current shortest path length and the specific path to Dist [ RegionID ];
b) Otherwise, if the RegionID is in the key list of the key value pair Dist of the temporary hash table, if the current shortest path length is smaller than the shortest path length existing in Dist [ RegionID ], updating the current shortest path length and the specific path to Dist [ RegionID ];
5) Obtaining the smallest RegionID of the shortest path length by sequencing the key value pairs Dist of the temporary hash table, if a plurality of results with the smallest shortest path lengths exist, randomly selecting one RegionID from the results, obtaining the corresponding shortest path, and completing the combination by the following operations:
a) Modifying the information of the connected region which is subordinate to each grid k in the debris region R [ R ] by the following two steps: (1) updating F [ k ] = RegionID, namely changing the identification of the subordinate connected region of each grid in the debris region into the RegionID; (2) adding k to R [ RegionID ];
b) Modifying the information of the connected region subordinate to each grid k in the shortest path Dist [ RegionID ], and carrying out the following three steps of operation: (1) updating F [ k ] = RegionID, namely changing the identification of the subordinate connected region of each grid in the debris region into the RegionID; (2) adding k to R [ RegionID ]; (3) adding k into a high positioning heat grid set S;
c) Clearing R [ R ], and deleting R from the key of R;
6) Returning to the step 1) after the current debris area is processed, and ending the algorithm until no debris area exists or no combinable connected area can be found in the rest debris areas;
7) Saving a new high-positioning-heat grid S, a hash table key value pair F and a hash table key value pair R;
on the basis of the above-described embodiments, as an alternative embodiment,
acquiring the outline range of each connected region, wherein the outline range comprises the following steps:
and if the number of the grids in the communication area is not higher than a second preset value, taking the outline formed by the grids in the area as the outline range of the communication area.
If the number of the grids in the communication area is higher than a second preset value, expanding the contour of each grid in the communication area by a plurality of second preset distances, performing Union operation of spatial superposition analysis to merge the contour into one contour, and contracting one contour by a third preset distance to serve as the contour range of the communication area.
It should be noted that if a connected region contains more than one mesh, the contour of each mesh is buffered (expanded) outward by a certain distance d (e.g., d = 0.00005), and then the Union operation of the spatial superposition analysis is performed to merge the contours into one contour, and then the contour is buffered (contracted) inward by the distance d to be used as the contour representation of the connected region. And (5) representing the profile of each connected region, and finally outputting the profile as a space profile mining result of the hot spot region.
An embodiment of the present application provides a hotspot area data processing device, as shown in fig. 10, the device may include: the positioning heat grid determining module 201 and the mining result determining module 202 specifically:
a location heat grid determining module 201, configured to determine the amount of LBS data respectively corresponding to at least two grids included in a preset geographic area, and use the LBS data amount greater than a current location activity amount threshold as a location heat grid;
and the mining result determining module 202 is configured to obtain a hot spot region in the preset geographic region according to the outline range of the positioning heat grid in the preset geographic region.
The data processing apparatus provided in the embodiment of the present invention specifically executes the processes of the foregoing method embodiments, and for details, the contents of the foregoing data processing method embodiments are not described herein again. According to the data processing device provided by the embodiment of the invention, the number of the LBS data of the grids in the preset geographic area is determined, and the LBS data is generated based on the entity information, so that the frequency degree of the corresponding geographic position entity information can be reflected according to the size of the LBS data, the grids with the number of the LBS data larger than the current positioning behavior quantity threshold value are used as the positioning heat degree grids and used as areas with more obvious entity information, namely hot spot areas. Through verification, although satellite remote sensing data is not used, the obtained outline range is very fit with the built-up area range on the satellite remote sensing image, and the fact that the actual situation can be reflected by the mining result of the hot spot area with high accuracy is shown. In one possible implementation, the mining result determining module includes:
the communication area determining module is used for determining at least one communication area, the communication area comprises at least one positioning heat grid, and the distance between the positioning heat grid in each communication area and any one positioning heat grid in other communication areas except each communication area is greater than a preset distance;
and the area range determining module is used for determining the administrative division level corresponding to each connected area, and if the difference value between the total area of the connected areas corresponding to the preset administrative division and the built area in the preset geographic area meets the preset condition, acquiring the outline range of each connected area and using the outline range as a hot spot area.
In one possible implementation, the data processing apparatus further includes:
the threshold updating module is used for executing the step of updating the positioning behavior quantity threshold if the difference value between the total area of the connected region corresponding to the administrative division of the preset level and the built region area in the preset geographic region does not accord with the preset condition until the difference value obtained according to the updated positioning behavior quantity threshold accords with the preset condition;
wherein, the updating step of the positioning action quantity threshold comprises the following steps:
and obtaining an updated positioning behavior amount threshold according to the difference, the current positioning behavior amount threshold and the maximum value and the minimum value of the LBS data in all grids, and taking the updated positioning behavior amount threshold as the current positioning behavior amount threshold.
In one possible implementation, the threshold updating module includes:
the threshold value range determining submodule is used for determining the value range of the updated positioning behavior quantity threshold value according to the size relation between the difference value and a preset value and by combining the current positioning behavior quantity threshold value and the maximum value and the minimum value of LBS data in all grids;
and the threshold value determining submodule is used for obtaining the updated positioning behavior threshold value according to the value range.
In one possible implementation, the threshold value range determining sub-module includes:
a first boundary determining unit, configured to, if the difference is greater than a first preset value, take the current positioning activity threshold as a right boundary of the value range, and take the minimum value of the LBS data in all grids as a left boundary of the value range;
and the second boundary determining unit is used for taking the current positioning behavior amount threshold as the left boundary of the value range and taking the maximum value of the LBS data in all grids as the right boundary of the value range if the difference value is smaller than the first preset value.
In one possible implementation, the connected component determining module includes:
the initial connected search submodule is used for determining at least one initial connected region through a preset search algorithm, the initial connected region comprises at least one positioning heat grid, and when the initial connected region comprises a plurality of positioning heat grids, any one positioning heat grid in the connected region is adjacent to the at least one positioning heat grid in the connected region;
and the debris merging submodule is used for taking the initial communication areas with the positioning heat grids smaller than the preset number as debris areas, traversing all the debris areas, and merging any one of the debris areas and other initial communication areas except any one of the debris areas if any one of the debris areas and any one of the positioning heat networks in the other initial communication areas except any one of the debris areas are smaller than a first preset distance.
In one possible implementation, the search algorithm includes any one of breadth-first search, depth-first search, and co-lookup set.
In one possible implementation, the initial connectivity search sub-module includes:
the initialization unit is used for adding any positioning heat grid which is not marked as accessed into a pre-established queue as a head node, and taking the unique identifier of the head node grid as the unique identifier of the current communication area;
the searching unit is used for taking out a positioning heat grid from the queue if the queue is not empty, marking the taken-out positioning heat grid as an accessed grid, executing searching operation on the accessed grid, and judging whether the queue is empty again after executing the searching operation;
the reselection unit is used for reselecting the positioning heat grids marked as visited if the queue is empty until all the positioning heat grids are marked as visited;
wherein, search unit includes:
the identification recording unit is used for recording the unique identification of the connected region to which the positioning heat grid belongs as the unique identification of the current connected region and recording the unique identification of the positioning heat grid into the grid set of the current connected region;
and the traversing unit is used for traversing the adjacent grids of the positioning heat grid, and for any adjacent grid, if the adjacent grid is the positioning heat grid and is not marked as the visited grid, the adjacent grid is added into the queue until the traversing is finished.
In one possible implementation, the debris merging submodule includes:
the shortest distance determining unit is used for determining the shortest distance between other initial connected areas and the debris area if determining that the positioning heat grids belonging to other initial connected areas exist in the range of the N x N grids around the grids as the center for any grid in any debris area; n is a positive integer;
and the area updating unit is used for updating the communication area of the positioning heat grid in the debris area into a target communication area, wherein the target communication area is an initial communication area which has the shortest distance with the debris area.
In one possible implementation, the debris merging submodule further includes:
the shortest path determining unit is used for determining the shortest paths between other initial connected areas and the debris areas;
a heat updating unit for updating the grids in the shortest path corresponding to the target connected region into the positioning heat grids
In one possible implementation, the region range determining module includes:
and the first outline determining unit is used for taking the outline formed by the grids in the area as the outline range of the communication area if the number of the grids in the communication area is not higher than a second preset value.
And the second contour determining unit is used for expanding the contour of each grid in the communication area by a plurality of second preset distances if the number of the grids in the communication area is higher than a second preset value, then performing Union operation of spatial superposition analysis to merge the expanded contour into a contour, and contracting the contour by a third preset distance to serve as the contour range of the communication area.
An embodiment of the present application provides an electronic device, which includes: a memory and a processor; at least one program stored in the memory for execution by the processor, which when executed by the processor, implements: the LBS data of the grids in the preset geographic area is determined, and the LBS data are generated based on the entity information, so that the frequency degree of the corresponding geographic position entity information can be reflected, the grids with the LBS data quantity larger than the current positioning behavior quantity threshold value are used as positioning heat grids and are used as areas with more remarkable entity information, namely hot spot areas. Through verification, although satellite remote sensing data is not used, the obtained outline range is very fit with the built-up area range on the satellite remote sensing image, and the fact that the actual situation can be reflected by the mining result of the hot spot area with high accuracy is shown.
In an alternative embodiment, an electronic device is provided, as shown in fig. 11, the electronic device 4000 shown in fig. 11 comprising: a processor 4001 and a memory 4003. Processor 4001 is coupled to memory 4003, such as via bus 4002. Optionally, the electronic device 4000 may further comprise a transceiver 4004. In addition, the transceiver 4004 is not limited to one in practical applications, and the structure of the electronic device 4000 is not limited to the embodiment of the present application.
The Processor 4001 may be a CPU (Central Processing Unit), a general-purpose Processor, a DSP (Digital Signal Processor), an ASIC (Application Specific Integrated Circuit), an FPGA (field programmable Gate Array) or other programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor 4001 may also be a combination that performs a computational function, including, for example, a combination of one or more microprocessors, a combination of a DSP and a microprocessor, or the like.
Bus 4002 may include a path that carries information between the aforementioned components. The bus 4002 may be a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus 4002 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 11, but this is not intended to represent only one bus or type of bus.
The Memory 4003 may be a ROM (Read Only Memory) or other type of static storage device that can store static information and instructions, a RAM (Random Access Memory) or other type of dynamic storage device that can store information and instructions, an EEPROM (Electrically Erasable Programmable Read Only Memory), a CD-ROM (Compact Disc Read Only Memory) or other optical Disc storage, optical Disc storage (including Compact Disc, laser Disc, optical Disc, digital versatile Disc, blu-ray Disc, etc.), a magnetic Disc storage medium or other magnetic storage device, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited thereto.
The memory 4003 is used for storing application codes for executing the scheme of the present application, and the execution is controlled by the processor 4001. Processor 4001 is configured to execute application code stored in memory 4003 to implement what is shown in the foregoing method embodiments.
The embodiment of the present application provides a computer readable storage medium, on which a computer program is stored, and when the computer program runs on a computer, the computer is enabled to execute the corresponding content in the foregoing method embodiment. Compared with the prior art, the number of the LBS data of the grids in the preset geographic area is determined, and the LBS data is generated based on the entity information, so that the frequency degree of the corresponding geographic position entity information can be reflected by the size of the LBS data, the grids with the number larger than the current positioning action amount threshold value of the LBS data are used as the positioning heat grids and used as areas with more remarkable entity information, namely hot areas. Through verification, although satellite remote sensing data is not used, the obtained outline range is very fit with the built-up area range on the satellite remote sensing image, and the fact that the actual situation can be reflected by the mining result of the hot spot area with high accuracy is shown.
Embodiments of the present application provide a computer program, which includes computer instructions stored in a computer-readable storage medium, and when a processor of a computer device reads the computer instructions from the computer-readable storage medium, the processor executes the computer instructions, so that the computer device executes the contents shown in the foregoing method embodiments. Compared with the prior art, the LBS data of the grids in the preset geographic area is determined, the LBS data is generated based on the entity information, so that the frequency degree of the corresponding geographic position entity information can be reflected, the grids with the LBS data quantity larger than the current positioning action quantity threshold value are used as the positioning heat degree grids and are used as areas with more remarkable entity information, namely hot areas. Through verification, although satellite remote sensing data is not used, the obtained outline range is very fit with the built-up area range on the satellite remote sensing image, and the fact that the actual situation can be reflected by the mining result of the hot spot area with high accuracy is shown.
It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
The foregoing is only a partial embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and embellishments can be made without departing from the principle of the present invention, and these should also be construed as the scope of the present invention.

Claims (14)

1. A method of data processing, comprising:
determining the quantity of location-based service (LBS) data respectively corresponding to at least two grids in a preset geographic area, and taking the grid of which the quantity of the LBS data is greater than a current positioning behavior quantity threshold value as a positioning heat grid;
and acquiring a hot spot area in the preset geographic area according to the outline range of the positioning heat grid in the preset geographic area.
2. The data processing method according to claim 1, wherein the obtaining of the hotspot region in the preset geographic region according to the contour range of the positioning heat grid in the preset geographic region comprises:
determining at least one communication area, wherein the communication area comprises at least one positioning heat grid, and the distance between the positioning heat grid in each communication area and any one positioning heat grid in other communication areas except each communication area is greater than a preset distance;
and determining the administrative division level corresponding to each connected region, and if the difference value between the total area of the connected regions corresponding to the preset administrative division and the built region area in the preset geographic region meets a preset condition, acquiring the contour range of each connected region and using the contour range as the hot spot region.
3. The data processing method of claim 2, wherein determining the administrative division level corresponding to each connected region further comprises:
if the difference value between the total area of the connected region corresponding to the administrative division of the preset level and the built region area in the preset geographic region does not accord with the preset condition, executing the updating step of the positioning behavior quantity threshold value until the difference value obtained according to the updated positioning behavior quantity threshold value accords with the preset condition;
wherein the step of updating the positioning behavior amount threshold comprises:
and obtaining an updated positioning behavior quantity threshold according to the difference, the current positioning behavior quantity threshold and the maximum value and the minimum value of the LBS data in all grids, and taking the updated positioning behavior quantity threshold as the current positioning behavior quantity threshold.
4. The data processing method according to claim 3, wherein the obtaining the updated positioning activity amount threshold according to the difference, the current positioning activity amount threshold, and the maximum and minimum values of LBS data in all grids comprises:
determining the value range of the updated positioning behavior amount threshold according to the size relationship between the difference value and a preset value and by combining the current positioning behavior amount threshold, and the maximum value and the minimum value of LBS data in all grids;
and obtaining an updated positioning behavior quantity threshold according to the value range.
5. The data processing method according to claim 4, wherein the determining, according to the magnitude relationship between the difference and a preset value, the value range of the updated positioning behavior amount threshold in combination with the current positioning behavior amount threshold, and the maximum and minimum values of the LBS data in all grids comprises:
if the difference value is larger than a first preset value, taking the current positioning behavior threshold value as a right boundary of a value range, and taking the minimum value of LBS data in all grids as a left boundary of the value range;
and if the difference is smaller than a first preset value, taking the current positioning behavior threshold as a left boundary of a value range, and taking the maximum value of the LBS data in all grids as a right boundary of the value range.
6. The data processing method of claim 1, wherein the determining at least one connected region comprises:
determining at least one initial connected region through a preset search algorithm, wherein the initial connected region comprises at least one positioning heat grid, and when the initial connected region comprises a plurality of positioning heat grids, any one positioning heat grid in the connected region is adjacent to the at least one positioning heat grid in the connected region;
and taking the initial communication areas with the positioning heat grids smaller than the preset number as debris areas, traversing all the debris areas, and merging any debris area with the other initial communication areas except any debris area if any debris area and any positioning heat network in the other initial communication areas except any debris area are smaller than a first preset distance.
7. The data processing method of claim 6, wherein the search algorithm comprises any one of breadth-first search, depth-first search, and co-lookup.
8. The data processing method according to claim 6 or 7, wherein the determining at least one initial connected region comprises:
for any positioning heat grid which is not marked as visited, adding the positioning heat grid into a pre-established queue as a head node, and taking the unique identifier of the head node grid as the unique identifier of the current communication area;
if the queue is not empty, taking out a positioning heat grid from the queue, marking the taken-out positioning heat grid as an accessed grid, executing search operation on the accessed grid, and judging whether the queue is empty again after executing the search operation;
if the queue is empty, reselecting the positioning heat grids marked as visited until all the positioning heat grids are marked as visited;
wherein the performing a search operation on the accessed grid comprises:
recording the unique identification of the connected region to which the positioning heat grid belongs as the unique identification of the current connected region, and recording the unique identification of the positioning heat grid into the grid set of the current connected region;
and traversing the adjacent grids of the positioning heat grid, and for any adjacent grid, if the adjacent grid is the positioning heat grid and is not marked as the visited grid, adding the adjacent grid into the queue until the traversal is completed.
9. The data processing method of claim 6, wherein the traversing all the debris areas, and if any one debris area is less than a first preset distance from any one of the positioning heat networks in the other initial communication areas except the any one debris area, merging the any one debris area with the other initial communication areas except the any one debris area comprises:
for any grid in any debris area, if determining that positioning heat grids belonging to other initial connected areas exist in the range of N-by-N grids around the grid as the center, determining the shortest distance between the other initial connected areas and the debris area; n is a positive integer;
and updating the communication area of the positioning heat grid in the debris area into a target communication area, wherein the target communication area is an initial communication area which has the shortest distance with the debris area.
10. The data processing method of claim 9, wherein said determining the shortest distance of the other initially connected region from the debris region further comprises:
determining a shortest path between the other initially connected regions and the debris region;
and updating all grids in the shortest path corresponding to the target connected region into positioning heat grids.
11. The data processing method according to claim 1, wherein the obtaining the outline range of each connected region comprises:
if the number of the grids in the communication area is not higher than a second preset value, taking the outline formed by the grids in the area as the outline range of the communication area;
if the number of the grids in the communication area is higher than a second preset value, expanding the contour of each grid in the communication area by a plurality of second preset distances, performing Union operation of spatial superposition analysis to combine the contour into a contour, and contracting the contour by a third preset distance to serve as the contour range of the communication area.
12. A data processing apparatus, comprising:
the positioning heat grid determining module is used for determining the quantity of LBS data respectively corresponding to at least two grids in a preset geographic area, and taking the grid of which the quantity of the LBS data is greater than the current positioning behavior quantity threshold value as a positioning heat grid;
and the mining result determining module is used for obtaining the hot spot area in the preset geographic area according to the outline range of the positioning heat grid in the preset geographic area.
13. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the steps of the data processing method according to any of claims 1 to 11 are implemented when the computer program is executed by the processor.
14. A computer-readable storage medium storing computer instructions for causing a computer to perform the steps of the data processing method according to any one of claims 1 to 11.
CN202111006953.5A 2021-08-30 2021-08-30 Data processing method and device, electronic equipment and storage medium Pending CN115730023A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111006953.5A CN115730023A (en) 2021-08-30 2021-08-30 Data processing method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111006953.5A CN115730023A (en) 2021-08-30 2021-08-30 Data processing method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115730023A true CN115730023A (en) 2023-03-03

Family

ID=85291052

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111006953.5A Pending CN115730023A (en) 2021-08-30 2021-08-30 Data processing method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115730023A (en)

Similar Documents

Publication Publication Date Title
EP2849117B1 (en) Methods, apparatuses and computer program products for automatic, non-parametric, non-iterative three dimensional geographic modeling
CN109284449B (en) Interest point recommendation method and device
CN114440916B (en) Navigation method, device, equipment and storage medium
CN111090712A (en) Data processing method, device and equipment and computer storage medium
CN109741209B (en) Multi-source data fusion method, system and storage medium for power distribution network under typhoon disaster
Bartie et al. Incorporating vegetation into visual exposure modelling in urban environments
CN110428386B (en) Map grid merging method and device, storage medium and electronic device
CN111291776A (en) Channel information extraction method based on crowd-sourced trajectory data
CN111429560A (en) Three-dimensional terrain service fusion method and device and server
Brinkhoff Open street map data as source for built-up and urban areas on global scale
Hofmann et al. Usage of fuzzy spatial theory for modelling of terrain passability
US11082802B2 (en) Application of data structures to geo-fencing applications
Basaraner et al. A structure recognition technique in contextual generalisation of buildings and built-up areas
CN112215864B (en) Contour processing method and device of electronic map and electronic equipment
CN112381078B (en) Elevated-based road identification method, elevated-based road identification device, computer equipment and storage medium
Azri et al. Review of spatial indexing techniques for large urban data management
CN111080080B (en) Village geological disaster risk prediction method and system
CN112700073B (en) Bus route planning method and device
CN116036604B (en) Data processing method, device, computer and readable storage medium
CN115271564B (en) Highway slope disaster space danger zoning method and terminal
CN115730023A (en) Data processing method and device, electronic equipment and storage medium
Li et al. gsstSIM: A high‐performance and synchronized similarity analysis method of spatiotemporal trajectory based on grid model representation
He et al. The fractal or scaling perspective on progressively generated intra-urban clusters from street junctions
Kumar et al. Spatial data mining: recent trends and techniques
CN114419357B (en) Data processing method, data processing device, computer and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40083153

Country of ref document: HK