CN115438138B - Employment center identification method and device, electronic equipment and storage medium - Google Patents

Employment center identification method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115438138B
CN115438138B CN202211395497.2A CN202211395497A CN115438138B CN 115438138 B CN115438138 B CN 115438138B CN 202211395497 A CN202211395497 A CN 202211395497A CN 115438138 B CN115438138 B CN 115438138B
Authority
CN
China
Prior art keywords
employment
center
clustering
employment center
target area
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211395497.2A
Other languages
Chinese (zh)
Other versions
CN115438138A (en
Inventor
张晓东
王良
许丹丹
梁弘
张兴华
崔鹤
陈猛
胡腾云
孙道胜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Chengyuan Digital Technology Co ltd
Beijing Municipal Institute Of City Planning & Design
Original Assignee
Beijing Chengyuan Digital Technology Co ltd
Beijing Municipal Institute Of City Planning & Design
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Chengyuan Digital Technology Co ltd, Beijing Municipal Institute Of City Planning & Design filed Critical Beijing Chengyuan Digital Technology Co ltd
Priority to CN202211395497.2A priority Critical patent/CN115438138B/en
Publication of CN115438138A publication Critical patent/CN115438138A/en
Application granted granted Critical
Publication of CN115438138B publication Critical patent/CN115438138B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Remote Sensing (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a employment center identification method, a device, electronic equipment and a storage medium, which relate to the technical field of data processing, and the method comprises the following steps: acquiring mobile phone positioning data of the personnel in the target area, and sending the mobile phone positioning data to a clustering model for clustering to obtain the employment place of the personnel in the target area output by the clustering model; acquiring enterprise registration data of a target area, inputting the enterprise registration data into the clustering model for clustering, and obtaining an enterprise registration gathering area output by the clustering model; obtaining a employment center of the target area based on the employment place of the personnel in the target area and the enterprise registration gathering place; and the clustering model is obtained by training based on a density clustering algorithm with noise. The invention can identify the spatial distribution of employment posts and reflect the spatial contact direction of living and employment.

Description

Employment center identification method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method and an apparatus for identifying a employment center, an electronic device, and a storage medium.
Background
The spatial distribution of urban internal population is one of the main contents of urban internal spatial structure research, and is embodied in living space and employment space. In the aspect of selection of an urban land development mode: planners often emphasize the organization of residential and employment spaces by functional zoning, and city centers are the primary requirement from an industrial functional perspective. In recent years, as the land resources become more and more scarce, it is important to optimize a multi-center system for adjusting a spatial structure and perfecting a center city (for example, how employment sites are distributed, how large the size of employment centers, where commuters come from, and the like).
Since city planning can only implement the concept of living balance on land utilization, and the distribution of housing and employment posts is carried out in the market, the market can not ensure that residents living in the local can obtain local employment posts, and can not ensure that the residents can purchase local housings when working in the local, thereby balancing from the aspect of planning land. The conventional method is based on census data, can only judge the aggregation condition of resident population in a city, cannot identify the spatial distribution of employment posts, and cannot reflect the spatial contact direction (occupation relationship) of residences and employment.
Disclosure of Invention
The invention provides a employment center identification method, a device, electronic equipment and a storage medium, which are used for identifying the spatial distribution of employment posts and reflecting the spatial connection direction of residence and employment.
The invention provides a employment center identification method, which comprises the following steps:
acquiring mobile phone positioning data of the personnel in the target area, and sending the mobile phone positioning data to a clustering model for clustering to obtain the employment place of the personnel in the target area output by the clustering model;
acquiring enterprise registration data of a target area, inputting the enterprise registration data into the clustering model for clustering, and obtaining an enterprise registration gathering place output by the clustering model;
obtaining a employment center of the target area based on the employment place of the personnel in the target area and the enterprise registration gathering place;
and the clustering model is obtained by training based on a density clustering algorithm with noise.
The employment center identification method provided by the invention further comprises the following steps:
acquiring a plurality of different enterprise categories of the employment center, and determining the industrial diversity measurement of the employment center based on the different enterprise categories;
obtaining public transportation data of the employment center, and determining the measure of the radiation range of the employment center based on the public transportation data;
classifying the employment center of the target region based on the employment center industry diversity metric and the employment center radiation range metric.
According to the employment center identification method provided by the invention, the industrial diversity measure of the employment center is calculated based on the following formula:
Figure 248048DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure 709116DEST_PATH_IMAGE002
s is the number of enterprise categories for the measure of the industrial diversity of the employment center,
Figure 873381DEST_PATH_IMAGE003
the occupation ratio of the ith type of enterprise in the target region is obtained.
According to the employment center identification method provided by the invention, the public transportation data comprises public transportation platform positions and the commuting time of the personnel in the target area.
According to the employment center identification method provided by the present invention, classifying the employment center of the target area based on the employment center industry diversity measure and the employment center radiation range measure includes:
sequentially selecting a plurality of groups of employment center categories based on the employment center industry diversity measurement, wherein each group of employment center category comprises a plurality of types of employment centers;
and determining the employment center radiation range measurement mean value corresponding to each group of employment center categories based on the employment center radiation range measurement, and taking the next group of employment center categories as the employment center target categories under the condition that the difference value between the employment center radiation range measurement mean value corresponding to the next group of employment center categories and the employment center radiation range mean value corresponding to the previous group of employment center categories is smaller than a preset threshold value.
According to the employment center identification method provided by the invention, the clustering model is used for:
analyzing the data input into the clustering model to obtain a spatial point data set;
calculating the k-distance of each spatial point in the spatial point data set;
displaying the k-distance of each space point in the space point data set by using a scatter diagram, and determining the neighborhood radius based on the scatter diagram;
determining a core point set based on a preset initial minimum point number and the neighborhood radius; the core point in the core point set is a spatial point which takes the core point as a center and has the neighborhood inner spatial point with the neighborhood radius as the radius not less than the initial minimum point number;
and clustering connectable core point groups in the core point set and spatial points with the distance to the connectable core point groups smaller than the neighborhood radius based on a density clustering algorithm with noise to obtain employment places of the target region personnel or the enterprise registration gathering places.
According to the employment center identification method provided by the invention, the noise-based density clustering algorithm is used for clustering connectable core point groups in the core point set and spatial points with the distance to the connectable core point groups smaller than the neighborhood radius to obtain employment places of the target region personnel or the enterprise registration gathering places, and the method comprises the following steps:
clustering a connectable core point group in the core point set and a spatial point with a distance to the connectable core point group smaller than the neighborhood radius based on a density clustering algorithm with noise to obtain a cluster corresponding to the connectable core point group;
clustering clusters corresponding to the connectable core point groups to obtain spatial clustering clusters;
and selecting the clustering cluster with the largest area from the spatial clustering clusters corresponding to different neighborhood radii, and determining the employment place of the personnel in the target area or the enterprise registration gathering place based on the arithmetic mean of the spatial point coordinates in the clustering cluster with the largest area.
The invention also provides a employment center identification apparatus, comprising:
the first clustering module is used for acquiring mobile phone positioning data of the target area personnel, sending the mobile phone positioning data to a clustering model for clustering, and obtaining the employment places of the target area personnel output by the clustering model;
the second clustering module is used for acquiring enterprise registration data of a target area, inputting the enterprise registration data into the clustering model for clustering, and obtaining an enterprise registration gathering place output by the clustering model;
the employment center identification module is used for obtaining the employment center of the target area based on the employment places of the personnel in the target area and the enterprise registration gathering places;
and the clustering model is obtained by training based on a density clustering algorithm with noise.
The invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the employment center identification method.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the employment center identification method as described in any one of the above.
The invention also provides a computer program product comprising a computer program which, when executed by a processor, implements a method of identifying a career centre as claimed in any one of the preceding claims.
The employment center identification method, the device, the electronic equipment and the storage medium provided by the invention have the advantages that the mobile phone positioning data of the personnel in the target area is sent to the clustering model for clustering to obtain the employment places of the personnel in the target area, the enterprise registration data of the target area is input to the clustering model for clustering to obtain the enterprise registration gathering places, and the employment center of the target area is obtained by combining the employment places of the personnel in the target area and the enterprise registration gathering places.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a schematic flow chart of a employment center identification method provided by the present invention;
FIG. 2 is a schematic diagram of a training process of a clustering model provided in the present invention;
FIG. 3 is a schematic view of the employment center classification provided by the present invention;
FIG. 4 is a schematic diagram of the clustering results of employment centers provided by the present invention;
FIG. 5 is a second schematic diagram of the employment center clustering result provided by the present invention;
FIG. 6 is a third schematic diagram of the employment center clustering results provided in the present invention;
FIG. 7 is a fourth schematic diagram of employment center clustering results provided by the present invention;
FIG. 8 is a schematic structural diagram of a employment center identification apparatus provided in the present invention;
fig. 9 is a schematic structural diagram of an electronic device provided by the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments obtained by persons of ordinary skill in the art based on the embodiments of the present invention without any creative work belong to the protection scope of the present invention.
The employment center identification method, apparatus, electronic device and storage medium of the present invention are described below with reference to fig. 1 to 9.
As shown in fig. 1, the present invention provides a employment center identification method, which includes:
step 110, acquiring mobile phone positioning data of the target area personnel, and sending the mobile phone positioning data to a clustering model for clustering to obtain the employment places of the target area personnel output by the clustering model; the Clustering model is obtained Based on the training of a Density Clustering algorithm (DBSCAN, sensitivity-Based Clustering of Applications with Noise) with Noise.
It can be understood that, for the mobile phone personnel in the target area, part of the mobile phone personnel is randomly selected to obtain the mobile phone positioning data of the part of the mobile phone personnel.
The density clustering algorithm with noise utilizes the concept of clustering based on density, namely, the number of objects (points or other space objects) contained in a certain area in a clustering space is required to be not less than a given threshold value, and the algorithm can find clusters (Cluster) with any shapes from a data set with noise so that the areas with sufficient density are divided in the same Cluster, thereby achieving the purpose of clustering. The density clustering algorithm with noise has two important input parameters: neighborhood radius (Eps) and number of minimum points (MinPts).
Due to the parameter sensitivity of the noise density clustering algorithm, before the algorithm is used for clustering analysis, parameters of two parameter neighborhood radiuses and the number of minimum points of the minimum point number of the clustering algorithm are calibrated.
The training process of the clustering model is as shown in fig. 2, and comprises the steps of firstly obtaining original data, carrying out data cleaning on the original data to obtain effective data, carrying out mountain spring on the effective data based on a deletion rule to obtain available data, carrying out data simplification and rule extraction on the available data to obtain training data, and carrying out parameter calibration on the original model (namely, the initial model of the density clustering algorithm with noise) based on the training data to obtain the clustering model.
And 120, acquiring enterprise registration data of the target area, inputting the enterprise registration data into the clustering model for clustering, and obtaining an enterprise registration gathering place output by the clustering model.
It can be understood that the enterprise registration data of the target area can be obtained through a network interface of a third-party service platform providing enterprise business data, and the enterprise registration data comprises a registration place and an office address of the enterprise, and business and industry of the enterprise.
And step 130, obtaining the employment center of the target area based on the employment place of the personnel in the target area and the enterprise registration gathering place.
It can be understood that the employment places of the personnel in the target area and the enterprise registration gathering places are superposed, and relevant indexes such as an employment center thermodynamic diagram, an isoline and the like are obtained through analysis, namely the employment center of the target area is determined.
In some embodiments, the employment center identification method further comprises:
acquiring a plurality of different enterprise categories of the employment center, and determining the industrial diversity measurement of the employment center based on the different enterprise categories;
obtaining public transportation data of the employment center, and determining the measure of the radiation range of the employment center based on the public transportation data;
classifying the employment centers of the target region based on the employment center industry diversity metric and the employment center radiation range metric.
It can be understood that a plurality of different enterprise categories of the target area are obtained, whether the enterprise has dominant enterprises is judged, the internal industry of the employment center of the target area is analyzed, the wind rose is drawn according to the category proportion, the dominant industry is highlighted, and the diversity measurement of the industry of the employment center is obtained.
On the basis of identifying the machine boundary of the employment center, screening public transportation data of the employment center, and determining the radiation range measurement of the employment center based on the screened public transportation data of the employment center. The public transportation data of the employment center may be data based on a public transportation Integrated Circuit (IC) Card.
In some embodiments, the employment center identification method calculates the employment center industry diversity measure based on the following formula:
Figure 36378DEST_PATH_IMAGE001
wherein, the first and the second end of the pipe are connected with each other,
Figure 249185DEST_PATH_IMAGE004
for the employment center industry diversity measure,
Figure 876344DEST_PATH_IMAGE005
in that
Figure 883614DEST_PATH_IMAGE006
The time has a maximum value, S is the number of enterprise categories,
Figure 799486DEST_PATH_IMAGE007
the occupation ratio of the ith type of enterprise in the target region is obtained.
It can be understood that the present embodiment introduces diversity index in biology, calculating the industrial diversity within employment centers. The diversity index refers to the measurement of species diversity, and well describes the biodiversity data in the region. The diversity index is a comprehensive index reflecting the richness and uniformity. When the diversity index is applied, communities with low abundance and high uniformity and areas with high abundance and low uniformity may have the same diversity index.
Further, in this embodiment, the determination of the diversity measure of the employment center industry according to the shannon index (shannon-wiener diversity index) can reflect whether the employment center has dominant industry.
In some embodiments, the public transportation data includes public transportation platform locations and commute times of the target area personnel.
It can be understood that the positions of the subway stations and the bus stations of the employment center are screened out on the basis of identifying the employment center and the boundary thereof. And (3) selecting commuters with the time period of 7.
In some embodiments, said classifying the employment center of the target area based on the employment center industry diversity metric and the employment center radiation range metric comprises:
sequentially selecting a plurality of groups of employment center categories based on the employment center industry diversity measurement, wherein each group of employment center category comprises a plurality of types of employment centers;
and determining the employment center radiation range measurement mean value corresponding to each group of employment center categories based on the employment center radiation range measurement, and taking the next group of employment center categories as the employment center target categories under the condition that the difference value between the employment center radiation range measurement mean value corresponding to the next group of employment center categories and the employment center radiation range mean value corresponding to the previous group of employment center categories is smaller than a preset threshold value.
It is understood that employment center classification of the target area may be achieved based on the K-Means algorithm. The K-means algorithm is a hard clustering algorithm, is an objective function clustering method based on a prototype, takes a certain distance from a data point to the prototype as an optimized objective function, and obtains an adjustment rule of iterative operation by using a function extremum solving method. The K-means algorithm takes Euclidean distance as similarity measure, and solves the optimal classification of a corresponding initial clustering center vector V, so that the evaluation index J is minimum. The K-means algorithm uses a sum of squared errors criterion function as a clustering criterion function.
The K-means algorithm proceeds as follows:
1. randomly selecting K documents from the N documents as centroids;
2. measuring the distance to each centroid for each document remaining and categorizing it to the closest centroid;
3. recalculating the centroid of each obtained class;
4. and iterating for 2-3 steps until the new centroid is equal to the original centroid or smaller than a specified threshold value, and finishing the algorithm.
In this embodiment, the K-Means algorithm includes the following specific steps:
1. determining a group of employment center categories, wherein the group of employment center categories comprise a plurality of employment centers of different categories, for example, the group of employment center categories comprise K employment centers of different categories, each type of employment center corresponds to one feature vector, namely the group of employment center categories corresponds to K feature vectors;
2. calculating the employment center radiation range measurement corresponding to each kind of employment center in a group of employment center categories, wherein the employment center radiation range measurement is also a new characteristic vector, and further calculating the employment center radiation range measurement mean value corresponding to a plurality of kinds of employment centers in a group of employment center categories, namely the employment center radiation range measurement mean value corresponding to the group of employment center categories;
3. and (3) repeating the steps 1 and 2 until the difference value of the employment center radiation range measurement mean values corresponding to the two adjacent groups of employment center categories does not change or reaches the iteration upper limit, and obtaining the employment center target category. The classification result of employment centers based on the K-Means algorithm is shown in FIG. 3.
In some embodiments, the clustering model is to:
analyzing the data input into the clustering model to obtain a spatial point data set;
calculating the k-distance of each spatial point in the spatial point data set;
displaying the k-distance of each space point in the space point data set by using a scatter diagram, and determining a neighborhood radius based on the scatter diagram;
determining a core point set based on a preset initial minimum point number and the neighborhood radius; the core point in the core point set is a spatial point which takes the core point as a center and has the neighborhood inner spatial point with the neighborhood radius as the radius not less than the initial minimum point number;
based on a density clustering algorithm with noise, clustering connectable core point groups in the core point set and spatial points with the distance to the connectable core point groups smaller than the neighborhood radius to obtain employment places of the target regional personnel or the enterprise registration gathering places.
It will be appreciated that the data input to the clustering model is parsed to form a set of spatial point data P = { P (i) = i =0, 1, …, N }, P (i) representing the input data, i.e. the spatial points of the set, and N representing the total number of input data.
The k-distance of each spatial point P (i) in the set of spatial point data P is calculated. For a set P, for any point P (i) belonging to P, the distances between the point P (i) and all elements in the subset S = { P (1), P (2), …, P (i-1), P (i + 1), …, P (N) } of the set P are calculated, the distances are sorted in order from small to large, and the sorted distance set is set as D = { D (1), D (2), …, D (k-1), D (k), D (k + 1), …, D (N-1) }, so that D (k) is referred to as the k-distance of P (i). That is, the k-distance is the k-th closest distance from the point p (i) to all points (except the point p (i)). K-distances are calculated for each point P (i) in the set P, and finally k-distances are obtained for all elements P (i) in the set P, and E = { E (1), E (2), …, E (n) }isexpressed by the set E.
And displaying the k-distance values of all elements P (i) in the set P by using a scatter diagram, and determining the value of the neighborhood radius according to the scatter diagram.
Calculating all core points according to the given initial minimum point number (for example, 4) and the values of the neighborhood radius in the previous step to obtain a core point set; that is, the points with the point p as the center and the number of the points in the neighborhood radius not less than the number of the initial minimum points are used as the core points, and the mapping between the core points and the points with the distance to the core points less than the neighborhood radius is established.
And calculating a connectable core point group according to the obtained core point set and the values of the neighborhood radiuses to obtain the noise points. The set of connectable core points comprises at least two connectable core points.
Based on a density clustering algorithm with noise, clustering connectable core point groups in the core point set and spatial points with the distance to the connectable core point groups smaller than the neighborhood radius to obtain employment places or enterprise registration gathering places of personnel in the target area.
In some embodiments, the clustering based on the density with noise clustering algorithm to cluster the connectable core point group and the spatial point whose distance to the connectable core point group is smaller than the neighborhood radius to obtain the employment place of the target regional personnel or the business registration gathering place includes:
clustering a connectable core point group in the core point set and a spatial point with a distance to the connectable core point group smaller than the neighborhood radius based on a density clustering algorithm with noise to obtain a cluster corresponding to the connectable core point group;
clustering clusters corresponding to the connectable core point groups to obtain spatial clustering clusters;
and selecting the clustering cluster with the largest area from the spatial clustering clusters corresponding to different neighborhood radii, and determining the employment place of the personnel in the target area or the enterprise registration gathering place based on the arithmetic mean of the spatial point coordinates in the clustering cluster with the largest area.
It will be appreciated that each set of connectable core points, and points less than the neighborhood radius from the connectable core points, are brought together to form a cluster.
Selecting different neighborhood radiuses, using a group of cluster clusters obtained by clustering with a noise density clustering algorithm and noise points thereof, using a scatter diagram to compare clustering effects, determining parameters Eps =0.0003, minPts =10 as model parameters of the noise density clustering algorithm, and calculating results (black solid points are abnormal noise data) shown in figures 4, 5, 6 and 7 by using the noise density clustering algorithm.
And calculating the arithmetic mean of the coordinates of the spatial points in the cluster with the largest area, and taking the arithmetic mean as the employment place (namely, the enterprise registration gathering place) or the residential place of the personnel in the target area.
The noise-containing density clustering algorithm has the obvious advantages that noise data can be effectively removed, and spatial clustering clusters in any shape can be efficiently and quickly found. The method mainly clusters the effective data of all the personnel to obtain the arithmetic mean value of the coordinate point in the largest cluster, so that each personnel can obtain two effective cluster points, wherein the effective cluster points are used as employment places in the daytime and used as residence places at night.
In summary, the employment center identification method provided by the present invention includes: acquiring mobile phone positioning data of the personnel in the target area, and sending the mobile phone positioning data to a clustering model for clustering to obtain the employment place of the personnel in the target area output by the clustering model; acquiring enterprise registration data of a target area, inputting the enterprise registration data into the clustering model for clustering, and obtaining an enterprise registration gathering area output by the clustering model; obtaining a employment center of the target area based on the employment place of the personnel in the target area and the enterprise registration gathering place; and the clustering model is obtained by training based on a density clustering algorithm with noise.
In the employment center identification method provided by the invention, the mobile phone positioning data of the personnel in the target area is sent to the clustering model for clustering to obtain the employment places of the personnel in the target area, the enterprise registration data of the target area is input to the clustering model for clustering to obtain the enterprise registration gathering places, and the employment center of the target area is obtained by combining the employment places of the personnel in the target area and the enterprise registration gathering places.
The employment center identification apparatus provided by the present invention is described below, and the employment center identification apparatus described below and the employment center identification method described above can be referred to in correspondence with each other.
As shown in fig. 8, the present invention provides a employment center identification apparatus 800, including:
the first clustering module 810 is configured to obtain mobile phone positioning data of a target region person, send the mobile phone positioning data to a clustering model for clustering, and obtain a employment location of the target region person output by the clustering model;
a second clustering module 820, configured to obtain enterprise registration data of a target area, input the enterprise registration data into the clustering model for clustering, and obtain an enterprise registration aggregation location output by the clustering model;
the employment center identification module 830 is configured to obtain an employment center of the target area based on the employment location of the personnel in the target area and the enterprise registration gathering location;
and the clustering model is obtained by training based on a density clustering algorithm with noise.
The electronic device, the computer program product and the storage medium provided by the present invention are described below, and the electronic device, the computer program product and the storage medium described below and the employment center identification method described above may be referred to in correspondence with each other.
Fig. 9 illustrates a physical structure diagram of an electronic device, and as shown in fig. 9, the electronic device may include: a processor (processor) 910, a communication Interface (Communications Interface) 920, a memory (memory) 930, and a communication bus 940, wherein the processor 910, the communication Interface 920, and the memory 930 communicate with each other via the communication bus 940. Processor 910 may invoke logic instructions in memory 930 to perform a employment center identification method, the method comprising:
acquiring mobile phone positioning data of the personnel in the target area, and sending the mobile phone positioning data to a clustering model for clustering to obtain the employment place of the personnel in the target area output by the clustering model;
acquiring enterprise registration data of a target area, inputting the enterprise registration data into the clustering model for clustering, and obtaining an enterprise registration gathering area output by the clustering model;
obtaining a employment center of the target area based on the employment place of the personnel in the target area and the enterprise registration gathering place;
and the clustering model is obtained by training based on a density clustering algorithm with noise.
Furthermore, the logic instructions in the memory 930 may be implemented in software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, and various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product, the computer program product comprising a computer program, the computer program being storable on a non-transitory computer-readable storage medium, the computer program, when executed by a processor, being capable of executing the employment center identification method provided by the above methods, the method comprising:
acquiring mobile phone positioning data of the personnel in the target area, and sending the mobile phone positioning data to a clustering model for clustering to obtain the employment places of the personnel in the target area output by the clustering model;
acquiring enterprise registration data of a target area, inputting the enterprise registration data into the clustering model for clustering, and obtaining an enterprise registration gathering area output by the clustering model;
obtaining a employment center of the target area based on the employment place of the personnel in the target area and the enterprise registration gathering place;
and the clustering model is obtained by training based on a density clustering algorithm with noise.
In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the employment center identification method provided by the above-mentioned methods, the method comprising:
acquiring mobile phone positioning data of the personnel in the target area, and sending the mobile phone positioning data to a clustering model for clustering to obtain the employment place of the personnel in the target area output by the clustering model;
acquiring enterprise registration data of a target area, inputting the enterprise registration data into the clustering model for clustering, and obtaining an enterprise registration gathering area output by the clustering model;
obtaining a employment center of the target area based on the employment place of the personnel in the target area and the enterprise registration gathering place;
and the clustering model is obtained by training based on a density clustering algorithm with noise.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art will understand and implement the present invention without inventive effort.
Through the above description of the embodiments, it is clear to those skilled in the art that the embodiments may be implemented by software plus a necessary general hardware platform, and may also be implemented by hardware. Based on the understanding, the above technical solutions substantially or otherwise contributing to the prior art may be embodied in the form of a software product, which may be stored in a computer-readable storage medium, such as ROM/RAM, magnetic disk, optical disk, etc., and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the various embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, and not to limit it; although the invention has been described in detail with reference to the foregoing embodiments, it will be appreciated by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (8)

1. A employment center identification method, comprising:
acquiring mobile phone positioning data of the personnel in the target area, and sending the mobile phone positioning data to a clustering model for clustering to obtain the employment place of the personnel in the target area output by the clustering model;
acquiring enterprise registration data of a target area, inputting the enterprise registration data into the clustering model for clustering, and obtaining an enterprise registration gathering place output by the clustering model;
obtaining a employment center of the target area based on the employment place of the personnel in the target area and the enterprise registration gathering place;
the clustering model is obtained by training based on a density clustering algorithm with noise;
the method further comprises the following steps:
acquiring a plurality of different enterprise categories of the employment center, and determining the industrial diversity measurement of the employment center based on the plurality of different enterprise categories;
acquiring public traffic data of the employment center, and determining the radiation range measurement of the employment center based on the public traffic data;
classifying the employment center of the target area based on the employment center industry diversity metric and the employment center radiation range metric;
classifying the employment center of the target area based on the employment center industry diversity metric and the employment center radiation range metric, comprising:
sequentially selecting a plurality of groups of employment center categories based on the employment center industry diversity measurement, wherein each group of employment center category comprises a plurality of types of employment centers;
and determining the employment center radiation range measurement mean value corresponding to each group of employment center categories based on the employment center radiation range measurement, and taking the next group of employment center categories as the employment center target categories under the condition that the difference value between the employment center radiation range measurement mean value corresponding to the next group of employment center categories and the employment center radiation range mean value corresponding to the previous group of employment center categories is smaller than a preset threshold value.
2. The employment center identification method of claim 1 wherein the employment center industry diversity measure is calculated based on the following formula:
Figure DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure 949666DEST_PATH_IMAGE002
s is the number of enterprise categories for the measure of the industrial diversity of the employment center,p i the occupation ratio of the ith type of enterprise in the target region is obtained.
3. The employment center identification method of claim 1 wherein the public transportation data includes public transportation platform locations and commute times of personnel in the target area.
4. The employment center identification method according to any one of claims 1-3, wherein the clustering model is used for:
analyzing the data input into the clustering model to obtain a spatial point data set;
calculating the k-distance of each spatial point in the spatial point data set;
displaying the k-distance of each space point in the space point data set by using a scatter diagram, and determining a neighborhood radius based on the scatter diagram;
determining a core point set based on a preset initial minimum point number and the neighborhood radius; the core point in the core point set is a spatial point which takes the core point as a center and has the neighborhood inner spatial point with the neighborhood radius as the radius not less than the initial minimum point number;
and clustering connectable core point groups in the core point set and spatial points with the distance to the connectable core point groups smaller than the neighborhood radius based on a density clustering algorithm with noise to obtain employment places of the target region personnel or the enterprise registration gathering places.
5. The employment center identification method according to claim 4, wherein the clustering based on the density clustering algorithm with noise for the connectable core point group in the core point set and the spatial point whose distance to the connectable core point group is smaller than the neighborhood radius to obtain the employment location of the target regional personnel or the business registration gathering location comprises:
clustering a connectable core point group in the core point set and a spatial point with a distance to the connectable core point group smaller than the neighborhood radius based on a density clustering algorithm with noise to obtain a cluster corresponding to the connectable core point group;
clustering clusters corresponding to the connectable core point groups to obtain spatial clustering clusters;
and selecting the clustering cluster with the largest area from the spatial clustering clusters corresponding to different neighborhood radii, and determining the employment place of the personnel in the target area or the enterprise registration gathering place based on the arithmetic mean of the spatial point coordinates in the clustering cluster with the largest area.
6. A employment center identification apparatus, comprising:
the first clustering module is used for acquiring mobile phone positioning data of the target area personnel, sending the mobile phone positioning data to a clustering model for clustering, and obtaining the employment places of the target area personnel output by the clustering model;
the second clustering module is used for acquiring enterprise registration data of a target area, inputting the enterprise registration data into the clustering model for clustering, and obtaining an enterprise registration gathering place output by the clustering model;
the employment center identification module is used for obtaining the employment center of the target area based on the employment place of the personnel in the target area and the enterprise registration gathering place;
the clustering model is obtained by training based on a density clustering algorithm with noise;
the employment center identification module is also used for acquiring a plurality of different enterprise categories of the employment center and determining the industrial diversity measurement of the employment center based on the different enterprise categories; acquiring public traffic data of the employment center, and determining the radiation range measurement of the employment center based on the public traffic data; classifying the employment center of the target area based on the employment center industry diversity metric and the employment center radiation range metric; the employment center identification module is also used for sequentially selecting a plurality of groups of employment center categories based on the employment center industry diversity measurement, and each group of employment center categories comprise a plurality of types of employment centers; and determining the employment center radiation range measurement mean value corresponding to each group of employment center categories based on the employment center radiation range measurement, and taking the next group of employment center categories as the employment center target categories under the condition that the difference value between the employment center radiation range measurement mean value corresponding to the next group of employment center categories and the employment center radiation range mean value corresponding to the previous group of employment center categories is smaller than a preset threshold value.
7. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program implements the employment center identification method of any one of claims 1 through 5.
8. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the employment center identification method according to any one of claims 1 through 5.
CN202211395497.2A 2022-11-09 2022-11-09 Employment center identification method and device, electronic equipment and storage medium Active CN115438138B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211395497.2A CN115438138B (en) 2022-11-09 2022-11-09 Employment center identification method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211395497.2A CN115438138B (en) 2022-11-09 2022-11-09 Employment center identification method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN115438138A CN115438138A (en) 2022-12-06
CN115438138B true CN115438138B (en) 2023-04-07

Family

ID=84252174

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211395497.2A Active CN115438138B (en) 2022-11-09 2022-11-09 Employment center identification method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115438138B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110222959A (en) * 2019-05-23 2019-09-10 河南大学 A kind of urban employment accessibility measuring method and system based on big data
CN112566029A (en) * 2020-12-09 2021-03-26 深圳市城市规划设计研究院有限公司 Urban employment center identification method and device based on mobile phone positioning data
CN112613530A (en) * 2020-11-23 2021-04-06 北京思特奇信息技术股份有限公司 Cell resident identification method and system based on adaptive density clustering algorithm
CN114219023A (en) * 2021-12-14 2022-03-22 中国平安财产保险股份有限公司 Data clustering method and device, electronic equipment and readable storage medium

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180018615A1 (en) * 2016-07-15 2018-01-18 Stafficiency Inc. System and Method for Management of Variable Staffing and Productivity
US20190102742A1 (en) * 2017-09-29 2019-04-04 Oracle International Corporation Diversity impact monitoring techniques
CN111210269B (en) * 2020-01-02 2020-09-18 平安科技(深圳)有限公司 Object identification method based on big data, electronic device and storage medium
CN113836373B (en) * 2021-01-20 2022-12-13 国义招标股份有限公司 Bidding information processing method and device based on density clustering and storage medium
CN112800165B (en) * 2021-04-06 2021-08-27 北京智源人工智能研究院 Industrial cluster positioning method and device based on clustering algorithm and electronic equipment
CN113722617A (en) * 2021-09-30 2021-11-30 京东城市(北京)数字科技有限公司 Method and device for identifying actual office address of enterprise and electronic equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110222959A (en) * 2019-05-23 2019-09-10 河南大学 A kind of urban employment accessibility measuring method and system based on big data
CN112613530A (en) * 2020-11-23 2021-04-06 北京思特奇信息技术股份有限公司 Cell resident identification method and system based on adaptive density clustering algorithm
CN112566029A (en) * 2020-12-09 2021-03-26 深圳市城市规划设计研究院有限公司 Urban employment center identification method and device based on mobile phone positioning data
CN114219023A (en) * 2021-12-14 2022-03-22 中国平安财产保险股份有限公司 Data clustering method and device, electronic equipment and readable storage medium

Also Published As

Publication number Publication date
CN115438138A (en) 2022-12-06

Similar Documents

Publication Publication Date Title
CN111614690B (en) Abnormal behavior detection method and device
CN110348516B (en) Data processing method, data processing device, storage medium and electronic equipment
CN110298687B (en) Regional attraction assessment method and device
CN109189876B (en) Data processing method and device
CN113807940B (en) Information processing and fraud recognition method, device, equipment and storage medium
CN111190988B (en) Address resolution method, device, equipment and computer readable storage medium
CN113722617A (en) Method and device for identifying actual office address of enterprise and electronic equipment
CN111639690A (en) Fraud analysis method, system, medium, and apparatus based on relational graph learning
CN111754044A (en) Employee behavior auditing method, device, equipment and readable storage medium
CN110619535A (en) Data processing method and device
CN107194815B (en) Client segmentation method and system
CN111353529A (en) Mixed attribute data set clustering method for automatically determining clustering center
CN114399367A (en) Insurance product recommendation method, device, equipment and storage medium
Diao et al. Clustering by Detecting Density Peaks and Assigning Points by Similarity‐First Search Based on Weighted K‐Nearest Neighbors Graph
CN115438138B (en) Employment center identification method and device, electronic equipment and storage medium
CN116707859A (en) Feature rule extraction method and device, and network intrusion detection method and device
CN114817518B (en) License handling method, system and medium based on big data archive identification
CN113011503B (en) Data evidence obtaining method of electronic equipment, storage medium and terminal
CN115392351A (en) Risk user identification method and device, electronic equipment and storage medium
CN111882421B (en) Information processing method, wind control method, device, equipment and storage medium
CN115413026A (en) Base station selection method, system, equipment and storage medium based on clustering algorithm
CN109919811B (en) Insurance agent culture scheme generation method based on big data and related equipment
CN113657440A (en) Rejection sample inference method and device based on user feature clustering
CN114613124A (en) Traffic information processing method, device, terminal and computer readable storage medium
US20230297651A1 (en) Cost equalization spectral clustering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant