WO2019179173A1 - Procédé et dispositif de détermination de zone de commerce - Google Patents

Procédé et dispositif de détermination de zone de commerce Download PDF

Info

Publication number
WO2019179173A1
WO2019179173A1 PCT/CN2018/119319 CN2018119319W WO2019179173A1 WO 2019179173 A1 WO2019179173 A1 WO 2019179173A1 CN 2018119319 W CN2018119319 W CN 2018119319W WO 2019179173 A1 WO2019179173 A1 WO 2019179173A1
Authority
WO
WIPO (PCT)
Prior art keywords
stores
business circle
store
value
distance
Prior art date
Application number
PCT/CN2018/119319
Other languages
English (en)
Chinese (zh)
Inventor
黄凯
钟蛵雩
贾全慧
余泉
Original Assignee
阿里巴巴集团控股有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司 filed Critical 阿里巴巴集团控股有限公司
Publication of WO2019179173A1 publication Critical patent/WO2019179173A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0204Market segmentation
    • G06Q30/0205Location or geographical consideration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques

Definitions

  • the embodiments of the present specification relate to the field of machine learning, and more particularly, to a method and apparatus for training a business circle determination model, a method and apparatus for determining a business circle, and a method and apparatus for updating a business circle determination.
  • offline stores have been vigorously developed. Compared with online merchants, offline merchants have physical stores, and objects are clustered. They have a certain concentration in geographical location, that is, they can be divided into various business districts. Through the business circle information, you can deepen your understanding of the store: identify the offline industry market, and assist in judging the operation status of the store.
  • the existing business circle information includes: business circle information marked by offline BD, and business circle information mainly from the results of the popular public comment business circle. The above business circle information is obtained by manual labeling. Therefore, there is a need for a more effective solution for determining a business circle.
  • the embodiments of the present specification aim to provide a more effective solution for determining a business circle to solve the deficiencies in the prior art.
  • an aspect of the present specification provides a method for training a business circle determination model, comprising: acquiring respective location information of a plurality of stores within a predetermined geographical range and respective business circle annotation information of the plurality of stores; according to CFSFDP a clustering algorithm, using the position information, calculating a value of a local density ⁇ of each store, a value of a minimum distance ⁇ with a higher density store, and a value of a product ⁇ ; acquiring according to respective current thresholds of ⁇ , ⁇ , and ⁇ Business circle determination information of each store; calculating the similarity of all the business circle determination information with respect to all the business circle label information by using the business circle determination information and the business circle label information of the plurality of stores; and adjusting ⁇ The respective thresholds of ⁇ and ⁇ increase the similarity.
  • a value of a local density ⁇ of a shop i in the plurality of stores is ⁇ i , wherein
  • d c is a radius threshold
  • d ij is the distance between the store i and the store j in the plurality of stores
  • i and j are natural numbers less than or equal to the total number of stores of the plurality of stores, and i ⁇ j.
  • a value of a local density ⁇ of a shop i in the plurality of stores is ⁇ i , wherein
  • d c is a radius threshold
  • d ij is the distance between the store i and the store j in the plurality of stores
  • i and j are natural numbers less than or equal to the total number of stores of the plurality of stores, and i ⁇ j.
  • the position information of the shop i and the shop j are expressed as latitude and longitude (Lon i , Lat i ) and (Lon j , Lat j ), respectively, and the distance d ij is calculated as follows:
  • R is the radius of the Earth.
  • the similarity is represented by a parameter WFS, wherein
  • i is an integer from 0 to A
  • j is an integer from 0 to B
  • A is the number of quotients in the circle
  • B is the number of quotients in the quotient
  • N i is the number of stores in the ith circle
  • N is the total number of stores of the plurality of stores
  • P ij is the accuracy rate for the i-th labeled business circle and the j-th determined business circle
  • R ij is the recall rate for the i-th labeled business circle and the j-th determined business circle, which will include a collection of labeled scattered stores Set to the 0th labeled business circle, and set the set of the determined scattered stores as the 0th determination business circle, wherein the marked scattered shop is an annotated shop that does not belong to any labeled business circle, and the determined scattered shop is A judgment shop that does not belong to any judgment business circle.
  • adjusting respective threshold values of ⁇ , ⁇ , and ⁇ such that the similarity improvement includes adjusting respective threshold values of ⁇ , ⁇ , and ⁇ such that the similarity The greatest degree.
  • Another aspect of the present specification provides a method for determining a business circle, comprising: acquiring respective location information of a plurality of stores within a predetermined geographical range; and calculating a local density ⁇ of each store by using the location information according to a CFSFDP clustering algorithm. a value, a value of a minimum distance ⁇ with a higher density store, and a value of the product ⁇ ; for each of the plurality of stores by the adjusted threshold values of ⁇ , ⁇ , and ⁇ obtained by the method of training the business circle determination model Determine the business circle.
  • the predetermined geographic range is a predetermined city.
  • Another aspect of the present specification provides a method for updating a business circle determination, comprising: acquiring respective first location information of a plurality of first stores within a predetermined geographic range and a first distance between each first store; acquiring the predetermined Second position information of each of the at least one second store in the geographical range; using the first location information and the second location information, calculating a second distance between the second stores, and any one of the second stores a third distance between any one of the first stores; calculating, according to the CFSFDP clustering algorithm, the plurality of first stores and the at least one second store based on the first distance, the second distance, and the third distance
  • a business circle is determined for the plurality of first stores and at least one second store.
  • the method of updating the business circle determination is performed every predetermined time period.
  • the present disclosure provides an apparatus for training a business circle determination model, including: a first acquisition unit configured to acquire respective location information of a plurality of stores within a predetermined geographic range and respective business circle labels of the plurality of stores
  • the first calculation unit is configured to calculate, according to the CFSFDP clustering algorithm, a value of a local density ⁇ of each store, a value of a minimum distance ⁇ with a higher density store, and a value of a product ⁇ using the position information
  • the second obtaining unit is configured to acquire the business circle determination information of each store according to the current threshold value of each of ⁇ , ⁇ , and ⁇ ; the second calculating unit is configured to determine the information and the quotient by using the respective business circle of the plurality of stores
  • the circle labeling information calculates a similarity of all the merchant circle determination information with respect to all of the merchant circle labeling information; and the threshold value adjusting unit is configured to adjust respective threshold values of ⁇ , ⁇ , and ⁇ such that the similarity is improved.
  • Another aspect of the present disclosure provides an apparatus for determining a business circle, comprising: an obtaining unit configured to acquire respective location information of a plurality of stores within a predetermined geographical range; and a calculating unit configured to use the CFSFDP clustering algorithm according to the CFSFDP clustering algorithm
  • the position information is calculated, the value of the local density ⁇ of each store, the value of the minimum distance ⁇ with the higher density store, and the value of the product ⁇ are calculated; and the determining unit is configured to obtain by the method according to the above-described training quotation determination model
  • the adjusted threshold values of ⁇ , ⁇ , and ⁇ are each determined for the plurality of stores.
  • the present disclosure provides an apparatus for updating a business circle determination, including: a first obtaining unit configured to acquire first location information of each of a plurality of first stores within a predetermined geographic range and between respective first stores a first distance unit, configured to acquire respective second location information of the at least one second store in the predetermined geographic range; the first calculating unit is configured to utilize the first location information and the second Position information, calculating a second distance between the respective second stores, a third distance between any one of the second stores and any one of the first stores; and a second calculating unit configured to cluster according to CFSFDP And calculating, based on the first distance, the second distance, and the third distance, a value of a local density ⁇ of each of the plurality of first stores and at least one second store, and a value of a minimum distance ⁇ from a higher density store And a value of the product ⁇ ; and a determining unit configured to adjust the threshold values of ⁇ , ⁇ , and ⁇ obtained by the method of training the business circle determination model for the
  • the business circle can be quickly and accurately determined, and the stability of the determination result can be ensured.
  • the embodiments of the present specification also effectively reduce the computational complexity and optimize the calculation time.
  • FIG. 1 shows a schematic diagram of a system 100 for determining a business circle in accordance with an embodiment of the present specification
  • FIG. 2 is a flow chart showing a method of training a business circle determination model according to an embodiment of the present specification
  • Figure 3 schematically shows an example of a ⁇ - ⁇ profile
  • Figure 4 schematically shows an example of a gamma distribution map
  • FIG. 5 is a flowchart showing a method of determining a business circle according to an embodiment of the present specification
  • FIG. 6 shows a flow chart of a method for updating a business circle determination according to an embodiment of the present specification
  • Figure 7 illustrates an apparatus 700 for training a business circle decision model in accordance with an embodiment of the present specification
  • Figure 8 illustrates an apparatus 800 for determining a business circle in accordance with an embodiment of the present specification
  • FIG. 9 illustrates an apparatus 900 for updating a business circle determination in accordance with an embodiment of the present specification.
  • FIG. 1 shows a schematic diagram of a system 100 for determining a business circle in accordance with an embodiment of the present specification.
  • system 100 includes a clustering module 11, an evaluation module 12, and a threshold adjustment module 13.
  • the training samples are input to the clustering module 11.
  • the training sample includes the location information of the store and the respective business circle labeling information of the store.
  • the clustering module 11 calculates the value of the local density ⁇ of each store, the value of the minimum distance ⁇ with the higher density store, and the value of the product ⁇ according to the CFSFDP clustering algorithm according to the position information, according to ⁇ , ⁇ , and ⁇ .
  • the respective current thresholds are used to obtain the business circle determination information of each store.
  • the clustering module 11 transmits the above-mentioned business circle determination information to the evaluation module 12.
  • the evaluation module 12 calculates the similarity of all the business circle determination information with respect to all the business circle annotation information as the evaluation score by using the respective business circle determination information and the business circle annotation information of the plurality of stores, and the evaluation score is It is transmitted to the threshold adjustment module 13.
  • the threshold adjustment module 13 adjusts the thresholds of the parameters ⁇ , ⁇ , and ⁇ in the clustering module 11 according to the evaluation score to increase the evaluation score, and after the plurality of adjustments, maximizes the evaluation score.
  • the full amount of store information can be clustered by the distance module 11 to obtain the business circle determination result.
  • the method includes: acquiring, in step S21, location information of each of a plurality of stores in a predetermined geographical range and respective business circle annotation information of the plurality of stores; and in step S22, according to a CFSFDP clustering algorithm Using the position information, calculating the value of the local density ⁇ of each store, the value of the minimum distance ⁇ with the higher density store, and the value of the product ⁇ ; in step S23, according to the respective threshold values of ⁇ , ⁇ , and ⁇ , Acquiring the business circle determination information of each store; in step S24, calculating the similarity of all the business circle determination information with respect to all the business circle annotation information by using the business circle determination information and the business circle annotation information of each of the plurality of stores And in step S25, the respective thresholds of ⁇ , ⁇ , and ⁇ are adjusted such that the similarity is improved.
  • step S21 location information of each of a plurality of stores within a predetermined geographical range and business circle annotation information of each of the plurality of stores are acquired.
  • the predetermined geographic extent may be, for example, a geographic extent including more than 100 business districts, such as a district, county, etc. of the city.
  • the number of stores in a plurality of stores may be, for example, on the order of several thousand, for example, 3,000.
  • the plurality of shopping districts covered by the plurality of stores include a plurality of location relationships, for example, the business circle is adjacent to the business circle, intersects, is away from, and the like.
  • the location information of the store may be expressed in various known forms.
  • the location information of the store may be the latitude and longitude of the store, or the location information of the store may be the city coordinates or the like.
  • the business district labeling information of the store includes whether the store belongs to a certain business circle, and which business district the store belongs to.
  • the business circle label information of the store may be indicated by the labeled business circle field. When the field is 0, the store is a scattered store that does not belong to any business circle. When the field is a natural number, it indicates that the store belongs to the natural number. Business district.
  • step S22 based on the CFSFDP clustering algorithm, the value of the local density ⁇ of each store, the value of the minimum distance ⁇ with the higher density store, and the value of the product ⁇ are calculated using the position information.
  • the value of the local density ⁇ of the shop i in the plurality of stores is ⁇ i , wherein ⁇ i is calculated by the following formula (1):
  • the CFSFDP clustering algorithm is used to determine the business circle, that is, clustering the shop points. Since the shape of the business circle is generally fixed, d c is set to 0.2 (ie, 200 m). With this setting, the stability of the clustering result is greatly improved.
  • the distance d ij between the store i and the store j may adopt different formulas according to different forms of the store location information. For example, when the position information of the shop i and the shop j are expressed as latitude and longitude (Lon i , Lat i ) and (Lon j , Lat j ), respectively, the distance d ij is calculated by the following formula (2):
  • the d ij calculated by the formula (2) is the distance between two points on the spherical surface, where R is the radius of the earth, and the average value is 6371 km.
  • d ij can be Euclidean distance, Minkowsky distance, Manhattan distance, and the like.
  • the location information of the store i and the store j are respectively represented in three-dimensional or two-dimensional coordinates in the city coordinate system, so that, for example, the Euclidean distance between the store i and the store j can be calculated as d ij .
  • each d ij can be calculated by, for example, the above formula (2), thereby obtaining a distance matrix.
  • a distance calculation formula based on a Gaussian kernel function is introduced, and ⁇ i is calculated by the following formula (3),
  • d c is a radius threshold
  • d ij is the distance between the store i and the store j in the plurality of stores
  • i and j are natural numbers less than or equal to the total number of stores of the plurality of stores
  • i ⁇ j is natural numbers less than or equal to the total number of stores of the plurality of stores
  • i ⁇ j is natural numbers less than or equal to the total number of stores of the plurality of stores
  • i j are natural numbers less than or equal to the total number of stores of the plurality of stores
  • i j are natural numbers less than or equal to the total number of stores of the plurality of stores
  • i ⁇ j are natural numbers less than or equal to the total number of stores of the plurality of stores
  • i ⁇ j is natural numbers less than or equal to the total number of stores of the plurality of stores
  • i ⁇ j is natural numbers less than or equal to the total number of stores of the plurality of stores
  • i ⁇ j
  • the center of the quotient can be determined by plotting the ⁇ - ⁇ profile.
  • Fig. 3 shows an example of a ⁇ - ⁇ profile.
  • the abscissa in the ⁇ - ⁇ distribution map is ⁇ and the ordinate is ⁇ .
  • the quasi-center has a higher local density value ⁇ and a higher high-density distance ⁇ , and therefore, the points in the upper right portion of the graph are located in the ⁇ - ⁇ distribution map. May be the cluster center.
  • points of different gradations other than black dots may be the center of the quotient circle.
  • Fig. 4 schematically shows an example of a gamma distribution map. As shown in FIG. 4, the ordinate is ⁇ , and the abscissa is the serial number of the shop i sorted according to the ⁇ i size, wherein each serial number corresponds to one store. The larger ⁇ , the larger the representative ⁇ * ⁇ , that is, the shop corresponding to the point becomes more likely to become the center of the business circle.
  • step S23 the business circle determination information of each store is acquired based on the current threshold values of ⁇ , ⁇ , and ⁇ .
  • the shop that becomes the center of the business circle is determined based on the threshold values of the set ⁇ , ⁇ , and ⁇ .
  • a broken line perpendicular to the ⁇ axis represents a threshold of ⁇
  • a broken line perpendicular to the ⁇ axis represents a threshold of ⁇
  • a broken line perpendicular to the ⁇ axis represents a threshold of ⁇ .
  • the point in the upper right of the intersection of the ⁇ threshold line and the ⁇ threshold line is the center of the first quotient circle
  • the point above the ⁇ threshold line is the second quotient circle. center.
  • each of the plurality of stores is clustered, that is, divided into a certain business circle.
  • each store is classified into a business circle to which the store point closest to it and whose density is higher than it belongs.
  • each store is categorized into the business district of the nearest business district center. When the store is too far away from any store with a higher density than the store, or if the store is too far away from the center of any commercial circle, for example, if it is more than 2km, the store may be considered as a scattered store and does not belong to any business district. Or the store may be considered to be a business district with a business circle identification number of zero. Thereby, the business circle determination information of each store is obtained.
  • step S24 the similarity of all the business circle determination information with respect to all the business circle label information is calculated by using the business circle determination information and the business circle label information of the plurality of stores.
  • the similarity is a degree indicating that the business circle determination information is similar to the business circle annotation information.
  • the similarity can be calculated in various forms to evaluate the judgment result. For example, Precision, Recall, AUC score, log loss, Accuracy, etc. can all be used to indicate similarity.
  • the similarity is expressed by a WFS score, wherein the WFS is calculated by the formula (5),
  • i is an integer from 0 to A
  • j is an integer from 0 to B
  • A is the number of quotients in the circle
  • B is the number of quotients in the quotient
  • N i is the number of stores in the ith circle
  • N is the total number of stores of the plurality of stores
  • P ij is the accuracy rate for the i-th labeled business circle and the j-th determined business circle
  • R ij is the recall rate for the i-th labeled business circle and the j-th determined business circle, which will include a collection of labeled scattered stores Set to the 0th labeled business circle, and set the set of the determined scattered stores as the 0th determination business circle, wherein the marked scattered shop is an annotated shop that does not belong to any labeled business circle, and the determined scattered shop is A judgment shop that does not belong to any judgment business circle.
  • P ij can be calculated by the following formula (7)
  • R ij can be calculated by the following formula (8):
  • i is an integer from 0 to A
  • j is an integer from 0 to B
  • A is the number of quotients in the labeled business circle
  • B is the number of quotients in the determination of the business circle.
  • x ij is the number of stores in the i-th labeled business circle that are assigned to the j-th decision business circle.
  • step S25 the respective threshold values of ⁇ , ⁇ , and ⁇ are adjusted such that the similarity is improved.
  • the at least one threshold line may be moved above and below at least one threshold line of the three threshold lines in FIGS. 3 and 4 to obtain an adjusted threshold.
  • the adjusted business circle determination information of each store is acquired based on the adjusted threshold values of ⁇ , ⁇ , and ⁇ , and the information and adjustment are marked by the business circle.
  • the subsequent business circle determination information calculates the adjusted similarity.
  • the threshold line may be moved a plurality of times in a direction in which the similarity becomes large according to the change in the similarity, so that the respective threshold values of ⁇ , ⁇ , and ⁇ are continuously adjusted, so that the similarity is continuously increased.
  • the respective thresholds of ⁇ , ⁇ , and ⁇ are adjusted such that the similarity is maximized, thereby obtaining respective thresholds of ⁇ , ⁇ , and ⁇ .
  • step S5 is a flowchart of a method for determining a business circle according to an embodiment of the present specification, including the steps of: acquiring, in step S51, respective location information of a plurality of stores within a predetermined geographic range; and in step S52, according to a CFSFDP clustering algorithm Using the position information, calculating the value of the local density ⁇ of each store, the value of the minimum distance ⁇ with the higher density store, and the value of the product ⁇ ; and, in step S53, by determining the model according to the training circle described above
  • the adjusted threshold values of ⁇ , ⁇ , and ⁇ obtained by the method determine a business circle for the plurality of stores.
  • the predetermined geographic range may be a predetermined city range, that is, the business circle is determined in units of cities.
  • the process of the step S52 is substantially the same as the step S22 of FIG. 2, and the process of the step S53 is substantially the same as the step S23 of FIG. 2, and details are not described herein again.
  • step S61 acquiring first location information of each of a plurality of first stores within a predetermined geographic range and each first store a first distance between the two; at step S62, acquiring second location information of each of the at least one second store within the predetermined geographic range; and in step S63, calculating the location by using the first location information and the second location information a second distance between each of the second stores, a third distance between any one of the second stores and any of the first stores; and in step S64, based on the CFSFDP clustering algorithm, based on the first distance, a second distance and a third distance, calculating a value of a local density ⁇ of each of the plurality of first stores and at least one second store, a value of a minimum distance ⁇ with a higher density store, and a value of a product ⁇ ; Step S65, determining a business circle for the plurality of first stores and at least
  • the method shown in Figure 6 is an incremental iterative approach. As offline merchants continue to expand and the number of stores continues to expand, the direct calculation of the distance matrix will face the computational complexity of O(N 2 ). Therefore, by the method shown in FIG. 6, the amount of calculation is reduced to speed up the calculation.
  • step S61 first location information of each of the plurality of first stores within the predetermined geographic range and a first distance between the respective first stores are acquired.
  • the predetermined geographic extent may be, for example, a predetermined city.
  • location information of a plurality of stores may be acquired in the initial month M 0 , and the distance between the respective stores may be calculated as described above, thereby acquiring the distance matrix N 0 .
  • add at least one store or multiple stores already have the location information has been changed at the shop.
  • the store that is not related to the second store among the plurality of stores is the first store, or x old .
  • the second store is a new store
  • the first store to acquire all shops M 0 months.
  • the shop has been changed to the second store stores position occurs, the first store is removed from the position changing all stores acquired in the rest of the store M 0 month shop.
  • a first plurality of available store respective first and the respective position information of the first store from the distance matrix N 0 store position information acquired and calculated in the M 0 months
  • the first distance between the first distance is the distance between the store x old and the old .
  • step S62 second location information of each of the at least one second store within the predetermined geographic range is acquired.
  • the location information of the newly added second store is acquired.
  • the second store location information in M January has changed in the shops, to obtain location information after the change of the second shop.
  • step S63 using the first location information and the second location information, calculating a second distance between the second stores, and a third between any of the second stores and any of the first stores distance. That is, using the first position information and second position information, calculating a third distance between the second and the distance between the new x x x new x-new and old.
  • step S64 according to the CFSFDP clustering algorithm, calculating a value of a local density ⁇ of each of the plurality of first stores and at least one second store based on the first distance, the second distance, and the third distance, and higher The value of the minimum distance ⁇ of the density store and the value of the product ⁇ .
  • the first distance, the second distance, and the third distance together form a new distance matrix, so that the value of the local density ⁇ of the plurality of stores including the first store and the second store, and the higher density store can be calculated as described above
  • step S65 the business circle is determined for the plurality of first stores and the at least one second store by the adjusted threshold values of ⁇ , ⁇ , and ⁇ obtained by the method of training the business circle determination model.
  • This step is basically the same as step S23 in FIG. 2 and step S53 in FIG. 5, and details are not described herein again.
  • the above method of updating the business circle determination may be performed once every predetermined time period, for example, once a month, so that the determination of the business circle may be periodically updated. And the update method reduces the computational complexity of at least two orders of magnitude.
  • FIG. 7 shows an apparatus 700 for training a business circle determination model according to an embodiment of the present specification, including: a first acquisition unit 71 configured to acquire respective location information of a plurality of stores within a predetermined geographic range and the plurality of The respective business circle label information of the stores; the first calculating unit 72 is configured to calculate the value of the local density ⁇ of each store and the minimum distance ⁇ with the higher density store by using the position information according to the CFSFDP clustering algorithm.
  • the second obtaining unit 73 is configured to acquire the business circle determination information of each store according to the current threshold of each of ⁇ , ⁇ , and ⁇ ;
  • the second calculating unit 74 is configured to utilize the plurality of The business district determination information and the business circle annotation information of each store, calculating the similarity of all the business circle determination information with respect to all the business circle annotation information;
  • the threshold adjustment unit 75 configured to adjust ⁇ , ⁇ , and ⁇ The respective thresholds increase the similarity.
  • FIG. 8 shows an apparatus 800 for determining a business circle according to an embodiment of the present specification, comprising: an obtaining unit 81 configured to acquire respective location information of a plurality of stores within a predetermined geographical range; and a calculating unit 82 configured to Calculating, according to the CFSFDP clustering algorithm, the value of the local density ⁇ of each store, the value of the minimum distance ⁇ with the higher density store, and the value of the product ⁇ ; and the determining unit 83 configured to pass The adjusted threshold values of ⁇ , ⁇ , and ⁇ obtained by the method for training the business circle determination model are determined for the plurality of stores.
  • FIG. 9 illustrates an apparatus 900 for updating a business circle determination, including: a first obtaining unit 91 configured to acquire first location information of each of a plurality of first stores within a predetermined geographic range and between respective first stores
  • the first obtaining unit 92 is configured to acquire second location information of each of the at least one second store in the predetermined geographic range;
  • the first calculating unit 93 is configured to use the first location information And the second location information, calculating a second distance between the second stores, a third distance between any one of the second stores and any of the first stores;
  • the second calculating unit 94 is configured to: Calculating, according to the CFSFDP clustering algorithm, a value of a local density ⁇ of each of the plurality of first stores and at least one second store based on the first distance, the second distance, and the third distance, and a minimum value of the higher density store a value of the distance ⁇ and a value of the product ⁇ thereof; and a determining unit 95 configured to adjust the threshold values of ⁇ ,
  • the quotient determination model according to an embodiment of the present specification can be evaluated by calculating a Sil score.
  • the SIL score can be calculated by the following formulas (9)-(11):
  • c k represents the set of the kth clustering result
  • a(i) represents the average distance of point i to all points in the circle
  • b(i) represents the average distance of point i to all points in the closest quotient circle p.
  • the method of the embodiment of the present specification only needs to obtain the geographical location information of the full amount of shops at the input end, and can determine the business circle for it without the need for manual one-to-one determination.
  • the coverage of the business circle by the measured method can reach 92.5%, wherein Covered stores are basically isolated points or dirty data points.
  • the CFSFDP algorithm used in the method of the embodiment of the present specification does not need to be defined in advance, but directly obtains the circle of the business circle by the method of threshold definition.
  • the method of the embodiment of the present specification enhances stability from two aspects: firstly, the optimal parameter is pre-trained by using the known high accuracy rate labeling business circle information to ensure the stability of the parameter; secondly, in the case of stable parameters, Using the threshold-defined way to obtain the business circle center can ensure that the business circle finds stable results when the data is unchanged or the change is small.
  • the method of the embodiment of the present specification introduces a distance matrix based on urban partitioning and incremental iteration, and utilizes the time series of store evolution, which effectively reduces the computational complexity, and the calculation time in the actual measurement is optimized by about 10 times.
  • the steps of a method or algorithm described in connection with the embodiments disclosed herein may be implemented in hardware, in a software module in a processor orbit, or in a combination of the two.
  • the software module can be placed in random access memory (RAM), memory, read only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or technical field. Any other form of storage medium known.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Strategic Management (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Game Theory and Decision Science (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

Des modes de réalisation de la présente invention concernent un procédé et un dispositif d'apprentissage d'un modèle de détermination de zone de commerce, un procédé et un dispositif de détermination d'une zone de commerce, et un procédé et un dispositif de mise à jour de détermination de zone de commerce, le procédé d'apprentissage d'un modèle de détermination de zone de commerce comprenant les étapes consistant : à acquérir des informations de localisation relatives à une pluralité de magasins dans une plage géographique prédéterminée et des informations de marquage de zone de commerce relatives à la pluralité de magasins ; selon un algorithme de regroupement CFSFDP, à utiliser les informations de localisation et à calculer la valeur d'une densité locale ρ de chaque magasin, la valeur d'une distance minimale δ à un magasin qui a une densité plus élevée et la valeur d'un produit γ associé ; à acquérir des informations de détermination de zone de commerce de chaque magasin en fonction de seuils actuels relatifs à ρ, δ et γ ; à utiliser les informations de détermination de zone de commerce et les informations de marquage de zone de commerce relatives à la pluralité de magasins, et à calculer le degré de similarité entre toutes les informations de détermination de zone de commerce par rapport à toutes les informations de marquage de zone de commerce ; et à régler les seuils relatifs à ρ, δ et γ de telle sorte que le degré de similarité est augmenté.
PCT/CN2018/119319 2018-03-20 2018-12-05 Procédé et dispositif de détermination de zone de commerce WO2019179173A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810231483.4A CN108596648B (zh) 2018-03-20 2018-03-20 一种商圈判定方法和装置
CN201810231483.4 2018-03-20

Publications (1)

Publication Number Publication Date
WO2019179173A1 true WO2019179173A1 (fr) 2019-09-26

Family

ID=63626938

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/119319 WO2019179173A1 (fr) 2018-03-20 2018-12-05 Procédé et dispositif de détermination de zone de commerce

Country Status (3)

Country Link
CN (1) CN108596648B (fr)
TW (1) TWI711983B (fr)
WO (1) WO2019179173A1 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111369284A (zh) * 2020-03-03 2020-07-03 浙江网商银行股份有限公司 目标对象类型确定方法及装置
CN111815361A (zh) * 2020-07-10 2020-10-23 北京思特奇信息技术股份有限公司 区域边界计算方法、装置、电子设备及存储介质
CN112783963A (zh) * 2021-03-17 2021-05-11 上海数喆数据科技有限公司 基于商圈划分的企业线下与线上多源数据整合方法及装置
CN116308501A (zh) * 2023-05-24 2023-06-23 北京骑胜科技有限公司 用于管理共享车辆的运营区域的方法、装置、设备和介质

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108596648B (zh) * 2018-03-20 2020-07-17 阿里巴巴集团控股有限公司 一种商圈判定方法和装置
CN110175865A (zh) * 2019-04-23 2019-08-27 国网浙江省电力有限公司湖州供电公司 基于泛在感知技术的电动汽车充电实时定价方法
CN111091417B (zh) * 2019-12-12 2023-10-31 拉扎斯网络科技(上海)有限公司 选址方法及装置
CN111210269B (zh) * 2020-01-02 2020-09-18 平安科技(深圳)有限公司 基于大数据的对象识别方法、电子装置及存储介质
CN111932318B (zh) * 2020-09-21 2021-01-19 腾讯科技(深圳)有限公司 区域划分方法、装置、电子设备及计算机可读存储介质
CN112016326A (zh) * 2020-09-25 2020-12-01 北京百度网讯科技有限公司 一种地图区域词识别方法、装置、电子设备和存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105574014A (zh) * 2014-10-13 2016-05-11 北京明略软件系统有限公司 一种商圈划分方法及系统
CN107657474A (zh) * 2017-07-31 2018-02-02 石河子大学 一种商圈边界的确定方法及服务端
CN108596648A (zh) * 2018-03-20 2018-09-28 阿里巴巴集团控股有限公司 一种商圈判定方法和装置

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104111946B (zh) * 2013-04-19 2018-08-07 腾讯科技(深圳)有限公司 基于用户兴趣的聚类方法和装置
CN106649331B (zh) * 2015-10-29 2020-09-11 阿里巴巴集团控股有限公司 商圈识别方法及设备
US20170308929A1 (en) * 2016-04-25 2017-10-26 Chian Chiu Li Social Network Based Advertisement
CN106339416B (zh) * 2016-08-15 2019-11-08 常熟理工学院 基于网格快速搜寻密度峰值的教育数据聚类方法
CN106777984B (zh) * 2016-12-19 2019-02-22 福州大学 一种基于密度聚类算法实现光伏阵列工作状态分析与故障诊断的方法
CN106649877A (zh) * 2017-01-06 2017-05-10 广东工业大学 一种基于密度峰值的大数据挖掘方法及装置
CN107563789A (zh) * 2017-07-31 2018-01-09 石河子大学 数据处理方法、系统、终端及计算机可读存储介质

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105574014A (zh) * 2014-10-13 2016-05-11 北京明略软件系统有限公司 一种商圈划分方法及系统
CN107657474A (zh) * 2017-07-31 2018-02-02 石河子大学 一种商圈边界的确定方法及服务端
CN108596648A (zh) * 2018-03-20 2018-09-28 阿里巴巴集团控股有限公司 一种商圈判定方法和装置

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111369284A (zh) * 2020-03-03 2020-07-03 浙江网商银行股份有限公司 目标对象类型确定方法及装置
CN111369284B (zh) * 2020-03-03 2023-08-15 浙江网商银行股份有限公司 目标对象类型确定方法及装置
CN111815361A (zh) * 2020-07-10 2020-10-23 北京思特奇信息技术股份有限公司 区域边界计算方法、装置、电子设备及存储介质
CN112783963A (zh) * 2021-03-17 2021-05-11 上海数喆数据科技有限公司 基于商圈划分的企业线下与线上多源数据整合方法及装置
CN112783963B (zh) * 2021-03-17 2023-04-28 上海数喆数据科技有限公司 基于商圈划分的企业线下与线上多源数据整合方法及装置
CN116308501A (zh) * 2023-05-24 2023-06-23 北京骑胜科技有限公司 用于管理共享车辆的运营区域的方法、装置、设备和介质
CN116308501B (zh) * 2023-05-24 2023-10-17 北京骑胜科技有限公司 用于管理共享车辆的运营区域的方法、装置、设备和介质

Also Published As

Publication number Publication date
TWI711983B (zh) 2020-12-01
CN108596648B (zh) 2020-07-17
TW201941116A (zh) 2019-10-16
CN108596648A (zh) 2018-09-28

Similar Documents

Publication Publication Date Title
WO2019179173A1 (fr) Procédé et dispositif de détermination de zone de commerce
US10496678B1 (en) Systems and methods for generating and implementing knowledge graphs for knowledge representation and analysis
Chen et al. Probabilistic modeling of traffic lanes from GPS traces
US11118921B2 (en) Vehicle routing guidance to an authoritative location for a point of interest
CN106649331A (zh) 商圈识别方法及设备
CN106919957B (zh) 处理数据的方法及装置
WO2021109775A1 (fr) Procédés et dispositifs pour générer un échantillon d'entraînement, entraîner un modèle et reconnaître un caractère
CN101127049A (zh) 结构化数据的聚类
CN107492120B (zh) 点云配准方法
CN109033170A (zh) 停车场的数据修补方法、装置、设备及存储介质
CN116008671A (zh) 一种基于时差和聚类的闪电定位方法
CN113379269B (zh) 多因素空间聚类的城市商业功能区划方法、装置及介质
CN107507176A (zh) 一种图像检测方法及系统
CN109919227A (zh) 一种面向混合属性数据集的密度峰值聚类方法
WO2019087552A1 (fr) Dispositif de mappage de caractéristiques de style de transaction financière et procédé de génération de carte de caractéristiques de style de transaction
CN111369284B (zh) 目标对象类型确定方法及装置
CN112445976A (zh) 一种基于拥堵指数图谱的城市地址定位方法
US11810001B1 (en) Systems and methods for generating and implementing knowledge graphs for knowledge representation and analysis
CN112001384A (zh) 商圈识别的方法及设备
CN115691140B (zh) 一种汽车充电需求时空分布的分析与预测方法
CN108647189B (zh) 一种识别用户人群属性的方法及装置
CN111428510B (zh) 一种基于口碑的p2p平台风险分析方法
Hasanah et al. Data mining using K-means clustering algorithm for grouping countries of origin of foreign tourist
JP2005267025A (ja) 解析モデルの領域抽出システム、方法、プログラム、およびプログラム媒体
CN111914751A (zh) 一种图像人群密度识别检测方法及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18911316

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18911316

Country of ref document: EP

Kind code of ref document: A1