CN112862514A - Data processing method and device, electronic equipment and computer readable storage medium - Google Patents
Data processing method and device, electronic equipment and computer readable storage medium Download PDFInfo
- Publication number
- CN112862514A CN112862514A CN201911195632.7A CN201911195632A CN112862514A CN 112862514 A CN112862514 A CN 112862514A CN 201911195632 A CN201911195632 A CN 201911195632A CN 112862514 A CN112862514 A CN 112862514A
- Authority
- CN
- China
- Prior art keywords
- business
- core
- circle
- merchant
- feature vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 21
- 239000013598 vector Substances 0.000 claims abstract description 111
- 238000012545 processing Methods 0.000 claims abstract description 24
- 238000007621 cluster analysis Methods 0.000 claims abstract description 20
- 238000000034 method Methods 0.000 claims description 26
- 238000004364 calculation method Methods 0.000 claims description 16
- 238000004458 analytical method Methods 0.000 claims description 5
- 238000012935 Averaging Methods 0.000 claims description 3
- 238000010586 diagram Methods 0.000 description 15
- 230000006870 function Effects 0.000 description 15
- 238000004590 computer program Methods 0.000 description 11
- 238000005516 engineering process Methods 0.000 description 9
- 230000015654 memory Effects 0.000 description 9
- 238000004891 communication Methods 0.000 description 7
- 230000002776 aggregation Effects 0.000 description 5
- 238000004220 aggregation Methods 0.000 description 5
- 238000012549 training Methods 0.000 description 5
- 238000004422 calculation algorithm Methods 0.000 description 2
- 230000002860 competitive effect Effects 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 238000004806 packaging method and process Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 239000000758 substrate Substances 0.000 description 2
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 description 1
- 230000009193 crawling Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000005855 radiation Effects 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 229910052710 silicon Inorganic materials 0.000 description 1
- 239000010703 silicon Substances 0.000 description 1
- 230000007306 turnover Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
- G06Q30/0204—Market segmentation
- G06Q30/0205—Location or geographical consideration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/02—Marketing; Price estimation or determination; Fundraising
- G06Q30/0201—Market modelling; Market analysis; Collecting market data
Landscapes
- Business, Economics & Management (AREA)
- Strategic Management (AREA)
- Engineering & Computer Science (AREA)
- Accounting & Taxation (AREA)
- Development Economics (AREA)
- Finance (AREA)
- Entrepreneurship & Innovation (AREA)
- Game Theory and Decision Science (AREA)
- Data Mining & Analysis (AREA)
- Economics (AREA)
- Marketing (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
An embodiment of the present disclosure provides a data processing method, including: for each of a plurality of quotient circles, performing the following operations in order to determine a feature vector for each quotient circle: determining at least one core merchant located within a current business circle; acquiring characteristic information of each core merchant in at least one core merchant; generating a feature vector for each core merchant according to the feature information of each core merchant; and determining a feature vector for the current business circle according to the generated feature vector for each core business. And performing cluster analysis on a plurality of business circles according to the determined feature vector for each business circle so as to classify the plurality of business circles. The embodiment of the disclosure also provides a data processing device, an electronic device and a computer readable storage medium.
Description
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a data processing method and apparatus, an electronic device, and a computer-readable storage medium.
Background
Today, electronic commerce is rapidly developed, and large-scale marketing activities performed according to administrative divisions such as provinces and cities cannot meet diversified demands of different customer groups in the same administrative region. In the course of implementing the inventive concept disclosed herein, the inventors have discovered that more and more marketing campaigns, offline promotions, new retail site selection, and real estate related services all require personalized campaign design and promotion with business circles as dimensions.
The range of the business circles is determined by an industrial cluster, the geographic position range of the business circles is clear, but each city has a plurality of business circles, it is estimated that each city has 10 business circles on average, and if only 700 cities in the country are considered, the nation has 7000 business circles. If the business circles are not clustered in a scientific way, the overall marketing strategy made by the merchants for 7000 business circles will lack feasibility. Meanwhile, if the business circles are classified manually, the classifying personnel are required to be familiar with the local situation of each city, a large amount of manpower and material resources are required to be consumed, and the feasibility is poor.
Disclosure of Invention
In view of this, the embodiments of the present disclosure provide a data processing method and apparatus, so that business circle clustering is not affected by human experience and is not limited by geographic location.
One aspect of the embodiments of the present disclosure provides a data processing method, including: for each of a plurality of quotient circles, performing the following operations to determine a feature vector for each of the quotient circles: determining at least one core merchant located within a current business circle; acquiring characteristic information of each core merchant in the at least one core merchant; generating a feature vector aiming at each core merchant according to the feature information of each core merchant; and determining the feature vector aiming at the current business circle according to the generated feature vector aiming at each core business. And performing cluster analysis on the plurality of business circles according to the determined feature vector aiming at each business circle so as to classify the plurality of business circles.
According to an embodiment of the present disclosure, the generating a feature vector for each core merchant according to the feature information of each core merchant includes: for each core merchant, determining at least one dimension characteristic used for representing the commercial value of the current core merchant according to the characteristic information of the current core merchant; and calculating the characteristic value of each dimension characteristic in the at least one dimension characteristic.
According to an embodiment of the present disclosure, the determining a feature vector for the current business turn according to the generated feature vector for each core business, includes: and averaging the feature values of the feature vectors of all the core merchants in the current business circle on each dimension to obtain the feature values of the feature vectors of the current business circle on the corresponding dimension.
According to an embodiment of the present disclosure, the performing a cluster analysis on the plurality of quotient circles according to the determined feature vector for each quotient circle includes: selecting a plurality of central business circles from the plurality of business circles; for each central business circle in the plurality of central business circles, performing the following operations to determine a first-level business circle cluster centered at each central business circle: calculating the similarity between the feature vector of each business circle in the multiple business circles and the feature vector of the current center business circle; and determining a plurality of first-level business district clusters taking the current central business district as the center according to the similarity calculation result.
According to an embodiment of the present disclosure, the data processing method further includes: and after determining the first-level business circle clusters taking each central business circle as the center, performing cluster analysis on the plurality of first-level business circle clusters.
According to the embodiment of the present disclosure, performing cluster analysis on the plurality of first-level business district clusters includes: selecting a plurality of central business circle clusters from the first-stage business circle clusters; for each first-level business district cluster in the plurality of first-level business district clusters, performing the following operations so as to determine a second-level business district cluster centered on each central business district cluster in the plurality of central business district clusters: calculating the similarity between the feature vector of each first-level business district cluster and the feature vector of the current center business district cluster; and determining a second-level business district cluster taking the current central business district cluster as the center according to the similarity calculation result.
According to the embodiment of the disclosure, the plurality of central business circles are not adjacent to each other in geographic position.
Another aspect of the embodiments of the present disclosure provides a data processing apparatus, including: the processing module is used for determining a feature vector of each business circle in the multiple business circles; and the clustering module is used for carrying out clustering analysis on the plurality of business circles according to the determined characteristic vector aiming at each business circle so as to classify the plurality of business circles. The processing module comprises: the first determining submodule is used for determining at least one core merchant in the current business circle; the acquisition submodule is used for acquiring the characteristic information of each core merchant in the at least one core merchant; the generating submodule is used for generating a feature vector aiming at each core merchant according to the feature information of each core merchant; and a second determining submodule, configured to determine, according to the generated feature vector for each core merchant, a feature vector for the current business circle.
Another aspect of the embodiments of the present disclosure provides an electronic device, which includes one or more processors and a storage device, where the storage device is configured to store executable instructions, and the executable instructions, when executed by the processors, implement the method of the embodiments of the present disclosure.
Another aspect of the embodiments of the present disclosure provides a computer-readable storage medium storing computer-executable instructions, which when executed by a processor, are used to implement the above-mentioned method of the embodiments of the present disclosure.
Another aspect of the present disclosure provides a computer program comprising computer executable instructions for implementing the above method of an embodiment of the present disclosure when executed.
According to the embodiment of the disclosure, the technical means of clustering the business circles according to the feature vectors of the business circles is adopted, so that the technical problems that a large amount of manpower and objects are consumed for business circle clustering in the related technology and the clustering result is easily limited by the geographical position can be at least partially solved, and the technical effect of automatically clustering similar business circles with longer distances can be realized.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent from the following description of embodiments of the present disclosure with reference to the accompanying drawings, in which:
fig. 1 schematically shows a system architecture that may be applied to a data processing method according to an embodiment of the present disclosure;
FIG. 2 schematically shows a flow chart of a data processing method according to an embodiment of the present disclosure;
FIG. 3 schematically shows a flow diagram for generating feature vectors for core merchants, in accordance with an embodiment of the present disclosure;
FIG. 4 schematically illustrates a flow diagram for cluster analysis of multiple quotient circles in accordance with an embodiment of the present disclosure;
FIG. 5 schematically illustrates a flow diagram for determining a first level business turn cluster centered at each center business turn, according to an embodiment of the disclosure;
FIG. 6 schematically illustrates a flow diagram for cluster analysis of clustered first-level business community clusters, in accordance with an embodiment of the present disclosure;
FIG. 7 schematically illustrates a detailed flow chart of cluster analysis of first-level business zone clusters formed by clustering according to an embodiment of the present disclosure;
FIG. 8 schematically illustrates a detailed flow chart for determining a second level business zone cluster centered around each central business zone cluster according to an embodiment of the present disclosure;
FIG. 9 schematically shows a block diagram of a data processing apparatus according to an embodiment of the present disclosure; and
fig. 10 schematically shows a block diagram of an electronic device adapted to implement a data processing method according to an embodiment of the present disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is illustrative only and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It is noted that the terms used herein should be interpreted as having a meaning that is consistent with the context of this specification and should not be interpreted in an idealized or overly formal sense.
Where a convention analogous to "at least one of A, B and C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B and C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). Where a convention analogous to "A, B or at least one of C, etc." is used, in general such a construction is intended in the sense one having skill in the art would understand the convention (e.g., "a system having at least one of A, B or C" would include but not be limited to systems that have a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). It will be further understood by those within the art that virtually any disjunctive word and/or phrase presenting two or more alternative terms, whether in the description, claims, or drawings, should be understood to contemplate the possibilities of including one of the terms, either of the terms, or both terms. For example, the phrase "a or B" should be understood to include the possibility of "a" or "B", or "a and B".
The terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, features defined as "first", "second", may explicitly or implicitly include one or more of the described features.
The embodiment of the disclosure aims at the problems that in the related art, by means of manual business experience and understanding of city business circles, the business circles are classified manually in different cities, so that a uniform classification standard is difficult to achieve, classification results are possibly one-sided, and a large amount of human resources are required to be consumed. Meanwhile, the problem that in the related technology, artificial clustering is carried out based on the distance between business circles, and homogeneous business circles relatively far away are difficult to be divided into the same category is solved. The embodiment of the disclosure is based on the feature information of core merchants in a business district, the feature information of the core merchants in each business district is processed into multidimensional feature vectors by utilizing the area _ to _ vector technology similar to the word2vector technology, and then the multidimensional feature vectors of the business districts are obtained by the multidimensional feature vectors of the core merchants.
An embodiment of the present disclosure provides a data processing method for classifying a plurality of business circles, including the following operations. For each of the plurality of business circles, for example, the following operations may be performed to determine a feature vector for each business circle, that is, determine at least one core merchant located in the current business circle, obtain feature information of each core merchant in the at least one core merchant, generate a feature vector for each core merchant according to the feature information of each core merchant, and determine a feature vector for the current business circle according to the generated feature vector for each core merchant. And performing cluster analysis on a plurality of business circles according to the determined feature vector for each business circle so as to classify the plurality of business circles.
Fig. 1 schematically shows a system architecture 100 that may be applied to a data processing method according to an embodiment of the present disclosure. It should be noted that fig. 1 is only an example of a system architecture to which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical content of the present disclosure, and does not mean that the embodiments of the present disclosure may not be applied to other devices, systems, environments or scenarios.
As shown in fig. 1, the system architecture 100 according to this embodiment may include terminal devices 101, 102, 103, a network 104 and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may have various communication client applications installed thereon, such as a shopping application, a web browser application, a search application, an instant messaging tool, a mailbox client, social platform software, and the like.
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 105 may be a server that provides various services, such as a background management server that provides support for websites browsed by users using the terminal devices 101, 102, 103. The background management server may analyze and perform other processing on the received data such as the user request, and feed back a processing result (e.g., a webpage, information, or data obtained or generated according to the user request) to the terminal device.
It should be noted that the data processing method provided by the embodiment of the present disclosure may be generally executed by the server 105. Accordingly, the data processing apparatus provided by the embodiments of the present disclosure may be generally disposed in the server 105. The data processing method provided by the embodiment of the present disclosure may also be executed by a server or a server cluster different from the server 105 and capable of communicating with the terminal devices 101, 102, 103 and/or the server 105. Accordingly, the data processing apparatus provided by the embodiment of the present disclosure may also be disposed in a server or a server cluster different from the server 105 and capable of communicating with the terminal devices 101, 102, 103 and/or the server 105. Alternatively, the data processing method provided by the embodiment of the present disclosure may also be executed by the terminal devices 101, 102, 103. Accordingly, the data processing apparatus provided by the embodiment of the present disclosure may also be disposed in the terminal devices 101, 102, 103. The data processing method provided by the embodiments of the present disclosure may also be executed by a terminal device or a group of terminal devices that is different from the terminal devices 101, 102, 103 and capable of communicating with the terminal devices 101, 102, 103 and/or the server 105. Accordingly, the data processing apparatus provided by the embodiment of the present disclosure may also be disposed in a terminal device or a terminal device group different from the terminal devices 101, 102, 103 and capable of communicating with the terminal devices 101, 102, 103 and/or the server 105.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Fig. 2 schematically shows a flow chart of a data processing method according to an embodiment of the present disclosure.
As shown in fig. 2, the data processing method for classifying a plurality of business circles may include the following operations S201 to S206.
In operation S201, for each of a plurality of business circles, the following operations S202 to S205 are performed in order to determine a feature vector for each business circle.
It should be noted that, in the embodiment of the present disclosure, the business circle may be a range of radiation that is obtained by one or more core merchants by taking coordinates corresponding to their locations as a center and extending outward along a certain direction and distance, and that is capable of attracting customers, that is, a range of an area occupied by customers consumed by the core merchants. The business circle may be in the form of a circular area centered on one or more core merchants. For example, a business circle may be an area centered at a core business with a radius of 0.5 kilometers. The business circle can also be an area with the radius of 1 kilometer and the center position of a plurality of core merchants is used as the center. The center position may be a physical position corresponding to a geometric center of positions of the core merchants, or a position of a merchant in the core merchants. The core merchant can be a consumption center of certain industry in a business district, and can attract residents in the business district to consume.
Next, at operation S202, at least one core merchant located within the current business segment is determined.
In embodiments of the present disclosure, the core merchants within a business circle may be determined from one or more aspects of the industry. For example, the core merchants within a business circle may be determined based on at least one of turnover, customer traffic, total profit, and the like, and may also be determined based on at least one of cash flow, sales volume, and the like for a certain product or service. The core merchant may be in a position to dominate the business circle in which it is located in at least one of the aspects referred to above. The number of core merchants in a business circle can be 1-10.
In embodiments of the present disclosure, the core merchants within a business circle may also be referred to as industry clusters (industry clusters), which are also referred to as "industry clusters," competitive clusters, "and" baud clusters. An industry cluster is understood to be a phenomenon that competitive enterprises within a certain industry and cooperative enterprises, specialized suppliers, service suppliers, relevant industry manufacturers and relevant organizations (such as universities, scientific research organizations, standards-making organizations, industry associations and the like) associated with the enterprises are gathered in a certain region, and the enterprises and the relevant organizations are gathered in a certain region to form a business circle. For example, information technology enterprises and related manufacturers, related organizations, etc. are gathered in the united states silicon valley and may form a business circle related to information technology, and video media companies and related film-making manufacturers, related organizations are gathered in the united states hollywood and may form a business circle related to video media.
Next, in operation S203, characteristic information of each of the at least one core merchant is obtained.
In the embodiment of the disclosure, since various public media, self-media, websites, and text information such as description information, comment information, and recommendation information about a core merchant of a specific organization (e.g., an industry association, a specific research organization, a national relevant director, etc.) may reflect the characteristic attribute of the core merchant from multiple angles. Therefore, the webpage data crawling tool can be adopted to crawl the text information corresponding to each core merchant, and then corresponding feature information is extracted from the crawled text information to serve as the feature information of the corresponding core merchant. In particular, the python tool may be utilized to crawl textual information corresponding to each core merchant from an associated website.
Then, in operation S204, a feature vector for each core merchant is generated according to the feature information of each core merchant.
In the embodiment of the disclosure, the word2vector technology can be used for processing the feature information of each core merchant to obtain the feature vector of the corresponding core merchant.
In the embodiment of the disclosure, each core merchant has a plurality of dimensional features, each dimensional feature has corresponding feature information, the feature information corresponding to each dimensional feature is processed by using word2vector technology, a feature value corresponding to the dimensional feature can be obtained, and the feature values of the plurality of dimensional features form a feature vector of the core merchant.
Next, in operation S205, a feature vector for the current business circle is determined according to the generated feature vector for each core merchant.
In the embodiment of the present disclosure, the feature vectors of all core merchants in the current business circle may be averaged, and the feature vector obtained thereby may be used as the feature vector of the current business circle.
Next, in operation S206, a plurality of business circles are subjected to cluster analysis according to the determined feature vector for each business circle, so as to classify the plurality of business circles.
In the embodiment of the disclosure, various distance and similarity calculation methods can be adopted, the similarity between any two business circles can be calculated based on the feature vectors of the business circles, and the business circles with the similarity higher than the similarity threshold value can be used as the same business circle, so that the classification of the business circles is realized. For example, a plurality of central business circles are selected from a plurality of business circles, and the business circle with the similarity higher than the similarity threshold value with each central business circle is placed in a business circle cluster based on the central business circle.
Through the embodiment of the disclosure, the commodity circle does not need to be clustered by depending on manpower and manual experience, but can be automatically clustered and analyzed according to the characteristic vector of the commodity circle, so that the manpower can be saved, and the classification result is more scientific and accurate. In addition, when a plurality of business circles are classified according to the feature vectors of the business circles, the business circles which are relatively far away from each other can be clustered because the business circles are not limited by the regions where the business circles are located. In addition, the embodiment of the disclosure can generate the feature vector of the business circle through the feature vector of the core business in the business circle, so that the commercial value of the business circle can be reflected more comprehensively and accurately.
Fig. 3 schematically illustrates a flow diagram for generating feature vectors for core merchants, in accordance with an embodiment of the disclosure.
As shown in fig. 3, for each core merchant, the operation S204 of generating a feature vector for each core merchant according to the feature information of each core merchant may include the following operations S2041 and S2042.
In operation S2041, at least one dimension characteristic for characterizing the commercial value of the current core merchant is determined according to the characteristic information of the current core merchant.
For example, the commercial value of a core merchant may include the value embodied in the functions of shopping, entertainment, friends making, transportation, and education. The value that each function represents can be described by one or more characteristic dimensions. For example, the commercial value embodied by a shopping function may be described by a characteristic dimension of a mall, supermarket, luxury and apparel. As another example, the commercial value embodied by entertainment functionality may be described by characteristic dimensions of bars, night shops, midnight, movie theaters, and parties. As another example, the commercial value embodied by traffic functions can be described by characteristic dimensions such as traffic jams and parking lots. As another example, the business value embodied by an educational function can be described by a characteristic dimension such as training. It can be seen that the aforementioned marketplace, supermarket, luxury, bar, night store, white collar, traffic congestion, midnight, movie theatre, parking lot, party, training, apparel, etc. can all be dimensional features of a business establishment.
If, the feature vector of the current business circle is calculated by using the core merchant 1, the core merchant 2 and the core merchant 3 of the current business circle. The feature vectors of core merchant 1, core merchant 2 and core merchant 3 of the current business circle may be represented by Z, and Z may include 13 dimensional features of Z1-Z13, and specifically, the 13 dimensional features of core merchant 1, core merchant 2 and core merchant 3 may be as shown in table 1.
TABLE 1
Next, in operation S2042, a feature value of each of the at least one dimensional feature is calculated.
In the embodiment of the present disclosure, each dimension feature has corresponding feature information, and the feature information corresponding to each dimension feature in table 1 may be processed by using a deep learning Word2Vector algorithm to obtain a feature value corresponding to each dimension feature.
For example, feature information corresponding to each dimensional feature of the core merchant 1, the core merchant 2, and the core merchant 3 in table 1 may be processed by using a correlation algorithm, and a feature value of the corresponding dimensional feature of each core merchant may be obtained. For example, the feature values of the corresponding dimensional features of each core merchant obtained by the processing are shown in table 2.
TABLE 2
In this embodiment of the present disclosure, the operation S205 determining a feature vector for the current business turn according to the generated feature vector for each core business, may include the following operations:
and averaging the characteristic values of the characteristic vectors of all the core merchants in the current business circle on each dimension to serve as the characteristic values of the characteristic vectors of the current business circle on the corresponding dimension.
In this embodiment of the present disclosure, the feature values of the feature vectors of the core merchant 1, the core merchant 2, and the core merchant 3 in table 2 are averaged to obtain the feature value of the feature vector of the current business circle in the corresponding dimension. The feature values of the feature vector of the current quotient circle on the corresponding dimension obtained by the specific calculation are shown in the last row in table 2.
In the embodiment of the present disclosure, the feature value of the feature vector of the current business turn in the corresponding dimension may be obtained in a mean aggregation manner, and in addition, the feature value of the feature vector of the current business turn in the corresponding dimension may be obtained in a maximum aggregation manner, a minimum aggregation manner, and the like. For example, the maximum feature value in each dimension of the feature vectors of all core merchants located in the current business circle is used as the feature value of the feature vector for the current business circle in the corresponding dimension. For example, the feature values of the feature vector of the current business turn obtained by maximum value aggregation in the corresponding dimension are shown in the last row in table 3.
TABLE 3
FIG. 4 schematically illustrates a flow diagram for cluster analysis of multiple quotient circles according to an embodiment of the present disclosure.
Specifically, as an alternative embodiment, as shown in fig. 4, the operation S206 performs cluster analysis on a plurality of quotient circles according to the determined feature vector for each quotient circle, which may include the following operations S2061 and S2062.
In operation S2061, a plurality of central quotient circles are selected from the plurality of quotient circles.
In embodiments of the present disclosure, the plurality of central business circles are not geographically adjacent to each other. The central business circles are not adjacent to each other geographically, so that the influence on effective clustering caused by excessive similar characteristics between the adjacent geographic business circles can be prevented, and the scientificity and accuracy of clustering results can be improved.
Next, in operation S2062, for each of the central quotient circles, a first-level quotient circle cluster centered around each central quotient circle is determined.
In the embodiment of the disclosure, the similarity between each business circle in a plurality of business circles and each central business circle may be calculated, and the plurality of business circles are aggregated into a first-level business circle cluster taking each central business circle as a center according to the calculated similarity.
FIG. 5 schematically shows a flow chart for determining a first level business turn cluster centered at each center business turn according to an embodiment of the present disclosure.
Specifically, as an alternative embodiment, as shown in fig. 5, operation S2062 may include the following operations S20621 and S20622.
In operation S20621, a similarity between the feature vector of each of the plurality of quotient circles and the feature vector of the current center quotient circle is calculated.
In the embodiment of the present disclosure, a similarity calculation method may be adopted to calculate the similarity between the feature vector of each quotient circle and the feature vector of the current center quotient circle. For example, the similarity may be calculated using a cosine similarity calculation method or a standard calculation function.
In the embodiment of the disclosure, it is assumed that the plurality of business circles include business circles a to J, and the feature vectors of the business circles a to J include the following dimensional features: stores, supermarkets, luxuries, bars, parties, and training. The characteristic values of the quotient circles a to J in the respective dimensions are shown in table 4 below.
TABLE 4
Market place | Supermarket | Luxury goods | Bar | Party with a mobile phone | Training | |
Trade area A | 0.2 | 0.9 | 0.17 | 0.25 | 0.33 | 0.41 |
Trade area B | 0.23 | 0.4 | 0.16 | 0.12 | 0.12 | 0.31 |
Trade area C | 0.38 | 0.5 | 0.45 | 0.11 | 0.24 | 0.71 |
Trade area D | 0.52 | -0.12 | -0.292 | -0.464 | -0.2 | -0.808 |
Trade area E | 0.66 | -0.19 | -0.446 | -0.702 | -0.3 | -0.1214 |
Trade circle F | -0.166 | 0.94 | -0.3 | -0.16 | -0.623 | -0.1526 |
Trade area G | -0.94 | 0.67 | -0.6 | -0.53 | -0.4356 | -0.1523 |
Trade area H | 0.108 | -0.4 | -0.908 | -0.1416 | -0.12 | -0.2432 |
Trade area I | 0.122 | -0.67 | -0.9 | -0.3 | -0.21 | -0.2838 |
Trade circle J | 0.136 | 0.99 | 0.23 | 0.12 | 0.22 | 0.31 |
As can be seen from table 4 above, the feature vector of the quotient loop a can be represented as (0.2, 0.9, 0.17, 0.25, 0.33, 0.41), the feature vector of the quotient loop B can be represented as (0.22, 0.4, 0.16, 0.12, 0.12, 0.31), the feature vector of the quotient loop C can be represented as (0.38, 0.5, 0.45, 0.11, 0.24, 0.71), … …, and so on.
In the embodiment of the present disclosure, the cosine similarity calculation formula may be:
in the above formula, a represents the feature vector of quotient circle x, b represents the feature vector of quotient circle y, and xiCharacteristic value, y, representing the i-th dimension characteristic of quotient field xiAnd the characteristic value of the ith dimension characteristic of the quotient circle y is represented, and n represents the dimension number of the characteristic vector of the quotient circle.
For example, the similarity between the quotient circle a and the quotient circle B is calculated by using the cosine similarity calculation formula, and the similarity between the quotient circle a and the quotient circle B is 94.66%. Similarly, the similarity between any quotient circle and other quotient circles can be calculated by the above method, and the result shown in table 5 can be obtained, wherein the similarity value in table 5 is percentage.
TABLE 5
A | B | C | D | E | F | G | H | | J | |
A | ||||||||||
100 | 94.66 | 83.76 | -45.43 | -35.32 | 34.27 | -1.57 | -57.93 | -73.68 | 97.74 | |
B | 94.66 | 100 | 95.29 | -44.79 | -22.60 | 20.55 | -23.25 | -61.53 | -72.49 | 90.62 |
C | 83.76 | 95.29 | 100 | -54.49 | -23.01 | -0.27 | -35.20 | -69.47 | -74.13 | 78.65 |
D | -45.43 | -44.79 | -54.49 | 100 | 76.98 | 15.53 | 3.71 | 55.54 | 59.24 | -37.42 |
E | -35.32 | -22.60 | -23.01 | 76.98 | 100 | 12.10 | 2.47 | 61.01 | 66.76 | -31.38 |
F | 34.27 | 20.55 | -0.27 | 15.53 | 12.10% | 100 | 75.87 | 1.04 | -10.79 | 48.16 |
G | -1.57 | -23.25 | -35.20 | 3.71 | 2.47 | 75.87 | 100 | 22.00 | 14.97 | 11.73 |
H | -57.93 | -61.53 | -69.47 | 55.54 | 61.01 | 1.04 | 22.00 | 100 | 96.98 | -61.73 |
I | -73.68 | -72.49 | -74.13 | 59.24 | 66.76 | -10.79 | 14.97 | 96.98 | 100 | -76.28 |
J | 97.74 | 90.62 | 78.65 | -37.42 | -31.38 | 48.16 | 11.73 | -61.73 | -76.28 | 100 |
Next, in operation S20622, a plurality of first-level business district clusters centered around the current central business district are determined according to the similarity calculation result.
In the embodiment of the present disclosure, suppose the business circles a, D, F and H are selected as the center business circles. It can be set that if the similarity between two business circles is greater than 70%, the two business circles are called similar business circles or homogeneous business circles, and the similar business circles can be aggregated in the same business circle cluster. If the similarity threshold is set to 70%, a business circle similar to the center business circle can be found based on the similarity between two business circles in the above table 5. That is, a business circle similar to business circle a includes business circle B, business circle C, and business circle J, a business circle similar to business circle D includes business circle E, a business circle similar to business circle F includes business circle G, and a business circle similar to business circle H includes business circle I. Therefore, business circles A, B, C and J can be aggregated in the same business circle cluster, business circles D and E can be aggregated in the same business circle cluster, business circles F and G can be aggregated in the same business circle cluster, and business circles H and I can be aggregated in the same business circle cluster.
FIG. 6 schematically shows a flow chart of cluster analysis of first-level business district clusters formed by clustering according to an embodiment of the present disclosure.
As an alternative embodiment, as shown in fig. 6, after operation S2062, the data processing method may further include the following operation S207, for example.
In operation S207, a cluster analysis is performed on the plurality of first-level business district clusters.
In the embodiment of the disclosure, after the clustering analysis is performed on the plurality of business circles, the clustering analysis is performed on the plurality of first-level business circle clusters, so that the similar characteristics of the business circles in the same cluster are more, and the business circles of different clusters are more different.
Fig. 7 schematically shows a specific flowchart of cluster analysis on the first-level business district clusters formed by clustering according to the embodiment of the present disclosure.
Specifically, as an alternative embodiment, as shown in fig. 7, operation S207 may include the following operations S2071 and S2072.
In operation S2071, a plurality of central business district clusters are selected from the plurality of first-level business district clusters.
In the embodiments of the present disclosure, the similarity between the first-level business district clusters may be calculated by using a standard measurement function or a cosine similarity calculation formula. If the similarity between the first-level business district clusters is larger than the set value, the two first-level business district clusters are determined to be similar, and then a plurality of center business district clusters can be selected from the first-level business district clusters according to the set value and the calculated similarity.
Next, in operation S2072, for each of the plurality of first-level business district clusters, a second-level business district cluster centered around each of the plurality of central business district clusters is determined.
In the embodiment of the disclosure, according to the similarity between the business circle clusters, combining the non-central business circle cluster in the plurality of first-level business circle clusters into the determined central business circle cluster to form a second-level business circle cluster.
Fig. 8 schematically illustrates a specific flow diagram for determining a second level business district cluster centered around each central business district cluster according to an embodiment of the present disclosure.
Specifically, as an alternative embodiment, as shown in FIG. 8, the ellipses omit operations S202-S204 for the sake of space. Operation S2072 may include the following operations S20721 and S20722.
In operation S20721, a similarity between the feature vector of each first-level quotient circle cluster and the feature vector of the current center quotient circle cluster is calculated.
In the embodiment of the present disclosure, the feature vector for the current first-level business district cluster may be determined according to the feature vector of each business district in the first-level business district cluster. Specifically, the feature values of the feature vectors of all the quotient circles located in the first-level quotient circle cluster in each dimension may be averaged to serve as the feature value of the feature vector of the current first-level quotient circle cluster in the corresponding dimension.
In the embodiment of the disclosure, a business circle A, a business circle B, a business circle C and a business circle J are set to form a first-level business circle cluster 1, a business circle D and a business circle E are set to form a first-level business circle cluster 2, a business circle F and a business circle G are set to form a first-level business circle cluster 3, and a business circle H and a business circle I are set to form a first-level business circle cluster 4. The characteristic values of the characteristic vectors of the first-level business district cluster 1, the first-level business district cluster 2, the first-level business district cluster 3 and the first-level business district cluster 4 in the corresponding dimensions are as shown in the following table 6.
TABLE 6
Market place | Supermarket | Luxury goods | Bar | Party with a mobile phone | Training | |
First-level business district cluster 1 | 0.217 | 0.305 | -0.015 | 0.0125 | 0.0925 | 0.26155 |
First-level business district cluster 2 | 0.59 | -0.155 | -0.369 | -0.583 | -0.25 | -0.4647 |
First-level business district cluster 3 | -0.553 | 0.805 | -0.45 | -0.345 | -0.5293 | -0.15245 |
First-level business district cluster 4 | 0.115 | -0.535 | -0.904 | -0.2208 | -0.165 | -0.3635 |
As can be seen from table 6 above, the eigenvectors of the first-level quotient circle cluster 1 can be represented as (0.217, 0.305, -0.015, 0.0125, 0.0925, 0.26155), the eigenvectors of the first-level quotient circle cluster 2 can be represented as (0.59, -0.155, -0.369, -0.583, -0.25, 0.4647), … …, and so on.
In the embodiment of the present disclosure, the similarity between the first-level business district clusters may be calculated by using the above cosine similarity calculation formula, as shown in table 7 below.
TABLE 7
Next, in operation S20722, a second-level quotient circle cluster centered around the current center quotient circle cluster is determined according to the similarity calculation result.
In the embodiment of the disclosure, the relationship between the difference between each first-level business district cluster and the difference between each business district in each first-level business district cluster can be calculated by using the standard measurement and calculation function, and the central business district cluster and the number thereof are determined. In addition, the center business circle cluster and the number thereof can also be determined by setting a similarity threshold. For example, setting the similarity threshold to be 60%, if the similarity between the first-level business district clusters is greater than 60%, determining that the two first-level business district clusters are similar, and therefore the two first-level business district clusters cannot be determined as the center business district cluster at the same time.
In the embodiment of the present disclosure, as can be seen from table 7, the similarity between the first-level business district cluster 2 and the first-level business district cluster 4 is 65.14%, which is greater than the similarity threshold value 60%, and therefore, the first-level business district cluster 2 and the first-level business district cluster 4 cannot be determined as the center business district cluster at the same time. Therefore, the first-level business circle cluster 1, the first-level business circle cluster 2 and the first-level business circle cluster 3 may be selected as the central business circle cluster, and the first-level business circle cluster 1, the first-level business circle cluster 3 and the first-level business circle cluster 4 may also be considered as the central business circle cluster.
In an embodiment of the present disclosure, if there is a similarity between two first-level quotient circle clusters greater than 60%, then the two first-level quotient circle clusters are referred to as similar quotient circle clusters. If the first-level business district cluster 1, the first-level business district cluster 2 and the first-level business district cluster 3 are selected as central business district clusters, and cluster analysis is carried out by taking the selected central business district cluster as a center, a second-level business district cluster I, a second-level business district cluster II and a second-level business district cluster III can be obtained. The second-level business district cluster I can comprise a first-level business district cluster 1, the second-level business district cluster II can comprise a first-level business district cluster 2 and a first-level business district cluster 4, and the second-level business district cluster II can comprise a first-level business district cluster 3.
In the embodiment of the present disclosure, because first-level business circle cluster 1 includes business circle a, business circle B, business circle C and business circle J, first-level business circle cluster 2 includes business circle D and business circle E, first-level business circle cluster 3 includes business circle F and business circle G, first-level business circle cluster 4 business circles H and business circle I. Therefore, the second-level business district cluster I comprises a business district A, a business district B, a business district C and a business district J, the second-level business district cluster II comprises a business district D, a business district E, a business district H and a business district I, and the second-level business district cluster II comprises a business district F and a business district G, so that classification and aggregation of the business districts A-J are completed. The results of the quotient zone clustering analysis are shown in table 8 below.
TABLE 8
Through the embodiment of the disclosure, the business circles are classified and aggregated into the business circle clusters, and the business circle cluster has important significance in the business implemented in the subareas. For example, the method can be used as the basis of services such as personalized regional marketing, offline addressing, regional purchasing power prediction, regional brand promotion strategy making and the like.
Fig. 9 schematically shows a block diagram of a data processing apparatus according to an embodiment of the present disclosure.
The apparatus shown in fig. 9 may be used to implement the methods described in the above embodiments. The data processing apparatus 900 may include a processing module 910 and a clustering module 920.
The processing module 910 may include a first determining submodule 911, an obtaining submodule 912, a generating submodule 913, and a second determining submodule 914.
In particular, the processing module 910 is configured to determine a feature vector for each of a plurality of quotient circles.
The clustering module 920 is configured to perform cluster analysis on the multiple business circles according to the determined feature vector for each business circle, so as to classify the multiple business circles.
In an embodiment of the present disclosure, the first determination sub-module 911 is used to determine at least one core merchant located within the current business turn.
The obtaining sub-module 912 is configured to obtain characteristic information of each of the at least one core merchant.
The generating submodule 913 is configured to generate a feature vector for each core merchant according to the feature information of each core merchant.
The second determining submodule 914 is configured to determine a feature vector for the current business turn from the generated feature vectors for each core merchant.
It should be noted that, in the embodiment of the present disclosure, the embodiment of the apparatus portion is the same as or similar to the embodiment of the method portion, and is not described herein again.
Any number of modules, sub-modules, or at least part of the functionality of any number thereof according to embodiments of the present disclosure may be implemented in one module. Any one or more of the modules and sub-modules according to the embodiments of the present disclosure may be implemented by being split into a plurality of modules. Any one or more of the modules, sub-modules according to embodiments of the present disclosure may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in any other reasonable manner of hardware or firmware by integrating or packaging the circuit, or in any one of three implementations, or in any suitable combination of any of the three. Alternatively, one or more of the modules, sub-modules according to embodiments of the disclosure may be implemented at least partly as computer program modules which, when executed, may perform corresponding functions.
For example, any number of the first determining submodule 911, the obtaining submodule 912, the generating submodule 913 and the second determining submodule 914 may be combined and implemented in one module, or any one of the modules may be split into a plurality of modules. Alternatively, at least part of the functionality of one or more of these modules may be combined with at least part of the functionality of the other modules and implemented in one module. According to an embodiment of the present disclosure, at least one of the first determining submodule 911, the obtaining submodule 912, the generating submodule 913 and the second determining submodule 914 may be implemented at least partially as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a chip, a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented by hardware or firmware in any other reasonable manner of integrating or packaging a circuit, or implemented by any one of three implementations of software, hardware and firmware, or implemented by a suitable combination of any of them. Alternatively, at least one of the first determining submodule 911, the obtaining submodule 912, the generating submodule 913 and the second determining submodule 914 may be at least partly implemented as a computer program module, which, when executed, may perform a corresponding function.
Fig. 10 schematically shows a block diagram of an electronic device adapted to implement a data processing method according to an embodiment of the present disclosure. The electronic device shown in fig. 10 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present disclosure.
As shown in fig. 10, an electronic device 1000 according to an embodiment of the present disclosure includes a processor 1001 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)1002 or a program loaded from a storage section 1008 into a Random Access Memory (RAM) 1003. Processor 1001 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), among others. The processor 1001 may also include onboard memory for caching purposes. The processor 1001 may include a single processing unit or multiple processing units for performing different actions of a method flow according to embodiments of the present disclosure.
In the RAM 1003, various programs and data necessary for the operation of the electronic apparatus 1000 are stored. The processor 1001, ROM 1002, and RAM 1003 are connected to each other by a bus 1004. The processor 1001 performs various operations of the method flow according to the embodiments of the present disclosure by executing programs in the ROM 1002 and/or the RAM 1003. Note that the program may also be stored in one or more memories other than the ROM 1002 and the RAM 1003. The processor 1001 may also perform various operations of method flows according to embodiments of the present disclosure by executing programs stored in one or more memories.
According to embodiments of the present disclosure, method flows according to embodiments of the present disclosure may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable storage medium, the computer program containing program code for performing the method illustrated by the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication part 1009 and/or installed from the removable medium 1011. The computer program performs the above-described functions defined in the system of the embodiment of the present disclosure when executed by the processor 1001. The systems, devices, apparatuses, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the present disclosure.
The present disclosure also provides a computer-readable storage medium, which may be contained in the apparatus/device/system described in the above embodiments; or may exist separately and not be assembled into the device/apparatus/system. The computer-readable storage medium carries one or more programs which, when executed, implement the method according to an embodiment of the disclosure.
According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, a computer-readable storage medium may include the ROM 1002 and/or the RAM 1003 described above and/or one or more memories other than the ROM 1002 and the RAM 1003.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Those skilled in the art will appreciate that various combinations and/or combinations of features recited in the various embodiments and/or claims of the present disclosure can be made, even if such combinations or combinations are not expressly recited in the present disclosure. In particular, various combinations and/or combinations of the features recited in the various embodiments and/or claims of the present disclosure may be made without departing from the spirit or teaching of the present disclosure. All such combinations and/or associations are within the scope of the present disclosure.
The embodiments of the present disclosure have been described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described separately above, this does not mean that the measures in the embodiments cannot be used in advantageous combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be devised by those skilled in the art without departing from the scope of the present disclosure, and such alternatives and modifications are intended to be within the scope of the present disclosure.
Claims (10)
1. A data processing method for classifying a plurality of business circles, the method comprising:
for each of the plurality of quotient circles, performing the following operations in order to determine a feature vector for the each quotient circle:
determining at least one core merchant located within a current business circle;
acquiring characteristic information of each core merchant in the at least one core merchant;
generating a feature vector aiming at each core merchant according to the feature information of each core merchant; and
determining a feature vector for the current business circle according to the generated feature vector for each core business; and
performing cluster analysis on the plurality of business circles according to the determined feature vector for each business circle so as to classify the plurality of business circles.
2. The method of claim 1, wherein the generating a feature vector for the each core merchant from the feature information of the each core merchant comprises: aiming at each core merchant, according to the characteristic information of the current core merchant,
determining at least one dimensional characteristic for characterizing the commercial value of a current core merchant; and
calculating a feature value for each of the at least one dimensional feature.
3. The method of claim 2, wherein the determining a feature vector for the current business turn from the generated feature vector for the each core merchant comprises:
and averaging the feature values of the feature vectors of all the core merchants in the current business circle on each dimension to obtain the feature values of the feature vectors of the current business circle on the corresponding dimension.
4. The method of claim 1, wherein the performing cluster analysis on the plurality of quotient circles according to the determined feature vector for the each quotient circle comprises:
selecting a plurality of central business circles from the plurality of business circles; and
for each central business circle of the plurality of central business circles, performing the following operations to determine a first-level business circle cluster centered around the each central business circle:
calculating the similarity between the feature vector of each business circle in the multiple business circles and the feature vector of the current center business circle; and
and determining a plurality of first-level business district clusters taking the current central business district as the center according to the similarity calculation result.
5. The method of claim 4, wherein the method further comprises:
after determining the first-level business circle clusters taking each central business circle as the center, performing cluster analysis on the plurality of first-level business circle clusters.
6. The method of claim 5, wherein clustering the plurality of first-level business district clusters comprises:
selecting a plurality of central business circle clusters from the first-stage business circle clusters;
for each of the plurality of first-level business district clusters, performing the following operations to determine a second-level business district cluster centered around each of the plurality of central business district clusters:
calculating the similarity between the feature vector of each first-level business district cluster and the feature vector of the current center business district cluster; and
and determining a second-level business district cluster taking the current central business district cluster as the center according to the similarity calculation result.
7. The method of claim 4, wherein the plurality of central business circles are not geographically adjacent to each other.
8. A data processing apparatus for classifying a plurality of business circles, the apparatus comprising:
a processing module for determining a feature vector for each of the plurality of business circles, the processing module comprising:
the first determining submodule is used for determining at least one core merchant in the current business circle;
the acquisition submodule is used for acquiring the characteristic information of each core merchant in the at least one core merchant;
the generating submodule is used for generating a feature vector aiming at each core merchant according to the feature information of each core merchant; and
a second determining submodule, configured to determine, according to the generated feature vector for each core merchant, a feature vector for the current business circle; and
and the clustering module is used for carrying out clustering analysis on the plurality of business circles according to the determined feature vector aiming at each business circle so as to classify the plurality of business circles.
9. An electronic device, comprising:
one or more processors;
storage means for storing executable instructions which, when executed by the processor, implement the method of any one of claims 1 to 7.
10. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, implement a method according to any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911195632.7A CN112862514B (en) | 2019-11-27 | 2019-11-27 | Data processing method and device, electronic equipment and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911195632.7A CN112862514B (en) | 2019-11-27 | 2019-11-27 | Data processing method and device, electronic equipment and computer readable storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112862514A true CN112862514A (en) | 2021-05-28 |
CN112862514B CN112862514B (en) | 2024-06-18 |
Family
ID=75995980
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911195632.7A Active CN112862514B (en) | 2019-11-27 | 2019-11-27 | Data processing method and device, electronic equipment and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112862514B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115860810A (en) * | 2023-02-07 | 2023-03-28 | 广州数说故事信息科技有限公司 | Dynamic monitoring method and system for industry brand city store opening strategy |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050050027A1 (en) * | 2003-09-03 | 2005-03-03 | Leslie Yeh | Determining and/or using location information in an ad system |
CN105574014A (en) * | 2014-10-13 | 2016-05-11 | 北京明略软件系统有限公司 | Commercial district division method and system |
CN106649331A (en) * | 2015-10-29 | 2017-05-10 | 阿里巴巴集团控股有限公司 | Business district recognition method and equipment |
CN107067293A (en) * | 2017-03-07 | 2017-08-18 | 北京三快在线科技有限公司 | Merchant category method, device and electronic equipment |
US10140623B1 (en) * | 2014-10-27 | 2018-11-27 | Square, Inc. | Detection and explanation of lifts in merchant data |
CN109101989A (en) * | 2018-06-29 | 2018-12-28 | 阿里巴巴集团控股有限公司 | A kind of Merchant Category model construction and Merchant Category method, device and equipment |
CN109102334A (en) * | 2018-08-07 | 2018-12-28 | 长沙市到家悠享家政服务有限公司 | Market area partition method, apparatus and electronic equipment |
CN109697637A (en) * | 2018-12-27 | 2019-04-30 | 拉扎斯网络科技(上海)有限公司 | Object type determination method and device, electronic equipment and computer storage medium |
CN110070380A (en) * | 2018-01-24 | 2019-07-30 | 北京京东尚科信息技术有限公司 | Information generating method and device |
CN110322287A (en) * | 2019-06-18 | 2019-10-11 | 平安普惠企业管理有限公司 | A kind of service area screening technique and device |
-
2019
- 2019-11-27 CN CN201911195632.7A patent/CN112862514B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050050027A1 (en) * | 2003-09-03 | 2005-03-03 | Leslie Yeh | Determining and/or using location information in an ad system |
CN105574014A (en) * | 2014-10-13 | 2016-05-11 | 北京明略软件系统有限公司 | Commercial district division method and system |
US10140623B1 (en) * | 2014-10-27 | 2018-11-27 | Square, Inc. | Detection and explanation of lifts in merchant data |
CN106649331A (en) * | 2015-10-29 | 2017-05-10 | 阿里巴巴集团控股有限公司 | Business district recognition method and equipment |
CN107067293A (en) * | 2017-03-07 | 2017-08-18 | 北京三快在线科技有限公司 | Merchant category method, device and electronic equipment |
CN110070380A (en) * | 2018-01-24 | 2019-07-30 | 北京京东尚科信息技术有限公司 | Information generating method and device |
CN109101989A (en) * | 2018-06-29 | 2018-12-28 | 阿里巴巴集团控股有限公司 | A kind of Merchant Category model construction and Merchant Category method, device and equipment |
CN109102334A (en) * | 2018-08-07 | 2018-12-28 | 长沙市到家悠享家政服务有限公司 | Market area partition method, apparatus and electronic equipment |
CN109697637A (en) * | 2018-12-27 | 2019-04-30 | 拉扎斯网络科技(上海)有限公司 | Object type determination method and device, electronic equipment and computer storage medium |
CN110322287A (en) * | 2019-06-18 | 2019-10-11 | 平安普惠企业管理有限公司 | A kind of service area screening technique and device |
Non-Patent Citations (2)
Title |
---|
郝斌;董硕;胡引翠;刘学;高玉健;张亚冬;: "多维特征融合的城市商圈划分方法", 地理与地理信息科学, no. 05 * |
闵芳;: "云计算环境下商业信息特征数据检测仿真研究", 计算机仿真, no. 12 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115860810A (en) * | 2023-02-07 | 2023-03-28 | 广州数说故事信息科技有限公司 | Dynamic monitoring method and system for industry brand city store opening strategy |
CN115860810B (en) * | 2023-02-07 | 2023-06-06 | 广州数说故事信息科技有限公司 | Dynamic monitoring method and system for industry brand city shop opening strategy |
Also Published As
Publication number | Publication date |
---|---|
CN112862514B (en) | 2024-06-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11049142B2 (en) | Smart geo-fencing using location sensitive product affinity | |
US10902443B2 (en) | Detecting differing categorical features when comparing segments | |
US8909771B2 (en) | System and method for using global location information, 2D and 3D mapping, social media, and user behavior and information for a consumer feedback social media analytics platform for providing analytic measurements data of online consumer feedback for global brand products or services of past, present or future customers, users, and/or target markets | |
US20130226711A1 (en) | Monetizing images in publishing networks | |
US9858610B2 (en) | Product recommendation based on geographic location and user activities | |
US8682714B2 (en) | Location analytics systems and methods | |
US20120239590A1 (en) | Managing customer communications among a plurality of channels | |
JP7285521B2 (en) | System and method for predicting similar mobile devices | |
Widaningrum et al. | Discovering spatial patterns of fast-food restaurants in Jakarta, Indonesia | |
CN113360792B (en) | Information recommendation method, device, electronic equipment and storage medium | |
Smith | Metrics, locations, and lift: Mobile location analytics and the production of second-order geodemographics | |
CN112749323B (en) | Method and device for constructing user portrait | |
JP6994602B2 (en) | Advertising control device and advertising control system | |
CN110720099A (en) | System and method for providing recommendation based on seed supervised learning | |
CN112862514B (en) | Data processing method and device, electronic equipment and computer readable storage medium | |
Kamala | Development of an effective method of data collection for advertising and marketing on the internet | |
CN110555745B (en) | Information pushing method and system, computer system and computer readable storage medium | |
US20170372358A1 (en) | Budgeting for campaigns associated with locations | |
US10863316B1 (en) | Predicting a physical location of an online system user from multiple candidate physical locations based on a geographic location of a client device associated with the user | |
CN110033336B (en) | Method and device for determining address | |
US10498838B2 (en) | Determining online system user eligibility for receiving content using a polygon representing a physical location associated with the content | |
US11620588B2 (en) | Methods and systems for determining alternative plans | |
CN112825176B (en) | Advertisement putting method and device | |
US20160063514A1 (en) | Marketing platform that provides anonymous and comparative performance information related to vendors | |
Mohamed et al. | Implementation of a Geographical Information System (GIS) for E-commerce |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |