CN111931077B - Data processing method, device, electronic equipment and storage medium - Google Patents

Data processing method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111931077B
CN111931077B CN202010615622.0A CN202010615622A CN111931077B CN 111931077 B CN111931077 B CN 111931077B CN 202010615622 A CN202010615622 A CN 202010615622A CN 111931077 B CN111931077 B CN 111931077B
Authority
CN
China
Prior art keywords
poi
order
positioning
positioning position
historical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010615622.0A
Other languages
Chinese (zh)
Other versions
CN111931077A (en
Inventor
张雷
段航
杨凯
苏哲
胡渭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hanhai Information Technology Shanghai Co Ltd
Original Assignee
Hanhai Information Technology Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hanhai Information Technology Shanghai Co Ltd filed Critical Hanhai Information Technology Shanghai Co Ltd
Priority to CN202010615622.0A priority Critical patent/CN111931077B/en
Publication of CN111931077A publication Critical patent/CN111931077A/en
Application granted granted Critical
Publication of CN111931077B publication Critical patent/CN111931077B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0633Lists, e.g. purchase orders, compilation or processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Remote Sensing (AREA)
  • Development Economics (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data processing method and device. The method comprises the following steps: based on order information of each historical order in the plurality of historical orders, determining a POI name of an interest point corresponding to each historical order and a third positioning position of each historical order respectively; the order information comprises a first positioning position when the order is placed and a second positioning position when the order is handed over; identifying target historical orders with the same POI names and third positioning positions meeting preset conditions from the plurality of historical orders, wherein the third positioning positions meeting the preset conditions comprise that the distance between the third positioning positions of any two target historical orders is smaller than a first preset threshold; and determining POI coordinates corresponding to the same POI name based on the first positioning position and the second positioning position of the target historical order with the same POI name. The method and the device can improve accuracy and coverage rate of the mined POI coordinates and reduce mining limitation.

Description

Data processing method, device, electronic equipment and storage medium
Technical Field
Embodiments of the present invention relate to the field of data processing technologies, and in particular, to a data processing method, a data processing device, an electronic device, and a computer readable storage medium.
Background
POIs (Point of Interest, points of interest) appear as the personalized service needs of users after the geographic information system has evolved to a certain stage. The POI information mainly includes information such as names, categories, coordinates, classifications, and the like. The comprehensive POI information is a condition for enriching the navigation map, the timely POI can remind the user of the detailed information of branches and surrounding buildings of road conditions, and can conveniently find all places required by the user in navigation, so that the most convenient and unobstructed road is selected for path planning, and therefore, the POI coordinate is particularly important.
In the related art, when mining the coordinates of the POI, the coordinates of the POI are mainly obtained by using a map code, wherein the map code may relate to a correspondence between an address/place name and the coordinates, and thus the coordinates of the POI may be obtained from the map code by an address attribute of the POI.
However, the geocoding needs to have more perfect map data support, and the accuracy of the current self-built geocoding is lower, so that the accuracy of the POI coordinates obtained by the method is lower; in addition, the address information of some POIs is poor in quality, so that the POI coordinates are difficult to obtain from the geocode according to the POI addresses, and therefore, the scheme also has the problem of low coverage rate; in addition, for POIs with similar addresses (such as two shops like a street), the same POI coordinates are easy to obtain through geocoding, so the scheme has a problem of great limitation.
Therefore, the scheme for mining the POI coordinates in the related art generally has the problems of low accuracy, low coverage rate and large limitation of the POI coordinates.
Disclosure of Invention
The embodiment of the invention provides a data processing method to solve the problems of low accuracy, low coverage rate and large limitation of POI coordinates in the scheme for mining the POI coordinates in the related technology.
In order to solve the above problems, in a first aspect, an embodiment of the present invention provides a data processing method, including:
based on order information of each historical order in a plurality of historical orders, determining a POI name corresponding to each historical order and a third positioning position to which each historical order belongs;
the order information comprises a first positioning position when an order is placed and a second positioning position when the order is handed over;
identifying target historical orders with the same POI names and the third positioning positions meeting preset conditions from the historical orders, wherein the third positioning positions meeting the preset conditions comprise that the distance between the third positioning positions of any two target historical orders is smaller than a first preset threshold;
And determining POI coordinates corresponding to the same POI name based on the first positioning position and the second positioning position of the target historical order with the same POI name.
In a second aspect, an embodiment of the present invention provides a data processing apparatus, including:
the first determining module is used for respectively determining the POI names of the points of interest corresponding to each historical order and the third positioning position of each historical order based on the order information of each historical order in the plurality of historical orders;
the order information comprises a first positioning position when an order is placed and a second positioning position when the order is handed over;
the first identifying module is used for identifying target historical orders with the same POI names and the third positioning positions meeting the preset conditions from the plurality of historical orders, wherein the third positioning positions meeting the preset conditions comprise that the distance between the third positioning positions of any two target historical orders is smaller than a first preset threshold;
and the second determining module is used for determining POI coordinates corresponding to the same POI name based on the first positioning position and the second positioning position of the target historical order with the same POI name.
In a third aspect, the embodiment of the present invention further discloses an electronic device, including a memory, a processor, and a computer program stored in the memory and capable of running on the processor, where the processor implements the data processing method according to the embodiment of the present invention when executing the computer program.
In a fourth aspect, embodiments of the present invention provide a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the data processing method disclosed in the embodiments of the present invention.
In the embodiment of the invention, the target historical orders with the same corresponding POI names and the relatively close distance between the third positioning positions of any two historical orders are determined from the historical orders, so that the determined target historical orders point to the same POI, and then the POI coordinates of the same POI are determined by utilizing the first positioning positions of a plurality of target historical orders pointing to the same POI when the orders are placed and the second positioning positions of the orders when the orders are handed over; in addition, as the geographical position covered by the historical order is wider, the POI coordinates of the position can be mined by means of the technical scheme provided by the embodiment of the invention as long as the position of the historical order exists, so that the mining coverage rate of the POI coordinates is further improved; in addition, when the POI coordinates are mined, the method of the embodiment of the invention performs POI coordinate mining of the POI names by combining the first positioning positions and the second positioning positions of a plurality of target historical orders with the same POI name, and even if the corresponding POI names of two geographic positions with a relatively short distance are different, the problem of limitation of mining the two positions with a relatively short distance to the same POI coordinates is avoided, and on the contrary, different POI coordinates can be mined based on the POI names even if the two geographic positions with a relatively short distance are adopted, so that the limitation of the mined POI coordinates is reduced. In addition, because order information is more time-efficient, POI coordinates can be mined more timely based on order information of historical orders.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings used in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of the steps of a data processing method of one embodiment of the present invention;
FIG. 2 is a schematic map view of one embodiment of the present invention;
FIG. 3 is a block diagram of a data processing apparatus according to one embodiment of the present invention;
FIG. 4 schematically illustrates a block diagram of a computing processing device for performing a method according to the present disclosure; and
fig. 5 schematically illustrates a storage unit for holding or carrying program code implementing a method according to the present disclosure.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
An embodiment of the present invention provides a data processing method, as shown in fig. 1, where the method may include the following steps:
step 101, based on order information of each historical order in a plurality of historical orders, determining a POI name corresponding to each historical order and a third positioning position to which each historical order belongs;
the order information comprises a first positioning position when an order is placed and a second positioning position when the order is handed over;
optionally, the order information further includes an order address.
The order type of the historical order can be any text order including order information such as an order address, a first positioning position when the order is placed, a second positioning position when the order is handed over, and the like, such as a take-out order, a taxi taking order, an express order, and the like.
For example, for a take-out order, the order address may be a receiving address set by the order user, the first location of the order user may be a location of the order user when the order is placed, and the second location of the order delivery may be a location of the order user when the order user receives the merchandise delivered by the dispenser.
For another example, for a taxi order, the order address may be a get-on address set by the order-placing user, the first positioning position when placing the order may be a user positioning position when placing the order, and the second positioning position when the order is handed over may be a user positioning position when placing the order.
In a possible implementation manner, when the order information of the historical order includes the order address but does not include the first positioning position and the second positioning position, the correlation between the historical order and the set of positioning coordinates can be generated by respectively hooking the historical order with the positioning coordinates of the user of the historical order when placing the order and the positioning coordinates when the order is handed over, so that the correlation can also form the order information of the historical order, and the configured order information of the historical order includes not only the order address but also the first positioning position and the second positioning position.
In this step, the POI name of the historical order and the location to which the historical order belongs are mined mainly based on the order information of the historical order.
For the location to which the historical order belongs, while the order information of the historical order may include at least one location, in order to facilitate expressing the location of the historical order, a third location needs to be mined for each historical order.
102, identifying target historical orders with the same POI names and the third positioning position meeting preset conditions from the historical orders;
Through the process of step 101, for each historical order, a POI name and a third location may be mined.
Since the number of the historical orders is plural, it is necessary to mine a target historical order which corresponds to the same POI name and whose third location satisfies a predetermined condition from the plurality of historical orders.
The number of target historical orders mined here is a plurality.
The third positioning position meeting the preset condition comprises the step that the distance between any two third positioning positions of the target historical orders is smaller than a first preset threshold value.
That is, this step mainly extracts, from a plurality of historical orders, a plurality of target historical orders that point to the same POI (specifically, the POI names corresponding to the historical orders are the same, and the third locating positions to which any two historical orders belong are relatively close), where the third locating positions to which any two historical orders belong are relatively close specifically referred to as: the distance between the third positioning positions of any two target historical orders is smaller than the first preset threshold value.
Step 103, determining POI coordinates corresponding to the same POI name based on the first positioning position and the second positioning position of the target historical order with the same POI name.
Wherein, the POI names respectively extracted for a plurality of target historical orders are the same. And each target historical order has a first positioning position and a second positioning position, if the first positioning position and the second positioning position of an order are regarded as a positioning position combination, a plurality of target historical orders with the same POI name can correspond to a plurality of groups of positioning position combinations, and the POI coordinates corresponding to the POI name can be mined by utilizing the plurality of groups of positioning position combinations.
The first positioning position may be a latitude and longitude coordinate, the second positioning position may be a latitude and longitude coordinate, and correspondingly, the POI coordinate may also be a latitude and longitude coordinate.
In the embodiment of the invention, the target historical orders with the same corresponding POI names and the relatively close distance between the third positioning positions of any two historical orders are determined from the historical orders, so that the determined target historical orders point to the same POI, and then the POI coordinates of the same POI are determined by utilizing the first positioning positions of a plurality of target historical orders pointing to the same POI when the orders are placed and the second positioning positions of the orders when the orders are handed over; in addition, as the geographical position covered by the historical order is wider, the POI coordinates of the position can be mined by means of the technical scheme provided by the embodiment of the invention as long as the position of the historical order exists, so that the mining coverage rate of the POI coordinates is further improved; in addition, when the POI coordinates are mined, the method of the embodiment of the invention performs POI coordinate mining of the POI names by combining the first positioning positions and the second positioning positions of a plurality of target historical orders with the same POI name, and even if the corresponding POI names of two geographic positions with a relatively short distance are different, the problem of limitation of mining the two positions with a relatively short distance to the same POI coordinates is avoided, and on the contrary, different POI coordinates can be mined based on the POI names even if the two geographic positions with a relatively short distance are adopted, so that the limitation of the mined POI coordinates is reduced. In addition, because order information is more time-efficient, POI coordinates can be mined more timely based on order information of historical orders.
Optionally, the order information further includes an order address, and when step 101 is performed, it may be implemented through S201 and S202:
s201, based on the order address of each historical order in a plurality of historical orders, determining a POI name corresponding to each historical order respectively;
the POI name corresponding to each historical order can be extracted from the order address by performing word segmentation processing and the like on the order address of each historical order.
S202, determining a third positioning position of each historical order based on the first positioning position and the second positioning position of each historical order.
When determining the third positioning position of one history order, any one of the first positioning position and the second positioning position of the history order can be used as the third positioning position. An intermediate location between the first location and the second location of the historical order may also be used as the third location to which the historical order belongs.
The location of the historical order based on the order placement location and the order handing location of the historical order may be more accurate.
The execution sequence of S201 and S202 is not limited in the present invention.
In the embodiment of the invention, the POI name corresponding to each historical order can be respectively determined based on the order address of each historical order in a plurality of historical orders, and the accuracy of the mined POI name can be improved because the order address of the historical order is more accurate; in addition, the third positioning position of each historical order can be determined based on the first positioning position and the second positioning position of each historical order, and the ordering positioning and the handover positioning of the historical order are accurate, so that the positioning of the historical order determined based on the first positioning position and the second positioning position is objective and accurate.
Optionally, when executing the step S201, a word segmentation may be performed for the order address of each of the plurality of historical orders to obtain a word segmentation result, and a POI name corresponding to each of the historical orders may be extracted from the word segmentation result.
The order information of each historical order can include an order address, a first positioning position when the order is placed, and a second positioning position when the order is handed over. Of course, the order information may also include information such as time of order, payment method, order remarks, etc.
And aiming at the order address of each historical order, respectively performing word segmentation on the order address to obtain a word segmentation result. After word segmentation, the order address can be segmented into a plurality of word segments, and attribute labels of the word segments are obtained. According to the attribute tags of the individual segmentations, POI names which possibly represent one POI can be extracted from the segmentations obtained by segmentation. Wherein a POI may be a shop, a mall, a bus stop, an office building, a park, a cell, etc.
In an alternative embodiment, a word segmentation model for structurally segmenting an order address may be pre-trained. In the training process, a large amount of first sample data are acquired, and each first sample data comprises a sample word and a labeling attribute label of the sample word. Based on a large amount of first sample data, training a word segmentation model to be trained by adopting a machine learning algorithm, taking sample word segmentation as input of the word segmentation model to be trained, calculating a loss value according to output of the word segmentation model to be trained and labeling attribute labels of the sample word segmentation, and determining that training is completed when the loss value is within a preset range, wherein the trained model is used as the word segmentation model. In implementation, the word segmentation model may adopt model structures such as BiLSTM (Bidirectional Long Short-Term Memory, two-way long and short Term Memory) -CRF (Conditional Random Field ).
Optionally, when the order address is segmented to obtain a segmentation result, the order address may be input into a pre-trained segmentation model to obtain each segmented word and an attribute tag of each segmented word output by the segmentation model, and the segmented words and the attribute tags of each segmented word are used as the segmentation result. The word segmentation model is trained according to a plurality of first sample data, wherein the first sample data comprises sample word segmentation and labeling attribute labels of the sample word segmentation.
For example, an order address is six layers of Guangcun road A mansion A seats in the sea lake area of Beijing city, and after structural word segmentation is carried out on the order address, the words of Beijing city, sea lake area, guangcun road, A mansion, A seat and six layers can be obtained. The attribute label of the word "Beijing city" is "city", the attribute label of the word "sea lake area" is "region", the attribute label of the word "Guangcun road" is "street", the attribute label of the word "A building" is "POI", the attribute label of the word "A seat" is "building", and the attribute label of the word "six layers" is "floor".
In the embodiment of the invention, the word segmentation model is trained based on a large number of sample word segmentation and labeling attribute labels of the sample word segmentation, so that the word segmentation model is utilized to more accurately and more rapidly segment the order address.
Optionally, the order information further includes a manual address type, and in the POI name extraction process, POI name extraction may be performed according to the manual address type.
In an alternative embodiment, the correspondence between the address type and the attribute tag may be preset.
For example, when the address type is office building, the corresponding attribute tags may be "POI" and "building"; when the address type is a cell, the corresponding attribute tags may be "POI" and "building number"; when the address type is a shop, the corresponding attribute tag may be "POI"; when the address type is a mall, the corresponding attribute tag may be "POI"; when the address type is park, the corresponding attribute tag may be "POI"; when the address type is a bus stop, the corresponding attribute tag may be "POI", and so on.
When the POI name corresponding to each historical order is extracted from the word segmentation result, the target attribute label corresponding to the manual address type of each historical order can be queried from the corresponding relation between the preset address type and the attribute label; extracting the word segmentation with the attribute tag being the target attribute tag from the word segmentation result, and taking the extracted word segmentation as the POI name corresponding to each historical order.
For example, in practical applications, the form of the order address may include: a hand-selected POI form, a hand-selected POI+handwritten content form and a handwritten content form. The hand-selected POI form means that the user only selects the address by hand. For the hand-selected POI format, the composition of the order address may include three types: POI (i.e., extracted POI name), poi+unit/floor/room number, poi+supplemental information (e.g., remark information). The hand-selected POI + handwritten content form means that the user has manually selected a portion of the address and has handwritten a portion of the content. For the hand POI+handwriting content form, the composition of the order address may include two types: poi+building+others, poi+sub-description+others. The handwriting content form means that the user has just handwritten the address. For handwritten content forms, the composition of the order address may include two types: poi+true appeal, poi+other information.
In the embodiment of the invention, the POI names corresponding to different address types possibly contain the segmentation corresponding to different attribute tags, so that the corresponding relation between the address types and the attribute tags is set based on actual conditions, the POI names are extracted according to the corresponding relation, the extraction process is simpler and more convenient, and the extraction result is more accurate.
Optionally, when executing step 103, the abnormal positioning positions in the first positioning position and the second positioning position corresponding to the same POI name may be filtered, and then, based on the first positioning position and the second positioning position corresponding to the same POI name after the filtering, the POI coordinates corresponding to the same POI name may be determined, which may specifically include S301 to S304 to implement:
s301, converting the first positioning position and the second positioning position of each target historical order with the same POI name into geographic position indexes respectively, and generating an association relation between the positioning positions and the geographic position indexes;
the first positioning position and the second positioning position can be longitude and latitude coordinates, so that one longitude and latitude coordinate can be converted into one geographic position index.
The geographic index calculation method can be used when converting any longitude and latitude geographic coordinate into a geographic position index, and the geographic index calculation method comprises, but is not limited to, geoHash, H3, S2 and other algorithms.
The geographic position index can encode longitude and latitude coordinates into a short character string formed by letters and numbers, and the short character string value can be used for indexing and expressing a certain coordinate point or area on a map. Wherein, points on the map that are similar can be converted into geographic location indexes with the same prefix (for example, similar location 1 and location 2 on the map have geographic location indexes abc123 and abc124, respectively, and the prefixes are abc 12).
Moreover, the geographic location index may represent geographic location coordinates of any accuracy, as long as the string length of the geographic location index is sufficiently long, wherein the higher the accuracy of the geographic location index, the longer the string length thereof, the smaller and more accurate the region of the geographic location it expresses; the longer the prefixes of the two sets of codes match, the more closely the geographic locations of the two places are indicated when the coding (i.e., string) of the geographic location index is used to determine the degree of closeness between the two places.
In one example, a GeoHash algorithm may be employed to convert each first location and each second location of each target historical order with the same POI name into a hash index, i.e., encoding the spatial location into a string.
Thus, each first location of each target historical order is converted into a hash index respectively, and each second location of each target historical order is also converted into a hash index respectively.
Thus, an association relationship between the location position and the geographical position index is generated for the same POI name.
Wherein, because the distance between different positioning positions (for example, between the first positioning position, or between the second positioning position, or between the first positioning position and the second positioning position) of different target historical orders may be relatively close, there is a case that different positioning positions are associated with the same geographic position index, but there is no case that the same positioning position is associated with different geographic position indexes.
In one example, for 5 target historical orders (thus corresponding to a total of 10 location positions, i.e., coordinates, to be converted) with the same POI name (e.g., "XX building 11") through the above index conversion, the following association is generated:
index 1 associates coordinate 1, coordinate 2, coordinate 3, coordinate 4, coordinate 5;
index 2 associates coordinate 6, coordinate 7, coordinate 8;
index 3 associates coordinate 9 with coordinate 10.
S302, identifying a target geographic position index associated with the most positioning positions in a plurality of geographic position indexes corresponding to the same POI name;
in the above example, the POI name "XX building 11 # building" corresponds to the above three indexes, and a target index associated with the largest number of coordinates, that is, index 1, can be identified.
S303, filtering the positioning positions which are not associated with the target geographic position index in the first positioning position and the second positioning position corresponding to the same POI name;
in the above example, the coordinates 6, 7, 8, 9, and 10 that are not associated with the index 1 may be filtered out of the 10 coordinates (i.e., coordinates 1 to 10) corresponding to the POI name "XX building 11 building" such that the POI name "XX building 11 building" corresponds to only the coordinates 1, 2, 3, 4, and 5.
S304, determining POI coordinates corresponding to the same POI name based on the first positioning position and the second positioning position which correspond to the same POI name and are subjected to filtering processing.
In the above example, the POI coordinates of the POI name "XX building No. 11" may be determined based on the filtered coordinates 1, 2, 3, 4, and 5 corresponding to the POI name.
The coordinates 6 to 10 are all the filtered abnormal positioning positions.
In some application scenarios, the filtered outlier location may include any of the following types of location: the first location of the alien order (i.e., where the first location of the order and the order address are in two completely different geographic locations, or where the first location of the order and the second location of the handover are in two completely different geographic locations, e.g., where the filtered abnormal location is the first location of the order, where the order handover is irregular, where the rider handover the order is far from the order address, where the filtered abnormal location is the second location of the handover).
In the embodiment of the invention, the first positioning position and the second positioning position are respectively converted into the geographic position indexes, and because the geographic position indexes are easy to judge the distance between the geographic positions, the abnormal first positioning position and/or the abnormal second positioning position can be accurately filtered based on the geographic position indexes, and the filtered first positioning position and the filtered second positioning position corresponding to the same POI name are adopted to mine the POI coordinates corresponding to the POI name, so that the error influence of the abnormal first positioning position and/or the abnormal second positioning position on the accuracy of the mined POI coordinates can be avoided, and the accuracy of the mined POI coordinates is further improved.
Alternatively, in executing step 103 or S304 described above, it may be realized by steps A1, A2, and A3 in this order:
step A1, performing density clustering on the first positioning position and the second positioning position of the target historical order with the same POI name to obtain at least one first cluster;
the close-level clustering object may be a first positioning position and a second positioning position corresponding to the same POI name before the filtering process or after the filtering process.
One historical order has one order address, and the order addresses may be the same or different from one historical order to another. The order address of a historical order may extract a POI name, with a historical order corresponding to a first location and a second location. For the same POI name, it may correspond to multiple historical orders, and thus the same POI name may correspond to multiple first location locations and multiple second location locations.
In an implementation, a DBSCAN (Density-Based Spatial Clustering of Applications with Noise, noise-based clustering method) may be used to perform Density clustering on the first positioning location and the second positioning location (corresponding to multiple sets of first positioning locations and multiple sets of second positioning locations) corresponding to each target historical order with the same POI name.
In one example, as shown in fig. 2, the range in which DBSCAN clustering is performed in this step is range 11, and each dot (except dot 12) in range 11 is a first positioning position and a second positioning position that need to be clustered and have the same POI name (e.g. "XX building 11") after the filtering process (e.g. three dots 13 not in range 11 in fig. 2 are filtered abnormal positioning positions).
DBSCAN is a density-based clustering algorithm. Unlike the partitioning and hierarchical clustering method, which defines clusters as the largest set of densely connected points, it is possible to partition a region having a sufficiently high density into clusters and find clusters of arbitrary shape in a noisy spatial database.
Several of the DBSCAN definitions are as follows:
e neighborhood: the region within a given object radius of an E is called the E neighborhood of the object.
Core object: if the number of the sample points in the neighborhood of the given object E is greater than or equal to MinPts, the object is called as a core object.
The direct density can be achieved: for sample set D, if sample point q is within the e-neighborhood of p, and p is the core object, then object q is directly density reachable from object p.
The density can be achieved: for sample set D, given a series of sample points p1, p2 … pn, p=p1, q=pn, object q is density reachable from object p, provided that object pi is density reachable directly from pi-1.
Density connection: there is a point o in the sample set D, and if object o to object p and object q are both density reachable, then p and q are density-coupled.
The DBSCAN clustering process is described generally as follows:
for a given neighborhood distance E and neighborhood minimum sample point number MinPts:
(1) Traversing all samples, and finding out all sets of core objects meeting the neighborhood distance E;
(2) Randomly selecting a core object, and finding out all samples with reachable densities to generate a cluster;
(3) Removing the density reachable samples found in (2) from the remaining core objects;
(4) Repeating steps (2) - (3) from the updated core object set until the core object is traversed or removed.
Corresponding to the embodiment of the invention, the first positioning position and the second positioning position of the target historical order with the same POI name form a sample set, wherein one first positioning position is a sample, one second positioning position is also a sample, and the first positioning position and the second positioning position can not be distinguished in the sample set, and are all samples.
A2, selecting a first cluster with the largest magnitude, and carrying out K-Means clustering on the selected first cluster to obtain at least one second cluster;
the first cluster with the largest magnitude can be selected from the first clusters obtained after density clustering. The maximum magnitude level refers to the maximum number of sample points in the cluster. And carrying out K-Means clustering on the first cluster with the largest magnitude.
K-Means is a distance-based clustering algorithm. The distance is used as an evaluation index of the similarity, that is, the closer the distance between two objects is, the greater the similarity is. The algorithm considers clusters to be made up of objects that are close together, thus targeting a compact and independent cluster as the final target.
The K-Means clustering process is approximately as follows:
(1) K samples were randomly selected from all samples as centroids.
(2) The distance to each centroid is measured for each sample remaining and is classified as the nearest centroid.
(3) The centroids of the classes that have been obtained are recalculated.
(4) And (5) iterating the steps (2) - (3) until the new centroid is equal to the original centroid or the distance between the new centroid and the original centroid is smaller than a specified threshold value, and ending the algorithm.
Corresponding to the embodiment of the invention, the first cluster with the largest magnitude is selected to form a sample set, wherein one first positioning position is a sample, and one second positioning position is also a sample.
And A3, selecting a second cluster with the largest magnitude, and taking the mass center of the selected second cluster as the POI coordinate corresponding to the same POI name.
And selecting a second cluster with the largest magnitude from second clusters obtained after K-Means clustering. The maximum magnitude level refers to the maximum number of sample points in the cluster. And taking the centroid of the second cluster with the largest magnitude as the POI coordinate corresponding to the same POI name.
In one example, as shown in FIG. 2, the centroid is the dot 12 within the range 11, i.e., the coordinates of the dot 12 are the POI coordinates corresponding to the POI name (e.g., "XX building 11").
For a POI name, if the POI position corresponding to the POI name is determined only according to the first positioning position and the second positioning position corresponding to a historical order to which the POI name belongs, the accuracy of the obtained POI coordinates may be lower due to inaccurate first positioning position or second positioning position of the historical order. Therefore, in the embodiment of the invention, the first positioning positions and the second positioning positions corresponding to the historical orders with the same POI name can be combined for processing, and the plurality of first positioning positions and the plurality of second positioning positions corresponding to the same POI name are clustered to determine the POI coordinate with higher confidence coefficient, so that inaccurate influence of some positioning positions on the obtained POI coordinate is weakened. In addition, the defect of a single clustering mode can be overcome by combining density clustering and K-Means clustering, and the accuracy of a clustering result is further improved.
Optionally, after step 103, the method according to an embodiment of the present invention may further include:
104, obtaining candidate POI information of POI coordinates to be corrected, wherein the candidate POI information comprises candidate POI names, candidate POI coordinates and candidate POI categories;
in some application scenarios, there are a large number of high-value POI information, and some of the POI information has a problem of coordinates (for example, no POI coordinates exist in the POI information or the POI coordinates in the POI information are inaccurate), so that the existing POI information with the problem of coordinates cannot be used online. In the embodiment of the invention, the candidate POI information with the coordinate problem can be subjected to the coordinate updating, so that the existing POI information can be used by various applications.
Thus, the candidate POI information herein is at least one POI information having a problem of coordinates (i.e., POI coordinates to be corrected).
For the POI information in which the POI coordinates do not exist in the existing POI information, an initial POI coordinate can be generated based on the address of the POI information, wherein the initial POI coordinate is the POI coordinate to be corrected of the POI information.
Step 105, identifying target POI information in the candidate POI information, wherein the candidate POI name is the same as the same POI name, and the distance between the candidate POI coordinates and the POI coordinates corresponding to the same POI name is greater than a second preset threshold, and the second preset threshold is a threshold matched with the candidate POI category in the target POI information;
The candidate POI information is a large amount of existing POI information needing to correct POI coordinates.
The corresponding relation between the POI names and the POI coordinates can be obtained through the steps 101 to 103, wherein the POI names are mined from the historical orders, so that different historical orders can mine different POI names, and the corresponding relation obtained through the step 103 can be multiple groups, for example, the POI name 1 corresponds to the POI coordinates 1; POI name 2 corresponds to POI coordinate 2.
Taking POI name 1 as an example, this step needs to determine which POI information (i.e., target POI information) in the candidate POI information is corrected by using POI coordinate 1.
The specific way is that candidate POI names of the candidate POI information can be compared with the POI name 1 (for example, text is compared one by one, or semantic similarity is matched), and target POI information, of which the candidate POI names are the same as the POI name 1 and the distance between the candidate POI coordinates and the POI coordinates 1 is larger than a second preset threshold, is found from the candidate POI information.
The evaluation criteria with the same POI names may be that the POI names are identical, for example, a building and a building are the same POI names; the POI names may be the same after normalization (unified case, unified number, etc.), for example, the a-building a-seat and the a-building a-seat are the same POI names, etc.
Therefore, the step can find out the target POI information with the same name as the POI name 1 and with the coordinates far away from the POI coordinates 1 from the candidate POI information.
In addition, when determining the second preset threshold value for comparing the distance between the target POI information and the POI coordinates, determining a threshold value corresponding to the candidate POI category in the target POI information according to the corresponding relation between the preset POI category and the threshold value, and taking the threshold value as the second preset threshold value for comparing the candidate POI category.
The reason is that the intensity level of the distribution of the buildings in different categories is different, for example, when the address category is a bus station, the distance between the different bus stations is generally not more than 1 km, so that the corresponding threshold value is 1 km for the address of the category being the bus station (i.e. the bus station corresponds to 1 km);
for another example, when the address category is a chain supermarket, a chain fast food restaurant, etc., the distance between different branches of the same store name is not more than 2 km, the chain supermarket, the chain fast food restaurant corresponds to a threshold value of 2 km;
as another example, when the address category is office buildings, then the distance between different office buildings of the same name is at least 10 km, and thus the office buildings correspond to a threshold of 10 km.
And step 106, updating the candidate POI coordinates in the target POI information into POI coordinates corresponding to the same POI name.
Since the candidate POI coordinates in the target POI information are far away from the newly generated POI coordinates 1 with high confidence, it is indicated that the original POI coordinates in the target POI information may have a coordinate error, and therefore, the candidate POI coordinates in the target POI information can be corrected and specifically updated to the POI coordinates 1.
In the embodiment of the invention, after the POI coordinates corresponding to a certain POI name are mined, candidate POI information needing the same name can be found from the existing candidate POI information based on a POI name comparison mode, and the candidate POI coordinates in the candidate POI information with the same name are compared with the mined latest POI coordinates in a distance manner, if the distance is larger, the original POI coordinates in the candidate POI information with the same name are possibly wrong, the POI coordinates with higher accuracy rate determined by the embodiment of the invention can be adopted to update the original POI coordinates, so that the existing POI information does not have a coordinate problem any more, and a large number of high-value POI information without coordinates can be used for correcting the POI coordinates on line; in addition, when evaluating whether the POI coordinates in the original POI information have errors, the distance between the two POI coordinates can be evaluated according to a third preset threshold corresponding to the POI category in the target POI information, so that the criterion for evaluating whether the POI coordinates have errors is more reasonable, and the POI information with the errors can be accurately positioned.
The present embodiment discloses a data processing apparatus, as shown in fig. 3, including:
the first determining module 31 is configured to determine, based on order information of each of the plurality of historical orders, a POI name corresponding to each of the historical orders, and a third location to which each of the historical orders belongs, respectively;
the order information comprises a first positioning position when an order is placed and a second positioning position when the order is handed over;
a first identifying module 32, configured to identify, from the plurality of historical orders, a target historical order that has the same POI name and the third positioning location meets a preset condition, where the third positioning location meets the preset condition and includes a distance between third positioning locations of any two of the target historical orders being less than a first preset threshold;
a second determining module 33, configured to determine POI coordinates corresponding to the same POI name based on the first positioning location and the second positioning location of the target historical order with the same POI name.
Optionally, the order information further includes an order address, and the first determining module 31 includes:
the first determining submodule is used for respectively determining the POI name corresponding to each historical order based on the order address of each historical order in the plurality of historical orders;
And the second determining submodule is used for determining a third positioning position of each historical order based on the first positioning position and the second positioning position of each historical order.
Optionally, the second determining module 33 includes:
the conversion sub-module is used for respectively converting the first positioning position and the second positioning position of each target historical order with the same POI name into geographic position indexes and generating an association relation between the positioning positions and the geographic position indexes;
the first identification sub-module is used for identifying a target geographic position index associated with the most positioning positions in a plurality of geographic position indexes corresponding to the same POI name;
the filtering sub-module is used for filtering the positioning positions which are not associated with the target geographic position index in the first positioning position and the second positioning position corresponding to the same POI name;
and the third determining submodule is used for determining POI coordinates corresponding to the same POI name based on the first positioning position and the second positioning position which correspond to the same POI name and are subjected to filtering processing.
Optionally, the second determining module 33 includes:
the first clustering sub-module is used for carrying out density clustering on the first positioning position and the second positioning position of the target historical order with the same POI name to obtain at least one first clustering cluster;
the second clustering sub-module is used for selecting a first cluster with the largest magnitude, and carrying out K-Means clustering on the selected first cluster to obtain at least one second cluster;
and the second recognition sub-module is used for selecting a second cluster with the largest magnitude, and taking the centroid of the selected second cluster as the POI coordinate corresponding to the same POI name.
Optionally, the apparatus further comprises:
the acquisition module is used for acquiring candidate POI information of the POI coordinates to be corrected, wherein the candidate POI information comprises candidate POI names, candidate POI coordinates and candidate POI categories;
the second identifying module is used for identifying target POI information, in which the candidate POI name is the same as the same POI name, and the distance between the candidate POI coordinates and the POI coordinates corresponding to the same POI name is greater than a second preset threshold, wherein the second preset threshold is a threshold matched with the candidate POI category in the target POI information;
And the updating module is used for updating the candidate POI coordinates in the target POI information into POI coordinates corresponding to the same POI name.
Optionally, the first determining submodule is configured to perform word segmentation on the order address of each of the plurality of historical orders to obtain a word segmentation result, and extract a POI name corresponding to each historical order from the word segmentation result.
Optionally, the first determining submodule includes:
the input unit is used for inputting the order address into a pre-trained word segmentation model to obtain each word segmentation output by the word segmentation model and attribute tags of each word segmentation, and taking each word segmentation and the attribute tags of each word segmentation as the word segmentation result;
the word segmentation model is trained according to a plurality of first sample data, wherein the first sample data comprises sample word segmentation and labeling attribute labels of the sample word segmentation.
Optionally, the first determining submodule includes:
the inquiry unit is used for inquiring the target attribute label corresponding to the hand-selected address type of each historical order from the corresponding relation between the preset address type and the attribute label;
and the determining unit is used for extracting the segmented words with the attribute labels being the target attribute labels from the word segmentation result, and taking the extracted segmented words as the POI names corresponding to each historical order.
The data processing device disclosed in the embodiments of the present invention is configured to implement each step of the data processing method described in each of the foregoing embodiments of the present invention, and specific implementation manners of each module of the device refer to corresponding steps, which are not described herein again.
According to the data processing device disclosed by the embodiment of the invention, the target historical orders with the same corresponding POI names and the relatively close distance between the third positioning positions of any two historical orders are determined from the historical orders, so that the determined target historical orders point to the same POI, and then the POI coordinates of the same POI are determined by utilizing the first positioning positions of the plurality of target historical orders pointing to the same POI when the orders are placed and the second positioning positions of the plurality of target historical orders when the orders are handed over; in addition, as the geographical position covered by the historical order is wider, the POI coordinates of the position can be mined by means of the technical scheme provided by the embodiment of the invention as long as the position of the historical order exists, so that the mining coverage rate of the POI coordinates is further improved; in addition, when the POI coordinates are mined, the method of the embodiment of the invention performs POI coordinate mining of the POI names by combining the first positioning positions and the second positioning positions of a plurality of target historical orders with the same POI name, and even if the corresponding POI names of two geographic positions with a relatively short distance are different, the problem of limitation of mining the two positions with a relatively short distance to the same POI coordinates is avoided, and on the contrary, different POI coordinates can be mined based on the POI names even if the two geographic positions with a relatively short distance are adopted, so that the limitation of the mined POI coordinates is reduced. In addition, because order information is more time-efficient, POI coordinates can be mined more timely based on order information of historical orders.
Correspondingly, the invention also discloses electronic equipment, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the data processing method according to any one of the embodiments of the invention when executing the computer program. The electronic device may be a PC, a mobile terminal, a personal digital assistant, a tablet computer, etc.
The invention also discloses a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the steps of the data processing method according to any of the above embodiments of the invention.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described by differences from other embodiments, and identical and similar parts between the embodiments are all enough to be referred to each other. For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference is made to the description of the method embodiments for relevant points.
The foregoing has described in detail a data processing method and apparatus provided by the present invention, and specific examples have been provided herein to illustrate the principles and embodiments of the present invention, the above examples being provided only to assist in understanding the method and core idea of the present invention; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present invention, the present description should not be construed as limiting the present invention in view of the above.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
Various component embodiments of the present disclosure may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that some or all of the functions of some or all of the components in a computing processing device according to embodiments of the present disclosure may be implemented in practice using a microprocessor or Digital Signal Processor (DSP). The present disclosure may also be embodied as a device or apparatus program (e.g., computer program and computer program product) for performing a portion or all of the methods described herein. Such a program embodying the present disclosure may be stored on a computer readable medium, or may have the form of one or more signals. Such signals may be downloaded from an internet website, provided on a carrier signal, or provided in any other form.
For example, FIG. 4 illustrates a computing processing device that may implement methods according to the present disclosure. The computing processing device conventionally includes a processor 1010 and a computer program product in the form of a memory 1020 or a computer readable medium. The memory 1020 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. Memory 1020 has storage space 1030 for program code 1031 for performing any of the method steps described above. For example, the storage space 1030 for the program code may include respective program code 1031 for implementing the various steps in the above method, respectively. The program code can be read from or written to one or more computer program products. These computer program products comprise a program code carrier such as a hard disk, a Compact Disc (CD), a memory card or a floppy disk. Such a computer program product is typically a portable or fixed storage unit as described with reference to fig. 5. The memory unit may have memory segments, memory spaces, etc. arranged similarly to the memory 1020 in the computing processing device of fig. 4. The program code may be compressed, for example, in a suitable form. In general, the storage unit includes computer readable code 1031', i.e., code that can be read by a processor such as 1010, for example, which when executed by a computing processing device causes the computing processing device to perform the steps in the method described above.
Reference herein to "one embodiment," "an embodiment," or "one or more embodiments" means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Furthermore, it is noted that the word examples "in one embodiment" herein do not necessarily all refer to the same embodiment.
In the description provided herein, numerous specific details are set forth. However, it is understood that embodiments of the disclosure may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The disclosure may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The use of the words first, second, third, etc. do not denote any order. These words may be interpreted as names.
Finally, it should be noted that: the above embodiments are merely for illustrating the technical solution of the present disclosure, and are not limiting thereof; although the present disclosure has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present disclosure.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or may be implemented by hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.

Claims (10)

1. A method of data processing, comprising:
based on order information of each historical order in a plurality of historical orders, determining a POI name corresponding to each historical order and a third positioning position to which each historical order belongs;
the order information comprises a first positioning position when an order is placed and a second positioning position when the order is handed over;
identifying target historical orders with the same POI names and the third positioning positions meeting preset conditions from the historical orders, wherein the third positioning positions meeting the preset conditions comprise that the distance between the third positioning positions of any two target historical orders is smaller than a first preset threshold;
and determining POI coordinates corresponding to the same POI name based on the first positioning position and the second positioning position of the target historical order with the same POI name.
2. The method of claim 1, wherein the order information further includes an order address, the determining, based on the order information of each of the plurality of historical orders, the POI name corresponding to each of the historical orders, and the third location to which each of the historical orders belongs, respectively, includes:
Based on the order address of each historical order in a plurality of historical orders, respectively determining a POI name corresponding to each historical order;
and determining a third positioning position of each historical order based on the first positioning position and the second positioning position of each historical order.
3. The method of claim 1, wherein the determining POI coordinates corresponding to the same POI name based on the first and second locating positions of the target historical order with the same POI name comprises:
converting the first positioning position and the second positioning position of each target historical order with the same POI name into geographic position indexes respectively, and generating an association relation between the positioning positions and the geographic position indexes;
identifying a target geographic position index associated with the most number of positioning positions in a plurality of geographic position indexes corresponding to the same POI name;
filtering the positioning positions which are not associated with the target geographic position index in the first positioning position and the second positioning position corresponding to the same POI name;
And determining POI coordinates corresponding to the same POI name based on the first positioning position and the second positioning position which correspond to the same POI name and are subjected to filtering processing.
4. The method of claim 1, wherein the determining POI coordinates corresponding to the same POI name based on the first and second locating positions of the target historical order with the same POI name comprises:
performing density clustering on the first positioning position and the second positioning position of the target historical order with the same POI name to obtain at least one first cluster;
selecting a first cluster with the largest magnitude, and carrying out K-Means clustering on the selected first cluster to obtain at least one second cluster;
and selecting a second cluster with the largest magnitude, and taking the centroid of the selected second cluster as the POI coordinate corresponding to the same POI name.
5. The method of claim 1, wherein after determining POI coordinates corresponding to the same POI name based on the first location and the second location of the target historical order with the same POI name, the method further comprises:
Obtaining candidate POI information of POI coordinates to be corrected, wherein the candidate POI information comprises candidate POI names, candidate POI coordinates and candidate POI categories;
identifying target POI information, in which the candidate POI names are the same as the same POI names, and the distance between the candidate POI coordinates and the POI coordinates corresponding to the same POI names is greater than a second preset threshold, wherein the second preset threshold is a threshold matched with the candidate POI category in the target POI information;
and updating the candidate POI coordinates in the target POI information into POI coordinates corresponding to the same POI name.
6. A data processing apparatus, comprising:
the first determining module is used for respectively determining the POI names of the points of interest corresponding to each historical order and the third positioning position of each historical order based on the order information of each historical order in the plurality of historical orders;
the order information comprises a first positioning position when an order is placed and a second positioning position when the order is handed over;
the first identifying module is used for identifying target historical orders with the same POI names and the third positioning positions meeting the preset conditions from the plurality of historical orders, wherein the third positioning positions meeting the preset conditions comprise that the distance between the third positioning positions of any two target historical orders is smaller than a first preset threshold;
And the second determining module is used for determining POI coordinates corresponding to the same POI name based on the first positioning position and the second positioning position of the target historical order with the same POI name.
7. The apparatus of claim 6, wherein the order information further comprises an order address, the first determination module comprising:
the first determining submodule is used for respectively determining the POI name corresponding to each historical order based on the order address of each historical order in the plurality of historical orders;
and the second determining submodule is used for determining a third positioning position of each historical order based on the first positioning position and the second positioning position of each historical order.
8. The apparatus of claim 6, wherein the second determining module comprises:
the conversion sub-module is used for respectively converting the first positioning position and the second positioning position of each target historical order with the same POI name into geographic position indexes and generating an association relation between the positioning positions and the geographic position indexes;
the first identification sub-module is used for identifying a target geographic position index associated with the most positioning positions in a plurality of geographic position indexes corresponding to the same POI name;
The filtering sub-module is used for filtering the positioning positions which are not associated with the target geographic position index in the first positioning position and the second positioning position corresponding to the same POI name;
and the third determining submodule is used for determining POI coordinates corresponding to the same POI name based on the first positioning position and the second positioning position which correspond to the same POI name and are subjected to filtering processing.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the data processing method of any of claims 1 to 5 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the steps of the data processing method of any one of claims 1 to 5.
CN202010615622.0A 2020-06-30 2020-06-30 Data processing method, device, electronic equipment and storage medium Active CN111931077B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010615622.0A CN111931077B (en) 2020-06-30 2020-06-30 Data processing method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010615622.0A CN111931077B (en) 2020-06-30 2020-06-30 Data processing method, device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111931077A CN111931077A (en) 2020-11-13
CN111931077B true CN111931077B (en) 2023-12-12

Family

ID=73316877

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010615622.0A Active CN111931077B (en) 2020-06-30 2020-06-30 Data processing method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111931077B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112837128B (en) * 2021-02-19 2023-04-28 拉扎斯网络科技(上海)有限公司 Order assignment method, order assignment device, computer equipment and computer readable storage medium
CN113836252B (en) * 2021-09-17 2023-09-26 北京京东振世信息技术有限公司 Method and device for determining geographic coordinates
CN114089101A (en) * 2021-11-11 2022-02-25 广东电网有限责任公司广州供电局 Low-voltage power grid fault transformer area judgment method and device
CN116383326B (en) * 2023-04-06 2024-07-30 腾讯科技(深圳)有限公司 Method and device for updating position information of interest point data and computer equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102015207047A1 (en) * 2014-04-25 2015-10-29 Xerox Corporation Method and system automated sequencing of vehicles in side-by-side transit configurations via image-based classification
CN110335115A (en) * 2019-07-01 2019-10-15 阿里巴巴集团控股有限公司 A kind of service order processing method and processing device
CN110503353A (en) * 2018-05-16 2019-11-26 北京三快在线科技有限公司 A kind of dispatching Zonal expression method and device
WO2019228391A1 (en) * 2018-05-31 2019-12-05 Beijing Didi Infinity Technology And Development Co., Ltd. Systems and methods for online to offline services

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102015207047A1 (en) * 2014-04-25 2015-10-29 Xerox Corporation Method and system automated sequencing of vehicles in side-by-side transit configurations via image-based classification
CN110503353A (en) * 2018-05-16 2019-11-26 北京三快在线科技有限公司 A kind of dispatching Zonal expression method and device
WO2019228391A1 (en) * 2018-05-31 2019-12-05 Beijing Didi Infinity Technology And Development Co., Ltd. Systems and methods for online to offline services
CN110335115A (en) * 2019-07-01 2019-10-15 阿里巴巴集团控股有限公司 A kind of service order processing method and processing device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种基于评分矩阵局部低秩假设融合地理和文本信息的协同排名POI推荐模型;孙琳;罗保山;高榕;;计算机应用研究(10);全文 *

Also Published As

Publication number Publication date
CN111931077A (en) 2020-11-13

Similar Documents

Publication Publication Date Title
CN111931077B (en) Data processing method, device, electronic equipment and storage medium
EP3971731B1 (en) Fence address-based coordinate data processing method and apparatus, and computer device
CN109145169B (en) Address matching method based on statistical word segmentation
US11698261B2 (en) Method, apparatus, computer device and storage medium for determining POI alias
CN108304423B (en) Information identification method and device
CN108628811B (en) Address text matching method and device
CN109145281B (en) Speech recognition method, apparatus and storage medium
CN111382212B (en) Associated address acquisition method and device, electronic equipment and storage medium
CN104080054B (en) A kind of acquisition methods and device of exception point of interest
CN111782741A (en) Interest point mining method and device, electronic equipment and storage medium
CN102693266A (en) Method of searching a data base, navigation device and method of generating an index structure
CN110598917B (en) Destination prediction method, system and storage medium based on path track
CN107203526A (en) A kind of query string semantic requirement analysis method and device
CN111141301A (en) Navigation end point determining method, device, storage medium and computer equipment
CN110674208B (en) Method and device for determining position information of user
CN111190988A (en) Address resolution method, device, equipment and computer readable storage medium
CN116414823A (en) Address positioning method and device based on word segmentation model
CN111382138A (en) POI data processing method, device, equipment and medium
CN114036414A (en) Method and device for processing interest points, electronic equipment, medium and program product
Xi et al. Improved dynamic time warping algorithm for bus route trajectory curve fitting
CN115525841A (en) Method for acquiring point of interest information, electronic device and storage medium
CN110609874B (en) Address entity coreference resolution method based on density clustering algorithm
CN111858787B (en) POI information acquisition method and device
CN111597277B (en) Site aggregation method, device, computer equipment and medium in electronic map
CN111412925B (en) POI position error correction method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant