CN116956218A - Multi-source building data fusion method and device - Google Patents

Multi-source building data fusion method and device Download PDF

Info

Publication number
CN116956218A
CN116956218A CN202310906869.1A CN202310906869A CN116956218A CN 116956218 A CN116956218 A CN 116956218A CN 202310906869 A CN202310906869 A CN 202310906869A CN 116956218 A CN116956218 A CN 116956218A
Authority
CN
China
Prior art keywords
compared
building data
building
data
feature similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310906869.1A
Other languages
Chinese (zh)
Inventor
赵京
赵友标
邢佳敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai 100me Network Technology Co ltd
Original Assignee
Shanghai 100me Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai 100me Network Technology Co ltd filed Critical Shanghai 100me Network Technology Co ltd
Priority to CN202310906869.1A priority Critical patent/CN116956218A/en
Publication of CN116956218A publication Critical patent/CN116956218A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/251Fusion techniques of input or preprocessed data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/08Logistics, e.g. warehousing, loading or distribution; Inventory or stock management
    • G06Q10/083Shipping

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Economics (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Development Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Quality & Reliability (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A multi-source building data fusion method and device, the method includes: establishing a corresponding relation between an order and an object to be compared aiming at any order; the object to be compared comprises building data and distribution tracks of orders with corresponding relation with the object to be compared; the building data is a distribution building indicated by the distribution address of the order; determining the text feature similarity of first building data of a first object to be compared and second building data of a second object to be compared; determining the spatial feature similarity of a first distribution track of the first object to be compared and a second distribution track of the second object to be compared; the first object to be compared and the second object to be compared are any two objects to be compared; and if the text feature similarity and the space feature similarity meet the data fusion requirement, fusing the first object to be compared and the second object to be compared into the same object to be compared.

Description

Multi-source building data fusion method and device
Technical Field
The application relates to the field of data processing, in particular to a multi-source building data fusion method and device.
Background
Instant messaging provides a logistics service for delivery to the home, requiring as detailed building data as possible. There are various sources of building data, such as, for example, available from map service providers, mined from rider behavior data and user-filled data, or by some incentive means, for crowd-sourced service providers to make active building labeling.
However, the multi-source building data may have the following problems, the building data provided by the first map service provider may have the problem that the building data is not comprehensive enough, and many building data, especially text information such as streets and building numbers, cannot follow actual changes. Second, mining building data from the rider distribution behavior data and the user-filled data mainly identifies the building location of the cell by mining the location of the rider when the rider clicks to be delivered and the text information filled by the user, but these mined information are easily affected by user filling errors or abnormal clicking behaviors of the rider. Third, the crowd-sourced services are generally implemented by letting crowd-sourced operators upload the longitude and latitude of the building location. However, the equipment used by the crowdsourcing service personnel to collect the building locations may be affected by multipath interference caused by dense buildings, and the normalization of the crowdsourcing service personnel to collect the building locations themselves may also be problematic.
Therefore, a scheme is needed at present for performing mutual check and fusion on building data from different sources, so as to obtain richer and more accurate building data.
Disclosure of Invention
The application provides a multi-source building data fusion method and device, which are used for carrying out mutual verification and fusion on building data of different sources, so as to obtain richer and more accurate building data.
In a first aspect, the present application provides a multi-source building data fusion method, the method comprising: establishing a corresponding relation between an order and an object to be compared aiming at any order; the object to be compared comprises building data and distribution tracks of orders with corresponding relation with the object to be compared; the building data is a distribution building indicated by the distribution address of the order; determining the text feature similarity of first building data of a first object to be compared and second building data of a second object to be compared; determining the spatial feature similarity of a first distribution track of the first object to be compared and a second distribution track of the second object to be compared; the first object to be compared and the second object to be compared are any two objects to be compared; and if the text feature similarity and the space feature similarity meet the data fusion requirement, fusing the first object to be compared and the second object to be compared into the same object to be compared.
In the above technical scheme, for the service scenario of instant delivery to home, the accuracy of building position has direct influence on delivery efficiency and service quality. The method has the advantages that the sources of the building data are various, the problems of inaccuracy, repeated data or low coverage rate and the like of the building data with different sources can exist, the building data corresponding to the same building in reality are determined by comparing the text feature similarity and the space feature similarity between the building data, and the building data corresponding to the same building are fused, so that richer and more accurate building data can be obtained, the efficiency of searching the corresponding building by a rider is improved, and the system can more accurately estimate expected delivery time, schedule orders, judge actions such as delivery in advance and the like.
In one possible design, the text feature similarity and the spatial feature similarity satisfy a data fusion requirement, comprising: the text feature similarity is larger than a first set threshold value, and the space feature similarity is larger than a second set threshold value, so that the data fusion requirement is met; the method further comprises the steps of: if the text feature similarity is greater than the first set threshold and the space feature similarity is not greater than the second set threshold, determining whether the cell to which the first building data belongs and the cell to which the second building data belongs are the same cell; if the first building data and the second building data are different cells, determining that the first building data and the second building data are abnormal data.
In the technical scheme, when the text feature similarity and the space feature similarity of the two building data are similar, the two building data correspond to the same building in reality, and the data fusion requirement is met. When the text feature similarity of the two building data is higher, but the space feature similarity is lower, further analysis is needed to be carried out on the two buildings, and whether the two building data are abnormal building data or not is judged according to whether the cells to which the two building data belong are the same or not. That is, when only one feature is similar, the feature is used as a suspected abnormal building, and the suspected abnormal building is specifically analyzed, so that the accuracy of identifying the abnormal building is improved.
In one possible design, the method further comprises: if the text feature similarity is not greater than the first set threshold, and the spatial feature similarity is greater than the second set threshold, comparing the text feature similarity of the first building data with the text feature similarity of each building data in the cell to which the first building data belongs with a third set threshold; if the text feature similarity between the first building data and each building data in the cell to which the first building data belongs is not greater than the third set threshold, determining that the first building data is abnormal data.
In the above technical solution, when the spatial feature similarity of two building data is higher, but the text feature similarity is lower, further analysis is needed for the two buildings, the two building data are respectively compared with the text features of other buildings in the affiliated cell, and if the two building data are dissimilar to the other building data, the two building data are considered to be abnormal buildings. And carrying out specific analysis on the suspected abnormal building so as to improve the accuracy of the abnormal building identification.
In one possible design, determining the spatial feature similarity of the first delivery trajectory of the first object to be compared and the second delivery trajectory of the second object to be compared includes: acquiring a first concave packet formed by a first distribution track of the first object to be compared and a second concave packet formed by a second distribution track of the second object to be compared; and determining the similarity of the spatial characteristics of the first distribution track and the second distribution track according to the superposition area of the first concave pocket and the second concave pocket.
According to the technical scheme, besides the text feature similarity of the building data, the space feature of the building data is obtained through the distribution tracks before and after the arrival of the rider, and the space feature similarity of the distribution tracks is determined through the comparison of the overlapping areas of the concave bags of the distribution tracks. When the similarity of the building data is compared, the spatial characteristics of the building data are also considered, so that the similar building data can be judged more accurately.
In one possible design, the first object to be compared is an object to be compared corresponding to any new order in a set time period, and the second object to be compared is an object to be compared corresponding to any building in the building database; or the first object to be compared and the second object to be compared are objects to be compared corresponding to any two buildings in the building database; the distribution track of any building in the building database is obtained by fusing the distribution tracks of orders with corresponding relations with the building; the distribution track corresponding to any new additional order is obtained according to the track points of the order in the set time before and after order delivery.
According to the technical scheme, the building data corresponding to the newly added order are compared and fused with the building data in the building database, and the building data in the building database are also compared and fused at regular time, so that the building data stored in the building database are richer and more accurate, and the efficiency of a rider for searching the corresponding building is improved.
In one possible design, the fusing the first object to be compared and the second object to be compared into the same object to be compared includes: combining the first building data with the second building data to obtain third building data; combining the first distribution track with the second distribution track to obtain a third distribution track; and updating the building database according to the third building data and the third distribution track.
According to the technical scheme, building data corresponding to the same building in reality are fused, so that richer and more accurate building data can be obtained. And when building data are fused, not only text data are fused, but also delivery tracks are combined, so that the fused building delivery tracks are more accurate. And further, the efficiency of searching the corresponding building by the rider according to the distribution track is improved.
In one possible design, the establishing the correspondence between the order and the object to be compared includes: acquiring address information in an order; the address information is selected from building data recommended by a system by a user, and a corresponding relation between the order and the building data selected by the user is established; or the address information is filled in by a user, the address information filled in by the user is matched with building data in the building database, and if the matched building data exist, the order is associated with the matched building data; otherwise, building data are newly added, and a corresponding relation between the order and the building data is established.
According to the technical scheme, the order and the building data are associated through two modes of user selection or address matching, so that the space feature similarity of the distribution tracks of different building data can be compared conveniently according to the distribution tracks of the order.
In a second aspect, an embodiment of the present application provides a multi-source building data fusion apparatus, including:
the association module is used for establishing a corresponding relation between an order and an object to be compared aiming at any order; the object to be compared comprises building data and distribution tracks of orders with corresponding relation with the object to be compared; the building data is a distribution building indicated by the distribution address of the order;
the comparison module is used for determining the text feature similarity of the first building data of the first object to be compared and the second building data of the second object to be compared; determining the spatial feature similarity of a first distribution track of the first object to be compared and a second distribution track of the second object to be compared; the first object to be compared and the second object to be compared are any two objects to be compared;
and the fusion module is used for fusing the first object to be compared and the second object to be compared into the same object to be compared if the text feature similarity and the space feature similarity meet the data fusion requirement.
In one possible design, the text feature similarity and the spatial feature similarity satisfy a data fusion requirement, comprising: the text feature similarity is larger than a first set threshold value, and the space feature similarity is larger than a second set threshold value, so that the data fusion requirement is met; the comparison module is further configured to determine whether a cell to which the first building data belongs and a cell to which the second building data belongs are the same cell if the text feature similarity is greater than the first set threshold and the spatial feature similarity is not greater than the second set threshold; if the first building data and the second building data are different cells, determining that the first building data and the second building data are abnormal data.
In one possible design, the comparing module is further configured to compare the magnitude relation between the text feature similarity of the first building data and each building data in the cell to which the first building data belongs and a third set threshold if the text feature similarity is not greater than the first set threshold and the spatial feature similarity is greater than the second set threshold; if the text feature similarity between the first building data and each building data in the cell to which the first building data belongs is not greater than the third set threshold, determining that the first building data is abnormal data.
In one possible design, the comparing module is further configured to obtain a first concave packet formed by the first distribution track of the first object to be compared and a second concave packet formed by the second distribution track of the second object to be compared; and determining the similarity of the spatial characteristics of the first distribution track and the second distribution track according to the superposition area of the first concave pocket and the second concave pocket.
In one possible design, the first object to be compared is an object to be compared corresponding to any new order in a set time period, and the second object to be compared is an object to be compared corresponding to any building in the building database; or the first object to be compared and the second object to be compared are objects to be compared corresponding to any two buildings in the building database; the distribution track of any building in the building database is obtained by fusing the distribution tracks of orders with corresponding relations with the building; the distribution track corresponding to any new additional order is obtained according to the track points of the order in the set time before and after order delivery.
In one possible design, the fusion module is further configured to combine the first building data with the second building data to obtain third building data; combining the first distribution track with the second distribution track to obtain a third distribution track; and updating the building database according to the third building data and the third distribution track.
In one possible design, the association module is further configured to obtain address information in the order; the address information is selected from building data recommended by a system by a user, and a corresponding relation between the order and the building data selected by the user is established; or the address information is filled in by a user, the address information filled in by the user is matched with building data in the building database, and if the matched building data exist, the order is associated with the matched building data; otherwise, building data are newly added, and a corresponding relation between the order and the building data is established.
In a third aspect, embodiments of the present application also provide a computing device, comprising:
a memory for storing program instructions;
a processor for invoking program instructions stored in said memory and executing the method as described in any of the possible designs of the first aspect in accordance with the obtained program instructions.
In a fourth aspect, embodiments of the present application also provide a computer-readable storage medium, in which computer-readable instructions are stored, which, when read and executed by a computer, cause the method described in any one of the possible designs of the first aspect to be implemented.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a multi-source building data fusion method according to an embodiment of the present application;
FIG. 2 is a schematic structural diagram of a multi-source building data fusion device according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of a computing device according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described in further detail below with reference to the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
In embodiments of the present application, a plurality refers to two or more. The words "first," "second," and the like are used merely for distinguishing between the descriptions and not be construed as indicating or implying a relative importance or order.
Fig. 1 schematically shows a flow chart of a multi-source building data fusion method according to an embodiment of the present application, as shown in fig. 1, the method includes the following steps:
step 101, establishing a corresponding relation between an order and an object to be compared for any order.
In the embodiment of the application, the object to be compared comprises building data and a distribution track of an order with a corresponding relation with the object to be compared, wherein the building data is a distribution building indicated by a distribution address of the order.
Specifically, in step 101, establishing a correspondence between an order and an object to be compared includes: and acquiring address information in the order, wherein the address information can be selected by a user from building data recommended by the system, and then establishing a corresponding relation between the order and the building data selected by the user. Or the address information can be filled in by the user, the address information filled in by the user is matched with building data in the building database, and if the matched building data exists, the order is associated with the matched building data; otherwise, building data are newly added, and a corresponding relation between the order and the building data is established.
That is, when the user newly builds an address, the address recommending module recommends a batch of building data from the building database to the user according to the distance from the position selected by the user or the keywords filled by the user, and if the user selects the corresponding building data from the recommended building data, all orders sent to the address by the user are associated with the building data in the future. If the user does not select the corresponding building data from the recommended building data, an address can be newly built by directly filling in the address information, and then the newly built address can be associated with the building data by an address matching mode. Specifically, address information filled by a user can be disassembled into layered information through an address analysis module, for example, the address information filled by the user is "the midsummer road 500 of the new region of Pudong in Shanghai is An Daguang stone science and technology park 6 building", and then the address information can be disassembled into "the city-Shanghai city, the district-Pudong new region, the road number-500 building, the district name-An Daguang stone science and technology park, the building number-6 building". In the embodiment of the application, the address resolution can be realized by text labeling (such as BIO labeling and BIOES labeling), conditional Random Field (CRF) and LSTM (LSTM) sequence coding technologies. The layered information is matched with building data in a building database in a regular mode (such as a regular expression) or in a machine learning mode (such as xgboost) through an address and building data matching module. If the address filled by the user can be matched with the corresponding building data, associating the order sent to the address with the matched building data; if the address filled by the user does not match the corresponding building data, building data is added and the order is associated with the new building data.
According to the technical scheme, the order and the building data are associated through two modes of user selection or address matching, so that the space feature similarity of the distribution tracks of different building data can be compared conveniently according to the distribution tracks of the order.
Step 102, determining the text feature similarity of the first building data of the first object to be compared and the second building data of the second object to be compared; and determining the spatial feature similarity of the first distribution track of the first object to be compared and the second distribution track of the second object to be compared.
In the embodiment of the application, the first object to be compared and the second object to be compared are any two objects to be compared. The first object to be compared can be an object to be compared corresponding to any new additional order in a set time period, and the second object to be compared can be an object to be compared corresponding to any building in the building database. Or the first object to be compared and the second object to be compared are objects to be compared corresponding to any two buildings in the building database.
In step 102, first, the building feature extraction module performs data extraction on building data from two dimensions of text features and spatial features of the distribution track. When the text features of the building data are extracted, the text features of the cells to which the building data belong are required to be extracted together, and the single building text has incomplete problems due to the source problem. For example, "midsummer road 500, an Daguang stone science and technology park No. 6 in new region of puradon, shanghai" may only remain as "midsummer road 500, no. 6" in building data of a certain source. In addition, building data from different sources also has the problem of different names of the same building, for example, a 'An Daguang stone science and technology garden No. 6 building' is also sometimes called a 'dingdong vegetable buying No. 6 building'. Therefore, the building data can be completed according to the text characteristics (including the area name, the street name, the cell name and all possible names of the cell) of the cell to which the building data belongs, and possible different names of the building data are presumed, so that the enhanced text characteristics of the building data are generated.
The distribution track of any building in the building database is obtained by fusing the distribution tracks of orders corresponding to the building, and the distribution track corresponding to any new additional order is obtained according to track points of the orders within a set time before and after order delivery.
Specifically, determining the spatial feature similarity between the first distribution track of the first object to be compared and the second distribution track of the second object to be compared in step 102 includes: acquiring a first concave pocket formed by a first distribution track of a first object to be compared and a second concave pocket formed by a second distribution track of a second object to be compared; and determining the similarity of the spatial characteristics of the first distribution track and the second distribution track according to the superposition area of the first concave pocket and the second concave pocket.
In addition to text features, building data is closely associated with the rider's monorail in instant distribution. The characteristics of building data in geographic space can be obtained by analyzing the distribution track when a rider arrives at the building order. Intercepting longitude and latitude tracks of all orders of the corresponding building, denoising the tracks (such as Kalman filtering and the like), extracting spatial distribution characteristics of the corresponding tracks of the building, and comparing the spatial characteristic similarity of the two building data distribution tracks.
Specifically, the spatial feature similarity of the distribution tracks of two building data may be compared by: extracting track points of a rider in the process of distributing orders, and determining concave bags of the track points and areas of the concave bags; comparing the similarity of the first building data and the rider trajectories corresponding to the second building data in the building database, including: determining the superposition area of concave bags of rider track points corresponding to the first building data and the second building data in the building database; calculating a first proportion of the area of the overlapping area to the concave area of the track point corresponding to the first building data and a second proportion of the area of the concave area of the track point corresponding to the second building data; the maximum of the first and second ratios is compared with a second set threshold. If the maximum value of the first proportion and the second proportion is larger than a second set threshold value, the spatial feature similarity meets the data fusion requirement.
In one possible embodiment, the text feature similarity and the spatial feature similarity of the first building data and the second building data within a set range may be compared in the comparison, wherein the set range may be a pre-divided area or an administrative area.
And 103, if the text feature similarity and the space feature similarity meet the data fusion requirement, fusing the first object to be compared and the second object to be compared into the same object to be compared.
Illustratively, the text feature similarity and the spatial feature similarity in step 103 satisfy the data fusion requirement, including: and if the text feature similarity is larger than a first set threshold and the space feature similarity is larger than a second set threshold, the data fusion requirement is met.
That is, if the text feature similarity and the space feature similarity of the two building data are both greater than a certain threshold, the two building data are considered to be the same building data with high probability, and the building data are required to be fused.
If the text feature similarity is greater than a first set threshold value and the space feature similarity is not greater than a second set threshold value, determining whether a cell to which the first building data belongs and a cell to which the second building data belongs are the same cell; if the data is not the same cell, determining that the first building data and the second building data are abnormal data.
That is, when the text feature similarity is high but the spatial feature similarity is low in the two building data, comparing whether the cells to which the two building data belong are the same cells, if the first building data and the cell to which the building data belong are different-place cells with similar names, the first building data and the second building data are considered to be normal data; otherwise, the first building data and the second building data are considered to be abnormal building data.
If the text feature similarity is not greater than the first set threshold value and the space feature similarity is greater than the second set threshold value, comparing the text feature similarity of the first building data with the text feature similarity of each building data in the cell to which the first building data belongs with the magnitude relation of the third set threshold value; if the text feature similarity between the first building data and each building data in the cell to which the first building data belongs is not greater than a third set threshold, determining that the first building data is abnormal data.
That is, when the text feature similarity in the two building data is low but the spatial feature similarity is high, the two building data are compared with the text feature similarity of each building data in the cell to which the two building data belong respectively, and if the text feature similarity of the first building data and each building data in the cell to which the first building data belong is very low, the first building data is considered to be abnormal building data; and if the similarity between the first building data and the text characteristics of the building data in the cell to which the first building data belongs is not high, the first building data is considered to be normal data.
If the text feature similarity is not greater than the first set threshold value and the space feature similarity is not greater than the second set threshold value, the two building data are normal data.
In a possible implementation manner, fusing the first object to be compared with the second object to be compared into the same object to be compared in step 103 includes: combining the first building data with the second building data to obtain third building data; combining the first distribution track and the second distribution track to obtain a third distribution track; and updating the building database according to the third building data and the third distribution track.
The embodiment of the application provides a multi-source building data fusion method, which is used for determining building data corresponding to unified buildings in reality by comparing text feature similarity and space feature similarity of building data of different sources, and fusing the building data, so that richer and more accurate building data are obtained, the efficiency of searching for the corresponding buildings by a rider is further improved, and the system can more accurately estimate expected delivery time, schedule orders, judge delivery in advance and the like.
Based on the same technical concept, fig. 2 schematically illustrates a structural diagram of a multi-source building data fusion device according to an embodiment of the present application, as shown in fig. 2, the device 200 includes:
the association module 201 is configured to establish, for any order, a correspondence between the order and an object to be compared; the object to be compared comprises building data and distribution tracks of orders with corresponding relation with the object to be compared; the building data is a distribution building indicated by the distribution address of the order;
a comparison module 202, configured to determine a text feature similarity of first building data of a first object to be compared and second building data of a second object to be compared; determining the spatial feature similarity of a first distribution track of the first object to be compared and a second distribution track of the second object to be compared; the first object to be compared and the second object to be compared are any two objects to be compared;
and the fusion module 203 is configured to fuse the first object to be compared and the second object to be compared into the same object to be compared if the text feature similarity and the spatial feature similarity meet a data fusion requirement.
In one possible design, the text feature similarity and the spatial feature similarity satisfy a data fusion requirement, comprising: the text feature similarity is larger than a first set threshold value, and the space feature similarity is larger than a second set threshold value, so that the data fusion requirement is met; the comparison module is further configured to determine whether a cell to which the first building data belongs and a cell to which the second building data belongs are the same cell if the text feature similarity is greater than the first set threshold and the spatial feature similarity is not greater than the second set threshold; if the first building data and the second building data are different cells, determining that the first building data and the second building data are abnormal data.
In one possible design, the comparing module 202 is further configured to compare the magnitude relation between the text feature similarity of the first building data and each building data in the cell to which the first building data belongs and a third set threshold if the text feature similarity is not greater than the first set threshold and the spatial feature similarity is greater than the second set threshold; if the text feature similarity between the first building data and each building data in the cell to which the first building data belongs is not greater than the third set threshold, determining that the first building data is abnormal data.
In one possible design, the comparing module 202 is further configured to obtain a first concave packet formed by the first distribution track of the first object to be compared and a second concave packet formed by the second distribution track of the second object to be compared; and determining the similarity of the spatial characteristics of the first distribution track and the second distribution track according to the superposition area of the first concave pocket and the second concave pocket.
In one possible design, the first object to be compared is an object to be compared corresponding to any new order in a set time period, and the second object to be compared is an object to be compared corresponding to any building in the building database; or the first object to be compared and the second object to be compared are objects to be compared corresponding to any two buildings in the building database; the distribution track of any building in the building database is obtained by fusing the distribution tracks of orders with corresponding relations with the building; the distribution track corresponding to any new additional order is obtained according to the track points of the order in the set time before and after order delivery.
In one possible design, the fusion module 203 is further configured to combine the first building data with the second building data to obtain third building data; combining the first distribution track with the second distribution track to obtain a third distribution track; and updating the building database according to the third building data and the third distribution track.
In one possible design, the association module 201 is further configured to obtain address information in the order; the address information is selected from building data recommended by a system by a user, and a corresponding relation between the order and the building data selected by the user is established; or the address information is filled in by a user, the address information filled in by the user is matched with building data in the building database, and if the matched building data exist, the order is associated with the matched building data; otherwise, building data are newly added, and a corresponding relation between the order and the building data is established.
Based on the same technical concept, an embodiment of the present application provides a computing device, as shown in fig. 3, including at least one processor 301 and a memory 302 connected to the at least one processor, where in the embodiment of the present application, a specific connection medium between the processor 301 and the memory 302 is not limited, and in fig. 3, the processor 301 and the memory 302 are connected by a bus, for example. The buses may be divided into address buses, data buses, control buses, etc.
In the embodiment of the present application, the memory 302 stores instructions executable by the at least one processor 301, and the at least one processor 301 may execute the multi-source building data fusion method by executing the instructions stored in the memory 302.
Where the processor 301 is the control center of the computing device, various interfaces and lines may be utilized to connect various portions of the computer device for resource setting by executing or executing instructions stored in the memory 302 and invoking data stored in the memory 302.
Alternatively, the processor 301 may include one or more processing units, and the processor 301 may integrate an application processor and a modem processor, wherein the application processor primarily processes operating systems, user interfaces, application programs, etc., and the modem processor primarily processes wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 301. In some embodiments, processor 301 and memory 302 may be implemented on the same chip, and in some embodiments they may be implemented separately on separate chips.
The processor 301 may be a general purpose processor such as a Central Processing Unit (CPU), digital signal processor, application specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present application. The general purpose processor may be a microprocessor or any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present application may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in the processor for execution.
The memory 302 serves as a non-volatile computer-readable storage medium that can be used to store non-volatile software programs, non-volatile computer-executable programs, and modules. The Memory 302 may include at least one type of storage medium, which may include, for example, flash Memory, hard disk, multimedia card, card Memory, random access Memory (Random Access Memory, RAM), static random access Memory (Static Random Access Memory, SRAM), programmable Read-Only Memory (Programmable Read Only Memory, PROM), read-Only Memory (ROM), charged erasable programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory), magnetic Memory, magnetic disk, optical disk, and the like. Memory 302 is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory 302 in embodiments of the present application may also be circuitry or any other device capable of performing memory functions for storing program instructions and/or data.
Based on the same technical concept, the embodiment of the application also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer executable program, and the computer executable program is used for enabling a computer to execute the multi-source building data fusion method listed in any mode.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (10)

1. A multi-source building data fusion method, the method comprising:
establishing a corresponding relation between an order and an object to be compared aiming at any order; the object to be compared comprises building data and distribution tracks of orders with corresponding relation with the object to be compared; the building data is a distribution building indicated by the distribution address of the order;
determining the text feature similarity of first building data of a first object to be compared and second building data of a second object to be compared; determining the spatial feature similarity of a first distribution track of the first object to be compared and a second distribution track of the second object to be compared; the first object to be compared and the second object to be compared are any two objects to be compared;
and if the text feature similarity and the space feature similarity meet the data fusion requirement, fusing the first object to be compared and the second object to be compared into the same object to be compared.
2. The method of claim 1, wherein the text feature similarity and the spatial feature similarity satisfy a data fusion requirement, comprising:
the text feature similarity is larger than a first set threshold value, and the space feature similarity is larger than a second set threshold value, so that the data fusion requirement is met;
the method further comprises the steps of: if the text feature similarity is greater than the first set threshold and the space feature similarity is not greater than the second set threshold, determining whether the cell to which the first building data belongs and the cell to which the second building data belongs are the same cell;
if the first building data and the second building data are different cells, determining that the first building data and the second building data are abnormal data.
3. The method according to claim 2, wherein the method further comprises:
if the text feature similarity is not greater than the first set threshold, and the spatial feature similarity is greater than the second set threshold, comparing the text feature similarity of the first building data with the text feature similarity of each building data in the cell to which the first building data belongs with a third set threshold;
if the text feature similarity between the first building data and each building data in the cell to which the first building data belongs is not greater than the third set threshold, determining that the first building data is abnormal data.
4. The method of claim 1, wherein determining the spatial feature similarity of the first delivery trajectory of the first object to be compared to the second delivery trajectory of the second object to be compared comprises:
acquiring a first concave packet formed by a first distribution track of the first object to be compared and a second concave packet formed by a second distribution track of the second object to be compared;
and determining the similarity of the spatial characteristics of the first distribution track and the second distribution track according to the superposition area of the first concave pocket and the second concave pocket.
5. The method according to any one of claims 1 to 4, wherein the first object to be compared is an object to be compared corresponding to any new order in a set period of time, and the second object to be compared is an object to be compared corresponding to any one building in a building database; or (b)
The first object to be compared and the second object to be compared are objects to be compared corresponding to any two buildings in the building database;
the distribution track of any building in the building database is obtained by fusing the distribution tracks of orders with corresponding relations with the building; the distribution track corresponding to any new additional order is obtained according to the track points of the order in the set time before and after order delivery.
6. The method of claim 5, wherein fusing the first object to be compared and the second object to be compared into the same object to be compared comprises:
combining the first building data with the second building data to obtain third building data;
combining the first distribution track with the second distribution track to obtain a third distribution track;
and updating the building database according to the third building data and the third distribution track.
7. The method of claim 5, wherein establishing the correspondence of the order to the object to be compared comprises:
acquiring address information in an order;
the address information is selected from building data recommended by a system by a user, and a corresponding relation between the order and the building data selected by the user is established;
or the address information is filled in by a user, the address information filled in by the user is matched with building data in the building database, and if the matched building data exist, the order is associated with the matched building data; otherwise, building data are newly added, and a corresponding relation between the order and the building data is established.
8. A multi-source building data fusion apparatus, comprising:
the association module is used for establishing a corresponding relation between an order and an object to be compared aiming at any order; the object to be compared comprises building data and distribution tracks of orders with corresponding relation with the object to be compared; the building data is a distribution building indicated by the distribution address of the order;
the comparison module is used for determining the text feature similarity of the first building data of the first object to be compared and the second building data of the second object to be compared; determining the spatial feature similarity of a first distribution track of the first object to be compared and a second distribution track of the second object to be compared; the first object to be compared and the second object to be compared are any two objects to be compared;
and the fusion module is used for fusing the first object to be compared and the second object to be compared into the same object to be compared if the text feature similarity and the space feature similarity meet the data fusion requirement.
9. A computing device, comprising:
a memory for storing program instructions;
a processor for invoking program instructions stored in the memory and performing the method according to any of claims 1-7 in accordance with the obtained program instructions.
10. A computer readable storage medium comprising computer readable instructions which, when read and executed by a computer, cause the method of any one of claims 1 to 7 to be implemented.
CN202310906869.1A 2023-07-21 2023-07-21 Multi-source building data fusion method and device Pending CN116956218A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310906869.1A CN116956218A (en) 2023-07-21 2023-07-21 Multi-source building data fusion method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310906869.1A CN116956218A (en) 2023-07-21 2023-07-21 Multi-source building data fusion method and device

Publications (1)

Publication Number Publication Date
CN116956218A true CN116956218A (en) 2023-10-27

Family

ID=88452367

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310906869.1A Pending CN116956218A (en) 2023-07-21 2023-07-21 Multi-source building data fusion method and device

Country Status (1)

Country Link
CN (1) CN116956218A (en)

Similar Documents

Publication Publication Date Title
US8996301B2 (en) Segment validation
CN107330459B (en) Data processing method and device and electronic equipment
US10119829B2 (en) Route planning method and device for navigation system and storage medium
CN104216895A (en) Method and device for generating POI data
JP2019512764A (en) Method and apparatus for identifying the type of user geographical location
CN109801091A (en) Targeted user population localization method, device, computer equipment and storage medium
CN103491625B (en) A kind of localization method and system of mobile radio terminal
CN111488414A (en) Road task matching method, device and equipment
CN110427574B (en) Route similarity determination method, device, equipment and medium
CN111931077A (en) Data processing method and device, electronic equipment and storage medium
CN104599161A (en) Method and device for pricing orders based on GPS (global positioning system) coordinate points of client
CN111192452A (en) Stroke data segmentation method and device, storage medium and electronic equipment
CN116698075B (en) Road network data processing method and device, electronic equipment and storage medium
CN111343582B (en) Method and device for preventing mileage cheating
CN111581306B (en) Driving track simulation method and device
CN116956218A (en) Multi-source building data fusion method and device
CN113449217A (en) Method and equipment for migration track, thermodynamic diagram generation and mesh point determination
CN109274725B (en) Internet transaction positioning method and device and server
CN107657474B (en) Method for determining business circle boundary and server
CN111897894A (en) POI retrieval heat determining method, device, equipment and storage medium
CN113032514B (en) Method and device for processing point of interest data
CN116963267A (en) Longitude and latitude auditing method, device, storage medium and server of cell base station
CN111737237B (en) Missing milepost data generation method, device, equipment and storage medium
CN113190640B (en) Method and device for processing point of interest data
CN111552759B (en) Method, device, equipment and medium for acquiring action track related data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination