CN106934015A - Address date treating method and apparatus - Google Patents

Address date treating method and apparatus Download PDF

Info

Publication number
CN106934015A
CN106934015A CN201710141485.XA CN201710141485A CN106934015A CN 106934015 A CN106934015 A CN 106934015A CN 201710141485 A CN201710141485 A CN 201710141485A CN 106934015 A CN106934015 A CN 106934015A
Authority
CN
China
Prior art keywords
coordinate
interval
density
maximum
address
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710141485.XA
Other languages
Chinese (zh)
Inventor
龙准
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingbangda Trade Co Ltd
Beijing Jingdong Zhenshi Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201710141485.XA priority Critical patent/CN106934015A/en
Publication of CN106934015A publication Critical patent/CN106934015A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Remote Sensing (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention discloses a kind of address date treating method and apparatus, it is related to Internet technical field.Method therein includes:Obtain the corresponding latitude and longitude coordinates of address date in presumptive area;Presumptive area is divided into multiple intervals according to the corresponding latitude and longitude coordinates of address date in presumptive area;Coordinate close quarters is identified according to each interval coordinate density;Using the coordinate in coordinate close quarters as normal address coordinate.The invention enables more accurate when address and coordinate is screened, quick, can preferably for address service provides data basis.

Description

Address date treating method and apparatus
Technical field
The present invention relates to Internet technical field, more particularly to a kind of address date treating method and apparatus.
Background technology
As shopping, the growth of Delivery Co., Ltd's high speed and scale constantly expand, various data volumes also in mad growth, Wherein company's order and correct-distribute data becomes apparent, and with the implementation of channel sinking strategy, small towns has been goed deep into correct-distribute address, very To rural area, the address date of these accurate, magnanimity turns into very valuable resource, effectively by these data minings out, essence Standard serves the business related to address service, such as pre-sorting system, order Trajectory System, user's ship-to system, high-precision Degree map, intelligence send a car etc. to seem particularly necessary.
In the prior art, after address date is obtained, usual advanced row address participle, then according to the address frequency of occurrences and Threshold value enters row address screening, and pre-sorting is carried out or with goods, the service of sending a car etc. according to the address for filtering out.But prior art is filtered out Address precision it is not high, it is impossible to meet portion requirements.
The content of the invention
The embodiment of the present invention technical problem to be solved is to provide a kind of ground that can improve address screening accuracy Location data processing method and device.
According to one embodiment of present invention, a kind of address date processing method is proposed, including:Obtain ground in presumptive area The corresponding latitude and longitude coordinates of location data;Presumptive area is divided into according to the corresponding latitude and longitude coordinates of address date in presumptive area many Individual interval;Coordinate close quarters is identified according to each interval coordinate density;Using the coordinate in coordinate close quarters as standard Address coordinate.
Further, the coordinate in coordinate close quarters is stored according to the structure in normal address storehouse.
Further, presumptive area is divided into multiple intervals according to the corresponding latitude and longitude coordinates of address date in presumptive area Including:Determine the corresponding longitude maximum of address date and minimum value and latitude maximum and minimum value in presumptive area;Will The region that longitude maximum and minimum value and latitude maximum and minimum value are constituted is divided into multiple intervals.
Further, identify that coordinate close quarters includes according to each interval coordinate density:It is corresponding according to each interval Coordinate quantity determines the coordinate density in each interval;Using the maximum interval of coordinate density as coordinate close quarters.
Further, identify that coordinate close quarters includes according to each interval coordinate density:It is corresponding according to each interval Coordinate quantity determines the coordinate density in each interval;If seat of the maximum interval of coordinate density comprising the predetermined ratio in presumptive area Punctuate, then using the maximum interval of coordinate density as coordinate close quarters.
Further, the method also includes:Coordinate is close during recurrence obtains the interval adjacent interval maximum with coordinate density Spend maximum interval;Judge the coordinate points of predetermined ratio in presumptive area whether in the range of predetermined area;If presumptive area The coordinate points of interior predetermined ratio are in the range of predetermined area, then close with coordinate in adjacent interval by the maximum interval of coordinate density Maximum interval set is spent as coordinate close quarters.
Further, the method also includes:The corresponding latitude and longitude coordinates of address date in presumptive area are pre-processed, Remove invalid latitude and longitude coordinates.
Further, the method also includes:Based on MapReduce programming frameworks to address date in presumptive area at Reason.
According to another embodiment of the present invention, it is also proposed that a kind of address date processing unit, including:Address date coordinate is obtained Unit is taken, for obtaining the corresponding latitude and longitude coordinates of address date in presumptive area;Area division unit, for according to fate Presumptive area is divided into multiple intervals by the corresponding latitude and longitude coordinates of address date in domain;Coordinate close quarters determining unit, is used for Coordinate close quarters is identified according to each interval coordinate density;Normal address coordinate determining unit, for by coordinate compact district Coordinate in domain is used as normal address coordinate.
Further, the device also includes normal address memory cell, and normal address memory cell is used for coordinate is intensive Coordinate in region is stored according to the structure in normal address storehouse.
Further, the device also includes longitude and latitude extreme value determining unit, and longitude and latitude extreme value determining unit is pre- for determining Determine the corresponding longitude maximum of address date in region and minimum value and latitude maximum and minimum value;Wherein, region division Unit is used to for the region that longitude maximum and minimum value and latitude maximum and minimum value are constituted to be divided into multiple intervals.
Further, the device also includes interval coordinate density determining unit, and interval coordinate density determining unit is used for root The coordinate density in each interval is determined according to each interval corresponding coordinate quantity;Wherein, coordinate close quarters determining unit is used to sit The maximum interval of mark density is used as coordinate close quarters.
Further, the device also includes interval coordinate density determining unit, and interval coordinate density determining unit is used for root The coordinate density in each interval is determined according to each interval corresponding coordinate quantity;If it is close that coordinate close quarters determining unit is additionally operable to coordinate Spend coordinate points of the maximum interval comprising the predetermined ratio in presumptive area, then it is the maximum interval of coordinate density is close as coordinate Collection region.
Further, the device also includes interval set determining unit, interval set determining unit be used for recurrence obtain with The maximum interval of coordinate density in the maximum interval adjacent interval of coordinate density;Wherein, coordinate close quarters determining unit is also used If in the predetermined ratio in presumptive area coordinate points in the range of predetermined area, by coordinate density it is maximum it is interval with it is adjacent The maximum interval set of coordinate density is used as coordinate close quarters in interval.
Further, the device also includes coordinate pretreatment unit, and coordinate pretreatment unit is used in presumptive area Data corresponding latitude and longitude coordinates in location are pre-processed, and remove invalid latitude and longitude coordinates.
Further, address date coordinate acquiring unit, coordinate close quarters determining unit and normal address coordinate determine Unit is processed address date in presumptive area based on MapReduce programming frameworks.
A kind of another embodiment of the invention, it is also proposed that address date processing unit, including:Memory;And coupling The processor of memory is connected to, the instruction that processor is configured as based on storage in memory performs method described above.
According to still another embodiment of the invention, it is also proposed that a kind of computer-readable recording medium, it is stored thereon with computer Programmed instruction, the instruction is when executed by the step of realizing above-mentioned method.
Compared with prior art, the embodiment of the present invention is obtained in presumptive area after the corresponding latitude and longitude coordinates of address date, Presumptive area is divided into multiple intervals according to the corresponding latitude and longitude coordinates of address date in presumptive area, according to each interval coordinate Density identifies coordinate close quarters, and using the coordinate in coordinate close quarters as normal address coordinate so that on screening ground It is more accurate when location and coordinate, quick, can preferably for address service provides data basis.
By referring to the drawings to the detailed description of exemplary embodiment of the invention, further feature of the invention and its Advantage will be made apparent from.
Brief description of the drawings
The Description of Drawings embodiments of the invention of a part for specification are constituted, and is used to solve together with the description Release principle of the invention.
Referring to the drawings, according to following detailed description, the present invention can be more clearly understood from, wherein:
Fig. 1 is the schematic flow sheet of one embodiment of address date processing method of the present invention.
Fig. 2 is the schematic flow sheet of another embodiment of address date processing method of the present invention.
Fig. 3 is the schematic flow sheet of the further embodiment of address date processing method of the present invention.
Fig. 4 is the schematic flow sheet of another embodiment of address date processing method of the present invention.
Fig. 5 is a schematic flow sheet for specific embodiment of address date processing method of the present invention.
Fig. 6 is the structural representation of one embodiment of address date processing unit of the present invention.
Fig. 7 is the structural representation of another embodiment of address date processing unit of the present invention.
Fig. 8 is the structural representation of the further embodiment of address date processing unit of the present invention.
Fig. 9 is the structural representation of another embodiment of address date processing unit of the present invention.
Figure 10 is the structural representation of another embodiment of address date processing unit of the present invention.
Specific embodiment
Describe various exemplary embodiments of the invention in detail now with reference to accompanying drawing.It should be noted that:Unless had in addition Body illustrates that the part and the positioned opposite of step, numerical expression and numerical value for otherwise illustrating in these embodiments do not limit this The scope of invention.
Simultaneously, it should be appreciated that for the ease of description, the size of the various pieces shown in accompanying drawing is not according to reality Proportionate relationship draw.
The description only actually at least one exemplary embodiment is illustrative below, never as to the present invention And its any limitation applied or use.
May be not discussed in detail for technology, method and apparatus known to person of ordinary skill in the relevant, but suitable In the case of, the technology, method and apparatus should be considered as authorizing a part for specification.
In all examples shown here and discussion, any occurrence should be construed as merely exemplary, without It is as limitation.Therefore, the other examples of exemplary embodiment can have different values.
It should be noted that:Similar label and letter represents similar terms in following accompanying drawing, therefore, once a certain Xiang Yi It is defined in individual accompanying drawing, then it need not be further discussed in subsequent accompanying drawing.
To make the object, technical solutions and advantages of the present invention become more apparent, below in conjunction with specific embodiment, and reference Accompanying drawing, the present invention is described in more detail.
Fig. 1 is the schematic flow sheet of one embodiment of address date processing method of the present invention.The method includes following step Suddenly:
In step 110, the corresponding latitude and longitude coordinates of address date in presumptive area are obtained.For example, obtaining a certain street The corresponding latitude and longitude coordinates in order correct-distribute address and each address.
In step 120, presumptive area is divided into multiple areas according to the corresponding latitude and longitude coordinates of address date in presumptive area Between.It is multiple areas by the region decile or any scribing for example, knowing in a certain street after the latitude and longitude coordinates of each address date Between.
In step 130, coordinate close quarters is identified according to each interval coordinate density.For example, can be according to each interval Corresponding coordinate quantity determines the coordinate density in each interval, using the maximum interval of coordinate density as coordinate close quarters.Also may be used In the interval maximum further to judge coordinate density whether the coordinate points comprising the predetermined ratio in presumptive area, for example, this Whether interval includes on a certain street at least 60% coordinate points, if so, then using the maximum interval of the coordinate density as coordinate Close quarters.Can also be using the maximum interval set of the maximum interval and adjacent coordinate density of coordinate density as coordinate Close quarters.Specifically can select which kind of mode determines coordinate close quarters according to the accurate determination of selection.
In step 140, using the coordinate in coordinate close quarters as normal address coordinate.Wherein, coordinate compact district is obtained Behind domain, all coordinate points of the coordinate close quarters can also be arrived big data platform according to the structure storage in normal address storehouse.
In this embodiment, coordinate compact district is identified according to the corresponding latitude and longitude coordinates of address date in presumptive area Domain, and using the coordinate in coordinate close quarters as normal address coordinate so that it is more accurate when address and coordinate is screened, fast Speed, preferably for address service provides data basis.For example, filtering out a certain seat building on a certain street for coordinate compact district Domain, using the coordinate in the building as normal address coordinate, and can be saved in normal address storehouse by normal address coordinate, with After when arranging where to set release position, can be using the region as release position.
Fig. 2 is the schematic flow sheet of another embodiment of address date processing method of the present invention.The method includes following Step:
In step 210, the corresponding latitude and longitude coordinates of address date in presumptive area are obtained.
In step 220, the corresponding longitude maximum of address date and minimum value and latitude maximum in presumptive area are determined Value and minimum value.
In step 230, the region that longitude maximum and minimum value and latitude maximum and minimum value are constituted is divided into It is multiple interval.For example, according to 100 meters * 100 meters of specification, longitude is divided into (Xmax-XminThe equal portions of)/100, latitude is divided It is (Ymax-YminThe equal portions of)/100, wherein, XmaxIt is longitude maximum, XminIt is longitude minimum value, YmaxIt is latitude maximum, Ymin It is latitude minimum value.The region can be divided into specific specification by those skilled in the art according to actual conditions, for example, 200 * 200 meters, 500 meters * 500 meters etc. of rice.
In step 240, the coordinate density in each interval is determined according to each interval corresponding coordinate quantity.Wherein, according to coordinate Longitude and latitude can to determine that each coordinate points should fall interval at which, and each interval corresponding coordinate quantity, then To each interval coordinate density.
In step 250, using the maximum interval of coordinate density as coordinate close quarters.For example, coordinate density maximum Interval is 500 meters * 500 meters for scope, and just needs to set a release position in the range of one 500 meters * 500 meters, then may be used Using by the region as coordinate close quarters.
In step 260, the coordinate in coordinate close quarters is arrived into big data platform according to the structure storage in normal address storehouse.
Said process is properly termed as coordinate polymerization process, and being polymerized by coordinate can determine coordinate close quarters, and will sit Coordinate in mark close quarters arrives big data platform according to the structure storage in normal address storehouse so that screen address and seat afterwards Timestamp is more accurate, quick.
In another embodiment of the present invention, can with as shown in figure 3, wherein, step 310- steps 340 respectively with step Rapid 210- steps 240 are identical.
In step 350, whether the coordinate points of the maximum interval predetermined ratio included in presumptive area of coordinate density are judged, If so, then performing step 360, otherwise, step 361 is performed.For example, judging in presumptive area 60% coordinate points whether in the seat The maximum interval of mark density.
In step 360, using the maximum interval of coordinate density as coordinate close quarters.
In step 361, not using the maximum interval of the coordinate density as coordinate close quarters.
In step 370, the coordinate in coordinate close quarters is arrived into big data platform according to the structure storage in normal address storehouse.
In this embodiment, if coordinate points of the maximum interval of coordinate density comprising the predetermined ratio in presumptive area, Then using the maximum interval of coordinate density as coordinate close quarters, and using the coordinate points in coordinate close quarters as normal address Coordinate points so that more accurate, quick in follow-up screening address and coordinate.
Fig. 4 is the schematic flow sheet of another embodiment of address date processing method of the present invention.The method includes following Step:
In step 410, the corresponding latitude and longitude coordinates of address date in presumptive area are obtained.
In step 420, the corresponding latitude and longitude coordinates of address date in the presumptive area are pre-processed, remove invalid warp Latitude coordinate.For example, the data that obvious address does not correspond with longitude and latitude can be removed.
In step 430, the corresponding longitude maximum of effective address data and minimum value and latitude in presumptive area are determined Maximum and minimum value.
In step 440, the region that longitude maximum and minimum value and latitude maximum and minimum value are constituted is divided into It is multiple interval.
In step 450, the coordinate density in each interval is determined according to each interval corresponding coordinate quantity.
In step 460, the maximum interval of density is obtained, and recurrence obtains the interval adjacent region maximum with the coordinate density Between the maximum interval of middle coordinate density.
In step 470, judge the coordinate points of predetermined ratio in the presumptive area whether in the range of predetermined area.Example Such as, whether corresponding coordinate points meet 60% coordinate points in the range of 200*200 square metres below a certain street.
In step 480, if the coordinate points of the predetermined ratio in presumptive area are in the range of predetermined area, by coordinate density The maximum interval interval set maximum with coordinate density in adjacent interval is used as coordinate close quarters.
In step 490, the coordinate in coordinate close quarters is arrived into big data platform according to the structure storage in normal address storehouse. The valid data with actual address error of coordinate at least within 200 meters are for example found in mass data, according to normal address Big data platform is arrived in the structure storage in storehouse.Wherein, normal address storehouse can include master meter and from table, wherein, normal address storehouse master As shown in table 1, normal address storehouse is as shown in table 2 from table for table.
Table 1
Table 2
In the above-described embodiments, coordinate close quarters is obtained by coordinate aggregating algorithm, and by the seat of coordinate close quarters The structure storage according to normal address storehouse is marked to big data platform, can faster, more accurately for address service provides data base Plinth.For example, when goods is dispensed, by query criteria address base master meter and from table, so that it may be quickly found out the address to be dispensed and sit Mark.
The above embodiment of the present invention can be realized based on MapReduce programming frameworks, as shown in figure 5, will be with a tool Illustrated as a example by body embodiment.Wherein, when being calculated based on MapReduce, input and output all be with t split;xcopies It is grid x-axis coordinate, ycopies is grid y-axis coordinate, and xsec is longitude equal portions, and ysec is latitude equal portions, and mr is Referred to as, pointnum is coordinate quantity to MapReduce.
In step 510, based on MapReduce, each following longitude and latitude maxima and minima of index is calculated, and according to every The specification of individual grid 100*100, the equal portions of (max-min)/100 are divided into by longitude and latitude respectively, obtain grid quantity.Wherein, each Index refers to presumptive area, for example, certain a street etc..
Input:Index id longitudes, latitude
Output:Index id xsec ysec min longitude min latitudes
In step 520, according to the min longitudes that first mr is exported, min latitudes calculate following each coordinate pair of index The grid (X-minx)/100 answered, the coordinate quantity under (Y-miny)/100, and grid.
Input:Index id latitude, longitudes address
Output 1:Index id xsec ysec { longitude, latitude, address<br>Longitude, latitude, address ... } pointnum
Output 2:Index id xsec ysec xcopies ycopies pointnum
In step 530, according to second output of mr 2, all pointnum under xsec, ysec are counted respectively, then The shared density of each point is multiplied by, then latitude density is multiplied by with longitude density, grid density is obtained, the side of maximal density is found Lattice, and the neighbouring grid of recursive lookup or so maximal density grid, obtain compact district.
Input:Index id xsec ysec xcopies ycopies pointnum
Output:Xsec ysec density compact district coordinate sum
In step 540, according to second output of mr 1, and the 3rd output of mr, the coordinate set of close quarters is obtained Close.
Input 1:Index id xsec ysec { longitude, latitude, address<br>Longitude, latitude, address } pointnum
Input 2:Xsec ysec density compact district coordinate sum
Output:Index id addresses latitude, longitude confidence level
In step 550, the coordinate set of the close quarters that will be obtained arrives big data according to the structure storage in normal address storehouse Platform.
In this embodiment it is possible to MapReduce is optimized, for example, being directed to knot in the middle of input-output file, map Fruit is compressed using lzo, uses the effective transmitted data amount of compression and data storage amount, you can with improve the efficiency of transmission of data with And transmission performance.
Set for map task (mapping tasks) and reduce task (reduction task), wherein, map task (mappings Quantity) determined by split (burst), block (database) relation of split and hfds (Hadoop distributed file systems) is close Cut.
SplitSize=max (" mapred.min.split.size ", min (" mapred.max.split.size ", BlockSize)), indexed, it is necessary to be set up to compressed file if input file has used compression, input file can just be divided into many Individual split.Reduce task (stipulations quantity) are set by client (client).
Using when, should use suitable writeable (write-in) type, for example, try one's best use intwritable, Longwritabl, vintwritable type are used as key key.
Furthermore it is also possible to the Thread Count that reduce pulls map results is adjusted, for example, reduce can be initial under default situations Change 5 threads of pulling data, it is appropriate to increase the Thread Count for pulling map results gradually from map ends copy, can allow shuffle Stage execution can be more quicker
Buffer (buffering area) for reading and writing of files is set.Io.sort.mb, in units of MB, gives tacit consent to 100M, generally From the point of view of this value it is too small, this option definition size of the map output results in EMS memory occupation buffer, when buffer reaches Certain threshold value, can start a background thread and be ranked up come the content to buffer, then write local disk ( Spill files), according to the size of map output data quantities, the size of adjustment buffer that can be appropriate.
In the above-described embodiments, the coordinate aggregating algorithm based on MapReduce can be filtered out accurately in a short time Address and coordinate such that it is able to faster, more accurately for address service provides data basis.In addition, by excellent to MapReduce Change also improves the efficiency of data cleansing.
Fig. 6 is the structural representation of one embodiment of address date processing unit of the present invention.The device includes number of addresses Determine according to coordinate acquiring unit 610, area division unit 620, coordinate close quarters determining unit 630 and normal address coordinate single Unit 640.
Address date coordinate acquiring unit 610 is used to obtain the corresponding latitude and longitude coordinates of address date in presumptive area.Example Such as, the corresponding latitude and longitude coordinates in order correct-distribute address and each address in a certain street are obtained.
Area division unit 620 is used to be divided presumptive area according to the corresponding latitude and longitude coordinates of address date in presumptive area It is multiple intervals.
Coordinate close quarters determining unit 630 is used to identify coordinate close quarters according to each interval coordinate density.Example Such as, the coordinate density in each interval can be determined according to each interval corresponding coordinate quantity, using coordinate density it is maximum it is interval as Coordinate close quarters.Whether can also further judge in the maximum interval of coordinate density comprising the predetermined ratio in presumptive area Coordinate points, for example, whether the interval comprising on a certain street at least 60% coordinate points, if so, then by the coordinate density most Big interval is used as coordinate close quarters.Can also be by the maximum area of the maximum interval and adjacent coordinate density of coordinate density Between set as coordinate close quarters.Specifically can select which kind of mode determines coordinate compact district according to the accurate determination of selection Domain.
Normal address coordinate determining unit 640 is used for the coordinate in coordinate close quarters as normal address coordinate.
In this embodiment, presumptive area is divided into multiple intervals, identifies that coordinate is close according to each interval coordinate density Collection region, and using the coordinate in coordinate close quarters as normal address coordinate so that it is more accurate when address and coordinate is screened Really, quickly, preferably for address service provides data basis.
Fig. 7 is the structural representation of another embodiment of address date processing unit of the present invention.The device includes address Data coordinates acquiring unit 710, longitude and latitude extreme value determining unit 720, area division unit 730, interval coordinate density determine single Unit 740, coordinate close quarters determining unit 750, normal address coordinate determining unit 760 and normal address memory cell 770.
Address date coordinate acquiring unit 710 is used to obtain the corresponding latitude and longitude coordinates of address date in presumptive area.
Longitude and latitude extreme value determining unit 720 is used to determine the corresponding longitude maximum of address date and minimum in presumptive area Value and latitude maximum and minimum value.
Area division unit 730 is used for the area for constituting longitude maximum and minimum value and latitude maximum and minimum value Domain is divided into multiple intervals.For example, being carried out to region according to 100 meters * 100 meters of specification or 200 meters * 200 meters of specification etc. Divide.
Interval coordinate density determining unit 740 is used to determine that the coordinate in each interval is close according to each interval corresponding coordinate quantity Degree.Wherein, the longitude and latitude according to coordinate can determine which interval is each coordinate points should fall in, and each interval is corresponding Coordinate quantity, then obtains the coordinate density in each interval.
Coordinate close quarters determining unit 750 is used for the maximum interval of coordinate density as coordinate close quarters.For example, The maximum interval of the coordinate density is 500 meters * 500 meters for scope, and just needs to be set in the range of one 500 meters * 500 meters One release position, then can be using the region as coordinate close quarters.
Normal address coordinate determining unit 760 is used for the coordinate in coordinate close quarters as normal address coordinate.
Normal address memory cell 770 is by the coordinate in coordinate close quarters according to the structure storage in normal address storehouse to greatly Data platform.
In another embodiment of the present invention, coordinate close quarters determining unit 750 can also judge coordinate density most The coordinate points of the big interval predetermined ratio whether included in presumptive area, if the coordinate comprising the predetermined ratio in presumptive area Point, then using the maximum interval of the coordinate density as coordinate close quarters.For example, 60% coordinate points are in the seat in presumptive area The maximum interval of mark density, then using the maximum interval of the coordinate density as coordinate close quarters.
In the above-described embodiments, being polymerized by coordinate can determine coordinate close quarters, and by coordinate close quarters Coordinate arrives big data platform according to the structure storage in normal address storehouse so that more accurate during screening address and coordinate afterwards, Quickly.
In another embodiment of the present invention, as shown in figure 8, the device include address date coordinate acquiring unit 810, Coordinate pretreatment unit 820, longitude and latitude extreme value determining unit 830, area division unit 840, interval coordinate density determining unit 850th, interval set determining unit 860, coordinate close quarters determining unit 870, normal address coordinate determining unit 880 and standard Address storaging unit 890, wherein, each unit can be carried out based on MapReduce programming frameworks to address date in presumptive area Treatment, i.e., improve coordinate polymerization, the efficiency of data cleansing by MapReduce technologies.
Address date coordinate acquiring unit 810 is used to obtain the corresponding latitude and longitude coordinates of address date in presumptive area.
Coordinate pretreatment unit 820 is used to carry out pre- place to the corresponding latitude and longitude coordinates of address date in the presumptive area Reason, removes invalid latitude and longitude coordinates.For example, the data that obvious address does not correspond with longitude and latitude can be removed.
Longitude and latitude extreme value determining unit 830 be used to determining in presumptive area the corresponding longitude maximum of effective address data and Minimum value and latitude maximum and minimum value.
Area division unit 840 is used for the area for constituting longitude maximum and minimum value and latitude maximum and minimum value Domain is divided into multiple intervals.
Interval coordinate density determining unit 850 is used to determine that the coordinate in each interval is close according to each interval corresponding coordinate quantity Degree.
Interval set determining unit 860 is used to obtain the maximum interval of density, and recurrence obtains maximum with the coordinate density Interval adjacent interval in the maximum interval of coordinate density.
If the coordinate points of the predetermined ratio that coordinate close quarters determining unit 870 is used in presumptive area are in predetermined area model In enclosing, then using the maximum interval interval set maximum with coordinate density in adjacent interval of coordinate density as coordinate compact district Domain.If for example, the coordinate points of corresponding coordinate points satisfaction 60%, will in the range of 200*200 square metres below a certain street The maximum interval interval set maximum with coordinate density in adjacent interval of coordinate density is used as coordinate close quarters.
Normal address coordinate determining unit 880 is used for the coordinate in coordinate close quarters as normal address coordinate.
Normal address memory cell 890 is used to store the coordinate in coordinate close quarters according to the structure in normal address storehouse To big data platform.The valid data with actual address error of coordinate at least within 200 meters are for example found in mass data, Big data platform is arrived in structure storage according to normal address storehouse.
In the above-described embodiments, coordinate close quarters is obtained by coordinate aggregating algorithm, and by the seat of coordinate close quarters The structure storage according to normal address storehouse is marked to big data platform, can faster, more accurately for address service provides data base Plinth.
Fig. 9 is the structural representation of another embodiment of address date processing unit of the present invention.The system includes storage Device 910 and processor 920.
Memory 910 can be disk, flash memory or other any non-volatile memory mediums.Memory be used for store Fig. 1- Instruction in embodiment corresponding to 5.
Processor 920 is coupled to memory 910, can implement as one or more integrated circuits, such as microprocessor Device or microcontroller.The processor 920 is used to perform the instruction stored in memory, due to by the coordinate in coordinate close quarters Structure according to normal address storehouse is stored so that more accurate when address and coordinate is screened, quick, is preferably address Service providing data basis.
In one embodiment, can also as shown in Figure 10, the address date processing unit includes memory 1010 and place Reason device 1020.Processor 1020 is coupled to memory 1010 by BUS buses 1030.The address date processing unit 1000 may be used also With by the externally connected storage device 1050 of memory interface 1040 to call external data, can also be by network interface 1060 are connected to network or an other computer system (not shown), no longer describe in detail herein.
In this embodiment, coordinate compact district is identified according to the corresponding latitude and longitude coordinates of address date in presumptive area Domain, and the coordinate in coordinate close quarters is stored according to the structure in normal address storehouse so that in screening address and coordinate Shi Gengjia is accurate, quick, preferably for address service provides data basis.
In another embodiment, a kind of computer-readable recording medium, is stored thereon with computer program instructions, and this refers to Order is when executed by the step of realizing the method in embodiment corresponding to Fig. 1-5.It should be understood by those skilled in the art that, Embodiments of the invention can be provided as method, device or computer program product.Therefore, the present invention can use complete hardware reality Apply the form of the embodiment in terms of example, complete software embodiment or combination software and hardware.And, the present invention can be used one The computer that individual or multiple wherein includes computer usable program code can be with non-transient storage medium (including but not limited to Magnetic disk storage, CD-ROM, optical memory etc.) on implement computer program product form.
The present invention is the flow chart with reference to method according to embodiments of the present invention, equipment (system) and computer program product And/or block diagram is described.It should be understood that each flow during flow chart and/or block diagram can be realized by computer program instructions And/or the combination of the flow and/or square frame in square frame and flow chart and/or block diagram.These computer programs can be provided to refer to The processor of all-purpose computer, special-purpose computer, Embedded Processor or other programmable data processing devices is made to produce One machine so that produced for realizing by the instruction of computer or the computing device of other programmable data processing devices The device of the function of being specified in one flow of flow chart or multiple one square frame of flow and/or block diagram or multiple square frames.
These computer program instructions may be alternatively stored in can guide computer or other programmable data processing devices with spy In determining the computer-readable memory that mode works so that instruction of the storage in the computer-readable memory is produced and include finger Make the manufacture of device, the command device realize in one flow of flow chart or multiple one square frame of flow and/or block diagram or The function of being specified in multiple square frames.
These computer program instructions can be also loaded into computer or other programmable data processing devices so that in meter Series of operation steps is performed on calculation machine or other programmable devices to produce computer implemented treatment, so as in computer or The instruction performed on other programmable devices is provided for realizing in one flow of flow chart or multiple flows and/or block diagram one The step of function of being specified in individual square frame or multiple square frames.
So far, the present invention is described in detail.In order to avoid covering design of the invention, without description this area, institute is public Some details known.Those skilled in the art can be appreciated how to implement technology disclosed herein as described above, completely Scheme.
The method of the present invention and device may be achieved in many ways.For example, can by software, hardware, firmware or Person's software, hardware, any combinations of firmware realize the method for the present invention and device.The step of for methods described it is above-mentioned Order is not limited to order described in detail above merely to illustrate, the step of the method for the present invention, unless with other sides Formula is illustrated.Additionally, in certain embodiments, also the present invention can be embodied as recording program in the recording medium, these Program includes the machine readable instructions for realizing the method according to the invention.Thus, the present invention also covering storage is for performing The recording medium of the program of the method according to the invention.
Although being described in detail to some specific embodiments of the invention by example, the skill of this area Art personnel it should be understood that above example is merely to illustrate, rather than in order to limit the scope of the present invention.The skill of this area Art personnel to above example it should be understood that can modify without departing from the scope and spirit of the present invention.This hair Bright scope is defined by the following claims.

Claims (18)

1. a kind of address date processing method, including:
Obtain the corresponding latitude and longitude coordinates of address date in presumptive area;
The presumptive area is divided into multiple intervals according to the corresponding latitude and longitude coordinates of address date in the presumptive area;
Coordinate close quarters is identified according to each interval coordinate density;
Using the coordinate in the coordinate close quarters as normal address coordinate.
2. method according to claim 1, also includes:
Coordinate in the coordinate close quarters is stored according to the structure in normal address storehouse.
3. method according to claim 2, according to the corresponding latitude and longitude coordinates of address date in the presumptive area by institute Stating presumptive area and being divided into multiple intervals includes:
Determine the corresponding longitude maximum of address date and minimum value and latitude maximum and minimum value in the presumptive area;
The region that the longitude maximum and minimum value and the latitude maximum and minimum value are constituted is divided into multiple areas Between.
4. method according to claim 3, identifies that coordinate close quarters includes according to each interval coordinate density:
The coordinate density in each interval is determined according to each interval corresponding coordinate quantity;
Using the maximum interval of coordinate density as coordinate close quarters.
5. method according to claim 3, identifies that coordinate close quarters includes according to each interval coordinate density:
The coordinate density in each interval is determined according to each interval corresponding coordinate quantity;
If coordinate points of the maximum interval of coordinate density comprising the predetermined ratio in the presumptive area, by the coordinate density Maximum interval is used as coordinate close quarters.
6. the method according to claim 4 or 5, also includes:
Recurrence obtains the interval of coordinate density maximum in the interval adjacent interval maximum with the coordinate density;
Judge the coordinate points of predetermined ratio in the presumptive area whether in the range of predetermined area;
If the coordinate points of the predetermined ratio in the presumptive area are in the range of predetermined area, the coordinate density is maximum The interval interval set maximum with coordinate density in adjacent interval is used as coordinate close quarters.
7., according to any described methods of claim 1-5, also include:
The corresponding latitude and longitude coordinates of address date in the presumptive area are pre-processed, invalid latitude and longitude coordinates are removed.
8., according to any described methods of claim 1-5, also include:
Address date in the presumptive area is processed based on MapReduce programming frameworks.
9. a kind of address date processing unit, including:
Address date coordinate acquiring unit, for obtaining the corresponding latitude and longitude coordinates of address date in presumptive area;
Area division unit, for according to the corresponding latitude and longitude coordinates of address date in the presumptive area by the presumptive area It is divided into multiple intervals;
Coordinate close quarters determining unit, for identifying coordinate close quarters according to each interval coordinate density;
Normal address coordinate determining unit, for using the coordinate in the coordinate close quarters as normal address coordinate.
10. device according to claim 9, also including normal address memory cell, the normal address memory cell is used Coordinate in by the coordinate close quarters is stored according to the structure in normal address storehouse.
11. devices according to claim 10, also including longitude and latitude extreme value determining unit, the longitude and latitude extreme value determines single Unit is for determining the corresponding longitude maximum of address date and minimum value and latitude maximum and minimum in the presumptive area Value;
Wherein, the area division unit is used for the longitude maximum and minimum value and the latitude maximum and minimum The region that value is constituted is divided into multiple intervals.
12. devices according to claim 11, also including interval coordinate density determining unit, the interval coordinate density is true Order unit according to each interval corresponding coordinate quantity for determining each interval coordinate density;
Wherein, the coordinate close quarters determining unit is used for the maximum interval of the coordinate density as coordinate compact district Domain.
13. devices according to claim 11, also including interval coordinate density determining unit, the interval coordinate density is true Order unit according to each interval corresponding coordinate quantity for determining each interval coordinate density;
Wherein, if the coordinate close quarters determining unit is additionally operable to the maximum interval of coordinate density comprising in the presumptive area Predetermined ratio coordinate points, then using maximum interval as coordinate close quarters of the coordinate density.
14. device according to claim 12 or 13, also including interval set determining unit, the interval set determines single Unit obtains the interval of coordinate density maximum in the interval adjacent interval maximum with the coordinate density for recurrence;
Wherein, if the coordinate close quarters determining unit is additionally operable to the coordinate points of the predetermined ratio in the presumptive area pre- Determine in areal extent, then by the interval collection cooperation of coordinate density maximum in the maximum interval and adjacent interval of the coordinate density It is coordinate close quarters.
15. according to any described devices of claim 9-13, also including coordinate pretreatment unit, the coordinate pretreatment unit For being pre-processed to the corresponding latitude and longitude coordinates of address date in the presumptive area, invalid latitude and longitude coordinates are removed.
16. according to any described devices of claim 9-13, wherein, the address date coordinate acquiring unit, the coordinate Close quarters determining unit and the normal address coordinate determining unit are based on MapReduce programming frameworks to the presumptive area Interior address date is processed.
A kind of 17. address date processing units, including:
Memory;And
The processor of the memory is coupled to, the instruction that the processor is configured as based on storage in the memory is performed Method as described in any one of claim 1 to 8.
A kind of 18. computer-readable recording mediums, are stored thereon with computer program instructions, and the instruction is when executed by reality The step of showing the method described in any one of claim 1 to 8.
CN201710141485.XA 2017-03-10 2017-03-10 Address date treating method and apparatus Pending CN106934015A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710141485.XA CN106934015A (en) 2017-03-10 2017-03-10 Address date treating method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710141485.XA CN106934015A (en) 2017-03-10 2017-03-10 Address date treating method and apparatus

Publications (1)

Publication Number Publication Date
CN106934015A true CN106934015A (en) 2017-07-07

Family

ID=59432017

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710141485.XA Pending CN106934015A (en) 2017-03-10 2017-03-10 Address date treating method and apparatus

Country Status (1)

Country Link
CN (1) CN106934015A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635063A (en) * 2018-12-06 2019-04-16 拉扎斯网络科技(上海)有限公司 Information processing method and device for address library, electronic equipment and storage medium
CN110046343A (en) * 2019-03-01 2019-07-23 江苏横云智慧科技有限公司 Non-standard address conversion is the method that canonical address and canonical address encode
CN111581471A (en) * 2020-05-09 2020-08-25 北京京东振世信息技术有限公司 Regional vehicle checking method, device, server and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103403734A (en) * 2011-03-21 2013-11-20 亚马逊技术股份有限公司 Courier management
CN104463516A (en) * 2013-09-18 2015-03-25 Sap欧洲公司 Order/vehicle distribution based on order density
CN105160021A (en) * 2015-09-29 2015-12-16 滴滴(中国)科技有限公司 Destination preference based order distribution method and apparatus
CN105654143A (en) * 2016-01-28 2016-06-08 北京京东尚科信息技术有限公司 Method and device for identifying point density

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103403734A (en) * 2011-03-21 2013-11-20 亚马逊技术股份有限公司 Courier management
CN104463516A (en) * 2013-09-18 2015-03-25 Sap欧洲公司 Order/vehicle distribution based on order density
CN105160021A (en) * 2015-09-29 2015-12-16 滴滴(中国)科技有限公司 Destination preference based order distribution method and apparatus
CN105654143A (en) * 2016-01-28 2016-06-08 北京京东尚科信息技术有限公司 Method and device for identifying point density

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
叶海波: "《城市地址编码的技术及应用》", 《中国优秀硕士学位论文全文数据库 基础科学辑》 *
檀竹隔: "《快递自提柜投放选址问题研究》", 《中国优秀硕士学位论文全文数据库 经济与管理科学辑》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109635063A (en) * 2018-12-06 2019-04-16 拉扎斯网络科技(上海)有限公司 Information processing method and device for address library, electronic equipment and storage medium
CN110046343A (en) * 2019-03-01 2019-07-23 江苏横云智慧科技有限公司 Non-standard address conversion is the method that canonical address and canonical address encode
CN111581471A (en) * 2020-05-09 2020-08-25 北京京东振世信息技术有限公司 Regional vehicle checking method, device, server and medium
CN111581471B (en) * 2020-05-09 2023-11-10 北京京东振世信息技术有限公司 Regional vehicle checking method, device, server and medium

Similar Documents

Publication Publication Date Title
CN105900064B (en) The method and apparatus for dispatching data flow task
TW201913522A (en) Risk feature screening, description message generation method, device and electronic device
CN106202092A (en) The method and system that data process
CN111967964B (en) Intelligent recommending method and device for bank client sites
CN109815267A (en) The branch mailbox optimization method and system, storage medium and terminal of feature in data modeling
CN106934015A (en) Address date treating method and apparatus
CN112148468B (en) Resource scheduling method and device, electronic equipment and storage medium
CN110389822A (en) The node scheduling method, apparatus and server of execution task
CN106372977B (en) A kind of processing method and equipment of virtual account
CN111507479B (en) Feature binning method, device, equipment and computer-readable storage medium
CN107291720A (en) A kind of method, system and computer cluster for realizing batch data processing
CN115457226A (en) Vector map generation method and device, electronic equipment and readable storage medium
CN106682414A (en) Method and device for establishing timing sequence prediction model
CN105681252B (en) Client side data-based processing method and device
CN110245978B (en) Method and device for evaluating and selecting policies in policy group
CN114844638A (en) Big data volume secret key duplication removing method and system based on cuckoo filter
CN111476872B (en) Image drawing method and image drawing device
CN113014674B (en) Method and device for drawing service dependency graph
US11194619B2 (en) Information processing system and non-transitory computer readable medium storing program for multitenant service
CN114564149B (en) Data storage method, device, equipment and storage medium
WO2018205890A1 (en) Task assignment method and system of distributed system, computer readable storage medium and computer device therefor
CN109165305A (en) A kind of storage of characteristic value, search method and device
CN107168547B (en) Method and device for inputting command
CN111782658A (en) Cross table processing method, cross table processing device, electronic equipment and storage medium
CN111967767A (en) Business risk identification method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20190708

Address after: 100086 6th Floor, Zhichun Road, Haidian District, Beijing

Applicant after: Beijing Jingdong Zhenshi Information Technology Co.,Ltd.

Address before: 100080 First Floor 101, No. 2 Building, No. 20 Courtyard, Suzhou Street, Haidian District, Beijing

Applicant before: Beijing Jingbangda Trading Co.,Ltd.

Effective date of registration: 20190708

Address after: 100080 First Floor 101, No. 2 Building, No. 20 Courtyard, Suzhou Street, Haidian District, Beijing

Applicant after: Beijing Jingbangda Trading Co.,Ltd.

Address before: 100195 Beijing Haidian Xingshikou Road 65 West Cedar Creative Garden 4 District 11 Building East 1-4 Floor West 1-4 Floor

Applicant before: BEIJING JINGDONG SHANGKE INFORMATION TECHNOLOGY Co.,Ltd.

Applicant before: BEIJING JINGDONG CENTURY TRADING Co.,Ltd.

RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170707