Embodiment
It should be noted that, in the situation that not conflicting, embodiment and the feature in embodiment in the application can combine mutually.Describe below with reference to the accompanying drawings and in conjunction with the embodiments the present invention in detail.
The embodiment of the present invention provides a kind of processing unit of IP data source, and this device is realized its function by computer equipment.
Fig. 1 is according to the structural representation of the processing unit of the IP data source of first embodiment of the invention.The processing unit of the IP data source of being somebody's turn to do as shown in Figure 1, comprises the first acquiring unit 10, generation unit 20, map unit 30 and split cells 40.
The first acquiring unit 10, for obtaining a plurality of IP data sources, any IP data source comprises a plurality of IP sections.IP data source also can be called IP database, IP database can be determined the attaching information that this IP address is corresponding for the IP address by given, this attaching information comprises this IP address affiliated area or operator, for example, get an IP address A, by IP data source or IP database, determine the geographical position of this IP address A.Each IP data source comprises a plurality of IP sections, it should be noted that, IP section is here any one IP section, not refer to first IP section of some IP data sources, here " first " is in order to distinguish the IP section after IP data source is repartitioned, the present invention not to be had to improper restriction.
Any IP data source comprises a plurality of IP sections, that is, each IP data source comprises a plurality of IP sections, and each IP section comprises one section of IP address.IP data source can crawl acquisition by network, also can be by buying acquisition to operator.IP data source can be 2 IP data sources, can be also 2 above a plurality of IP data sources.
Generation unit 20, for generating IP number axis, comprises all IP address on IP number axis.Because IP address is a denumerable set, its scope is 0.0.0.0 to 255.255.255.255, as shown in Figure 2, generate an IP number axis by 0.0.0.0 to 255.255.255.255, the direction of the order of the IP of direction indication shown in arrow address arrangement in figure, is arranged on all IP address in denumerable sets on this IP number axis in order accordingly.Because IP address is a denumerable set, the form of the tables of data that therefore this IP number axis can be by database presents, for example, by each IP address spaces, be shaping data, these shaping data are not repeatedly inserted in tables of data in order, and this tables of data just can be called IP number axis so.It should be noted that, the IP number axis here, also can be called IP question blank, IP tables of data, or other for representing, the IP address set of arranging is integrated in all IP address, " IP number axis " here do not have improper restriction to the present invention.
Map unit 30, for being mapped to respectively IP number axis by the starting point of each IP section of a plurality of IP data sources and end point.Owing to comprising all IP addresses on above-mentioned IP number axis, each IP section all has one section of corresponding IP section of an IP section on this IP number axis.The starting point of each the IP section in each IP data source and end point are mapped to respectively on the corresponding IP of IP number axis address, when IP number axis is while comprising the IP tables of data of all IP address, be the starting point of each the IP section in each IP data source and end point are mapped to respectively on the corresponding IP of IP tables of data address.Owing to comprising all IP addresses on IP number axis, so the starting point of any the IP section in any one IP data source or end point all have a corresponding IP address on IP number axis.
Split cells 40 is a plurality of the 2nd IP sections for dividing IP data source according to the starting point on IP number axis and end point.The 2nd IP section is the IP section after IP data source is repartitioned, it should be noted that, here " second " with above-mentioned in an IP section in " first " there is similar function, the 2nd IP section at this place is in order to distinguish the IP section before repartitioning, the 2nd IP section can be called new IP section, in above-mentioned, an IP section can be called initial IP section, and " second " here do not have improper restriction to the present invention.Dividing IP data source is a plurality of the 2nd IP sections, that is, the IP section in all IP data sources that get is repartitioned, and makes the IP section of the division in each IP data source consistent.Owing to being all mapped with an IP section of a plurality of IP data sources on IP number axis, each IP section includes starting point and end point, and the IP address between starting point and end point, therefore can to IP data source, repartition according to the starting point of an IP section and end point, for example, IP data source A comprises three IP sections, each IP section comprises a starting point and an end point, IP data source B comprises two IP sections, each IP section comprises a starting point and an end point, if starting point and the end point of IP section do not repeat in IP data source A, in IP data source B, starting point and the end point of IP section do not repeat, and in IP data source A, in the starting point of IP section and end point and IP data source B, starting point and the end point of IP section do not repeat yet, using all starting points and end point as boundary point, so, altogether comprise 10 boundary points, according to 10 boundary points from new division IP data source A and IP data source B.The division here also can be called fractionation, is about to the IP section that IP data source is split as from IP section is different before.
According to the embodiment of the present invention, by generating IP number axis, IP section in a plurality of IP data sources is mapped to respectively on IP number axis in corresponding IP section, according to the IP section on IP number axis, repartition original a plurality of IP data source, due to consistent to the criteria for classifying of IP data source on IP number axis, the division of the IP section in the IP data source after a plurality of IP data sources are repartitioned is consistent, solved different IP data source in prior art IP section has wherein been divided to inconsistent problem, reached the effect that the division of the IP section in a plurality of IP data sources is consistent, and then can be convenient to the IP section between a plurality of IP data sources to contrast, to determine the ownership of identical IP section according to a plurality of different IP data sources.
Fig. 3 is according to the structural representation of the processing unit of the IP data source of second embodiment of the invention.The processing unit of the IP data source of this embodiment can be used as a kind of preferred implementation of processing unit of the IP data source of above-described embodiment.As shown in Figure 3, the processing unit of the IP data source of being somebody's turn to do comprises the first acquiring unit 10, generation unit 20, map unit 30 and split cells 40, and wherein, split cells 40 comprises that the first generation module 401 and first splits module 402.The first acquiring unit 10 in the present embodiment, generation unit 20, map unit 30 are identical with the first acquiring unit 10, generation unit 20, map unit 30 functions shown in Fig. 1, do not repeat here.
The first generation module 401 is for using starting point corresponding on IP number axis and end point as boundary point.This boundary point is used for dividing IP data source, due to all starting points and IP address corresponding to end point are mapped to respectively on the IP address that IP number axis is corresponding, therefore, on IP number axis, have corresponding starting point and end point, using the starting point on this IP number axis and end point as boundary point, wherein, if duplicated between starting point or end point, or between starting point and end point, duplicate, so using all repetitions o'clock as a boundary point.
First splits module 402 take as the 2nd IP section for the IP address field between two boundary points adjacent on IP number axis that to divide IP data source be a plurality of the 2nd IP sections.Between each boundary point, include one section of IP address field, that is to say IP section, using this IP address field as the 2nd IP section, that is to say new IP section, to realize repartitioning each IP data source.
Preferably, the first fractionation module comprises: comparison sub-module, judgement submodule and generation submodule.
Comparison sub-module is for contrasting IP address field and an IP section.Due to and whether the IP section in each IP data source comprised all IP address, therefore can exist the IP address field after dividing according to boundary point not to be present in any one IP data source, each IP address field between every two adjacent boundary points after dividing and each the IP section in original a plurality of IP data source are contrasted, so that judge whether this IP address field is present in original IP data source.
Judge that submodule is used for judging on IP number axis, whether IP address field is present in any IP section.According to the comparing result after IP address field and an IP section are contrasted, judge whether this IP address field is present in any the IP section in any one IP data source.This deterministic process is the process that the IP address field after splitting is recalled, and confirms whether the IP address field after splitting is present in former IP data source.For example, certain two adjacent boundary point is the end point of the IP section in original IP data source and the starting point of another IP section, IP address field between this end point and starting point does not belong to any one IP data source, this IP address field can not be as the 2nd IP section, that is, can not be as the new IP section after new fractionation.
If generate submodule, for IP address field, be present in any IP section, using IP address field on IP number axis as the 2nd IP section.Only have this IP address field of working as to be present in an IP section, using this IP address field as the 2nd IP address field, that is, and new IP section.If this IP address field is not present in any IP section, show not have this section of IP address in original a plurality of IP data sources., not using this IP address field as the 2nd IP section, continue to judge whether next IP address field is present in any IP section.
According to the embodiment of the present invention, IP address field after dividing and the IP section in IP data source are contrasted, reject and be not present in the IP address field in original IP data source, make the IP data source after repartitioning and divide the total IP address comprising in previous IP data source to be consistent.
Preferably, the processing unit of IP data source also comprises second acquisition unit and determining unit.
Second acquisition unit after the 2nd IP section, obtains the attaching information of the IP data source of an IP section correspondence for the IP address field between two boundary points adjacent on IP number axis, and attaching information comprises geography information and operator's information.The one IP section is each the IP section in each IP data source, obtain the attaching information of each IP section, this attaching information can be geography information or operator's informaiton, operator also can claim ISP(Internet Service Provider, ISP, be called for short ISP), for example, geography information corresponding to an IP section B in some IP data sources is Beijing.
Determining unit is for determining the attaching information of the 2nd IP section according to attaching information.Because the 2nd IP section is that a plurality of IP sections in a plurality of IP data sources are repartitioned and obtained, therefore according to the attaching information of an IP section, can determine the attaching information of the 2nd IP section.In addition, the 2nd IP section may be present in a plurality of IP data sources, and therefore the 2nd IP section after dividing can include a plurality of different geography information or operator's informaiton.For example, geography information corresponding to the one IP section C is Tianjin, the corresponding geography information of the one IP section D is Tangshan City, wherein, the one IP section C and an IP section D are respectively the IP section in different IP data source, and there is crossing situation between an IP section C and an IP section D, after disposal methods through the IP data source in the embodiment of the present invention, the one IP section C is split as to the 2nd IP section X and the 2nd IP section Y, the one IP section D is split as the 2nd IP section X and the 2nd IP section Z, so, can find out, the 2nd IP section X had both been present in an IP section C, also be present in an IP section D simultaneously, the geography information that the 2nd IP section X is corresponding so may be Tianjin, also may be Tangshan City, here all geography information that may be corresponding that can its correspondence of mark, so that further determine its ownership.Geography information corresponding to the 2nd IP section Y is Tianjin, and geography information corresponding to the 2nd IP section Z is Tangshan City.
Fig. 4 is according to the structural representation of the processing unit of the IP data source of third embodiment of the invention.The processing unit of the IP data source of this embodiment can be used as a kind of preferred implementation of processing unit of the IP data source of above-described embodiment.As shown in Figure 4, the processing unit of the IP data source of being somebody's turn to do comprises the first acquiring unit 10, generation unit 20, map unit 30 and split cells 40, wherein, generation unit 20 comprises conversion module 201, insert module 202, order module 203 and the second generation module 204, map unit 30 comprises acquisition module 301 and mapping block 302, and split cells 40 comprises that the 3rd generation module 401 and second splits module 402.In the present embodiment, the first acquiring unit 10 is identical with the first acquiring unit 10 functions shown in Fig. 1, does not repeat here.
Conversion module 201 is for being integer data by IP address spaces, and IP address comprises all IP address.Can be integer data by each IP address spaces, Bigint(system data type for example) data of type.The IP address here comprises all IP address.
Insert module 202 is not for being repeatedly inserted into by the IP address after transforming the tables of data generating in advance.Tables of data can be the tables of data being pre-created by database function, and the data after IP address spaces are not repeatedly inserted in this tables of data, makes data corresponding to every each IP address only corresponding data in tables of data.
Order module 203 sorts for the IP address to tables of data, obtains splitting according to table, splits according to table and is used for dividing IP data source.Because database has ranking function, can adopt the statement for data are sorted in Order by(database for example) ranking function of statement sorts to data corresponding to tables of data IP address, obtain splitting according to table, this splits according to table is the tables of data after sorting, can be for dividing IP data source.
The second generation module 204 will be for splitting according to table as IP number axis.Because this fractionation can be divided IP data source according to table, therefore will split according to table as IP number axis.
Acquisition module 301 is for obtaining starting point and the end point of each IP section of a plurality of IP data sources.Each IP section includes a starting point and an end point, and the starting point of obtaining and end point are starting point and the end point of each the IP section in each IP data source.The one IP section comprises a plurality of IP address, therefore, and all corresponding IP address of the starting point of each IP section or end point.
Mapping block 302 is for being mapped to respectively fractionation according to table by starting point and end point.Starting point and end point are mapped to respectively and split according to table above, that is, all starting points and IP address corresponding to end point are mapped to respectively and are split according in data corresponding to the IP address in table.
The 3rd generation module 401 is for splitting according to showing upper corresponding starting point and end point as boundary point.This boundary point is used for dividing IP data source, due to all starting points and IP address corresponding to end point are mapped to respectively to fractionation according on IP address corresponding to table, therefore, on splitting according to table, have corresponding starting point and end point, this is split according to the starting point on table and end point as boundary point, wherein, if duplicated between starting point or end point, or between starting point and end point, duplicate, so using all repetitions o'clock as a boundary point.
Second splits module 402 for as two IP section take according to the IP address field between upper two the adjacent boundary points of table using splitting that to divide IP data source be a plurality of the 2nd IP sections.Between each boundary point, include one section of IP address field, that is to say IP section, using this IP address field as the 2nd IP section, that is to say new IP section, to realize repartitioning each IP data source.
The embodiment of the present invention also provides a kind of processing method of IP data source.The method may operate on computer equipment.It should be noted that, the processing unit of the IP data source that the processing method of the IP data source of the embodiment of the present invention can provide by the embodiment of the present invention is carried out, the processing method of the IP data source that the processing unit of the IP data source of the embodiment of the present invention also can provide for the execution embodiment of the present invention.
Fig. 5 is according to the flow chart of the processing method of the IP data source of first embodiment of the invention.As shown in Figure 5, the processing method of this IP data source comprises that step is as follows:
Step S101, obtains a plurality of IP data sources, and any IP data source comprises a plurality of IP sections.IP data source also can be called IP database, IP database can be determined the attaching information that this IP address is corresponding for the IP address by given, this attaching information comprises this IP address affiliated area or operator, for example, get an IP address A, by IP data source or IP database, determine the geographical position of this IP address A.Each IP data source comprises a plurality of IP sections, it should be noted that, IP section is here any one IP section, not refer to first IP section of some IP data sources, here " first " is in order to distinguish the IP section after IP data source is repartitioned, the present invention not to be had to improper restriction.
Any IP data source comprises a plurality of IP sections, that is, each IP data source comprises a plurality of IP sections, and each IP section comprises one section of IP address.IP data source can crawl by network, also can be by buying acquisition to operator.IP data source can be 2 IP data sources, can be also 2 above a plurality of IP data sources.
Step S102, generates IP number axis, comprises all IP address on IP number axis.Because IP address is a denumerable set, its scope is 0.0.0.0 to 255.255.255.255, as shown in Figure 3, generate an IP number axis by 0.0.0.0 to 255.255.255.255, the direction of the order of the IP of direction indication shown in arrow address arrangement in figure, is arranged on all IP address in denumerable sets on this IP number axis in order accordingly.Because IP address is a denumerable set, the form of the tables of data that therefore this IP number axis can be by database presents, for example, by each IP address spaces, be shaping data, these shaping data are not repeatedly inserted in tables of data in order, and this tables of data just can be called IP number axis so.It should be noted that, the IP number axis here, also can be called IP question blank, IP tables of data, or other for representing, the IP address set of arranging is integrated in all IP address, " IP number axis " here do not have improper restriction to the present invention.
Step S103, is mapped to the starting point of each the IP section in a plurality of IP data sources and end point respectively in the corresponding IP section of IP number axis.Owing to comprising all IP addresses on above-mentioned IP number axis, each IP section all has one section of corresponding IP section of an IP section on this IP number axis.The starting point of each the IP section in each IP data source and end point are mapped to respectively on the corresponding IP of IP number axis address, when IP number axis is while comprising the IP tables of data of all IP address, be the starting point of each the IP section in each IP data source and end point are mapped to respectively on the corresponding IP of IP tables of data address.Owing to comprising all IP addresses on IP number axis, so the starting point of any the IP section in any one IP data source or end point all have a corresponding IP address on IP number axis.
Step S104 is a plurality of the 2nd IP sections according to the described starting point on IP number axis and described end point division IP data source.The 2nd IP section is the IP section after IP data source is repartitioned, it should be noted that, here " second " with above-mentioned in an IP section in " first " there is similar function, the 2nd IP section at this place is in order to distinguish the IP section before repartitioning, the 2nd IP section can be called new IP section, in above-mentioned, an IP section can be called initial IP section, and " second " here do not have improper restriction to the present invention.Dividing IP data source is a plurality of the 2nd IP sections, that is, the IP section in all IP data sources that get is repartitioned, and makes the IP section of the division in each IP data source consistent.Owing to being all mapped with an IP section of a plurality of IP data sources on IP number axis, each IP section includes starting point and end point, and the IP address between starting point and end point, therefore can to IP data source, repartition according to the starting point of an IP section and end point, for example, IP data source A comprises three IP sections, each IP section comprises a starting point and an end point, IP data source B comprises two IP sections, each IP section comprises a starting point and an end point, if starting point and the end point of IP section do not repeat in IP data source A, in IP data source B, starting point and the end point of IP section do not repeat, and in IP data source A, in the starting point of IP section and end point and IP data source B, starting point and the end point of IP section do not repeat yet, using all starting points and end point as boundary point, so, altogether comprise 10 boundary points, according to 10 boundary points from new division IP data source A and IP data source B.The division here also can be called fractionation, is about to the IP section that IP data source is split as from IP section is different before.
According to the embodiment of the present invention, by generating IP number axis, the starting point of the IP section in a plurality of IP data sources and end point are mapped to respectively on IP number axis on corresponding IP address, according to the starting point of the IP section on IP number axis and end point, repartition original a plurality of IP data source, due to consistent to the criteria for classifying of IP data source on IP number axis, the division of the IP section in the IP data source after a plurality of IP data sources are repartitioned is consistent, solved different IP data source in prior art IP section has wherein been divided to inconsistent problem, reached the effect that the division of the IP section in a plurality of IP data sources is consistent, and then can be convenient to the IP section between a plurality of IP data sources to contrast, to determine the ownership of identical IP section according to a plurality of IP data sources.
Fig. 6 is according to the flow chart of the processing method of the IP data source of second embodiment of the invention.The processing method of the IP data source that this embodiment provides can be used as a kind of preferred implementation of processing method of the IP data source of above-described embodiment.As shown in Figure 6, the processing method of this IP data source comprises that step is as follows:
Step S201, obtains a plurality of IP data sources, and any IP data source comprises a plurality of IP sections.IP data source also can be called IP database, IP database can be determined the attaching information that this IP address is corresponding for the IP address by given, this attaching information comprises this IP address affiliated area or operator, for example, get an IP address A, by IP data source or IP database, determine the geographical position of this IP address A.Each IP data source comprises a plurality of IP sections, it should be noted that, IP section is here any one IP section, not refer to first IP section of some IP data sources, here " first " is in order to distinguish the IP section after IP data source is repartitioned, the present invention not to be had to improper restriction.
Any IP data source comprises a plurality of IP sections, that is, each IP data source comprises a plurality of IP sections, and each IP section comprises one section of IP address.IP data source can crawl by network, also can be by buying acquisition to operator.IP data source can be 2 IP data sources, can be also 2 above a plurality of IP data sources.
Step S202, generates IP number axis, and IP number axis comprises all IP address.Because IP address is a denumerable set, its scope is 0.0.0.0 to 255.255.255.255, as shown in Figure 3, generate an IP number axis by 0.0.0.0 to 255.255.255.255, the direction of the order of the IP of direction indication shown in arrow address arrangement in figure, is arranged on all IP address in denumerable sets on this IP number axis in order accordingly.Because IP address is a denumerable set, the form of the tables of data that therefore this IP number axis can be by database presents, for example, by each IP address spaces, be shaping data, these shaping data are not repeatedly inserted in tables of data in order, and this tables of data just can be called IP number axis so.It should be noted that, the IP number axis here, also can be called IP question blank, IP tables of data, or other for representing, the IP address set of arranging is integrated in all IP address, " IP number axis " here do not have improper restriction to the present invention.
Step S203, is mapped to the starting point of each the IP section in a plurality of IP data sources and end point respectively on IP number axis.Owing to comprising all IP addresses on above-mentioned IP number axis, each IP section all has one section of corresponding IP section of an IP section on this IP number axis.Each IP section includes a starting point and an end point, and the starting point of obtaining and end point are starting point and the end point of each the IP section in each IP data source.The one IP section comprises a plurality of IP address, therefore, and all corresponding IP address of the starting point of each IP section or end point.The starting point of each the IP section in each IP data source and end point are mapped to respectively on the corresponding IP of IP number axis address, when IP number axis is while comprising the IP tables of data of all IP address, be the starting point of each the IP section in each IP data source and end point are mapped to respectively on the corresponding IP of IP tables of data address.Owing to comprising all IP addresses on IP number axis, so the starting point of any the IP section in any one IP data source or end point all have a corresponding IP address on IP number axis.
Step S204, using starting point corresponding on IP number axis and end point as boundary point.This boundary point is used for dividing IP data source, due to all starting points and IP address corresponding to end point are mapped to respectively on the IP address that IP number axis is corresponding, therefore, on IP number axis, have corresponding starting point and end point, using the starting point on this IP number axis and end point as boundary point, wherein, if duplicated between starting point or end point, or between starting point and end point, duplicate, so using all repetitions o'clock as a boundary point.
Step S205, the IP address field between two boundary points adjacent on IP number axis take as the 2nd IP section that to divide IP data source be a plurality of the 2nd IP sections.Between each boundary point, include one section of IP address field, that is to say IP section, using this IP address field as the 2nd IP section, that is to say new IP section, to realize repartitioning each IP data source.
Preferably, step S205 can comprise the following steps:
Step S2051, contrasts IP address field and an IP section.Due to and whether the IP section in each IP data source comprised all IP address, therefore can exist the IP address field after dividing according to boundary point not to be present in any one IP data source, each IP address field between every two adjacent boundary points after dividing and each the IP section in original a plurality of IP data source are contrasted, so that judge whether this IP address field is present in original IP data source.
Step S2052, judges on IP number axis, whether IP address field is present in any IP section.According to the comparing result after IP address field and an IP section are contrasted, judge whether this IP address field is present in any the IP section in any one IP data source.This deterministic process is the process that the IP address field after splitting is recalled, and confirms whether the IP address field after splitting is present in former IP data source.For example, certain two adjacent boundary point is the end point of the IP section in original IP data source and the starting point of another IP section, IP address field between this end point and starting point does not belong to any one IP data source, this IP address field can not be as the 2nd IP section, that is, can not be as the new IP section after new fractionation.
Step S2053, if IP address field is present in any IP section, using IP address field on IP number axis as the 2nd IP section.Only have this IP address field of working as to be present in an IP section, using this IP address field as the 2nd IP address field, that is, and new IP section.If this IP address field is not present in any IP section, show not have this section of IP address in original a plurality of IP data sources., not using this IP address field as the 2nd IP section, continue to judge whether next IP address field is present in any IP section.
According to the embodiment of the present invention, IP address field after dividing and the IP section in IP data source are contrasted, reject and be not present in the IP address field in original IP data source, make the IP data source after repartitioning and divide the total IP address comprising in previous IP data source to be consistent.
Preferably, after execution of step S205, the processing method of IP data source also comprises that step is as follows:
Step S6, obtains the attaching information of the IP data source of an IP section correspondence, and attaching information comprises geography information and operator's information.The one IP section is each the IP section in each IP data source, obtain the attaching information of each IP section, this attaching information can be geography information or operator's informaiton, operator also can claim ISP(Internet Service Provider, ISP, be called for short ISP), for example, geography information corresponding to an IP section B in some IP data sources is Beijing.
Step S7, determines the attaching information of the 2nd IP section according to attaching information.Because the 2nd IP section is that a plurality of IP sections in a plurality of IP data sources are repartitioned and obtained, therefore according to the attaching information of an IP section, can determine the attaching information of the 2nd IP section.In addition, the 2nd IP section may be present in a plurality of IP data sources, and therefore the 2nd IP section after dividing can include a plurality of different geography information or operator's informaiton.For example, geography information corresponding to the one IP section C is Tianjin, the corresponding geography information of the one IP section D is Tangshan City, wherein, the one IP section C and an IP section D are respectively the IP section in different IP data source, and there is crossing situation between an IP section C and an IP section D, after disposal methods through the IP data source in the embodiment of the present invention, the one IP section C is split as to the 2nd IP section X and the 2nd IP section Y, the one IP section D is split as the 2nd IP section X and the 2nd IP section Z, so, can find out, the 2nd IP section X had both been present in an IP section C, also be present in an IP section D simultaneously, the geography information that the 2nd IP section X is corresponding so may be Tianjin, also may be Tangshan City, here all geography information that may be corresponding that can its correspondence of mark, so that further determine its ownership.Geography information corresponding to the 2nd IP section Y is Tianjin, and geography information corresponding to the 2nd IP section Z is Tangshan City.
Fig. 7 is according to the flow chart of the processing method of the IP data source of third embodiment of the invention.The processing method of the IP data source that this embodiment provides can be used as a kind of preferred implementation of processing method of the IP data source of above-described embodiment.As shown in Figure 7, the processing method of this IP data source comprises that step is as follows:
Step S301, obtains a plurality of IP data sources, and any IP data source comprises a plurality of IP sections.IP data source also can be called IP database, IP database can be determined the attaching information that this IP address is corresponding for the IP address by given, this attaching information comprises this IP address affiliated area or operator, for example, get an IP address A, by IP data source or IP database, determine the geographical position of this IP address A.Each IP data source comprises a plurality of IP sections, it should be noted that, IP section is here any one IP section, not refer to first IP section of some IP data sources, here " first " is in order to distinguish the IP section after IP data source is repartitioned, the present invention not to be had to improper restriction.
Any IP data source comprises a plurality of IP sections, that is, each IP data source comprises a plurality of IP sections, and each IP section comprises one section of IP address.IP data source can crawl by network, also can be by buying acquisition to operator.IP data source can be 2 IP data sources, can be also 2 above a plurality of IP data sources.
Step S302, is whole categorical data by IP address spaces, and IP address comprises all IP address.Can be integer data by each IP address spaces, Bigint(system data type for example) data of type.The IP address here comprises all IP address.
Step S303, is not repeatedly inserted into the IP address after transforming in the tables of data generating in advance.Tables of data can be the tables of data being pre-created by database function, and the data after IP address spaces are not repeatedly inserted in this tables of data, makes data corresponding to every each IP address only corresponding data in tables of data.
Step S304, sorts to the IP address in tables of data, obtains splitting according to table.Because database has ranking function, can adopt the statement for data are sorted in Order by(database for example) ranking function of statement sorts to data corresponding to tables of data IP address, obtain splitting according to table, this splits according to table is the tables of data after sorting, can be for dividing IP data source.
Step S305, will split according to table as IP number axis.Because this fractionation can be divided IP data source according to table, therefore will split according to table as IP number axis.
Step S306, obtains starting point and the end point of each the IP section in a plurality of IP data sources.Each IP section includes a starting point and an end point, and the starting point of obtaining and end point are starting point and the end point of each the IP section in each IP data source.The one IP section comprises a plurality of IP address, therefore, and all corresponding IP address of the starting point of each IP section or end point.
Step S307, is mapped to respectively fractionation according on table by starting point and end point.Starting point and end point are mapped to respectively and split according to table above, that is, all starting points and IP address corresponding to end point are mapped to respectively and are split according in data corresponding to the IP address in table.
Step S308, will split according to showing upper corresponding starting point and end point as boundary point.This boundary point is used for dividing IP data source, due to all starting points and IP address corresponding to end point are mapped to respectively to fractionation according on IP address corresponding to table, therefore, on splitting according to table, have corresponding starting point and end point, this is split according to the starting point on table and end point as boundary point, wherein, if duplicated between starting point or end point, or between starting point and end point, duplicate, so using all repetitions o'clock as a boundary point.
Step S309, take as the 2nd IP section the IP address field splitting according between upper two the adjacent boundary points of table divide described IP data source as a plurality of the 2nd IP sections.Between each boundary point, include one section of IP address field, that is to say IP section, using this IP address field as the 2nd IP section, that is to say new IP section, to realize repartitioning each IP data source.
Obviously, those skilled in the art should be understood that, above-mentioned each module of the present invention or each step can realize with general calculation element, they can concentrate on single calculation element, or be distributed on the network that a plurality of calculation elements form, alternatively, they can be realized with the executable program code of calculation element, thereby, they can be stored in storage device and be carried out by calculation element, or they are made into respectively to each integrated circuit modules, or a plurality of modules in them or step are made into single integrated circuit module to be realized.Like this, the present invention is not restricted to any specific hardware and software combination.
It should be noted that, in the step shown in the flow chart of accompanying drawing, can in the computer system such as one group of computer executable instructions, carry out, and, although there is shown logical order in flow process, but in some cases, can carry out shown or described step with the order being different from herein.
The foregoing is only the preferred embodiments of the present invention, be not limited to the present invention, for a person skilled in the art, the present invention can have various modifications and variations.Within the spirit and principles in the present invention all, any modification of doing, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.