KR101447526B1

KR101447526B1 - Method and apparatus for sorting personal information database based on an address and for grouping information from the sorted database

Info

Publication number: KR101447526B1
Application number: KR1020130011879A
Authority: KR
Inventors: 정연일
Original assignee: (주)수지원넷소프트
Priority date: 2013-02-01
Filing date: 2013-02-01
Publication date: 2014-10-08
Also published as: KR20140099083A

Abstract

The present invention arranges and processes a database based on an address. An apparatus according to the present invention includes a classification unit configured to classify entries in a database into entries divided into a plurality of sub-databases different from each other according to a classification level designated as an address element; For each of a plurality of sub-databases, an entry obtained from the database in a manner that the entries of the sub-database are arranged in a specified order based on the address information of each entry, To the network. According to the classification, each of the sub-databases is composed of entries having the same information for the designated address element, and the sub-databases are entries having different information for the designated address element.

Description

Field of the Invention The present invention relates to a method and apparatus for sorting information based on an address of a personal information database and an information grouping method using the sorted database,

The present invention is directed to sorting personal information databases obtained through various methods, particularly large databases, based on their address fields and using the aligned databases.

Companies that sell products or provide services to customers collect their personal information through various channels for marketing or publicity. For example, when a customer contacts with a customer, such as when purchasing a product, when receiving a service, or when receiving an after-sales service for a product that is sold, the customer is actively involved with the home or work address, telephone number, We collect the same personal information. This is because personal information such as a customer's address is expected to greatly contribute to the company's business activities, or to play a positive role in future corporate profits.

According to this background, when individual information is available, companies gather it separately and indiscriminately and database it, and the size of the personal information database is quite large. For example, the number of personal information collected can range from millions to tens of millions. Therefore, it is possible to reduce the expense of the enterprise by using the collected large-capacity personal information database more efficiently. For example, if you can extract the personal information of the family members listed in the personal information database, do not send the company's publicity or event guide to them individually, For example, a householder or a housewife. In this way, it is possible to reduce the cost of mailing the mail while achieving almost the same effect as the publicity.

However, as mentioned above, the personal information database collected by companies and the like, by registering the addresses, names, telephone numbers, and ages individually acquired from individuals in the personal information database at the time of their acquisition, Are not sorted according to a specific field in the database, for example, in the order of addresses (i.e., in alphabetical order). Typically, a particular address element of the address field (the address element is an element of the address system, such as "city / city", "city / county", "town / county / Herein, the elements of "city / province", "city / county / district" and "town / county / province" are generally referred to as "upper address" (Hereinafter referred to as a " sub address ") of a specific address element is requested from a personal information database, The database entries are used in a way to obtain database entries that match the information, where all the entries are searched. For example, according to a query designating "Seoul" for the address field, all entries of the personal information database are searched to obtain personal information entries including "Seoul " in the address field.

In this way, if you want to know the personal information that appears to belong to a particular group, for example, individuals belonging to the same household, it takes a considerable amount of time to retrieve the addresses listed in the database. For example, with respect to a personal information database having N listed entries, which is constructed as illustrated in FIG. 1, an entry estimated to belong to the same household as an individual corresponding to the registered personal information entry 11 In order to find out, it is necessary to compare all the N-1 pieces of personal information entries corresponding to a part of the address registered in the address field of the arbitrary entry 11, for example, the "address" of the lower address, The entry 12 must be compared with the remaining N-2 entries again. After this comparison process is performed approximately N (N-1) / 2 times, it is possible to know a group of personal information whose addresses are approximately the same. However, as mentioned above, when the number of N reaches millions or tens of millions, the above search becomes a task requiring a great computational power. Moreover, it is common that the comparison between the information elements listed in such a personal information database is performed by a script based on text, which is relatively slow to perform.

Therefore, according to the above-described method, it takes a great amount of time to extract entries belonging to a specific group, for example, the same household, from the personal information database.

In order to extract entries belonging to a specific group according to the above-described method, the personal information database must be loaded on the memory of the computing device. Since the amount of information is very large, In general, since an execution object other than a database, for example, a script for database processing, a computer operating system (O / S), and a basic process occupies memory resources, the entire personal information database is usually loaded into memory I can not. In this case, a portion of the personal information database that is compared to a particular entry is loaded into memory and the rest is stored on the hard disk, causing a swap of data between the memory and the hard disk do. This further increases the amount of time it takes to identify a particular group of entries.

The present invention provides a method and an apparatus for sorting a database based on an address field in order to distinguish entries corresponding to a specific group from a collected large-capacity personal information database at a higher speed than conventional ones The purpose.

It is another object of the present invention to provide a method and apparatus for adaptively classifying entries of a database in a personal information database adaptively to a memory resource of a computing device used for distinguishing entries corresponding to a specific group.

It is still another object of the present invention to provide a method of distinguishing entries according to various grouping conditions when distinguishing entries corresponding to a specific group on the basis of an address field in a personal information database in which entries are sorted according to the above object And a device.

It is a further object of the present invention to allow a user to freely designate and extract desired information for an inputted personal information database and also to identify a desired specific group based on freely specified information about the extracted information And to provide a device and method for enabling the device to be used.

It is to be understood that the object of the present invention is not limited to the explicitly stated objects, but, of course, it is an object of the present invention to achieve the effect which can be derived from the following specific and exemplary description of the present invention.

According to an aspect of the present invention, an apparatus for sorting and processing a database on the basis of an address includes a plurality of sub-databases different from each other according to a classification level designated as an address element, A sorting unit configured to classify the sub-databases into a plurality of sub-databases, and a sorting unit configured to sort entries of the sub-databases corresponding to the plurality of sub- , And an organization configured to add an entry obtained from the database to the corresponding sub-database. And, with this apparatus, each of the sub-databases is composed of entries having the same information for the designated address element, and the sub-databases are entries having different information for the designated address element .

In one embodiment of the present invention, the classification level designates at least one address element starting from the highest address element in the address system through an external input or an information file.

In one embodiment according to the present invention, the address element includes elements of "city / city "," city / district ", and "east / The designated address element includes at least the elements of the " state / city ".

In one embodiment of the present invention, the address information of each entry of each sub-database may be an address code converted from the address data of the corresponding entry in the database. In this case, , And a numeric string uniquely assigned to the upper address of the address data.

According to an embodiment of the present invention, information corresponding to each entry of the input database is extracted in accordance with a specified format, and each extracted information is arranged in a designated order to constitute each personal information entry And a processing unit. In the present embodiment, the classification section causes the classification to be performed on each of the personal information entries configured by the processing section. Here, the specified format and order may be defined by an arbitrary name variable designating corresponding information in the entry of the input database, and an order of listing the variables.

According to an exemplary embodiment of the present invention, the apparatus further includes a grouping unit for comparing the entries of the sub-databases among the sub-databases to identify entries matching the set grouping requirement. In the present embodiment, in the mutual comparison of the entries, the grouping unit compares the entry with the contiguous entries from the next entry to the mismatched entry in the matching element of the address specified by the grouping requirement .

In an embodiment of the present invention in which the grouping unit is included, the grouping unit may add, to the entries classified according to the set grouping requirement, indication information indicating a result according to the set grouping requirement. In addition, in the present embodiment, the grouping requirement is defined by variables having an arbitrary name designating corresponding information in the entry of the sub-database and logical functions applied to the variables, At least a matching requirement for a telephone number and a matching requirement for a telephone number. The grouping requirement may be specified via external input or through an information file.

According to an embodiment of the present invention, the apparatus further comprises a merging unit for merging the plurality of sub-databases into a single database. The merging unit may select an arbitrary entry in each of the sub-databases and associate the sub-databases with each other in accordance with the order relationship between the sub-databases, So that the single database is constructed.

According to another aspect of the present invention, a method of processing a database according to an address comprises the steps of: identifying information corresponding to an address element specified as a classification level for an arbitrary entry of a database; , Sorting the arbitrary entry based on the address information of the entry, registering the arbitrary entry in a sub-database specified for the classification, and registering the arbitrary entry in a position in the sub-database based on the address information of the entry It consists of two steps. And, in the method, each of the sub-databases in which each entry of the database is classified and registered by the two steps is configured with entries having the same information for the specified address element.

In an embodiment according to the present invention, the method further comprises the steps of: selecting an entry in one of the sub-databases; determining, for the selected entry, And ending the confirmation of the selected entry in the entry that is inconsistent with the matching element of the address specified by the grouping requirement.

Further, in an embodiment according to the present invention, the method further comprises: selecting an arbitrary entry in each of the sub-databases, and comparing the address information of the selected entries with each other, And configuring the sub-databases into a single database by sequentially arranging the sub-databases according to the determined mutual contexts.

According to another aspect of the present invention, a storage medium on which data is recorded includes an address processing program recorded in the medium for sorting and processing entries of the database by addresses. When the address processing program is loaded into a computing device capable of reading the program and executed, the address processing program may be configured to execute, based on the classification level designated as the address element, Sub-databases into entries having the same information for the specified address element by classifying the sub-databases into entries divided into sub-databases, Database, an entry obtained from the database is added to the corresponding sub-database in such a manner that the entries of the corresponding sub-database are arranged in the order specified based on the address information of each entry .

At least one embodiment of the present invention described above or described in detail below with reference to the accompanying drawings is to provide a method of classifying addresses from a large volume database based on listed address information, Allows the database to be sized appropriately. Accordingly, when a computing device provided for database processing has limited memory resources, each database having the classified entries is made to have a size appropriate for the capacity of the resource, thereby eliminating restrictions on the database processing, It is possible to eliminate the factor of swapping. In the latter case, the processing time is shortened.

In addition, according to the present invention, each partitioned database is arranged such that its entries are sorted in order based on the address, so that it is possible to distinguish entries belonging to a specific group, for example, the same household, Since the area can also be limited to the adjacent entry (s) of the entry, the time required for the search is also significantly shortened. And, in the entry grouping for the database, if the grouping requirement is not satisfied satisfactorily by the specified grouping requirement, that is, if the number of grouped entries is too large or too small, The time for obtaining the same result is reduced to a ratio of several tens of times as compared with the case of performing a database operation in which a randomly inserted personal information is inserted.

Figure 1 is a simplified illustration of an example of a generic database built with collected personal information,
2 is a block diagram illustrating a configuration of an apparatus for performing sorting of a personal information database according to an address and an information grouping method using the sorted database according to an embodiment of the present invention,
3 is a flowchart of a method for sorting a personal information database based on an address, according to an embodiment of the present invention,
4A is a definition example of a field input / output format that allows a user to freely select required field information from an existing database and configure a database having information suitable for use according to an embodiment of the present invention,
4B schematically illustrates the construction of a new entry by arranging the information of an arbitrary entry of an inputted existing database in a specified order according to a prescribed field input / output format according to an embodiment of the present invention And,
FIG. 5A illustrates an example of designating the depth of an address based on an address, in classifying entries of a database according to an embodiment of the present invention,
5B schematically shows a method of operating and referencing a classification reference table so that the databases can be classified and classified based on classification levels according to an embodiment of the present invention,
Fig. 6 shows an example in which, for the inputted database entry, a unique code for classifying the entry is added according to the embodiment of Fig. 5B,
7 illustrates an example in which an existing database is divided into a plurality of sub-databases based on a classification level and constructed according to an embodiment of the present invention,
8A and 8B are diagrams illustrating a method of configuring a sub-database according to an exemplary embodiment of the present invention so as to enable binary search for registered entries, &Lt; RTI ID = 0.0 > - < / RTI > in an ordering of addresses in the database,
9A to 9C are various examples for specifying a condition of grouping in grouping entries belonging to a specific group for a database according to embodiments of the present invention,
FIG. 9D shows an example showing that the field information can be freely selected for a database for which grouping of entries is desired, and the grouping condition can be freely defined according to the promised syntax as the selected information, according to an embodiment of the present invention ego,
10 illustrates an example of a database in which entries are arranged in order based on addresses according to an embodiment of the present invention,
11 shows an example of converting address data in a text format of personal information into a unique numeric code corresponding thereto according to another embodiment of the present invention,
Fig. 12 is an example of a database obtained by applying the conversion process according to the embodiment of Fig. 11 to the information described in the database illustrated in Fig. 10,
Figure 13 is a diagram showing, together with its execution components, that each partitioned sub-database can be merged in an address-based manner in accordance with one embodiment of the present invention,
FIG. 14 schematically illustrates the process of merging each sub-database in an order-based manner in accordance with the embodiment of FIG.

Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

FIG. 2 is a block diagram illustrating an apparatus for performing an information grouping method using an ordered list and sorting of personal information lists according to an address according to an embodiment of the present invention. Referring to FIG. 2, Output format 2a in accordance with the data field structure of the input / output format 2a, and also extracts information of the corresponding field according to the input / output format 2a, (Hereinafter, referred to as " address data ") is decomposed into address elements in each of the extracted field information, and a personal information database A classifier 23 for classifying entries of the personal information database according to a classification level 2b given from the outside or in a separate information file form; (Hereinafter referred to as " sub-database ") to which the classified entry belongs, for each entry input from the pre-classification section 23, ( _K , k = 1, 2, ..., N) by locating and arranging the positions of the personal information sub-databases 200 _k , k = In accordance with a grouping condition 3a given externally or in a separate information file format with reference to the sub-databases 200 constructed by the DB organizer 24, And a grouping unit 25 for, for example, extracting entries that are determined to belong to the same household.

Each component 21, 22, 23, 24, 25 of the device illustrated in FIG. 2, or a partial combination thereof, may be configured as hardware executing software or firmware, And may be configured as one or more application programs to be executed in the apparatus. Of course, hardware and software may be combined and configured. The personal information database 1a to be subjected to the method according to the present invention and the information 2a, 2b, and 3a designated for database processing are included in the device or the computing device of FIG. 2 (Such as an optical recording medium reader, a mass storage hard disk, a portable storage device, a keyboard, and the like) that is provided with such information. When configured as software, an application program implemented for the functions of all or a part of the illustrated components 21, 22, 23, 24, and 25 is recorded in a medium of a storage medium, Readable < / RTI > device or communicated between devices by communication over a communication network. In addition, the device illustrated in FIG. 2 may also be configured as one integrated server.

In an embodiment of the present invention, the grouping unit 25 may be constituted by a separate apparatus or an application program. In this case, the grouping unit 25 divides each sub-database constructed by the DB organization unit 24, in which entries are arranged in order according to an address field, into a separate moving storage medium, For example, a memory, a hard disk, etc.), or communication. The pre-processing unit 21 and the address parser 22 may also be configured as separate devices or application programs. In this case as well, the classifying unit 23 may classify the address data into individual The information database is acquired and processed through a separate portable storage medium or a storage space or a communication that is mutually shared. If necessary, the classification unit 23 may divide the address data of each entry of the entered personal information database into address elements.

3 is a flow diagram of a method for sorting personal information listings based on an address, in accordance with an embodiment of the present invention. Hereinafter, the operation of the apparatus having the configuration illustrated in FIG. 2 to mechanically perform the sorting of the personal information list according to a given condition will be described in detail and with reference to the flowchart illustrated in FIG.

The preprocessing unit 21 may be implemented in various ways (for example, a system through a reading device or a connected device included in a disc recording medium, a portable storage device, or a communication method through a remote terminal) Output format 2a and extracts information corresponding to each field of the existing personal information database 1a input in any one of the methods 1 to 3 as a suitable information field (S310). This will be explained in more detail.

The field input / output format 2a may be given as illustrated in FIG. 4A. The variable names of the left variable group 41 are arbitrarily designated by the user and the right is an example of a processing function for designating the desired information in accordance with the field format of the inputted database 1a. , '|'), The information of the corresponding order (the length of the information is variable) is specified. In addition, the field input / output format 2a may define a new variable by using a variable of an arbitrary name designated by the user (42). The preprocessor 21 interprets each sentence described as the input / output format 2a according to the promised syntax and stores a variable name for the information of the field specified by each processing function . Then, the output format 43 specified in the field input / output format 2a is confirmed. Of course, the output format 43 is also freely defined in the order and format by the user using the variable name 41 defined by the user. The designation of the output format 43 illustrated in FIG. 4A is an extremely simple example, and may be specified in various manners, with the same principle as illustrated, for outputting a list of personal information for a desired use. The preprocessor 21 extracts the corresponding information in each entry 411 of the inputted existing database 1a according to the designated output format 43 as illustrated in FIG. Thereby constituting an entry 412. In the figure, for fields of information other than the field 412a having information on the address, it is assumed that the database entries are classified and aligned on the basis of addresses in the following description. Field 412b.

On the other hand, in the case where the information for enabling the entry 412 constructed according to the given output format 43 to be recognized as an address field is described in the field input / output format 2a, Validation of the data in the address field (verification of whether the length is 4 bytes or more, the start character is Hangul character, etc.) may be performed. Through the validation, it is possible to block the input of the error data to the address, thereby preventing the unnecessary process from being burdened due to the error of the input information. It is also possible to correct such an error by automatically correcting it or by displaying the error through a screen display and receiving correct data.

In another embodiment according to the present invention, the pre-processing unit 21 may not be included in the apparatus for sorting and processing the database based on the address according to the present invention. In other words, the inputted existing personal information database 1a may be directly input to the address parser 22 and processed.

The address parser 22 extracts the data of the address field 412a for each entry of the personal information entry 412 configured by the pre-processing unit 21 or the existing personal information database 1a directly input And distinguishes each address element based on a specific identifier (e.g., a special character such as a blank or tab) (S311). For example, if the address data of the input entry is "54 Dongrajaei Apartment X Dong Y, Mullae-dong, Yeongdeungpo-gu, Seoul," city / city "=" "Dong / Eup / myeon" = "Mullae-dong", "Bungee" = "54", "Building name" = "Mungrajae apartment", "Dong" = "X", "Lake" = "Y" . Each address element may be replaced with an address element having a different name, for example, "Lee" or another address element "block name "," subaddress ", or the like depending on the inputted address data. In addition, the address parser 22 desirably ensures that the data corresponding to each address element has a predetermined fixed length. Of course, for this purpose, a space or a character such as "0" may be added to the data for that address element. In another embodiment according to the present invention, the text information may be converted into a fixed-length numeric code instead of the text information for the address, which will be described later.

The classifying unit 23 receives or obtains one personal information entry having an address field in which address data is divided by each address element, from the address parser 22 (S312). The classification unit 23 receives the database classification level 2b from the user through a separate input / output device or reads the information file created by the user from the storage means and confirms the classification level 2b specified in the file. The classification level 2b is a value obtained by classifying the entries of the database configured from the input personal information database 1a by the pre-processing unit 21 into the same address group (group) to reduce the database size Which corresponds to the specified depth for the address elements mentioned above. 5A is an illustration of this, where designating only the highest address element (in the above example, "city / city") corresponds to level 1, Designates the level 2 and 3 address elements (in the example above, "city / city" - "city / county" - "east / . &Lt; / RTI > Of course, without using the term "level", it is possible to directly indicate the address elements, and to specify the extent to which the classification is based, either directly from the user or through an information file. If the level to be categorized, that is, the level for dividing the database is specified, the classifying unit 23 classifies the address of each personal information entry received or obtained (hereinafter referred to as "input") from the address parser 22 The data is confirmed up to the address element (s) corresponding to the specified level and the personal information entry is classified based on the confirmed data (S313). Hereinafter, this will be described in detail.

When the personal information entry is input, the classification unit 23 extracts only the data of the address elements from the address data 51 of the entry 50 to the designated classification level 2b 51a and compares them with each item of the classification reference table 500 managed by itself (S313-1). The classification unit 23 constructs the classification reference table 500 by using the address data up to the designated classification level 2b extracted from the first entry as the first item 501, , The data up to the specified level extracted from the address data of the entry is compared with the item (s) listed in the classification reference table 500 before that, and if there is no identical item, the operation of registering the data in the table 500 . When listing as a new item in the classification reference table 500, a serial number according to the listed order is given as a classification code 510 and recorded together. With the operation of the classification reference table 500 in this manner, when the data for the address elements up to the specified level are different for each input entry, different classification codes are assigned. Therefore, the classification unit 23 searches the classification reference table 500 for the extracted address data 51a, and if the same data is registered, the classification unit 23 obtains the classification code assigned thereto. If there is no item of the same data, the address data 51a is registered as a new item of the table 500, and the next number of the assigned serial number ("L + 1" And the number is applied as a classification code for the address data 51a while being registered as a classification code in the item. In this manner, when the classification code for the entered personal information entry 50 is determined, the entered personal information 60 is input to the DB organization unit (not shown) together with the classification code 61 determined therefor, 24).

When the personal information entry to which the classification code is added is inputted from the classification section 23, the DB organization section 24 confirms whether the sub-database specified for the classification code exists (S320). The DB organizer 24 constructs each sub-database corresponding to the classification code as the personal information entries to which the classification code is added. FIG. 7 shows the sub- (200). &Lt; / RTI > As shown in the figure, classification codes (code 1, code 2, ...) are added to each sub-database 200 _k , k = 1, 2, 3, The corresponding sub-database is specified by the classification code for the personal information entry, and each sub-database (200 _k , k = 1, 2, 3) is identified by the entry in which the entry entered in the specified sub- , ...) are organized. Therefore, if it is determined in the confirming process (S320) that the sub-database for the same code as the classification code of the currently input personal information entry is not constructed, - < / RTI > For example, it defines the field structure of the entry and allocates necessary memory resources. In step S322, the currently inputted personal information entry is registered first in the created sub-database, and the inputted classification code is added to the sub-database to distinguish it from existing other sub-databases.

If it is confirmed in step S320 that the sub-database for the same code as the classification code of the currently input personal information entry is constructed, the DB organizer 24 stores the entered personal information entry It will be listed on the subdivision. At this time, the entry for registering the entry or the information for indexing the entry is the order in which the input entry occupies the sub-database on the basis of the address data (S323). For the purpose of orderly registration (the term "orderly listing" is used as a meaning of listing the entry itself or the index information of the entry in order based on the address data) is, each sub-a DB _{(200 k, k = 1,2,3,} ...), when, based on the address field, is built into the structure which allows to know the order of the listed entries. Figure 8A is an example of such a structure. The example of FIG. 8A is an example of a structure for enabling binary search for listed entries, according to an embodiment of the present invention, in which the order of specific fields of listed entries is known The structure of a chainized entry may be selected and used. In order to quickly determine the position according to sequential listing, it is desirable to construct each sub-database into a structure capable of binary search.

The DB organizer 24 is an exemplary structure as shown in FIG. 8A. In order to construct each sub-database, the DB organizer 24 stores, in addition to the entered personal information entries 800, a serial number according to the listed order for each entry Index value 801, and creates an index table 810 in which the corresponding index values are arranged in order according to the order of the address data of each entry. Namely, the address data of the third entry 800 ₃ in the corresponding sub-database, which is specified by the index value ("3" in the example of the drawing) of the item 810 ₁ located first on the index table 810 "address 3") is in the order of precedence (e.g., in alphabetical order). For the newly entered personal information entry, the DB organizer 24 performs an operation according to the following description, for sequentially listing in each sub-database constructed in the manner shown in FIG. 8A.

First, the DB organizer 24 registers the newly entered personal information entry at the end of the corresponding sub-database as shown in Fig. 8B, and assigns a serial number (in the example of the figure, " N + 1 ") as an index value 821 of the entry 820 of the entry. Then, the position in the order of the address data 820a of the entry 820 is grasped through the binary search. More specifically, in the index table 810 of the sub-database, a central item is selected and the address data of the entry specified by the index value of the central item and the address data of the currently registered entry 820 820a. If the address data 820a indicates a smaller value, the DB organization unit 24 selects a center item at the upper part based on the previously selected center item in the index table 810, The central item in the lower part is selected and compared with the address data of the entries listed in the same manner as described above. In this manner, if two items (e.g., items of index values "1" and "4") on the adjacent index table 810 that specify each listing entry for which the comparison value is smaller and larger are found The index table 810 is updated (830) by inserting an item 831 of the index value 821 assigned to the currently registered entry 820 in the meantime. In accordance with the insertion of the item 831, items at the rear end thereof are physically moved (84). In this embodiment, the reason why the address items are moved in order instead of index entries indexing them without directly moving the storage location of each entry to be compared is that the amount of data that must be moved in the physical storage space is small So as to reduce the time it takes.

Upon completion of the processing (S321, S322, or S323) for the entered personal information entry as described above, the DB organization unit 24 notifies the classification unit 23 of the fact that the classification has been performed The personal information entry is received and processed.

The classification of the personal information entries according to the specified classification level 2b and the process of listing in the corresponding sub-database according to the classification are performed by the classification unit 23 and the DB organization unit 24, Is performed for the last entry of the personal information database 1a (S330).

According to the embodiment of the present invention, when the construction of the sub-database having the index table is completed according to the above-described embodiment, the entries indicated by the items are rearranged So that the entries themselves can be arranged in the order of their address data. Of course, in the present embodiment, the index table and the information for indexing the entry are unnecessary and thus removed from the sub-database.

When all the above processes are completed, each sub-database (200 _k , k = 1, 2, 3, ..) constructed by the DB organizer 24, until the classification level 2b designated by the user All of which are composed only of personal information entries having the same data, and each entry is placed in the order of the address data (or has information on the position order).

Each of the sub-databases 200 constructed by the DB organizer 24 is reduced in size compared to the personal information database 1a previously input. Of course, if the segmentation level 2b is further increased (by designating the address deeper), the size of each sub-database is further reduced. Thus, by properly setting the classification level 2b, the user can set the classification level 2b appropriately so that the sub-functions of the resource of the computing device, particularly the sub- Let Db be obtained. As described above, even when the subdivided size subdivision is used by classifying the provided personal information database 1a according to the specified classification level 2b instead of using the entire personal information database 1a, Even if the grouping based on the address or the like is performed on the sub-database, there is almost no possibility that the entry belonging to the grouping is in the other sub-database. Thus, even if the personal information database is divided as described above, There is no difference in the results obtained by grouping. Therefore, as described above, according to the concept and the subject of the present invention in which the size of the personal information database is adaptively divided according to the classification level specified by the user, it is possible to suitably use resources of the computing device executing the grouping Can be obtained without any other adverse effect.

Meanwhile, as described above, when the sub-divisions 200 are constructed by the DB organization unit 24, the grouping unit 25 searches the address field of the sub-divisions 200, 3a, it is possible to group entries belonging to a specific group, for example, belonging to the same household.

The grouping condition 3a may be set with various grouping requirements. This is because the information listed in the personal information may not be completely accurate information about the individual. In other words, there may be errors or mistakes in the information entered by any individual through written or online, or the abbreviated name or nickname of the building name instead of the correct name of the address, The description may be omitted for some detailed address elements. In addition to these examples, there are many factors that make it difficult for the individuals in the same household to have the same address. Thus, for various reasons, a particular group, e.g., an individual belonging to the same household, may not have address data that all match the addresses listed. In view of this, the grouping condition 3a may be specified with various requirements, and also specifies type information to indicate that the grouped entries are in accordance with the requirement, according to the requirement. This type information is assigned to the entries identified by the grouping unit 25 and grouped according to the specified requirements. Hereinafter, the grouping unit 25 divides entries belonging to a specific group with respect to the sub-divisions 200 constructed according to the requirements specified by the grouping condition 3a Illustratively and specifically explained.

9A to 9C are partial examples of the grouping conditions 3a that are variously designated. FIG. 9A is a diagram showing the address data written in the personal information (the address data described above does not describe all the address elements constituting the address ), And when four-digit numbers except for the station number match in the telephone number in which the personal information is described, it is an example of requirements set to be grouped into the same household. If all the addresses match, the reason for verifying through the telephone number is that even if the address is identical, the old address remains in the personal information database even though the address has been changed by moving, etc., To the same household as other listed households. 9B shows an example in which address elements corresponding to "address" in the address data are used as matching elements. In the case where all of the telephone numbers match with each other in the matching elements, Fig. 9C shows an example of a requirement set to be grouped into the same household. Fig. 9C shows an example of a requirement set for grouping into the same household. And are grouped into the same household when they coincide with each other. 9A to 9C are only some examples of possible requirements for the same household, and as mentioned above, various requirements that can be regarded as the same household in addition to the above can be set through the grouping condition 3a.

The requirements of the grouping condition (3a) defined by the equation as illustrated in Figs. 9A to 9C can be freely defined through a variable arbitrarily defined by the user. FIG. 9D is an example created to specify the requirement of FIG. 9A as an example of this. As illustrated in FIG. 9D, functions 910 for specifying corresponding information in an entry of a sub-database to be used for grouping, and variables 920 indicating information designated by the functions are used for the promised syntax . &Lt; / RTI > The function in the illustrated rule is used assuming that each field of the sub-database entry has a fixed length of information. For example, an address is assumed to be written from the first to the thirtieth entry. Using the thus specified variables 920 and appropriate logic functions, the grouping condition 3a, which defines the grouping requirement, i.e., the householding requirement, that the user wants 930, , The grouping unit 25 performs an entry grouping operation according to the set grouping condition 3a.

That is, the grouping unit 25 analyzes the grouping requirement created in the grouping condition 3a according to the promised syntax, and then, for each sub-group in which each entry is sorted in order based on the address, , One by one into the memory, checking whether the corresponding information obtained from each entry for the variables based on the interpreted contents are mutually compatible with the "householdization requirement ", and grouping the mutually matching entries . Hereinafter, a method of performing the above process on one sub-database will be described in detail.

The grouping unit 25 reads the first entry 1001 of the sub-database currently selected and loaded into the memory configured as illustrated in FIG. 10 to determine whether there is an entry belonging to the same group as the entry Look for the next entry. Of course, whether or not they belong to the same group is confirmed from whether or not they meet the requirements currently specified in the grouping condition (3a). Confirmation of whether or not they match is specified for the matching of addresses in the grouping condition (3a) Address entries are processed up to the mismatched entry. For example, with respect to the grouping condition of FIG. 9A, when there is a comparison with the third entry in the sub-database illustrated in FIG. 10, the search for the entry belonging to the same group for the first entry 1001 is stopped C101). Then, with respect to the grouping condition in Fig. 9B, the search is stopped when a comparison with the fourth entry is made (C102). As described above, it is not necessary to search for further entries at the moment when the matching elements in the address do not match, because the entries of the corresponding sub-database to be searched are all sorted based on the addresses, This is because there is no matching entry in the matching element of the address. Therefore, the search time of the database for finding entries belonging to the same group is significantly reduced as compared with the conventional method.

When the search for the first entry 1001 is completed, the grouping unit 25 searches for an entry belonging to the same group as described above for the next entry (an entry not designated as the same group for the first entry) . &Lt; / RTI > In the example of FIG. 10, when the grouping condition of FIG. 9A is applied, the first entry and the second entry are grouped, and the operation for finding the same group with the third entry is performed next. During the above process for the currently loaded sub-database, the grouping unit 25 sets a separate mark (for example, the currently set grouping condition 3a) for the entries belonging to the same group, Or by extracting such entries to generate a separate entry block 211, assigning the type information to the block, and configuring the entry as entries to be grouped according to the subsequent grouping requirement And the block 212, as shown in FIG. If the grouping unit 25 marks the sub-database with type information for an entry grouped in the corresponding sub-database, another requirement is specified in the grouping condition (3a) to re-search the sub-database , The entries marked with the type information are skipped without any comparison process. In the example of FIG. 10, when the grouping requirement of FIG. 9A is set primarily, the first and second entries are grouped, the type information "0" is marked in each entry, and the grouping requirement of FIG. , The fifth and sixth entries 1021 are grouped so that type information "1" is marked. When the grouping requirement of FIG. 9C is again set in the third place, the seventh and eighth entries 1022 are grouped, and type information "2" is marked in each corresponding entry.

In the embodiments according to the present invention described so far, the text information of the address field of the pre-built database 1a is used as it is. In another embodiment of the present invention, the address data of the personal information entry can be converted into a code by a series of numbers and used. If the address data is used as a numeric code according to the present embodiment, an operation of comparing a part of the address data of the input entry performed by the classifying unit 23 with each item of the classification reference table 500, Comparing the address data of the input entry with the address data of the entry of the corresponding sub-database in order to find the position in order in the sub-database for the input entry, 25 is significantly improved in the operation of comparing entries registered for entry grouping according to a given grouping condition 3a.

In the embodiment according to the present invention, the classifying section 23 divides the address data 51 of the inputted personal information entry 50 having the structure as illustrated in FIG. 5B according to the method illustrated in FIG. 11 It is converted into a unique numerical code corresponding thereto and replaced. In the method illustrated in Fig. 11, for example, a 10-digit numeric string 1101 specified by the administrative standard code is assigned to the upper address of the arbitrary personal information entry, and for the "address & (Numeral 1102) of nine digits including four digits for each of the "main address" and "sub-address", and the number of digits (for example, (Numeral 1103) of 8 digits for each 4 digits of the "Dong lake" Of course, the way in which numbers or digits are assigned differently may be applied. Also, a code that uniquely identifies the number assignment scheme, e.g., a string, may be additionally inserted into the converted code.

According to the present embodiment, after the address data of the personal information entry is replaced by the corresponding numeric code (hereinafter also referred to as "address code") by the classifying unit 23, And the grouping unit 25 perform the operations described above for the personal information entry in which the address field is written with the address code at a higher speed. For example, in the example of FIG. 10, the grouping unit 25 marks or extracts entries belonging to the same group on the sub-database constructed as shown in FIG. In the present embodiment, the grouping condition (3a) defines the matching element as the number of digits when determining a matching element with respect to a part of the address. For example, in the example of FIG. 11, when a matching element is set up to "address", a matching element is assigned to the address code from the left to 17 digits.

In an embodiment according to the present invention, an apparatus for performing the sorting and grouping method of the personal information database according to the present invention is characterized in that, as illustrated in FIG. 13, 200) into a single personal information list. The merging unit 26 may be configured as hardware that executes software or firmware as in the other components 21, 22, 23, 24, and 25 described above, or may be implemented in a computing device May be configured as one or more application programs, and hardware and software may be combined.

The merging unit 26 merges the final personal information database 300 merged by the sub-divisions 200 with the address information of each entry (address data in the form of text, The address codes are merged in such a manner that they are arranged in order according to the address codes. To this end, the order of arrangement in the merged database 300 is first grasped for each sub-database 200 _k , k = 1, 2, 3,. 14, for each sub-database, the address information 1411 of the first entry and the information 1412 for identifying the sub-database, for example, The address 1411 of each item of the table 1410 is compared with each other and is rearranged according to the order of "number" or value size (P141) Reference table 1420 is created. Then, by reading the sub-database identification information of the corresponding item and arranging the personal information entries of the sub-database of the identification information in accordance with the order of the items described in the reference table 1420 (P142), all personal information entries Or index items corresponding one-to-one with personal information entries) are arranged in order according to the address information.

In the above-described embodiment, the first entry is selected for each sub-database and the address information is used for mutual comparison to determine the placement order of each sub-database. However, in another embodiment according to the present invention, An entry at any position in the sub-database may be selected. Of course, the selection position may not be the same for each sub-divisor.

As described above, a personal information database in which all the entries are arranged in order based on addresses can be used for the above-described grouping as well as for other purposes. Of course, in this case too, an order-based feature based on the address ensures fast searching.

The embodiments of the present invention described above can be combined and implemented together if they are not mutually incompatible.

It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the spirit or scope of the invention as defined in the appended claims. , Alteration, substitution, addition, or the like.

21: pre-processor 22: address parser
23: classification section 24: DB organization section
25: grouping unit 26: merging unit
200 _k : sub-database

Claims

An apparatus for processing a database including address information,
A classification unit configured to classify entries of the database into entries divided into a plurality of sub-databases different from each other according to the classification level designated by the address element;
For each of the plurality of sub-databases, an entry obtained from the database in a manner that the entries of the sub-database are arranged in a specified order based on the address information of each entry, An organizational unit configured to add to the database,
And a merging unit for merging the plurality of sub-databases into a single database,
Wherein each of the plurality of sub-databases is configured with entries having the same information for the designated address element,
The merging unit selects an arbitrary entry in each of the sub-databases, and according to the mutual relationship between the plurality of sub-databases identified through the mutual comparison of the address information of each selected entry, And arranging the databases in the order of the addresses of the databases.

The method according to claim 1,
Wherein the classification level arranges and processes the database according to an address that specifies at least one address element starting from a highest address element in an address scheme through an external input or an information file.

The method according to claim 1,
The address element includes elements of "city / city "," city / district ", and "east / A device that sorts and processes a database according to an address that contains an element of "time".

The method according to claim 1,
Wherein the address information of each entry of each sub-database base is an address code converted from the address data of the corresponding entry in the database, and the address code is a number uniquely allocated to the upper address of the address data A device that sorts and processes a database according to an address that contains a column.

The method according to claim 1,
Further comprising a processing unit for extracting information corresponding to information of each entry of the input database in accordance with a specified format and arranging the extracted information in a specified order so as to form individual information entries,
Wherein said classifying unit arranges and processes the database according to an address that causes said classification to be performed for each of said personal information entries constituted by said processing unit.

6. The method of claim 5,
Wherein the specified format and order are arranged according to an address that is defined by variables of an arbitrary name designating corresponding information in an entry of the input database and an order of enumeration of the variables.

The method according to claim 1,
Further comprising a grouping unit for comparing entries of any sub-database among the sub-databases to distinguish entries matching the set grouping requirement,
Wherein the grouping unit compares, in the mutual comparison of the entries, an address for comparing an arbitrary entry with consecutive entries from the next entry to an inconsistent entry in a matching element of an address specified by the grouping requirement A device that sorts and processes a database accordingly.

8. The method of claim 7,
Wherein the grouping unit arranges the database according to an address that is further configured to add, to the entries classified according to the set grouping requirement, instruction information indicating a result according to the set grouping requirement.

8. The method of claim 7,
The grouping requirement may be such that the database is aligned according to an address that is defined by variables having an arbitrary name designating corresponding information in the entry of the sub-database and logical functions applied to the variables Processing device.

8. The method of claim 7,
Wherein the grouping requirement aligns and processes the database according to an address that is specified via an external input or through an information file, including at least a matching requirement for the address and a matching requirement for the telephone number.

delete

CLAIMS 1. A method of processing a database containing address information by a computing device,
A step of confirming information corresponding to an address element specified as a classification level for an arbitrary entry in the database;
Classifying the arbitrary entry based on the identified information, registering the arbitrary entry in a sub-database designated for the classification among a plurality of sub-databases, A second step of registering in a position corresponding to the order in the sub-database,
Merging the plurality of sub-databases into a single database,
Wherein each of the plurality of sub-databases into which each entry of the database is classified by the step 2 is configured as entries having the same information with respect to the designated address element,
In the step 3, an arbitrary entry is selected from each of the sub-databases, and according to the mutual relationship between the plurality of sub-databases identified through the mutual comparison of the address information of each selected entry, And arranging the databases in order to configure the single database.

13. The method of claim 12,
Selecting any entry in one of the sub-databases;
Confirming whether the selected entry satisfies the set grouping requirement sequentially from the next entry, and ending the confirmation for the selected entry in the entry that is inconsistent with the matching element of the address specified by the grouping requirement The method comprising the steps of:

delete

A storage medium on which data is recorded,
An address processing program for sorting and processing the entries of the database based on the address is recorded,
When the address processing program is loaded into a computing device capable of reading the program and executed,
By subdividing the entries of the input database into entries divided into a plurality of different sub-databases in accordance with the classification level designated by the address element, each of the sub-databases finally assigns the specified A function to configure the address element as entries having the same information,
For each of the plurality of sub-databases, an entry obtained from the database in a manner that entries of the corresponding sub-database are arranged in the order specified based on the address information of each entry, And the like,
Selecting an arbitrary entry in each of the plurality of sub-databases, and selecting the entry in the plurality of sub-databases according to the mutual relationship between the plurality of sub- Wherein the program is a program capable of performing a function of constituting a single database by disposing and merging in order.