MXPA96005656A - Method to group related data, multi dimension - Google Patents

Method to group related data, multi dimension

Info

Publication number
MXPA96005656A
MXPA96005656A MXPA/A/1996/005656A MX9605656A MXPA96005656A MX PA96005656 A MXPA96005656 A MX PA96005656A MX 9605656 A MX9605656 A MX 9605656A MX PA96005656 A MXPA96005656 A MX PA96005656A
Authority
MX
Mexico
Prior art keywords
vertices
angle
vertex
score
angles
Prior art date
Application number
MXPA/A/1996/005656A
Other languages
Spanish (es)
Other versions
MX9605656A (en
Inventor
F Poppen Richard
E Smartt Brian
A Dunn Linnea
J Derose Frank
Original Assignee
J Derose Frank
A Dunn Linnea
F Poppen Richard
E Smartt Brian
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US08/245,690 external-priority patent/US5706503A/en
Application filed by J Derose Frank, A Dunn Linnea, F Poppen Richard, E Smartt Brian filed Critical J Derose Frank
Publication of MX9605656A publication Critical patent/MX9605656A/en
Publication of MXPA96005656A publication Critical patent/MXPA96005656A/en

Links

Abstract

The present invention relates to a method for grouping multidimensional related data, developed by the identification of characteristics from a collection of data, each of said characteristics is represented by a vertex, pairs of characteristics that are desired to be grouped together are selected , the pair of vertices that represents each selected pair of characteristics is connected by means of an angle, a score is assigned for each angle, according to a predetermined formula, an angle with the highest score is selected, a new vertex is created by means of the union of vertices connected by the selected angle, new angles are created between the new vertex and the vertices previously connected to the joined vertices, and this procedure is repeated until each has a predetermined score

Description

METHOD TO GROUP RELATED, MULTI-DIMENSIONAL DATA.
BACKGROUND OF THE INVENTION FIELD OF THE INVENTION The present invention relates in general to methods for storing data in a database and in particular to a method for grouping multi-dimensional related data, such as, for example, data on geographic maps, in a database of a management that is efficient in terms of resources and space required to group, and that results in the reduction of time required to recover files or data selected from a database. DESCRIPTION OF THE RELATED TECHNIQUE Since its invention, computers have been t. used to search through, or other collections of very long data processes to stay in full in the main memory, at the same time. Therefore, programs have a lot of dealings with large masses of data by reading the data items in a file, sequentially in parts. When a file is going to be processed keeping each file separate, in sequence, the organization of a file is not very important. When you try to use a file by repeatedly looking for a specific item of interest and operate it, the organization of the file can be very important to make efficient said -operation. For that purpose, records in a file are often stored according to a specific order, so that particular records can be found when searching for them in their position in that order. For example, a bank can select an information file about its depositors by means of the account number. Thus the registration of a particular account can easily be found by 'its number. A file can be classified according to more than one sequence, so that records can be found by more than one attribute. There is a very extensive body of literature that deals with the classification, selection and search of data. For most methods of organizing files, the data is found according to some attribute, such as the name of the client. The links in the order of selection can be broken using another order criterion, but the selection is still intrinsically one dimension. The organization is such that, for any two registries, one comes before the other. This is convenient for storing data in a computer, because the memory devices of computers are also intrinsically one-dimensional. The records come in a fixed sequence and, for many storage devices, it takes significant time to move from one point to another in a sequence of files. (Even though the -.- surface of a disk is two-dimensional, the data on the disk is stored in a one-dimensional sequence.) When the data that has been stored is intrinsically two-dimensional (or some dimension greater than 1) such as, for example, geographic map data, this organization is not convenient. Often one wants to place records for objects that are close to each other in the appropriate multi dimensional space close to each other in the file, because the file is going to be used in such a way that one usually goes From the registration of an object to the registration of the near object, however, organizing the file so that this can be done is not easy, if, for example, one has a list of 5 places chosen arbitrarily in a city, There is a method to fix them that necessarily places the _ records in points that are close to each other in the field close to each other in the file. The best that one can hope for is to place as many related records as possible, close to each other in the file. Certainly, selecting a coordinated and breaking links with the other is not a very good approach. Imagine points in a selected city by length, with links broken by the selection by latitude. Next, for example, the records of points of interest that are close to each other in the field can be separated in the file by the records at many points that are between - "- the points of interest in length, but that are At the other end of the city, in latitude, several techniques have been created to collect in groups data records that represent objects from a space that is intrinsically multi-dimensional, many of which are simply geographic (or geometric), and they collect together records of objects that are close to each other, but not related Many others are not efficient in terms of either resources used in clustering or in the use of space in the data file The present invention relates to a new method to group related data from a multi-dimensional space There are many methods for grouping multi-dimensional data, especially two-dimensional data such it's like map data, in the literature. Many are "top-down" methods, such as the four-tree method, which successively subdivide the data into groups until the resulting groups are as small as necessary. These methods tend to produce groupings with significant variation in size or to produce subdivisions that have different numbers of levels in different regions of the data, or both. Other "top-down" methods tend to operate locally, filling one group before moving to build the next. This also tends to result in significant imbalances in the sizes of the - groupings and in irregular configurations. The method of the present invention is "up-down" but overall, resulting in well-balanced and well-formed groupings in the entire database. SUMMARY OF THE INVENTION In view of the foregoing, a main objective of the present invention is a method for grouping multi-dimensional related data such as, for example, l-geographic map data, into a computer database that is faster and more efficient in terms of data recovery and storage than other previously known methods. In a preferred embodiment, the method of the present invention comprises the following steps: a. establishing a vertex in a graph for each object among a collection of objects of interest in a computer database, so as to provide a graph comprising a plurality of vertices; b. connecting selected pairs of said plurality of 0 vertices by means of an angle; c. provide each of the angles by connecting the non-suppressed vertices with a score that is a measure of how desirable it is to combine the vertices associated with the same; 5 d. select the two vertices connected with the angle that has a better score; and. combine said two vertices; F. create a new united vertex and angles that are provided with a score; and g. repeat steps (d) to (f) until a predetermined termination condition has been reached. In another preferred embodiment of the present invention, the above described method further comprises the step of suppressing said vertices and associated angles representing objects for which it is not desired to maintain data in the database, so as to leave a plurality of vertices that represent characteristics for which the data is present in the database. In another embodiment of the present invention, the above-described method further comprises the step of generating and maintaining a list of vertices in which the score of the best angle connected to one of the vertices and the other vertex to which said vertex is listed is listed for each vertex. best angle connect. In yet another embodiment of the present invention the above-described step of generating and maintaining a list of vertices comprises the step of selecting said list by means of the best angle score. To facilitate all the methods described above, the following procedures are sometimes used, if not always: a. the vertex lists and their accompanying scores are compiled with the highest scores at the top of the list. b. clusters are grouped by themselves in an iterative process with successively larger restrictions / limits of size that are placed in their combined sizes after each iteration, so that a certain degree of uniformity in the size of the data records is I can maintain. In another embodiment of the present invention, a method is provided for combining groupings in which the score of each new angle created corresponds to the number of angles replaced by the new angle created. Starting with the combination of groupings connected by angles having a higher value, the combination continues until the combined groupings reach a predetermined size.
BRIEF DESCRIPTION OF THE DRAWINGS The above objects, features and advantages of the present invention will become apparent from the following detailed description of the accompanying drawings, in which: * Figure 1 is a representation of a section of a 5 map of a city; Figure 2 is a graph showing the vertices for - "" each area and line segment corresponding to the characteristics of the map of Figure 1; Figure 3 is a reproduction of the graph of Figure 2 with the addition of angles that connect the vertex in each area to the vertex of each line segment that abuts the area; Figures 4 to 6 are a series of drawings illustrating the suppression of a selected vertex; < Figure 7 is a reproduction of the graph of Figure 3 after the selected vertices have been deleted; Figure 8 is a reproduction of Figure 7, where another vertex has been deleted; Figure 9 is a diagram illustrating the boundary boxes of an area and a line segment according to the present invention; Figure 10 is a diagram illustrating the boundary boxes of the two line segments according to the present invention; Figure 11 is a diagram illustrating the boundary box of a diagonal line segment according to the present invention; Figure 12 is a reproduction of the graph of Figure 8 with scores added to the selected angles according to the present invention; Figure 13 is a diagram illustrating a union of two vertices of Figure 12, according to the present invention; Figure 14 is a graph containing a plurality of vertices with arbitrary scores adjacent to the various angles connecting the pairs of the vertices; Figure 15 is a reproduction of the graph of Figure 14 with two of the vertices of Figure 14 that have been joined and new angles associated with them that have been created, and Figure 16 is a graph of grouping. data according to another embodiment of the present invention: Figure 17 is a grouping graph, which is a reproduction of the graph of Figure 16 after joining a pair of vertices and their associated angles; reproduction of the graph of Figure 17 after the union of a pair of vertices and their associated angles, and Figure 19 is a reproduction of the graph of Figure 18 after joining a pair of vertices and their associated angles .
DESCRIPTION OF THE PREFERRED MODALITIES As will be used here, the term "characteristic" 5 means a single indivisible object described by a database, eg. , a section of a river, a block of a street, a lake, a segment of a political limit, etc. The terms "group" and "vertex" each mean a collection of one or more characteristics. The term "vertex" is used herein in reference to a collection of one or more characteristics, when said collection of characteristics has been used as an element in a graph. The term "group" is used herein in reference to a collection of one or more characteristics, when said collection has been used as a collection of characteristics that are to be stored together in a data file. As used here, there is no distinction between a group and a vertex. Referring to Figure 1, a section 5 of a two-dimensional surface is shown, e.g. , a map of a city usually designated as (100) comprises a "Plurality of line segments (1-16) representing roads / streets and five designated areas (17-21) that are bordered by line segments (1-16). Some or all 0 line segments could represent features that will be represented in the database for subsequent retrieval, eg. , streets, political limits, rivers, etc. In this example, each of these line segments represents a segment of a street. Also, some or all of the areas could represent characteristics that are to be represented in the database for subsequent recovery, for example, a park, a lake, a complex of -office, etc. In this example, the area (18) represents a park, while the remaining areas (17), (19), (20) and (21) do not represent characteristics that are to be stored in the database. Referring to Figure 2, a graph designated as regular (110) for objects on the map (100) is shown. For example, the objects may comprise a section / length of a street, a building, a park, a lake, "the boundary of an area, etc. In the graph (110) a vertex, represented by a point, is provided for each object, that is, each line segment (1-16) and each area (17-21) For example, for line segments (8), (5), (1) and (4) and area of limit (17) there are vertices (31-34), respectively, and for the area (17) a vertex (35) It should be noted that the horizontal and vertical lines (8), (5), (1) and (4) and the like, present in the graph (110) and subsequent graphs are construction lines, as distinguished from the angles described below, have no meaning and are provided only to facilitate the relative placement of the vertices in the graph (110). Referring to Figure 3, the vertex for each area is connected to, that is, is associated with, the vertex for each line segment that adjoins the area. The connection is called "angle". For example, counting in the direction of the clock hands, there are four angles (40-43) that connect the vertex (35) for the area (17) to the vertices (31-34) for the line segments (8), (5) ), (1) and (4) of Figure 1, respectively. The next step in the data grouping method according to the present invention, is to suppress the vertex for each area and each line segment, that is, omit from the database, ie, that does not correspond to the data that are desired to subsequently retrieve them from the database. Thus, remembering that areas (17), (19), (20) and (21) are not characteristics that are going to be stored in the database, delete the vertices corresponding to those areas, the angles that connect the vertices in each of said areas at their neighboring vertices in the line segments that adjoin the area are removed and repositioned by means of angles interconnecting the neighboring vertices in each of the confined line segments, so that the neighboring vertices not suppressed, connected to the suppressed vertex by means of adjacent suppressed angles, connect themselves by a new constructed angle. For example, as seen in Figures 3 and 4, the vertex (35) is connected by angles (40-43) to neighboring vertices (31-34). As shown in Figure 5, by means of broken lines and a sunken point, the vertex (35) and the angles (40-43) in Figures 3 and 4 are suppressed by removing them and connecting the vertices (31-34) by angles (50-53), respectively, resulting in the graph fragment shown in Figure 6. The graph that results from the -suppression of the vertices in the other areas (19), (20) and (21) that use the same technique, shown in Figure 7. Note that some angles are drawn as curved lines and others as straight lines. The drawing is only for the help of the visualization of the connections between the vertices. The current trajectory of the drawing lines has no consequence. With reference to Figure 7, a vertex (64) representing an area (18) is connected by angles (54-58) to vertices (62), (63), (32), (60) and (61), respectively. The vertex corresponding to the area (19) has been deleted, but the limit area of the line segments (19) are represented by vertices (62), (65), (66) and (67). The vertex (62) is shown connected to its neighboring vertices (64), (65) and (67) by means of angles (54), (70) and (69), respectively. Referring to Figure 8, if it is desired to delete the vertex (62) and its associated angles (54), (69) and (70), as shown in Figure 7, the vertex (62) and the angles (54) ), (69) and (70) are removed and the neighboring vertices (64), (65) and (67) to which they are connected are now connected by means of angles (71), (72) and (73). It will now be recognized that each feature on the map that is subsequently attempted to retrieve from the database has a vertex and each vertex represents a feature on the map. In addition, the characteristics that are adjacent to the map are adjacent to the graph, that is, they are connected by an angle. In the following discussion of the present invention, each vertex will represent a group comprising one or more features. Initially, before the groups are combined, each group contains only one feature. Whenever a consideration is given to combine two groups, that is, a first and a second group, a score (one figure of merit or another) is provided "- • - 'value, not necessarily numerical) to the combination. The score is a measure of how well or how much one wishes to combine the groupings under consideration, In a first embodiment of the present invention, where the data is geographic data, the score equals the area of the boundary box of the first group plus the boundary box area of the second group minus the boundary box area of the combined groups. It should be understood that in this embodiment of the present invention, the area of a bounding box of a group is the area of the smallest rectangle having edges running from north to south and from east to west enclosing the group, i.e. , the current characteristic or characteristics represented by a vertex. Referring to Figure 9, and considering a union of the vertex / group of the park with the vertex / group of the street on the east side of the park (the north is at the top of the page), it can be observed that the area of the boundary box of the group of the park is equal to (AB), being that the area of the boundary box of the group of streets east of the park is zero. The area of the box 5 boundary of the street group is zero because, while the length of the boundary box is equal to the length of the street segment, ie, (B), the boundary box width is zero. Thus, applying the above equation, the score for the union is AB + 0 - AB = 0. It should be noted > '.? - that the dotted lines in the figure defining the bounding boxes move slightly from their true positions for visibility reasons. With reference to Figure 10, a value corresponding to its length is placed adjacent to each of the line segments. For example, the line segment C, this, of the park, is assigned a value of 2 and a segment of line D, which extends from there, is assigned a value of 4. If you want to consider combining or joining the groups associated with the C-line segment and the line segment D, it will be calculated using the equation described above that the area of the boundary box of the first group, designated C for convenience, will be zero and that the boundary box area of the second group, designated D for convenience, will be zero but that the area of the boundary box that encloses to both groups will be CD.
Recalling that the score for the angle resulting from the union, is the area of the boundary box of the first group plus the area of the boundary box of the second group minus the boundary box area of the combined groups 5, it can be seen that The score of the union of groups C and D corresponding to the line segments C and D is calculated as follows: 0 + 0 -CD = -CD or 0 + 0 - (2 x 4) = -8 * 1 - It should be noted that the limit box of a group for a line segment is not always zero. For example, with reference to Figure 11, a diagonal segment (80) having a vertex (81) is being shown. Since the boundaries of a bounding box have an extension that is not zero, in a north-south direction, as well as from east to west, it will be seen that the area of the vertex bounding box (81) is GH. As shown in Figure 12, using the technique described above, a score is calculated for each 0 angle that joins a pair of groups in the graph. For example, the score for the angle connecting the group W and the group Ml, which results from the union of groups C and D, is -8, as calculated previously. Once each of the angles in graph 5 has been graded, the angle with the highest score is selected. Among the angles illustrated in Figure 12, the highest score is zero, all others are negative values, that is, less than zero. It should be noted that if more than one angle has the same higher score, any of them can be arbitrarily selected. In the previous modality, the scoring method is such that the best angle, that is, the angle that connects the groups that is most desirable to combine, has the highest score. It is possible to use other scoring methods, such as one in which the highest angle has the lowest score. After an angle that has a higher score has been selected, begins the task of combining the connected groups by means of an angle. Combining the groups comprises joining the vertices and combining the lists of characteristics associated with each of the vertices. For example, again with reference to Figure 12, it is assumed that it is desired to join the groups designated as V and W, which are connected by the designated angle (90). For convenience, the vertices connected to the group V are designated in the direction of the clock hands by the alphanumeric indicators, Ni, N2, N3 and N4. The groups connected to the vertex W are designated in the direction of the clock hands by the alphanumeric indicators V, Mi and M2. With reference to Figure 13, the result of combining the groups, that is, joining the vertex V and, is shown where a new vertex X is created and all the angles that connect the vertices N? -N4 to the vertex V and all the angles that connect the vertices Mi and M2 to the vertex W are now connected to the new vertex X created. Notice that the angles of the new vertex X are connected in the same order as those previous angles V and W. Reading in the direction of the clock hands, the angles are seen connecting the vertices connected to V and in the same order, mainly Ni, N2, N3, N4, and then to the vertices connected to in the same order, mainly Mi and M2. All other angles and vertices in the graph remain undistributed. After joining the two vertices, as described above, the new angles created are graded. In this regard, it will be noted that only the angles connected to the new vertex X need to be re-qualified. After the angles are screened, one of the angles with the highest score is selected again. The groups connected therein are combined and their respective vertices are joined, as described above with respect to Figures 12 and 13. The grouping and joining of the vertices continues in this way until it is no longer possible to join two groups. This occurs, for example, when the list of characteristics associated with the groups becomes too large for a record, for example, the record containing characteristics of the combined groups would exceed a predetermined number of bits, for example, 8192 bits and / or the combined groups exceeded a predetermined geographic area, number of data items contained there, 5 or the like. When this occurs, that the angle between the groups that has been determined can not be combined, a score of less infinity (8) is given, instead of using the scoring technique described above. The least infinity is used because it is the lowest score possible. Thus, the i. The combination of the groups is completed when each angle is given a score of -oo. In practice, the scoring process of all angles and search for the highest scoring angle is slow. To limit the re-qualification and thus accelerate the process, a list of vertices is generated. For each vertex in the list, a vertex score is listed (that is, '- the score of the highest score angle associated with the vertex) and the neighboring vertex to which the highest score angle connects. If more than one angle has the same higher score, a neighboring vertex is arbitrarily chosen. The list of vertices is selected according to the scores of the vertices. For example, with reference to Figure 14, there is shown a graph of a plurality of vertices A-H with arbitrary scores placed adjacent to the various angles connecting the pairs of vertices. It will be noted that the highest score angle of vertex C has a score of 2. There are two such angles. One connects to vertex A and the other to vertex D. In the list below, vertex D is selected. From there, the vertices are selected with the vertices that have the highest scores that are placed at the beginning of the list and the vertices with the lower scores that appear at the bottom of the list. The best angles are Better lf- score - Vertex direct to this vertex of the angle E F 9 F E 9 D E 7 C D 2 15 A C 2 B E 2 G C 1 H F -8 0 When the list is found where the vertex has the highest score at the top of the list, it can be seen that the next union will comprise the vertices E and F. After the vertices E and F are joined so that a new vertex Z is created, as shown in Figure 15, the new angles created are qualified. Note that the score for the new angles created connected to the vertex Z is arbitrary for purposes of this example. It will be noted again that the scores of the unaffected angles remain the same and that only the angles connected to the vertex Z need to be qualified. For purposes of illustration, it is assumed that the scores for the angles connected to the new vertex Z are as follows: B - Z 3 D - Z 2 H - Z - 8 In the list of vertices described above, the lists for E and F are eliminated since vertices E and F are gone. Next, a list is added for Z and its position and the positions of the neighboring vertices of Z, that is, B, D and H, are adjusted as necessary according to the magnitude of their scores. It will be noticed that the new listing is as follows: The best angles are Better score Vertex direct to this vertex of the angle BZ 3 ZB 3 DZ 2 CD 2 AC 2 GC 1 HZ -8 In some applications the intended use of the data that is grouped is such that one wishes to add group games to form larger groups of a Hierarchical mode. That is, you do not want to form a group game, then form another group game in such a way that the set of elements of each group in the second group is the union of the games of the elements of some collection of games from the first grouping. In this case, one begins the second grouping operation with the graph in the state in which it was left by the first grouping operation, instead of with a data element per vertex, and reconstructs the row of priorities of the angles as necessary if the angle scoring mechanism has been modified. Alternatively, in some applications the intentional use of the data that has been grouped is such that one wishes to form higher level groups whose elements themselves are low level groups. In this case, one simply uses the present invention again. In the second grouping process, the data elements are the groups from the first grouping process, instead of the data elements grouped in the first grouping process.
In other applications the nature of the data that has been grouped is such that the method described above can produce groups that vary greatly in size. It may then be useful to apply technique 5 iteratively, with successive larger limits. For example, suppose that a group can contain at most N bits. One should group the data once, limiting the size of each group to N / 16 bits; then, again, add these groups to form larger groups with a limited size V r 'a N / 4 bits; then, again, add the larger groups to form even larger groups with a size limited to N bits. Experience has shown that this iterative process results in groups with more uniform sizes. 15 As discussed previously, the determination of whether two groups can be combined often depends on whether the size of the result could be very large. In these cases, the format in which the data is stored is often such that combining two groups results in a new group with a size that is different from the sum of the two sizes of the two original groups. It can, in computing terms, be a little more expensive to compute the exact size of the combined group, but easier to compute an estimated upper limit on the size of the combined group. In these 5 cases, one can store with each vertex an upper limit estimated in the size of their group, which one then uses to evaluate the combinability of groups conservatively, sometimes rejecting the combination of two groups because the estimated size indicates erroneously that the result may be very large. This can result in a significant salvation of computational effort. One can combine this approach occasionally by computing the precise true size of each group, for example, once at the beginning of each iterative or hierarchical stage, or at times of random choice during the grouping process. In still other applications the nature of the data that has been grouped is such that one discovers groups whose combination is desired by means of a grouping process. For example, if the data that has been grouped is geographic, and you want to minimize the scope of the overlap of the groups, you may find that after some aggregation has been carried out, there are two groups whose vertices are not connected by an angle, so that the geographical scope of one contains entirely the geographical scope of the other. Then you want to add an angle between the vertices of the groups. One can add a stage in which one sees for each pair of groups. Unfortunately, this operation, in terms of computing, can be a bit expensive when you review each pair of vertices in the whole grouping graph. However, if one keeps the grouping graph as a flat graph, one can find many of such pairs of vertices quickly by examining each surface of the graph (one surface of a flat graph is a region of the plane limited by angles of the graph) and review each pair of non-adjacent vertices on the surface. In addition, if one then adds an angle, the flatness of the grouping graph is retained. In another embodiment of the invention, the data that are to be grouped represent the vertices of the graph alone. The vertices that you want to group together are only those that are connected by an angle in the original graph. In this case, the original graph alone can be used as the initial state of the grouping graph. In this mode, the goal in grouping is to minimize the total number of angles of the original graph that connect vertices in different groups. To complete this, each angle in the grouping graph will have associated with it the number of angles in the original graph represented by that angle. This number is called the "weight" of the angle. The meaning of storing the grouping graph is modified so that its weight can be stored along with each angle. Now the goal is to minimize the total weight of all the angles in the grouping graph, when the grouping has been completed.
First, the score of each angle in the graph - of grouping is made equal to its weight. Next, two groups are combined. The angles of the original graph that previously ran between the two groups now 5 remain within a single group. No other angle between the groups is affected. When combining two groups that have a common neighbor, the weight of the new vertex angle to that neighbor becomes equal to the sum of the previous weights of the angles. As a result, at each stage, the weights of V -'- the angles and the number of original angles between the groups remain synchronized. Because in this modality there is no notion of geographic compact, there is no need to keep the grouping graph as a flat graph. With reference to Figures (16-19), a data combination graph is shown and a grouping graph comprises six vertices (1-6) that are interconnected by a plurality of horizontal, vertical and diagonal lines, i.e. angles For purposes of illustration 0 the vertices (1) and (2) will be joined as shown with the broken lines surrounding the vertices (1) and (2). Referring to Figure 17, after the vertices (1) and (2) are joined, a new designated vertex (1), (2) is created together with new angles resulting from the same. In this mode, the score for the new angle created will be equal to the sum of the scores of the angles replaced by the new angle created. To determine this sum, it is necessary to refer to the previous graph. There you can see that the new angles created run between (1), (2) and (5) and between (1), (2) and (6) have the score (2) and (1), respectively. The reason why the running angle of (1), (2) and (5) has the score (2) can be seen from Figure 16, where an angle with a score of 1 ran from (2) a (5) and another angle also with the score of 1 ran from (1) to (5), being the sum 2. In the other cases, mainly (1), (2) to (4) and (1), (2) to (3), only one angle, with a score of 1, ran between the respective vertices. The next step in the method of this embodiment of the present invention is to unite again those groups interconnected by an angle having a higher score, that is, the angle connecting to (1), (2) and (5). With reference to Figure 18, in the process of joining the vertices (1), (2) and (5) a new designated vertex (1), (2), (5) is created, which results in the creation of a new angle between (1), (2), (5) and the vertex (6). With reference to Figure 17, to calculate the new scores for the angles in the grouping graph of Figure 18, it will be observed that the angle between the vertex (1), (2), (5) and the vertex (4) they are equal to (2). This is due to the fact that before joining, two angles connected to the vertex (4) with the vertices (1), (2) and (5). Similarly, the new angle created between the vertex (1), _ (2), (5) and the vertex (6) has a score of 2 because, as shown in Figure 17, the vertex (6) ) was connected to (5) by an angle that has a score of 1 and 5 at the vertex (1) by another angle that has a score of 1. The score of the angles that connect to the vertices (1), (2) , (5) and vertex (3) and the angle that connects the vertex (3) and vertex (6) remains equal to 1. It is assumed for purposes of illustration that the (L- combined vertices (1), (2), , (5) create the largest group possible so that no other vertex can be combined with groups (1), (2), (5) .In that case, as described above, each of the angles connecting the vertex (1), (2), (5) to other vertices are given the score of minus infinity (-8) to reflect that fact, in this case, the only vertices that can also be combined are the vertices (3) and (6 With reference to Figure 19, joining the vertices (3) and (6) results in three vertices comprising vertices 0 (1), (2), (5), vertex (4) and vertices (3), (6) As indicated above, because the vertex (1), (2), (5) comprises a group of maximum size, no further combinations are allowed. Thus, each of the angles that extend from the vertex (1), (2), (5) are given the score of -oo, ending the grouping of the data.
Although modalities are described previously. Preferred embodiments of the present invention, it is contemplated that numerous modifications may be made thereto for particular applications without departing from the spirit and scope of the present invention. For example, the present invention is not limited to the grouping of map features but can be used to group anything distributed into two or more dimensions, such as, components on a printed circuit board, nodes on a graph, people of (1- interest) common, stars in space, etc. Accordingly, it is intended that the described embodiments be considered only as illustrative of the present invention and that the scope thereof is not limited to this but is determined by reference to the claims that follow 5 below.

Claims (16)

  1. CLAIMS 1. A method for grouping multi-dimensional related data into a computer database, comprising the following steps: a. establishing a vertex in a graph for each object among a collection of objects of interest in a computer database, so as to provide a graph comprising a plurality of vertices; b. connecting selected pairs of said plurality of vertices by means of an angle; c. provide each of the angles that connect the vertices with a score, which is a measure of how desirable it is to combine the vertices associated with them; d. select the two vertices connected with the angle that has the best score; and. combine said two vertices; F. create a new united vertex and angles, which are provided with a score; and g. generate stages (d) to (f) until a predetermined completion condition is found.
  2. 2. A method according to claim 1, wherein said steps of identifying objects and connecting selected pairs of vertices by means of an angle, comprise the following steps: a) identifying characteristics corresponding to the areas and line segments on a surface two-dimensional b) create a separate vertex for each identified area and line segment; and c) create an angle from the vertex corresponding to each identified area to the vertex corresponding to each line segment identified that limits the area.
  3. A method according to claim 1, wherein said steps of object identification and connection of selected pairs of vertices by means of an angle, comprise the steps of: a) identifying features corresponding to line segments and areas in a two-dimensional surface and wherein said step of assigning a score to each angle, according to a predetermined formula, comprises the stage of selectively assigning a score, which is: i) equal to the area of a boundary box that limits the first group plus the area of a boundary box that limits the second group minus the area of a boundary box that limits the combined groups, if the size of the record of the combined groups is less than a predetermined maximum size, or ii) equal to negative infinity (-00), if the record size of the combined groups is equal to or greater than the predetermined maximum size.
  4. 4. A method according to claim 1, wherein said step of assigning a score for each angle, according to a predetermined formula, comprises the step of selectively assigning a score that is equal to the number of angles that a new created angle replaces.
  5. A method according to claim 1, comprising the following step: generating and maintaining a list of vertices in which the score of the best angle connected to that vertex and the other vertex to which the vertex is connected is listed for each vertex. better angle.
  6. 6. A method according to claim 5, wherein the step of generating and maintaining a list of vertices comprises the step of selecting said list by means of the best angle score.
  7. 7. A method according to claim 1, comprising the steps of repeating steps (c) to (g).
  8. 8. A method according to claim 7, wherein each repetition of said steps (c) and (f) comprises the steps of using a different angle scoring method.
  9. 9. A method according to claim 8, wherein said step (g) comprises a different limit and lt > - greater with each repetition of stages (c) to (g).
  10. A method according to claim 1, comprising the steps of iterating steps (a) to (g) at least twice using the results of the previous iteration in the development of each subsequent iteration.
  11. 11. A method according to claim 1, wherein step (g) comprises the step of using an estimated upper limit instead of an exact value to determine whether said predetermined condition has been found.
  12. 12. A method according to claim 1, comprising the step of identifying pairs of vertices that are not yet connected by an angle, which one wishes to combine and connect each one of said vertices with an angle during each repetition of the stages. (d) 5 a (f).
  13. 13. A method according to claim 12, wherein said step of identifying pairs of vertices that are not yet connected by an angle, which are desired to be combined is limited to considering vertices 5 on the same surface of the graph.
  14. 14. A method according to claim 1, wherein the data that has been grouped is geographic map data.
  15. 15. A method according to claim 1, comprising the step of suppressing those associated vertices and angles representing objects for which it is not desired to maintain data in the database, so that a plurality of vertices representing features for which data 15 is present in the database.
  16. 16. A method according to claim 1, wherein said predetermined completion condition of step (g) is reached at the point at which any other combination of groups would exceed certain limits.
MXPA/A/1996/005656A 1994-05-18 1996-11-18 Method to group related data, multi dimension MXPA96005656A (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US08/245,690 US5706503A (en) 1994-05-18 1994-05-18 Method of clustering multi-dimensional related data in a computer database by combining the two verticles of a graph connected by an edge having the highest score
US08245690 1994-05-18
PCT/US1995/006150 WO1995031788A1 (en) 1994-05-18 1995-05-17 Method of clustering multi-dimensional related data

Publications (2)

Publication Number Publication Date
MX9605656A MX9605656A (en) 1998-05-31
MXPA96005656A true MXPA96005656A (en) 1998-10-23

Family

ID=

Similar Documents

Publication Publication Date Title
US5706503A (en) Method of clustering multi-dimensional related data in a computer database by combining the two verticles of a graph connected by an edge having the highest score
EP0219930B1 (en) Storing and searching a representation of topological structures
Snoeyink Point location
Samet The quadtree and related hierarchical data structures
TW405079B (en) A system and method of optimizing database queries in two or more dimensions
US6012069A (en) Method and apparatus for retrieving a desired image from an image database using keywords
CN108920462B (en) Point of interest (POI) retrieval method and device based on map
Henrich A Distance Scan Algorithm for Spatial Access Structures.
KR100380200B1 (en) Multi-linearization data structure for image browsing
US20020091704A1 (en) Database system and method of organizing an n-dimensional data set
Arge et al. External-memory algorithms for processing line segments in geographic information systems
US7426455B1 (en) Optimal boolean set operation generation among polygon-represented regions
US6622141B2 (en) Bulk loading method for a high-dimensional index structure
JP2638442B2 (en) Triangle and tetrahedron search method and analysis area segmentation device
Zhou et al. A multi-representation spatial data model
MXPA96005656A (en) Method to group related data, multi dimension
Bonerath et al. A time-windowed data structure for spatial density maps
US6873943B2 (en) Method and system of vectorial cartography
Abello et al. Graph sketches
Vaidya et al. Design and architectural implications of a spatial information system
Samet Multidimensional data structures for spatial applications
Kumar Mean-variance analysis of the performance of spatial ordering methods
CN116383444A (en) Serialization storage method and device of graph structure, electronic equipment and medium
Tan et al. Research on object-oriented three dimensional data model
Morean et al. Application of pattern recognition techniques to process cartographic data