DE60223546T2

DE60223546T2 - METHOD AND SYSTEM FOR REORGANIZING A "TABLESPACE" IN A DATABASE

Info

Publication number: DE60223546T2
Application number: DE60223546T
Authority: DE
Inventors: Amando B. Jr. Islandia ISIP; Stephen Islandia WEAVER; Jospeh Islandia ZELENKA
Original assignee: Computer Associates Think Inc
Current assignee: CA Inc
Priority date: 2001-07-19
Filing date: 2002-07-18
Publication date: 2008-09-18
Anticipated expiration: 2022-07-19
Also published as: CA2453174A1; CN1533540A; BR0211216A; ATE378641T1; WO2003009180A2; IL159633A0; EP1410260A2; KR20040017321A; JP2004536408A; EP1410260B1; DE60223546D1; WO2003009180A3

Abstract

A method of reoganizing a tablespace in a database may include reading a row of data from the database, analyzing the row of data read out from the database, determining whether to eliminate or retain the row of data based on at least one predetermined rule, reloading the row of data into the database when it is determined that the row of data complies with the at least one predetermined rule, eliminating the row of data when it is determined that the row of data does not comply with the at least one predetermined rule, and rebuilding an index related to the database to include keys that correspond to the reloaded row of data. The determining, reloading, eliminating and rebuilding steps are repeated for each row of data in the database.

Description

HINTERGRUNDBACKGROUND

Gebiet der ErfindungField of the invention

Die vorliegende Erfindung betrifft Datenbanken und insbesondere ein Verfahren und System zur Neuorganisation eines „Tablespace" in einer Datenbank.The The present invention relates to databases and more particularly Method and system for reorganizing a "tablespace" in a database.

Stand der TechnikState of the art

Die Daten in einer Datenbank können als Tabellen in Form von Spalten und Zeilen von Daten vorliegen, wie in 1 dargestellt. In diesem Beispiel umfasst eine „Produkttabelle" mehrere Spalten (Produktname, Stücknummer, Verfallsdatum) zum Speichern von Datenzeilen, die zu verschiedenen Produkten gehören (Produkt 1, Produkt 2, Produkt 3 usw.). Es kann ein Index bereitgestellt werden, welcher mehrere Indexschlüssel umfasst, um einen schnellen Zugriff auf die Daten in der Datenbank zu ermöglichen. Ein Indexschlüssel ist eine minimale Menge von Attributen, welche jede Zeile in der Datenbank eindeutig identifiziert. Zum Beispiel kann in der in 1 dargestellten Datenbank „Produktname" der Schlüssel sein, wenn man aus Vereinfachungsgründen davon ausgeht, dass jedes Produkt einen einzigartigen Produktnamen aufweist. Mit anderen Worten kann der Name eines Produkts verwendet werden, um eindeutig die Zeile in der Datenbank zu identifizieren, in welcher die Daten für das Produkt gespeichert sind.The data in a database can be presented as tables in the form of columns and rows of data, as in 1 shown. In this example, a "product table" includes several columns (product name, part number, expiration date) for storing data lines belonging to different products (product 1, product 2, product 3, etc.) An index may be provided which includes multiple index keys An index key is a minimum set of attributes that uniquely identifies each row in the database 1 The "product name" database may be the key, if for simplicity's sake one assumes that each product has a unique product name, in other words, the name of a product can be used to uniquely identify the line in the database in which the data for the product is stored.

Die Daten in jeder Zeile der Datenbank oder Beziehung sollten analysiert werden, um sicherzustellen, dass die Daten bestimmte Prüfbedingungen erfüllen und die referenzielle Integrität bewahren. Eine Bedingung ist eine Regel, welche die Werte in einer Datenbank oder Tabelle beschränkt. Zum Beispiel kann in der in 1 dargestellten Datenbank ein Beispiel für eine Bedingung ein Verfallsdatum < 16. Mai 2001 sein. Das heißt, dass jeder Wert, der in der Spalte für das Verfallsdatum aufgelistet ist, vor dem 16. Mai 2001 liegen sollte.The data in each row of the database or relationship should be analyzed to ensure that the data meets certain test conditions and preserves referential integrity. A condition is a rule that limits the values in a database or table. For example, in the in 1 An example of a condition that has an expiration date is May 16, 2001. That is, each value listed in the Expiration Date column should be before May 16, 2001.

Die referenzielle Integrität erfordert, dass alle Nicht-Null-Fremdschlüssel einem aktuellen Schlüssel in irgendeiner Beziehung entsprechen. Bei einem fremden Schlüssel kann es sich um ein Attribut oder eine Menge von Attributen in einer Tabelle handeln, die in irgendeiner anderen Tabelle einen Schlüssel bilden. Fremdschlüssel werden benutzt, um logische Verbindungen zwischen Beziehungen anzuzeigen. Zum Beispiel kann in der in 1 dargestellten Datenbank der Stücknummer-Fremdschlüssel (Stücknr.) die Produkttabelle mit einer (nicht dargestellten) Stücktabelle in Beziehung setzen. Die referenzielle Integrität stellt sicher, dass das Stücknummernattribut ein Schlüssel in der Stücktabelle bleibt, so dass die Beziehung zwischen der Produkttabelle und der (nicht dargestellten) Stücktabelle gültig bleibt. Im Allgemeinen werden Prüfbedingungen und Erfordernisse für die referenzielle Integrität durch einen Administrator der Datenbank vorgegeben und können in Abhängigkeit von den Anwendungen variieren, welche die Daten in der Datenbank benutzen.Referential integrity requires that all non-zero foreign keys correspond to a current key in some way. A foreign key can be an attribute or set of attributes in a table that form a key in any other table. Foreign keys are used to indicate logical connections between relationships. For example, in the in 1 In the database of the piece number foreign key (piece number), the product table is related to a piece table (not shown). Referential integrity ensures that the piece number attribute remains a key in the piece table so that the relationship between the product table and the piece table (not shown) remains valid. Generally, referential integrity checking conditions and requirements are dictated by an administrator of the database and may vary depending on the applications that use the data in the database.

Wenn die Daten in einer Zeile einer Datenbank oder Beziehung nicht den Bedingungen genügen oder nicht die referenzielle Integrität bewahren, können die Daten gelöscht werden. Außerdem können die Indexschlüssel aus dem Index gelöscht werden, die den Zeilen entsprechen, aus welchen die Daten gelöscht werden. Nach dem Löschen können sowohl die Datenbank als auch der Index „Löcher" aufweisen, z. B. Zeilen ohne Daten und/oder Stellen, wo die Schlüssel gelöscht wurden. Um die effektive Nutzung des Platzes in der Datenbank und in dem Index zu maximieren, sollten diese Löcher entfernt werden.If the data in a row of a database or relationship does not Conditions suffice or do not preserve referential integrity, the Data deleted become. Furthermore can the index keys deleted from the index which correspond to the lines from which the data is deleted. After deleting can both the database as well as the index "holes", eg lines without data and / or bodies where the keys deleted were. To the effective use of space in the database and to maximize in the index, these holes should be removed.

Derzeit kann ein Prüf-Dienstprogramm die Überprüfung und Löschung von Daten durchführen, welche nicht die Bedingungen oder die referenzielle Integrität einhalten. Ein separates Neuorganisations-Dienstprogramm kann dann benutzt werden, um die verbleibenden Datenzeilen neu zu organisieren, um die Datenbank neu zusammenzustellen, wobei Zeilen ohne Daten eliminiert werden. Außerdem kann das Neuorganisations-Dienstprogramm den Index in Beziehung auf die neu organisierte Tabelle neu erstellen, um Leerstellen zu eliminieren, die von gelöschten Schlüsseln zurückgelassen wurden.Currently can be a testing utility the review and deletion perform data which do not adhere to conditions or referential integrity. A separate reorganization utility can then be used to reorganize the remaining data lines recompose the database, eliminating rows without data become. In addition, can the reorganization utility recreate the index in relation to the newly organized table, to eliminate spaces left behind by deleted keys.

Die Operation eines Prüf-Dienstprogramms zum Durchführen eines Verfahrens zur Überprüfung von Daten auf die Erfüllung von Bedingungen und auf referenzielle Integrität ist in 2 dargestellt. In Schritt S20 liest das Prüf-Dienstprogramm eine Datenzeile aus der Datenbank aus. In Schritt S22 werden die aus der Datenbank ausgelesenen Daten analysiert, um sicherzustellen, dass die Daten vorbestimmte Prüfbedingungen erfüllen und die referenzielle Integrität bewahren. Wenn die Daten aus einer Zeile diesen Erfordernissen nicht genügen, kann das Prüf-Dienstprogramm die Daten in dieser Zeile löschen. In Schritt S24 können auch die Indexschlüssel, die der gelöschten Zeile entsprechen, aus einem Index gelöscht werden, der sich auf die Datenbank bezieht. In Schritt S26 werden die Datenbank und der Index mit den von den gelöschten Daten und den gelöschten Schlüsseln zurückgelassenen Räumen neu geschrieben.The operation of a check utility to perform a method of checking data for conditional compliance and referential integrity is described in U.S. Patent No. 5,629,866 2 shown. In step S20, the check utility reads out a data line from the database. In step S22, the data read from the database is analyzed to ensure that the data meets predetermined test conditions and preserves referential integrity. If the data from a row does not meet these requirements, the check utility may delete the data in that row. In step S24, the index keys corresponding to the deleted line may also be deleted from an index relating to the database. In step S26, the database and the index are rewritten with the spaces left by the deleted data and the keys deleted.

Ein Neuorganisations-Dienstprogramm kann dann aufgerufen werden, um die Löcher (z. B. durch die gelöschten Daten und Schlüssel zurückgelassene Räume) in der Datenbank und dem Index auf die in 3 dargestellte Weise zu eliminieren. In Schritt S30 liest das Neuorganisations-Dienstprogramm eine Zeile aus der Datenbank aus. In Schritt 32 wird die Zeile neu in die Datenbank geladen, wenn die Daten in der Zeile nicht durch das Prüf-Dienstprogramm gelöscht worden sind. In Schritt S34 wird der Indexraum neu erstellt, um Indexschlüssel zu umfassen, welche nur der Zeile entsprechen, die in Schritt S32 durch das Neuorganisations-Dienstprogramm neu in die Datenbank geladen wurde.A reorganization utility can then be called to clear the holes (eg spaces left by the deleted data and keys) in the database and the index on the in 3 to eliminate shown manner. In step S30, the reorganization utility reads out a row from the database. In step 32 the line is reloaded into the database if the data in the line has not been cleared by the check utility. In step S34, the index space is rebuilt to include index keys corresponding only to the line newly loaded in the database by the reorganization utility in step S32.

Obwohl solche Verfahren des Überprüfens und Neuorganisierens von Daten in einer Datenbank funktionieren, können einige operationelle Eigenschaften solcher Verfahren noch verbessert werden. Es werden zum Beispiel verschiedene Eingabe/Ausgabe-Operationen angewendet, um diese Verfahren auszuführen. Jede Zeile wird durch das Prüf-Dienstprogramm ausgelesen, und dann wird jede Seite der Datenbank und des Index neu geschrieben, nachdem die nichtkonformen Daten gelöscht sind. Das Neuorganisations-Dienstprogramm liest dann jede Zeile der Datenbank neu aus und lädt die Zeilen neu, aus welchen die Daten nicht gelöscht sind, neu in die Datenbank. Das Neuorganisations-Dienstprogramm erstellt dann den Indexraum neu, so dass dieser der neu geladenen Datenbank entspricht. Die Eingabe/Ausgabe-Operationen wiederholen sich unnötigerweise und erhöhen die Wahrscheinlichkeit dafür, dass Fehler in die Daten in der Tabelle eingeführt werden.Even though such procedures of checking and Reorganizing data in a database can work well operational characteristics of such processes are yet to be improved. It For example, various input / output operations are used to perform these methods. each Line is through the testing utility and then each page of the database and the index rewritten after the nonconforming data is deleted. The reorganization utility will then read each line of the database new and loads new lines from which the data is not deleted, new to the database. The reorganization utility created then redraw the index space, leaving this the newly loaded database equivalent. The input / output operations are unnecessarily repeated and increase the probability that Errors are introduced in the data in the table.

Die wiederholende Natur dieser Verfahren kostet auch Zeit. Zunächst läuft das Prüf-Dienstprogramm vollständig ab, um die Daten in der Datenbank und dem Index auszulesen und neu zu schreiben. Dann liest das Neuorganisations-Dienstprogramm jede Zeile der Datenbank mit den entsprechenden Indexschlüsseln im Index aus und lädt die Datenbank neu und erstellt den Indexraum neu. Während diese beiden Dienstprogramme ablaufen, sind die Daten in der Tabelle nicht für Benutzeranwendungen oder für Online-Transaktionen verfügbar.The The repetitive nature of these procedures also takes time. At first it works Testing Utility Completely to read out the data in the database and the index and new to write. Then the reorganization utility reads each one Line of the database with the corresponding index keys in the Index off and load rebuild the database and rebuild the index space. While these both utilities are running, the data in the table is not for user applications or for Online transactions available.

Es wäre deswegen wünschenswert, ein Verfahren und System zum Überprüfen und Neuorganisieren von Daten in einer Datenbank oder Beziehung in effizienterer Weise bereitzustellen, so dass die Daten in der Tabelle nur für eine relativ kurze Zeitperiode nicht verfügbar sind.It that's why desirable, a method and system for checking and Reorganize data in a database or relationship in more efficient Way, so that the data in the table only for a relative short period not available are.

SOCKUT, G. H., u. a., "Database Reorganization – Principles and Practice" (ACM Computing Surveys, Bd. 11, Nr. 4, Dezember 1979 (1979–12), S. 371 bis 395) erwähnen ein Datenbank-Verwaltungssystem („SYSTEM 2000"), welches Datensätze und Indizes in logischer Reihenfolge entnehmen und dann in physischer Reihenfolge neu laden kann. Der Neuladebefehl kann darauf gerichtet sein, nur jene Datensätze neu zu laden, welche bestimmte Kriterien erfüllen.SOCKUT, G.H., u. a., "Database Reorganization - Principles and Practice "(ACM Computing Surveys, Vol. 11, No. 4, December 1979 (1979-12), p. 371 to 395) a database management system ("SYSTEM 2000 "), which records and indexes in logical order and then in physical Reload order. The reload command can be directed to it be, only those records which meet certain criteria.

BRUNI, P., u. a., „DB2 UDB for OS/390 Version 6 Performance Topics" (IBM, 1999) beschreiben einen Befehl „REORG DISCARD", welcher ermöglicht, dass Zeilen während der Neuorganisation einer relationalen Datenbank verworfen werden. Die zu verwerfenden Zeilen werden durch ein Schlüsselwort WHEN spezifiziert, welches keine Vergleiche von Spalte zu Spalte erlaubt. Die Indexneuerstellung wird getrennt von der Neuorganisation mittels mehrerer Sortieren/Erstellen-Aufgabenpaare durchgeführt, welche parallel zueinander verarbeitet werden.BRUNI, P., u. a., "DB2 UDB for OS / 390 Version 6 Performance Topics "(IBM, 1999) describe a command" REORG DISCARD ", which allows that lines during reorganization of a relational database. The rows to be discarded are specified by a keyword WHEN, which does not allow comparisons from column to column. The index renewal is separated from reorganization using multiple sort / create-task pairs performed, which be processed parallel to each other.

Kurzdarstellung der ErfindungBrief description of the invention

Erfindungsgemäß wird das Folgende bereitgestellt: ein Verfahren gemäß Anspruch 1; ein System gemäß Anspruch 6 und ein Computer-Aufzeichnungsmedium, das einen computerlesbaren Code umfasst, gemäß Anspruch 11.According to the invention The following are provided: a method according to claim 1; a system according to claim 6 and a computer recording medium containing a computer-readable Code comprises, according to claim 11th

Kurze Beschreibung der ZeichnungenBrief description of the drawings

Ein vollständigeres Verständnis der vorliegenden Erfindung und vieler mit dieser verbundener Vorteile ist leicht zu erhalten, da dieselbe besser zu verstehen ist im Zusammenhang mit der folgenden detaillierten Beschreibung, wenn sie in Verbindung mit den begleitenden Zeichnungen betrachtet wird, wobei:One complete understanding of the present invention and many advantages associated therewith is easy to obtain as it is better understood in context with the following detailed description, when used in conjunction with the accompanying drawings, wherein:

1 eine Darstellung einer Datenbank ist, in welcher Daten gespeichert werden; 1 is a representation of a database in which data is stored;

2 ein Ablaufdiagramm ist, welches die Operation eines Prüf-Dienstprogramms veranschaulicht; 2 Fig. 10 is a flow chart illustrating the operation of a check utility;

3 ein Ablaufdiagramm ist, welches die Operation eines Neuorganisations-Dienstprogramms veranschaulicht; 3 Fig. 10 is a flow chart illustrating the operation of a reorganization utility;

4 ein Blockdiagramm ist, welches ein Computersystem zur Verwirklichung eines Verfahrens und Systems gemäß der vorliegenden Erfindung veranschaulicht; 4 Fig. 10 is a block diagram illustrating a computer system for implementing a method and system according to the present invention;

5 eine Darstellung einer Datenbank ist, in welcher Daten gespeichert werden; 5 is a representation of a database in which data is stored;

6 ein Ablaufdiagramm ist, welches ein Verfahren zur Neuorganisation einer Datenbank gemäß einer Ausführungsform der vorliegenden Erfindung veranschaulicht; 6 Fig. 10 is a flow chart illustrating a method for reorganizing a database according to an embodiment of the present invention;

7 ein Ablaufdiagramm ist, welches ein Verfahren zur Organisation einer Datenbank gemäß einer anderen Ausführungsform der vorliegenden Erfindung veranschaulicht. 7 Fig. 10 is a flow chart illustrating a method of organizing a database according to another embodiment of the present invention.

Detaillierte BeschreibungDetailed description

Bei der Beschreibung der bevorzugten Ausführungsformen der vorliegenden Erfindung, die in den Zeichnungen dargestellt sind, wird aus Gründen der Verdeutlichung eine spezielle Terminologie verwendet. Die vorliegende Erfindung soll jedoch nicht auf die so ausgewählte spezielle Terminologie beschränkt sein, und es versteht sich, dass jedes spezielle Element alle technischen Äquivalente umfasst, welche in ähnlicher Weise operieren.at the description of the preferred embodiments of the present Invention, which are illustrated in the drawings, is for the sake of Clarification using a special terminology. The present However, the invention should not be limited to the specific terminology so selected limited be, and it is understood that each specific element all technical equivalents which is similar Operate way.

Das Verfahren und System der vorliegenden Erfindung sorgen für das Auslesen einer Zeile aus einer Datenbank und das Überprüfen der Daten in der Zeile auf das Erfüllen von Bedingungen und auf referenzielle Integrität. Wenn die Daten in der Zeile diesen Erfordernissen genügen, wird die Datenzeile neu in die Datenbank geladen, anderenfalls wird die Datenzeile eliminiert. Dann wird ein zu der Datenbank gehörender Index derart neu erstellt, dass er nur Schlüssel umfasst, welche zu den Datenzeilen gehören, die neu in die Datenbank geladen wurden. In einem einzigen nahtlosen Verfahren wird die Datenzeile ausgelesen, analysiert und neu geladen oder eliminiert und der zur Datenbank gehörende Index neu erstellt.The The method and system of the present invention provide for readout a row from a database and checking the data in the row on fulfilling conditions and referential integrity. If the data in the line meet these requirements, the data line is reloaded into the database, otherwise it will the data line is eliminated. Then an index belonging to the database becomes rebuilt so that it only contains keys that belong to the Data lines belong, which were newly loaded into the database. In a single seamless Procedure, the data line is read out, analyzed and reloaded or eliminates and recreates the index associated with the database.

Das System und Verfahren kann in Form einer Softwareanwendung verwirklicht werden, die auf einem Computersystem abläuft, z. B. einem Mainframe wie OS/390, einem Personalcomputer, einem Handcomputer, einem Server usw. Das Computersystem kann mit einer Datenbank verbunden sein. Die Verbindung kann zum Beispiel über eine Direktverbindung wie eine direkte festverdrahtete oder drahtlose Verbindung, über eine Netzwerkverbindung wie ein lokales Datennetz oder über das Internet erfolgen.The System and method can be implemented in the form of a software application that runs on a computer system, e.g. B. a mainframe like OS / 390, a personal computer, a handheld computer, a server etc. The computer system may be connected to a database. The Connection can be over for example a direct connection such as a direct hardwired or wireless Connection, over a network connection such as a local area network or via the Internet done.

Ein Beispiel für ein Computersystem, in welchem das System und Verfahren der vorliegenden Erfindung verwirklicht werden können, ist in 4 dargestellt. Das Computersystem, welches allgemein als das System 400 bezeichnet wird, kann einen Zentralprozessor (CPU) 402, einen Speicher 404, eine Druckerschnittstelle 406, eine Anzeigeeinheit 408, eine LAN(Lokales Datennetz)-Datenübertragungs-Steuereinheit 410, eine LAN-Schnittstelle 412, eine Netzwerk-Steuereinheit 414, einen internen Bus 416 und eine oder mehrere Eingabevorrichtungen 418, wie zum Beispiel eine Tastatur, eine Maus usw., umfassen. Wie dargestellt, kann das System 400 über eine Verbindung 422 mit einer Datenbank 420 verbunden sein.An example of a computer system in which the system and method of the present invention may be practiced is disclosed in U.S.P. 4 shown. The computer system, commonly referred to as the system 400 can be called a central processor (CPU) 402 , a store 404 , a printer interface 406 , a display unit 408 , a LAN (Local Area Network) data transfer control unit 410 , a LAN interface 412 , a network control unit 414 , an internal bus 416 and one or more input devices 418 such as a keyboard, a mouse, etc. As shown, the system can 400 over a connection 422 with a database 420 be connected.

5 veranschaulicht eine Datenbank oder Beziehung, die als Stücknummerntabelle bezeichnet ist und Spalten umfasst, welche eine Stücknummer (Stücknr.), einen Produktnamen und ein Verfallsdatum repräsentieren. Die Daten in der Datenbank sollten bestimmte Bedingungen erfüllen und sollten die referenzielle Integrität bewahren, wie oben erwähnt. Wenn eine Datenzeile solchen Erfordernissen nicht genügt, sollten die Daten in der Datenzeile gelöscht werden, und die zu der Datenzeile gehörenden Schlüssel werden aus einem zu der Datenbank gehörenden Index gelöscht. 5 FIG. 12 illustrates a database or relationship, referred to as a piece number table, that includes columns representing a piece number (piece number), a product name, and an expiration date. The data in the database should meet certain conditions and should retain referential integrity as mentioned above. If a data line does not satisfy such requirements, the data in the data row should be deleted and the keys associated with the data row are deleted from an index associated with the database.

Die vorliegende Erfindung ist darauf gerichtet, ein Verfahren zur Neuorganisation einer Datenbank bereitzustellen, welches sowohl für eine Überprüfung dessen sorgt, dass die Daten in jeder Zeile der Datenbank vorgegebene Bedingungen und die Voraussetzungen für die referenzielle Integrität erfüllen, als auch für eine Neuorganisation der Datenbank und des zur Datenbank gehörenden Index sorgt, um jedwede Löcher zu eliminieren, die durch das Löschen von Daten während der Überprüfungsoperation zurückgeblieben sein können.The The present invention is directed to a method for reorganizing to provide a database for both a review of Ensures that the data in each row of the database has predetermined conditions and the conditions for the referential integrity fulfill, as well as for a reorganization of the database and the index belonging to the database takes care of any holes to eliminate that by deleting of data during the verification operation retarded could be.

Das Verfahren sorgt für das Auslesen jeder Datenzeile in einer Datenbank und das Analysieren der Daten gemäß vorgegebenen Regeln. Die Datenzeile wird gemäß den vorgegebenen Regeln entweder beibehalten oder gelöscht. Eine beibehaltene Datenzeile wird neu in die Datenbank geladen. Ein zur Datenbank gehörender Index wird derart neu erstellt, dass er Schlüssel umfasst, welche der beibehaltenen Datenzeile entsprechen. Dieses Verfahren wird für jede Datenzeile in der Datenbank wiederholt.The Procedure ensures reading each line of data in a database and analyzing the Data according to specified Regulate. The data line is according to the given Rules either retained or deleted. One maintained data line will be reloaded into the database. An index belonging to the database is recreated to include keys which are the retained data row correspond. This procedure is for each data line in the database repeated.

Unter Bezugnahme auf 6 wird ein Verfahren zur Neuorganisation einer Datenbank gemäß einer Ausführungsform der vorliegenden Erfindung erläutert. In Schritt S60 wird eine Datenzeile aus der Datenbank ausgelesen. Die Datenzeile wird in Schritt 62 gemäß vorgegebenen Regeln analysiert. Die Datenzeile wird gemäß vorgegebenen Regeln eliminiert oder beibehalten. Die vorgegebenen Regeln können Prüfbedingungen oder Bedingungen und Voraussetzungen für die referenzielle Integrität umfassen. Diese Regeln können von einem Datenbankadministrator vorgegeben werden. Diese Regeln können von dem Datenbankadministrator modifiziert werden, wenn dies angebracht ist. Wenn eine Datenzeile nicht beibehalten werden soll (Nein, Schritt S63), wird die Datenzeile eliminiert (Schritt S65). Wenn die Datenzeile beibehalten werden soll (Ja, Schritt S63), wird die Datenzeile neu in die Datenbank geladen. In Schritt S66 wird ein zur Datenbank gehörender Index mit Indexschlüsseln oder Schlüsseln, welche der beibehaltenen Datenzeile entsprechen, neu erstellt, wenn die Datenzeile in Schritt S64 neu in die Datenbank geladen worden ist. In Schritt S68 wird eine Bestimmung vorgenommen, ob eine andere Zeile oder eine nächste Zeile in dem Index vorhanden ist. Wenn es keine nächste Zeile gibt (Nein, Schritt S68), endet das Verfahren. Wenn es eine nächste Zeile gibt (Ja, Schritt S68), kehrt das Verfahren zu Schritt S60 zurück, wo die nächste Zeile aus der Datenbank ausgelesen wird. Das Verfahren wird für jede Zeile in der Datenbank wiederholt.With reference to 6 A method for reorganizing a database according to an embodiment of the present invention will be explained. In step S60, a data line is read from the database. The data line will be in step 62 analyzed according to given rules. The data line is eliminated or retained according to given rules. The predetermined rules may include test conditions or conditions and prerequisites for referential integrity. These rules can be specified by a database administrator. These rules may be modified by the database administrator, as appropriate. If a data line is not to be maintained (No, step S63), the data line is eliminated (step S65). If the data line is to be retained (Yes, step S63), the data line is reloaded into the database. In step S66, an index belonging to the database with index keys or keys corresponding to the retained data row is recreated when the data row has been newly loaded into the database in step S64. In step S68, a determination is made as to whether another line or a next line exists in the index. If there is no next line (No, step S68), the process ends. If there is a next line (Yes, step S68), the process returns to step S60 where the next line is read from the database. The procedure is repeated for each row in the database.

Wie oben angemerkt, werden, wenn Daten in einer Zeile eine Bedingung nicht erfüllen oder die Voraussetzungen für die referenzielle Integrität nicht erfüllen, die Daten aus der Datenbank eliminiert. Die Daten können in einem Löschschritt eliminiert werden oder können einfach nicht neu in die Datenbank geladen werden. Wenn Daten in einer Zeile die Bedingungen und die Voraussetzungen für die referenzielle Integrität erfüllen, können die Daten beibehalten werden und neu in die nächste leere Zeile der Datenbank geladen werden. Eine leere Zeile ist eine Zeile, in welcher momentan keine Daten gespeichert sind. Alternativ kann die beibehaltene Datenzeile in eine neue Datenbank, in die nächste offene Zeile der neuen Datenbank, geladen werden. So bleiben in der Datenbank keine Leerräume zurück. Die resultierende neu geladene Datenbank oder die neue Datenbank umfassen nur Datenzeilen, welche die Bedingungen und die Voraussetzungen für die referenzielle Integrität erfüllen. Außerdem kann der zur Datenbank gehörende Index fliegend neu erstellt werden und umfasst Schlüssel, welche zu Zeilen gehören, die in der neu geladenen Datenbank existieren.As noted above, when data in a line is a condition do not fulfill or the requirements for the referential integrity do not fulfill, eliminates the data from the database. The data can be in an extinguishing step be eliminated or can simply not be reloaded into the database. When data in a line the terms and conditions for the referential integrity fulfill, can the data will be kept and re-added to the next blank line of the database getting charged. An empty line is a line in which currently no data is stored. Alternatively, the retained data line in a new database, in the next one open line of the new database to be loaded. So stay in the database no spaces back. The resulting reloaded database or database include only data lines, which the terms and conditions for the referential integrity fulfill. Furthermore can be the one belonging to the database Index to be recreated on-the-fly and includes keys which belong to lines that exist in the newly loaded database.

Unter Anwendung des Verfahrens der vorliegenden Erfindung kann jede Datenzeile nur einmal ausgelesen und neu geschrie ben werden, so dass die Wahrscheinlichkeit eines während der Eingabe- und Ausgabeschritte des Verfahrens auftretenden Fehlers verringert werden kann. Außerdem muss man nur ein Dienstprogramm anlaufen lassen, um die Daten der Zeilen der Tabelle sowohl zu überprüfen als auch neu zu organisieren, um die zur Neuorganisation von „Tablespaces" in einer Datenbank benötigte Zeit zu verringern. So kann die Stillstandszeit, während der die Tabelle für Benutzeranwendungen und Online-Geschäfte nicht verfügbar ist, verkürzt werden.Under Application of the method of the present invention may be any data line only once read and rewritten ben, so the probability one while the input and output steps of the method occurring error can be reduced. Furthermore you just have to run a utility to get the data of the Both rows of the table to check as also to reorganize to reorganize "tablespaces" in a database needed To reduce time. Thus, the downtime during the the table for User applications and online stores are not available is to be shortened.

Gemäß einer anderen Ausführungsform der vorliegenden Erfindung wird ein Verfahren zur Neuorganisation einer Datenbank bereitgestellt, in welchem die Datenbank in mehrere Partitionen aufgeteilt wird.According to one another embodiment The present invention provides a method for reorganizing a database in which the database in several Partitions is split.

Das Verfahren umfasst einen Schritt des Trennens der Datenbank und eines zugehörigen Index in mehrere Partitionen. Eine der mehreren Partitionen der Datenbank wird zusammen mit einer zugehörigen Partition des zugehörigen Index ausgewählt, und eine Datenzeile der einen ausgewählten Partition wird aus der Partition ausgewählt. Die gemäß vorgegebenen Regeln analysierte Datenzeile wird gemäß den vorgegebenen Regeln entweder beibehalten oder eliminiert. Eine beibehaltene Datenzeile wird neu in die ausgewählte eine Partition geladen. Die zugehörige Partition des zugehörigen Index wird derart neu erstellt, dass sie Schlüssel umfasst, welche der beibehaltenen Datenzeile entsprechen, die in die ausgewählte eine Partition der Datenbank geladen wird. Jede Datenzeile in der Partition wird ausgelesen, und jede Partition wird analysiert. Es wird nur eine Partition der mehreren Partitionen zur Zeit analysiert.The Method includes a step of separating the database and a associated Index in multiple partitions. One of the several partitions of Database is associated with an associated partition of the associated index selected, and a data line of a selected one Partition is selected from the partition. The according to specified Rules parsed data line will either be according to the given rules maintained or eliminated. A retained data line is refreshed in the selected loaded a partition. The associated partition of the associated index is recreated to include keys, which of the retained ones Data line corresponding to the selected one partition of the database is loaded. Each data line in the partition is read, and each partition is analyzed. It will only be a partition of several partitions currently being analyzed.

Wie oben erwähnt, sind die Daten in der Datenbank für Anwendungen und E-Commerce nicht verfügbar, während das Neuorganisationsverfahren abläuft. Obwohl durch das Verkürzen der Zeit, die benötigt wird, um die Neuorganisations- und Überprüfungsfunktionen durchzuführen, die Zeitdauer verkürzt wird, während der die Daten nicht verfügbar sind, sind die Daten immer noch für eine gewisse Zeitdauer vollständig nicht verfügbar. Das Partitionieren der Datenbank in mehrere Partitionen und das Neuorganisieren jeder Partition unabhängig voneinander ermöglicht, dass die anderen Partitionen der Datenbank für Benutzeranwendungen und für E-Commerce verfügbar bleiben. Auf diese Weise sind immer zumindest einige der Daten der Datenbank verfügbar.As mentioned above, are the data in the database for applications and e-commerce not available, while the reorganization procedure expires. Although by shortening the Time that needed To perform the reorganization and verification functions, the Duration is shortened, while the data is not available the data is still for a certain period of time completely not available. Partitioning the database into multiple partitions and reorganizing independent of each partition allows each other that the other partitions of the database for user applications and for e-commerce available stay. In this way, at least some of the data is always the Database available.

Das Verfahren wird unter Bezugnahme auf 7 weiter beschrieben. In Schritt S70 werden eine Datenbank und ein zugehöriger Index entsprechend in Partitionen unterteilt. Die Partitionen des zugehörigen Index entsprechen jenen der Datenbank. Die Anzahl der Partitionen kann von der relativen Größe der Datenbank abhängen und kann durch einen Benutzer oder den Datenbankadministrator eingestellt werden. Diese Flexibilität ermöglicht es, dass das Verfahren auf die Anwendung in vielen verschiedenen Arten von Datenbanken eingestellt werden kann. In Schritt S71 wird eine Partition der mehreren Partitionen der Datenbank zusammen mit einer zugehörigen Partition des zugehörigen Index zur Neuorganisation ausgewählt. Bei dieser ausgewählten einen Partition der Datenbank kann es sich um jede der mehreren Partitionen handeln. Ein Benutzer oder der Datenbankadministrator kann festlegen, welche der Partitionen zuerst neu organisiert werden soll, basierend auf Faktoren wie der Häufigkeit der Verwendung, oder vielleicht basierend auf der Art der in der Partition enthaltenen Daten. Es sollte angemerkt werden, dass der Benutzer oder Datenbankadministrator auch die Bedingungen und die Voraussetzungen für die referenzielle Integrität einstellt und sich daher wahrscheinlich in der besten Position dafür befindet, die beste Reihenfolge festzulegen, in welcher die Partitionen zu organisieren sind. Die Schritte S72 bis S78 laufen im Wesentlichen wie die Schritte S62 bis S68 ab, die oben in Bezug auf 6 beschrieben sind, außer dass die Datenzeilen aus einer ausgewählten Partition der Datenbank ausgelesen und neu in diese geladen werden und eine zugehörige Partition des Index neu erstellt wird. Wenn die nächste Zeile nicht vorhanden ist (Nein, Schritt 78), kann das Verfahren zu Schritt 79 übergehen, wo eine Bestimmung vorgenommen werden kann, ob eine andere Partition oder eine nächste Partition der Datenbank vorhanden ist. Wenn es keine nächste Partition gibt (Nein, Schritt 79), endet das Verfahren. Wenn es eine nächste Partition gibt (Ja, Schritt 79), kann das Verfahren zu Schritt 71 zurückkehren, wo die nächste Partition ausgewählt wird. Wenn in Schritt 78 die nächste Zeile vorliegt (Ja, Schritt 78), kann das Verfahren zu Schritt 72 zurückkehren, und die nächste Datenzeile wird aus der ausgewählten einen Partition der Datenbank ausgelesen.The method is described with reference to 7 further described. In step S70, a database and an associated index are divided into partitions, respectively. The partitions of the associated index correspond to those of the database. The number of partitions can vary depending on the relative size of the database and can be set by a user or the database administrator. This flexibility allows the method to be set up for use in many different types of databases. In step S71, a partition of the multiple partitions of the database is selected together with an associated partition of the associated index for reorganization. This selected one partition of the database can be any of the multiple partitions. A user or the database administrator can specify which of the partitions to reorganize first, based on factors such as the frequency of use, or perhaps based on the type of data contained in the partition. It should be noted that the user or database administrator also sets the terms and conditions for referential integrity and is therefore likely to be in the best position to determine the best order in which to organize the partitions. The steps S72 to S78 are substantially the same as the steps S62 to S68 described above with respect to FIG 6 except that the rows of data are read from and loaded into a selected partition of the database and an associated partition of the index is recreated. When the next time le does not exist (no, step 78 ), the procedure can be used to step 79 go over where a determination can be made as to whether another partition or a next partition of the database exists. If there is no next partition (No, step 79 ), the procedure ends. If there is a next partition (Yes, step 79 ), the procedure can be used to step 71 return where the next partition is selected. When in step 78 the next line is present (Yes, step 78 ), the procedure can be used to step 72 return, and the next line of data is read from the selected one partition of the database.

Wie oben angemerkt, werden, wenn Daten in einer Zeile eine Bedingung nicht erfüllen oder die Voraussetzungen für die referenzielle Integrität nicht erfüllen, die Daten aus der Datenbank eliminiert. Die Daten können in einem Löschschritt eliminiert werden oder können einfach nicht neu in die Datenbank geladen werden. Wenn Daten in einer Zeile die Bedingungen und die Voraussetzungen für die referenzielle Integrität erfüllen, können die Daten beibehalten werden und neu in die nächste leere Zeile der ausgewählten Partition der Datenbank geladen werden. Alternativ kann die beibehaltene Datenzeile in eine Partition einer neuen Datenbank, in die nächste offene Zeile der neuen Datenbank, geladen werden. So bleiben entweder in der ausgewählten Partition der Datenbank oder der Partition der neuen Datenbank keine Leerräume zurück. Die resultierende neu geladene Partition der Datenbank oder die Partition der neuen Datenbank umfassen nur Datenzeilen, welche die Bedingungen und die Voraussetzungen für die referenzielle Integrität erfüllen. Außerdem kann die zugehörige Partition des zu der Datenbank oder der neuen Datenbank gehörenden Index fliegend neu erstellt werden und umfasst Schlüssel, welche zu Zeilen gehören, die in der neu geladenen Partition der Datenbank oder der Partition der neuen Datenbank existieren.As noted above, when data in a line is a condition do not fulfill or the requirements for the referential integrity do not fulfill, eliminates the data from the database. The data can be in an extinguishing step be eliminated or can simply not be reloaded into the database. When data in a line the terms and conditions for the referential integrity fulfill, can the data will be retained and re-added to the next blank line of the selected partition the database are loaded. Alternatively, the retained data line in a partition of a new database, in the next open Line of the new database to be loaded. So stay in either the selected one No partition of database or partition of new database voids back. The resulting newly loaded partition of the database or the The new database partition contains only data lines containing the Conditions and the requirements for referential integrity. In addition, can the associated Partition of the index belonging to the database or the new database to be recreated on the fly, and includes keys that belong to rows that in the newly loaded partition of the database or partition the new database exist.

Obwohl das oben beschriebene Verfahren und System allgemein auf Datenbanken anwendbar ist, ist ein spezielles Beispiel für eine solche Datenbank eine Datenbank, die in einer DB2-Umgebung konstruiert ist.Even though the method and system generally described above on databases is applicable, a specific example of such a database is one Database constructed in a DB2 environment is.

Die vorliegende Erfindung kann bequem verwirklicht werden, indem ein oder mehrere herkömmliche allgemeine digitale Computer und/oder Server verwendet werden, die gemäß den Lehren der vorliegenden Beschreibung programmiert sind. Eine geeignete Softwarecodierung kann von geübten Programmierern basierend auf den Lehren der vorliegenden Erfindung einfach erstellt werden. Die vorliegende Erfindung kann auch durch Herstellen anwendungsspezifischer integrierter Schaltungen oder durch Zusammenschalten eines geeigneten Netzwerks herkömmlicher Komponenten verwirklicht werden.The The present invention can be conveniently realized by a or more conventional ones general digital computers and / or servers are used according to the teachings programmed in the present description. A suitable Software coding can be practiced by Programmers based on the teachings of the present invention easily created. The present invention can also by Manufacture of application-specific integrated circuits or by Connecting together a suitable network of conventional components realized become.

Hinsichtlich der obigen Lehren sind zahlreiche weitere Modifikationen und Variationen der vorliegenden Erfindung möglich. Es versteht sich deswegen, dass innerhalb des Umfangs der beigefügten Patentansprüche die vorliegende Erfindung anders ausgeführt werden kann als speziell hierin beschrieben.Regarding The above teachings are numerous other modifications and variations of the present invention possible. It is therefore to be understood that within the scope of the appended claims, the The present invention may be practiced otherwise than as specifically described herein.

Claims

Method for reorganizing a database the database being an associated Index and the process comprises: Reading (S60; S72) a data line from the database; Analyze (S62; S73) of the data line, which according to at least a predefined rule has been read from the database, where the rule is a requirement for referential integrity; Determine (S63, S74), whether the data line is based on at least one in advance fixed rule should be eliminated or maintained, and (A) if it is determined that the data line is the at least one in advance fixed rule fulfilled, Maintaining (S64; S76) the data line by placing the data line in the Database in the first line of the database that contains no data loaded and recreating (S66; S77) the index so that this key contains which correspond to the retained data line; or (b) if it is determined that the data line does not have the at least one advance fixed rule fulfilled, Eliminating (S65; S75) the data line; and if a next line of data exists, performing the previous steps for the next Data line.

The method of claim 1, comprising partitioning (S70) the database and an associated index into multiple partitions; Select (S71) a partition of the multiple partitions of the database and one associated Partition of the associated Index; taking the steps for reading, analyzing, determining, maintaining and eliminating the chosen one a partition of the database will be applied and where is the step for the Rebuilding on an associated Partition of the associated In addition, the method employs selecting, reading, Analyze, Determine, Maintain, Eliminate and Rebuild for every Partition in the database includes.

The method of claim 2, wherein the maintaining step (S64; S76) comprises: loading the data line in the selected one of the database into a first open line of the selected one of the database, the first open line being a first line in the chosen one Partition of the database that contains no data.

Method according to one of the preceding claims, wherein the data line by deleting the Data line is eliminated.

Method according to one of the preceding claims 1 to 4, with the requirement for referential integrity One rule is that it requires all non-zero foreign keys in the database a current key in another database correspond.

System for reorganizing a database, wherein the database has an associated Index and the system comprises: a reading device, which is designed to read a data line from the database; a Analysis device for analyzing the data line according to at least a predetermined rule, the rule being one requirement for referential integrity is; a loader that is used to maintain the data line is designed, if it is determined that the data line the at least meets a pre-determined rule, by putting the data row in the database in the first row of the database, that does not contain data, is loaded; an elimination device designed to eliminate the Data line is designed if it is determined that the data line does not meet the at least one pre-determined rule; and a Rebuilding device designed to rebuild the index is to key to include, which correspond to the retained data line; in which the devices are designed to perform their respective functions for one next Execute data line, if a next one Data line is present.

The system of claim 6, comprising: a partitioning device, to partition the database and an associated index is designed in several partitions; and a partition selection device, the one to choose a partition of the multiple partitions of the database and one associated Partition of the associated Index is designed; in which the reading device for reading a data line from the selected one a partition of the database is designed; the analyzer to parse the data line, which is read from the selected one partition was, is designed; the loading device for loading the data line in the chosen one a partition of the database is designed; and the rebuilding device to re-create the associated Partition of the associated Index is designed; in which the system for reorganizing everyone Line of the selected a partition of the database and each partition of the multiple partitions the database is designed.

The system of claim 7, wherein the loading device to load the data line into the selected one partition of the database is designed in a first open line of the database, the first open line a first line in the selected one partition of the database is that contains no data.

A system according to any one of claims 6 to 8, wherein the elimination device is designed to eliminate the data line by deleting the data line.

A system according to any one of claims 6 to 9, wherein the requirement for referential Integrity one The rule is that it requires all non-zero foreign keys in the database a current key in another database correspond.

A computer recording medium that has an on a computer executable Code for reorganizing a database includes, where the database an associated one Index and the executable code on a computer includes: one Readout code for reading out (S60; S72) a data line from the database; one Analysis code for analyzing (S62; S73) the data line that is at least a predefined rule has been read from the database, where the rule is a requirement for referential integrity; one Determining code for determining (S63; S74) whether the data line is based eliminated or maintained on the at least one predetermined rule shall be; a loading code for maintaining (S64; S76) the Data line, if it is determined that the data line is the least meets a pre-determined rule, by putting the data row in the database in the first row of the database, that does not contain data, is loaded; an elimination code for eliminating (S65; S75) of the data line if it is determined that the data line is not that meets at least one pre-determined rule; a rebuild code to rebuild (S66; S77) the index so that this key contains the match the retained data row, and a repeat code to perform from reading, analyzing, determining, maintaining, eliminating and Rebuilding for a next one Data line, if a next one Data line is present.

A computer recording medium that The computer executable code of claim 11, further comprising: a partitioning code for partitioning (S70) the database and an associated index into a plurality of partitions; a partition selection code for selecting (S71) a partition of the multiple partitions of the database and an associated partition of the associated index; wherein the read code is configured to read a data line from the selected one partition; the analysis code is configured to parse the data line read from the selected one partition; the loading code is configured to load the data line into the selected one partition of the database; the rebuild code is configured to rebuild the associated partition of the associated index, and wherein the replay code comprises: a rewrite code for repeating read, parse, determine, retain, eliminate, and rebuild for each row in the selected one partition of the database; and a partition repeat code for repeating selecting, reading, analyzing, determining, maintaining, eliminating, and rebuilding for each partition in the database.

The computer recording medium according to claim 12, the loading code for loading the data line into the selected one Partition the database into a first open line of the selected one partition the database is designed, wherein the first open line, a first Line in the selected is a partition of the database that contains no data.

The computer recording medium according to one of claims 11 to 13, wherein the elimination code comprises a deletion code for deleting the data line.

The computer recording medium according to one of claims 11 through 14, where the requirement for referential integrity is a rule It is that it requires all non-zero foreign keys in the database a current key in another database correspond.