DE3881392T2

DE3881392T2 - System and method for automatic segmentation.

Info

Publication number: DE3881392T2
Application number: DE88201987T
Authority: DE
Inventors: Jacob Albert Westdijk
Original assignee: Oce Nederland BV
Current assignee: Canon Production Printing Netherlands BV
Priority date: 1988-09-12
Filing date: 1988-09-12
Publication date: 1993-10-21
Anticipated expiration: 2008-09-13
Also published as: DE3881392D1; EP0358815B1; US5073953A; JP2818448B2; JPH02105978A; EP0358815A1

Description

Die Erfindung betrifft ein System und Verfahren zum automauschen Segmentieren eines gescannten Dokuments in einem elektronischen Dokumentenverarbeitungsgerät.The invention relates to a system and method for automatically segmenting a scanned document in an electronic document processing device.

Ein Dokument, das in einem elektronischen Dokumentenverarbeitungsgerät, wie etwa einem elektronischen Kopiergerät, einem optischen Zeichenerkennungssystem oder einem Datenkompressionssystem zu verarbeiten ist, kann unterschiedliche Arten von Informationen enthalten, die auf unterschiedliche Weisen verarbeitet werden müssen, um eine ausreichende Kopiequalitat und eine ausreichende Kompression zu erreichen und/oder Bildmanipulationen zu ermöglichen. Beispielsweise kann ein Dokument Fotografien mit kontinuierlicher Tönung, gerasterte oder geditherte Bilder, nachfolgend als Halbtonbilder bezeichnet, sowie Schwarz/Weiß-Informationen wie etwa Text oder Grafik enthalten. Wenn die durch Scannen eines solchen Dokuments erhaltenen Daten in einem elektronischen Kopiergerät einer Biidverarbeitung und Bildspeicherung unterzogen und gedruckt werden, so wird für die Text oder Grafik repräsentierende Information üblicherweise ein Schwellenwertvergleich zu Erzielung eines Binärbildes durchgeführt, während für periodische Informationen (Raster) ein Grauwert-Schwellenwertvergleich durchgeführt wird und Information mit kontinuierlicher Tönung gedithert wird. Folglich ist es notwendig, Bereiche oder Segmente des Dokuments, die verschiedene Arten von Informationen enthalten, zu orten und zu identifizieren. Dieser Prozeß wird "Segmentierung" genannt.A document to be processed in an electronic document processing device, such as an electronic copier, an optical character recognition system or a data compression system, may contain different types of information that must be processed in different ways to achieve sufficient copy quality and compression and/or to enable image manipulation. For example, a document may contain continuous tone photographs, rasterized or dithered images, hereinafter referred to as halftone images, and black and white information such as text or graphics. When the data obtained by scanning such a document is subjected to image processing and image storage in an electronic copier and printed, the information representing text or graphics is usually thresholded to obtain a binary image, while periodic information (rasters) is subjected to gray-scale thresholding and continuous tone information is dithered. Consequently, it is necessary to locate and identify areas or segments of the document that contain different types of information. This process is called "segmentation".

Ein Beispiel eines herkömmlichen automatischen Segmentierungssystems. das in der Lage ist, Textinformation von Halbton-Bildinformation zu trennen, wird in EP-A2-0 202 425 beschrieben. Bei diesem System wird das gescannte Bild des Dokuments in eine Matrix aus Blöcken oder Teilbilder mit einer Größe von 4 x 4 Pixein aufgeteilt. Dann wird Jeder dieser Blöcke entweder als TEXP oder als BILD gekennzeichnet. Da das Kennzeichnen der verhältnismäßig kleinen Blöcke statistischen Fluktuationen unterliegt, wird die so erhaltene Matrix aus Kennungen häufig kurze Ketten von Textblöcken in einem Gebiet enthalten, in dem Bildblöcke vorherrschend sind, und umgekehrt. In einem abschließenden Schritt des Segmentierungsprozesses wird die Kennungsmatrix geglättet indem solche kurze Ketten von Blöcken eliminiert werden. Mit anderen Worten, es wird eine Kontextregel angewendet, die verlangt, daß die Kennungen von isolierten Blöcken oder isolierten kurzen Block-Ketten umgeschaltet werden, so daß sie mit der in der Umgebung vorherrschenden Kennung übereinstimmen.An example of a conventional automatic segmentation system capable of separating text information from halftone image information is described in EP-A2-0 202 425. In this system, the scanned image of the document is divided into a matrix of blocks or sub-images of 4 x 4 pixels in size. Then each of these blocks is labelled either as TEXP or IMAGE. Since the labelling of the relatively small blocks is subject to statistical fluctuations, the resulting matrix of labels will often contain short chains of text blocks in an area where image blocks predominate, and vice versa. In a final step of the segmentation process, the label matrix is smoothed by eliminating such short chains of blocks. In other words, a context rule is applied that requires that the identifiers of isolated blocks or isolated short blocks chains be switched to match the identifier prevailing in the environment.

Allgemein hat ein automatisches Segmentierungssystem zwei widersprechende Anforderungen zu erfüllen. Einerseits sollte es schnell genug sein, so daß das Dokument mit hoher Geschwindigkeit verarbeitet werden kann. Andererseits sollte es hinreichend robust sein, so daß es auch Dokumente bearbeiten kann, für die es schwierig ist, zwischen verschiedenen Arten von Information zu unterscheiden, weil beispielsweise die Zeichen eines Textes in hellen Farben auf einem dunklen Hintergrund gedruckt sind oder weil beispielsweise Fotograflen helle Bereiche enthalten. Um die Robusthelt eines herkömmlichen Segmentierungssystems zu verbessern, muß eine größere Anzahl von Kriterien in dem Kennungsschritt und/oder in dem Glättungsschritt überprüft werden, so daß die Verarbeitungszeit zunimmt.In general, an automatic segmentation system has to meet two contradictory requirements. On the one hand, it should be fast enough so that the document can be processed at high speed. On the other hand, it should be sufficiently robust so that it can also process documents for which it is difficult to distinguish between different types of information, for example because the characters of a text are printed in light colors on a dark background or because, for example, photographs contain light areas. In order to improve the robustness of a conventional segmentation system, a larger number of criteria must be checked in the recognition step and/or in the smoothing step, so that the processing time increases.

Es ist eine Aufgabe der vorliegenden Erfindung, ein automatisches Dokumenten-Segmentierungssystem und -verfahren zu schaffen, das eine verbesserte Robustheit und Geschwindigkeit aufweist.It is an object of the present invention to provide an automatic document segmentation system and method having improved robustness and speed.

Diese Aufgabe wird durch das in Anspruch 1 angegebene System und das in Anspruch 17 angegebene Verfahren gelöst.This object is achieved by the system specified in claim 1 and the method specified in claim 17.

Erfindungsgemäß ist die Anzahl unterschiedlicher Kennungen, die in dem anfänglichen Kennzeichnungsschritt gewählt werden können, größer als die Anzahl der verschiedenen Arten von Information, zwischen denen letztlich entschieden werden muß. Folglich ist es nicht notwendig, die Art der Information in dem anfänglichen Kennzeichnungsschritt mit Sicherheit zu identifizieren. Aus diesem Grund kann der anfängliche Kennzeichnungsschritt in einer verhältnismäßig kurzen Zeit ausgeführt werden, selbst dann, wenn die Größe der Teilbilder verhältnismäßig groß gewählt ist, um statistische Fluktuationen zu vermindern. Die durch die anfänglichen Kennungen angegebene Art der Information wird schließlich in dem Glättungsschritt auf der Grundlage von Kontextregeln bestimmt. Es hat sich gezeigt, daß Kontextregeln, die zu diesem Zweck geeignet sind, nicht zu viel Rechenzeit benötigen, so daß eine Netto-Zeitersparnis erreicht wird. Da weiterhin die anfänglichen Kennungen eine differenzierte Klassifikation darstellen, können bestimmte Fehler, die in dem anfänglichen Kennzeichnungsschritt aufgetreten sind, in dem Glättungsschritt korrigiert werden, selbst wenn diese Fehler sich auf verhältnismäßig große Ketten von Teilbildern beziehen. Dies trägt zu einer verbesserten Robustheit des Systems bei.According to the invention, the number of different identifiers that can be chosen in the initial identification step is larger than the number of different types of information between which a decision must ultimately be made. Consequently, it is not necessary to identify the type of information with certainty in the initial identification step. For this reason, the initial identification step can be carried out in a relatively short time, even if the size of the partial images is chosen to be relatively large in order to reduce statistical fluctuations. The type of information indicated by the initial identifiers is finally determined in the smoothing step on the basis of context rules. It has been shown that context rules suitable for this purpose do not require too much computing time, so that a net time saving is achieved. Furthermore, since the initial identifiers represent a differentiated classification, certain errors that occurred in the initial labeling step can be corrected in the smoothing step, even if these errors relate to relatively large chains of sub-images. This contributes to an improved robustness of the system.

Zweckmäßige Einzelheiten und weitere Verbesserungen des erfindungsgemäßen Systems sind in den abhängigen Ansprüchen angegeben.Useful details and further improvements of the inventive system are specified in the dependent claims.

Im folgenden werden bevorzugte AusführungsbeIspiele der Erfindung anhand der Zeichnungen näher erläutert. Es zeigen:In the following, preferred embodiments of the invention are explained in more detail with reference to the drawings. They show:

Fig. 1 ein Blockdiagramm zur Illustration des allgemeinen Aufbaus eines automatischen Segmentierungssystems;Fig. 1 is a block diagram illustrating the general structure of an automatic segmentation system;

Fig. 2 eine graphische Darstellung eines Klassifizierungsbaumes, der in dem anfänglichen Kennzeichnungsschritt verwendet wird;Fig. 2 is a graphical representation of a classification tree used in the initial labeling step;

Fig. 3(A) - 3(E) Diagramme von Kontextregeln, die in dem GIättungsschritt verwendet werden;Fig. 3(A) - 3(E) are diagrams of context rules used in the smoothing step;

Fig. 4(A) u. 4(B) ein Beispiel einer anfänglichen Kennungsmatrix und einer darauf erhaltenen geglätteten Kennungsmatrix;Fig. 4(A) and 4(B) show an example of an initial identification matrix and a smoothed identification matrix obtained therefrom;

Fig. 5 ein Blockdiagramm einer möglichen Hardware-Reallsierung des erfindungsgemäßen Systems; undFig. 5 is a block diagram of a possible hardware implementation of the system according to the invention; and

Fig. 6 und 7 Blockdiagramme von abgewandelten Beispielen des Segmentierungssystems.Fig. 6 and 7 are block diagrams of modified examples of the segmentation system.

Wie in Fig. 1 gezeigt ist, umfaßt ein Dokumenten-Segmentierungssystem einen anfänglichen Kennzeichnungsmodul 10 und einen Glättungsmodul 12. Ein das gescannte Bild des gesamten Dokuments repräsentierendes Signal wird von einem Vorlagenscanner (nicht gezeigt) zu dem anfänglichen Kennzeichnungsmodul 10 übermittelt. Das gescannte Bild wird als aus einer Matrix aus Teilbildern bestehend betrachtet. In einem repräsentativen Beispiel wird ein A4-Dokument mit einer Scan-Auflösung von 500 dpi gescannt, und die Größe des Teilbildes beträgt 64 x 64 Pixel. Der Grauwert jedes Pixels wird durch ein 8-Bit-Wort entsprechend einem von 256 Grauwerten angegeben.As shown in Fig. 1, a document segmentation system includes an initial characterization module 10 and a smoothing module 12. A signal representing the scanned image of the entire document is transmitted from an original scanner (not shown) to the initial characterization module 10. The scanned image is considered to consist of a matrix of sub-images. In a representative example An A4 document is scanned with a scanning resolution of 500 dpi and the size of the partial image is 64 x 64 pixels. The gray value of each pixel is specified by an 8-bit word corresponding to one of 256 gray values.

In dem anfänglichen Kennzeichnungsmodul 10 wird Jedes einzelne Teilbild mit Hilfe eines Klassifizierers analysiert, der eine Anzahl von Routinen zur Extraktion charakteristischer Merkmale aus dem Teilbild aufweist. Auf der Grundlage der extrahierten Merkmale wird dem Teilbild eine spezielle anfängliche Kennung zugeordnet. Im Ergebnis erhält man eine anfängliche Kennungsmatrix, die das gesamte Dokument repräsentiert und deren Matrixelemente die Kennungen der einzelnen Teilbilder sind.In the initial labeling module 10, each individual sub-image is analyzed using a classifier that has a number of routines for extracting characteristic features from the sub-image. On the basis of the extracted features, a specific initial label is assigned to the sub-image. The result is an initial label matrix that represents the entire document and whose matrix elements are the labels of the individual sub-images.

Die anfängliche Kennungsmatrix wird in dem Glättungsmodul 12 weiterverarbeitet. Es wird eine Anzahl von Kontextregeln angewandt, um die anfänglichen Kennungen abhängig von den den benachbarten Teilbildern zugeordneten Kennungen zu ändern. Die Kontextregeln sind so ausgelegt, daß einige der anfänglichen Kennungen im Verlauf des Glättungsprozesses vollständig eliminiert werden.The initial label matrix is further processed in the smoothing module 12. A number of context rules are applied to change the initial labels depending on the labels assigned to the neighboring sub-images. The context rules are designed such that some of the initial labels are completely eliminated during the smoothing process.

Das Ergebnis ist eine geglättete Kennungsmatrix, die bei diesem Ausführungsbeispiel nur aus zwei verschiedenen Kennungen besteht, die Segmente bilden, die Bereichen mit kontinuierlicher Tönung (z.B. Fotografien) bzw. Schwarz/Weiß-Bereichen (z.B. Text oder Grafik) des gescannten Dokuments entsprechen.The result is a smoothed identifier matrix, which in this embodiment consists of only two different identifiers that form segments that correspond to areas with continuous toning (e.g. photographs) or black and white areas (e.g. text or graphics) of the scanned document.

Der in dem anfänglichen Kennzeichnungsmodul verwendete Klassifzierer kann auf herkömmlichen Verfahren zur Bildanalyse wie etwa Histogrammbewertung, Raumanalyse, Differentialoperatoren, Spektralanalyse, oder einer Kombination dieser Verfahren beruhen. Der Klassifizierer kann ein Klassifizierungsbaum sein, in dem die anzuwendenden Überprüfungsroutinen jeweils von dem Ergebnis der vorausgegangenen Überprüfung abhängen, oder alternativ kann ein Einschritt-Klassifizierer verwendet werden. Als ein bevorzugtes Ausführungsbeispiel zeigt Fig. 2 einen Klassifizierungsbaum, der ein Grauwert-Histogramm des Teilbildes bewertet. jeder Verzweigungspunkt des in Fig. 2 gezeigten Baumdiagramms entspricht einem bestimmten Kriterium, auf das die Histogramm-Daten überprüft werden. Beispielsweise können die folgenden Kriterien betrachtet werden.The classifier used in the initial labeling module may be based on conventional image analysis methods such as histogram evaluation, spatial analysis, differential operators, spectral analysis, or a combination of these methods. The classifier may be a classification tree in which the checking routines to be applied each depend on the result of the previous check, or alternatively a one-step classifier may be used. As a preferred embodiment, Fig. 2 shows a classification tree that evaluates a gray-value histogram of the partial image. Each branch point of the tree diagram shown in Fig. 2 corresponds to a specific criterion for which the histogram data is checked. For example, the the following criteria should be considered.

- Die Abszissenposition des höchsten Peaks des Histogramms, d.h., der Grauwert, der am häufigsten auftritt. Dieses Kriterium liefert ein grobes Maß für die Gesamthelligkeit des Teilbildes.- The abscissa position of the highest peak of the histogram, i.e., the gray value that occurs most frequently. This criterion provides a rough measure of the overall brightness of the sub-image.

- Die Anzahl der Peaks in dem Histogramm. Insbesondere kann ein Histogramm mit zwei getrennten Peaks ein Hinweis auf Text oder Grafik sein.- The number of peaks in the histogram. In particular, a histogram with two separate peaks can indicate text or graphics.

- Die Höhendifferenz zwischen zwei Peaks. Bei Text- oder Grafikinformation wird in den meisten Fällen eine große Höhendifferenz zwischen dem höchsten und zweithöchsten Peak bestehen.- The height difference between two peaks. For text or graphic information, in most cases there will be a large height difference between the highest and second highest peak.

- Die Grauwertdifferenz zwischen zwei Peaks. In Schwarz/Weiß-Bildern kann diese Differenz groß sein.- The gray value difference between two peaks. In black and white images, this difference can be large.

- Die Höhe des Minimums zwischen zwei dominierenden Peaks. In einem Bild mit kontinuierlicher Tönung wird dieser Wert hoch sein.- The height of the minimum between two dominant peaks. In a continuous tone image this value will be high.

- Die Höhendifferenz zwischen dem höchsten Peak und den Minima auf einer oder beiden Seiten davon, als eine Art "Signal-Rausch-Abstand".- The height difference between the highest peak and the minima on one or both sides of it, as a kind of "signal-to-noise ratio".

- Die Anzahl von Pixeln unterhalb des Minimalwertes des Tals zwischen den beiden Haupt-Peaks. Diese Anzahl wird in Halbtonbildern groß sein.- The number of pixels below the minimum value of the valley between the two main peaks. This number will be large in halftone images.

- Die Breiten des höchsten Peaks oder der beiden höchsten Peaks. Schmale Peaks können ein Hinweis auf Text oder Grafik sein.- The widths of the highest peak or the two highest peaks. Narrow peaks may indicate text or graphics.

Wenn die in dem Klassifizierer verwendeten Kriterien einen breiten Bereich möglicher Resultate haben, so werden die Resultate einem Schwellenwertvergleich unterzogen, um eine handhabbare Anzahl von Zweigen zu erhalten. Die Struktur des Baumes, die darin verwendeten Kriterien und die Schwellenwerte für die Resultate können optimiert werden, indem sie an statistische Ergebnisse angepaßt werden, die von einer Anzahl von Eichdokumenten erhalten wurden. Je größer die Vielfalt der Eichdokumente ist, desto größer ist die Robustheit, aber auch die erforderliche Komplexität des Klassifizierers.If the criteria used in the classifier have a wide range of possible results, the results are thresholded to obtain a manageable number of branches. The structure of the tree, the criteria used in it and the thresholds for the results can be optimized by adapting them to statistical results obtained from a number of calibration documents. The greater the variety of calibration documents, the greater the robustness but also the required complexity of the classifier.

Bei dem in Fig. 2 gezeigten Beispiel liefert der Kiassifizierungsbaum als mögliche Klassifizierungsergebnisse vier verschiedene Kennungen, die mit BW, BIM, BG und U bezeichnet werden. In diesem Beispiel können die Merkmale der mit diesen Kennungen ausgezeichneten Bilder wie folgt beschrieben werden:In the example shown in Fig. 2, the classification tree provides four different identifiers as possible classification results, which are denoted by BW, BIM, BG and U. In this example, the features of the images marked with these identifiers can be described as follows:

BW: Zwei dominierende Grauwerte mit einem hohen Kontrast; ein Kandidat für Text oder Grafik (BW steht für black/white (Schwarz/Weiß)),BW: Two dominant gray values with a high contrast; a candidate for text or graphics (BW stands for black/white ),

BIM: Ein Bild, das ebenfalls zwei dominierende Grauwerte aufweist, jedoch im Hinblick auf andere Kriterien kein starker Kandidat für Text oder Grafik ist (BIM steht für "bimodal"),BIM: An image that also has two dominant gray values, but is not a strong candidate for text or graphics based on other criteria (BIM stands for "bimodal"),

BG: Ein typischer Hintergrundbereich; verhältnismäßig hell und mit geringem Kontrast; kann in Text oder Grafiksegmenten, aber auch in Halbtonsegmenten auftreten,BG: A typical background area; relatively bright and with low contrast; can occur in text or graphic segments, but also in halftone segments,

U: Ein Bereich mit einer diffusen Grauwertverteilung (U = "undefiniert"); ein Kandidat für Bilder mit kontinuierlicher Tönung.U: An area with a diffuse gray value distribution (U = "undefined"); a candidate for images with continuous tones.

Ein Beispiel einer anfänglichen Kennungsmatrix, die mit dem oben beschriebenen Klassifizierer erhalten wurde, ist in Fig. 4(A) dargestellt. Es ist zu erkennen, daß diese anfängliche Kennungsmatrix noch eine Anzahl von Fluktuationen aufweist, die in dem Glättungsschritt eliminiert werden müssen.An example of an initial label matrix obtained with the classifier described above is shown in Fig. 4(A). It can be seen that this initial label matrix still has a number of fluctuations that must be eliminated in the smoothing step.

Die in dem Glättungsverfahren verwendeten Kontextregeln werden nachfolgend unter Bezugnahme auf Fig. 3 beschrieben.The context rules used in the smoothing procedure are described below with reference to Fig. 3.

Um die Matrixelemente mit ihren jeweiligen Nachbarn zu vergleichen, werden die Matrixelemente zu Feldern A, A' aus 3 x 3 Elementen zusammengefaßt. Auf die einzelnen 3 x 3-Felder werden vier Kontextregeln angewandt, die in Fig. 3(A) bis 3(D) illustriert sind.To compare the matrix elements with their respective neighbors, the matrix elements are grouped into arrays A, A' of 3 x 3 elements. Four context rules are applied to each 3 x 3 array, which are illustrated in Fig. 3(A) to 3(D).

Eine in Fig. 3(A) gezeigte sogenannte "LOKAL"-Kontextregel dient zur Elimination isolierter Kennungen in einer gleichförmigen Umgebung. Diese Regel kann wie folgt formuliert werden:A so-called "LOCAL" context rule shown in Fig. 3(A) is used to eliminate isolated identifiers in a uniform environment. This rule can be formulated as follows:

Wenn eine Kennung X von oberen, unteren, rechten und linken Nachbarn mit der Kennung Y umgeben ist, so ändere X in Y. In dieser Regel stehen X und Y für beliebige der anfänglichen Kennungen BW, BIM, BG, U.If an identifier X is surrounded by upper, lower, right and left neighbors with identifier Y, change X to Y. In this rule, X and Y are any of the initial identifiers BW, BIM, BG, U.

Die in Fig. 3(B) und 3(C) gezeigten Kontextregeln können als "schwache Expansions"-Regeln bezeichnet werden und haben die folgende Struktur:The context rules shown in Fig. 3(B) and 3(C) can be called “weak expansion” rules and have the following structure:

Wenn wenigstens ein Element in einem 3 x 3-Feld A' die Kennung BW (schwach expandierende Kennung) hat und das Feld keine Kennungen aus einer bestimmten Gruppe enthält, so expandiere die Kennung BW über das gesamte Feld.If at least one element in a 3 x 3 array A' has the identifier BW (weakly expanding identifier) and the array contains no identifiers from a certain group, then expand the identifier BW over the entire array.

Die in Fig. 3(B) gezeigte Expansionsregel verwandelt Kombinationen aus BW und BG in BW und kann kurz geschrieben werden als BW/BG--> BW. In dieser Regel besteht die "bestimmte Gruppe" von Kennungen, die nicht in dem Feld enthalten sein dürften, aus den Kennungen BIM und U. Wenn eine dieser Kennungen in dem Feld enthalten ist, wird das Feld durch diese Kontextregel unverändert gelassen.The expansion rule shown in Fig. 3(B) transforms combinations of BW and BG into BW and can be written briefly as BW/BG--> BW. In this rule, the "specific group" of identifiers that should not be contained in the field consists of the identifiers BIM and U. If any of these identifiers are contained in the field, the field is left unchanged by this context rule.

In Fig. 3(C) umfaßt die "bestimmte Gruppe" verbotener Kennungen die Kennungen BG und U. Folglich verwandelt diese Kontextregel lediglich Felder, die aus Kombinationen von BW und BIM bestehen, und sie kann als BW/BIM--> BW geschrieben werden.In Fig. 3(C), the "specific group" of forbidden identifiers includes the identifiers BG and U. Consequently, this context rule only transforms fields consisting of combinations of BW and BIM, and it can be written as BW/BIM-->BW.

Es ist möglich, weitere Kontextregeln mit der gleichen Struktur aufzustellen, indem andere Gruppen von Kennungen definiert werden, die nicht in dem Feld enthalten sein dürfen. Beispielsweise ist es möglich, das gesamte Feld in einem Schritt in BW zu verwandeln, wenn das Feld aus einer Kombination von BW, BG und BIM besteht.It is possible to set up further context rules with the same structure by defining other groups of identifiers that must not be included in the field. For example, it is possible to convert the entire field to BW in one step if the field consists of a combination of BW, BG and BIM.

Weiterhin können diese Kontextregeln modifiziert werden, indem man verlangt, daß das Feld wenigstens zwei, drei oder mehr Elemente mit der Kennung BW enthalten muß.Furthermore, these context rules can be modified by requiring that the field must contain at least two, three or more elements with the identifier BW.

Eine Kontextregel EXPANDIEREN, die in Fig. 3(D) gezeigt ist, verlangt folgendes:An EXPAND context rule shown in Fig. 3(D) requires the following:

Wenn das Feld A' wenigstens ein Element mit der Kennung U (stark expandierende Kennung) aufweist, so wird die Kennung U über das gesamte Feld expandiert.If the array A' has at least one element with the identifier U (strongly expanding identifier), the identifier U is expanded over the entire array.

In dieser Regel gibt es keine Beschränkungen hinsichtlich anderer Kennungen, die in dem ursprünglichen Feld auftreten.This rule does not impose any restrictions on other identifiers that appear in the original field.

Fig. 3(E) illustriert eine Kontextregel, die als FÜLLEN bezeichnet ist und die nicht auf die 3 x 3-Felder beschränkt ist. Diese Regel kann wie folgt definiert werden:Fig. 3(E) illustrates a context rule called FILL that is not restricted to the 3 x 3 fields. This rule can be defined as follows:

1) Wo die Kennung U einander schneidende vertikale und horizontale Ketten 14,16 bildet, fülle das gesamte Rechteck 18, das durch diese Ketten aufgespannt wird, mit der Kennung U (der Ausdruck "Kette" bezeichnet eine ununterbrochene Folge von Kennungen U in einer Zeile oder eine Spalte der Matrix);1) Where the identifier U forms intersecting vertical and horizontal chains 14,16, fill the entire rectangle 18 spanned by these chains with the identifier U (the term "chain" denotes an uninterrupted sequence of identifiers U in a row or a column of the matrix);

2) Überprüfe alle Kombinationen aus horizontalen und vertikalen Ketten, um die mit U gefüllte Fläche zu maximieren;2) Check all combinations of horizontal and vertical chains to maximize the area filled with U;

3) Wenn die Höhe der maximierten Fläche kleiner als vier Elemente ist oder deren Breite kleiner als vier Elemente ist, so ändere alle Kennungen dieses Bereichs in BW.3) If the height of the maximized area is less than four elements or its width is less than four elements, change all identifiers of this area to BW.

Gemäß einer Erweiterung der Kontextregel FÜLLEN werden die aufgespannten Rechtecke nur mit der Kennung U gefüllt, wenn sie eine Anzahl von U- Kennungen enthalten, die größer als ein vorgegeben es Verhältnis (Umin/U) ist, und/oder die Form des Rechtecks an bestimmte Bedingungen gebunden ist, beispielsweise größer als ein vorgegebenes Minimum oder kleiner als ein vorgegebenes Maximum.According to an extension of the context rule FILL, the spanned rectangles are only filled with the identifier U if they contain a number of U identifiers that is greater than a specified ratio (Umin/U) and/or the shape of the rectangle is bound to certain conditions, for example larger than a specified minimum or smaller than a specified maximum.

Der Glättungsmodul 12 ist so programmiert, däß er die in Fig. 3 illustrierten Kontextregeln in einer Reihenfolge anwendet, die nachstehend aufgelistet ist.The smoothing module 12 is programmed to apply the context rules illustrated in Fig. 3 in an order listed below.

1) LOKAL1) LOCAL

2) BW/BG--> BW2) BW/BG--> BW

3) LOKAL3) LOCAL

4) BW/BIM--> BW4) BW/BIM--> BW

5) LOKAL5) LOCAL

6) BW/BG--> BW6) BW/BG--> BW

7) LOKAL7) LOCAL

8) EXPANDIEREN8) EXPAND

9) LOKAL9) LOCAL

10) FÜLLEN10) FILL

In jedem dieser Schritte wird die Kontextregel auf die gesamte Matrix angewandt, bevor der nächste Schritt ausgeführt wird. Im Fall der Kontextregel LOKAL wird die gesamte Matrix in Ein-Element-Schritten mit einem 3 x 3- Fenster gescannt, so daß Jedes Element einmal als zentrales Element des 3 x 3-Feldes A betrachtet wird.In each of these steps, the context rule is applied to the entire matrix before the next step is executed. In the case of the context rule LOCAL, the entire matrix is scanned in single-element steps with a 3 x 3 window, so that each element is considered once as the central element of the 3 x 3 array A.

In den Schritten 2), 4) und 6) kann die gleiche Prozedur angewandt werden. Alternativ kann die Matrix in ein starres Gitter aus 3 x 3-Feldern A' aufgeteilt sein.The same procedure can be applied in steps 2), 4) and 6). Alternatively, the matrix can be divided into a rigid grid of 3 x 3 fields A'.

In Schritt 8) wird ein starres Gitter aus 3 x 3-Feldern A' verwendet. Alternativ kann das Verfahren mit fließendem Feld angewandt werden, doch sollte dann gefordert werden, daß jedes Feld wenigstens zwei Kennungen U enthält, da andernfalls die expandierte Fläche zu groß würde.In step 8) a rigid grid of 3 x 3 fields A' is used. Alternatively, the floating field method can be used, but then it should be required that each field contains at least two identifiers U since otherwise the expanded area would be too large.

Der Schritt 1) beginnt mit der anfänglichen Kennungsmatrix, die in dem Modul 10 erzeugt wurde. Alle anderen Schritte werden auf die modifizierte Matrix angewandt, die als Ergebnis des vorausgegangenen Schrittes erhalten wurde. Es ist zu bemerken, daß die Anwendungsregel LOKAL im Wechsel mit anderen Kontextregeln mehrere Male ausgeführt wird. Die Regel BW/BG--> BW wird in Schritt 2) und nochmals in Schritt 6) angewandt.Step 1) starts with the initial identifier matrix generated in module 10. All other steps are applied to the modified matrix obtained as a result of the previous step. Note that the application rule LOCAL is executed several times, alternating with other context rules. The rule BW/BG-->BW is applied in step 2) and again in step 6).

Am Ende des Schrittes 7 werden die Kennungen BG und BIM weitgehend eliminiert sein, und die Matrix wird Bereiche aufweisen, die homogen mit der Kennung BW gefüllt sind, während andere Bereiche die Kennung U in Kombination mit anderen Kennungen enthalten. In diesen Bereichen wird die Kennung U in Schritten 8) und 10) expandiert, so daß am Ende des Schrittes 10) die gesamte Matrix aus rechteckigen Bereichen zusammengesetzt ist. die homogen entweder mit BW oder mit U gefüllt sind. Die Regel FÜLLEN schreibt jedoch vor, daß mit U gefüllte Bereiche in BW umgewandelt werden, wenn sie zu klein sind.At the end of step 7, the identifiers BG and BIM will be largely eliminated and the matrix will have areas that are homogeneously filled with the identifier BW, while other areas contain the identifier U in combination with other identifiers. In these areas, the Identifier U is expanded in steps 8) and 10) so that at the end of step 10) the entire matrix is composed of rectangular regions that are homogeneously filled with either BW or U. However, the FILL rule dictates that regions filled with U are converted to BW if they are too small.

Die am Ende des Schrittes 10) erhaltene Kennungsmatrix besteht somit aus den Kennungen BW und U, die große rechteckige Segmente bilden, die Schwarz/Weiß-Bereiche bzw. Bereiche des gescannten Dokuments mit kontinuierlicher Tönung repräsentieren. Diese Matrix entspricht der gewünschten geglätteten Kennungsmatrix. Um diese Matrix vom der anfänglichen Kennungsmatrix zu unterscheiden, werden die Kennungen BW und U umbenannt in geglättete Kennungen T (für "TEXT") bzw. P (für "PHOTO").The identification matrix obtained at the end of step 10) thus consists of the identifiers BW and U, which form large rectangular segments representing black/white areas and areas of the scanned document with continuous toning, respectively. This matrix corresponds to the desired smoothed identification matrix. To distinguish this matrix from the initial identification matrix, the identifiers BW and U are renamed smoothed identifiers T (for "TEXT") and P (for "PHOTO"), respectively.

Fig. 4(B) zeigt die geglättete Kennungsmatrix, die aus der in Fig. 4(A) gezeigten anfänglichen Kennungsmatrix erhalten wurde. Diese Figuren spiegeln experimentelle Ergebnisse wider, die gewonnen wurden durch Anwendung des oben beschriebenen Segmentierungsverfahrens auf ein Testdokument, das einen Textbereich mit verschiedenen Textformaten und zwei Fotografiebereiche enthielt. Die tatsächlichen Ränder der Fotografiebereiche des Dokuments sind durch gestrichelte Linien 20 angegeben.Fig. 4(B) shows the smoothed label matrix obtained from the initial label matrix shown in Fig. 4(A). These figures reflect experimental results obtained by applying the segmentation method described above to a test document containing a text region with different text formats and two photograph regions. The actual edges of the photograph regions of the document are indicated by dashed lines 20.

Es ist zu erkennen, daß die P-Segmente in Fig. 4(B) innerhalb der Auflösung der Matrix der Teilbilder zu den tatsächlichen Grenzen der Fotografiebereiche passen.It can be seen that the P-segments in Fig. 4(B) match the actual boundaries of the photographic areas within the resolution of the matrix of sub-images.

Wie in Fig. 4(A) gezeigt ist, enthalten die Fotografleberelche verhältnismäßig große zusammenhängende Flächen, die mit den Kennungen BW, BIM und BG gefüllt sind, die ebenso gut als Textbereiche hätten interpreLiert werden können. In dem Glättungsprozeß sind diese Mehrdeutigkeiten mit Hilfe der Kontextregeln erfolgreich beseitigt worden.As shown in Fig. 4(A), the photographer's liver moose contain relatively large contiguous areas filled with the labels BW, BIM and BG, which could just as well have been interpreted as text areas. In the smoothing process, these ambiguities were successfully eliminated using the context rules.

Fig. 5 zeigt eine mögliche Hardware-Realisierung des automatischen Segmentierungssystems.Fig. 5 shows a possible hardware implementation of the automatic segmentation system.

Das Dokument wird in einem Scanner 22 gescannt, und die Digitalwerte, die die Graupegel der einzelnen Pixel angeben, werden in einem Bit-Feld gespeichert. Diese Werte werden außerdem an eine Histogramm-Einheit 24 übermittelt, die Histogramme für die einzelnen Teilbilder des Dokuments erstellt. Die Histogramm-Daten werden in einem Klassifizierer 26 bewertet, der dem anfänglichen Kennzeichnungsmodul 10 in Fig. 1 entspricht. Der Klassiflzierer 26 enthält eine Merkmals-Extraktionseinheit 28 zur Überprüfung der Merkmale des Histogramms und einen Klassifizierungsbaum 30, der die zu überprüfenden Merkmale auswählt und schließlich dem untersuchten Teilbild eine der anfänglichen Kennungen zuweist.The document is scanned in a scanner 22 and the digital values indicating the gray levels of the individual pixels are stored in a bit field. These values are also transmitted to a histogram unit 24 which creates histograms for the individual sub-images of the document. The histogram data are evaluated in a classifier 26 which corresponds to the initial labelling module 10 in Fig. 1. The classifier 26 contains a feature extraction unit 28 for checking the features of the histogram and a classification tree 30 which selects the features to be checked and finally assigns one of the initial labels to the sub-image under examination.

Die anfänglichen Kennungen werden weiterverarbeitet in einem Kontext-Prozessor 32, der dem Glättungsmodul 12 in Fig. 1 entspricht. Der Kontext-Prozessor enthält Verarbeitungsmodule 34 zum aufeinanderfolgenden Anwenden der Kontextregeln (Schritte 1 bis 10) sowie Puffer 36 zum Speichern der anfänglichen Kennungsmatrix, der Zwischenergebnisse und der geglätteten Kennungsmatrix.The initial identifiers are further processed in a context processor 32, which corresponds to the smoothing module 12 in Fig. 1. The context processor contains processing modules 34 for sequentially applying the context rules (steps 1 to 10) as well as buffers 36 for storing the initial identifier matrix, the intermediate results and the smoothed identifier matrix.

In einer modifizierten Hardware-Implementierung können mehrere Histogramm-Einheiten 24 und Klassiflzierer 26 vorgesehen sein, so daß mehrere Teilbilder in der anfänglichen Kennzeichnungsphase parallel verarbeitet werden können.In a modified hardware implementation, multiple histogram units 24 and classifiers 26 may be provided so that multiple sub-images can be processed in parallel in the initial labeling phase.

Bei dem in Fig. 1 bis 4 gezeigten Ausführungsbeispiel unterscheidet das Segmentierungssystem nur zwischen zwei verschiedenen Arten von Informationen, nämlich Schwarz/Weiß-Information (Kennung T) und Information mit kontinuierlicher Tönung (Kennung P). Die Fotografie-Segmente können sowohl Information mit kontinuierlicher Tönung als auch periodische Information wie etwa Raster oder geditherte Bilder enthalten. Die erfindungsgemäße Lösung ist auch auf Segmentierungssysteme anwendbar, die weiter zwischen Information mit kontinuierlicher Tönung und periodischer Information unterscheiden. Dies kann beispielsweise dadurch erreicht werden, daß das Segmentierungssystem in der in Fig. 6 oder 7 gezeigten Weise modifiziert wird.In the embodiment shown in Fig. 1 to 4, the segmentation system only distinguishes between two different types of information, namely black/white information (identifier T) and continuous tone information (identifier P). The photograph segments can contain both continuous tone information and periodic information such as raster or dithered images. The solution according to the invention is also applicable to segmentation systems that further distinguish between continuous tone information and periodic information. This can be achieved, for example, by modifying the segmentation system in the manner shown in Fig. 6 or 7.

In beiden Figuren werden Haibton-Indikatoren zum Feststellen von Halbtoninformation verwendet. Halbtoninformation kann festgestellt werden, indem für Jedes Teilbild die folgenden Kriterien verwendet werden:In both figures, halftone indicators are used to detect halftone information. Halftone information can be detected by using the following criteria for each sub-image:

- Der Abstand zwischen dem ersten Spitzenwert im Spektrum, der nicht der Frequenz Null entspricht, und dem Ursprung des Spektrums.- The distance between the first peak in the spectrum that is not at zero frequency and the origin of the spectrum.

- Das Verhältnis zwischen dem der Frequenz Null entsprechenden Spitzenwert und dem ersten nicht der Frequenz Null entsprechenden Spitzenwert im Spektrum.- The ratio between the peak corresponding to zero frequency and the first non-zero peak in the spectrum.

In Fig. 6 werden der anfängliche Kennzeichnungsprozeß und der Glättungsprozeß in der gleichen Weise ausgeführt wie in Fig. 1, und anschließend werden Fotografiebereiche (Kennung P) weiter analysiert, um zwischen Information mit kontinuierlicher Tönung und periodischer Information zu unterscheiden. Dies kann durch Überprüfung eines der zuvor erwähnten Rasterindikatoren erreicht werden, was in dem Periodizitätsmodul 38 geschieht. Das in Fig. 6 gezeigte System hat den Vorteil, daß die zeitraubende Überprüfung auf periodische Information auf die Segmente beschränkt ist, die als Fotografiebereich identifiziert worden sind.In Fig. 6, the initial labeling process and the smoothing process are carried out in the same way as in Fig. 1, and then photograph areas (identifier P) are further analyzed to distinguish between continuous tone information and periodic information. This can be achieved by checking one of the previously mentioned raster indicators, which is done in the periodicity module 38. The system shown in Fig. 6 has the advantage that the time-consuming checking for periodic information is limited to the segments identified as a photograph area.

Alternativ kann die Überprüfung auf periodische Information in der anfänglichen Kennzeichnungsphase ausgeführt werden, wie in Fig. 7 gezeigt ist. In diesem Fall enthalten die anfänglichen Kennungen wenigstens eine Kennung, die einen starken Kandidaten für Rasterbilder anzeigt, und die Kontextregeln in dem Glättungsmodul werden Regeln zum Expandieren dieser Kennung enthalten, so daß die geglättete Kennungsmatrix drei verschiedene Kennungen entsprechend Information mit kontinuierlicher Tönung, periodischer Information und Schwarz/Weiß-Information enthält.Alternatively, the check for periodic information can be performed in the initial labeling phase, as shown in Fig. 7. In this case, the initial labels will contain at least one label indicating a strong candidate for raster images, and the context rules in the smoothing module will contain rules for expanding this label so that the smoothed label matrix contains three different labels corresponding to continuous tone information, periodic information and black/white information.

Die Kontextregeln zum Auffinden von Gebieten, die Halbtoninformation enthalten, können ähnlich den beschriebenen Kontextregeln FÜLLEN und EXPANDIEREN (Fig. 3) sein, die zum Auffinden von Bereichen mit kontinuierlicher Tönung vorgesehen sind. Anstelle der Zielkennung U wird dann die Kennung verwendet werden, die Halbtoninformation angibt.The context rules for finding areas containing halftone information can be similar to the described context rules FILL and EXPAND (Fig. 3), which are intended for finding areas with continuous toning. Instead of the target identifier U, the identifier indicating halftone information will then be used.

In dem in Fig. 7 gezeigten Beispiel ist zusätzlich ein Randanalyse-Modul 40 vorgesehen, um die Kongruenz zwischen den Segmenten (P und T in Fig. 4B) und den tatsächlichen Rändern 20 der Fotografiebereiche des Dokuments zu verbessern.In the example shown in Fig. 7, an edge analysis module 40 is additionally provided to improve the congruence between the segments (P and T in Fig. 4B) and the actual edges 20 of the photograph areas of the document.

Die Randanalyse kann beispielsweise bewerkstelligt werden, indem das Gitter der Teilbilder in vertikaler und horizontaler Richtung um einen bestimmten Teil der Teilbildgröße (z.B. 1/4, 1/2, 3/4) verschoben und die Prozeduren zur anfänglichen Kennzeichnung und Glättung für die verschobenen Gitter wiederholt werden. Ein Vergleich der verschiedenen Resultate liefert dann detailliertere Informationen über die tatsächliche Position der Ränder des Fotografiebereichs.Edge analysis can be accomplished, for example, by shifting the grid of subimages vertically and horizontally by a certain fraction of the subimage size (e.g. 1/4, 1/2, 3/4) and repeating the initial labeling and smoothing procedures for the shifted grids. A comparison of the different results then provides more detailed information about the actual position of the edges of the photographic area.

Wahlweise kann die Randanalyse für vertikale und horizontale Ränder auf die Teile des Dokuments beschränkt sein, in denen vertikale bzw. horizontale Ränder des Fotografiebereichs erwartet werden müssen.Optionally, the edge analysis for vertical and horizontal edges can be limited to the parts of the document where vertical or horizontal edges of the photographic area are expected.

Bei einem alternativen Ansatz kann die Randanalyse in der Weise durchgeführt werden, daß bestimmte Zielgebiete, die auf die Koordinaten der Kennungs-Übergänge in der geglätteten Kennungsmatrix zentriert sind, näher untersucht werden. Beispielsweise werden die Zielgebiete in Teilfenster unterteilt, die eine höhere Auflösung bieten, als sie bei der anfänglichen Kennzeichnung verwendet wurde, und jedes Teilfenster kann dann als Rand- Teilfenster oder Nicht-Rand-Teilfenster klassifiziert werden.In an alternative approach, edge analysis can be performed by examining specific target regions centered on the coordinates of the label transitions in the smoothed label matrix. For example, the target regions are divided into subwindows that provide a higher resolution than that used in the initial labeling, and each subwindow can then be classified as an edge subwindow or a non-edge subwindow.

Die Analyse der Zielgebiete kann auf isolierte Stellen auf den Übergangslinien in der geglätteten Kennungsmatrix beschränkt sein. Wenn der Rand exakt innerhalb dieser Zielgebiete liegt, kann die exakte Lage des gesamten Randes durch Extrapolation gefunden werden.The analysis of the target areas can be limited to isolated locations on the transition lines in the smoothed label matrix. If the edge lies exactly within these target areas, the exact location of the entire edge can be found by extrapolation.

Während oben spezielle Ausführungsbeispiele der Erfindung beschrieben worden sind, ist ein Fachmann in der Lage, verschiedene Abwandlungen ins Auge zu fassen, die sämtlich unter den in den Ansprüchen angegebenen Erfindungsgedanken fallen.While specific embodiments of the invention have been described above, a person skilled in the art will be able to envisage various modifications, all of which fall within the spirit and scope of the invention as set out in the claims.

Beispielsweise können die im Zusammenhang mit Fig. 3 erwähnten Kontextregeln mit Matrizen ausgeführt werden, die größer sind als die beschriebene Größe von 3 x 3-Teilbildern.For example, the context rules mentioned in connection with Fig. 3 can be executed with matrices that are larger than the described size of 3 x 3 subimages.

Claims

1. System for automatically segmenting a scanned document in an electronic document processing device to separate document areas (T, P) containing image information of different types, such as black and white images, continuous tone images and the like, comprising

- matrix generating means for dividing the scanned image representing the entire document into a matrix of partial images,

- identification means (10) for analyzing the information contained in each partial image and for assigning an initial identifier (BW, BIM, BG, U) to the partial image so as to obtain an initial identifier matrix (Fig. 4(A)), and

- smoothing means (12) for smoothing the initial identifier matrix by changing the identifiers according to context rules in order to obtain a pattern of uniformly labelled segments representing the different document areas,

characterized in that the marking means (10) are arranged to select the initial identifiers from a first set of identifiers (BW BIM, BG, U), and the smoothing means are arranged to convert the initial identifiers into smoothed identifiers (T, P), which smoothed identifiers are selected from a second set of identifiers which is numerically smaller than the first set.

2. System according to claim 1, wherein the context rules implemented in the smoothing means (12) include rules for expanding some of the initial identifiers (BW, U) and for eliminating other of the initial identifiers (BIM, BG) and the expanded identifiers are ultimately identified with the smoothed identifiers (T, P).

3. System according to claim 2, wherein context rules contain at least one rule having one of the following structures:

"if in a certain array (A') of matrix elements in the initial identifier matrix at least n elements have the identifier BW and this array contains no identifiers from the group G, then change all identifiers in this array to BW";

where n is a given number, BW is a given initial identifier and G is a given subset of the set of initial identifiers,

b) "if in a given array (A') of matrix elements in the initial identifier matrix at least m elements have the identifier U, then change all identifiers in this array to U";

where m is a given number and U is a given initial identifier.

"c1) where the identifier U forms intersecting vertical and horizontal chains (14,16), fill the entire rectangle (18) spanned by these chains with the identifier U;

c2) check all combinations of horizontal and vertical chains to maximize the area to be filled with the identifier U;

c3) if the height of the maximized area is less than hmin elements or the width is less than wmin elements, change all identifiers within this area to BW";

where U and BW are given initial identifiers and hmin and wmin are given numbers.

4. System according to claim 3, characterized in that the context rules further contain a rule with the following structure:

c4) "if the rectangle mentioned in c1 contains a number of identifiers U that is greater than a given number Umin, fill the entire rectangle with the identifier U",

c5) "if the shape of the rectangle mentioned in c1 satisfies the conditions: width/height > min and width/height < max, then fill the entire rectangle with the identifier U",

where min and max are given numbers.

5. A system according to claim 3 or 4, wherein the predetermined fields (A') to which the context rules (a) and (b) are applied have a size of 3 x 3 sub-images.

6. System according to claim 3, 4 or 5, wherein the smoothing means comprise several stages (34) for gradually modifying the initial identifier matrix by applying each context rule at least once according to a predetermined order.

7. The system of claim 6, wherein the context rule with the structure is applied before the context rule with the structure (b) and the context rule with the structure (b) is applied before the context rule with the structure (c).

8. A system according to claim 6 or 7, wherein the context rules include a local rule requiring that

"if a given matrix element is surrounded by upper, lower right and left immediate neighbors, all of which have the same identifier X, then the identifier of the given matrix element is also changed to X",

where the local rule is applied immediately before any of the rules with structure (a), (b) or (c).

9. System according to one of the preceding claims, in which the size of each partial image is at least 16 x 16 pixels, preferably 64 x 64 pixels and the scan resolution is greater than 100 dpi.

10. System according to claim 9, wherein the marking means (10) include a histogram unit (24) for generating gray value histograms of each partial image.

11. System according to one of the preceding claims, in which the identification means (10) comprise a tree classifier for determining the identifier to be assigned to a given partial image by checking features of the input data according to a tree structure.

12. System according to one of the preceding claims, in which the identification means contain a device (38) for detecting raster or dithered information.

13. System according to one of claims 1 to 11, with context rules for finding halftone areas.

14. System according to one of claims 1 to 11, with a device (38) for detecting rasters or dithered information only in photograph areas which have been segmented by the smoothing means (12).

15. System according to one of the preceding claims, with several identification means (10) for parallel processing of the data from several partial images.

16. A system according to any preceding claim, comprising edge line analysis means (40) responsive to the output values of the smoothing means (12) for locating the edges of the segmented regions with greater resolution.

17. A method for automatically segmenting a scanned document in an electronic document processing device to separate document areas (T, P) containing different types of image information, such as black and white images, continuous tone images and the like, wherein

- the scanned image representing the entire document is divided into a matrix of partial images,

- the information contained in each partial image is analyzed to assign an initial identifier (BW BIM, BG, U) to the partial image in order to obtain an initial identifier matrix, and

- the initial identifier matrix is smoothed by changing the identifiers of individual matrix elements according to context rules in order to obtain a pattern of uniformly labelled segments corresponding to the different document regions, characterized in that the initial identifiers are selected from a first set of identifiers (BW, BIM, BG, U) and that in the smoothing step the initial identifiers are converted into smoothed identifiers (T, P) selected from a second set of identifiers which is numerically smaller than the first set.