RU2116671C1

RU2116671C1 - Method for processing of object image

Info

Publication number: RU2116671C1
Application number: RU95100895A
Authority: RU
Inventors: В.В. Кондратьев; В.А. Утробин
Original assignee: Нижегородский государственный технический университет
Priority date: 1995-01-23
Filing date: 1995-01-23
Publication date: 1998-07-27
Also published as: RU95100895A

Abstract

FIELD: visual information processing, in particular, conversions for extraction of maximal amount of information about features of displayed objects and processes. SUBSTANCE: method involves generation of image pyramid by making copies of source image and generation of feature pyramid. Generation of image pyramid and feature pyramid involves splitting image at least once and averaging brightness of image as well as regions generated by splitting in order to generate multitude of structure elements from multitude of copies of source image. Analysis of structure elements involves detection of structural relations using binary relations. EFFECT: simplified design, increased speed of pyramid generation.

Description

Изобретение относится к способам обработки визуально информации, в частности к системам преобразований, обеспечивающим извлечение максимально информативных данных о свойствах изображенных объектов или процессов. The invention relates to methods for processing visually information, in particular to transformation systems, providing extraction of the most informative data on the properties of the depicted objects or processes.

Известен способ выделения объекта на изображении и устройство для его осуществления. A known method of selecting an object in the image and a device for its implementation.

Используется анализ изображения на пиксильном уровне при условии априорного задания критериев уровня разной яркости путем уменьшения величин сигналов на первую заданную величину. The analysis of the image at the pixel level is used, provided that the criteria for the level of different brightness are set a priori by reducing the signal values by a first predetermined value.

Известен [3] способ, где также идет анализ изображения на пиксильном уровне, используется медианная фильтрация, применение которой эффективно при априорном знании специфики помех на изображении [4]. The known [3] method, where the image is also analyzed at the pixel level, uses median filtering, the use of which is effective with an a priori knowledge of the specifics of interference in the image [4].

Наиболее близок к предлагаемому способу пиримидальной обработки изображения [1] , в котором реализуется три этапа преобразования информации об изображении:
построение пирамиды изображения, построение "снизу - вверх" последовательных копий исходного изображения; для получения l-й копии (G_l) требуется две операции: свертка копии (G_l-1) с весовой матрицей W, реализующей ФНЧ для пирамиды Гаусса или полосовой фильтр для пирамиды Лапласа, G_l=W+G_l-1; прореживание полученного результата с коэффициентом 2 по каждому уровню пирамиды G_l= [W+G_l-1]↓2. В результате образуется пирамида Гаусса как множество {G_l} или пирамида Лапласа как множество {L_l}, причем L_l=G_l-W+G_l;
построение признаковой пирамиды; изображение каждого уровня пирамиды изображения поддергивается операции свертывания с селективным фильтром F, весовая функция которого настроена на выявление специфического признака в изображении;
обработка изображения и построение пирамиды описания; изображение по каждому уровню признаковой пирамиды проходит необходимую нелинейную обработку, например возведение в квадрат яркостных величин изображения (по каждому уровневому пикселу разрешения) для усиления различия выявляемого признака по отношению к фону; после этого по каждому уровню осуществляют построение " снизу - вверх" пирамиды Гаусса (процедура аналогична первому этапу преобразования изображения).Closest to the proposed method of pyrimidal image processing [1], which implements three stages of converting image information:
building a pyramid of the image, building the "bottom - up" of successive copies of the original image; to obtain the l-th copy (G _l ) two operations are required: convolution of the copy (G _l-1 ) with the weight matrix W, which implements a low-pass filter for the Gauss pyramid or a band-pass filter for the Laplace pyramid, G _l = W + G _l-1 ; thinning the result with a coefficient of 2 for each level of the pyramid G _l = [W + G _l-1 ] ↓ 2. As a result, a Gauss pyramid is formed as the set {G _l } or a Laplace pyramid as the set {L _l }, moreover, L _l = G _l -W + G _l ;
building a feature pyramid; the image of each level of the image pyramid is supported by a folding operation with a selective filter F, the weight function of which is configured to identify a specific feature in the image;
image processing and construction of a description pyramid; the image for each level of the sign pyramid undergoes the necessary non-linear processing, for example, squaring the brightness values of the image (for each level pixel of resolution) to enhance the difference between the detected sign with respect to the background; after that, a Gauss pyramid is constructed “from bottom to top” for each level (the procedure is similar to the first stage of image conversion).

Данная операция преобразования является необходимой и обеспечивает устранение вероятных помех, возникающих при реализации операции дифференцирования на перовом этапе ( в случае пирамиды Лапласа) и операции усиления - на втором. This conversion operation is necessary and ensures the elimination of possible interference arising from the implementation of the differentiation operation at the first stage (in the case of the Laplace pyramid) and the amplification operation at the second.

В результате формируется пирамида описания изображения, полностью зависящая от применяемого признакового фильтра F. As a result, an image description pyramid is formed that completely depends on the feature filter F.

Недостатком прототипа является то, что способ довольно сложен, а также его недостаточное быстродействие. The disadvantage of the prototype is that the method is quite complicated, as well as its lack of speed.

Особенности преобразования информации по прототипу следующие: построение пирамид снизу вверх; обязательность блоков нелинейной обработки и дополнительной пирамиды Гаусса на выходе; специфичность фильтра F, зависящая от решающей прикладной задачи; отсутствие потенциальной возможности выявления структурных связей между элементами изображения при его описании. Features of the conversion of information on the prototype are as follows: building pyramids from the bottom up; mandatory blocks of nonlinear processing and additional Gauss pyramids at the output; filter specificity F, depending on the crucial application; lack of potential to identify structural relationships between image elements in its description.

Указанные недостатки устраняются предлагаемым решением. These shortcomings are eliminated by the proposed solution.

Задачей изобретения является совершенствование известного способа. The objective of the invention is to improve the known method.

Технический результат - упрощение и ускорение осуществления способа за счет иного построения пирамид, чем в прототипе, и исключения специфических фильтров. The technical result is the simplification and acceleration of the implementation of the method due to the different construction of the pyramids than in the prototype, and the exclusion of specific filters.

Этот технический результат достигается тем, что в способе обработки изображения объекта, включающем построение пирамиды изображения с созданием копий исходного изображения, построение признаковой пирамиды, при построении пирамиды изображения и признаковой пирамиды делят изображение один или более раз, а затем усредняют по яркости как само изображение, так и полученные в результате делений подобласти, получая на множестве копий исходного изображения множество их структурных элементов, между которыми выявляют структурные связи через бинарные отношения. This technical result is achieved in that in a method for processing an image of an object, including constructing an image pyramid with creating copies of the original image, constructing a feature pyramid, when constructing an image pyramid and feature pyramid, divide the image one or more times, and then average the brightness as the image itself, and those obtained as a result of division of the subregion, receiving on the set of copies of the original image many of their structural elements between which structural connections are revealed through inarnye relationship.

Предлагаемый способ заключается в следующем. Реализуют два этапа преобразования информации об изображении. The proposed method is as follows. Two stages of image information conversion are implemented.

Построение пирамиды изображения. Построение "сверху вниз" параллельных копий исходного изображения. Для получения l-й копии изображения (I_l) требуется две операции:
деление области изображения I (G) на (2^2l • 2²), равных по площади σ_l подобластей {G $\binom{l}{ij}$ } размера (2² • 2²) с общим числом подобластей 2^2l(l=0,1,... );
усреднение по яркости изображения по элементам каждой подобласти

где
μ_n,m - яркость пиксела исходного изображения с координатами (n,m), принадлежащими подобласти G

\binom{l}{ij}

. Данная операция эквивалентна операции склярного умножения элементов (пикселов) изображения по ограниченной подобласти G

\binom{l}{ij}

с весовой матрицей W^l, все элементы которой равны единице (представленных векторами), т.е. реализуется процесс полного сглаживания по G

\binom{l}{ij}

. Это обеспечивает максимальную устойчивость (регуляризацию) процесса восстановления изображения в условиях неопределенности; инвариантность получаемых признаков к возможным преобразованиям; не требуются дополнительные операции обработки изображения, как в прототипе; простоту (минимальную вычислительную трудоемкость) реализации по сравнению с прототипом.Building a pyramid image. Building top-down parallel copies of the original image. To obtain the l-th copy of the image (I _l ) two operations are required:
dividing the image area I (G) by (2 ^2l • 2 ² ), equal in area σ _{l of} subdomains {G

\binom{l}{ij}

} size (2 ² • 2 ² ) with the total number of subdomains 2 ^2l (l = 0,1, ...);
averaging over image brightness over the elements of each subregion

Where
μ _{n, m} is the brightness of the pixel of the original image with coordinates (n, m) belonging to the subregion G

\binom{l}{ij}

. This operation is equivalent to the operation of scalar multiplication of elements (pixels) of an image in a limited subdomain G

\binom{l}{ij}

with the weight matrix W ^l , all of whose elements are equal to unity (represented by vectors), i.e. the process of full smoothing by G

\binom{l}{ij}

. This ensures maximum stability (regularization) of the image recovery process in conditions of uncertainty; the invariance of the obtained signs to possible transformations; no additional image processing operations are required, as in the prototype; simplicity (minimal computational complexity) of implementation compared to the prototype.

Построение признаковой пирамиды, которая одновременно является пирамидой описания изображения. The construction of a feature pyramid, which is also a pyramid of image description.

Построение пирамиды высотой R осуществляют сверху вниз. Для этого требуется три операции:
деление каждой из 2^2l подобластей {G $\binom{l}{ij}$ } ∈ I_l площади σ по осям координат x, y на N_x= 2ⁿ, N _y=2^m(N,M,=0,1,...) участков, образующих множество новых непересекающихся подобластей по каждому k-му варианту деления

;
усреднение по каждой подобласти G

\binom{lk}{sr}

множества

;
выявление бинарного отношения строгого частичного упорядочения между элементами двух непересекающихся подмножеств

, каждое из которых объединяет равное число подобластей G

\binom{lk}{sr}

. Если некоторая пара элементов A₁, A₂ принадлежит G

\binom{l}{ij}

, то справедливо

где
u(•) - вещественнозначная функция на G

\binom{l}{ij}

, представляет весовую матрицу W^k, все элементы которой равны (-1) для объединения непересекающихся подобластей G

\binom{lk}{sr}

, входящий в

, и (+1) для подобластей, входящих в

. В этом случае выявление бинарного отношения может быть осуществлено следующим образом
▽

\binom{k}{l}

= m

\binom{l,k}{1}

-m

\binom{l,k}{2}

,
где

, и суммирование производится по подобластям, принадлежащим i-му подмножеству

.The construction of a pyramid of height R is carried out from top to bottom. This requires three operations:
division of each of the 2 ^2l subdomains {G

\binom{l}{ij}

} ∈ I _{l of} area σ along the x, y coordinate axes on N _x = 2 ⁿ , N _y = 2 ^m (N, M, = 0,1, ...) sections that form the set of new disjoint subdomains for each kth division option

;
averaging over each subdomain G

\binom{lk}{sr}

many

;
revealing a binary relation of strict partial ordering between elements of two disjoint subsets

, each of which unites an equal number of subdomains G

\binom{lk}{sr}

. If some pair of elements A ₁ , A ₂ belongs to G

\binom{l}{ij}

then right

Where
u (•) is a real-valued function on G

\binom{l}{ij}

, represents the weight matrix W ^k , all of whose elements are equal to (-1) for the union of disjoint subdomains G

\binom{lk}{sr}

included in

, and (+1) for the subdomains included in

. In this case, the identification of a binary relation can be carried out as follows
▽

\binom{k}{l}

= m

\binom{l, k}{1}

-m

\binom{l, k}{2}

,
Where

, and the summation is performed over the subdomains belonging to the ith subset

.

Первые две операции обеспечивают выявление структурных элементов изображения I уровня l, а последняя - структурных связей между этими элементами. The first two operations provide the identification of structural elements of the image of level I l, and the last - structural relationships between these elements.

В результате формируются уровни описания в виде структур (графов, образов) любого произвольного изображения в условиях отсутствия априорной информации о последнем с использованием универсальной системы однородных признаков {m $\binom{lk}{i}$ } и системы однородных правил (бинарных отношений строгого частичного) упорядочения отношения эквивалентности ≈ и строгого порядка <. При этом построение описания и анализ для задачи распознавания изображения производят сверху вниз, от общего (целого) к частному, что характерно для системы зрительного восприятия и обеспечивает высокую производительность и достоверность узнавания изображения.As a result, description levels are formed in the form of structures (graphs, images) of any arbitrary image in the absence of a priori information about the latter using a universal system of homogeneous signs {m $\binom{lk}{i}$ } and a system of homogeneous rules (binary relations of strict partial) ordering the equivalence relations ≈ and strict order <. In this case, the construction of the description and analysis for the image recognition problem is performed from top to bottom, from the general (whole) to the particular, which is characteristic of the visual perception system and provides high performance and image recognition reliability.

Пример осуществления способа. An example implementation of the method.

Пример 1. Построение пирамиды описания (признаковой пирамиды) искаженного символа "□". Example 1. The construction of the description pyramid (feature pyramid) of the distorted symbol "□".

Пусть матрица исходного изображения размера 8х8 имеет вид

1. Строят первый, верхний уровень пирамиды изображения.Let the matrix of the initial image of size 8x8 have the form

1. Build the first, upper level of the image pyramid.

Для этого - делят область изображения на

разных по площади подобластей {G

\binom{0}{ij}

}. Для l= 0 получают одну подобласть (деление отмечено пунктиром), размера 2² • 2² проводят усреднение по выделенным элементам подобласти. Получают копию изображения I₀

где

Строят признаковую пирамиду для I₀. Для этого формируют первый, верхний уровень пирамиды. При k= 0, n= 0, m=0

состоит из одной подобласти G

\binom{00}{sr}

= G

\binom{0}{ij}

= I₀. Тогда

.To do this, divide the image area into

subregions of different sizes {G

\binom{0}{ij}

}. For l = 0, one subdomain is obtained (dividing is indicated by a dotted line), size 2 ² • 2 ² is averaged over the selected elements of the subdomain. Get a copy of image I ₀

Where

Build a sign pyramid for I ₀ . To do this, form the first, upper level of the pyramid. For k = 0, n = 0, m = 0

consists of one subdomain G

\binom{00}{sr}

= G

\binom{0}{ij}

= I ₀ . Then

.

Таким образом, на первом уровне выявляют один структурный элемент m $\binom{00}{1}$ (граф: точка с петлей), содержащий информацию о наличии изображения;
формируют следующие два уровня признаковой пирамиды:
k=1, n=0, m=1 k=2, n=1, m=0

где

Для выявления структурных связей необходимо знать конкретные значения μ_ij. Пусть μ_ij= 1 по ∀_i,j, где μ_ij≠ 0. Тогда

Таким образом, на второй и третьем уровнях выявляют по два структурных элемента, бинарные отношения между которыми соответствуют отношению эквивалентности, т.е. имеем однородное изображение на данных уровнях описания;
формируют следующие уровни описания:
k=3, n=0, m=2 k=4, n=2, m=0

С учетом значений μ_ij получим

Тогда

и на уровнях 3 и 4 выявляют отношение строгого порядка - локализация изображения на периферии области изображения на осях x, y и отверстие в центре области изображения. Граф по осям координат имеет вид:

Итак, для описания исходного изображения оказалось достаточным построения одного уровня пирамиды изображения (I₀) и 4 уровней признаковой пирамиды. Для выявления более тонкой структуры исходного изображения необходимо перейти к построению следующего уровня пирамиды изображения (I₁). Для этого необходимо принять l = 1 и повторить процесс построения, описанный выше.Thus, at the first level, one structural element m

\binom{00}{1}

(graph: dot with a loop) containing information about the presence of an image;
The following two levels of the sign pyramid are formed:
k = 1, n = 0, m = 1 k = 2, n = 1, m = 0

Where

To identify structural relationships, it is necessary to know the specific values of μ _ij . Let μ _ij = 1 in ∀ _{i, j} , where μ _ij ≠ 0. Then

Thus, at the second and third levels, two structural elements are revealed, the binary relations between which correspond to the equivalence relation, i.e. we have a homogeneous image at these levels of description;
form the following levels of description:
k = 3, n = 0, m = 2 k = 4, n = 2, m = 0

Given the values of μ _ij we get

Then

and at levels 3 and 4, a strict order relation is revealed — localization of the image on the periphery of the image area on the x, y axes and a hole in the center of the image area. The graph along the coordinate axes has the form:

So, to describe the source image, it turned out to be sufficient to build one level of the image pyramid (I ₀ ) and 4 levels of the feature pyramid. To identify a finer structure of the original image, it is necessary to proceed to the construction of the next level of the image pyramid (I ₁ ). To do this, take l = 1 and repeat the construction process described above.

Пример 2. Обнаружение пятен на поверхности промышленных изделий типа "водяных" знаков, когда контрастность пятна по отношению к окружающему фону ниже 2%. Example 2. Detection of spots on the surface of industrial products such as watermarks, when the contrast of the spot with respect to the surrounding background is below 2%.

Пусть исходное изображение для простоты изложения имеет размер 4 х 4 пиксел. При этом все пикселы имеют однородную яркость, кроме одного, яркость которого на 2% ниже

где а=0,98.Let the original image be 4 x 4 pixels in size for simplicity. Moreover, all pixels have uniform brightness, except for one, whose brightness is 2% lower

where a = 0.98.

Так как I=I₀, то строят признаковую пирамиду:
уровень 1: R=n=m=0,

, так как m

\binom{00}{1}

< 1, то произошло обнаружение нарушения однородности;
уровень 2: k=m=1, n=0,

, т.е. место нарушения однородности локализовано в левой половине изображения:
уровень 3: k-2, n=1, m=0,

, т.е. место нарушения однородности локализовано в нижней половине изображения.Since I = I ₀ , then build a sign pyramid:
level 1: R = n = m = 0,

, since m

\binom{00}{1}

<1, then a homogeneity violation was detected;
level 2: k = m = 1, n = 0,

, i.e. the place of violation of uniformity is localized in the left half of the image:
level 3: k-2, n = 1, m = 0,

, i.e. the place of violation of uniformity is localized in the lower half of the image.

Объединяя результаты описания изображения по уровням 2 и 3, (задача анализа), получают место локализации - 3-й квадрант плоскости изображения. Combining the results of the image description by levels 2 and 3, (analysis task), we get the localization place - the 3rd quadrant of the image plane.

Эксперименты показывают, что предлагаемый способ проще в осуществлении и в 10 раз быстрее. Experiments show that the proposed method is easier to implement and 10 times faster.

Основные области использования предлагаемого способы: системы технического зрения; обработка результатов аэрофотосъемки; анализ изображений в системах таможенного контроля; системы анализа и классификации товарных знаков, ярлыков и марок; системы распознавания символов и знаков произвольной конфигурации, например идентификация печатей, подписей, индексов почтовых отправлений и т.п. The main areas of use of the proposed methods: vision systems; processing the results of aerial photography; image analysis in customs control systems; systems for the analysis and classification of trademarks, labels and brands; recognition systems for characters and signs of arbitrary configuration, for example, identification of seals, signatures, postal codes, etc.

Анализ подтверждает, что рассмотренное решение соответствует критериям новизны, изобретательского уровня и промышленной применимости. The analysis confirms that the considered solution meets the criteria of novelty, inventive step and industrial applicability.

Claims

A method of processing an image of an object, including constructing an image pyramid with creating copies of the original image, constructing a feature pyramid, characterized in that when constructing an image pyramid and feature pyramid, the image is divided one or more times, and then both the image itself and the images obtained are averaged in brightness as a result of subdomain divisions, receiving on the set of copies of the original image many of their structural elements between which structural connections are revealed through binary relations.