CN102663361A

CN102663361A - Face image reversible geometric normalization method facing overall characteristics analysis

Info

Publication number: CN102663361A
Application number: CN2012100965956A
Authority: CN
Inventors: 李晓光; 夏青; 卓力; 李风慧
Original assignee: Beijing University of Technology
Current assignee: Shandong Wangyuan Information Technology Co ltd
Priority date: 2012-04-01
Filing date: 2012-04-01
Publication date: 2012-09-12
Anticipated expiration: 2032-04-01
Also published as: CN102663361B

Abstract

The invention discloses a face image reversible geometric normalization method facing overall characteristics analysis. A method based on secondary affine transformation is adopted. Firstly global affine is carried out which means that global rotation, zooming and cutting are carried out on a face image according to a standard structure definition of a face, such that in an input face image, coordinates of two eyes are at a same horizontal line, a two eye center distance is subjected to normalization according to a parameter d of a standard structure, and transverse and longitudinal proportions of the face are consistent with the standard structure. The image is cut into a magnitude of 2d*2d. Then geometric position normalization is carried out on a mouth area through a local affine transformation needle. Finally by utilizing image edge fusion, a standard geometric normalization face image is formed. An inverse normalization process is that according to a transformation parameter, a standard structure face image is recovered to a geometric structure characteristic before geometric normalization through secondary affine inverse transformation such that personalized topology structures of different face characteristic points are maintained. The face image which goes through geometric normalization can be used for face analysis based on an overall characteristic, such as super resolution reconstruction of the face image, face recognition, and the like.

Description

Face image reversible geometric normalization method oriented to integral feature analysis

Technical Field

The invention relates to a geometric normalization method for a face image, in particular to a reversible geometric normalization method for the face image based on secondary affine transformation, which belongs to the field of face image processing.

Background

With the development of computer and multimedia technologies and the continuous progress of society, various fields such as security monitoring, visual communication, medical imaging, satellite remote sensing, electronic entertainment, computer vision and the like put forward wide demands on high-quality images and videos, and the high-quality images/videos can provide richer information and more real visual feelings, so that the method is the basis of many practical applications. The spatial resolution refers to the amount of information stored in an image, and is an important index for measuring the detail expressive power of the image. The real world continuous scene itself has rich information, but is affected by various factors such as imaging equipment, imaging environment, noise and the like, and the digital image acquired by the imaging equipment is generally a low-resolution image, which is difficult to meet the continuously improved practical application requirements. Especially images/videos from various handheld devices, vehicle-mounted video capture devices, wireless video sensor networks, and video monitoring devices in harsh natural conditions, etc., are often unsatisfactory in terms of resolution. Therefore, the Super-Resolution (SR) technology has become a research focus in the image processing field in recent years.

The human face is an important object in image processing and analysis, and has attracted wide attention for super-resolution restoration of human face images. Aiming at the face image, the method based on the integral face feature analysis obtains a good experimental result. Such methods perform feature analysis on the face image as a whole. A sample library of pairs of high resolution face images and corresponding low resolution face images is subjected to pca (principal Component analysis) or ica (identity Component analysis) analysis. The whole human face is decomposed into linear combinations of different characteristic faces by PCA. The sample library is expressed as a set of eigenfaces after PCA decomposition is carried out on a group of low-resolution eigenfaces and corresponding high-resolution eigenfaces. The super-resolution restoration is to decompose the low-resolution face into linear combinations of a plurality of low-resolution eigenfaces to obtain combination coefficients. The set of coefficients is then applied directly to the corresponding high resolution intrinsic face, resulting in a new high resolution face.

The method can obtain a result with higher resolution, but the good and bad face effect of the reconstruction result depends on whether the image in the sample library is well geometrically normalized or not. In reality, the sizes of human faces are different, and the positions of five sense organs are different from person to person, so that the geometric normalization of the human face image facing the overall features plays a key role, and is an important factor for determining the image result after super-resolution restoration. Existing methods are typically implemented by normalizing the position of both eyes. I.e. the coordinate position of the human eye is fixed by global rotation, scaling and cropping of the image of the human face. Doing so still leaves significant artifacts after restoration of misaligned nose and mouth, and the problem is not fundamentally solved. Meanwhile, the methods generally do not consider the change of the geometric normalization to the human face personalized proportion characteristic. The original human face personalized scale characteristics are destroyed after the input image is modified.

Disclosure of Invention

The invention aims to realize reversible geometric normalization of a human face image by a method based on secondary affine transformation, and respectively adopts three methods of global affine transformation, local affine transformation and image edge fusion according to the definition of a human face standard geometric structure.

The invention is realized by adopting the following technical means:

a human face geometric normalization method based on secondary affine transformation comprises the steps that a video acquisition device acquires a human face image, an optical signal of a target image is converted into a digital image signal, and the digital image signal is stored in a storage device of the acquisition device; reading in images through the existing USB and infrared interfaces by a computer, and carrying out geometric normalization of the human face based on secondary affine transformation in a processor; the normalization result can be stored in a database form and directly stored in a local hard disk, or stored in a network storage device or stored in a network; the normalization method mainly comprises the steps of face standard geometric structure definition, global affine transformation, local affine transformation and image edge fusion.

In the global affine transformation step, after the face image is marked according to the standard structure definition (figure 1) of the face, global rotation, scaling and cutting are carried out, so that the input face image is normalized according to the parameters d of the standard structure when the coordinates of two eyes are positioned on the same horizontal line and the center distance of the two eyes, and the transverse and longitudinal proportion of the face is consistent with the standard structure;

in the local affine transformation step, geometric position normalization is performed for the mouth region. The method mainly comprises the steps of mouth region division, local rotation and scaling and a converted mouth region;

in the image edge fusion step, edge fusion is carried out on the global transformation image and the local transformation image so as to form a standard geometric normalization human face image;

the image normalization method comprises the following steps:

reading in the collected face image from the database;

describing the face image by 6 feature point coordinates and relative positions thereof, and selecting P₁～P₆(the corresponding coordinate is (x)₁，y₁)～(x₆，y₆) Respectively corresponding to the coordinates of the left eye center, the right eye center, the nose tip, the left mouth corner, the mouth center and the right mouth corner;

calculating the relative position relationship F ═ d, r between the characteristic points₁，r₂，r₃). Wherein,

d = \sqrt{{(x_{1} - x_{2})}^{2} + {(y_{1} - y_{2})}^{2}} - - - (1)

r_{1} = \frac{y_{3} - y_{1}}{d} - - - (2)

r_{2} = \frac{y_{5} - y_{3}}{d} - - - (3)

r_{3} = \frac{x_{5} - x_{4}}{d} = \frac{x_{6} - x_{5}}{d} - - - (4)

i.e. d is the Euclidean distance between the centers of the eyes, r₁The ratio of the vertical distance from the line of the eyes to the nose tip to d; r is₂Is the ratio of the distance from the tip of the nose to the center point of the mouth to d, r₃Is the ratio of the distance from the center point of the mouth to the corner of the mouth to d. The length and width of the face image are defined as 2d, and the edge of the image with the nearest distance between the centers of the eyes is 0.5 d;

marking characteristic points of a face image sample library and solving statistical average to obtain standard structure parameters F aiming at different human face images;

carrying out global affine transformation on the image defined by the standard structure of the face, and carrying out global rotation, scaling and cutting to enable the input face image to be in the same horizontal line with the coordinates of two eyes, and the central distance of the two eyes to be normalized according to the parameter d of the standard structure, so that the transverse and longitudinal proportions of the face are consistent with the standard structure;

and carrying out local affine transformation and carrying out geometric position normalization on the mouth region. The method mainly comprises the steps of mouth area division, local rotation and scaling, and edge fusion of the mouth area after transformation and an original image.

The edges of the mouth local image and the global image are fused to complete the geometric normalization of the face image;

storing the face image after geometric normalization into a database, and carrying out operations such as local storage or remote transmission through network storage equipment;

the inverse normalization process is opposite to the geometric normalization process, and starts from the local inverse affine and then reaches the global inverse affine;

the foregoing global affine transformation step is as follows:

for the input face image, artificial characteristic points (P) are carried out₁～P₆) Notation, according to P₁And P₂Coordinate calculation of global image rotation angle, as in equation (5)) Shown in the figure:

<math> <mrow> <mi>α</mi> <mo>=</mo> <mi>arctg</mi> <mrow> <mo>(</mo> <mfrac> <mrow> <msub> <mi>y</mi> <mn>2</mn> </msub> <mo>-</mo> <msub> <mi>y</mi> <mn>1</mn> </msub> </mrow> <mrow> <msub> <mi>x</mi> <mn>2</mn> </msub> <mo>-</mo> <msub> <mi>x</mi> <mn>1</mn> </msub> </mrow> </mfrac> <mo>)</mo> </mrow> </mrow> </math>

(5)

wherein alpha is P₁And P₂The angle of the line to the horizontal. The image is rotated by an angle theta-alpha. The coordinates before and after rotation are expressed by the following formula (6):

(6)

defining the upper left corner of the rotated image as a new coordinate system origin, translating the coordinate origin relative to the original image origin, and setting the width and height of the original image as W and H respectively, wherein the coordinate transformation relation of the image rotating counterclockwise is shown as formula (7):

(7)

where (x, y) is the original image coordinates and (x ', y') is the rotated image coordinates. Similarly, the coordinate relationship between the clockwise rotation image and the original image is:

(8)

the feature points of the two eyes of the rotated image are located on the same horizontal line. And calculating the horizontal and vertical scaling coefficients, and cutting to make the coordinates of the eyes and the nose tip consistent with the standard geometric structure of the human face. The transverse and longitudinal scaling factors are respectively shown in formulas (9) and (10):

S_h＝(x′₂-x′₁)/d (9)

S_y＝(y′₃-y′₁)/r₁d

(10)

wherein, the coordinate point x'₁、x′₂、y′₁And y'₃As coordinates of the rotated feature points, r₁And d is a standard structural parameter defined by formula (2), i.e., d is the Euclidean distance between the centers of the eyes, r₁Is the ratio of the perpendicular distance from the line of the eyes to the tip of the nose to d. To ensure the reversibility of the normalization, the parameters α and abscissaThe ratio S to the vertical scaling factor Sh/Sv is saved as the inverse transform parameter.

The specific steps of the local affine transformation aiming at the geometric normalization of the mouth region are as follows:

a rectangular mouth area is defined, and the length and the height of the rectangle are respectively shown as formulas (11) and (12):

w＝a₁(x₆-x₄) (11)

h＝a₂(y₅-y₃) (12)

the coordinates of the vertex at the upper left corner of the rectangle are shown in equation (13):

(x_o，y_o)＝(x₅-w/2，(y₅+y₃)/2) (13)

the triangular areas of the left lower corner and the right lower corner of the rectangle are excluded from the mouth area. Two right-angle sides of the triangular area are w/4 and h/4 respectively. Selection of a by testing the CAS-PEAL-R1 database₁＝1.7，a₂＝1.1；

Calculating the local rotation angle of the mouth area through the coordinates of the characteristic points P4-P6 as shown in formula (14):

<math> <mrow> <mi>β</mi> <mo>=</mo> <mi>arctg</mi> <mrow> <mo>(</mo> <mfrac> <mrow> <msub> <mi>y</mi> <mn>6</mn> </msub> <mo>-</mo> <msub> <mi>y</mi> <mn>4</mn> </msub> </mrow> <mrow> <msub> <mi>x</mi> <mn>6</mn> </msub> <mo>-</mo> <msub> <mi>x</mi> <mn>4</mn> </msub> </mrow> </mfrac> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>14</mn> <mo>)</mo> </mrow> </mrow> </math>

the local scaling factor is calculated as shown in equation (15):

S_m＝Sp＝Sq＝(2r₃d)/(x′₆-x′₄) (15)

wherein, x'₄And x'₆Is a characteristic point P₄And P₆And transverse coordinates in the partial image after the rotation transformation. Local scaling of the mouth takes lateral and longitudinal equal scaling;

the rotated and scaled mouth image is translated such that the transformed P₅Coordinate and P in human face standard structure₅Coordinate coincidence, translation offset is noted as (Δ)_x，Δ_y)；

All the parameters F for the inverse transformation are obtained_inv＝[α，S，β，S_m，Δ_x，Δ_y]；

The edge fusion step for the global transformation image and the local transformation image is as follows:

and establishing a Label image Label. The marker image size is 2d × 2 d;

and marking each pixel point of the image by a numerical value of 0-3. Initializing all mark values of the image to 0; after the mouth area is defined, pixels at corresponding positions of the mouth area are located, and the value of the marked image is 1; after local rotation, scaling and translation, the new mouth region covers a region marker image marker value of 2. The specific meaning of such a marked image is: the position of the value 0 is a pixel area which is not involved in local transformation; the area with the value of 1 belongs to the original mouth area and is not covered by the new mouth area; the pixel with the value of 2 is the position of the new mouth area which is not overlapped with the original mouth area; the pixel position with the value of 3 is the position where the new mouth area is superposed with the original mouth area;

for the pixels in the 0 area, directly adopting the pixels in the global image as final normalized image pixels;

for the pixels in the area 2, the new mouth area covers the original pixel points after local affine transformation, so the pixels in the new mouth area directly replace the covered original pixel points;

the area 3 is the superposition part of the new mouth area and the original image mouth area, and a fusion result is obtained by adopting a weighted average method so as to avoid obvious edge artifacts in the image; the weight is defined as follows: p transformed by local affine₅(x₅，y₅) Constructing a two-dimensional Gaussian function for the center is shown in equation (16):

<math> <mrow> <mi>Weight</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>=</mo> <msup> <mi>e</mi> <mrow> <mo>-</mo> <mrow> <mo>(</mo> <mfrac> <msup> <mi>i</mi> <mn>2</mn> </msup> <msubsup> <mi>σ</mi> <mi>x</mi> <mn>2</mn> </msubsup> </mfrac> <mo>+</mo> <mfrac> <msup> <mi>j</mi> <mn>2</mn> </msup> <msubsup> <mi>σ</mi> <mi>y</mi> <mn>2</mn> </msubsup> </mfrac> <mo>)</mo> </mrow> </mrow> </msup> </mrow> </math>

(16)

wherein i ═ x-x₅，j＝y-y₅All are image coordinates in the local image;

w_m＝w×S_mand h_m＝h×S_mRespectively the width and the height of the local affine transformation rear mouth part image;

the synthesized pixel of the overlap region (i, j) is represented by formula (17):

p(i，j)＝Weight(i，j)p_G+(1-Weight(i，j))p_L.

(17)

P_Gis the pixel value, P, of a pixel point in the global image overlap region_LPixel values of pixel points overlapped with the global image in the mouth region;

the area 1 pixels are areas which belong to the original mouth area but are not covered by the new mouth area, and the pixels in the area 1 in the fused image are filled through one-dimensional linear interpolation;

traversing the Label graph, if the current mark is a 1 area, counting the lengths RunWidth and RunHeight of the area which is continuously and transversely 1 with the current point as the starting point;

selecting a direction with a smaller continuous length to ensure that the interpolation pixel points have good continuity, and performing linear interpolation according to the pixel values of the 0 regions at the two ends of the 1 region in the fusion image to obtain the fusion pixel value of the 1 region;

if RunHeight is less than RunWidth, fusing pixel points:

<math> <mrow> <mi>Q</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>=</mo> <msub> <mi>Q</mi> <mi>G</mi> </msub> <mrow> <mo>(</mo> <msub> <mi>i</mi> <mn>1</mn> </msub> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>×</mo> <mrow> <mo>(</mo> <mfrac> <mrow> <msub> <mi>i</mi> <mn>2</mn> </msub> <mo>-</mo> <mi>i</mi> </mrow> <mi>RunHeight</mi> </mfrac> <mo>)</mo> </mrow> <mo>+</mo> <msub> <mi>Q</mi> <mi>L</mi> </msub> <mrow> <mo>(</mo> <msub> <mi>i</mi> <mn>2</mn> </msub> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>×</mo> <mrow> <mo>(</mo> <mfrac> <mrow> <mi>i</mi> <mo>-</mo> <msub> <mi>i</mi> <mn>1</mn> </msub> </mrow> <mi>RunHeight</mi> </mfrac> <mo>)</mo> </mrow> </mrow> </math>

(18)

wherein Q_G(i₁J) is the uppermost first pixel not belonging to region 1 of the same row of interpolation pixels, Q_L(i₂J) is the first pixel point which is not in the 1 area at the lowest end of the same row of the interpolation pixel points.

If RunWidth is less than RunHeight, fusing pixel points:

<math> <mrow> <mi>Q</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>=</mo> <msub> <mi>Q</mi> <mi>M</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <msub> <mi>j</mi> <mn>1</mn> </msub> <mo>)</mo> </mrow> <mo>×</mo> <mrow> <mo>(</mo> <mfrac> <mrow> <msub> <mi>j</mi> <mn>2</mn> </msub> <mo>-</mo> <mi>j</mi> </mrow> <mi>RunWidtht</mi> </mfrac> <mo>)</mo> </mrow> <mo>+</mo> <msub> <mi>Q</mi> <mi>N</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <msub> <mi>j</mi> <mn>2</mn> </msub> <mo>)</mo> </mrow> <mo>×</mo> <mrow> <mo>(</mo> <mfrac> <mrow> <mi>j</mi> <mo>-</mo> <msub> <mi>j</mi> <mn>1</mn> </msub> </mrow> <mi>RunWidth</mi> </mfrac> <mo>)</mo> </mrow> </mrow> </math>

(19)

wherein Q_M(i，j₁) The first pixel point not belonging to the 1 region at the leftmost end of the same line of interpolation pixel points, Q_N(i，j₂) The first pixel point which is not in the region 1 at the rightmost end of the same line of the interpolation pixel points.

Modifying the Label 1 in the Label image corresponding to the interpolated pixel into 0, and providing help for the linear interpolation of the unfused 1-region pixel;

compared with the prior art, the invention has the following obvious advantages and beneficial effects:

the face image is defined according to the standard structure of the face, global affine transformation is carried out, and the face image is rotated, scaled and cut, so that the input face image is in the same horizontal line with the coordinates of two eyes, the center distance of the two eyes is normalized according to the parameter d of the standard structure, and the transverse and longitudinal proportion of the face is consistent with the standard structure. And then applying local affine transformation, and carrying out geometric position normalization on the mouth region, wherein the geometric position normalization comprises mouth region division, local rotation and scaling and the transformed mouth region. Finally, carrying out image edge fusion, and carrying out edge fusion on the global transformation image and the local transformation image to form a complete geometric normalization face image;

the method fully utilizes the characteristic of facial image facial feature position approximation, and uniformly performs reversible facial image geometric normalization based on secondary affine transformation on the images in the database before performing super-resolution restoration based on global feature PCA, so that the super-resolution restoration result is well improved, and meanwhile, the original features of the images are maintained by the reversible normalization method.

The invention has the characteristics that:

1. the 6-point coordinate is adopted to describe the topological structure of the main features of the face, so that the method is simple and clear; a standard face structure method based on statistics to determine 6-point description is provided;

2. the global affine transformation and the local affine transformation are utilized to respectively carry out geometric normalization on the global structure and the local structure of the mouth of the human face, the principle is simple, and the implementation is easy;

3. based on the marked edge region fusion technology, image artifacts introduced by local structure adjustment can be effectively removed;

4. the human face geometric normalization process is parameterized, and the inverse transformation of the normalization process can be realized. The problem that the change of the normalization processing on the face structure influences the face processing result is solved.

Description of the drawings:

FIG. 1, Standard Structure definition of human faces;

FIG. 2, (a) coordinate mapping after counterclockwise rotation, scaling and clipping; (b) coordinate corresponding relation after clockwise rotation;

FIG. 3, (a) a mouth region division diagram; (b) partially transforming the labeled graph;

FIG. 4, (a) a diagram of generation of a training sample library; (b) the invention relates to a human face super-resolution restoration block diagram.

Fig. 5(a) is a face image reconstruction result without normalization processing (b) is a face image reconstruction result after normalization processing.

The specific implementation mode is as follows:

the embodiments of the invention are described in detail below with reference to the accompanying drawings:

the method comprises the steps that a video acquisition device acquires a face image, converts an optical signal of a target image into a digital image signal and stores the digital image signal in a storage of the acquisition device; reading in images through the existing USB and infrared interfaces by a computer, and carrying out geometric normalization of the human face based on secondary affine transformation in a processor; the normalization result can be stored in a database form and directly stored in a local hard disk, or stored in a network storage device or stored in a network; the normalization method mainly comprises the steps of face standard geometric structure definition, global affine transformation, local affine transformation and image edge fusion.

Definition of standard geometric structure of human face

As shown in figure 1, 6 characteristic point coordinates and relative positions thereof are used for describing a face image, and P is selected₁～P₆(the corresponding coordinate is (x)₁，y₁)～(x₆，y₆) Respectively corresponding to the coordinates of the left eye center, the right eye center, the nose tip, the left mouth corner, the mouth center and the right mouth corner;

d = \sqrt{{(x_{1} - x_{2})}^{2} + {(y_{1} - y_{2})}^{2}} - - - (1)

r_{1} = \frac{y_{3} - y_{1}}{d} - - - (2)

r_{2} = \frac{y_{5} - y_{3}}{d} - - - (3)

r_{3} = \frac{x_{5} - x_{4}}{d} = \frac{x_{6} - x_{5}}{d} - - - (4)

(II) Global affine transformation

The global affine transformation step is as follows:

for the input face image, artificial characteristic points (P) are carried out₁～P₆) Notation, according to P₁And P₂And (3) calculating the global image rotation angle by coordinates, wherein the global image rotation angle is expressed by formula (5):

(5)

(6)

defining the upper left corner of the rotated image as a new coordinate system origin, then the origin of coordinates is translated with respect to the origin of the original image, and if the width and height of the original image are W and H, respectively, as shown in fig. 2, then the coordinate transformation relation of the image rotating counterclockwise is as shown in equation (7):

(8)

S_h＝(x′₂-x′₁)/d (9)

S_y＝(y′₃-y′₁)/r₁d

(10)

wherein, the coordinate point x'₁、x′₂、y′₁And y'₃As coordinates of the rotated feature points, r₁And d is the criterion defined in section 2.1 in order to ensure the reversibility of the normalization, the parameter α and the ratio of the horizontal to vertical scaling factor S Sh/Sv are stored as inverse transform parameters. Meanwhile, according to the parameters of rotation, scaling and clipping, the face characteristic point P is updated₁～P₆The coordinates of (a);

(III) local affine transformations

Local affine transformation, aiming at the geometric normalization of a mouth region, the specific steps are as follows:

the rectangular mouth area is defined as shown in fig. 3(a), and the length and height of the rectangle are respectively shown in formulas (11) and (12):

w＝a₁(x₆-x₄) (11)

h＝a₂(y₅-y₃) (12)

(x_o，y_o)＝(x₅-w/2，(y₅+y₃)/2) (13)

the triangular areas of the left lower corner and the right lower corner of the rectangle are excluded from the mouth area. The two right-angle sides of the triangular area are l4 and h/4 respectively. Selection of a by testing the CAS-PEAL-R1 database₁＝1.7，a₂＝1.1；

the local scaling factor is calculated as shown in equation (15):

S_m＝Sp＝Sq＝(2r₃d)/(x′₆-x′₄) (15)

x′₄and x'₆Is a characteristic point P₄And P₆And transverse coordinates in the partial image after the rotation transformation. Local scaling of the mouthScaling to and longitudinally;

(IV) image edge blending

and establishing a Label image Label. The marker image size is 2d × 2 d;

and marking each pixel point of the image by a numerical value of 0-3. All flag values are initialized to 0; after the mouth area is defined, pixels at corresponding positions of the mouth area are located, and the value of the marked image is 1; after local rotation, scaling and translation, the area covered by the new mouth area and the corresponding marker image marker value is 2. The specific meaning of such a marked image is: the position of the value 0 is a pixel area which is not involved in local transformation; the area with the value of 1 and the pixel which belongs to the original mouth area and is not covered by the new mouth area; the pixel with the value of 2 is the position of the new mouth area which is not overlapped with the original mouth area; the pixel position with the area value of 3 is the position where the new mouth area is superposed with the original mouth area;

for the pixels in the area 2, the new mouth area covers the original pixel points after local affine transformation, so the pixels in the new mouth area directly replace the original pixel points covered by the new mouth area;

the area 3 is the superposition part of the new mouth area and the original image mouth area, and a fusion result is obtained by adopting a weighted average method so as to avoid obvious edge artifacts in the image; the weight is defined as follows: by local affineTransformed P₅(x₅，y₅) Constructing a two-dimensional Gaussian function for the center is shown in equation (16):

(16)

wherein i ═ x-x₅，j＝y-y₅All are image coordinates in the local image;

p(i，j)＝Weight(i，j)P_G+(1-Weight(i，j))P_L.

(17)

if RunHeight is less than RunWidth, fusing pixel points:

(18)

If RunWidth is less than RunHeight, fusing pixel points:

(19)

storing the face image after geometric normalization into a database, and carrying out operations such as local storage or remote transmission through network storage equipment; the normalized face image library can be used for PCA analysis, establishing a sample library of face image analysis, and being used for face recognition or face super-resolution reconstruction and the like.

(V) inverse normalization

And according to the standard structure definition of the human face, a local mouth area is defined. According to F_invLocal translation parameter (Δ) among the parameters_x，Δ_y) Local scaling factor S_mAnd the local rotation angle beta performs translation, scaling and rotation of the mouth image;

and fusing the edge of the image subjected to local affine transformation and the edge of the global image.

And scaling and rotating the fused image according to the S and the alpha in the reversible parameters to obtain the human face image after inverse normalization. And S is the ratio of the horizontal scaling coefficient and the vertical scaling coefficient of the global affine transformation in the normalization. In the inverse normalization, the height of the target image can be calculated according to the S and the width of the target image, so that inverse scaling of different proportions in the transverse direction and the longitudinal direction is realized.

And performing integral rotation of the image according to the global angle parameter alpha.

Fig. 4 illustrates a block diagram of a system for applying the method of the present invention to super-resolution reconstruction of face images. Fig. 4(a) shows a process of generating a sample of a face image library. The high-resolution face image which is geometrically normalized by the inventionThe library is downsampled to generate a corresponding low resolution image. And carrying out PCA analysis on the high-resolution face library and the low-resolution face library respectively to obtain corresponding feature vector groups serving as sample libraries for face image super-resolution reconstruction. Fig. 4(b) is a super-resolution reconstruction process based on global feature analysis for a face image. Firstly, the geometric normalization based on the invention is carried out on the low-resolution face image to obtain the inverse transformation parameter F_inv(ii) a And then, performing linear representation on the normalized face image by utilizing the PCA feature vector group of the low-resolution images in the sample library to obtain a linear combination coefficient C. And carrying out linear combination on the combination coefficient C and the PCA characteristic vector group of the high-resolution face to obtain an initial estimation image of the high-resolution face image. F-based initial results_invAnd performing inverse normalization processing on the parameters to obtain a high-resolution face image reconstruction result. FIG. 5 illustrates that the method of the present invention is applied to the super-resolution reconstruction process of the face image, which can effectively improve the quality of the reconstructed image. Wherein, fig. 5(a) is the face image super-resolution reconstruction result without normalization processing, and fig. 5(b) is the face image super-resolution reconstruction result after normalization processing. The first attached table shows the average PSNR values of the test samples with twice amplification results after normalization and non-normalization processing are performed on 300 sample images respectively in the image objective quality evaluation. It can be seen from the table that the results obtained for the normalized samples are significantly better than the set of samples that were not normalized. In the face image super-resolution reconstruction scheme, the method can retain the personalized geometric characteristics of the original face image and improve the effectiveness of the whole face analysis.

TABLE-CAS-PEAL-R1 database image normalization test PSNR mean value (2 Xmagnification)

	PSNR(db)
		Normalized result	29.026
Results without normalization	27.421

Claims

1. A face image reversible geometric normalization method facing integral feature analysis comprises the steps that a video acquisition device acquires a face image, an optical signal of a target image is converted into a digital image signal, and the digital image signal is stored in a storage device of the acquisition device; reading in images through the existing USB and infrared interfaces by a computer, and carrying out geometric normalization of the human face based on secondary affine transformation in a processor; the normalization result can be stored in a database form and directly stored in a local hard disk, or stored in a network storage device or stored in a network; the normalization method mainly comprises the steps of face standard geometric structure definition, global affine transformation, local affine transformation and image edge fusion;

in the global affine transformation step, after a face image is defined according to a standard structure of a face, global rotation, scaling and cutting are carried out, so that the input face image is enabled to have the two-eye coordinates in the same horizontal line, the center distance of the two eyes is normalized according to a parameter d of the standard structure, and the transverse and longitudinal proportions of the face are consistent with the standard structure;

in the local affine transformation step, geometric position normalization is carried out on a mouth region, and the main steps include mouth region division, local rotation and scaling and the mouth region after transformation;

the image normalization algorithm comprises the following steps:

reading in the collected face image from the database;

calculating the relative position relationship F ═ d, r between the characteristic points₁，r₂，r₃) (ii) a Wherein,

d = \sqrt{{(x_{1} - x_{2})}^{2} + {(y_{1} - y_{2})}^{2}} - - - (1)

r_{1} = \frac{y_{3} - y_{1}}{d} - - - (2)

r_{2} = \frac{y_{5} - y_{3}}{d} - - - (3)

r_{3} = \frac{x_{5} - x_{4}}{d} = \frac{x_{6} - x_{5}}{d} - - - (4)

i.e. d is the Euclidean distance between the centers of the eyes, r₁The ratio of the vertical distance from the line of the eyes to the nose tip to d; r is₂Is the ratio of the distance from the tip of the nose to the center point of the mouth to d, r₃The ratio of the distance from the center point of the mouth to the angle of the mouth to d; the length and width of the face image are defined as 2d, and the edge of the image with the nearest distance between the centers of the eyes is 0.5 d;

carrying out local affine transformation and carrying out geometric position normalization on the mouth region; dividing a mouth region, locally rotating and scaling, and fusing the transformed mouth region and the edge of an original image;

the inverse normalization process is the inverse of the geometric normalization process described above, starting with the local inverse affine and going to the global inverse affine.

2. The reversible geometric normalization method for human face images oriented to integral feature analysis according to claim 1, characterized in that: the global affine transformation comprises the following specific steps:

2.1: for the input face image, artificial characteristic points (P) are carried out₁～P₆) Notation, according to P₁And P₂And (3) calculating the global image rotation angle by coordinates, wherein the global image rotation angle is expressed by formula (5):

<math> <mrow> <mi>α</mi> <mo>=</mo> <mi>arctg</mi> <mrow> <mo>(</mo> <mfrac> <mrow> <msub> <mi>y</mi> <mn>2</mn> </msub> <mo>-</mo> <msub> <mi>y</mi> <mn>1</mn> </msub> </mrow> <mrow> <msub> <mi>x</mi> <mn>2</mn> </msub> <mo>-</mo> <msub> <mi>x</mi> <mn>1</mn> </msub> </mrow> </mfrac> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>5</mn> <mo>)</mo> </mrow> </mrow> </math>

2.2: wherein alpha is P₁And P₂Angle of the connecting line and the horizontal line; rotating the image by an angle theta-alpha; the coordinates before and after rotation are expressed by the following formula (6):

2.3: defining the upper left corner of the rotated image as a new coordinate system origin, translating the coordinate origin relative to the original image origin, and setting the width and height of the original image as W and H respectively, wherein the coordinate transformation relation of the image rotating counterclockwise is shown as formula (7):

2.4: wherein (x, y) is the original image coordinate, and (x ', y') is the rotated image coordinate; similarly, the coordinate relationship between the clockwise rotation image and the original image is:

2.5: the feature points of the two eyes of the rotated image are positioned on the same horizontal line; calculating the horizontal and vertical scaling coefficients, and cutting to make the coordinates of the eyes and the nose tip consistent with the standard geometric structure of the human face; the transverse and longitudinal scaling factors are respectively shown in formulas (9) and (10):

S_h＝(x′₂-x′₁)/d (9)

S_y＝(y′₃-y′₁)/r₁d

(10)

2.6: wherein, the coordinate point x'₁、x′₂、y′₁And y'₃Is the feature point coordinate after rotation, d is the Euclidean distance between the centers of the two eyes, r₁The ratio of the vertical distance from the line of the eyes to the tip of the nose to d, the parameter α and the ratio of the horizontal to vertical scaling factor S Sh/Sv are stored as inverse transform parameters.

3. The reversible geometric normalization method for human face images oriented to integral feature analysis according to claim 1, characterized in that: the local affine transformation specifically comprises the following steps aiming at geometric normalization of a mouth region:

3.1: a rectangular mouth area is defined, and the length and the height of the rectangle are respectively shown as formulas (11) and (12):

w＝a₁(x₆-x₄) (11)

h＝a₂(y₅-y₃) (12)

(x_o，y_o)＝(x₅-w/2，(y₅+y₃)/2) (13)

excluding the triangular areas at the left lower corner and the right lower corner of the rectangle from the mouth area; two right-angle sides of the triangular area are w/4 and h/4 respectively; selection of a by testing the CAS-PEAL-R1 database₁＝1.7，a₂＝1.1；

3.2: calculating the local rotation angle of the mouth area through the coordinates of the characteristic points P4-P6 as shown in formula (14):

the local scaling factor is calculated as shown in equation (15):

S_m＝Sp＝Sq＝(2r₃d)/(x′₆-x′₄) (15)

wherein, x'₄And x'₆Is a characteristic point P₄And P₆Transverse coordinates in the partial image after the rotation transformation; local scaling of the mouth takes lateral and longitudinal equal scaling;

Obtaining all parameters F for inverse transformation_inv＝[α，S，β，S_m，Δ_x，Δ_y]。

4. The reversible geometric normalization method for human face images oriented to integral feature analysis according to claim 1, characterized in that: the method is characterized in that the global transformation image and the local transformation image are subjected to edge fusion, and the method specifically comprises the following steps:

4.1: establishing a Label image Label; the marker image size is 2d × 2 d;

4.2: marking each pixel point of the image by a numerical value of 0-3; initializing all mark values of the image to 0; after the mouth area is defined, pixels at corresponding positions of the mouth area are located, and the value of the marked image is 1; after local rotation, zooming and translation, the area covered by the new mouth area and the corresponding mark value of the mark image are 2; the specific meaning of such a marked image is: the position with the value of 0 is a pixel area related to local transformation; the area with the value of 1 and the pixel which belongs to the original mouth area and is not covered by the new mouth area; the pixel with the value of 2 is the position of the new mouth area which is not overlapped with the original mouth area; the pixel position with the area value of 3 is the position where the new mouth area is superposed with the original mouth area;

the area 3 is a superposition part of a new mouth area and a non-mouth area of the original image, and a fusion result is obtained by adopting a weighted average method so as to avoid obvious edge artifacts in the image; the weight is defined as follows: p transformed by local affine₅(x₅，y₅) Constructing a two-dimensional Gaussian function for the center is shown in equation (16):

<math> <mrow> <mi>Weight</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>=</mo> <msup> <mi>e</mi> <mrow> <mo>-</mo> <mrow> <mo>(</mo> <mfrac> <msup> <mi>i</mi> <mn>2</mn> </msup> <msubsup> <mi>σ</mi> <mi>x</mi> <mn>2</mn> </msubsup> </mfrac> <mo>+</mo> <mfrac> <msup> <mi>j</mi> <mn>2</mn> </msup> <msubsup> <mi>σ</mi> <mi>y</mi> <mn>2</mn> </msubsup> </mfrac> <mo>)</mo> </mrow> </mrow> </msup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>16</mn> <mo>)</mo> </mrow> </mrow> </math>

wherein i ═ x-x₅，i＝y-y₅All are image coordinates in the local image;

p(i，j)＝Weight(i，j)p_G+(1-Weight(i，j))p_L.

(17)

P_Gis the pixel value, p, of a pixel point in the global image overlap region_LPixel values of pixel points at corresponding positions of the mouth region;

4.3: the area 1 pixels are areas which belong to the original mouth area but are not covered by the new mouth area, and the pixels in the area 1 in the fused image are filled through interpolation;

if RunHeight is less than RunWidth, fusing pixel points:

<math> <mrow> <mi>Q</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>=</mo> <msub> <mi>Q</mi> <mi>G</mi> </msub> <mrow> <mo>(</mo> <msub> <mi>i</mi> <mn>1</mn> </msub> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>×</mo> <mrow> <mo>(</mo> <mfrac> <mrow> <msub> <mi>i</mi> <mn>2</mn> </msub> <mo>-</mo> <mi>i</mi> </mrow> <mi>RunHeight</mi> </mfrac> <mo>)</mo> </mrow> <mo>+</mo> <msub> <mi>Q</mi> <mi>L</mi> </msub> <mrow> <mo>(</mo> <msub> <mi>i</mi> <mn>2</mn> </msub> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>×</mo> <mrow> <mo>(</mo> <mfrac> <mrow> <mi>i</mi> <mo>-</mo> <msub> <mi>i</mi> <mn>1</mn> </msub> </mrow> <mi>RunHeight</mi> </mfrac> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>18</mn> <mo>)</mo> </mrow> </mrow> </math>

wherein Q_G(i₁J) is the uppermost first pixel not belonging to region 1 of the same row of interpolation pixels, Q_L(i₂J) is the first pixel point which is not in the 1 area at the lowest end of the same row of the interpolation pixel points;

if RunWidth is less than RunHeight, fusing pixel points:

<math> <mrow> <mi>Q</mi> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <mi>j</mi> <mo>)</mo> </mrow> <mo>=</mo> <msub> <mi>Q</mi> <mi>M</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <msub> <mi>j</mi> <mn>1</mn> </msub> <mo>)</mo> </mrow> <mo>×</mo> <mrow> <mo>(</mo> <mfrac> <mrow> <msub> <mi>j</mi> <mn>2</mn> </msub> <mo>-</mo> <mi>j</mi> </mrow> <mi>RunWidtht</mi> </mfrac> <mo>)</mo> </mrow> <mo>+</mo> <msub> <mi>Q</mi> <mi>N</mi> </msub> <mrow> <mo>(</mo> <mi>i</mi> <mo>,</mo> <msub> <mi>j</mi> <mn>2</mn> </msub> <mo>)</mo> </mrow> <mo>×</mo> <mrow> <mo>(</mo> <mfrac> <mrow> <mi>j</mi> <mo>-</mo> <msub> <mi>j</mi> <mn>1</mn> </msub> </mrow> <mi>RunWidth</mi> </mfrac> <mo>)</mo> </mrow> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>19</mn> <mo>)</mo> </mrow> </mrow> </math>

wherein Q_M(i，j₁) The first pixel point not belonging to the 1 region at the leftmost end of the same line of interpolation pixel points, Q_N(i，j₂) The first pixel point which is not in the region 1 at the rightmost end of the same line of the interpolation pixel points;

and modifying the Label 1 in the Label image corresponding to the interpolated pixel into 0, and providing assistance for the linear interpolation of the unfused 1-region pixel.