CN108345881B

CN108345881B - Document quality detection method based on computer vision

Info

Publication number: CN108345881B
Application number: CN201810101325.7A
Authority: CN
Inventors: 郭文忠; 张融; 柯逍; 陈羽中
Original assignee: Fuzhou University
Current assignee: Fuzhou University
Priority date: 2018-02-01
Filing date: 2018-02-01
Publication date: 2021-12-21
Anticipated expiration: 2038-02-01
Also published as: CN108345881A

Abstract

The invention relates to a document quality detection method based on computer vision. The document quality detection method based on computer vision is provided for solving the problems that the traditional manual naked eye detection is difficult, the efficiency is low, the reliability is poor, and the subjectivity influence is large. In order to accurately detect the quality of a document, the method firstly extracts a static frame of a high-speed document printing video by a reasonable method; secondly, carrying out proper image preprocessing on the document image based on a computer vision preprocessing method; thirdly, carrying out accurate contour detection and extraction on the preprocessed document image; then, carrying out inclination correction on the extracted document outline image to enable the malformed image to become a normally tiled document image to be detected; and finally, PSNR and MSE quality evaluation is carried out on the image to be detected, and the image to be detected is compared with the template to obtain a detection result of the document quality. The method has the characteristics of high efficiency, reliability, continuity, flexibility and the like, and has strong practical applicability.

Description

Document quality detection method based on computer vision

Technical Field

The invention relates to the field of computer vision and digital images, and applies the method to the document quality detection, in particular to a document quality detection method based on the computer vision.

Background

With the rapid development of the society in the twenty-first century, the rapid development and progress of the industry and the service industry, the modern printing industry has performed highly automated production, and in the service industry, as the competition among enterprises is more and more intense, the requirement for the appearance printing quality of documents is higher and higher, the demand for documents is more and more, and the problem that the sorting and arrangement of a huge number of documents are required is also troublesome. Under the pressure of time and cost of modern printing enterprises, traditional visual observation can only qualitatively describe the printing quality, but cannot quantitatively describe the printing quality in a large quantity and fast manner, and cannot classify documents quickly and accurately according to certain characteristic classifications. Under these circumstances, intelligent systems for detecting and classifying documents with high efficiency and accuracy are urgently sought.

In recent years, computer vision technology has been developed rapidly, and various detection technologies based on computer vision have been widely used and developed in many practical application systems. The computer vision is to replace human with industrial camera and computer, and to identify, track and detect the target through computer without human intervention. In general, machines are used to process and analyze images instead of the human eye. The method has the characteristics of high efficiency, reliability, continuity, flexibility and the like. In the printing of documents in batches in modern and service industries, it is possible to use printing machines with less precision, less material to be printed, and complicated processes in printing technology, as well as environmental factors. Printed documents may suffer from a wide variety of quality defects, such as insurance policies, for example, ink smearing, pinholes, wrinkling, text blurring, text missing, blemishes, scratches, ghosting, scratches, misalignment, misregistration, color misregistration, and the like. These quality defects can become the development bottleneck of industrial enterprises, and can cause the situation that people cannot compete for the service industry. Meanwhile, because the manual quality detection efficiency is far lower than that of a machine computer, the situations force the rapid detection and classification of the documents to be a necessary development trend.

Disclosure of Invention

The invention aims to provide a computer vision-based document quality detection method, which overcomes the defects of the existing manual detection and solves the problem of the computer vision-based document quality detection method.

In order to achieve the purpose, the technical scheme of the invention is as follows: a document quality detection method based on computer vision is realized according to the following steps:

step S1: extracting a static frame from an input high-speed image video sequence to obtain a document image needing to be detected and processed;

step S2: preprocessing the extracted static frame, performing noise reduction and edge reinforcement, and eliminating environmental interference items;

step S3: carrying out document contour detection on the image preprocessed in the step S2, and extracting a document image part according to the contour;

step S4: performing tilt correction on the document image on spatial transformation to enable the document image to be a tiled vertical overlook document image;

step S5: comparing the document image identified and located in the step S4 with the template through a quality evaluation algorithm to detect the quality level of the document.

In an embodiment of the present invention, in the step S1, the document image to be detected is extracted by:

step S11: taking frame number I for an input high-speed video under a printer, extracting images at intervals of the frame number I, selecting preset interval frame numbers, and converting the input video into an image stream;

step S12: performing still frame processing on the image stream, and performing image algorithm subtraction on the front image and the rear image in the image stream to make the same part be a pixel 255 and the different part be 0;

step S13: and (4) counting the number N of 0 pixel points of the image after the subtraction operation, wherein if N is smaller than a preset threshold value Y, the current frame is a static frame, otherwise, the frame is discarded, and the step S12 is continued.

In an embodiment of the present invention, in the step S2, the still frame is preprocessed by the following steps:

step S21: carrying out image smoothing treatment on an image to be detected through a Mean Shift algorithm, wherein the Mean Shift algorithm comprises the following steps:

s211: firstly, randomly selecting a pixel point in a picture as a central point;

s212: calculating the mean value M (x) of the shift of the center point,

wherein x represents the center point abscissa y represents the center point ordinate, S_hIs a high dimensional sphere region with radius h: k denotes that k fall into the ball S among the n sample points_hPerforming the following steps;

s213: moving the point to its offset mean, X^t+1＝M^t+x^t，M^tIs an offset mean value obtained in a t state; x is the number of^tIs the center in the t state, X^t+1Is x^tCenter in the next state; then, taking the initial point as a new initial point, and continuing moving until a certain condition is met;

step S22: performing morphological operation on an image to be detected, namely performing the process of firstly corroding and then expanding the image, wherein the mathematical expression of the process is as follows:

dst＝open(src,element)＝dilate(erode(src,element))

where src is the input image, element is the defined kernel, enode is the erosion operation, and dilate is the dilation operation;

step S23: and carrying out binarization processing on the image to be detected, and selecting a threshold value to convert the image into a gray image.

In an embodiment of the present invention, in the step S3, the specific implementation steps of performing the document contour detection on the image preprocessed in the step S2 are as follows:

step S31: contour detection is carried out on an image to be detected by adopting a Canny operator, and a convolution operator for Canny is

X and Y being variables of the axes X and Y, S_xIs its x-direction partial derivative calculation template, S_yThe partial derivative in the y direction is calculated for the template, the x-axis direction and the y-axis directionThe mathematical expressions of the first partial derivative matrix, the gradient magnitude and the gradient direction are as follows:

P[i,j]＝(f[i,j+1]-f[i,j]+f[i+1,j+1]-f[i+1,j])/2

Q[i,j]＝(f[i,j]-f[i+1,j]+f[i,j+1]-f[i+1,j+1])/2

θ[i：j]＝arCtan(Q[i：j]P[i：j])

wherein i and j are array subscripts, wherein f is an image gray value, P represents an X-axis direction gradient amplitude, Q represents a Y-axis direction gradient amplitude, M is the point amplitude, and theta is a gradient direction, namely an angle;

then, according to the calculated gradient amplitude, non-maximum value inhibition is carried out, and then a double-threshold algorithm is used for detecting and connecting edges;

step S32: extracting all contour edges from the binary image through a contour search function, and writing a deletion algorithm: calculating the areas of all connected domains, and then deleting the connected domains with the areas smaller than a preset threshold value, and only leaving the outline graph of the document;

step S33: detecting straight line segments in the contour map, and adopting cumulative probability Hough transform, wherein the steps are as follows:

s331: mapping the image coordinates to image polar coordinates, the linear expression is

Where r is the distance from the origin to the line,

is the X-axis coordinate, beta is the Y-axis coordinate, and lambda is the inclination angle of the straight line and the X-axis;

s332: randomly extracting a feature point, namely an edge point, in the image, if the feature point is marked as a point on a certain straight line, continuously randomly extracting an edge point from the rest edge points until all the edge points are extracted;

s333: carrying out Hough transform on the point, and carrying out accumulation and calculation;

s334: selecting a point with the maximum value in the Hough space, if the point is larger than a threshold value, performing step S335, otherwise returning to step S332;

s335: according to the maximum value obtained by Hough transform, sending out the point, and displacing along the direction of a straight line, thereby finding two end points of the straight line;

s336: and calculating the length of the straight line, and if the length is greater than a certain threshold value, recording the edge line segment which is considered to be good, placing the edge line segment in a Lines array, and returning to the step S332.

In an embodiment of the present invention, in the step S4, the specific implementation steps of performing the tilt correction on the document image in the spatial transformation are as follows:

step S51: the edge of the document image is a quadrangle which is divided into an upper edge, a lower edge, a left edge and a right edge which are A, B, C, D respectively; a straight line is determined by two end points; classifying all straight line segments in the Lines array according to A, B, C, D four edges approximately through the two point coordinates, taking a point a at the upper left corner of the document image, and taking a point closest to the upper left corner from the left end point of A and the upper end point of C; b, taking a point which is closest to the upper right corner from the right endpoint of A and the upper endpoint of D; c, taking the point at the lower left corner of the document image, and taking the point closest to the lower left corner in the lower end point of C and the left end point of B; taking a point D at the lower right corner of the document image, taking a point closest to the lower right corner from the lower end point of D and the right end point of B, and taking four corner points of the document images a, B, c and D;

step S52: then, perspective space transformation is used for performing tilt correction on the malformed image of the document image, a perspective transformation matrix for transforming the malformed image into a rectangle is obtained through a perspective transformation matrix algorithm function according to the four corner points found in step S51, and then space transformation is performed through the perspective space transformation algorithm function to complete tilt correction.

In an embodiment of the present invention, the specific implementation steps of step S5 are as follows:

step S61: measuring the quality level of the document image by using the PSNR peak signal-to-noise ratio; PSNR is applied to a document image to be detected and a template image, and the larger the PSNR value among 2 images is, the more similar the PSNR value is; presetting a threshold value ps as a measuring reference; PSNR equation is as follows:

where MAX represents the maximum value of the image color and the 8-bit sample points are represented as 255, MAX is therefore²＝255*255；

MSE represents the current picture P₁And a reference picture P₂The MSE equation is:

where m and n represent the height and width of the image frame, I represents the current image in line I, J represents the current image in column J, and K (I, J) represents the current image P₁L (I, J) denotes the reference image P₂Row I and column J;

step S62: substituting an image array to be detected as K (I, J) and substituting a template image array as L (I, J); and calculating to obtain the PSNR value pr of the document image to be detected, wherein if the pr is larger than a threshold value ps, the quality level of the document to be detected accords with the condition, and otherwise, the quality level of the document to be detected is unqualified.

Compared with the prior art, the invention has the following beneficial effects: the method comprehensively considers the interference caused by environmental factors, and improves the accuracy and reliability of document positioning by preprocessing noise reduction and deleting the connected domain with small area by the algorithm. And simultaneously, detecting and extracting the document contour through a Canny operator and a contour search function. And thirdly, extracting straight line segments through cumulative probability Hough transformation, and positioning coordinates of four corner points of the document through a designed algorithm. Subsequently, the deformed document image is stretched into a normal document image through perspective space transformation tilt correction. And finally, detecting the document quality by using a PSNR and MSE evaluation method. The patent applies and optimizes an algorithm in computer vision, and can accurately position and identify the document and carry out quality detection on the document. The method is simple, flexible to implement and high in practicability.

Drawings

FIG. 1 is a flow chart of a computer vision-based document quality detection method according to the present invention.

Detailed Description

The technical scheme of the invention is specifically explained below with reference to the accompanying drawings.

As shown in FIG. 1, the present invention provides a document quality detection method based on computer vision. The document quality detection method based on computer vision is provided for the problems that the traditional manual naked eye detection is difficult, the efficiency is low, the reliability is poor, and the subjectivity influence is large. In order to accurately detect the quality of a document, the method firstly extracts a static frame of a high-speed document printing video by a reasonable method; secondly, carrying out proper image preprocessing on the document image based on a computer vision preprocessing method; thirdly, carrying out accurate contour detection and extraction on the preprocessed document image; then, carrying out inclination correction on the extracted document outline image to enable the malformed image to become a normally tiled document image to be detected; and finally, PSNR and MSE quality evaluation is carried out on the image to be detected, and the image to be detected is compared with the template to obtain a detection result of the document quality. The method comprises the following specific steps:

In the step S1, a document image to be detected is extracted by:

step S11: adopting proper frame number I for an input high-speed video under a printer, extracting images at intervals of the frame number I, selecting preset interval frame numbers, and converting the input video into an image stream;

step S12: performing still frame processing on the image stream, and performing image algorithm subtraction on the front image and the rear image in the image stream to make the same part be a pixel 255 (white pixel) and the different part be 0 (black pixel);

step S13: and (4) counting the number N of 0 (black) pixel points of the image after the subtraction operation, wherein if N is smaller than a preset threshold value Y, the current frame is a static frame, otherwise, the frame is discarded, and the step S12 is continued.

In the step S2, the still frame is preprocessed by:

s212: calculating the mean value M (x) of the shift of the center point,

wherein x represents the abscissa of the center point, y represents the ordinate of the center point, S_hIs a high dimensional sphere region with radius h: k denotes that k fall into the ball S among the n sample points_hPerforming the following steps;

s213: moving the point to its offset mean, X^t+¹＝M^t+x^t，M^tIs an offset mean value obtained in a t state; x is the number of^tIs the center in the t state, X^t+1Is x^tCenter in the next state; then, taking the initial point as a new initial point, and continuing moving until a certain condition is met;

dst＝open(src,element)＝dilate(erode(src,element))

In step S3, the specific implementation steps of detecting the document contour of the image preprocessed in step S2 are as follows:

X and Y being variables of the axes X and Y, S_xIs its x-direction partial derivative calculation template, S_yThe calculation template of the partial derivatives in the y direction is characterized in that the mathematical expressions of a first-order partial derivative matrix, a gradient amplitude and a gradient direction of the calculation template in the x-axis direction and the y-axis direction are as follows:

P[i,j]＝(f[i,j+1]-f[i,j]+f[i+1,j+1]-f[i+1,j])/2

Q[i,j]＝(f[i,j]-f[i+1,j]+f[i,j+1]-f[i+1,j+1])/2

θ[i，j]＝arctan(Q[i，j]P[i：j])

Where r is the distance from the origin to the line,

In step S4, the specific implementation steps of performing the tilt correction on the document image in the spatial transformation are as follows:

The specific implementation steps of step S5 are as follows:

The above are preferred embodiments of the present invention, and all changes made according to the technical scheme of the present invention that produce functional effects do not exceed the scope of the technical scheme of the present invention belong to the protection scope of the present invention.

Claims

1. A document quality detection method based on computer vision is characterized by comprising the following steps:

step S1: extracting a static frame from an input high-speed image video sequence under a printer to obtain a document image needing detection processing;

step S5: comparing the document image identified and positioned in the step S4 with the template through a quality evaluation algorithm to detect the quality level of the document;

in the step S2, the still frame is preprocessed by:

s212: calculating the mean value M (x) of the shift of the center point₀)，

Wherein x is₀Representing the abscissa of the center point, S_hIs a high dimensional sphere region with radius h: k is represented by n₀In one sample point havek drop balls S_hPerforming the following steps;

dst＝open(src,element)＝dilate(erode(src,element))

step S23: carrying out binarization processing on an image to be detected, and selecting a threshold value to convert the image into a gray image;

X and Y being variables of the axes X and Y, S_xIs a template for calculating partial derivatives in the x direction, S_yThe calculation template of the partial derivatives in the y direction is a first-order partial derivative matrix in the x-axis direction and the y-axis direction, and the mathematical expressions of the gradient amplitude and the gradient direction are as follows:

P[i,j]＝(f[i,j+1]-f[i,j]+f[i+1,j+1]-f[i+1,j])/2

Q[i,j]＝(f[i,j]-f[i+1,j]+f[i,j+1]-f[i+1,j+1])/2

θ[i，j]＝arctan(Q[i，j]/P[i，j])

Where r is the distance from the origin to the line,

s336: calculating the length of the straight line, if the length is greater than a certain threshold value, recording the edge line segment which is considered to be good, placing the edge line segment in a Lines array, and returning to the step S332;

step S51: the edge of the document image is a quadrangle which is divided into an upper edge, a lower edge, a left edge and a right edge which are A, B, C, D respectively; a straight line is determined by two end points; classifying all straight line segments in the Lines array according to A, B, C, D four edges approximately through the coordinates of the two end points, taking a point a at the upper left corner of the document image, and taking a point closest to the upper left corner in the left end point of A and the upper end point of C; b, taking a point which is closest to the upper right corner from the right endpoint of A and the upper endpoint of D; c, taking the point at the lower left corner of the document image, and taking the point closest to the lower left corner in the lower end point of C and the left end point of B; taking a point D at the lower right corner of the document image, taking a point closest to the lower right corner from the lower end point of D and the right end point of B, and taking four corner points of the document images a, B, c and D;

2. A computer vision-based document quality detection method according to claim 1, wherein in said step S1, the document image to be detected is extracted by:

step S13: counting the number N of 0 pixel points of the image subjected to the subtraction operation, and if N is smaller than a preset threshold value Y₀If the current frame is a still frame,otherwise, the frame is discarded and the process continues to step S12.

3. The method for detecting document quality based on computer vision of claim 1, wherein the step S5 is implemented by the following steps: