CN116883461B - Method for acquiring clear document image and terminal device thereof - Google Patents

Method for acquiring clear document image and terminal device thereof Download PDF

Info

Publication number
CN116883461B
CN116883461B CN202310567449.5A CN202310567449A CN116883461B CN 116883461 B CN116883461 B CN 116883461B CN 202310567449 A CN202310567449 A CN 202310567449A CN 116883461 B CN116883461 B CN 116883461B
Authority
CN
China
Prior art keywords
image
document
document image
transformation matrix
image set
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310567449.5A
Other languages
Chinese (zh)
Other versions
CN116883461A (en
Inventor
请求不公布姓名
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhuhai Yike Intelligent Technology Co ltd
Zhuhai Xinye Electronic Technology Co Ltd
Original Assignee
Zhuhai Yike Intelligent Technology Co ltd
Zhuhai Xinye Electronic Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhuhai Yike Intelligent Technology Co ltd, Zhuhai Xinye Electronic Technology Co Ltd filed Critical Zhuhai Yike Intelligent Technology Co ltd
Priority to CN202310567449.5A priority Critical patent/CN116883461B/en
Publication of CN116883461A publication Critical patent/CN116883461A/en
Application granted granted Critical
Publication of CN116883461B publication Critical patent/CN116883461B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/30Determination of transform parameters for the alignment of images, i.e. image registration
    • G06T7/33Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods
    • G06T7/337Determination of transform parameters for the alignment of images, i.e. image registration using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration by the use of more than one image, e.g. averaging, subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/19007Matching; Proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30176Document
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention provides a method for acquiring clear document images and a terminal device thereof, wherein the method comprises the steps of acquiring an original document image set to be processed; registering an original document image set, matching two or more document images acquired under different angles or different focusing conditions, so that points corresponding to the same position in space in the original document image set are matched one by one, and after matched key points are acquired, calculating a transformation relation of the original document image set to obtain a registered document image set; after the registered document image set is obtained, extracting the characteristic values of the same area on each image from the document image set position by position, and carrying out image fusion according to the extracted characteristic values to obtain a clear document image. According to the invention, a plurality of multi-focus images are shot on the same file content, and the image sets are fused according to the definition of the same position in the documents in different images, so that a full-focus clear document scanning image is obtained.

Description

Method for acquiring clear document image and terminal device thereof
Technical Field
The invention relates to the technical field of image processing, in particular to a method for acquiring a clear document image and a terminal device applying the method.
Background
With the popularization of intelligent devices and digital learning and office, how to convert paper documents into digital documents conveniently and quickly with high quality by using a digital camera is becoming more important. Today, people often use a camera on a mobile phone to scan and save document images in life or work, and the general flow is as follows: the mobile phone is used for shooting the document photo, and a plurality of image processing methods, such as contrast enhancement and the like, are used for obtaining a clearer document image. However, when the photographed document is relatively large or an included angle exists between the camera and the plane of the photographed document, the document image obtained by the camera may appear as a phenomenon that a part of the area is out of focus and blurred. In this case, only the conventional image processing method is not used to realize the definition of the blurred area by using only a single image, but the method based on the deep learning is also difficult to obtain a better effect, on one hand, the deep learning is not used to judge the text content and restore the text content through the characteristics even if the text is very blurred, on the other hand, the deep learning requires a large amount of computing resources and running time, and is not suitable for being directly deployed in the portable intelligent device used for shooting the document.
To address this problem, it is often necessary to take a plurality of photographs at different focus positions, and even so, it is still inconvenient to read the document because it is necessary to switch back and forth among the collection of photographs, while taking a large number of photographs for one document wastes more memory space.
In addition, when a user sits to photograph and scan a paper document flattened on a desktop, or photographs or projects a blackboard in a lecture classroom, the photographed document is often blurred in local defocus due to the fact that the imaging plane of the camera is not completely parallel to the plane of the photographed body, and characters in the defocus region are difficult to recognize when the document needs to be referred to. In addition, generally, when a document is scanned, the center of the document is often selected to be focused, when an enlarged image is read, the edge position is found to be blurred due to a slight included angle between an imaging surface and a document plane or lens shake caused by photographing action, so that the document is read to a certain extent, and the reading difficulty is increased.
Disclosure of Invention
Aiming at the problems that the photographed document is large, the partial area of the photographed document is out of focus and blurred, the occupied storage space is large and the like, the invention provides a method for acquiring a clear document image and a terminal device thereof.
In order to solve the problems, the technical scheme adopted by the invention is as follows:
a method for acquiring a clear document image, comprising the steps of:
acquiring an original document image set to be processed;
registering the original document image set, matching two or more document images acquired under different angles or different focusing conditions, so that points corresponding to the same position in space in the original document image set are matched one by one, and after matched key points are acquired, calculating the transformation relation of the original document image set to obtain a registered document image set;
after the registered document image set is obtained, extracting the characteristic values of the same area on each image from the document image set position by position, and carrying out image fusion according to the extracted characteristic values to obtain a clear document image.
According to the method for acquiring the clear document image provided by the invention, after the clear document image is acquired, the method can be further performed:
post-processing the document image: edge detection is carried out on the fused document image and the document image set before fusion, the detected document edges are fused in combination with the image set registration relationship, four vertex coordinates of the document on the fused image are calculated according to the detected document edges, the effective area of the scanned document is segmented from the document image, the document image is corrected, and therefore a clear scanned document image is obtained.
According to the method for acquiring the clear document image provided by the invention, the acquisition of the document image set to be processed comprises the following steps:
selecting different focusing images of a plurality of identical documents from a storage unit;
or the image acquisition unit directly acquires a plurality of document images in the changing process of the focal length;
wherein the image sets used in one process are typically image sets of the same document, different focus distances.
According to the method for acquiring clear document images provided by the invention, the document image set is registered, and the method comprises the following steps:
performing key point detection and feature description on each document image by using a feature extraction algorithm;
after each document image in the document image set is identified with key points and feature descriptions, measuring the distance between each pair of key point descriptors by using a matcher, and reserving correct matching by using a ratio filter so as to complete feature matching;
after the matched key points are obtained, further removing the mismatching points by using a random sampling consistency algorithm to obtain an initial perspective transformation matrix between the matching points;
and based on the matched key points and the obtained initial perspective transformation matrix, carrying out optimization by combining a nonlinear optimization algorithm to obtain a final transformation matrix.
According to the method for acquiring the clear document image provided by the invention, when the transformation relation of the document image set is calculated, in order to reduce calculation time consumption, the method is further implemented:
the original document image set is reduced according to a preset proportion;
calculating a transformation matrix between the reduced image sets;
and performing corresponding inverse scaling operation on the transformation matrix to obtain the transformation matrix suitable for the original image size.
According to the method for acquiring the clear document image provided by the invention, the corresponding inverse scaling operation is performed on the transformation matrix, and the method comprises the following steps:
setting the transformation matrix of the reduced first image and the reduced second image as formula (1):
transformation matrix after inverse scaling operation corresponding to the first image and the second imageIs formula (2):
wherein Scale is the scaling factor of the image;
according to the transformation matrixRegistering the original image set.
According to the method for acquiring the clear document image, the image fusion is carried out according to the extracted characteristic values, and the method comprises the following steps:
performing image fusion according to the extracted characteristic values based on pixel-level image fusion;
calculating the characteristic value of each region by using a window sliding mode;
and merging the document image sets into a clear image according to the characteristic values of the same area on different images.
According to the method for acquiring the clear document image provided by the invention, the document region extraction and deformation correction are carried out on the fused document image, and the method comprises the following steps:
edge detection of a document area is carried out on the fused document image;
optionally, edge detection of a document area is carried out on the document image before fusion, the information of the document image after fusion corresponding to each edge is calculated according to the mapping relation of the image set, and all the edge information is fused in a summation mode;
performing edge straight line fitting on the detected edge;
calculating four vertexes of the document image area according to the edge straight line;
and calculating a transformation matrix by using the four vertexes, and applying the transformation matrix to the fused document image to obtain a corrected document image.
A terminal apparatus for acquiring a clear document image, comprising:
a memory for storing image data and instructions executable by the processor;
a processor for processing data, executing instructions, and performing operations;
the image acquisition unit is used for acquiring an original scanned document image to be processed;
and an image output unit for displaying or printing the processed document image.
Therefore, compared with the prior art, the method for acquiring the clear document image provided by the invention acquires the full-focus document image by utilizing a plurality of differently focused document images to perform image fusion, and combines a proper edge detection algorithm and a document correction algorithm, so that the standard clear scanned document can be acquired, and the storage and the subsequent review are convenient. When a high quality printer is provided, the scanned document photo can be used to print out clear documents directly. In addition, the calculated amount of the method is far lower than that of the deep learning method, the method can be conveniently and rapidly deployed on mobile terminal equipment, clear document images are processed and synthesized on the equipment locally immediately after the camera acquires the document images, and meanwhile, the problem of information security possibly brought by document data in network transmission is avoided.
Furthermore, the invention can detect the document edge of the original document image to be processed through multiple dimensions to obtain the document edge, so that the document edge can keep global consistency, the false edge interference existing in the background and the image is eliminated, the accuracy and precision of edge detection are improved, the reliability of straight line fitting is improved for the subsequent steps, and the vertex positioning can be accurately realized.
The invention is described in further detail below with reference to the drawings and the detailed description.
Drawings
Fig. 1 is a first flowchart of an embodiment of a method of the present invention for acquiring a clear document image.
Fig. 2 is a second flowchart of an embodiment of a method of the present invention for capturing a sharp document image.
Fig. 3 is a schematic diagram of an embodiment of a terminal apparatus for acquiring a clear document image according to the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1 and 2, a method for acquiring a clear document image according to the present embodiment includes the following steps:
step S1, acquiring an original document image set to be processed, which specifically comprises the following steps: selecting different focusing images of a plurality of identical documents from a storage unit; or the image acquisition unit directly acquires a plurality of document images in the changing process of the focal length; the image sets used in one processing process are usually the same document and image sets with different focusing distances, for example, a left focusing image, a middle focusing image and a right focusing image of the same document can be used as one effective document image set to be processed. For best results, it is generally desirable that all content of a document be in focus and clear on at least one of its image sets. In addition, the resolution, depth of field and image processing time of the current common image acquisition unit are comprehensively considered, and 2-5 images are used for fusion at a time in the embodiment.
Step S2, registering an original document image set, matching two or more document images acquired under different angles or different focusing conditions, so that points corresponding to the same position in space in the original document image set are matched one by one, and after the matched key points are acquired, calculating the transformation relation of the original document image set to obtain a registered document image set;
and S3, after the registered document image set is obtained, extracting the characteristic values of the same area on each image from the document image set position by position, and carrying out image fusion according to the extracted characteristic values to obtain a clear document image.
In the present embodiment, after obtaining a clear document image, it is also possible to execute:
post-processing the document image: edge detection is carried out on the fused document image and the document image set before fusion, the detected document edges are fused in combination with the image set registration relationship, four vertex coordinates of the document on the fused image are calculated according to the detected document edges, the effective area of the scanned document is segmented from the document image, the document image is corrected, and therefore a clear scanned document image is obtained.
In this embodiment, calculating four vertex coordinates of a document from detected edges of the document includes: edge detection is carried out on the fused document image and the document image set before fusion, the detected document edges are fused in combination with the image set registration relationship, line segment fitting is carried out on the detected document edges, then screening and collecting are carried out on the fitted line segments, edge straight lines of the effective area of the document are obtained, and four vertex coordinates are calculated according to the edge straight lines;
in this embodiment, correcting a document image includes: and correcting the document image through perspective transformation according to the four vertex coordinates to generate a final scanned document image.
In the above step S2, registering the document image set includes:
performing key point detection and feature description on each document image by using a feature extraction algorithm;
after each document image in the document image set is identified with key points and feature descriptions, measuring the distance between each pair of key point descriptors by using a matcher, and reserving correct matching by using a ratio filter so as to complete feature matching;
after the matched key points are obtained, further removing the mismatching points by using a random sampling consistency algorithm to obtain an initial perspective transformation matrix between the matching points;
and based on the matched key points and the obtained initial perspective transformation matrix, carrying out optimization by combining a nonlinear optimization algorithm to obtain a final transformation matrix.
In this embodiment, in calculating the transformation relationship of the document image set, in order to reduce the calculation time consumption, further execution is performed:
reducing the original document image set according to a preset proportion;
calculating a transformation matrix between the reduced image sets;
and performing corresponding inverse scaling operation on the transformation matrix to obtain the transformation matrix suitable for the original image size.
Specifically, in the step S2, the image sets are registered, and a specific objective is to match two or more images collected under different angles or different focusing conditions, so that points corresponding to the same position in space in the image sets are matched one by one, so that subsequent operations are facilitated to fuse image information, and the processing method used in the embodiment includes:
and extracting the characteristics of the document image sets one by one, matching the characteristics among the images, and calculating the transformation relation among the image sets.
The present embodiment uses a feature extraction algorithm to perform keypoint detection and feature description on each graph, and the usable features may include Harris corner points (Harris), scale-invariant feature transformation (Scale-invariant Feature Transform), acceleration robust features (Speeded Up Robust Features), local binary patterns (Local Binary Patterns), direction gradient histograms (Histogram of Oriented Gradient), orientation features of acceleration segment test, and rotated binary robust independent basic features (Oriented Features from Accelerated Segment Test and Rotated Binary Robust Independent Elementary Features), and the like.
After the keypoints and feature descriptions are identified for each document image in the set of document images, the distance between each pair of keypoint descriptors is measured using a matcher, and then the correct matching is retained by using a ratio filter to complete feature matching, wherein the used matching measurer can be a violence matcher (Brute Force Matcher) and a nearest neighbor matcher (Flann Based Matcher).
After the matched keypoints are obtained, a random sampling consistency algorithm (ranac) is used to further remove the mismatching points and obtain an initial perspective transformation matrix between the matching points.
Because the distortion model of the lens is nonlinear, the matched key points and the obtained initial perspective transformation matrix are further utilized, and a final transformation matrix is obtained by combining with a Levenberg-Marquardt nonlinear optimization algorithm (Levenberg-Marquardt) optimization.
Of course, other methods of registration of the image sets may be used in this step, such as mutual information registration (Mutual Information), normalized mutual information registration (Normalization Mutual Information), entropy correlation coefficient registration (Entropy Corrleation Coefficient), huo Enshu gram optical flow field registration (Horn-Schunck), lucaskaner optical flow field registration (Lucas-Kanade), and deep learning based registration, such as voxel deformation network (Voxelmorph), and the like.
In addition, with the rapid iterative development of technology, modern intelligent portable devices, such as mobile phones, are configured with photo sensors with higher resolution, such as 1200 ten thousand pixel lenses and 5000 ten thousand pixel lenses, and the like, the higher resolution also means that more document image details can be obtained, but with the increase of image resolution, the time of image processing is multiplied, in order to reduce time consumption, when calculating the transformation relation of an image set, the processed image set can be reduced first, and a transformation matrix among images is calculated for the set, and then the transformation matrix suitable for the original image size is obtained by performing corresponding inverse scaling operation on the transformation matrix, which is specifically implemented as follows:
scaling down the original image set to the same Scale, for example, scaling down the image to 1/4 of the original size, then scale=1/4;
calculating a transformation matrix between the reduced image sets;
inverse scaling the transformation matrix to obtain a transformation matrix at the original size, and setting the transformation matrix in which the image 1 and the image 2 are reducedThe following are provided:
transformation matrix of corresponding original image 1 and original image 2The method comprises the following steps:
registering the original image set according to the new transformation matrix. In addition, if the scale size is too large, it is considered to combine the original image set and the sumAnd performing fine tuning optimization on the transformation matrix.
Specifically, in the step S3, after the registered document image set is obtained, the image set is fused by using an image fusion technique. The general image fusion technology can be divided into three types, namely pixel-level image fusion, feature-level image fusion and decision-level image fusion, wherein the general pixel-level image fusion can better retain detailed information, so the embodiment selects pixel-level image fusion, extracts feature values of the same area on each image from the document image set position by position, and performs image fusion according to the extracted feature values to obtain a clear document image, and the specific mode is as follows:
calculating a characteristic value of each region by using a window sliding mode, wherein the characteristic value can be local variance, local image entropy, common convolution characteristics (such as Sobel characteristics and Laplacian characteristics) and the like;
the image sets are fused into a clear image according to the feature values of the same area on different images, and methods which can be adopted are a weighted average method based on features, a multi-band mixer (Multiband Blender) and the like.
In addition, the registered image sets can be fused by combining the feature pyramid, so that ringing phenomenon can be effectively reduced, and the quality of the fused image is improved.
In order to further improve the function, after obtaining a clear document image, referring to fig. 2, post-processing operation may be added to perform edge detection on the document image after fusion and the document image set before fusion, calculate four vertex coordinates of the document according to the detected document edge, and segment an effective area of the scanned document from the image according to the four vertex coordinates, and correct the effective area to obtain a clear scanned document image, which is implemented as follows:
the edge detection is carried out on the fused document image and the document image set before fusion, and the method comprises the following steps:
using an edge detection model to detect the document edge of the fused document image, simultaneously using the edge detection model to detect the document edge of the fused document image set, and combining known registration information to stack all document edge information to obtain a first document edge; in this embodiment, the edge detection model is trained by combining pixels and semantics, so that the fused document image is input into the edge detection model to perform edge prediction, a document edge probability map is obtained, and the document edge probability map is determined to be the first document edge. In the document edge probability map, the value of each pixel represents the probability that the pixel at the corresponding position in the document image to be processed belongs to the document edge, and the value range is 0.0 and 1.0.
Binarizing the first document edge to obtain a second document edge; the binarization process refers to threshold binarization process. And carrying out binarization processing on the first document edge by using a threshold binarization method to obtain a second document edge. The specific process is that the edge probability value of each pixel point in the document edge probability map represented by the first document edge is traversed. If the probability value of a certain pixel point is greater than or equal to the set threshold value p, the edge probability of the pixel point is reassigned to be 1. If the probability value of a certain pixel point is smaller than the set threshold value, the edge probability of the pixel point is reassigned to 0.
Filtering the second document edge to obtain a third document edge; since in the second document edge there may be edge blocks in some background or inside the document area, which do not belong to the document edge but to the interfering term, filtering is needed to improve accuracy. The process of filtering the second document edge includes: finding out all the connected edges in the second document edge by using a connected domain algorithm, and calculating the area of each connected edge; if the area is smaller than the set threshold, filtering the area, and only preserving the connected edges with the area larger than the preset threshold. Based on this, the second document edge is filtered to obtain a third document edge.
And refining the third document edge to obtain a fourth document edge as the document edge. Wherein the skeleton of the image can be obtained by a refinement algorithm. The refinement algorithm here may be the zhangsuin refinement algorithm.
And then screening and assembling the fitted line segments to obtain edge lines of the effective area of the document, and finally calculating 4 vertex coordinates according to the edge lines. Of course, in this step, the document area can be identified by using a deep learning method, and four vertex coordinates of the document are obtained; and correcting the document image according to the four vertex coordinates and perspective transformation, and generating a final document image.
Specifically, a straight line set based on a document edge frame can be obtained by detecting straight lines of the detected edge, then a rectangular frame set formed by any four straight lines in the straight line set is obtained, and the vertex coordinate of the largest rectangular frame is used as the vertex coordinate of a document in a document image.
And then, calculating a perspective transformation matrix based on the obtained vertex coordinates, and performing perspective correction on the fused document image according to the perspective transformation matrix to obtain a final document image comprising the distorted and repaired front-view angle document image, thereby facilitating reading and document archiving of a user and improving user experience.
Therefore, compared with the prior art, the method for acquiring the clear document image provided by the invention acquires the full-focus document image by utilizing a plurality of differently focused document images to perform image fusion, and combines a proper edge detection algorithm and a document correction algorithm, so that the standard clear scanned document can be acquired, and the storage and the subsequent review are convenient. When a high quality printer is provided, the scanned document photo can be used to print out clear documents directly. In addition, the calculated amount of the method is far lower than that of the deep learning method, the method can be conveniently and rapidly deployed on mobile terminal equipment, clear document images are processed and synthesized on the equipment locally immediately after the camera acquires the document images, and meanwhile, the problem of information security possibly brought by document data in network transmission is avoided.
Furthermore, the document edge is obtained by detecting the document edge of the original document image to be processed through multiple dimensions, so that the document edge can keep global consistency, false edge interference existing in the background and the image is eliminated, the accuracy and precision of edge detection are improved, the reliability of straight line fitting is improved for the subsequent steps, and vertex positioning can be accurately realized.
Further, the invention carries out straight line fitting on the edges of the document to determine a straight line set, further determines four vertexes of the document in the document image to be processed according to the straight line set, and carries out perspective transformation on the fused document image by combining the four vertexes to obtain a document correction result.
Terminal device embodiment for acquiring clear document image
Referring to fig. 3, a terminal apparatus for acquiring a clear document image provided in this embodiment includes:
a memory for storing image data and instructions executable by the processor;
a processor for processing data, executing instructions, and performing operations;
the image acquisition unit is used for acquiring an original scanned document image to be processed;
an image output unit for displaying or printing the processed document image, the image output unit may also be a third party output device, such as an external display or printer;
the terminal can be a portable device such as a smart phone, a tablet computer and the like with a standard operating system.
In this embodiment, the obtaining of the document image set to be processed may be selecting a plurality of document images from a local memory, or directly obtaining a plurality of document images from an image collecting unit, or a small shot document video, or a Live Photo (Live Photo) of an apple, where the obtained image set is usually a document image set of the same document and different focusing distances, for example, a left focusing image, a middle focusing image and a right focusing image of the same document may be used as an effective document image set to be processed.
The memory in this embodiment refers to a digital electronic semiconductor device for storing program instructions and various data information, and may be generally divided into an internal memory (for short, memory) and an external memory (for short, external memory), where programs and data are generally stored in the external memory, and when a program command needs to be executed, commands and related data are called into the memory to be executed.
The processor in this embodiment is a microprocessor for interpreting program data already processed by computer instructions, and is typically referred to as a central processing unit (Central Processing Unit), which may be a complex instruction set microprocessor (Complex Instruction Set Computing) or a reduced instruction set microprocessor (Reduced Instruction Set Computer).
The image acquisition unit in this embodiment is usually a camera, a video camera, a scanner, or may be an intelligent terminal device with a photographing function, such as a smart phone, a tablet computer, etc. In case of an image acquisition unit separate from the current processing terminal, the image data needs to be transmitted to the processing terminal by means of an additional data transmission.
The image output unit in this embodiment is mainly used for displaying the processing result in the form of an image, and may be a printer, a projector, a display screen, or the like. The output device may be a display integrated on the terminal or may be a third party output device connected by wired or wireless data transmission.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above embodiments are only preferred embodiments of the present invention, and the scope of the present invention is not limited thereto, but any insubstantial changes and substitutions made by those skilled in the art on the basis of the present invention are intended to be within the scope of the present invention as claimed.

Claims (6)

1. A method for acquiring a clear document image, comprising the steps of:
acquiring an original document image set to be processed;
registering the original document image set, matching two or more document images acquired under different angles or different focusing conditions, so that points corresponding to the same position in space in the original document image set are matched one by one, and after matched key points are acquired, calculating the transformation relation of the original document image set to obtain a registered document image set;
after the registered document image set is obtained, extracting the characteristic values of the same area on each image from the document image set position to position, and carrying out image fusion according to the extracted characteristic values to obtain a clear document image;
wherein after obtaining the clear document image, optionally performing a post-processing operation:
edge detection of a document area is carried out on the fused document image;
edge detection of a document area is carried out on the document image before fusion, the information of the document image after fusion corresponding to each edge is calculated according to the mapping relation of the image set, and all the edge information is fused in a summation mode;
performing edge straight line fitting on the detected edge;
calculating four vertexes of the document image area according to the edge straight line;
and calculating a transformation matrix by using the four vertexes, and applying the transformation matrix to the fused document image to obtain a corrected document image.
2. The method according to claim 1, characterized in that:
the acquiring the document image set to be processed comprises the following steps:
selecting different focusing images of a plurality of identical documents from a storage unit;
or the image acquisition unit directly acquires a plurality of document images in the changing process of the focal length;
wherein the image sets used in one process are typically image sets of the same document, different focus distances.
3. The method of claim 1, wherein registering the set of document images comprises:
performing key point detection and feature description on each document image by using a feature extraction algorithm;
after each document image in the document image set is identified with key points and feature descriptions, measuring the distance between each pair of key point descriptors by using a matcher, and reserving correct matching by using a ratio filter so as to complete feature matching;
after the matched key points are obtained, further removing the mismatching points by using a random sampling consistency algorithm to obtain an initial perspective transformation matrix between the matching points;
and based on the matched key points and the obtained initial perspective transformation matrix, carrying out optimization by combining a nonlinear optimization algorithm to obtain a final transformation matrix.
4. The method according to claim 1, characterized in that:
in calculating the transformation relation of the document image set, to reduce calculation time consumption, further performing:
the original document image set is reduced according to a preset proportion;
calculating a transformation matrix between the reduced image sets;
and performing corresponding inverse scaling operation on the transformation matrix to obtain the transformation matrix suitable for the original image size.
5. The method of claim 4, wherein said performing a corresponding inverse scaling operation on the transformation matrix comprises:
setting the transformation matrix of the reduced first image and the reduced second image as formula (1):
(1)
transformation matrix after inverse scaling operation corresponding to the first image and the second imageIs formula (2):
(2)
wherein Scale is the scaling factor of the image;
according to the transformation matrixRegistering the original image set.
6. The method of claim 1, wherein the image fusion based on the extracted feature values comprises:
performing image fusion according to the extracted characteristic values based on pixel-level image fusion;
calculating the characteristic value of each region by using a window sliding mode;
and merging the document image sets into a clear image according to the characteristic values of the same area on different images.
CN202310567449.5A 2023-05-18 2023-05-18 Method for acquiring clear document image and terminal device thereof Active CN116883461B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310567449.5A CN116883461B (en) 2023-05-18 2023-05-18 Method for acquiring clear document image and terminal device thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310567449.5A CN116883461B (en) 2023-05-18 2023-05-18 Method for acquiring clear document image and terminal device thereof

Publications (2)

Publication Number Publication Date
CN116883461A CN116883461A (en) 2023-10-13
CN116883461B true CN116883461B (en) 2024-03-01

Family

ID=88257445

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310567449.5A Active CN116883461B (en) 2023-05-18 2023-05-18 Method for acquiring clear document image and terminal device thereof

Country Status (1)

Country Link
CN (1) CN116883461B (en)

Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102074001A (en) * 2010-11-25 2011-05-25 上海合合信息科技发展有限公司 Method and system for stitching text images
CN104463817A (en) * 2013-09-12 2015-03-25 华为终端有限公司 Image processing method and device
CN104700382A (en) * 2012-12-16 2015-06-10 吴凡 Multi-focus image file handling method
CN105046676A (en) * 2015-08-27 2015-11-11 上海斐讯数据通信技术有限公司 Image fusion method and equipment based on intelligent terminal
CN105096239A (en) * 2015-07-02 2015-11-25 北京旷视科技有限公司 Method and device for image registration, method and device for image splicing
CN105430266A (en) * 2015-11-30 2016-03-23 努比亚技术有限公司 Image processing method based on multi-scale transform and terminal
CN105447850A (en) * 2015-11-12 2016-03-30 浙江大学 Panorama stitching synthesis method based on multi-view images
CN110059691A (en) * 2019-03-29 2019-07-26 南京邮电大学 Multi-angle of view based on mobile terminal distorts file and picture geometric correction method
CN110866871A (en) * 2019-11-15 2020-03-06 深圳市华云中盛科技股份有限公司 Text image correction method and device, computer equipment and storage medium
CN112241940A (en) * 2020-09-28 2021-01-19 北京科技大学 Method and device for fusing multiple multi-focus images
CN113409928A (en) * 2021-06-29 2021-09-17 中国人民解放军总医院第一医学中心 Medical information sharing system
CN113596276A (en) * 2021-06-28 2021-11-02 展讯半导体(南京)有限公司 Scanning method and system for portable electronic equipment, electronic equipment and storage medium
CN113627428A (en) * 2021-08-11 2021-11-09 Oppo广东移动通信有限公司 Document image correction method and device, storage medium and intelligent terminal device
CN114399781A (en) * 2022-01-18 2022-04-26 Oppo广东移动通信有限公司 Document image processing method and device, electronic equipment and storage medium
CN114612352A (en) * 2022-01-29 2022-06-10 东声(苏州)智能科技有限公司 Multi-focus image fusion method, storage medium and computer
CN115294003A (en) * 2022-08-09 2022-11-04 西安欧亚学院 Multi-focus image fusion method
CN115424102A (en) * 2022-08-10 2022-12-02 西安电子科技大学 Multi-focus image fusion method based on anisotropic guided filtering

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102053051A (en) * 2009-10-30 2011-05-11 西门子公司 Body fluid analysis system as well as image processing device and method for body fluid analysis
KR20140105816A (en) * 2011-12-09 2014-09-02 스티븐 버두너 Method for combining a plurality of eye images into a plenoptic multifocal image
US9047666B2 (en) * 2013-03-12 2015-06-02 Futurewei Technologies, Inc. Image registration and focus stacking on mobile platforms
US9317893B2 (en) * 2013-03-26 2016-04-19 Sharp Laboratories Of America, Inc. Methods and systems for correcting a document image
CN105100579B (en) * 2014-05-09 2018-12-07 华为技术有限公司 A kind of acquiring and processing method and relevant apparatus of image data
US11012608B2 (en) * 2016-09-12 2021-05-18 Huawei Technologies Co., Ltd. Processing method and mobile device
US10984513B1 (en) * 2019-09-30 2021-04-20 Google Llc Automatic generation of all-in-focus images with a mobile camera

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102074001A (en) * 2010-11-25 2011-05-25 上海合合信息科技发展有限公司 Method and system for stitching text images
CN104700382A (en) * 2012-12-16 2015-06-10 吴凡 Multi-focus image file handling method
CN104463817A (en) * 2013-09-12 2015-03-25 华为终端有限公司 Image processing method and device
CN105096239A (en) * 2015-07-02 2015-11-25 北京旷视科技有限公司 Method and device for image registration, method and device for image splicing
CN105046676A (en) * 2015-08-27 2015-11-11 上海斐讯数据通信技术有限公司 Image fusion method and equipment based on intelligent terminal
CN105447850A (en) * 2015-11-12 2016-03-30 浙江大学 Panorama stitching synthesis method based on multi-view images
CN105430266A (en) * 2015-11-30 2016-03-23 努比亚技术有限公司 Image processing method based on multi-scale transform and terminal
CN110059691A (en) * 2019-03-29 2019-07-26 南京邮电大学 Multi-angle of view based on mobile terminal distorts file and picture geometric correction method
CN110866871A (en) * 2019-11-15 2020-03-06 深圳市华云中盛科技股份有限公司 Text image correction method and device, computer equipment and storage medium
CN112241940A (en) * 2020-09-28 2021-01-19 北京科技大学 Method and device for fusing multiple multi-focus images
CN113596276A (en) * 2021-06-28 2021-11-02 展讯半导体(南京)有限公司 Scanning method and system for portable electronic equipment, electronic equipment and storage medium
CN113409928A (en) * 2021-06-29 2021-09-17 中国人民解放军总医院第一医学中心 Medical information sharing system
CN113627428A (en) * 2021-08-11 2021-11-09 Oppo广东移动通信有限公司 Document image correction method and device, storage medium and intelligent terminal device
CN114399781A (en) * 2022-01-18 2022-04-26 Oppo广东移动通信有限公司 Document image processing method and device, electronic equipment and storage medium
CN114612352A (en) * 2022-01-29 2022-06-10 东声(苏州)智能科技有限公司 Multi-focus image fusion method, storage medium and computer
CN115294003A (en) * 2022-08-09 2022-11-04 西安欧亚学院 Multi-focus image fusion method
CN115424102A (en) * 2022-08-10 2022-12-02 西安电子科技大学 Multi-focus image fusion method based on anisotropic guided filtering

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
A Novel Fusion Algorithm for Multi-focus Image;Wang Hongmei 等;《International Conference on Applied Informatics and Communication(2011)》;641-647 *
Research on Automatic Correction of the Document Images Based on Perspective Transformation;Jin B 等;《2021 IEEE International Conference on Computer Science, Electronic Information Engineering and Intelligent Control Technology (CEI)》;291-297 *
基于文本特征自适应正则化的文档图像超分辨率重建;刘阿建 等;《山西大学学报(自然科学版)》(第01期);104-113 *
基于迭代型形态成分分析的多聚焦图像融合;王玲玲 等;《软件导刊》(第05期);230-233 *
文档图像拼接技术研究;高鸿;《中国优秀硕士学位论文全文数据库信息科技辑》(第2011(12)期);I138-770 *

Also Published As

Publication number Publication date
CN116883461A (en) 2023-10-13

Similar Documents

Publication Publication Date Title
JP4772839B2 (en) Image identification method and imaging apparatus
Piva An overview on image forensics
JP4556813B2 (en) Image processing apparatus and program
JP5896245B2 (en) How to crop a text image
RU2659745C1 (en) Reconstruction of the document from document image series
RU2631765C1 (en) Method and system of correcting perspective distortions in images occupying double-page spread
WO2012068902A1 (en) Method and system for enhancing text image clarity
CN107945111B (en) Image stitching method based on SURF (speeded up robust features) feature extraction and CS-LBP (local binary Pattern) descriptor
CN112367459B (en) Image processing method, electronic device, and non-volatile computer-readable storage medium
JP5825172B2 (en) Image determination apparatus, image determination method, and computer program for image determination
Joze et al. Imagepairs: Realistic super resolution dataset via beam splitter camera rig
JP2011045078A (en) Adaptive deblurring for camera-based document image processing
JP2007074578A (en) Image processor, photography instrument, and program
CN114283156B (en) Method and device for removing document image color and handwriting
US10455163B2 (en) Image processing apparatus that generates a combined image, control method, and storage medium
US20230343119A1 (en) Captured document image enhancement
CN111932462B (en) Training method and device for image degradation model, electronic equipment and storage medium
CN111340040B (en) Paper character recognition method and device, electronic equipment and storage medium
CN116883461B (en) Method for acquiring clear document image and terminal device thereof
US20210281742A1 (en) Document detections from video images
CN112634298B (en) Image processing method and device, storage medium and terminal
TWI620147B (en) Image synthesis method for synthesizing people
Chazalon et al. A semi-automatic groundtruthing tool for mobile-captured document segmentation
CN111402281B (en) Book edge detection method and device
Kakar Passive approaches for digital image forgery detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20231218

Address after: 519000 307-2, ZhongLiXin building, No. 4, Xingguo street, Xiangzhou District, Zhuhai City, Guangdong Province

Applicant after: Zhuhai Yike Intelligent Technology Co.,Ltd.

Applicant after: Zhuhai Xinye Electronic Technology Co.,Ltd.

Address before: 519000 307-2, ZhongLiXin building, No. 4, Xingguo street, Xiangzhou District, Zhuhai City, Guangdong Province

Applicant before: Zhuhai Yike Intelligent Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant