CN104091319B - The shredded paper picture joining method of energy function is built based on Monte carlo algorithm - Google Patents

The shredded paper picture joining method of energy function is built based on Monte carlo algorithm Download PDF

Info

Publication number
CN104091319B
CN104091319B CN201410298442.9A CN201410298442A CN104091319B CN 104091319 B CN104091319 B CN 104091319B CN 201410298442 A CN201410298442 A CN 201410298442A CN 104091319 B CN104091319 B CN 104091319B
Authority
CN
China
Prior art keywords
picture
image
energy function
monte carlo
splicing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201410298442.9A
Other languages
Chinese (zh)
Other versions
CN104091319A (en
Inventor
王晓峰
苏盈盈
王洪珂
孙宝光
白翔文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Science and Technology
Original Assignee
Chongqing University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Science and Technology filed Critical Chongqing University of Science and Technology
Priority to CN201410298442.9A priority Critical patent/CN104091319B/en
Publication of CN104091319A publication Critical patent/CN104091319A/en
Application granted granted Critical
Publication of CN104091319B publication Critical patent/CN104091319B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Editing Of Facsimile Originals (AREA)
  • Image Processing (AREA)

Abstract

The invention provides a kind of shredded paper picture joining method that energy function is built based on Monte carlo algorithm, relate generally to splicing and the recovery problem of duplex printing file, it is typically due to picture more, information content is larger, therefore usually nonlinear optimal problem, accurate model difficulty of setting up is larger, and solves more difficulty, may be larger with time error.Therefore image an as entirety is carried out Selective filling by the present invention using the Monte carlo algorithm based on random thought.Consider how the given shredded paper crusher machine scraps of paper from same one page printing word file are spliced together, including:Only rip cutting, not only rip cutting but also crosscutting situation, a scrap of paper situations such as duplex printing file and rip cutting have crosscutting potentially includes Chinese or English.The present invention can make the automatic Mosaic of a scrap of paper by picture stitching algorithm, to obtain picture splicing and recovery effect, reduce manpower and materials consumption, and improve splicing recovering efficiency.

Description

Shredded paper picture splicing method for constructing energy function based on Monte Carlo algorithm
Technical Field
The invention belongs to the technical field of information, relates to a shredded paper picture splicing method for constructing an energy function based on a Monte Carlo algorithm, and particularly relates to an automatic broken file (shredded paper) picture splicing technology for improving splicing recovery efficiency and accuracy.
Background
The split file splicing has important application in the fields of file repair, judicial evidence recovery and identification, historical literature repair, military information acquisition and the like, and a plurality of split splicing problems can be solved or approximated to the two-dimensional split splicing problem. Shredded paper stitching is a typical problem for two-dimensional patch image stitching. Traditionally, splicing recovery work needs to be completed manually, the accuracy is high, but the efficiency is low. However, when the number of the fragments is large, a large amount of manpower and material resources are consumed, the task can be completed quickly and accurately by manual splicing in a short time, and certain damage can be caused to the objects.
With the development of computer technology, the automatic splicing of the paper scraps can be carried out by utilizing the computer programming technology and the picture splicing algorithm so as to obtain the picture splicing and restoration, reduce the consumption of manpower and material resources and improve the splicing and restoration efficiency.
Disclosure of Invention
The invention aims to provide a shredded paper picture splicing method for constructing an energy function based on a Monte Carlo algorithm, and aims to solve the problems that in the prior art, the splicing NP is difficult and the splicing recovery efficiency is low.
The invention is realized in such a way that a shredded paper picture splicing method for constructing an energy function based on a Monte Carlo algorithm comprises the following steps:
s1, scanning the paper scrap into a two-dimensional gray picture form to obtain (m multiplied by n) pieces, and reading the picture information into matrix information by using Matlab;
s2, generating a random and nonrepeating m multiplied by n image fragment two-dimensional combination sequence for the generated images by using a Matlab random function randderm based on a Monte Carlo algorithm;
s3, taking the two-dimensional combination sequence of the m multiplied by n fragments as a file picture to be generated, calculating energy functions of each image and adjacent pictures, namely upper, lower, left and right images, in the picture based on the Root Mean Square Error (RMSE), and then solving the sum of the energy functions of all the pictures;
s4, performing 10000 times of circulation on the step S2, and comparing the size of the energy function each time to obtain the minimum value of the energy function;
s5, obtaining the position of the fragment corresponding to the minimum value as the optimal arrangement mode;
and S6, gradually converging the minimum value through a plurality of iterations to obtain the optimal value, namely the optimal jigsaw effect.
The invention overcomes the defects of the prior art, provides a shredded paper picture splicing method for constructing an energy function based on a Monte Carlo algorithm, mainly relates to the splicing and recovery problems of double-sided printed files, generally has a nonlinear optimization problem due to more pictures and larger information quantity, and has higher difficulty in accurately establishing a model, more difficulty in solving the model and larger error. Therefore, the invention takes the image as a whole and adopts the Monte Carlo algorithm based on the random thought to carry out the selective filling. Consider how shredded paper from a given shredder for a printed text document on the same page is spliced together, including only slit, both slit and cross cut, double sided printed documents, and shredded paper with cross cut, which may include Chinese or English. The specific analysis for three specific cases is as follows: 1) for only the longitudinally cut file, simultaneously considering 2 constraint association conditions on the left and the right among each fragment in the file; 2) 4 constraint association conditions of the upper part, the lower part, the left part and the right part of each fragment in the file are considered for the file comprising transverse cutting and longitudinal cutting; 3) the method comprises the steps of simultaneously considering 8 constraint association conditions of the upper, lower, left, right and reverse sides of each fragment in a double-sided file, then establishing an energy function, wherein the energy function is the smallest generally, namely the best splicing effect, searching the smallest value through the random property of Monte Carlo, finally programming by utilizing Matlab, obtaining an optimal solution, and verifying.
Drawings
FIG. 1 is a flow chart of the steps of the shredded paper image splicing method for constructing an energy function based on the Monte Carlo algorithm;
FIG. 2 is a histogram of a 5.000.bmp image in an embodiment of the invention;
FIG. 3 is a cross-sectional view of an embodiment of the present invention;
FIG. 4 is a dot diagram in the embodiment of the present invention;
FIG. 5 is a vertical view in an embodiment of the present invention;
FIG. 6 is a schematic diagram of the relationship between the upper, lower, left and right of the split joint in the embodiment of the present invention;
FIG. 7 is a schematic diagram of energy variation with 1000 iterations in an embodiment of the present invention;
fig. 8 is a schematic diagram of an energy variation with 10000 iterations in the embodiment of the present invention;
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
Examples
A shredded paper picture splicing method for constructing an energy function based on a Monte Carlo algorithm is disclosed, as shown in FIG. 1, and comprises the following steps:
s1, scanning the shredded paper into a two-dimensional gray picture form (m multiplied by n), and reading the picture information into matrix information by using Matlab;
in step S1, more specifically, the method includes:
1) image pre-processing
The general image can not be used directly, because there are different information such as noise, grey scale, etc., the direct use can cause the error, lead to the result incorrect or wrong concatenation, so need carry out the preliminary treatment to the image:
a) denoising process
A given image may have noise problems due to imaging reasons, camera or computer reasons, and the presence of noise may easily cause data processing errors, such as the top row. If a pixel is affected by noise, 0 becomes 1, the final statistics will cause a certain error, so the image is denoised by the commonly used gaussian filter:
b) binarization processing
The image given in the text is a gray level image with a range of 0-255, however, the gray level image is usually 8 bits and has a large calculation amount, so the binarization processing is usually performed to reduce the calculation amount and accelerate the execution speed.
First, a typical printed paper image, which is a gray scale image (0 to 255), is found to have image information values of not only two values of 0 and 255 but also a large number of values in the range of 0 to 255, but there is a high probability that the image information values are 0 and 255. By adopting the imhist function in Matlab, the linear shape is thickened by 5.0 times because the values of 0 and 255 are more and other points are less, as shown in FIG. 2, as can be seen from FIG. 2, the gray value is mainly concentrated between 0 and 255 and between 0 and 255, so the threshold value is taken as 180 empirically. The gray value is 0-180, and the gray value is 0 after binarization treatment; the gray value is greater than 180 and less than or equal to 255, and the gray value is 1 after binarization processing. Namely:
2) picture feature analysis
Analysis shows that in image splicing, the existence of characters at the middle part of the picture and the splicing of the picture have no influence, so that the character characteristics at the left and right boundaries of the picture are only considered. The n gray value matrixes generated after the picture is imported into Matlab software only need to use the elements of the first column and the last column to perform data similarity analysis, and the elements of other columns can be ignored.
The method comprises the following steps of collecting the characteristics of a boundary, carrying out an image binarization fixed threshold method based on gray level and a mean square error statistical matching method, and verifying the errors possibly generated in the method as follows:
common strokes in chinese are generally: horizontal (i), vertical (i), horizontal (i), vertical (i), etc., can be divided into two categories:
a) adjacent point or points, such as: strokes (one) are vertically truncated from the middle arrow to form:
1. horizontal is usually 3 points (same as left-falling, right-falling structure), as shown in fig. 3;
2. strokes (strokes) are vertically truncated from the middle arrow, resulting in the stroke shown in FIG. 4.
b) Adjacent multiple points
3. Strokes (I) are vertically truncated from the middle arrow, resulting in the shape shown in FIG. 5.
In summary, by counting common strokes, it is found that if a word is disconnected from the middle, the adjacent picture pixels are generally similar in gray level, so the mean square error method is theoretically feasible. Then, a method of using Matlab software and selecting the characteristics of the collected boundary to carry out a gray-based image binarization fixed threshold method and a mean square error statistical matching method is feasible, the error should be small, and a correct result can be obtained.
S2, generating non-repeated m multiplied by n image fragments by utilizing a Matlab random function randderm based on a Monte Carlo algorithm;
s3, taking the m multiplied by n fragments as a file picture to be generated, calculating energy functions of the upper, lower, left and right sides of each image in the picture, and solving the sum of the energy functions of all the pictures;
in step S3, the method further includes:
1) mean square error statistical matching algorithm based on gray scale
Because the existence of characters in the middle of the picture is irrelevant to the picture splicing, the characteristics of the left, right, upper and lower boundaries of the picture are selected for matching. However, there may be a phenomenon that the gray values at the boundaries are similar but the pictures are not matched. Therefore, a method for carrying out gray-scale-based mean square error statistical matching on the features at the collection boundary is selected. Typical measures of judgment error include:
a) standard deviation of
The standard deviation reflects the dispersion of the image gray level relative to the average gray level, and is defined as follows:
wherein,is the mean of image F, defined as:
the standard deviation can also be used to evaluate the magnitude of the image contrast. If the standard deviation is large, the gray level distribution of the image is dispersed, the contrast of the image is large, and more information can be seen. The standard deviation is small, the image contrast is small, the contrast is not large, the color tone is single and uniform, and too much information cannot be seen.
b) Root mean square error RMSE
The root mean square error between the fused image F and the standard reference image R is defined as:
wherein, M and N are the number of rows and columns of the image respectively. Here, in consideration of the real-time property, it is preferable to use the root mean square error.
2) Specific algorithm
The mean square error of the corresponding difference values of the binary gray values of the first (last) column of any one matrix and the last (first) column of each other matrix is obtained through operation, when the probability of 0 element in the difference values is higher, the difference values of the gray values are smaller, the similarity degree at the boundary of the corresponding picture of the matrix is higher, and the possibility of splicing the corresponding pictures is higher. However, in this method, there may be cases where the probability of 0 element appearing is the same, the gray values are different, and the pictures do not match. Therefore, a method of abandoning subtraction is selected, mean square deviation is selected for comparison, the image gray value difference corresponding to the matrix with the minimum variance is minimum, and the image matching degree is highest, so that the 2 images can be spliced and restored.
The characters of Chinese and English on both sides are usually given with great difficulty, so the two sides are taken as specific analysis, and other forms are simplified forms. Documents are printed on both sides, typically sharing a picture of 2 x M x n, with each image having a height of M pixels.
The idea of constructing an energy function (cost function) based on data, the Monte Carlo algorithm based on a random idea and the Matlab are adopted for solving, and finally analysis and verification are carried out.
a) Algorithmic analysis
Assuming that the page of the printing paper has two opposite sides, and there are m × n pieces in total, it can be considered that if the matching is correct, the matching between each piece of paper should be good at the top, bottom, left, and right, and the mean square error corresponding to the piece of paper should be minimum, so that the mean square error between all pieces of paper should be small or minimum (considering the possible existence of error), based on the idea that an energy function is established, the energy function is based on the mean square error matching relationship between the top, bottom, left, and right of the connected pictures, as shown in fig. 6, and includes two cases of the opposite side and the front side, so there are 8 constraints in total.
b) Energy function establishment
Based on the image matching idea and the analysis, an energy function based on the upper, lower, left and right relations of each image is constructed, and the images are assumed to be pixel sets X respectivelyup,Xcenter,Xdown,Xleft,XrightThen the energy function for one picture can be derived,
f(x1,x2,x3,x4,x5)=Ψ(x1,x2)+Ψ(x1,x3)+Ψ(x1,x4)+Ψ(x1,x5)
also, a global energy function may be obtained
F(x1,x2,x3,x4,x5)=∑(Ψ(x1,x2)+Ψ(x1,x3)+Ψ(x1,x4)+Ψ(x1,x5))
So the global energy function for the whole picture
Where Ψ (-) is the mean-square error measure between neighboring picturesN is the neighborhood of q.
S4, performing 10000 times of circulation on the step S2, comparing the magnitude of each energy function, and taking the minimum value;
s5, obtaining the position of the fragment corresponding to the minimum value as the optimal arrangement mode;
in step S5, the energy function is larger when there is no match or disorder, and smaller when there is a better match, so that the matching problem is transformed into the problem of minimum energy, which should be the problem of finding the optimal solution. Namely:
and S6, gradually converging the minimum value through a plurality of iterations to obtain the optimal value, namely the optimal jigsaw effect.
Taking a picture as an example to perform restoration splicing, as shown in fig. 7 and 8, it can be seen from the figure that, as iteration is performed, the initial value of the lowest energy is 341.1, and as iteration is performed, the value of the lowest energy function is 332.2 when 1000 times of operation are performed; the minimum energy function value is 328.3 when the operation is carried out 10000 times, and the minimum energy function value is gradually converged to a stable value from the image, namely, the optimal solution is obtained.
Compared with the defects and shortcomings of the prior art, the invention has the following beneficial effects: through the image splicing algorithm, the shredded paper can be automatically spliced to obtain the image splicing and recovery effect, the consumption of manpower and material resources is reduced, and the splicing recovery efficiency is improved.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (1)

1. A shredded paper picture splicing method for constructing an energy function based on a Monte Carlo algorithm is characterized by comprising the following steps:
s1, scanning the paper scrap into a two-dimensional gray picture form to obtain m multiplied by n pieces, and reading picture information into matrix information by using Matlab;
s2, generating a random and nonrepeating m multiplied by n image fragment two-dimensional combination sequence for the generated images by using a Matlab random function randderm based on a Monte Carlo algorithm;
s3, taking the two-dimensional combination sequence of the m multiplied by n fragments as a file picture to be generated, calculating energy functions of each image and adjacent pictures, namely upper, lower, left and right images, in the picture based on the Root Mean Square Error (RMSE), and then solving the sum of the energy functions of all the pictures;
s4, performing 10000 times of circulation on the step S2, and comparing the size of the energy function each time to obtain the minimum value of the energy function;
s5, obtaining the position of the fragment corresponding to the minimum value as the optimal arrangement mode;
and S6, gradually converging the minimum value through a plurality of iterations to obtain the optimal value, namely the optimal jigsaw effect.
CN201410298442.9A 2014-06-26 2014-06-26 The shredded paper picture joining method of energy function is built based on Monte carlo algorithm Expired - Fee Related CN104091319B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410298442.9A CN104091319B (en) 2014-06-26 2014-06-26 The shredded paper picture joining method of energy function is built based on Monte carlo algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410298442.9A CN104091319B (en) 2014-06-26 2014-06-26 The shredded paper picture joining method of energy function is built based on Monte carlo algorithm

Publications (2)

Publication Number Publication Date
CN104091319A CN104091319A (en) 2014-10-08
CN104091319B true CN104091319B (en) 2017-07-11

Family

ID=51639034

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410298442.9A Expired - Fee Related CN104091319B (en) 2014-06-26 2014-06-26 The shredded paper picture joining method of energy function is built based on Monte carlo algorithm

Country Status (1)

Country Link
CN (1) CN104091319B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108805811B (en) * 2018-05-30 2022-06-24 山东师范大学 Natural image intelligent picture splicing method and system based on non-convex quadratic programming
CN116485658B (en) * 2023-06-19 2023-09-12 旷智中科(北京)技术有限公司 Multichannel image stitching method based on Monte Carlo tree search

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8233740B2 (en) * 2006-12-26 2012-07-31 Alan Steven Roth System and method for constructing photorealistic mosaics
CN101901481B (en) * 2010-08-11 2012-11-21 深圳市蓝韵实业有限公司 Image mosaic method
CN103177431B (en) * 2012-12-26 2015-10-14 中国科学院遥感与数字地球研究所 A kind of RS data space-time fusion method
CN103632338B (en) * 2013-12-05 2016-08-31 鲁东大学 A kind of image registration Evaluation Method based on match curve feature

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘孟娟.基于聚类分析和灰度值匹配的碎片文件拼接复原.《价值工程》.2010, *
宋强等.一种单传感器实时系统误差配准算法.《海空航空工程学院学报》.2010,第25卷(第6期),第617-620页. *

Also Published As

Publication number Publication date
CN104091319A (en) 2014-10-08

Similar Documents

Publication Publication Date Title
CN111814722B (en) Method and device for identifying table in image, electronic equipment and storage medium
US8712188B2 (en) System and method for document orientation detection
CN105469027A (en) Horizontal and vertical line detection and removal for document images
CN103679678B (en) A kind of semi-automatic splicing restored method of rectangle character features a scrap of paper
CN104252620A (en) Character-touching graph verification code recognition method
US9330331B2 (en) Systems and methods for offline character recognition
US8331680B2 (en) Method of gray-level optical segmentation and isolation using incremental connected components
CN112906695B (en) Form recognition method adapting to multi-class OCR recognition interface and related equipment
CN104008384A (en) Character identification method and character identification apparatus
CN111814598A (en) Financial statement automatic identification method based on deep learning framework
CN103258201A (en) Form line extraction method integrating global information and local information
CN105450900A (en) Distortion correction method and equipment for document image
CN105335745A (en) Recognition method, device and equipment for numbers in images
CN105469026A (en) Horizontal and vertical line detection and removal for document images
CN104699663A (en) Information inputting method and device thereof
CN104182966B (en) A kind of regular shredded paper method for automatically split-jointing
CN113688688A (en) Completion method of table lines in picture and identification method of table in picture
CN104091319B (en) The shredded paper picture joining method of energy function is built based on Monte carlo algorithm
US10115036B2 (en) Determining the direction of rows of text
CN101567049B (en) Method for processing noise of half tone document image
RU2597163C2 (en) Comparing documents using reliable source
CN116311297A (en) Electronic evidence image recognition and analysis method based on computer vision
CN113989823B (en) Image table restoration method and system based on OCR coordinates
Tian et al. Table frame line detection in low quality document images based on hough transform
CN110070103A (en) The method and terminal device of identity card identification

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20170711