CN111815507A - Personal library construction method and device - Google Patents

Personal library construction method and device Download PDF

Info

Publication number
CN111815507A
CN111815507A CN202010697101.4A CN202010697101A CN111815507A CN 111815507 A CN111815507 A CN 111815507A CN 202010697101 A CN202010697101 A CN 202010697101A CN 111815507 A CN111815507 A CN 111815507A
Authority
CN
China
Prior art keywords
text
text image
image
library
recognized
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010697101.4A
Other languages
Chinese (zh)
Other versions
CN111815507B (en
Inventor
何章鸣
李国盛
马正芳
张子轩
吕东辉
魏居辉
邢尧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National University of Defense Technology
Original Assignee
National University of Defense Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National University of Defense Technology filed Critical National University of Defense Technology
Priority to CN202010697101.4A priority Critical patent/CN111815507B/en
Publication of CN111815507A publication Critical patent/CN111815507A/en
Application granted granted Critical
Publication of CN111815507B publication Critical patent/CN111815507B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • G06T3/147
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/247Aligning, centring, orientation detection or correction of the image by affine transforms, e.g. correction due to perspective effects; Quadrilaterals, e.g. trapezoids
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Abstract

The embodiment of the invention provides a personal library construction method and a personal library construction device, wherein the method comprises the following steps: firstly, a hardware platform is set up to obtain a high-quality text image; secondly, morphological preprocessing is carried out on the image, so that target information is highlighted and interference is reduced; thirdly, further correcting the distortion of the image and cutting the edge of the image based on an algorithm of affine transformation and projective geometry; and finally, performing OCR recognition on the processed image, and performing personalized retrieval on the recognized text based on keywords or discourse names. The method can realize the electronization of the paper text at any time, anywhere and with people, improve the recognition rate of the prior character recognition technology and quickly construct a personal library.

Description

Personal library construction method and device
Technical Field
The invention relates to a construction scheme of a digital library, in particular to a personal library construction method and a personal library construction device.
Background
At present, most of books and literature materials which are most common in life are paper products. The paper resources are more in line with the search and reading habits of people. However, with the development of computers and the internet, the literature is moving towards modernization, electronics and networking. The construction of a personal digital library can be divided into two parts of electronization and information retrieval.
The electronization of paper documents does not leave professional scanning equipment and skilled operating skills, but this is too demanding for the mass user. Meanwhile, the threshold of electronization is too high, so that the existing electronized books can only be used for popular books. The electronization process is only completed by a publisher in a single way, and pertinence and individuation cannot be achieved. Scanning devices on the market are classified into large-sized and small-sized ones. After a large amount of market research, we find that: the large-scale scanning integrated equipment is very high in price and very heavy. The scanned files cannot be edited in most cases. Small scanning devices are cheaper and lighter in comparison. But the speed of character recognition is slow and the accuracy is low.
All the relevant software for document retrieval (such as Endnote, Note First, My Little Library, and Howept "personal digital Library") can realize the popular Library construction, but all the construction modes provided by the software have certain limitations. For example, the search function is not accurate enough; the construction mode is not personalized; uneven resource quality, etc. More importantly, due to copyright issues, the repository for such software is not comprehensive. How to further provide a more personalized and convenient construction mode of a personal digital library on the basis of the software is a long-standing problem which is not well solved.
In fact, the existing optical character recognition technology is mature, so that the recognition rate is not high, mainly due to poor shooting environment and image distortion. So our main work is: the method improves the quality of the acquired text image, corrects the image distortion and carries out personalized retrieval on the acquired text.
In summary, two key points of construction of personal digital libraries are: high quality acquisition and scanning of text images and personalized retrieval of generated resources.
Disclosure of Invention
Aiming at the defects in the prior art, the technical problem to be solved by the invention is to provide a series of methods for improving the quality of the obtained text image, correcting the image distortion and carrying out personalized retrieval on the obtained text so as to improve the retrieval and scanning efficiency and reduce the error rate of character recognition.
In one aspect, an embodiment of the present invention provides a method for constructing a personal library, where the method includes:
acquiring a text image;
correcting transmission deformation and bending deformation of the text image to obtain a corrected text image;
selecting four vertexes as boundaries in the corrected quadrilateral area to be recognized of the text graph, cutting out the part outside the boundaries, and acquiring the part of the text graph to be recognized in the boundaries;
performing morphological preprocessing on the text image part to be recognized in the boundary to obtain a preprocessed text image;
performing Optical Character Recognition (OCR) on the preprocessed text image to acquire a recognized text library;
and traversing and searching the text library so as to build the number of the personal library.
In another aspect, an embodiment of the present invention provides a personal library building apparatus, where the apparatus includes:
an acquisition unit configured to acquire a text image;
the correction unit is used for correcting transmission deformation of the text image, correcting bending deformation and acquiring a corrected text image;
the cutting unit is used for selecting four vertexes as boundaries in the corrected quadrilateral area to be recognized of the text graph, cutting off the part outside the boundaries and acquiring the part of the text graph to be recognized in the boundaries;
the morphological preprocessing unit is used for performing morphological preprocessing on the text image part to be recognized in the boundary to acquire a preprocessed text image;
the OCR recognition unit is used for carrying out Optical Character Recognition (OCR) recognition on the preprocessed text image to acquire a recognized text library;
and the construction unit is used for performing traversal retrieval on the text library so as to perform digital construction of a personal library.
In summary, the invention has the advantages that: the quality of the text image can be improved from the source, and the distortion degree is reduced; the affine transformation correction method can well correct the transmission deformation and obviously improve the character recognition rate; bending deformation, particularly barrel deformation, can be efficiently corrected; therefore, the method can remarkably improve the quality of the acquired text image and the character recognition rate, and can realize the construction of the personal digital library at any time and any place quickly based on the mobile phone.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flow chart of a method for constructing a personal library according to an embodiment of the present invention
FIG. 2 is a flowchart of a method for quickly building a personal library according to an embodiment of the present invention;
FIG. 3 is a flow chart of an idea of iterative construction of a high-quality image acquisition hardware platform according to an embodiment of the present invention;
FIG. 4 is a flow chart of high quality image acquisition in an embodiment of the present invention;
FIG. 5 is a diagram of a high-quality image acquisition hardware platform in an embodiment of the present invention, in which a Bluetooth switch and a camera support are used to reduce camera shake, a page presser is used to reduce text distortion, and an electrodeless variable light desk lamp is used to provide a high-quality background light source;
FIG. 6 is a diagram of the effect of affine transformation based transmission distortion correction according to an embodiment of the present invention, in which four-point positioning of an image is performed first, and then affine transformation correction and image edge trimming are performed;
FIG. 7 is a flowchart illustrating image morphology preprocessing according to an embodiment of the present invention;
FIG. 8 is a diagram illustrating the effect of image morphological pre-processing according to an embodiment of the present invention;
FIG. 9 is a flowchart of a grid template based warp correction algorithm in an embodiment of the present invention;
FIG. 10 is a schematic diagram comparing a conventional shot with a high quality image acquisition hardware platform diagram according to an embodiment of the present invention;
FIG. 11 is a schematic diagram illustrating a distortion map and a calibration map according to an embodiment of the present invention;
fig. 12 is a schematic structural diagram of a personal library building apparatus according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a flowchart of a method for constructing a personal library according to an embodiment of the present invention, where the method includes:
101. acquiring a text image;
102. correcting transmission deformation and bending deformation of the text image to obtain a corrected text image;
103. selecting four vertexes as boundaries in the corrected quadrilateral area to be recognized of the text graph, cutting out the part outside the boundaries, and acquiring the part of the text graph to be recognized in the boundaries;
104. performing morphological preprocessing on the text image part to be recognized in the boundary to obtain a preprocessed text image;
105. performing Optical Character Recognition (OCR) on the preprocessed text image to acquire a recognized text library;
106. and traversing and searching the text library so as to build the number of the personal library.
Preferably, the acquiring the text image specifically includes:
selecting a mobile phone with a lens as a sampling instrument;
selecting a stable strong light source with adjustable light intensity as a background light source;
setting a shooting angle by using a shooting bracket to take a top view angle for shooting and positioning a lens at the central position of a sampled object;
flattening the text by using a page presser;
and taking a picture by adopting a Bluetooth switch so as to eliminate the jitter of the mobile phone.
Preferably, the correcting the transmission deformation of the text image specifically includes:
constructing a transformation matrix, affine transformation from a point A (x, y) in the original image to a point A ' (x ', y ') in the transformed image:
Figure BDA0002591541360000041
in the formula
Figure BDA0002591541360000042
Are all constant matrices;
an affine transformation matrix is calculated by using four vertex coordinates of the quadrangle:
Figure BDA0002591541360000043
constructing a linear equation set, wherein the coordinates of four vertexes of the original quadrangle are [ x ] respectivelyi,yi]TI-1, 2,3,4, which are to be transformed into four points [ x 'in the correction matrix'i,y′i]TI is 1,2,3,4, the system of equations is constructed as follows:
Figure BDA0002591541360000044
and (3) matrix representation of an equation set, wherein corresponding unknown vectors, constant matrixes and constant vectors are respectively as follows:
x=[a11,a21,a31,a12,a22,a32,a13,a23]T
b=[x′1,x′2,x′3,x′4,y′1,y′2,y′3,y′4]T
Figure BDA0002591541360000051
wherein x is an unknown vector to be solved, b is a constant vector, and A is a constant matrix;
the constructed linear system of equations is:
Ax=b;
solving the equation set to obtain
x=A-1b。
Preferably, the correcting the text image for bending deformation specifically includes:
specification of the grid template, sequentially representing a selected point in the two images by (x, y) and (u, v):
Figure BDA0002591541360000052
in the formula aijAnd bijAre all the coefficients of a constant number,
and (3) least square fitting: the coefficients in the above equation are unknown, requiring minimum fitting error, and the least squares result:
Figure BDA0002591541360000053
in the formulaxAndyis the sum of the variances of the control points. Due to the fact thatxAndythe minimum requirement is as follows:
Figure BDA0002591541360000061
then, there are:
Figure BDA0002591541360000062
Figure BDA0002591541360000063
selecting control points: the number of control point pairs is denoted by L: s is 0,1, 2.; t-0, 1,2.., n-s;
and traversing mapping calculation, setting the order as 3, solving the coefficient in the polynomial equation, and performing computer traversal to finish correction.
Preferably, the method specifically comprises:
the morphological preprocessing of the text and graphic part to be recognized in the boundary comprises the following steps: gray level processing, binarization processing and corrosion and expansion processing are carried out, and a preprocessed text image is obtained;
scanning the totipotent king by using OCR recognition software, performing Optical Character Recognition (OCR) recognition on the preprocessed text image, and acquiring a recognized text library;
traversing and searching the text library based on the keywords so as to build numbers of a personal library; the keywords include: title, author name, content type.
Reference will now be made in detail to the following drawings: the steps of the method for quickly building a personal library based on mobile phone photographing are specifically described in the flowchart shown in fig. 2. FIG. 3 is a flow chart of an idea of iterative construction of a high-quality image acquisition hardware platform according to an embodiment of the present invention; FIG. 4 is a flow chart of high quality image acquisition in an embodiment of the present invention; FIG. 5 is a diagram of a high-quality image acquisition hardware platform in an embodiment of the present invention, in which a Bluetooth switch and a camera support are used to reduce camera shake, a page presser is used to reduce text distortion, and an electrodeless variable light desk lamp is used to provide a high-quality background light source; FIG. 6 is a diagram of the effect of affine transformation based transmission distortion correction according to an embodiment of the present invention, in which four-point positioning of an image is performed first, and then affine transformation correction and image edge trimming are performed; FIG. 7 is a flowchart illustrating image morphology preprocessing according to an embodiment of the present invention; FIG. 8 is a diagram illustrating the effect of image morphological pre-processing according to an embodiment of the present invention; FIG. 9 is a flowchart of a grid template based warp correction algorithm in an embodiment of the present invention; FIG. 10 is a schematic diagram comparing a conventional shot with a high quality image acquisition hardware platform diagram according to an embodiment of the present invention; FIG. 11 is a diagram illustrating a distortion map and a calibration map according to an embodiment of the present invention.
Images processed in the embodiment are all images obtained by a mobile phone, and due to the shooting characteristics of the conventional mobile phone, the problems of shooting jitter, insufficient lighting, image distortion and the like are difficult to avoid during shooting; meanwhile, most of the existing recognition software has no image correction function, particularly no bending deformation correction function, so that the character recognition effect of the text image with deformation is poor.
The method comprises the following steps: and acquiring a high-quality text image.
Placing the shooting support and the bottom column of the desk lamp at the axial position of the text page, and ensuring the support and the bottom column to be parallel to the horizontal plane; the light rays of the desk lamp are distributed on the text page as uniformly as possible; fixing the mobile phone by using a photographing bracket to enable the lens of the mobile phone to be positioned at the central part of the text as much as possible; flattening and fixing the thick text as far as possible at the bottom of the text by using a page presser; utilize the bluetooth switch to shoot.
Step two: and correcting transmission deformation.
S2.1, setting the length-width ratio of the correction quadrangle A 'B' C 'D' to be the same as the length-width ratio of the paper A4, and calculating an affine transformation matrix by using four vertex coordinates of the quadrangle:
Figure BDA0002591541360000071
s2.2, knowing the coordinate of the original quadrangle as [ x ]i,yi]TI-1, 2,3,4, which is converted to [ x'i,y′i]TAnd i is 1,2,3, 4. 8 equations were constructed and 8 parameters were solved:
Figure BDA0002591541360000072
s2.3, constructing a linear equation set Ax ═ b.
S2.4, by inverse operation x ═ A-1And b, solving.
Step three: correcting bending deformation; warping is another major distortion affecting the accuracy of character recognition, which can be well corrected using a grid template to improve recognition rates. The bending deformation correction method based on the grid template comprises the following steps: the method comprises the following specific operations of grid template specification, least square fitting, control point selection and traversal mapping calculation:
s3.1, specifying a grid template, specifying a template in the deformation image and a template in the reference image to obtain, sequentially and respectively representing a selected point in the two images by (x, y) and (u, v),
Figure BDA0002591541360000073
the order of the formula is n
S3.2, obtaining the relation among the coefficients by using a least square method, wherein the relation is as follows:
Figure BDA0002591541360000081
and:
Figure BDA0002591541360000082
and S3.3, the number of the selected control point pairs is represented by L, and the coefficient matrixes of the a and the b can be obtained when the number of the mapping point pairs reaches a certain number.
And S3.4, setting the order as 3, solving the polynomial equation coefficient, and performing traversal correction.
Step four: image processing and document retrieval.
S4.1, cutting the image; selecting four vertexes as boundaries in a quadrilateral area to be identified of the text image, and cutting out the parts outside the boundaries;
s4.2, morphological pretreatment; the morphological pretreatment mainly comprises gray level treatment, binarization treatment and corrosion and expansion treatment;
s4.3, character recognition; the existing character recognition technology is mature, and any OCR software can well complete the task of character recognition.
And S4.4, traversing in the constructed small text library by using discourse names or keywords to complete retrieval.
The practical result of this embodiment is a high-quality text image and a corrected text image, and the recognition results of the text image with no processing are shown in table 1. A conventional shot versus a high quality image acquisition hardware platform map is shown in fig. 10; the distortion map and the correction map are shown in fig. 11. According to the test results, the image deformation acquired by the hardware platform is within the recognition range of the OCR software; the character recognition accuracy of the hardware platform image is improved by more than 40% compared with that of the conventional manual shooting, and even approaches to complete recognition; in addition, the character recognition accuracy rate is greatly improved and approaches to 95% by the correction enhancement algorithm provided by the text. The embodiment illustrates that the method can improve the quality of conventional text photographing and improve the accuracy of character recognition, which lays a foundation for constructing a personal digital library at any time and anywhere.
Table 1 statistics of experimental results
Total number of characters 323 Average number of recognized characters Percent identification (%)
Conventional photographed picture 183 56.66
Platform photograph 322 99.69
Distortion figure 266 82.35
Calibration chart 307 95.05
Corresponding to the above method embodiment, as shown in fig. 12, a schematic structural diagram of a personal library construction apparatus according to an embodiment of the present invention is shown, the apparatus includes:
an acquisition unit configured to acquire a text image;
the correction unit is used for correcting transmission deformation of the text image, correcting bending deformation and acquiring a corrected text image;
the cutting unit is used for selecting four vertexes as boundaries in the corrected quadrilateral area to be recognized of the text graph, cutting off the part outside the boundaries and acquiring the part of the text graph to be recognized in the boundaries;
the morphological preprocessing unit is used for performing morphological preprocessing on the text image part to be recognized in the boundary to acquire a preprocessed text image;
the OCR recognition unit is used for carrying out Optical Character Recognition (OCR) recognition on the preprocessed text image to acquire a recognized text library;
and the construction unit is used for performing traversal retrieval on the text library so as to perform digital construction of a personal library.
Preferably, the acquiring unit is configured to acquire a text image, and specifically includes: selecting a mobile phone with a lens as a sampling instrument; selecting a stable strong light source with adjustable light intensity as a background light source; setting a shooting angle by using a shooting bracket to take a top view angle for shooting and positioning a lens at the central position of a sampled object; flattening the text by using a page presser; and taking a picture by adopting a Bluetooth switch so as to eliminate the jitter of the mobile phone.
Preferably, the correction unit is configured to correct the transmission distortion of the text image, and specifically includes:
constructing a transformation matrix, affine transformation from A (x, y) to A ' (x ', y '):
Figure BDA0002591541360000091
an affine transformation matrix is calculated by using four vertex coordinates of the quadrangle:
Figure BDA0002591541360000092
constructing a linear equation set, and the coordinate of the original quadrangle is [ x ]i,yi]TI-1, 2,3,4, which is to be converted to [ x'i,y′i]TI ═ 1,2,3,4, the system of equations was constructed as follows:
Figure BDA0002591541360000093
and (3) matrix representation of an equation set, wherein corresponding unknown vectors, constant matrixes and constant vectors are respectively as follows:
x=[a11,a21,a31,a12,a22,a32,a13,a23]T
b=[x′1,x′2,x′3,x′4,y′1,y′2,y′3,y′4]T
Figure BDA0002591541360000101
the constructed linear system of equations is:
Ax=b,
solving the equation set to obtain
x=A-1b。
Preferably, the correction unit is configured to correct the text image by bending deformation, and specifically includes: specification of the grid template, sequentially representing a selected point in the two images by (x, y) and (u, v):
Figure BDA0002591541360000102
and (3) least square fitting: the coefficients in the above equation are unknown, requiring minimum fitting error, and the least squares result:
Figure BDA0002591541360000103
due to the fact thatxAndythe minimum requirement is as follows:
Figure BDA0002591541360000104
then, there are:
Figure BDA0002591541360000111
Figure BDA0002591541360000112
selecting control points: the number of control point pairs is denoted by L: s is 0,1, 2.; t-0, 1,2.., n-s;
and traversing mapping calculation, setting the order as 3, solving the coefficient in the polynomial equation, and performing computer traversal to finish correction.
Preferably, the morphological preprocessing unit is specifically configured to perform morphological preprocessing on the text graph part to be recognized in the boundary, and includes: gray level processing, binarization processing and corrosion and expansion processing are carried out, and a preprocessed text image is obtained;
the OCR recognition unit is specifically used for scanning the totipotent king by using OCR recognition software to perform Optical Character Recognition (OCR) recognition on the preprocessed text image to acquire a recognized text library;
the construction unit is specifically used for performing traversal retrieval based on keywords on the text library so as to perform digital construction of a personal library; the keywords include: title, author name, content type.
In summary, the invention has the advantages that: the quality of the text image can be improved from the source, and the distortion degree is reduced; the affine transformation correction method can well correct the transmission deformation and obviously improve the character recognition rate; bending deformation, particularly barrel deformation, can be efficiently corrected; therefore, the method can remarkably improve the quality of the acquired text image and the character recognition rate, and can realize the construction of the personal digital library at any time and any place quickly based on the mobile phone.
It should be understood that the specific order or hierarchy of steps in the processes disclosed is an example of exemplary approaches. Based upon design preferences, it is understood that the specific order or hierarchy of steps in the processes may be rearranged without departing from the scope of the present disclosure. The accompanying method claims present elements of the various steps in a sample order, and are not intended to be limited to the specific order or hierarchy presented.
In the foregoing detailed description, various features are grouped together in a single embodiment for the purpose of streamlining the disclosure. This method of disclosure is not to be interpreted as reflecting an intention that the claimed embodiments of the subject matter require more features than are expressly recited in each claim. Rather, as the following claims reflect, invention lies in less than all features of a single disclosed embodiment. Thus, the following claims are hereby expressly incorporated into the detailed description, with each claim standing on its own as a separate preferred embodiment of the invention.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. To those skilled in the art; various modifications to these embodiments will be readily apparent, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.
What has been described above includes examples of one or more embodiments. It is, of course, not possible to describe every conceivable combination of components or methodologies for purposes of describing the aforementioned embodiments, but one of ordinary skill in the art may recognize that many further combinations and permutations of various embodiments are possible. Accordingly, the embodiments described herein are intended to embrace all such alterations, modifications and variations that fall within the scope of the appended claims. Furthermore, to the extent that the term "includes" is used in either the detailed description or the claims, such term is intended to be inclusive in a manner similar to the term "comprising" as "comprising" is interpreted when employed as a transitional word in a claim. Furthermore, any use of the term "or" in the specification of the claims is intended to mean a "non-exclusive or".
Those of skill in the art will further appreciate that the various illustrative logical blocks, units, and steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate the interchangeability of hardware and software, various illustrative components, elements, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design requirements of the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present embodiments.
The various illustrative logical blocks, or elements, described in connection with the embodiments disclosed herein may be implemented or performed with a general purpose processor, a digital signal processor, an Application Specific Integrated Circuit (ASIC), a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof designed to perform the functions described herein. A general-purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. A processor may also be implemented as a combination of computing devices, e.g., a digital signal processor and a microprocessor, a plurality of microprocessors, one or more microprocessors in conjunction with a digital signal processor core, or any other similar configuration.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may be stored in RAM memory, flash memory, ROM memory, EPROM memory, EEPROM memory, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art. For example, a storage medium may be coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC, which may be located in a user terminal. In the alternative, the processor and the storage medium may reside in different components in a user terminal.
In one or more exemplary designs, the functions described above in connection with the embodiments of the invention may be implemented in hardware, software, firmware, or any combination of the three. If implemented in software, the functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media that facilitate transfer of a computer program from one place to another. Storage media may be any available media that can be accessed by a general purpose or special purpose computer. For example, such computer-readable media can include, but is not limited to, RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to carry or store program code in the form of instructions or data structures and which can be read by a general-purpose or special-purpose computer, or a general-purpose or special-purpose processor. Additionally, any connection is properly termed a computer-readable medium, and, thus, is included if the software is transmitted from a website, server, or other remote source via a coaxial cable, fiber optic cable, twisted pair, Digital Subscriber Line (DSL), or wirelessly, e.g., infrared, radio, and microwave. Such discs (disk) and disks (disc) include compact disks, laser disks, optical disks, DVDs, floppy disks and blu-ray disks where disks usually reproduce data magnetically, while disks usually reproduce data optically with lasers. Combinations of the above may also be included in the computer-readable medium.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are merely exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method of personal library construction, the method comprising:
acquiring a text image;
correcting transmission deformation and bending deformation of the text image to obtain a corrected text image;
selecting four vertexes as boundaries in the corrected quadrilateral area to be recognized of the text graph, cutting out the part outside the boundaries, and acquiring the part of the text graph to be recognized in the boundaries;
performing morphological preprocessing on the text image part to be recognized in the boundary to obtain a preprocessed text image;
performing Optical Character Recognition (OCR) on the preprocessed text image to acquire a recognized text library;
and traversing and searching the text library so as to build the number of the personal library.
2. The method for building a personal library as claimed in claim 1, wherein said obtaining a text image specifically comprises:
selecting a mobile phone with a lens as a sampling instrument;
selecting a stable strong light source with adjustable light intensity as a background light source;
setting a shooting angle by using a shooting bracket to take a top view angle for shooting and positioning a lens at the central position of a sampled object;
flattening the text by using a page presser;
and taking a picture by adopting a Bluetooth switch so as to eliminate the jitter of the mobile phone.
3. The method for building a personal library as claimed in claim 1, wherein the correcting the transmission distortion of the text image specifically comprises:
constructing a transformation matrix, affine transformation from a point A (x, y) in the original image to a point A ' (x ', y ') in the transformed image:
Figure FDA0002591541350000011
in the formula
Figure FDA0002591541350000012
Are all constant matrices;
an affine transformation matrix is calculated by using four vertex coordinates of the quadrangle:
Figure FDA0002591541350000013
constructing a linear equation set, wherein the coordinates of four vertexes of the original quadrangle are [ x ] respectivelyi,yi]TI-1, 2,3,4, which are to be transformed into four points [ x 'in the correction matrix'i,y′i]TI is 1,2,3,4, the system of equations is constructed as follows:
Figure FDA0002591541350000021
and (3) matrix representation of an equation set, wherein corresponding unknown vectors, constant matrixes and constant vectors are respectively as follows:
x=[a11,a21,a31,a12,a22,a32,a13,a23]T
b=[x′1,x′2,x′3,x′4,y′1,y′2,y′3,y′4]T
Figure FDA0002591541350000022
wherein x is an unknown vector to be solved, b is a constant vector, and A is a constant matrix;
the constructed linear system of equations is:
Ax=b;
solving the equation set to obtain
x=A-1b。
4. The method for building a personal library as claimed in claim 1, wherein the step of correcting the text image for warping comprises:
specification of the grid template, sequentially representing a selected point in the two images by (x, y) and (u, v):
Figure FDA0002591541350000023
in the formula aijAnd bijAre all the coefficients of a constant number,
and (3) least square fitting: the coefficients in the above equation are unknown, requiring minimum fitting error, and the least squares result:
Figure FDA0002591541350000031
in the formulaxAndyis the sum of the variances of the control points. Due to the fact thatxAndythe minimum requirement is as follows:
Figure FDA0002591541350000032
then, there are:
Figure FDA0002591541350000033
Figure FDA0002591541350000034
selecting control points: the number of control point pairs is denoted by L: s is 0,1, 2.; t-0, 1,2.., n-s;
and traversing mapping calculation, setting the order as 3, solving the coefficient in the polynomial equation, and performing computer traversal to finish correction.
5. The method of building a personal library as claimed in claim 1, wherein the method specifically comprises:
the morphological preprocessing of the text and graphic part to be recognized in the boundary comprises the following steps: gray level processing, binarization processing and corrosion and expansion processing are carried out, and a preprocessed text image is obtained;
scanning the totipotent king by using OCR recognition software, performing Optical Character Recognition (OCR) recognition on the preprocessed text image, and acquiring a recognized text library;
traversing and searching the text library based on the keywords so as to build numbers of a personal library; the keywords include: title, author name, content type.
6. A personal library building apparatus, the apparatus comprising:
an acquisition unit configured to acquire a text image;
the correction unit is used for correcting transmission deformation of the text image, correcting bending deformation and acquiring a corrected text image;
the cutting unit is used for selecting four vertexes as boundaries in the corrected quadrilateral area to be recognized of the text graph, cutting off the part outside the boundaries and acquiring the part of the text graph to be recognized in the boundaries;
the morphological preprocessing unit is used for performing morphological preprocessing on the text image part to be recognized in the boundary to acquire a preprocessed text image;
the OCR recognition unit is used for carrying out Optical Character Recognition (OCR) recognition on the preprocessed text image to acquire a recognized text library;
and the construction unit is used for performing traversal retrieval on the text library so as to perform digital construction of a personal library.
7. The personal library building apparatus of claim 6, wherein the obtaining unit is configured to obtain a text image, and specifically includes: selecting a mobile phone with a lens as a sampling instrument; selecting a stable strong light source with adjustable light intensity as a background light source; setting a shooting angle by using a shooting bracket to take a top view angle for shooting and positioning a lens at the central position of a sampled object; flattening the text by using a page presser; and taking a picture by adopting a Bluetooth switch so as to eliminate the jitter of the mobile phone.
8. The personal library building apparatus of claim 6, wherein the correcting unit is configured to correct the transmission distortion of the text image, and specifically comprises:
constructing a transformation matrix, affine transformation from A (x, y) to A ' (x ', y '):
Figure FDA0002591541350000041
an affine transformation matrix is calculated by using four vertex coordinates of the quadrangle:
Figure FDA0002591541350000042
constructing a linear equation set, and the coordinate of the original quadrangle is [ x ]i,yi]TI-1, 2,3,4, which is to be converted to [ x'i,y′i]TI ═ 1,2,3,4, the system of equations was constructed as follows:
Figure FDA0002591541350000043
and (3) matrix representation of an equation set, wherein corresponding unknown vectors, constant matrixes and constant vectors are respectively as follows:
x=[a11,a21,a31,a12,a22,a32,a13,a23]T
b=[x′1,x′2,x′3,x′4,y′1,y′2,y′3,y′4]T
Figure FDA0002591541350000051
the constructed linear system of equations is:
Ax=b,
solving the equation set to obtain
x=A-1b。
9. The personal library building apparatus of claim 6, wherein the correction unit is configured to correct the text image for warping, and specifically includes:
specification of the grid template, sequentially representing a selected point in the two images by (x, y) and (u, v):
Figure FDA0002591541350000052
and (3) least square fitting: the coefficients in the above equation are unknown, requiring minimum fitting error, and the least squares result:
Figure FDA0002591541350000053
due to the fact thatxAndythe minimum requirement is as follows:
Figure FDA0002591541350000054
then, there are:
Figure FDA0002591541350000055
Figure FDA0002591541350000061
selecting control points: the number of control point pairs is denoted by L: s is 0,1, 2.; t-0, 1,2.., n-s;
and traversing mapping calculation, setting the order as 3, solving the coefficient in the polynomial equation, and performing computer traversal to finish correction.
10. The personal library building apparatus of claim 6,
the morphological preprocessing unit is specifically configured to perform morphological preprocessing on the text and graphic portion to be recognized within the boundary, and includes: gray level processing, binarization processing and corrosion and expansion processing are carried out, and a preprocessed text image is obtained;
the OCR recognition unit is specifically used for scanning the totipotent king by using OCR recognition software to perform Optical Character Recognition (OCR) recognition on the preprocessed text image to acquire a recognized text library;
the construction unit is specifically used for performing traversal retrieval based on keywords on the text library so as to perform digital construction of a personal library; the keywords include: title, author name, content type.
CN202010697101.4A 2020-07-20 2020-07-20 Personal library construction method and device Active CN111815507B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010697101.4A CN111815507B (en) 2020-07-20 2020-07-20 Personal library construction method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010697101.4A CN111815507B (en) 2020-07-20 2020-07-20 Personal library construction method and device

Publications (2)

Publication Number Publication Date
CN111815507A true CN111815507A (en) 2020-10-23
CN111815507B CN111815507B (en) 2023-06-20

Family

ID=72865670

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010697101.4A Active CN111815507B (en) 2020-07-20 2020-07-20 Personal library construction method and device

Country Status (1)

Country Link
CN (1) CN111815507B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101458770A (en) * 2008-12-24 2009-06-17 北京文通科技有限公司 Character recognition method and system
CN106202184A (en) * 2016-06-27 2016-12-07 华中科技大学 A kind of books personalized recommendation method towards libraries of the universities and system
CN111199297A (en) * 2018-11-20 2020-05-26 龙卫兵 Library self-service method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101458770A (en) * 2008-12-24 2009-06-17 北京文通科技有限公司 Character recognition method and system
CN106202184A (en) * 2016-06-27 2016-12-07 华中科技大学 A kind of books personalized recommendation method towards libraries of the universities and system
CN111199297A (en) * 2018-11-20 2020-05-26 龙卫兵 Library self-service method

Also Published As

Publication number Publication date
CN111815507B (en) 2023-06-20

Similar Documents

Publication Publication Date Title
CN109816011B (en) Video key frame extraction method
US7835589B2 (en) Photographic document imaging system
US20150205777A1 (en) Automated form fill-in via form retrieval
TW569153B (en) Method and apparatus for three-dimensional shadow lightening
CN109255300B (en) Bill information extraction method, bill information extraction device, computer equipment and storage medium
CN101515984B (en) Electronic document producing device and electronic document producing method
US6996290B2 (en) Binding curvature correction
CN107451582A (en) A kind of graphics context identifying system and its recognition methods
US20130182950A1 (en) Method for enhancing a digitized document
WO2010129442A2 (en) Image quality indicator responsive to image processing
CN112036259A (en) Form correction and recognition method based on combination of image processing and deep learning
US8773733B2 (en) Image capture device for extracting textual information
US8687916B2 (en) Correcting page curl in scanned books
CN113177899A (en) Method for correcting text tilt of medical photocopy, electronic device and readable storage medium
US8768058B2 (en) System for extracting text from a plurality of captured images of a document
US20130022270A1 (en) Optical Character Recognition of Text In An Image for Use By Software
WO2021051527A1 (en) Image segmentation-based text positioning method, apparatus and device, and storage medium
US20130315485A1 (en) Textual information extraction method using multiple images
CN111815507B (en) Personal library construction method and device
US9323725B2 (en) Multi-mode image capture systems and methods
CN109508712A (en) A kind of Chinese written language recognition methods based on image
US10366284B1 (en) Image recognition and parsing
JP2007011529A (en) Method for determining character recognition position in ocr processing
CN108357269B (en) Intelligent pen rack
CN103020915A (en) Video data based objective image enhancement method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant