CN113989484A - Ancient book character recognition method and device, computer equipment and storage medium - Google Patents

Ancient book character recognition method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN113989484A
CN113989484A CN202111286038.6A CN202111286038A CN113989484A CN 113989484 A CN113989484 A CN 113989484A CN 202111286038 A CN202111286038 A CN 202111286038A CN 113989484 A CN113989484 A CN 113989484A
Authority
CN
China
Prior art keywords
image
ancient book
training
model
positioning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111286038.6A
Other languages
Chinese (zh)
Inventor
程瑞雪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Gulian Beijing Media Tech Co ltd
Original Assignee
Gulian Beijing Media Tech Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Gulian Beijing Media Tech Co ltd filed Critical Gulian Beijing Media Tech Co ltd
Priority to CN202111286038.6A priority Critical patent/CN113989484A/en
Publication of CN113989484A publication Critical patent/CN113989484A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Character Discrimination (AREA)

Abstract

The invention discloses a method and a device for recognizing ancient books by characters, computer equipment and a storage medium, wherein the method comprises the following steps: acquiring and preprocessing an ancient book image to be identified; inputting the ancient book image into a target detection model to obtain one or more positioning frames contained in the ancient book image; the positioning frame is used for representing Chinese character areas which are longitudinally and continuously arranged and have the same font form; inputting the area image determined by each positioning frame into a character recognition model to acquire characters contained in the area image; and determining the character recognition result of the ancient book image according to characters contained in all the area images. The invention realizes the positioning and identification of the characters in the ancient book literature pictures without manual intervention, and converts the characters in the pictures into digital characters, thereby improving the efficiency of ancient book entry and accelerating the ancient book arrangement work.

Description

Ancient book character recognition method and device, computer equipment and storage medium
Technical Field
The invention relates to the field of character recognition, in particular to a character recognition method, a character recognition device, computer equipment and a storage medium for ancient books.
Background
The Chinese national culture is long-running, and a large number of ancient books are produced in a long history period. Ancient books and literature are a common way for modern people to understand and study ancient culture. To facilitate reading and understanding by more modern people, ancient books are organized by professionals (including but not limited to providing relevant notes). As an important means for giving the ancient book regenerative protection, the digitization of the ancient book (making the ancient book literature into digital text, namely a database and the like) is an important part in the ancient book arrangement. However, the history of China is long, the number of the remained ancient books is large, and the manual ancient book entry is time-consuming and labor-consuming; similarly, a great deal of labor energy is consumed for manual labeling processing of data, and higher requirements are placed on literacy and recording quality of ancient book recording personnel.
In the prior art, with the development and popularization of an OCR (Optical Character Recognition) technology, characters can be extracted from pictures in different scenes. Chinese patent application No. 202110735014.8 discloses a text recognition method, apparatus, computer device and computer readable storage medium based on probability calibration. The method comprises the steps of obtaining an initial recognition image, inputting the initial recognition image to a preset DARTS model, carrying out character recognition on the initial recognition image to obtain a calibration parameter of characters contained in the initial recognition image, inputting the initial recognition image to a preset OCR model, carrying out character recognition on the initial recognition image to obtain a character recognition logs probability vector corresponding to the characters contained in the initial recognition image, carrying out probability calibration on the character recognition logs probability vector according to the calibration parameter and carrying out normalization processing to obtain a character recognition result of the characters contained in the initial recognition image, and increasing the calibration of character recognition probability to solve the calibration problem of recognition error rate and improve the accuracy of character prediction in character recognition.
Although the character recognition method based on probability calibration can improve the recognition accuracy of characters in different scenes, the method does not meet the requirement of ancient book digitization recognition, and can not learn the marked ancient book document pictures under the condition of non-manual intervention, thereby being not beneficial to electronic output and storage of the ancient book documents.
Therefore, in order to solve the problems in the prior art, it is urgently needed to provide a technology which can reduce labor and energy and effectively improve the ancient book collating efficiency, and the technology of identifying the text content in the ancient book by the AI algorithm is very important.
Disclosure of Invention
In view of the above, the present invention provides a method for recognizing ancient books by characters, which can reduce labor and effectively improve the ancient book sorting efficiency. Further, the invention can rely on AI algorithms to identify the text content in ancient books. The invention also aims to provide a character recognition device for ancient books. It is a further object of this invention to provide a computing device and computer readable storage medium. The technical scheme is adopted to achieve the purpose.
According to one aspect of the disclosure, a method for recognizing ancient books by characters is provided, which comprises the following steps:
acquiring and preprocessing an ancient book image to be identified;
inputting the ancient book image into a target detection model to obtain one or more positioning frames contained in the ancient book image; the positioning frame is used for representing Chinese character areas which are longitudinally and continuously arranged and have the same font form;
inputting the area image determined by each positioning frame into a character recognition model to acquire characters contained in the area image;
and determining the character recognition result of the ancient book image according to characters contained in all the area images.
According to the ancient book character recognition method of the present invention, preferably, the step of acquiring and preprocessing the image of the ancient book to be recognized comprises:
converting the ancient book image into a black and white image;
and carrying out inclination correction on the black-and-white image.
According to the ancient book character recognition method of the present invention, preferably, the step of inputting the ancient book image into a target detection model to obtain one or more positioning frames contained in the ancient book image comprises:
inputting the ancient book image into the target detection model to obtain one or more groups of vertex coordinate sets; each set of vertex coordinate set comprises four pairs of position coordinates, and each pair of position coordinates is used for representing the position of one vertex in the positioning frame in the ancient book image;
generating the one or more positioning boxes from the one or more sets of vertex coordinates.
According to the ancient book character recognition method of the present invention, preferably, after the step of inputting the ancient book image into a target detection model to obtain one or more positioning frames contained in the ancient book image, the method further comprises:
determining the area image contained in each positioning frame;
storing the area images into a memory mapping database according to a preset sequence; wherein the preset order is determined according to the vertex position of the positioning frame.
According to the ancient book character recognition method of the present invention, preferably, the step of inputting the area image determined by each positioning frame into the character recognition model to obtain the characters contained in the area image comprises:
acquiring the region images from the memory mapping database according to the preset sequence;
inputting the regional image into a depth residual error network model to output corresponding extracted character regional characteristics;
inputting the extracted character region characteristics into an LSTM model to output corresponding sequence relation information;
inputting the sequence relation information into a CTC model or an attention model to output characters contained in the region image.
According to the ancient book character recognition method of the present invention, preferably, the step of inputting the sequence relation information into a CTC model or an attention model to output characters contained in the region image includes:
determining the number of characters contained in the region image;
when the number of characters is larger than a first threshold value, inputting the selection feature into the CTC model; otherwise, the selected features are input into the attention model.
Preferably, the training step of the target detection model includes:
constructing a training picture set, wherein the training picture comprises single-row Chinese characters and/or double-row Chinese character mixtures which are composed of traditional Chinese characters and have different font forms;
marking one or more corresponding positioning training frames for the training pictures, wherein the positioning training frames are used for representing Chinese character areas which are longitudinally and continuously arranged and have the same font form;
taking the training pictures as input data, taking one or more groups of vertex coordinate training sets corresponding to the one or more positioning training frames as output data, and training a DBNet model or a CRAFT model; each group of vertex coordinate training set comprises four pairs of position training coordinates, and each pair of position training coordinates is used for representing the position of one vertex in the positioning training frame in the training picture;
and when the error function is smaller than a preset second threshold value, finishing the training.
According to another aspect of the present invention, there is provided an ancient book character recognition apparatus, comprising:
the image acquisition module is suitable for acquiring an ancient book image to be identified;
the target detection module is suitable for inputting the ancient book image into a target detection model so as to obtain one or more positioning frames contained in the ancient book image; the positioning frame is used for representing Chinese character areas which are longitudinally and continuously arranged and have the same font form;
the character recognition module is suitable for inputting the area image determined by each positioning frame into a character recognition model so as to obtain characters contained in the area image;
and the integration module is suitable for determining the character recognition result of the ancient book image according to characters contained in all the area images.
According to yet another aspect of the present disclosure, a computing device is provided, which includes a memory, a processor, and computer instructions stored in the memory and executable on the processor, wherein the processor implements the steps of the method for converting traditional Chinese characters into simplified Chinese characters.
According to yet another aspect of the present disclosure, a computer-readable storage medium is provided, which stores computer instructions that, when executed by a processor, implement the steps of the above method for Chinese character original complex and simple transformation.
According to the invention, the set of training pictures is generated to respectively perform mechanical learning training on the built character positioning model and the character recognition model, so that the characters in the ancient book document picture are positioned and recognized under the condition of no manual intervention, the characters in the picture are converted into digital characters, the ancient book recording efficiency is improved, and the ancient book sorting work is accelerated. According to the preferred technical scheme of the invention, the training atlas is not required to be labeled by manpower, so that the manpower input cost is greatly saved.
Drawings
The invention may be better understood by describing exemplary embodiments of the present disclosure in conjunction with the following drawings, in which:
FIG. 1 is a schematic block diagram of a computing device in accordance with a disclosed embodiment of the invention;
FIG. 2 is a schematic flow chart of a method for character recognition of ancient books according to an embodiment of the disclosure;
FIG. 3 is a schematic illustration of an ancient book image according to an embodiment of the disclosure;
FIG. 4 is a schematic view of a location box for target detection of ancient book images, consistent with an embodiment of the present disclosure;
FIG. 5 is a schematic illustration of a region image according to a disclosed embodiment of the invention;
FIG. 6 is a result of text recognition on a region image according to a disclosed embodiment of the invention;
FIG. 7 is a graph of the text recognition results for an ancient book image according to the disclosed embodiment of the invention;
FIG. 8 is a schematic diagram of a training process for a target detection model according to a disclosed embodiment of the invention;
fig. 9 is a schematic diagram of a device for recognizing ancient books by characters according to an embodiment of the disclosure.
Detailed Description
While specific embodiments of the invention will be described below, it should be noted that in the course of the detailed description of these embodiments, in order to provide a concise and concise description, all features of an actual implementation may not be described in detail. It should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions are made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another. Moreover, it should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.
Unless otherwise defined, technical or scientific terms used in the claims and the specification should have the ordinary meaning as understood by those of ordinary skill in the art to which the invention belongs. The use of "first," "second," and similar terms in the description and claims of the present application do not denote any order, quantity, or importance, but rather the terms are used to distinguish one element from another. The terms "a" or "an," and the like, do not denote a limitation of quantity, but rather denote the presence of at least one. The word "comprise" or "comprises", and the like, means that the element or item listed before "comprises" or "comprising" covers the element or item listed after "comprising" or "comprises" and its equivalent, and does not exclude other elements or items. The terms "connected" or "coupled" and the like are not restricted to physical or mechanical connections, nor are they restricted to direct or indirect connections.
FIG. 1 shows a block diagram of a computing device 100, according to an embodiment of the present description. The components of the computing device 100 include, but are not limited to, memory 110 and processor 120. The processor 120 is coupled to the memory 110 via a bus 130 and a database 150 is used to store data.
Computing device 100 also includes access device 140, access device 140 enabling computing device 100 to communicate via one or more networks 160. Examples of such networks include the Public Switched Telephone Network (PSTN), a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), or a combination of communication networks such as the internet. Access device 140 may include one or more of any type of network interface (e.g., a Network Interface Card (NIC)) whether wired or wireless, such as an IEEE802.11 Wireless Local Area Network (WLAN) wireless interface, a worldwide interoperability for microwave access (Wi-MAX) interface, an ethernet interface, a Universal Serial Bus (USB) interface, a cellular network interface, a bluetooth interface, a Near Field Communication (NFC) interface, and so forth.
In one embodiment of the present description, the above-described components of computing device 100 and other components not shown in FIG. 1 may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device architecture shown in FIG. 1 is for purposes of example only and is not limiting as to the scope of the description. Those skilled in the art may add or replace other components as desired.
Computing device 100 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), a mobile phone (e.g., smartphone), a wearable computing device (e.g., smartwatch, smartglasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or PC. Computing device 100 may also be a mobile or stationary server.
Wherein the processor 120 may perform the steps of the method shown in fig. 2.
Fig. 2 shows a schematic flow chart of a method for character recognition of ancient books according to an embodiment of the present application, including steps S1 to S4.
And S1, acquiring and preprocessing the ancient book image to be identified.
The ancient book image includes a plate engraving type, a four-bank handwriting type, a tombstone engraving type and a lead raft type picture with the contents of the ancient book document, for example, as shown in fig. 3. The process of preprocessing the ancient book image includes, but is not limited to, binarizing the ancient book image to convert the ancient book image into a black-and-white image, and performing tilt correction on the black-and-white image. Here, "tilt correction" means that an image whose plane photographing angle is tilted is rotationally adjusted so that its margin is in a horizontal or vertical state.
S2, inputting the ancient book image into a target detection model to obtain one or more positioning frames contained in the ancient book image; the positioning frame is used for representing Chinese character areas which are longitudinally and continuously arranged and have the same font form.
The purpose of target detection on the ancient book image is to position the ancient book image in divided areas so as to facilitate character recognition according to each area in the subsequent process. The target detection model can be obtained by utilizing the existing training of any relevant machine learning model, and aims to obtain image information in different areas by classifying the ancient book images. In the present embodiment, the divided areas are displayed by the positioning boxes, and fig. 4 shows a schematic diagram of the positioning boxes in the ancient book image. As will be appreciated by those skilled in the art, ancient book documents are usually arranged in portrait, and therefore the alignment boxes in this embodiment detect the text areas in each column to obtain a continuous arrangement of text areas having the same font style. The font form may include any one of a font format, a font size, and a font color. Taking fig. 4 as an example, it can be seen that 3 positioning frames are output after the first column passes through the target detection model, and the 3 positioning frames correspond to areas with different column numbers and different font sizes respectively.
In one example, the output of the object detection model may be one or more sets of vertex coordinates; and each set of vertex coordinate set comprises four pairs of position coordinates, and each pair of position coordinates is used for representing the position of one vertex in the positioning frame in the ancient book image. For example, the bounding box ABCD includes four vertices, respectivelyA. B, C and D. The position coordinate of each vertex is (X)A,YA)、(XB,YB)、(XC,YC) And (X)D,YD). Then, the vertex coordinate set corresponding to the positioning box may be (X)A,YA,XB,YB,XC,YC, XD,YD). It is understood that upon obtaining the set of vertex coordinates, the corresponding positioning box may be generated directly.
In this embodiment, the image defined inside each positioning frame is referred to as an area image. It is understood that one ancient book image includes a plurality of area images, and accordingly, the area images included in the plurality of ancient book images are multiplied. In order to increase the reading speed of the area image, the present embodiment stores the area images in a Memory-Mapped Database (LMDB) according to a preset sequence. The preset sequence is determined according to the vertex position of the positioning frame. For example, the vertex positions of the positioning frames may be arranged in order from top to bottom, right to left, and sequentially stored based on the ancient book reading habit.
And S3, inputting the area image determined by each positioning frame into a character recognition model to acquire characters contained in the area image.
The character recognition model can be obtained by training an existing arbitrary feature extraction network model and aims to recognize characters in each regional image. In order to increase the accuracy of the character recognition, the character recognition model in this embodiment may include a plurality of feature extraction models, so as to extract the character features in the region image in a targeted manner. Fig. 5 and 6 show an area image and a recognized text result according to a first embodiment of the present invention, respectively.
In one example, the number of words in the region image is greater than a first threshold, e.g., 32. At this time, step S3 may include: acquiring the region images from a memory mapping database according to the preset sequence; inputting the regional image into a depth residual error network model to output corresponding extracted character regional characteristics; inputting the extracted character region characteristics into an LSTM model to output corresponding sequence relation information; inputting the sequence relation information into a CTC model to output characters contained in the region image. The depth residual error network model ResNet expresses the characteristics of the output character region as linear superposition of input and nonlinear transformation of the input, the LSTM model can solve the problem of long-term dependence in a character sequence, and the CTC model is a time sequence class classification based on a neural network and is suitable for being applied to long rows or long columns with longer word numbers.
In another example, the number of characters in the region image is less than or equal to a first threshold, e.g., 32. At this time, step S3 may include: acquiring the region images from a memory mapping database according to the preset sequence; inputting the regional image into a depth residual error network model to output corresponding extracted character regional characteristics; inputting the extracted character region characteristics into an LSTM model to output corresponding sequence relation information; and inputting the sequence relation information into an attention model to output characters contained in the region image. Replacing the CTC model with an attention model in this example may solve the bottleneck of information loss due to long sequence to fixed length vector conversion, which is more suitable for text recognition in short rows or short columns.
And S4, determining the character recognition result of the ancient book image according to the characters contained in all the area images.
It is understood that, in the case where the ancient book image includes a plurality of area images, the recognition result of each area image is arranged in turn according to the above preset sequence, and the recognition result of the whole ancient book image can be obtained, as shown in fig. 7.
Through the steps, the embodiment can quickly and accurately convert the characters in the picture into the characters of the digital edition, so that the labor input cost is saved. Furthermore, the ancient book input efficiency can be improved, and the ancient book arrangement work is accelerated.
Fig. 8 is a schematic diagram of a training process of a target detection model according to a first embodiment of the present invention. As shown in fig. 8, the target detection model is obtained by training through the following steps:
s810, constructing a training picture set, wherein the training picture comprises single-row Chinese characters and/or double-row Chinese character mixtures which are composed of traditional Chinese characters and have different font forms.
The font models in the training pictures can be obtained through Song style transfer training of the Chinese book office, and the font-changed models can have different font formats and typesetting formats. The font format may include options such as font style, font size, font aspect ratio, and the like, and the typesetting format may include options such as font spacing, column spacing, single-row and double-row settings, and the like.
The text file to be generated is input into the page model, and after page generation parameters are determined, the page model automatically combines characters into single-double parallel mixed columns according to single-double arrangement, and combines all the mixed columns into training pictures. A training picture set is formed by a plurality of training pictures. And each training picture is associated with corresponding original text position information for carrying out model verification. The original text position information comprises coordinate information and character content information. In the coordinate information, a specific position in the training picture can be represented by a coordinate system of an X axis and a Y axis, and the position of the original text rectangular block formed by each character string in the training picture is determined by determining coordinates of four corner points of a rectangle.
And S820, marking one or more corresponding positioning training frames for the training pictures, wherein the positioning training frames are used for representing Chinese character areas which are longitudinally and continuously arranged and have the same font form.
The training picture is divided into a plurality of columns, namely a plurality of positioning training frames, by dividing and positioning the training picture, and the position information of each positioning training frame is identified, wherein the position information comprises the coordinates of four angular points of each positioning training frame.
S830, taking the training pictures as input data, taking one or more groups of vertex coordinate training sets corresponding to the one or more positioning training frames as output data, and training a DBNet model or a CRAFT model; and each group of vertex coordinate training set comprises four pairs of position training coordinates, and each pair of position training coordinates is used for representing the position of one vertex in the positioning training frame in the training picture.
The character positioning model is trained by respectively comparing and judging the identified positioning training frame (comprising coordinates of four corner points of the character and relative coordinates of each character) and corresponding original text position information (comprising coordinates of four corner points of an original rectangular block of the character and relative coordinates of each character). Through continuous repeated training and learning, the character positioning model can automatically adjust and perfect specific parameters in the positioning module, and the effect of artificial intelligence autonomous learning is achieved.
And S840, finishing the training when the error function is smaller than a preset second threshold value.
The structure of the text positioning model in this embodiment may include a DB (scalable binary) network model and a CRAFT (Computerized Relationship technologies) model, which are used to segment and position the picture, and output the segmented picture blocks and the positioning coordinates. Wherein, the CRAFT model directly uses a pre-training model; the method overcomes the weakness of being easily influenced by the shape of the characters, and uses the Gaussian heatmap to represent the link relation between the position information of each character and the literature. The DB network model needs to be trained based on the training picture set and the corresponding text position information to determine specific parameters.
According to the embodiment, the training picture set is automatically synthesized through the page model, the problem of excessive energy required by manual labeling of real data can be effectively solved, the model training efficiency is improved, and the labor cost is effectively saved.
Corresponding to the above method, the present specification further provides an embodiment of a character recognition apparatus for ancient books, and fig. 9 shows a schematic diagram of a character recognition apparatus 90 for ancient books according to an embodiment of the disclosure. As shown in fig. 9, the ancient book character recognition apparatus 90 includes: the system comprises an image acquisition module, a target detection module, a character recognition module and an integration module.
The image acquisition module 91 is suitable for acquiring an ancient book image to be identified;
a target detection module 92, adapted to input the ancient book image into a target detection model to obtain one or more positioning frames contained in the ancient book image; the positioning frame is used for representing Chinese character areas which are longitudinally and continuously arranged and have the same font form;
a character recognition module 93, adapted to input the area image determined by each of the positioning boxes into a character recognition model, so as to obtain characters included in the area image;
an integrating module 94, adapted to determine a text recognition result of the ancient book image according to the texts contained in all the area images.
An embodiment of the present application further provides a computer readable storage medium storing computer instructions that, when executed by a processor, implement the steps of the ancient book OCR recognition method as described above.
The above is an illustrative scheme of a computer-readable storage medium of the present embodiment. It should be noted that the technical solution of the storage medium belongs to the same concept as the technical solution of the ancient book OCR recognition method, and for details that are not described in detail in the technical solution of the storage medium, reference may be made to the description of the technical solution of the ancient book OCR recognition method.
The foregoing description has been directed to specific embodiments of this disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The computer instructions comprise computer program code which may be in the form of source code, object code, an executable file or some intermediate form, or the like. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, etc. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
In summary, according to the exemplary embodiment, the set of training pictures is generated to perform the mechanical learning training on the built character positioning model and the character recognition model respectively, so that the characters in the ancient book document picture are positioned and recognized without manual intervention, the characters in the picture are converted into digital characters, the ancient book entry efficiency is improved, and the ancient book arrangement work is accelerated; and the manual marking of the training atlas is not needed, so that the manual input cost is greatly saved.
It is to be noted that in the apparatus and method disclosed herein, it is apparent that the individual components or steps may be disassembled and/or reassembled. Such decomposition and/or recombination should be considered equivalents of the present disclosure. Also, the steps of executing the series of processes described above may naturally be executed chronologically in the order described, but need not necessarily be executed chronologically. Some steps may be performed in parallel or independently of each other.
The above detailed description should not be construed as limiting the scope of the disclosure. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of the present disclosure.

Claims (10)

1. A character recognition method for ancient books is characterized by comprising the following steps:
acquiring and preprocessing an ancient book image to be identified;
inputting the ancient book image into a target detection model to obtain one or more positioning frames contained in the ancient book image; the positioning frame is used for representing Chinese character areas which are longitudinally and continuously arranged and have the same font form;
inputting the area image determined by each positioning frame into a character recognition model to acquire characters contained in the area image;
and determining the character recognition result of the ancient book image according to characters contained in all the area images.
2. The character recognition method of claim 1, wherein the step of obtaining and pre-processing the image of the ancient book to be recognized comprises:
converting the ancient book image into a black and white image;
and carrying out inclination correction on the black-and-white image.
3. The method of claim 1, wherein the step of inputting the ancient book image into a target detection model to obtain one or more positioning frames contained in the ancient book image comprises:
inputting the ancient book image into the target detection model to obtain one or more groups of vertex coordinate sets; each set of vertex coordinate set comprises four pairs of position coordinates, and each pair of position coordinates is used for representing the position of one vertex in the positioning frame in the ancient book image;
generating the one or more positioning boxes from the one or more sets of vertex coordinates.
4. The method of claim 3, wherein the step of inputting the ancient book image into a target detection model to obtain one or more positioning frames contained in the ancient book image further comprises:
determining the area image contained in each positioning frame;
storing the area images into a memory mapping database according to a preset sequence; wherein the preset order is determined according to the vertex position of the positioning frame.
5. The character recognition method of claim 4, wherein the step of inputting the area image determined by each positioning box into a character recognition model to obtain the characters contained in the area image comprises:
acquiring the region images from the memory mapping database according to the preset sequence;
inputting the regional image into a depth residual error network model to output corresponding extracted character regional characteristics;
inputting the extracted character region characteristics into an LSTM model to output corresponding sequence relation information;
inputting the sequence relation information into a CTC model or an attention model to output characters contained in the region image.
6. The character recognition method of claim 5, wherein the step of inputting the sequence relation information into a CTC model or an attention model to output the characters contained in the region image comprises:
determining the number of characters contained in the region image;
when the number of characters is larger than a first threshold value, inputting the selection feature into the CTC model; otherwise, the selected features are input into the attention model.
7. The character recognition method of any one of claims 1-6, wherein the training step of the target detection model comprises:
constructing a training picture set, wherein the training picture comprises single-row Chinese characters and/or double-row Chinese character mixtures which are composed of traditional Chinese characters and have different font forms;
marking one or more corresponding positioning training frames for the training pictures, wherein the positioning training frames are used for representing Chinese character areas which are longitudinally and continuously arranged and have the same font form;
taking the training pictures as input data, taking one or more groups of vertex coordinate training sets corresponding to the one or more positioning training frames as output data, and training a DBNet model or a CRAFT model; each group of vertex coordinate training set comprises four pairs of position training coordinates, and each pair of position training coordinates is used for representing the position of one vertex in the positioning training frame in the training picture;
and when the error function is smaller than a preset second threshold value, finishing the training.
8. An ancient book character recognition device, comprising:
the image acquisition module is suitable for acquiring an ancient book image to be identified;
the target detection module is suitable for inputting the ancient book image into a target detection model so as to obtain one or more positioning frames contained in the ancient book image; the positioning frame is used for representing Chinese character areas which are longitudinally and continuously arranged and have the same font form;
the character recognition module is suitable for inputting the area image determined by each positioning frame into a character recognition model so as to obtain characters contained in the area image;
and the integration module is suitable for determining the character recognition result of the ancient book image according to characters contained in all the area images.
9. A computing device comprising a memory, a processor and computer instructions stored on the memory and executable on the processor, wherein the processor implements the steps of the method of any one of claims 1 to 7 when executing the instructions.
10. A computer-readable storage medium storing computer instructions which, when executed by a processor, implement the steps of the method of any one of claims 1 to 7.
CN202111286038.6A 2021-11-02 2021-11-02 Ancient book character recognition method and device, computer equipment and storage medium Pending CN113989484A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111286038.6A CN113989484A (en) 2021-11-02 2021-11-02 Ancient book character recognition method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111286038.6A CN113989484A (en) 2021-11-02 2021-11-02 Ancient book character recognition method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113989484A true CN113989484A (en) 2022-01-28

Family

ID=79745572

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111286038.6A Pending CN113989484A (en) 2021-11-02 2021-11-02 Ancient book character recognition method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113989484A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114758339A (en) * 2022-06-15 2022-07-15 深圳思谋信息科技有限公司 Method and device for acquiring character recognition model, computer equipment and storage medium
CN115147852A (en) * 2022-03-16 2022-10-04 北京有竹居网络技术有限公司 Ancient book identification method, ancient book identification device, ancient book storage medium and ancient book storage equipment
CN115410216A (en) * 2022-10-31 2022-11-29 天津恒达文博科技股份有限公司 Ancient book text informatization processing method and system, electronic equipment and storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115147852A (en) * 2022-03-16 2022-10-04 北京有竹居网络技术有限公司 Ancient book identification method, ancient book identification device, ancient book storage medium and ancient book storage equipment
CN114758339A (en) * 2022-06-15 2022-07-15 深圳思谋信息科技有限公司 Method and device for acquiring character recognition model, computer equipment and storage medium
CN114758339B (en) * 2022-06-15 2022-09-20 深圳思谋信息科技有限公司 Method and device for acquiring character recognition model, computer equipment and storage medium
CN115410216A (en) * 2022-10-31 2022-11-29 天津恒达文博科技股份有限公司 Ancient book text informatization processing method and system, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
JP6831469B2 (en) Chinese print character image composition method and equipment
CN113989484A (en) Ancient book character recognition method and device, computer equipment and storage medium
Mohammad et al. Optical character recognition implementation using pattern matching
CN101253514B (en) Grammatical parsing of document visual structures
CN108345880A (en) Invoice recognition methods, device, computer equipment and storage medium
CN107220641B (en) Multi-language text classification method based on deep learning
Sahu et al. A study on optical character recognition techniques
CN112069900A (en) Bill character recognition method and system based on convolutional neural network
CN111523622B (en) Method for simulating handwriting by mechanical arm based on characteristic image self-learning
CN110555441A (en) character recognition method and device
CN112926565B (en) Picture text recognition method, system, equipment and storage medium
CN111460782A (en) Information processing method, device and equipment
CN113223025A (en) Image processing method and device, and neural network training method and device
CN112183494A (en) Character recognition method and device based on neural network and storage medium
CN111274762A (en) Computer expression method based on diversified fonts in Tibetan classical documents
CN109508712A (en) A kind of Chinese written language recognition methods based on image
CN115640401A (en) Text content extraction method and device
CN114140808A (en) Electronic official document identification method based on domestic CPU and operating system
CN108197663A (en) Based on the calligraphy work image classification method to pairing set Multi-label learning
Pornpanomchai et al. Printed Thai character recognition by genetic algorithm
CN115331236A (en) Method and device for generating handwriting whole-line sample
CN110555431B (en) Image recognition method and device
Gao et al. An English Handwriting Evaluation Algorithm Based on CNNs
CN112199927A (en) Ancient book mark point filling method and device
Sable et al. Doc-handler: Document scanner, manipulator, and translator based on image and natural language processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination