CN116704515A - Method, device and storage medium for identifying and retrieving Chinese characters in calligraphy works - Google Patents

Method, device and storage medium for identifying and retrieving Chinese characters in calligraphy works Download PDF

Info

Publication number
CN116704515A
CN116704515A CN202310534872.5A CN202310534872A CN116704515A CN 116704515 A CN116704515 A CN 116704515A CN 202310534872 A CN202310534872 A CN 202310534872A CN 116704515 A CN116704515 A CN 116704515A
Authority
CN
China
Prior art keywords
image
vector data
character
font
calligraphy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310534872.5A
Other languages
Chinese (zh)
Inventor
陈映庭
陈勇平
郑倩萍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Huiyi Culture Technology Co ltd
Original Assignee
Guangzhou Huiyi Culture Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Huiyi Culture Technology Co ltd filed Critical Guangzhou Huiyi Culture Technology Co ltd
Priority to CN202310534872.5A priority Critical patent/CN116704515A/en
Publication of CN116704515A publication Critical patent/CN116704515A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/16Image preprocessing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/153Segmentation of character regions using recognition of characters or words
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/22Character recognition characterised by the type of writing
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a method, a device and a storage medium for identifying and retrieving Chinese characters in a calligraphy work, which comprise the following steps: preprocessing a to-be-processed calligraphy work image, inputting the to-be-processed calligraphy work image into a pre-trained calligraphy character object recognition model, selecting each font area in the calligraphy work image by a frame, and performing cutting and normalization processing to obtain a single-font image; inputting the single-character images into a pre-trained multidimensional multi-stream handwriting character recognition model to obtain single-character vector data corresponding to each single-character image; comparing the single-word vector data with a preset handwriting word vector database to obtain a vector data index; and acquiring font information from a preset resource library according to the vector data index. The invention effectively solves the problems of single function, low efficiency, incomplete fonts and contents in the prior art when Chinese character handwriting recognition is performed.

Description

Method, device and storage medium for identifying and retrieving Chinese characters in calligraphy works
Technical Field
The invention relates to the technical field of data processing, in particular to a method and a device for identifying and retrieving Chinese characters in a calligraphy work and a storage medium.
Background
Chinese handwriting is a common cultural wealth for all people. Protecting and inheriting Chinese calligraphy is not only the responsibility of national culture and art, but also the responsibility of preservation, development and inheritance of the outstanding patterns of human culture.
Chinese calendar calligraphic works including but not limited to seal characters, slave characters, grass characters, running script characters, regular script characters, and 1 ten thousand of common simplified and complex Chinese characters. From the view of appreciation, especially in the work appreciation of ancient characters, seal characters and cursive script, modern people have difficulty in understanding, and if learning is required, the work is more difficult. From the stage of handwriting learning, unknown words, particularly words which are not output by an input method, are difficult to inquire how an ancient person writes.
The Chinese character handwriting recognition and scoring system disclosed by the prior art has the defects of single function, failure in recognizing cursive books and ancient characters, failure in realizing batch recognition of the handwriting works, low efficiency, failure to combine with a Chinese character basic library, a Kangxi dictionary library and a speaking word decomposition library, and comprehensive display of related information of each recognized single character.
Therefore, we hope to design a set of Chinese character batch identification retrieval method for the calligraphy works with high accuracy, strong evidence, comprehensive reference and good usability, and solve the difficulty of ordinary people and calligraphy lovers in appreciating and learning the calligraphy works.
Disclosure of Invention
The embodiment of the invention provides a method, a device and a storage medium for identifying and retrieving Chinese characters in a handwriting work, which are used for solving the problems of single function, low efficiency and incomplete fonts and contents in the process of identifying Chinese characters in the prior art.
A method for identifying and retrieving Chinese characters in a calligraphy work, the method comprising:
acquiring a to-be-processed calligraphy work image, and preprocessing the calligraphy work image;
inputting the preprocessed calligraphic work image into a pre-trained calligraphic word object recognition model, and selecting each font area in the calligraphic work image by a frame;
cutting and normalizing according to each font area to obtain a single font image;
inputting the single-character images into a pre-trained multidimensional multi-stream handwriting character recognition model to obtain single-character vector data corresponding to each single-character image;
comparing the single-word vector data with a preset handwriting word vector database to obtain a vector data index corresponding to the single-word vector data;
and acquiring font information from a preset resource library according to the vector data index.
Optionally, the preprocessing the calligraphy work image includes:
adjusting the size of the calligraphic work image so that the image pixels are below a preset pixel threshold;
filtering red of the calligraphic work image with the adjusted size, and converting the calligraphic work image into a gray level image;
and performing open operation and close operation on the gray level image to remove noise points and obtain a binarized image.
Optionally, the calligraphy character object recognition model adopts a target object recognition neural network, and the training process comprises the following steps:
obtaining a inscription image of a calligraphy Chinese character;
marking a calligraphy character label of the inscription image by a label tool, wherein the calligraphy character label comprises seal characters, clerks, grasses, lines, regular script characters;
converting the marked inscription image into an XML data format to obtain a model learning data set;
dividing the model learning data set into a training set, a verification set and a test set according to a preset proportion;
and transmitting the training set, the verification set and the test set into the target object recognition neural network for training to obtain a handwriting word object recognition model.
Optionally, the clipping and normalizing processing is performed according to each font area, and obtaining the single-font image includes:
cutting each font area selected by the frame to obtain an area image;
performing binarization processing and denoising processing on the regional image;
acquiring angular point information of single fonts in the area image, and cutting a font image from the area image according to the angular point information;
correcting the font image into a square image according to the length information or the width information of the font image;
and stretching or shrinking the square image to a first preset size, and shrinking the square image with the first preset size to a second preset size to obtain the single character image.
Optionally, the corner information includes upper left-most corner information and lower right-most corner information of the font;
the obtaining the corner information of the single font in the area image, and the cutting the font image from the area image according to the corner information comprises the following steps:
performing corner detection and corner identification on the area image;
traversing each pixel in the area image to obtain pixels with corner marks and coordinate information thereof;
acquiring an X-axis minimum value and a maximum value and a Y-axis minimum value and a maximum value from the coordinate information of the pixel with the corner mark;
the X-axis minimum value and the Y-axis maximum value form the leftmost upper corner information, and the X-axis maximum value and the Y-axis minimum value form the rightmost lower corner information.
Optionally, the multidimensional multi-flow handwriting word recognition model adopts a convolutional neural network as a main network structure, and increases multi-flow tensor output, wherein the first path tensor output is used for model training, and the second path tensor output is used for providing output with shorter length and easy storage.
Optionally, the comparing the single-word vector data with a preset handwriting vector database, and obtaining a vector data index corresponding to the single-word vector data includes:
comparing the single-word vector data with each handwriting vector in a preset handwriting vector database, and calculating a Euclidean distance;
acquiring a handwriting word vector corresponding to the minimum Euclidean distance value and a vector data index thereof, and taking the handwriting word vector and the vector data index thereof as a vector data index corresponding to the single word vector data;
the handwriting vectors in the preset handwriting vector database are vector data which are output by the second path tensor after the handwriting images pass through the multidimensional multi-stream handwriting recognition model, and the vector data indexes are relative storage paths of the handwriting images in the preset resource library.
Optionally, the preset resource library comprises a Chinese character basic library, a dictionary library and a speaking text Jie Ziku;
the font information comprises characters, pictures, similarity values, pinyin, basic paraphrasing, detailed paraphrasing, dictionary and word speaking and decoding content.
A method for identifying and retrieving Chinese characters in a calligraphy work, the device comprises the following steps:
the pretreatment module is used for acquiring a to-be-treated calligraphy work image and carrying out pretreatment on the calligraphy work image;
the frame selection module is used for inputting the preprocessed calligraphic work image into a pre-trained calligraphic word object recognition model, and selecting each font area in the calligraphic work image in a frame mode;
the clipping module is used for clipping and normalizing according to each font area to obtain a single font image;
the recognition module is used for inputting the single-character images into a pre-trained multidimensional multi-stream handwriting character recognition model to obtain single-character vector data corresponding to each single-character image;
the comparison module is used for comparing the single-word vector data with a preset handwriting vector database to obtain a vector data index corresponding to the single-word vector data;
and the acquisition module is used for acquiring font information from a preset resource library according to the vector data index.
A computer readable storage medium storing a computer program which, when executed by a processor, implements a method of identifying and retrieving chinese characters of a calligraphy work as described above.
According to the embodiment of the invention, the image of the calligraphy work to be processed is obtained, and the image of the calligraphy work is preprocessed; inputting the preprocessed calligraphic work image into a pre-trained calligraphic word object recognition model, and selecting each font area in the calligraphic work image by a frame; cutting and normalizing according to each font area to obtain a single font image; inputting the single-character images into a pre-trained multidimensional multi-stream handwriting character recognition model to obtain single-character vector data corresponding to each single-character image; comparing the single-word vector data with a preset handwriting word vector database to obtain a vector data index corresponding to the single-word vector data; acquiring font information from a preset resource library according to the vector data index; therefore, the Chinese characters of the calligraphic works are identified in batches, the identification efficiency of the Chinese characters is effectively improved, the identification functions are enriched and the identification accuracy is improved by being related to various resource libraries, more comprehensive character information is provided for users, and the problem that ordinary people and calligraphic lovers have difficulty in learning and appreciating the calligraphic works is solved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for identifying and retrieving Chinese characters in a calligraphy work according to an embodiment of the invention;
FIG. 2 is a schematic diagram of a training process of a calligraphy object recognition model according to an embodiment of the present invention;
FIG. 3 is a schematic flow chart of clipping and normalizing process according to an embodiment of the present invention;
FIG. 4 is a training schematic diagram of a multi-dimensional multi-stream handwriting recognition model according to an embodiment of the present invention;
FIG. 5 is a diagram showing the result of identifying and retrieving Chinese characters in a calligraphy work according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a device for identifying and retrieving Chinese characters in a calligraphy work according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of a computer device in accordance with an embodiment of the invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
According to the identification and retrieval method for the Chinese characters of the calligraphic works, provided by the embodiment of the invention, the images of the calligraphic works to be processed are preprocessed; inputting the preprocessed calligraphic work image into a pre-trained calligraphic word object recognition model, and selecting each font area in the calligraphic work image by a frame; cutting and normalizing according to each font area to obtain a single font image; inputting the single-character images into a pre-trained multidimensional multi-stream handwriting character recognition model to obtain single-character vector data corresponding to each single-character image; comparing the single-word vector data with a preset handwriting word vector database to obtain a vector data index corresponding to the single-word vector data; acquiring font information from a preset resource library according to the vector data index; therefore, the Chinese characters of the calligraphic works are identified in batches, the identification efficiency of the Chinese characters is effectively improved, the identification functions are enriched and the identification accuracy is improved by being related to various resource libraries, more comprehensive character information can be provided, assistance can be provided when ordinary people and calligraphic lovers enjoy and learn the calligraphic works, and the problem of difficulty in recognizing characters is solved.
The following describes in detail the method for identifying and searching Chinese characters in the calligraphy works provided in this embodiment, as shown in fig. 1, the method for identifying and searching Chinese characters in the calligraphy works includes:
in step S101, a calligraphy work image to be processed is acquired, and the calligraphy work image is preprocessed.
Here, the preprocessing includes resizing and binarizing the calligraphic work image. Optionally, preprocessing the image of the calligraphic work in step S101 includes:
in step S1011, the size of the calligraphic work image is adjusted so that the image pixels are below a preset pixel threshold.
For the calligraphic work image with oversized pixels, the size is adjusted to enable the image to be below a preset pixel threshold, such as below 5000 pixels, so that operation resources are saved.
In step S1012, the redness-reducing process is performed on the resized calligraphy work image, and the image is converted into a grayscale image.
In this embodiment, the red filtering process refers to removing red pixels in the image of the calligraphy work. Typically, a calligraphic fan writes on a red rice grid, resulting in a small number of red pixels in the calligraphic work image. Therefore, red pixels in the image of the calligraphy work need to be removed, that is, red-filtering processing, before the image is binarized into a gray-scale image. Alternatively, as a preferred example of the present invention, the following is provided as an implementation of the red filtering process:
first, obtaining a red channel by blue_c, green_c, red_c=cv2.split (image); then one more parameter cv2.thresh_otsu is entered and the threshold thresh is set to 0, the optimal threshold is found by a preset algorithm, i.e. thresh, ret=cv2.thresh (red_c, 0,255, cv2.thresh_binary+cv2.thresh_otsu); then the actual measurement is adjusted to be 95 percent of effect, namely;
filter_condition=int(thresh*0.95)
_,red_thresh=cv2.threshold(red_c,filter_condition,255,cv2.THRESH_BINARY)
finally, the image is turned back to 3 channels, namely:
result_img=np.expand_dims(red_thresh,axis=2)
result_img=np.concatenate((result_img,result_img,result_img),axis=-1)
in step S1013, an open operation and a close operation are performed on the gray scale map to remove noise and obtain a binary image.
In the embodiment, the gray level map is corroded and then expanded by open operation, so that the boundaries of small objects and smooth larger objects are eliminated; and the closed operation is performed, the expansion is performed before the corrosion is performed, a small cavity is eliminated, a narrow break point and a narrow gully are connected, a broken contour line is filled, and therefore noise points of the gray image are removed, and a binary image is obtained.
In step S102, the preprocessed calligraphic work image is input to a pre-trained calligraphic word object recognition model, and each font area in the calligraphic work image is selected in a frame.
Optionally, the calligraphy character object recognition model adopts a target object recognition neural network. As a preferred example of the present invention, the training process of the calligraphy object recognition model includes:
obtaining a inscription image of a calligraphy Chinese character;
marking a calligraphy character label of the inscription image by a label tool, wherein the calligraphy character label comprises seal characters, clerks, grasses, lines, regular script characters;
converting the marked inscription image into an XML data format to obtain a model learning data set;
dividing the model learning data set into a training set, a verification set and a test set according to a preset proportion;
and transmitting the training set, the verification set and the test set into the target object recognition neural network for training to obtain a handwriting word object recognition model.
Wherein the target object recognition neural network includes, but is not limited to, a YOLOv5 target Detection network, a Detection network. Taking a YOLOv5 target detection network as an example, for a calligraphy Chinese character inscription image, the embodiment self-defines a label inscription library, totally defines seal characters, clerks, lines, regular script and ancient characters as calligraphy character labels, marks the inscription image by using a label tool, generates an xml data format required by the calligraphy character object recognition model, and obtains a standardized YOLOv5 model learning data set. The 8:1:1 ratio of the data set is then randomly split into a training set, a verification set, a test set, and a correct image path is configured. And transmitting the configuration parameters of each set into a YOLOv5 target detection network for training, and obtaining the calligraphy character object recognition model. The network structure pre-training weight is tested, yolov5m. Pt is selected, a pre-training model parameter file yolov5m_mask. Yaml is modified, nc=1, and an optimization function optimization=sgd is trained. For easy understanding, fig. 2 is a schematic diagram of a training flow of a calligraphy character object recognition model according to an embodiment of the present invention.
According to the embodiment, through the trained calligraphy character object recognition model, each font area in the calligraphy work image is selected through a frame.
In step S103, clipping and normalization processing are performed according to each font area, resulting in a single font image.
According to the embodiment, according to each font area selected by the frame, clipping is carried out on the single-word image areas, so that single-word images are obtained, and the image size is normalized. Alternatively, as a preferred example of the present invention, the clipping and normalizing according to the font areas in step S103, to obtain a single-font image includes:
in step S1031, clipping each font area selected by the frame to obtain an area image;
in step S1032, binarizing and denoising the region image;
in step S1033, obtaining corner information of a single font in the area image, and cutting a font image from the area image according to the corner information;
in step S1034, correcting the font image into a square image according to the length information or the width information of the font image;
in step S1035, the square image is stretched or reduced to a first preset size, and then the square image with the first preset size is reduced to a second preset size, so as to obtain a single character image.
Firstly, carrying out binarization processing and denoising processing on a region image obtained by clipping, and then obtaining the corner information of a single character body in the region image through the corner calculation of the cornerHarris. When the corner information of a single font in the region image is searched, the corner information comprises the leftmost upper corner information and the rightmost lower corner information of the font. The step S1033 of obtaining the corner information of the single font in the area image, and the step of cutting the font image from the area image according to the corner information includes:
in step S301, performing corner detection and corner identification on the area image;
in step S302, each pixel in the area image is traversed, and a pixel with a corner mark and its coordinate information are obtained;
in step S303, an X-axis minimum value and a maximum value, and a Y-axis minimum value and a maximum value are obtained from the coordinate information of the pixel with the corner mark;
in step S304, the X-axis minimum value and the Y-axis maximum value are combined into upper left-most corner information, and the X-axis maximum value and the Y-axis minimum value are combined into lower right-most corner information.
Here, the corner mark is preferably a red mark. In this embodiment, the length and the height of the area image are acquired first, and the area image is subjected to corner detection and corner identification. And then all pixels in the region image are cycled, all pixels with corner marks are obtained, X-axis minimum values, X-axis maximum values, Y-axis minimum values and Y-axis maximum values in the coordinate information of the pixels with the corner marks are selected to form diagonal coordinates, and leftmost upper corner information (X-axis minimum values, Y-axis maximum values) and rightmost lower corner information (X-axis maximum values, Y-axis minimum values) are obtained.
After the positions of the leftmost upper corner and the rightmost lower corner in the regional image are obtained through the cornerHarris corner calculation, the words are individually intercepted, and the length and the width are judged according to the intercepted words, so that a square image with the width as the height or with the length as the height is generated. The first predicted size is 360×360, and the second predetermined size is 224×224. And (3) stretching or shrinking the square image to 360 x 360 and then shrinking the square image to 224 x 224 so as to obtain the standard white background black word, namely the single-word body image. Optionally, for ease of understanding, fig. 3 is a schematic flow chart of clipping and normalizing processing provided in an embodiment of the present invention.
In step S104, the single-character images are input to a pre-trained multidimensional multi-stream handwriting word recognition model, so as to obtain single-character vector data corresponding to each single-character image.
In this embodiment, each single-word body image is transmitted into a pre-trained multidimensional multi-stream handwriting word recognition model, and single-word vector data corresponding to the single-word body image is generated.
The multidimensional multi-flow handwriting word recognition model adopts a convolutional neural network as a main network structure, and increases multi-flow tensor output, wherein the first path tensor output is used for model training, and the second path tensor output is used for providing output which is shorter in length and easy to store. Optionally, the convolutional neural network includes, but is not limited to, a mobilenetv3_large lightweight neural network, a RestNet residual network, an acceptance classical neural network.
Illustratively, taking the mobilenetv3_large lightweight neural network as an example, when training the multidimensional multi-stream handwriting word recognition model, ancient words are divided into seal characters and are arranged according to seal characters, clerks, grass, lines and regular script. Collecting the calligraphic inscription of the calendar generation and rubbing, and performing single word standard processing. Since the data is as high as 600 tens of thousands, the present embodiment extracts 200 tens of thousands of clearer single word data as sample data. The sample data was read as per 9:1 is divided into a training set and a verification set, and the mobile network structure is trained. In this embodiment, the mobilenetv3_large main network structure mainly increases multi-stream Tensor output, the first channel of Tensor ensures model training, the second channel of Tensor provides output with shorter length and easy storage, and the multi-dimensional multi-stream handwriting word recognition model is obtained by performing model training on the improved network structure. For easy understanding, fig. 4 is a training schematic diagram of a multidimensional multi-flow handwriting recognition model according to an embodiment of the present invention.
In step S105, the single-word vector data is compared with a preset handwriting vector database, and a vector data index corresponding to the single-word vector data is obtained.
Here, the embodiment of the invention presets a handwriting vector database for storing vector data indexes. Optionally, as a preferred example of the present invention, comparing the single-word vector data with a preset handwriting vector database in step S105, and obtaining a vector data index corresponding to the single-word vector data includes:
in step S1051, the single word vector data is compared with each of the handwriting vectors in the preset handwriting vector database, and the euclidean distance is calculated.
In step S1052, the handwriting word vector corresponding to the minimum value of the euclidean distance and the vector data index thereof are acquired as the vector data index corresponding to the single word vector data.
The handwriting vectors in the preset handwriting vector database are vector data which are output by the second path tensor after the handwriting images pass through the multidimensional multi-stream handwriting recognition model, and the vector data indexes are relative storage paths of the handwriting images in the preset resource library.
In the embodiment of the invention, the Euclidean distance is compared between the single word vector data and each calligraphic word vector in the calligraphic vector database, and then the calligraphic word vector corresponding to the minimum Euclidean distance value and the vector data index thereof are returned. Wherein, the embodiment cuts the standardized picture data by cutting all seal characters, slave characters, grasses, lines and regular script, and transmits the picture data into a multidimensional and multi-stream handwriting character recognition model, and taking vector data output by the second path of tensor, and storing the vector data into a vector database, such as Milvus, so as to obtain the calligraphy vector database. The vector data index is a relative storage path of the handwriting image in a preset resource library, such as a relative path of a book body\single word\picture.
In step S106, font information is obtained from a preset resource library according to the vector data index.
In the embodiment of the invention, the preset resource library comprises, but is not limited to, a Chinese character basic library, a dictionary library and a speaking text Jie Ziku; the font information includes, but is not limited to, text, pictures, similarity values, pinyin, basic paraphrasing, detailed paraphrasing, dictionary, and text-to-word content. Wherein the dictionary library includes, but is not limited to, kangxi dictionary library. For easy understanding, fig. 5 is a schematic diagram showing the result of identifying and retrieving Chinese characters in the calligraphy works according to the embodiment of the present invention.
According to the embodiment, the vector data index is combined with the Chinese character basic library, the Kangxi dictionary library and the Chinese character decomposition library to find the font information corresponding to the single-character image, and the font information including but not limited to characters, pictures, similar values, pinyin, basic paraphrasing, detailed paraphrasing, kangxi dictionary and Chinese character decomposition content is returned and displayed to a user, so that the Chinese character batch identification of the handwriting works is realized, the Chinese character identification efficiency is effectively improved, the identification functions are enriched and the identification accuracy is improved by being related to a plurality of resource libraries, more comprehensive character information can be provided, and the recognition difficulty of ordinary people and handwriting lovers in appreciating and learning the handwriting works is solved.
It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present invention.
In an embodiment, the invention further provides a device for identifying and searching the Chinese characters of the calligraphy works, and the device for identifying and searching the Chinese characters of the calligraphy works is in one-to-one correspondence with the method for identifying and searching the Chinese characters of the calligraphy works in the embodiment. As shown in fig. 6, the device for identifying and retrieving Chinese characters in a calligraphy work comprises a preprocessing module 61, a frame selection module 62, a clipping module 63, an identification module 64, a comparison module 65 and an acquisition module 66. The functional modules are described in detail as follows:
a preprocessing module 61, configured to acquire a calligraphy work image to be processed, and perform preprocessing on the calligraphy work image;
the frame selection module 62 is configured to input the preprocessed calligraphic work image into a trained calligraphic word object recognition model, and frame-select each font area in the calligraphic work image;
a clipping module 63, configured to clip and normalize according to each font area to obtain a single font image;
the recognition module 64 is configured to input the single-character image into a pre-trained multidimensional multiflow handwriting recognition model, so as to obtain single-character vector data corresponding to each single-character image;
the comparison module 65 is configured to compare the single-word vector data with a preset handwriting vector database, and obtain a vector data index corresponding to the single-word vector data;
and the obtaining module 66 is used for obtaining the font information from a preset resource library according to the vector data index.
Optionally, the preprocessing module 61 includes:
the pixel adjusting unit is used for adjusting the size of the calligraphic work image so that the image pixel is below a preset pixel threshold value;
the red filtering unit is used for filtering red of the calligraphic work image with the adjusted size and converting the calligraphic work image into a gray level image;
and the denoising unit is used for carrying out open operation and close operation on the gray level image so as to remove the noise and obtain a binarized image.
Optionally, the calligraphy character object recognition model adopts a target object recognition neural network, and the training process comprises the following steps:
obtaining a inscription image of a calligraphy Chinese character;
marking a calligraphy character label of the inscription image by a label tool, wherein the calligraphy character label comprises seal characters, clerks, grasses, lines, regular script characters;
converting the marked inscription image into an XML data format to obtain a model learning data set;
dividing the model learning data set into a training set, a verification set and a test set according to a preset proportion;
and transmitting the training set, the verification set and the test set into the target object recognition neural network for training to obtain a handwriting word object recognition model.
Optionally, the clipping module 63 includes:
the clipping unit is used for clipping each font area selected by the frame to obtain an area image;
the binarization unit is used for carrying out binarization processing and denoising processing on the regional image;
the cutting unit is used for obtaining the corner information of the single font in the area image and cutting the font image from the area image according to the corner information;
a correction unit for correcting the font image into a square image according to the length information or the width information of the font image;
the size adjusting unit is used for stretching or shrinking the square image to a first preset size, and shrinking the square image with the first preset size to a second preset size to obtain a single character image.
Optionally, the corner information includes upper left-most corner information and lower right-most corner information of the font;
the cutting unit comprises:
the angular point identification subunit is used for carrying out angular point detection and angular point identification on the area image;
the coordinate acquisition subunit is used for traversing each pixel in the area image and acquiring the pixel with the corner mark and the coordinate information thereof;
a maximum and minimum value obtaining subunit, configured to obtain an X-axis minimum value and a maximum value, and a Y-axis minimum value and a maximum value from the coordinate information of the pixel with the corner mark;
and the combining subunit is used for combining the X-axis minimum value and the Y-axis maximum value into the leftmost upper corner information and combining the X-axis maximum value and the Y-axis minimum value into the rightmost lower corner information.
Optionally, the multidimensional multi-flow handwriting word recognition model adopts a convolutional neural network as a main network structure, and increases multi-flow tensor output, wherein the first path tensor output is used for model training, and the second path tensor output is used for providing output with shorter length and easy storage.
Optionally, the comparison module 65 includes:
the comparison unit is used for comparing the single-word vector data with each calligraphic word vector in a preset calligraphic word vector database and calculating the Euclidean distance;
the index acquisition unit is used for acquiring the handwriting word vector corresponding to the minimum Euclidean distance value and the vector data index thereof, and the handwriting word vector and the vector data index are used as the vector data index corresponding to the single word vector data;
the handwriting vectors in the preset handwriting vector database are vector data which are output by the second path tensor after the handwriting images pass through the multidimensional multi-stream handwriting recognition model, and the vector data indexes are relative storage paths of the handwriting images in the preset resource library.
Optionally, the preset resource library comprises a Chinese character basic library, a Kangxi dictionary library and a speaking text Jie Ziku;
the font information comprises characters, pictures, similarity values, pinyin, basic paraphrasing, detailed paraphrasing, kangxi dictionary and words.
The specific limitation of the device for identifying and searching the Chinese characters of the calligraphy work can be referred to the limitation of the method for identifying and searching the Chinese characters of the calligraphy work, and the description is omitted here. All or part of the modules in the handwriting Chinese character recognition and retrieval device can be realized by software, hardware and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 7. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to realize a method for identifying and searching Chinese characters in the calligraphy works.
In one embodiment, a computer device is provided comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of when executing the computer program:
acquiring a to-be-processed calligraphy work image, and preprocessing the calligraphy work image;
inputting the preprocessed calligraphic work image into a pre-trained calligraphic word object recognition model, and selecting each font area in the calligraphic work image by a frame;
cutting and normalizing according to each font area to obtain a single font image;
inputting the single-character images into a pre-trained multidimensional multi-stream handwriting character recognition model to obtain single-character vector data corresponding to each single-character image;
comparing the single-word vector data with a preset handwriting word vector database to obtain a vector data index corresponding to the single-word vector data;
and acquiring font information from a preset resource library according to the vector data index.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions.
The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims (10)

1. A method for identifying and retrieving Chinese characters in a calligraphy work is characterized by comprising the following steps:
acquiring a to-be-processed calligraphy work image, and preprocessing the calligraphy work image;
inputting the preprocessed calligraphic work image into a pre-trained calligraphic word object recognition model, and selecting each font area in the calligraphic work image by a frame;
cutting and normalizing according to each font area to obtain a single font image;
inputting the single-character images into a pre-trained multidimensional multi-stream handwriting character recognition model to obtain single-character vector data corresponding to each single-character image;
comparing the single-word vector data with a preset handwriting word vector database to obtain a vector data index corresponding to the single-word vector data;
and acquiring font information from a preset resource library according to the vector data index.
2. The method for identifying and retrieving Chinese characters in a calligraphy work according to claim 1, wherein said preprocessing said image of the calligraphy work comprises:
adjusting the size of the calligraphic work image so that the image pixels are below a preset pixel threshold;
filtering red of the calligraphic work image with the adjusted size, and converting the calligraphic work image into a gray level image;
and performing open operation and close operation on the gray level image to remove noise points and obtain a binarized image.
3. The method for identifying and retrieving Chinese characters in a calligraphy work according to claim 1, wherein the calligraphy character object identification model adopts a target object identification neural network, and the training process comprises:
obtaining a inscription image of a calligraphy Chinese character;
marking a calligraphy character label of the inscription image by a label tool, wherein the calligraphy character label comprises seal characters, clerks, grasses, lines, regular script characters;
converting the marked inscription image into an XML data format to obtain a model learning data set;
dividing the model learning data set into a training set, a verification set and a test set according to a preset proportion;
and transmitting the training set, the verification set and the test set into the target object recognition neural network for training to obtain a handwriting word object recognition model.
4. The method for identifying and retrieving Chinese characters in a calligraphy work according to claim 1, wherein said performing clipping and normalization processing according to each font area to obtain a single-font image comprises:
cutting each font area selected by the frame to obtain an area image;
performing binarization processing and denoising processing on the regional image;
acquiring angular point information of single fonts in the area image, and cutting a font image from the area image according to the angular point information;
correcting the font image into a square image according to the length information or the width information of the font image;
and stretching or shrinking the square image to a first preset size, and shrinking the square image with the first preset size to a second preset size to obtain the single character image.
5. The method for recognizing and retrieving Chinese characters in a calligraphy work according to claim 4, wherein the corner information includes upper left-most corner information and lower right-most corner information of a font;
the obtaining the corner information of the single font in the area image, and the cutting the font image from the area image according to the corner information comprises the following steps:
performing corner detection and corner identification on the area image;
traversing each pixel in the area image to obtain pixels with corner marks and coordinate information thereof;
acquiring an X-axis minimum value and a maximum value and a Y-axis minimum value and a maximum value from the coordinate information of the pixel with the corner mark;
the X-axis minimum value and the Y-axis maximum value form the leftmost upper corner information, and the X-axis maximum value and the Y-axis minimum value form the rightmost lower corner information.
6. The method for identifying and retrieving Chinese characters in a calligraphy work according to claim 1, wherein the multidimensional multi-flow calligraphy character identification model adopts a convolutional neural network as a main network structure, and multi-flow tensor output is added, wherein a first path of tensor output is used for model training, and a second path of tensor output is used for providing output which is shorter in length and easy to store.
7. The method for identifying and retrieving Chinese characters in a calligraphy work according to claim 1, wherein the comparing the single-word vector data with a preset calligraphy word vector database to obtain a vector data index corresponding to the single-word vector data comprises:
comparing the single-word vector data with each handwriting vector in a preset handwriting vector database, and calculating a Euclidean distance;
acquiring a handwriting word vector corresponding to the minimum Euclidean distance value and a vector data index thereof, and taking the handwriting word vector and the vector data index thereof as a vector data index corresponding to the single word vector data;
the handwriting vectors in the preset handwriting vector database are vector data which are output by the second path tensor after the handwriting images pass through the multidimensional multi-stream handwriting recognition model, and the vector data indexes are relative storage paths of the handwriting images in the preset resource library.
8. The method for identifying and retrieving Chinese characters in a calligraphy work according to claim 1, wherein the preset resource library comprises a Chinese character basic library, a dictionary library and a speaking text Jie Ziku;
the font information comprises characters, pictures, similarity values, pinyin, basic paraphrasing, detailed paraphrasing, dictionary and word speaking and decoding content.
9. A device for identifying and retrieving Chinese characters in a calligraphy work, the device comprising:
the pretreatment module is used for acquiring a to-be-treated calligraphy work image and carrying out pretreatment on the calligraphy work image;
the frame selection module is used for inputting the preprocessed calligraphic work image into a pre-trained calligraphic word object recognition model, and selecting each font area in the calligraphic work image in a frame mode;
the clipping module is used for clipping and normalizing according to each font area to obtain a single font image;
the recognition module is used for inputting the single-character images into a pre-trained multidimensional multi-stream handwriting character recognition model to obtain single-character vector data corresponding to each single-character image;
the comparison module is used for comparing the single-word vector data with a preset handwriting vector database to obtain a vector data index corresponding to the single-word vector data;
and the acquisition module is used for acquiring font information from a preset resource library according to the vector data index.
10. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the method for identifying and retrieving chinese characters in a calligraphy work according to any one of claims 1 to 8.
CN202310534872.5A 2023-05-11 2023-05-11 Method, device and storage medium for identifying and retrieving Chinese characters in calligraphy works Pending CN116704515A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310534872.5A CN116704515A (en) 2023-05-11 2023-05-11 Method, device and storage medium for identifying and retrieving Chinese characters in calligraphy works

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310534872.5A CN116704515A (en) 2023-05-11 2023-05-11 Method, device and storage medium for identifying and retrieving Chinese characters in calligraphy works

Publications (1)

Publication Number Publication Date
CN116704515A true CN116704515A (en) 2023-09-05

Family

ID=87836455

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310534872.5A Pending CN116704515A (en) 2023-05-11 2023-05-11 Method, device and storage medium for identifying and retrieving Chinese characters in calligraphy works

Country Status (1)

Country Link
CN (1) CN116704515A (en)

Similar Documents

Publication Publication Date Title
CN110569830B (en) Multilingual text recognition method, device, computer equipment and storage medium
US10846553B2 (en) Recognizing typewritten and handwritten characters using end-to-end deep learning
CN109670504B (en) Handwritten answer recognition and correction method and device
US10915788B2 (en) Optical character recognition using end-to-end deep learning
CN110647885B (en) Test paper splitting method, device, equipment and medium based on picture identification
CN111507330B (en) Problem recognition method and device, electronic equipment and storage medium
CN113255583B (en) Data annotation method and device, computer equipment and storage medium
CN111666937A (en) Method and system for recognizing text in image
CN114005126A (en) Table reconstruction method and device, computer equipment and readable storage medium
US20120281919A1 (en) Method and system for text segmentation
CN117115823A (en) Tamper identification method and device, computer equipment and storage medium
CN111832551A (en) Text image processing method and device, electronic scanning equipment and storage medium
US10685222B2 (en) Computerized writing evaluation and training method
CN111259888A (en) Image-based information comparison method and device and computer-readable storage medium
CN116225956A (en) Automated testing method, apparatus, computer device and storage medium
CN116704515A (en) Method, device and storage medium for identifying and retrieving Chinese characters in calligraphy works
CN115311666A (en) Image-text recognition method and device, computer equipment and storage medium
CN111931018B (en) Test question matching and splitting method and device and computer storage medium
JP2008027133A (en) Form processor, form processing method, program for executing form processing method, and recording medium
CN112418217A (en) Method, apparatus, device and medium for recognizing characters
CN113343967B (en) Optical character rapid identification method and system
CN112329744B (en) Picture character recognition method and device
CN115410200A (en) Text recognition method and device
Rasa et al. Handwriting Classification of Numbers and Writing Data using the Convolutional Neural Network Model (CNN)
CN115100672A (en) Character detection and identification method, device and equipment and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination