CN113591743A - Calligraphy video identification method, system, storage medium and computing device - Google Patents

Calligraphy video identification method, system, storage medium and computing device Download PDF

Info

Publication number
CN113591743A
CN113591743A CN202110895033.7A CN202110895033A CN113591743A CN 113591743 A CN113591743 A CN 113591743A CN 202110895033 A CN202110895033 A CN 202110895033A CN 113591743 A CN113591743 A CN 113591743A
Authority
CN
China
Prior art keywords
video
cursive
calligraphy
vector
picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110895033.7A
Other languages
Chinese (zh)
Other versions
CN113591743B (en
Inventor
梁循
吴佳辰
黄伟兰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Renmin University of China
Original Assignee
Renmin University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Renmin University of China filed Critical Renmin University of China
Priority to CN202110895033.7A priority Critical patent/CN113591743B/en
Publication of CN113591743A publication Critical patent/CN113591743A/en
Application granted granted Critical
Publication of CN113591743B publication Critical patent/CN113591743B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Processing Or Creating Images (AREA)
  • Character Discrimination (AREA)

Abstract

The invention relates to a calligraphy video identification method, a system, a storage medium and a computing device, comprising the following steps: acquiring and processing initial calligraphy video data; collecting and obtaining prior knowledge of cursive writing orders and cursive symbols of the cursive writing lines; extracting video key frame video pictures from initial calligraphy video data by combining prior knowledge, and converting the key frame pictures into texts; vectorizing the texts of the pictures to obtain a multi-dimensional vector of each text, splicing the multi-dimensional vectors generated by each picture according to a time sequence, and combining to form a video vector; and performing vector dimension reduction visualization processing on the video to finish classification identification. The method can improve the accuracy of the video identification of the cursive handwriting of the cursive script of the line book, and can be widely applied to the technical field of video data identification.

Description

Calligraphy video identification method, system, storage medium and computing device
Technical Field
The invention relates to the technical field of video data identification, in particular to a calligraphy video identification method, a calligraphy video identification system, a storage medium and a computing device for cursive script.
Background
The cursive script is a product which is convenient to write quickly and continuously, so that the phenomena of stroke connection, deformation, simplification, sloppy writing and the like can occur in the writing process, the calligraphy font and the general simplified body regular script are different, and the difficulty is brought to the recognition of the handwritten calligraphy. However, the simplification of writing is not arbitrary and has a certain rule, and since ancient times, the simplified writing rule of cursive writing is not fixed, but in the gradual evolution, one word or one structure is subject to mutual constraint and fixed writing of cursive writing. The cursive script develops its unique technical system in the evolution of the book body, and the most important is the simplification of the writing order and the cursive script symbols.
The adjustment of the order of strokes makes the cursive script more natural and convenient when continuously writing. For example, writing beside the vertical center can be changed from writing a left point first, writing a right point later and writing a vertical and exposed vertical last into writing a short vertical first, writing a short horizontal with a folded pen and then writing a long vertical with a reversed pen and left-up with a right pen. Cursive script symbols are written by concise symbols instead of radicals of regular scripts, and the cursive script parts are the rules summarized by calligraphy and law makers in the past and are continuously developed and evolved. The writing methods of the cursive components are summarized into standard cursive symbols by the generation of the right-handed ancestor, and then 71 radical cursive symbols and 355 radical cursive symbols are provided in the book of cursive word analysis. The ordinary people also need to know some priori knowledge when knowing the cursive handwriting of the cursive script, so that the introduction of the information of the stroke order and the symbol of the cursive script is important in the handwriting recognition, especially the recognition of the handwriting video. Compared with the common image characteristics, the video characteristics comprise time sequence changes of the image characteristics, and the calligraphy writing video can better reflect the stroke order information of cursive writing. Although at present, many modeling methods exist in the field of video motion recognition and video classification. In recent years, neural networks have achieved almost superior results to human beings in computer vision tasks such as image recognition and object detection, and researchers have increasingly started using neural networks such as a neural network based on three-dimensional convolution, a neural network based on dual stream, and the like in video tasks.
However, research in the field of calligraphy video recognition is not uncommon.
Disclosure of Invention
In view of the above problems, an object of the present invention is to provide a method, a system, a storage medium, and a computing device for identifying calligraphy video, which improve the accuracy of identifying cursive calligraphy video.
In order to achieve the purpose, the invention adopts the following technical scheme: a method of handwriting video recognition, comprising: acquiring and processing initial calligraphy video data; collecting and obtaining prior knowledge of cursive writing orders and cursive symbols of the cursive writing lines; extracting video key frame video pictures from initial calligraphy video data by combining prior knowledge, and converting the key frame pictures into texts; vectorizing the texts of the pictures to obtain a multi-dimensional vector of each text, splicing the multi-dimensional vectors generated by each picture according to a time sequence, and combining to form a video vector; and performing vector dimension reduction visualization processing on the video to finish classification identification.
Further, the acquiring and processing of the initial calligraphy video data includes: crawling initial calligraphy video data by using a crawler; screening out videos which are clear in video effect and do not shield the written content from the text part beyond a preset range; and intercepting the single character video in the screened video.
Further, the a priori knowledge includes: the writing order information in the cursive script is different from the writing mode of the regular script.
Further, the extracting video key frame video pictures in the initial calligraphy video data by combining the prior knowledge comprises: calling an opencv packet, intercepting video frames according to a preset interval, and storing the video frames as pictures; acquiring the approximate progress position of the key frame in the video according to the stroke sequence information and the cursive sign of the line book and the cursive; and automatically screening each calligraphy video according to the key frames to obtain a fixed number of key frame video pictures.
Further, the converting the key frame picture into a text includes: converting the characteristic information of the pixel points of each picture into a text for storage; standardizing the picture into a fixed length and a fixed width, and carrying out graying treatment; and extracting an image numerical matrix of the picture and generating a transposed matrix of the picture, and splicing the image numerical matrix and the transposed matrix of the picture to obtain a text of the picture.
Further, the combining forms a video vector comprising: calling a gensim packet, realizing vectorization of the picture text by adopting a Doc2Vec document embedding model, and presetting the length of a text vector and window parameters; traversing vector dimensions and window parameters, and determining optimal parameters for a Doc2Vec document embedding model; and splicing vectors generated by each picture of the same video according to a time sequence order, and combining to form a video vector.
Further, the performing vector dimension reduction visualization processing on the video includes: performing manifold learning on the video vectors, performing dimensionality reduction visualization, converting a high-dimensional matrix into a two-dimensional vector group, regarding each document as a scattered point, and drawing a graph; and obtaining vectors of the same word on the graph of the dimension reduction result, gathering the vectors at the similar places on the graph, and finishing the classification identification according to the obtained graph.
A handwriting video recognition system, comprising: the system comprises an initial data acquisition module, a priori knowledge collection module, a text conversion module, a vectorization module and an identification module; the initial data acquisition module is used for acquiring and processing initial calligraphy video data; the priori knowledge collection module is used for collecting and acquiring the priori knowledge of the cursive writing order and the cursive symbols of the cursive writing of the line book; the text conversion module extracts video key frame video pictures from the initial calligraphy video data by combining the prior knowledge and converts the key frame pictures into texts; the vectorization module is used for vectorizing the texts of the pictures to obtain a multi-dimensional vector of each text, splicing the multi-dimensional vectors generated by each picture according to a time sequence and combining the multi-dimensional vectors to form a video vector; and the identification module is used for performing vector dimension reduction visualization processing on the video to finish classification identification.
A computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a computing device, cause the computing device to perform any of the above methods.
A computing device, comprising: one or more processors, memory, and one or more programs stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for performing any of the above-described methods.
Due to the adoption of the technical scheme, the invention has the following advantages:
1. the invention raises the problem of static character recognition based on artificial intelligence to a line cursive character recognition problem by means of dynamic stroke order prior knowledge.
2. The invention introduces the priori knowledge of the writing order, cursive symbols and the like of cursive handwriting of the cursive.
3. The invention adopts an unsupervised algorithm to carry out the embedding training of the video and the image, which is beneficial to the popularization of the application of the invention.
Drawings
FIG. 1 is a flow chart illustrating a method for video recognition of handwriting according to an embodiment of the present invention;
FIG. 2 is a flow chart illustrating a method for identifying a line book and a cursive script in an embodiment of the invention;
FIG. 3 is a schematic diagram of storing crawled videos to a local disk in accordance with an embodiment of the invention;
FIG. 4 is a schematic diagram of a cursive symbol prior knowledge in one embodiment of the invention;
FIG. 5 is a schematic diagram of a computing device in an embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the drawings of the embodiments of the present invention. It is to be understood that the embodiments described are only a few embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the described embodiments of the invention, are within the scope of the invention.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.
The invention discloses a method for identifying cursive handwriting video by applying video embedding based on prior knowledge of cursive handwriting sequences and cursive symbols, and relates to technologies and methods for network video acquisition, video processing, video embedding, picture feature extraction and the like. The invention only relates to the extraction of time sequence visual modal information in the video, and needs to ignore the shielding, shaking and visual angle change caused by video background change and shooting and accurately identify the character writing information in the calligraphy video. The method aims at the recognition of calligraphy videos, and the problem of static character recognition based on artificial intelligence is improved to a line and sketch recognition problem by means of dynamic stroke order prior knowledge. Since the single character calligraphy video is segmented firstly, and the database is time-consuming to build by itself, the method used by the invention selects an unsupervised method aiming at a small-scale data set.
In an embodiment of the present invention, as shown in fig. 1, a method for recognizing a calligraphy video is provided, and this embodiment is illustrated by applying the method to a terminal, it is to be understood that the method may also be applied to a server, and may also be applied to a system including a terminal and a server, and is implemented by interaction between the terminal and the server. The identification method provided by the embodiment can be used for identifying the cursive script calligraphy video, can also be applied to other fields for identifying other video data, for example, the method can also be used for identifying the outline calligraphy video. In this embodiment, the method includes the steps of:
step 1, acquiring and processing initial calligraphy video data;
step 2, collecting and obtaining the prior knowledge of the cursive writing order and cursive symbols of the cursive writing;
step 3, extracting video key frame video pictures from the initial calligraphy video data by combining the prior knowledge, and converting the key frame pictures into texts;
step 4, vectorizing the texts of the pictures to obtain a multi-dimensional vector of each text, splicing the multi-dimensional vectors generated by each picture according to a time sequence, and combining to form a video vector;
and 5, performing vector dimension reduction visualization processing on the video to finish classification identification.
In a preferred embodiment, the step 1 of acquiring and processing the initial calligraphy video data comprises the following steps:
step 11, crawling initial calligraphy video data by using a crawler;
and crawling calligraphy writing videos on the short video website, caching the videos into a local file, and obtaining partial results as shown in FIG. 3.
12, screening out a video with clear video effect and without shielding the written content from the character part beyond a preset range for training the effect; in this embodiment, the preset range occlusion is preferably 25% occlusion;
and step 13, intercepting the single character video in the screened video, enabling each video to only contain the writing process of a single Chinese character, naming the video, and intercepting and deleting user watermarks before and after the video.
The method specifically comprises the following steps: in the embodiment, the crawler is adopted to crawl and process the initial calligraphy video data. Today's short video websites have a large number of calligraphic videos, but these websites have video shots that are still erratic and all downloaded videos contain a few seconds of user watermarks at the end. Therefore, the videos need to be processed after being downloaded, videos with clear video effects and no significant occlusion on written contents are obtained through artificial screening, and watermarks of the last seconds are deleted uniformly. Because there is a writing process in which one video contains multiple words, in this embodiment, single-word videos are processed and recognized, and therefore, these videos are intercepted.
In a preferred embodiment, the step 2 collects and acquires the prior knowledge of the cursive script writing order and cursive symbols of the cursive script, wherein the prior knowledge comprises the information of the order different from the regular script writing mode in the cursive script.
The method specifically comprises the following steps: the method is characterized by collecting the commonly used writing order information and the cursive script symbols which are different from the regular script writing mode in the cursive script by referring to the reference data such as the radical and radical radicals of the cursive script, the most complete cursive script writing method and the book in the standard cursive script, the Liudong celery analysis of the cursive script method and the Sun Bao text practical writing dictionary.
Because standard reference materials are more prone to not creating a pragmatic confusion when formulating cursive notation standards, symbols having unique correspondences are preferred, but not so in actual use. Therefore, the invention combines the daily use habit and summarizes the common cursive script symbols in the cursive script and the cursive script, the corresponding representative radicals and the use characters. Since the present invention focuses on the idea of the solution, only 35 groups of common cursive symbols are collected as an example.
The method can not only identify the single character video of the cursive script calligraphy, but also distinguish the regular script calligraphy video and the cursive script calligraphy video aiming at a certain character.
In a preferred embodiment, the step 3 of extracting video key frame video pictures in the initial calligraphic video data in combination with a priori knowledge comprises the following steps:
step 311, calling an opencv packet, intercepting video frames according to a preset interval, and storing the video frames as pictures;
step 312, acquiring approximate progress positions of the key frames in the video according to the stroke sequence information and the cursive symbols of the rowbooks and cursive;
although the video length and the number of extracted pictures are different, the writing speed of each stroke is similar when a person writes. And for a group of written videos of each character to be recognized, setting the progress positions of a series of key frames in the videos according to the stroke order rules of the line books and the cursive script and cursive script symbols.
And 313, automatically screening each calligraphy video according to the key frames to obtain a fixed number of key frame video pictures.
The method specifically comprises the following steps: because each frame in the video cannot be proposed and trained due to the training time problem, the embodiment extracts the key frame of the single-word calligraphy video by combining the prior knowledge. And for a group of written videos of each character to be recognized, setting the progress positions of a series of key frames in the videos according to the stroke order rules of the line books and the cursive script and cursive script symbols. And automatically screening each calligraphy video according to the set key frames to obtain a fixed number of key frame video pictures.
In a preferred embodiment, the step 3 of converting the key frame picture into text includes the following steps:
step 321, converting the characteristic information of the pixel point of each picture into a text for storage;
322, standardizing the picture into a fixed length and a fixed width, and carrying out gray processing;
step 323, extracting an image numerical matrix (namely, a pixel matrix) of the picture and generating a transposed matrix thereof, and splicing the image numerical matrix and the transposed matrix thereof to obtain a text of the picture.
In this embodiment, the Doc2Vec algorithm in unsupervised learning is adopted, so the picture of the key frame is converted into a text. Firstly, the picture is grayed, so that the image only contains brightness information and does not contain redundant color information. Wherein, the white point value is 255, the black point value is 0, and 0-255 are gray points. And (3) extracting an image numerical matrix of the picture and generating a transposed matrix of the picture, and splicing the image numerical matrix and the transposed matrix of the picture in order to simultaneously extract the transverse and longitudinal characteristics of the picture. And saving the picture text result in a txt file and storing the picture text result locally.
In a preferred embodiment, the combining in step 4 forms a video vector, comprising the steps of:
step 41, calling a genesis package, realizing vectorization of a picture text by adopting a Doc2Vec document embedding model, and presetting the length of a text vector and window parameters;
since the Doc2Vec model can create a fixed-length vectorized representation of a document, regardless of its length. And (3) using a Doc2Vec function in the gensim packet, inputting the text representation of the picture into the function, and presetting parameters such as the length and the window of the document vector.
Step 42, traversing vector dimensions and window parameters, and determining optimal parameters for a Doc2Vec document embedding model;
and 43, splicing vectors generated by each picture of the same video according to a time sequence, and combining to form a video vector.
The method specifically comprises the following steps: and (3) for the picture text, training a Doc2Vec model, expressing the text of the picture into a function, and presetting parameters such as the length and the window of a document vector. By using the spatial representation of the PV-DM model training vector in the Doc2Vec, the model output obtains a multi-dimensional vector of each text, each dimension represents a hidden feature of the image represented by the text, and the features summarize the transverse and longitudinal features of the calligraphy image represented by the text.
In a preferred embodiment, the vector dimension reduction visualization process is performed on the video in step 5, and includes the following steps:
step 51, performing manifold learning on video vectors, performing dimensionality reduction visualization, converting a high-dimensional matrix into a two-dimensional vector group, regarding each document as a scatter point, and drawing a graph;
in the embodiment, the T-SNE method is adopted for dimension reduction visualization.
And step 52, obtaining vectors of the same character on the graph of the dimension reduction result, gathering the vectors at the similar positions on the graph, and finishing the classification identification according to the obtained graph.
The method specifically comprises the following steps: after representing the single-word calligraphy video in vector form using unsupervised learning. And (3) carrying out manifold learning on the generated video vector, carrying out dimension reduction visualization by using a T-SNE method, converting a high-dimensional matrix into a two-dimensional vector group, regarding each document as a scatter point, and drawing a graph. On the graph of the dimension reduction result, it can be seen that the vectors of the same word are gathered at a close place on the graph.
By applying the vector visualization method, classification experiments can be carried out on single-character videos with labels. The method can be used for identifying the video of the unidentified calligraphy single characters.
In one embodiment of the present invention, there is provided a calligraphy video recognition system comprising: the system comprises an initial data acquisition module, a priori knowledge collection module, a text conversion module, a vectorization module and an identification module;
the initial data acquisition module is used for acquiring and processing initial calligraphy video data;
the priori knowledge collecting module is used for collecting and obtaining the priori knowledge of the cursive writing order and the cursive symbols of the cursive writing of the line book;
the text conversion module extracts video key frame video pictures from the initial calligraphy video data by combining the prior knowledge and converts the key frame pictures into texts;
the vectorization module is used for vectorizing the texts of the pictures to obtain a multi-dimensional vector of each text, splicing the multi-dimensional vectors generated by each picture according to a time sequence and combining the multi-dimensional vectors to form a video vector;
and the identification module is used for performing vector dimension reduction visualization processing on the video to finish classification identification.
The system provided in this embodiment is used for executing the above method embodiments, and for details of the process and the details, reference is made to the above embodiments, which are not described herein again.
As shown in fig. 5, which is a schematic structural diagram of a computing device provided in an embodiment of the present invention, the computing device may be a terminal, and may include: a processor (processor), a communication Interface (communication Interface), a memory (memory), a display screen and an input device. The processor, the communication interface and the memory are communicated with each other through a communication bus. The processor is used to provide computing and control capabilities. The memory includes a non-volatile storage medium, an internal memory, the non-volatile storage medium storing an operating system and a computer program that when executed by the processor implements an identification method; the internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, a manager network, NFC (near field communication) or other technologies. The display screen can be a liquid crystal display screen or an electronic ink display screen, and the input device can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on a shell of the computing equipment, an external keyboard, a touch pad or a mouse and the like. The processor may call logic instructions in memory to perform the following method:
acquiring and processing initial calligraphy video data; collecting and obtaining prior knowledge of cursive writing orders and cursive symbols of the cursive writing lines; extracting video key frame video pictures from initial calligraphy video data by combining prior knowledge, and converting the key frame pictures into texts; vectorizing the texts of the pictures to obtain a multi-dimensional vector of each text, splicing the multi-dimensional vectors generated by each picture according to a time sequence, and combining to form a video vector; and performing vector dimension reduction visualization processing on the video to finish classification identification.
In addition, the logic instructions in the memory may be implemented in the form of software functional units and may be stored in a computer readable storage medium when sold or used as a stand-alone product. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Those skilled in the art will appreciate that the architecture shown in fig. 5 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects may be applied, and that a particular computing device may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment of the invention, a computer program product is provided, the computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions that, when executed by a computer, enable the computer to perform the methods provided by the above-described method embodiments, for example, comprising: acquiring and processing initial calligraphy video data; collecting and obtaining prior knowledge of cursive writing orders and cursive symbols of the cursive writing lines; extracting video key frame video pictures from initial calligraphy video data by combining prior knowledge, and converting the key frame pictures into texts; vectorizing the texts of the pictures to obtain a multi-dimensional vector of each text, splicing the multi-dimensional vectors generated by each picture according to a time sequence, and combining to form a video vector; and performing vector dimension reduction visualization processing on the video to finish classification identification.
In one embodiment of the invention, a non-transitory computer-readable storage medium is provided, which stores server instructions that cause a computer to perform the methods provided by the above embodiments, for example, including: acquiring and processing initial calligraphy video data; collecting and obtaining prior knowledge of cursive writing orders and cursive symbols of the cursive writing lines; extracting video key frame video pictures from initial calligraphy video data by combining prior knowledge, and converting the key frame pictures into texts; vectorizing the texts of the pictures to obtain a multi-dimensional vector of each text, splicing the multi-dimensional vectors generated by each picture according to a time sequence, and combining to form a video vector; and performing vector dimension reduction visualization processing on the video to finish classification identification.
The implementation principle and technical effect of the computer-readable storage medium provided by the above embodiments are similar to those of the above method embodiments, and are not described herein again.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A calligraphy video identification method is characterized by comprising the following steps:
acquiring and processing initial calligraphy video data;
collecting and obtaining prior knowledge of cursive writing orders and cursive symbols of the cursive writing lines;
extracting video key frame video pictures from initial calligraphy video data by combining prior knowledge, and converting the key frame pictures into texts;
vectorizing the texts of the pictures to obtain a multi-dimensional vector of each text, splicing the multi-dimensional vectors generated by each picture according to a time sequence, and combining to form a video vector;
and performing vector dimension reduction visualization processing on the video to finish classification identification.
2. The method for recognizing, according to claim 1, wherein said obtaining and processing of initial calligraphic video data comprises:
crawling initial calligraphy video data by using a crawler;
screening out videos which are clear in video effect and do not shield the written content from the text part beyond a preset range;
and intercepting the single character video in the screened video.
3. The identification method of claim 1, wherein the a priori knowledge comprises: the writing order information in the cursive script is different from the writing mode of the regular script.
4. The identification method according to claim 1, wherein the extracting video key frame video pictures in the initial calligraphy video data in combination with the prior knowledge comprises:
calling an opencv packet, intercepting video frames according to a preset interval, and storing the video frames as pictures;
acquiring the approximate progress position of the key frame in the video according to the stroke sequence information and the cursive sign of the line book and the cursive;
and automatically screening each calligraphy video according to the key frames to obtain a fixed number of key frame video pictures.
5. The recognition method of claim 1, wherein said converting the key frame picture into text comprises:
converting the characteristic information of the pixel points of each picture into a text for storage;
standardizing the picture into a fixed length and a fixed width, and carrying out graying treatment;
and extracting an image numerical matrix of the picture and generating a transposed matrix of the picture, and splicing the image numerical matrix and the transposed matrix of the picture to obtain a text of the picture.
6. The identification method of claim 1, wherein the combining forms a video vector comprising:
calling a gensim packet, realizing vectorization of the picture text by adopting a Doc2Vec document embedding model, and presetting the length of a text vector and window parameters;
traversing vector dimensions and window parameters, and determining optimal parameters for a Doc2Vec document embedding model;
and splicing vectors generated by each picture of the same video according to a time sequence order, and combining to form a video vector.
7. The identification method of claim 1, wherein the subjecting the video to vector dimension reduction visualization comprises:
performing manifold learning on the video vectors, performing dimensionality reduction visualization, converting a high-dimensional matrix into a two-dimensional vector group, regarding each document as a scattered point, and drawing a graph;
and obtaining vectors of the same word on the graph of the dimension reduction result, gathering the vectors at the similar places on the graph, and finishing the classification identification according to the obtained graph.
8. A handwriting video recognition system, comprising: the system comprises an initial data acquisition module, a priori knowledge collection module, a text conversion module, a vectorization module and an identification module;
the initial data acquisition module is used for acquiring and processing initial calligraphy video data;
the priori knowledge collection module is used for collecting and acquiring the priori knowledge of the cursive writing order and the cursive symbols of the cursive writing of the line book;
the text conversion module extracts video key frame video pictures from the initial calligraphy video data by combining the prior knowledge and converts the key frame pictures into texts;
the vectorization module is used for vectorizing the texts of the pictures to obtain a multi-dimensional vector of each text, splicing the multi-dimensional vectors generated by each picture according to a time sequence and combining the multi-dimensional vectors to form a video vector;
and the identification module is used for performing vector dimension reduction visualization processing on the video to finish classification identification.
9. A computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a computing device, cause the computing device to perform any of the methods of claims 1-7.
10. A computing device, comprising: one or more processors, memory, and one or more programs stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for performing any of the methods of claims 1-7.
CN202110895033.7A 2021-08-04 2021-08-04 Handwriting video identification method, system, storage medium and computing device Active CN113591743B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110895033.7A CN113591743B (en) 2021-08-04 2021-08-04 Handwriting video identification method, system, storage medium and computing device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110895033.7A CN113591743B (en) 2021-08-04 2021-08-04 Handwriting video identification method, system, storage medium and computing device

Publications (2)

Publication Number Publication Date
CN113591743A true CN113591743A (en) 2021-11-02
CN113591743B CN113591743B (en) 2023-11-24

Family

ID=78255306

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110895033.7A Active CN113591743B (en) 2021-08-04 2021-08-04 Handwriting video identification method, system, storage medium and computing device

Country Status (1)

Country Link
CN (1) CN113591743B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20170083805A (en) * 2016-01-11 2017-07-19 경북대학교 산학협력단 Distinction method and system for characters written in caoshu characters or cursive characters
CN108932508A (en) * 2018-08-13 2018-12-04 杭州大拿科技股份有限公司 A kind of topic intelligent recognition, the method and system corrected
CN110019817A (en) * 2018-12-04 2019-07-16 阿里巴巴集团控股有限公司 A kind of detection method, device and the electronic equipment of text in video information
CN110580352A (en) * 2017-07-04 2019-12-17 艾朝君 Chinese character and line book intercommunication mutual identification technical method
US20200134444A1 (en) * 2018-10-31 2020-04-30 Sony Interactive Entertainment Inc. Systems and methods for domain adaptation in neural networks
CN111436005A (en) * 2019-01-15 2020-07-21 北京字节跳动网络技术有限公司 Method and apparatus for displaying image
CN111881310A (en) * 2019-12-07 2020-11-03 杭州华冬人工智能有限公司 Chinese character hard-stroke writing intelligent guidance and scoring method and guidance scoring system
CN112015955A (en) * 2020-09-01 2020-12-01 清华大学 Multi-mode data association method and device
CN112036522A (en) * 2020-07-20 2020-12-04 上海卓希智能科技有限公司 Calligraphy individual character evaluation method, system and terminal based on machine learning
CN112183335A (en) * 2020-09-28 2021-01-05 中国人民大学 Handwritten image recognition method and system based on unsupervised learning
CN112766080A (en) * 2020-12-31 2021-05-07 北京搜狗科技发展有限公司 Handwriting recognition method and device, electronic equipment and medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20170083805A (en) * 2016-01-11 2017-07-19 경북대학교 산학협력단 Distinction method and system for characters written in caoshu characters or cursive characters
CN110580352A (en) * 2017-07-04 2019-12-17 艾朝君 Chinese character and line book intercommunication mutual identification technical method
CN108932508A (en) * 2018-08-13 2018-12-04 杭州大拿科技股份有限公司 A kind of topic intelligent recognition, the method and system corrected
US20200134444A1 (en) * 2018-10-31 2020-04-30 Sony Interactive Entertainment Inc. Systems and methods for domain adaptation in neural networks
CN110019817A (en) * 2018-12-04 2019-07-16 阿里巴巴集团控股有限公司 A kind of detection method, device and the electronic equipment of text in video information
CN111436005A (en) * 2019-01-15 2020-07-21 北京字节跳动网络技术有限公司 Method and apparatus for displaying image
CN111881310A (en) * 2019-12-07 2020-11-03 杭州华冬人工智能有限公司 Chinese character hard-stroke writing intelligent guidance and scoring method and guidance scoring system
CN112036522A (en) * 2020-07-20 2020-12-04 上海卓希智能科技有限公司 Calligraphy individual character evaluation method, system and terminal based on machine learning
CN112015955A (en) * 2020-09-01 2020-12-01 清华大学 Multi-mode data association method and device
CN112183335A (en) * 2020-09-28 2021-01-05 中国人民大学 Handwritten image recognition method and system based on unsupervised learning
CN112766080A (en) * 2020-12-31 2021-05-07 北京搜狗科技发展有限公司 Handwriting recognition method and device, electronic equipment and medium

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
MICHAEL BLUMENSTEIN等: "An investigation of the modified direction feature for cursive character recognition", 《PATTERN RECOGNITION》, vol. 40, no. 2, pages 376 - 388, XP005837117, DOI: 10.1016/j.patcog.2006.05.017 *
唐锋 等: "长文本武侠小说外号识别研究", 《中文信息学报》, vol. 33, no. 8, pages 132 - 142 *
孙巍巍: "基于深度学习的手写汉字识别技术研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 5, pages 138 - 1130 *
张树业: "深度模型及其在视觉文字分析中的应用", 《中国博士学位论文全文数据库信息科技辑》, no. 2, pages 138 - 179 *
薛扬 等: "镜像图灵测试:古诗的机器识别", 《计算机学报》, vol. 44, no. 7, pages 1398 - 1413 *

Also Published As

Publication number Publication date
CN113591743B (en) 2023-11-24

Similar Documents

Publication Publication Date Title
Chang et al. Generating handwritten chinese characters using cyclegan
US11899927B2 (en) Simulated handwriting image generator
US20190180154A1 (en) Text recognition using artificial intelligence
CN110738203A (en) Method and device for outputting field structuralization and computer readable storage medium
CN110390254B (en) Character analysis method and device based on human face, computer equipment and storage medium
CN111666937A (en) Method and system for recognizing text in image
CN114092938B (en) Image recognition processing method and device, electronic equipment and storage medium
Fang et al. Multi-feature learning by joint training for handwritten formula symbol recognition
Singh et al. Dknet: Deep kuzushiji characters recognition network
CN111414913B (en) Character recognition method, recognition device and electronic equipment
Zhang et al. A simple and effective static gesture recognition method based on attention mechanism
CN111709338B (en) Method and device for table detection and training method of detection model
Rahman et al. Air writing: Recognizing multi-digit numeral string traced in air using RNN-LSTM architecture
Nandhini et al. Sign language recognition using convolutional neural network
CN113591743B (en) Handwriting video identification method, system, storage medium and computing device
CN114898376B (en) Formula identification method, device, equipment and medium
CN108491820B (en) Method, device and equipment for identifying limb representation information in image and storage medium
Panchal et al. An investigation on feature and text extraction from images using image recognition in Android
CN110633666A (en) Gesture track recognition method based on finger color patches
Hutagalung et al. Hiragana Handwriting Recognition Using Deep Neural Network Search.
Munggaran et al. Handwritten pattern recognition using Kohonen neural network based on pixel character
Assaleh et al. Recognition of handwritten Arabic alphabet via hand motion tracking
KR20230036674A (en) Character recognition method using HOG,SVM image processing
Yan et al. SMFNet: One Shot Recognition of Chinese Character Font Based on Siamese Metric Model
Corr et al. Open source dataset and deep learning models for online digit gesture recognition on touchscreens

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant