CN116912856A - Image identification method and device of intelligent scanning pen and intelligent scanning pen - Google Patents

Image identification method and device of intelligent scanning pen and intelligent scanning pen Download PDF

Info

Publication number
CN116912856A
CN116912856A CN202311182898.4A CN202311182898A CN116912856A CN 116912856 A CN116912856 A CN 116912856A CN 202311182898 A CN202311182898 A CN 202311182898A CN 116912856 A CN116912856 A CN 116912856A
Authority
CN
China
Prior art keywords
image
text recognition
test
model
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311182898.4A
Other languages
Chinese (zh)
Inventor
车建波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Bepal Intelligent Technology Co ltd
Original Assignee
Shenzhen Bepal Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Bepal Intelligent Technology Co ltd filed Critical Shenzhen Bepal Intelligent Technology Co ltd
Priority to CN202311182898.4A priority Critical patent/CN116912856A/en
Publication of CN116912856A publication Critical patent/CN116912856A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19127Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/142Image acquisition using hand-held instruments; Constructional details of the instruments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/19147Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/191Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
    • G06V30/1916Validation; Performance evaluation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Character Discrimination (AREA)

Abstract

The application provides an image recognition method of an intelligent scanning pen, and particularly relates to the technical field of artificial intelligence, wherein the method comprises the following steps: s0: constructing, training and testing to obtain an image text recognition model; s1: image acquisition S2: image preprocessing S4: and carrying out text recognition on the preprocessed image by using the image text recognition model, acquiring language information and visual characteristics in the image, and outputting text content contained in the preprocessed image. The application also provides a device and an intelligent scanning pen. Compared with the prior art, the image recognition method provided by the application does not need an extra language model, converts the original two-step recognition image text content into one-step recognition, adaptively considers visual and language information in a unified structure, and displays stronger language capability while keeping high efficiency.

Description

Image identification method and device of intelligent scanning pen and intelligent scanning pen
Technical Field
The application relates to the technical field of artificial intelligence, in particular to an intelligent scanning pen.
Background
The scanning pen is used as a portable electronic device, and can convert handwritten or printed text into a digital format after scanning, and further perform operations such as display, editing, TTS voice playing, translation, sharing and the like.
The working process mainly comprises the steps of image acquisition, image processing, image recognition and the like: firstly, a user needs to hold a scanning pen to aim at a text to be scanned, at the moment, an optical sensor arranged at the pen point of the scanning pen works, and images of the text at the pen point are collected; and processing the acquired image in the scanning pen, and finally sending the processed image into an image recognition model for text recognition, so that the text information in the image is obtained, and further processing such as displaying, translating and the like is further carried out on the text information according to the requirements of a user.
The image recognition algorithm applied to the scanning pen in the prior art is generally completed by adopting a two-step method, namely after the processed image is sent into the image recognition model, the image recognition model firstly carries out visual prediction on the image based on a visual analysis method to obtain a prediction result, then corrects the prediction result by using a text model to finally obtain a text recognition result, and when the method is adopted, the image recognition model independently constructs a visual prediction module and a language correction module, and the two modules independently consider visual information and text information in the sent image. Taking a Chinese patent application document with the application number of 202310604317.5 as an example, a text recognition method and a device for a dictionary pen are clearly recorded, word contents in a text image acquired in a scanning process of the dictionary pen are recognized, character information comparison is carried out on the word contents when the word contents cannot be matched with information in a dictionary pen word stock, character states are judged, text errors are determined in the complete state of the characters, the recognized complete character contents are matched with the information in the dictionary pen word stock in the incomplete state of the characters, the disposal mode of the incomplete characters is judged, the keyword of the incomplete characters is discarded or determined, the matching recognition is carried out on the residual characters and the information in the dictionary pen word stock based on the semantic association relation between the keyword and the residual characters, further, discarding or recognition complement is carried out on incomplete characters at a text recognition starting point and an end point, recognition of a large-density text is better applicable, and the accuracy and efficiency of text recognition are improved.
When the method for distinguishing the visual prediction module from the text language correction module and carrying out image prediction on an image of an input model and then carrying out language correction is applied to a specific scanning pen, the processing speed is low because of the more tedious processing, and the sketching trace, background texture, stain shielding and the like in a text-containing image obtained by scanning at the tip of the scanning pen are easy to influence the final text recognition result because the visual prediction module and the text language correction module work independently, so that the accuracy of the scanning pen is not ideal.
Disclosure of Invention
In order to solve the above problems, the present application provides an image recognition method of an intelligent scanning pen, which takes a visual and a language model as a whole, and obtains language information and visual characteristics for an image together, without requiring an additional language model, and can obtain higher accuracy and faster text recognition speed when being applied to the scanning pen.
The application also provides an image recognition device of the intelligent scanning pen and the intelligent scanning pen so as to apply the method.
The technical scheme provided by the application is as follows:
an image recognition method of an intelligent scanning pen, the method comprising the following steps:
s0: constructing, training and testing to obtain an image text recognition model;
s1: and (3) image acquisition: setting an image collector, and scanning to obtain an original image of a target area;
s2: image preprocessing: preprocessing an original image to obtain a preprocessed image;
s4: and carrying out text recognition on the preprocessed image by using the image text recognition model, acquiring language information and visual characteristics in the image, and outputting text content contained in the preprocessed image.
Further, S0: the image text recognition model is constructed, trained and tested and comprises the following substeps:
s01: constructing a model;
s02: preparing a training data set;
s03: training a model;
s04: and (5) testing the model.
S01: the model construction method comprises the following substeps:
further, S01: the construction model specifically comprises the following steps: s011: respectively constructing a feature extraction network, a mask language perception module and a visual reasoning module;
s012: the feature extraction network, the mask language sensing module and the visual reasoning module are sequentially connected to construct the to-be-trained image text recognition model.
Further, S02 specifically is:
s021: selecting a plurality of training images;
s022: making word-level labels for each training image;
s023: for each training image in the data set with the words marked, randomly generating a character index according to the length of the words marked on the current image;
s024: all training images with word-level labels and character indices are collected to form a training dataset.
Further, S03: the training model comprises the following substeps:
s031: selecting an image from the training data set and inputting the image into the text recognition model of the image to be trained;
s032: the feature extraction network performs feature extraction on the currently input image to obtain a two-dimensional feature vector of the current image;
s033: the mask language perception module performs character mask mapping processing on the two-dimensional feature vector of the current image, the word level label carried by the current image and the character index carried by the current image to obtain shielding feature mapping;
s034: combining the current image, the two-dimensional feature vector of the current image and the shielding feature map to obtain an image with shielding characters;
s035: inputting the image with the shielding characters into a visual reasoning module, capturing long-distance information in a visual space by the visual reasoning module, and predicting the corresponding text line identification content;
s036: and (3) circulating S031-S035 until all images in the training data set are input into the image text recognition model to be trained, and obtaining the image text recognition model to be tested.
Further, 04 includes the following sub-steps:
S041A: selecting a plurality of test images to be assembled into a first test data set;
S042A: removing a mask language perception module in the text recognition model of the image to be detected, and directly connecting a feature extraction network with a visual reasoning module;
S043A: sequentially inputting each test image in the test data set into a feature extraction network, extracting features of the currently input test image by the feature extraction network to obtain a two-dimensional feature vector of the current test image, directly inputting the two-dimensional feature vector into a visual reasoning module, and predicting text line identification content corresponding to the visual reasoning module to obtain a text identification result of the current test image;
S044A, circulating S041A-S043A until all test images in the test data set are input into the image text recognition model to be trained, and obtaining a first image text recognition model to be evaluated.
Further, S04 further includes the following sub-steps:
S041B: selecting a plurality of test images, making word-level labels for each test image, and randomly generating character indexes according to the lengths of the words labeled on the current test image; collecting all the test images carrying word-level labels and character indexes into a second test data set;
S042B: sequentially inputting each test image in the test data set into a text recognition model of the image to be tested to obtain a text recognition result of the current test image;
S043B, circulating S041B-S042B until all the test images in the test data set are input into the image text recognition model to be trained, and obtaining a second image text recognition model to be evaluated.
Further, S04 further includes: and evaluating the first image text recognition model to be evaluated and the second image text recognition model to be evaluated, and selecting the better one as the final image text recognition model.
The application also provides an image recognition device of the intelligent scanning pen, which stores the image recognition method of the intelligent scanning pen and controls the execution of the method.
The application also provides an intelligent scanning pen, and the intelligent scanning pen is applied to the image recognition device of the intelligent scanning pen.
The beneficial effects of the application are as follows: compared with the prior art, the image recognition method provided by the application does not need an extra language model, converts the original two-step recognition image text content into one-step recognition, adaptively considers visual and language information in a unified structure, and displays stronger language capability while keeping high efficiency.
Drawings
Fig. 1 is a flowchart of an image recognition method of an intelligent scanning pen implemented in a specific embodiment.
Fig. 2 is a block diagram of S03 in the image recognition method of the intelligent scanning pen according to the embodiment: a flow chart of sub-steps of training a model.
The realization, functional characteristics and advantages of the present application are further described with reference to the accompanying drawings in combination with the embodiments.
Description of the embodiments
In order to more clearly and completely describe the technical scheme of the application, the application is further described below with reference to the accompanying drawings.
Please refer to fig. 1-2.
In this embodiment, an image recognition method of an intelligent scanning pen is provided, and the method includes the following steps:
s0: constructing, training and testing to obtain an image text recognition model;
s1: and (3) image acquisition: setting an image collector, and scanning to obtain an original image of a target area;
s2: image preprocessing: preprocessing an original image to obtain a preprocessed image;
s4: and carrying out text recognition on the preprocessed image by using the image text recognition model, acquiring language information and visual characteristics in the image, and outputting text content contained in the preprocessed image.
Further, in this embodiment, S0: the image text recognition model is constructed, trained and tested and comprises the following substeps:
s01: constructing a model;
s02: preparing a training data set;
s03: training a model;
s04: and (5) testing the model.
Further, in this embodiment, S01: the model construction method comprises the following substeps:
s01: the construction model specifically comprises the following steps: s011: respectively constructing a feature extraction network, a mask language perception module and a visual reasoning module;
s012: the feature extraction network, the mask language sensing module and the visual reasoning module are sequentially connected to construct the to-be-trained image text recognition model.
Further, in the present embodiment, S02 is specifically:
s021: selecting a plurality of training images;
s022: making word-level labels for each training image;
s023: for each training image in the data set with the words marked, randomly generating a character index according to the length of the words marked on the current image, wherein the character index represents the index of the blocked character, and randomly acquiring the index of the character for the input word image with each length;
s024: all training images with word-level labels and character indices are collected to form a training dataset.
Further, in this embodiment, S03: the training model comprises the following substeps:
s031: selecting an image from the training data set and inputting the image into the text recognition model of the image to be trained;
s032: the feature extraction network performs feature extraction on the currently input image to obtain a two-dimensional feature vector of the current image;
s033: the mask language perception module performs character mask mapping processing on the two-dimensional feature vector of the current image, the word level label carried by the current image and the character index carried by the current image to obtain shielding feature mapping; the mask language sensing module takes a two-dimensional feature vector and a character index of a current image as input, and in specific implementation, a technician can build the mask language sensing module by using a transducer unit and a sigmoid layer, the transducer unit is utilized to improve feature representation capability, character index information is combined, character mask mapping is obtained by the sigmoid layer, and shielding feature mapping is further generated.
S034: combining the current image, the two-dimensional feature vector of the current image and the shielding feature map to obtain an image with shielding characters; in order to guide the learning process of the shielding characteristic mapping, two parallel branches can be designed on the basis of weak supervision complementary learning when the mask language perception module is built.
The weakly supervised complementary learning is intended to guide the occlusion feature map to cover areas of more occluded characters, thereby complementarily causing the occlusion feature map to contain areas of more other characters. In the first branch, the element-wise product between the two-dimensional feature vector of the current image and the occlusion feature map is multiplied to generate a feature map containing the visual semantics of the occluded character. Correspondingly, in a second branch, the elemental product between the two-dimensional feature vector of the current image and the occlusion feature map is used to generate a feature map containing the visual semantics of the other characters. In this way, the first and second branch complementary learning processes will direct the occlusion feature map to cover only the characters in the corresponding location, and not overlap with other characters.
The weights of the transducer elements and the prediction layer are shared between two parallel arranged branches, which can be used for feature representation enhancement and semantic guidance. The supervision of weakly supervised complementary learning is automatically obtained using the original word level labels and randomly generated character indices, so that the mask language perception module can automatically generate accurate character mask mappings without additional labels.
S035: inputting the image with the shielding characters into a visual reasoning module, capturing long-distance information in a visual space by the visual reasoning module, and predicting the corresponding text line identification content;
the visual reasoning module comprises a visual semantic reasoning layer and a parallel prediction layer. The visual semantic reasoning layer consists of a plurality of transformer units, and can effectively model long-term dependency relationship without being influenced by word length.
Because the mask language perception module accurately blocks character information in a training stage, the visual semantic reasoning layer is guided to predict the dependency relationship between the visual characteristics of the characters, so that the semantics of the blocked characters are deduced. Therefore, under word-level supervision, the visual semantic reasoning layer learns to actively model language information in the visual context to assist in recognition, and because of the design, the visual semantic reasoning layer can adaptively consider the language information for visual feature enhancement under the condition of current visual semantic confusion (such as stain shielding, trace sketching and the like) in a subsequent model test stage. After the feature mapping generated by the visual semantic reasoning layer is visualized, how the learned language information improves the recognition performance can be better understood.
S036: and (3) circulating S031-S035 until all images in the training data set are input into the image text recognition model to be trained, and obtaining the image text recognition model to be tested.
Further, in this embodiment, 04 includes the following sub-steps:
S041A: selecting a plurality of test images to be assembled into a first test data set;
S042A: removing a mask language perception module in the text recognition model of the image to be detected, and directly connecting a feature extraction network with a visual reasoning module;
S043A: sequentially inputting each test image in the test data set into a feature extraction network, extracting features of the currently input test image by the feature extraction network to obtain a two-dimensional feature vector of the current test image, directly inputting the two-dimensional feature vector into a visual reasoning module, and predicting text line identification content corresponding to the visual reasoning module to obtain a text identification result of the current test image;
S044A, circulating S041A-S043A until all test images in the test data set are input into the image text recognition model to be trained, and obtaining a first image text recognition model to be evaluated.
Further, in this embodiment, S04 further includes the following substeps:
S041B: selecting a plurality of test images, making word-level labels for each test image, and randomly generating character indexes according to the lengths of the words labeled on the current test image; collecting all the test images carrying word-level labels and character indexes into a second test data set;
S042B: sequentially inputting each test image in the test data set into a text recognition model of the image to be tested to obtain a text recognition result of the current test image;
S043B, circulating S041B-S042B until all the test images in the test data set are input into the image text recognition model to be trained, and obtaining a second image text recognition model to be evaluated.
Further, in this embodiment, S04 further includes: and evaluating the first image text recognition model to be evaluated and the second image text recognition model to be evaluated, and selecting the better one as the final image text recognition model.
In this embodiment, there is also provided an image recognition apparatus of an intelligent scanning pen, which stores the image recognition method of the intelligent scanning pen as described above and performs the method under control.
In this embodiment, there is also provided an intelligent scanning pen, which employs the image recognition apparatus of the intelligent scanning pen as described above.
Of course, the present application can be implemented in various other embodiments, and based on this embodiment, those skilled in the art can obtain other embodiments without any inventive effort, which fall within the scope of the present application.

Claims (10)

1. An image recognition method of an intelligent scanning pen is characterized by comprising the following steps:
s0: constructing, training and testing to obtain an image text recognition model;
s1: and (3) image acquisition: setting an image collector, and scanning to obtain an original image of a target area;
s2: image preprocessing: preprocessing an original image to obtain a preprocessed image;
s4: and carrying out text recognition on the preprocessed image by using the image text recognition model, acquiring language information and visual characteristics in the image, and outputting text content contained in the preprocessed image.
2. The image recognition method of an intelligent scanning pen according to claim 1, wherein S0: the image text recognition model is constructed, trained and tested and comprises the following substeps:
s01: constructing a model;
s02: preparing a training data set;
s03: training a model;
s04: and (5) testing the model.
3. The image recognition method of an intelligent scanning pen according to claim 2, wherein the step S01: the model construction method comprises the following substeps:
s01: the construction model specifically comprises the following steps: s011: respectively constructing a feature extraction network, a mask language perception module and a visual reasoning module;
s012: the feature extraction network, the mask language sensing module and the visual reasoning module are sequentially connected to construct the to-be-trained image text recognition model.
4. The image recognition method of the intelligent scanning pen according to claim 3, wherein the S02 specifically is:
s021: selecting a plurality of training images;
s022: making word-level labels for each training image;
s023: for each training image in the data set with the words marked, randomly generating a character index according to the length of the words marked on the current image;
s024: all training images with word-level labels and character indices are collected to form a training dataset.
5. The image recognition method of an intelligent scanning pen according to claim 4, wherein the step S03: the training model comprises the following substeps:
s031: selecting an image from the training data set and inputting the image into the text recognition model of the image to be trained;
s032: the feature extraction network performs feature extraction on the currently input image to obtain a two-dimensional feature vector of the current image;
s033: the mask language perception module performs character mask mapping processing on the two-dimensional feature vector of the current image, the word level label carried by the current image and the character index carried by the current image to obtain shielding feature mapping;
s034: combining the current image, the two-dimensional feature vector of the current image and the shielding feature map to obtain an image with shielding characters;
s035: inputting the image with the shielding characters into a visual reasoning module, capturing long-distance information in a visual space by the visual reasoning module, and predicting the corresponding text line identification content;
s036: and (3) circulating S031-S035 until all images in the training data set are input into the image text recognition model to be trained, and obtaining the image text recognition model to be tested.
6. The image recognition method of an intelligent scanning pen according to claim 5, wherein S04 comprises the following sub-steps:
S041A: selecting a plurality of test images to be assembled into a first test data set;
S042A: removing a mask language perception module in the text recognition model of the image to be detected, and directly connecting a feature extraction network with a visual reasoning module;
S043A: sequentially inputting each test image in the test data set into a feature extraction network, extracting features of the currently input test image by the feature extraction network to obtain a two-dimensional feature vector of the current test image, directly inputting the two-dimensional feature vector into a visual reasoning module, and predicting text line identification content corresponding to the visual reasoning module to obtain a text identification result of the current test image;
S044A, circulating S041A-S043A until all test images in the test data set are input into the image text recognition model to be trained, and obtaining a first image text recognition model to be evaluated.
7. The image recognition method of an intelligent scanning pen according to claim 6, wherein S04 further comprises the sub-steps of:
S041B: selecting a plurality of test images, making word-level labels for each test image, and randomly generating character indexes according to the lengths of the words labeled on the current test image; collecting all the test images carrying word-level labels and character indexes into a second test data set;
S042B: sequentially inputting each test image in the test data set into a text recognition model of the image to be tested to obtain a text recognition result of the current test image;
S043B, circulating S041B-S042B until all the test images in the test data set are input into the image text recognition model to be trained, and obtaining a second image text recognition model to be evaluated.
8. The method for recognizing an image of an intelligent scanning pen according to claim 6, wherein S04 further comprises: and evaluating the first image text recognition model to be evaluated and the second image text recognition model to be evaluated, and selecting the better one as the final image text recognition model.
9. An image recognition apparatus of an intelligent scanning pen, characterized in that the apparatus stores the image recognition method of an intelligent scanning pen as claimed in any one of claims 1 to 8 and is controlled to execute the method.
10. An intelligent scanning pen, characterized in that it employs the image recognition apparatus of an intelligent scanning pen as claimed in claim 9.
CN202311182898.4A 2023-09-14 2023-09-14 Image identification method and device of intelligent scanning pen and intelligent scanning pen Pending CN116912856A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311182898.4A CN116912856A (en) 2023-09-14 2023-09-14 Image identification method and device of intelligent scanning pen and intelligent scanning pen

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311182898.4A CN116912856A (en) 2023-09-14 2023-09-14 Image identification method and device of intelligent scanning pen and intelligent scanning pen

Publications (1)

Publication Number Publication Date
CN116912856A true CN116912856A (en) 2023-10-20

Family

ID=88360690

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311182898.4A Pending CN116912856A (en) 2023-09-14 2023-09-14 Image identification method and device of intelligent scanning pen and intelligent scanning pen

Country Status (1)

Country Link
CN (1) CN116912856A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112541501A (en) * 2020-12-18 2021-03-23 北京中科研究院 Scene character recognition method based on visual language modeling network
CN112733768A (en) * 2021-01-15 2021-04-30 中国科学技术大学 Natural scene text recognition method and device based on bidirectional characteristic language model
US20220415072A1 (en) * 2022-02-25 2022-12-29 Beijing Baidu Netcom Science Technology Co., Ltd. Image processing method, text recognition method and apparatus
CN116704519A (en) * 2023-06-07 2023-09-05 深圳如布科技有限公司 Character recognition method, character recognition device, electronic equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112541501A (en) * 2020-12-18 2021-03-23 北京中科研究院 Scene character recognition method based on visual language modeling network
CN112733768A (en) * 2021-01-15 2021-04-30 中国科学技术大学 Natural scene text recognition method and device based on bidirectional characteristic language model
US20220415072A1 (en) * 2022-02-25 2022-12-29 Beijing Baidu Netcom Science Technology Co., Ltd. Image processing method, text recognition method and apparatus
CN116704519A (en) * 2023-06-07 2023-09-05 深圳如布科技有限公司 Character recognition method, character recognition device, electronic equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DELI YU ET AL.: "Towards Accurate Scene Text Recognition with Semantic Reasoning Networks", 《2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》, pages 12110 - 12119 *
李林 等: "基于卷积神经网络的污损遮挡号牌分类", 计算机科学, no. 1, pages 213 - 219 *

Similar Documents

Publication Publication Date Title
KR102266529B1 (en) Method, apparatus, device and readable storage medium for image-based data processing
CN111027562B (en) Optical character recognition method based on multiscale CNN and RNN combined with attention mechanism
CN109670494B (en) Text detection method and system with recognition confidence
CN113254654B (en) Model training method, text recognition method, device, equipment and medium
CN110096987B (en) Dual-path 3DCNN model-based mute action recognition method
CN110796018A (en) Hand motion recognition method based on depth image and color image
CN111738169A (en) Handwriting formula recognition method based on end-to-end network model
CN112712069A (en) Question judging method and device, electronic equipment and storage medium
CN112836702B (en) Text recognition method based on multi-scale feature extraction
CN117746078B (en) Object detection method and system based on user-defined category
CN113592807A (en) Training method, image quality determination method and device, and electronic equipment
CN117010500A (en) Visual knowledge reasoning question-answering method based on multi-source heterogeneous knowledge joint enhancement
CN116564355A (en) Multi-mode emotion recognition method, system, equipment and medium based on self-attention mechanism fusion
CN117892140B (en) Visual question and answer and model training method and device thereof, electronic equipment and storage medium
CN115759293A (en) Model training method, image retrieval device and electronic equipment
CN114462418B (en) Event detection method, system, intelligent terminal and computer readable storage medium
CN111914822A (en) Text image labeling method and device, computer readable storage medium and equipment
Sanmitra et al. Machine Learning Based Real Time Sign Language Detection
KR20190121593A (en) Sign language recognition system
CN116912856A (en) Image identification method and device of intelligent scanning pen and intelligent scanning pen
CN116935411A (en) Radical-level ancient character recognition method based on character decomposition and reconstruction
CN116311269A (en) Formula picture identification question judging system
CN111753618A (en) Image recognition method and device, computer equipment and computer readable storage medium
CN113657092B (en) Method, device, equipment and medium for identifying tag
CN113761209B (en) Text splicing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination