CN116912856A - Image identification method and device of intelligent scanning pen and intelligent scanning pen - Google Patents
Image identification method and device of intelligent scanning pen and intelligent scanning pen Download PDFInfo
- Publication number
- CN116912856A CN116912856A CN202311182898.4A CN202311182898A CN116912856A CN 116912856 A CN116912856 A CN 116912856A CN 202311182898 A CN202311182898 A CN 202311182898A CN 116912856 A CN116912856 A CN 116912856A
- Authority
- CN
- China
- Prior art keywords
- image
- text recognition
- test
- model
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 36
- 238000012360 testing method Methods 0.000 claims abstract description 65
- 230000000007 visual effect Effects 0.000 claims abstract description 51
- 238000012549 training Methods 0.000 claims abstract description 36
- 238000007781 pre-processing Methods 0.000 claims abstract description 7
- 238000000605 extraction Methods 0.000 claims description 21
- 230000008447 perception Effects 0.000 claims description 12
- 238000013507 mapping Methods 0.000 claims description 11
- 238000012545 processing Methods 0.000 claims description 8
- 238000010276 construction Methods 0.000 claims description 6
- 238000013473 artificial intelligence Methods 0.000 abstract description 2
- 230000000295 complement effect Effects 0.000 description 5
- 238000012937 correction Methods 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/19—Recognition using electronic means
- G06V30/191—Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06V30/19127—Extracting features by transforming the feature space, e.g. multidimensional scaling; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/142—Image acquisition using hand-held instruments; Constructional details of the instruments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/19—Recognition using electronic means
- G06V30/191—Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06V30/19147—Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/19—Recognition using electronic means
- G06V30/191—Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06V30/1916—Validation; Performance evaluation
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Software Systems (AREA)
- Character Discrimination (AREA)
Abstract
The application provides an image recognition method of an intelligent scanning pen, and particularly relates to the technical field of artificial intelligence, wherein the method comprises the following steps: s0: constructing, training and testing to obtain an image text recognition model; s1: image acquisition S2: image preprocessing S4: and carrying out text recognition on the preprocessed image by using the image text recognition model, acquiring language information and visual characteristics in the image, and outputting text content contained in the preprocessed image. The application also provides a device and an intelligent scanning pen. Compared with the prior art, the image recognition method provided by the application does not need an extra language model, converts the original two-step recognition image text content into one-step recognition, adaptively considers visual and language information in a unified structure, and displays stronger language capability while keeping high efficiency.
Description
Technical Field
The application relates to the technical field of artificial intelligence, in particular to an intelligent scanning pen.
Background
The scanning pen is used as a portable electronic device, and can convert handwritten or printed text into a digital format after scanning, and further perform operations such as display, editing, TTS voice playing, translation, sharing and the like.
The working process mainly comprises the steps of image acquisition, image processing, image recognition and the like: firstly, a user needs to hold a scanning pen to aim at a text to be scanned, at the moment, an optical sensor arranged at the pen point of the scanning pen works, and images of the text at the pen point are collected; and processing the acquired image in the scanning pen, and finally sending the processed image into an image recognition model for text recognition, so that the text information in the image is obtained, and further processing such as displaying, translating and the like is further carried out on the text information according to the requirements of a user.
The image recognition algorithm applied to the scanning pen in the prior art is generally completed by adopting a two-step method, namely after the processed image is sent into the image recognition model, the image recognition model firstly carries out visual prediction on the image based on a visual analysis method to obtain a prediction result, then corrects the prediction result by using a text model to finally obtain a text recognition result, and when the method is adopted, the image recognition model independently constructs a visual prediction module and a language correction module, and the two modules independently consider visual information and text information in the sent image. Taking a Chinese patent application document with the application number of 202310604317.5 as an example, a text recognition method and a device for a dictionary pen are clearly recorded, word contents in a text image acquired in a scanning process of the dictionary pen are recognized, character information comparison is carried out on the word contents when the word contents cannot be matched with information in a dictionary pen word stock, character states are judged, text errors are determined in the complete state of the characters, the recognized complete character contents are matched with the information in the dictionary pen word stock in the incomplete state of the characters, the disposal mode of the incomplete characters is judged, the keyword of the incomplete characters is discarded or determined, the matching recognition is carried out on the residual characters and the information in the dictionary pen word stock based on the semantic association relation between the keyword and the residual characters, further, discarding or recognition complement is carried out on incomplete characters at a text recognition starting point and an end point, recognition of a large-density text is better applicable, and the accuracy and efficiency of text recognition are improved.
When the method for distinguishing the visual prediction module from the text language correction module and carrying out image prediction on an image of an input model and then carrying out language correction is applied to a specific scanning pen, the processing speed is low because of the more tedious processing, and the sketching trace, background texture, stain shielding and the like in a text-containing image obtained by scanning at the tip of the scanning pen are easy to influence the final text recognition result because the visual prediction module and the text language correction module work independently, so that the accuracy of the scanning pen is not ideal.
Disclosure of Invention
In order to solve the above problems, the present application provides an image recognition method of an intelligent scanning pen, which takes a visual and a language model as a whole, and obtains language information and visual characteristics for an image together, without requiring an additional language model, and can obtain higher accuracy and faster text recognition speed when being applied to the scanning pen.
The application also provides an image recognition device of the intelligent scanning pen and the intelligent scanning pen so as to apply the method.
The technical scheme provided by the application is as follows:
an image recognition method of an intelligent scanning pen, the method comprising the following steps:
s0: constructing, training and testing to obtain an image text recognition model;
s1: and (3) image acquisition: setting an image collector, and scanning to obtain an original image of a target area;
s2: image preprocessing: preprocessing an original image to obtain a preprocessed image;
s4: and carrying out text recognition on the preprocessed image by using the image text recognition model, acquiring language information and visual characteristics in the image, and outputting text content contained in the preprocessed image.
Further, S0: the image text recognition model is constructed, trained and tested and comprises the following substeps:
s01: constructing a model;
s02: preparing a training data set;
s03: training a model;
s04: and (5) testing the model.
S01: the model construction method comprises the following substeps:
further, S01: the construction model specifically comprises the following steps: s011: respectively constructing a feature extraction network, a mask language perception module and a visual reasoning module;
s012: the feature extraction network, the mask language sensing module and the visual reasoning module are sequentially connected to construct the to-be-trained image text recognition model.
Further, S02 specifically is:
s021: selecting a plurality of training images;
s022: making word-level labels for each training image;
s023: for each training image in the data set with the words marked, randomly generating a character index according to the length of the words marked on the current image;
s024: all training images with word-level labels and character indices are collected to form a training dataset.
Further, S03: the training model comprises the following substeps:
s031: selecting an image from the training data set and inputting the image into the text recognition model of the image to be trained;
s032: the feature extraction network performs feature extraction on the currently input image to obtain a two-dimensional feature vector of the current image;
s033: the mask language perception module performs character mask mapping processing on the two-dimensional feature vector of the current image, the word level label carried by the current image and the character index carried by the current image to obtain shielding feature mapping;
s034: combining the current image, the two-dimensional feature vector of the current image and the shielding feature map to obtain an image with shielding characters;
s035: inputting the image with the shielding characters into a visual reasoning module, capturing long-distance information in a visual space by the visual reasoning module, and predicting the corresponding text line identification content;
s036: and (3) circulating S031-S035 until all images in the training data set are input into the image text recognition model to be trained, and obtaining the image text recognition model to be tested.
Further, 04 includes the following sub-steps:
S041A: selecting a plurality of test images to be assembled into a first test data set;
S042A: removing a mask language perception module in the text recognition model of the image to be detected, and directly connecting a feature extraction network with a visual reasoning module;
S043A: sequentially inputting each test image in the test data set into a feature extraction network, extracting features of the currently input test image by the feature extraction network to obtain a two-dimensional feature vector of the current test image, directly inputting the two-dimensional feature vector into a visual reasoning module, and predicting text line identification content corresponding to the visual reasoning module to obtain a text identification result of the current test image;
S044A, circulating S041A-S043A until all test images in the test data set are input into the image text recognition model to be trained, and obtaining a first image text recognition model to be evaluated.
Further, S04 further includes the following sub-steps:
S041B: selecting a plurality of test images, making word-level labels for each test image, and randomly generating character indexes according to the lengths of the words labeled on the current test image; collecting all the test images carrying word-level labels and character indexes into a second test data set;
S042B: sequentially inputting each test image in the test data set into a text recognition model of the image to be tested to obtain a text recognition result of the current test image;
S043B, circulating S041B-S042B until all the test images in the test data set are input into the image text recognition model to be trained, and obtaining a second image text recognition model to be evaluated.
Further, S04 further includes: and evaluating the first image text recognition model to be evaluated and the second image text recognition model to be evaluated, and selecting the better one as the final image text recognition model.
The application also provides an image recognition device of the intelligent scanning pen, which stores the image recognition method of the intelligent scanning pen and controls the execution of the method.
The application also provides an intelligent scanning pen, and the intelligent scanning pen is applied to the image recognition device of the intelligent scanning pen.
The beneficial effects of the application are as follows: compared with the prior art, the image recognition method provided by the application does not need an extra language model, converts the original two-step recognition image text content into one-step recognition, adaptively considers visual and language information in a unified structure, and displays stronger language capability while keeping high efficiency.
Drawings
Fig. 1 is a flowchart of an image recognition method of an intelligent scanning pen implemented in a specific embodiment.
Fig. 2 is a block diagram of S03 in the image recognition method of the intelligent scanning pen according to the embodiment: a flow chart of sub-steps of training a model.
The realization, functional characteristics and advantages of the present application are further described with reference to the accompanying drawings in combination with the embodiments.
Description of the embodiments
In order to more clearly and completely describe the technical scheme of the application, the application is further described below with reference to the accompanying drawings.
Please refer to fig. 1-2.
In this embodiment, an image recognition method of an intelligent scanning pen is provided, and the method includes the following steps:
s0: constructing, training and testing to obtain an image text recognition model;
s1: and (3) image acquisition: setting an image collector, and scanning to obtain an original image of a target area;
s2: image preprocessing: preprocessing an original image to obtain a preprocessed image;
s4: and carrying out text recognition on the preprocessed image by using the image text recognition model, acquiring language information and visual characteristics in the image, and outputting text content contained in the preprocessed image.
Further, in this embodiment, S0: the image text recognition model is constructed, trained and tested and comprises the following substeps:
s01: constructing a model;
s02: preparing a training data set;
s03: training a model;
s04: and (5) testing the model.
Further, in this embodiment, S01: the model construction method comprises the following substeps:
s01: the construction model specifically comprises the following steps: s011: respectively constructing a feature extraction network, a mask language perception module and a visual reasoning module;
s012: the feature extraction network, the mask language sensing module and the visual reasoning module are sequentially connected to construct the to-be-trained image text recognition model.
Further, in the present embodiment, S02 is specifically:
s021: selecting a plurality of training images;
s022: making word-level labels for each training image;
s023: for each training image in the data set with the words marked, randomly generating a character index according to the length of the words marked on the current image, wherein the character index represents the index of the blocked character, and randomly acquiring the index of the character for the input word image with each length;
s024: all training images with word-level labels and character indices are collected to form a training dataset.
Further, in this embodiment, S03: the training model comprises the following substeps:
s031: selecting an image from the training data set and inputting the image into the text recognition model of the image to be trained;
s032: the feature extraction network performs feature extraction on the currently input image to obtain a two-dimensional feature vector of the current image;
s033: the mask language perception module performs character mask mapping processing on the two-dimensional feature vector of the current image, the word level label carried by the current image and the character index carried by the current image to obtain shielding feature mapping; the mask language sensing module takes a two-dimensional feature vector and a character index of a current image as input, and in specific implementation, a technician can build the mask language sensing module by using a transducer unit and a sigmoid layer, the transducer unit is utilized to improve feature representation capability, character index information is combined, character mask mapping is obtained by the sigmoid layer, and shielding feature mapping is further generated.
S034: combining the current image, the two-dimensional feature vector of the current image and the shielding feature map to obtain an image with shielding characters; in order to guide the learning process of the shielding characteristic mapping, two parallel branches can be designed on the basis of weak supervision complementary learning when the mask language perception module is built.
The weakly supervised complementary learning is intended to guide the occlusion feature map to cover areas of more occluded characters, thereby complementarily causing the occlusion feature map to contain areas of more other characters. In the first branch, the element-wise product between the two-dimensional feature vector of the current image and the occlusion feature map is multiplied to generate a feature map containing the visual semantics of the occluded character. Correspondingly, in a second branch, the elemental product between the two-dimensional feature vector of the current image and the occlusion feature map is used to generate a feature map containing the visual semantics of the other characters. In this way, the first and second branch complementary learning processes will direct the occlusion feature map to cover only the characters in the corresponding location, and not overlap with other characters.
The weights of the transducer elements and the prediction layer are shared between two parallel arranged branches, which can be used for feature representation enhancement and semantic guidance. The supervision of weakly supervised complementary learning is automatically obtained using the original word level labels and randomly generated character indices, so that the mask language perception module can automatically generate accurate character mask mappings without additional labels.
S035: inputting the image with the shielding characters into a visual reasoning module, capturing long-distance information in a visual space by the visual reasoning module, and predicting the corresponding text line identification content;
the visual reasoning module comprises a visual semantic reasoning layer and a parallel prediction layer. The visual semantic reasoning layer consists of a plurality of transformer units, and can effectively model long-term dependency relationship without being influenced by word length.
Because the mask language perception module accurately blocks character information in a training stage, the visual semantic reasoning layer is guided to predict the dependency relationship between the visual characteristics of the characters, so that the semantics of the blocked characters are deduced. Therefore, under word-level supervision, the visual semantic reasoning layer learns to actively model language information in the visual context to assist in recognition, and because of the design, the visual semantic reasoning layer can adaptively consider the language information for visual feature enhancement under the condition of current visual semantic confusion (such as stain shielding, trace sketching and the like) in a subsequent model test stage. After the feature mapping generated by the visual semantic reasoning layer is visualized, how the learned language information improves the recognition performance can be better understood.
S036: and (3) circulating S031-S035 until all images in the training data set are input into the image text recognition model to be trained, and obtaining the image text recognition model to be tested.
Further, in this embodiment, 04 includes the following sub-steps:
S041A: selecting a plurality of test images to be assembled into a first test data set;
S042A: removing a mask language perception module in the text recognition model of the image to be detected, and directly connecting a feature extraction network with a visual reasoning module;
S043A: sequentially inputting each test image in the test data set into a feature extraction network, extracting features of the currently input test image by the feature extraction network to obtain a two-dimensional feature vector of the current test image, directly inputting the two-dimensional feature vector into a visual reasoning module, and predicting text line identification content corresponding to the visual reasoning module to obtain a text identification result of the current test image;
S044A, circulating S041A-S043A until all test images in the test data set are input into the image text recognition model to be trained, and obtaining a first image text recognition model to be evaluated.
Further, in this embodiment, S04 further includes the following substeps:
S041B: selecting a plurality of test images, making word-level labels for each test image, and randomly generating character indexes according to the lengths of the words labeled on the current test image; collecting all the test images carrying word-level labels and character indexes into a second test data set;
S042B: sequentially inputting each test image in the test data set into a text recognition model of the image to be tested to obtain a text recognition result of the current test image;
S043B, circulating S041B-S042B until all the test images in the test data set are input into the image text recognition model to be trained, and obtaining a second image text recognition model to be evaluated.
Further, in this embodiment, S04 further includes: and evaluating the first image text recognition model to be evaluated and the second image text recognition model to be evaluated, and selecting the better one as the final image text recognition model.
In this embodiment, there is also provided an image recognition apparatus of an intelligent scanning pen, which stores the image recognition method of the intelligent scanning pen as described above and performs the method under control.
In this embodiment, there is also provided an intelligent scanning pen, which employs the image recognition apparatus of the intelligent scanning pen as described above.
Of course, the present application can be implemented in various other embodiments, and based on this embodiment, those skilled in the art can obtain other embodiments without any inventive effort, which fall within the scope of the present application.
Claims (10)
1. An image recognition method of an intelligent scanning pen is characterized by comprising the following steps:
s0: constructing, training and testing to obtain an image text recognition model;
s1: and (3) image acquisition: setting an image collector, and scanning to obtain an original image of a target area;
s2: image preprocessing: preprocessing an original image to obtain a preprocessed image;
s4: and carrying out text recognition on the preprocessed image by using the image text recognition model, acquiring language information and visual characteristics in the image, and outputting text content contained in the preprocessed image.
2. The image recognition method of an intelligent scanning pen according to claim 1, wherein S0: the image text recognition model is constructed, trained and tested and comprises the following substeps:
s01: constructing a model;
s02: preparing a training data set;
s03: training a model;
s04: and (5) testing the model.
3. The image recognition method of an intelligent scanning pen according to claim 2, wherein the step S01: the model construction method comprises the following substeps:
s01: the construction model specifically comprises the following steps: s011: respectively constructing a feature extraction network, a mask language perception module and a visual reasoning module;
s012: the feature extraction network, the mask language sensing module and the visual reasoning module are sequentially connected to construct the to-be-trained image text recognition model.
4. The image recognition method of the intelligent scanning pen according to claim 3, wherein the S02 specifically is:
s021: selecting a plurality of training images;
s022: making word-level labels for each training image;
s023: for each training image in the data set with the words marked, randomly generating a character index according to the length of the words marked on the current image;
s024: all training images with word-level labels and character indices are collected to form a training dataset.
5. The image recognition method of an intelligent scanning pen according to claim 4, wherein the step S03: the training model comprises the following substeps:
s031: selecting an image from the training data set and inputting the image into the text recognition model of the image to be trained;
s032: the feature extraction network performs feature extraction on the currently input image to obtain a two-dimensional feature vector of the current image;
s033: the mask language perception module performs character mask mapping processing on the two-dimensional feature vector of the current image, the word level label carried by the current image and the character index carried by the current image to obtain shielding feature mapping;
s034: combining the current image, the two-dimensional feature vector of the current image and the shielding feature map to obtain an image with shielding characters;
s035: inputting the image with the shielding characters into a visual reasoning module, capturing long-distance information in a visual space by the visual reasoning module, and predicting the corresponding text line identification content;
s036: and (3) circulating S031-S035 until all images in the training data set are input into the image text recognition model to be trained, and obtaining the image text recognition model to be tested.
6. The image recognition method of an intelligent scanning pen according to claim 5, wherein S04 comprises the following sub-steps:
S041A: selecting a plurality of test images to be assembled into a first test data set;
S042A: removing a mask language perception module in the text recognition model of the image to be detected, and directly connecting a feature extraction network with a visual reasoning module;
S043A: sequentially inputting each test image in the test data set into a feature extraction network, extracting features of the currently input test image by the feature extraction network to obtain a two-dimensional feature vector of the current test image, directly inputting the two-dimensional feature vector into a visual reasoning module, and predicting text line identification content corresponding to the visual reasoning module to obtain a text identification result of the current test image;
S044A, circulating S041A-S043A until all test images in the test data set are input into the image text recognition model to be trained, and obtaining a first image text recognition model to be evaluated.
7. The image recognition method of an intelligent scanning pen according to claim 6, wherein S04 further comprises the sub-steps of:
S041B: selecting a plurality of test images, making word-level labels for each test image, and randomly generating character indexes according to the lengths of the words labeled on the current test image; collecting all the test images carrying word-level labels and character indexes into a second test data set;
S042B: sequentially inputting each test image in the test data set into a text recognition model of the image to be tested to obtain a text recognition result of the current test image;
S043B, circulating S041B-S042B until all the test images in the test data set are input into the image text recognition model to be trained, and obtaining a second image text recognition model to be evaluated.
8. The method for recognizing an image of an intelligent scanning pen according to claim 6, wherein S04 further comprises: and evaluating the first image text recognition model to be evaluated and the second image text recognition model to be evaluated, and selecting the better one as the final image text recognition model.
9. An image recognition apparatus of an intelligent scanning pen, characterized in that the apparatus stores the image recognition method of an intelligent scanning pen as claimed in any one of claims 1 to 8 and is controlled to execute the method.
10. An intelligent scanning pen, characterized in that it employs the image recognition apparatus of an intelligent scanning pen as claimed in claim 9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311182898.4A CN116912856A (en) | 2023-09-14 | 2023-09-14 | Image identification method and device of intelligent scanning pen and intelligent scanning pen |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311182898.4A CN116912856A (en) | 2023-09-14 | 2023-09-14 | Image identification method and device of intelligent scanning pen and intelligent scanning pen |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116912856A true CN116912856A (en) | 2023-10-20 |
Family
ID=88360690
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311182898.4A Pending CN116912856A (en) | 2023-09-14 | 2023-09-14 | Image identification method and device of intelligent scanning pen and intelligent scanning pen |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116912856A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112541501A (en) * | 2020-12-18 | 2021-03-23 | 北京中科研究院 | Scene character recognition method based on visual language modeling network |
CN112733768A (en) * | 2021-01-15 | 2021-04-30 | 中国科学技术大学 | Natural scene text recognition method and device based on bidirectional characteristic language model |
US20220415072A1 (en) * | 2022-02-25 | 2022-12-29 | Beijing Baidu Netcom Science Technology Co., Ltd. | Image processing method, text recognition method and apparatus |
CN116704519A (en) * | 2023-06-07 | 2023-09-05 | 深圳如布科技有限公司 | Character recognition method, character recognition device, electronic equipment and storage medium |
-
2023
- 2023-09-14 CN CN202311182898.4A patent/CN116912856A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112541501A (en) * | 2020-12-18 | 2021-03-23 | 北京中科研究院 | Scene character recognition method based on visual language modeling network |
CN112733768A (en) * | 2021-01-15 | 2021-04-30 | 中国科学技术大学 | Natural scene text recognition method and device based on bidirectional characteristic language model |
US20220415072A1 (en) * | 2022-02-25 | 2022-12-29 | Beijing Baidu Netcom Science Technology Co., Ltd. | Image processing method, text recognition method and apparatus |
CN116704519A (en) * | 2023-06-07 | 2023-09-05 | 深圳如布科技有限公司 | Character recognition method, character recognition device, electronic equipment and storage medium |
Non-Patent Citations (2)
Title |
---|
DELI YU ET AL.: "Towards Accurate Scene Text Recognition with Semantic Reasoning Networks", 《2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》, pages 12110 - 12119 * |
李林 等: "基于卷积神经网络的污损遮挡号牌分类", 计算机科学, no. 1, pages 213 - 219 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102266529B1 (en) | Method, apparatus, device and readable storage medium for image-based data processing | |
CN111027562B (en) | Optical character recognition method based on multiscale CNN and RNN combined with attention mechanism | |
CN109670494B (en) | Text detection method and system with recognition confidence | |
CN113254654B (en) | Model training method, text recognition method, device, equipment and medium | |
CN110096987B (en) | Dual-path 3DCNN model-based mute action recognition method | |
CN110796018A (en) | Hand motion recognition method based on depth image and color image | |
CN111738169A (en) | Handwriting formula recognition method based on end-to-end network model | |
CN112712069A (en) | Question judging method and device, electronic equipment and storage medium | |
CN112836702B (en) | Text recognition method based on multi-scale feature extraction | |
CN117746078B (en) | Object detection method and system based on user-defined category | |
CN113592807A (en) | Training method, image quality determination method and device, and electronic equipment | |
CN117010500A (en) | Visual knowledge reasoning question-answering method based on multi-source heterogeneous knowledge joint enhancement | |
CN116564355A (en) | Multi-mode emotion recognition method, system, equipment and medium based on self-attention mechanism fusion | |
CN117892140B (en) | Visual question and answer and model training method and device thereof, electronic equipment and storage medium | |
CN115759293A (en) | Model training method, image retrieval device and electronic equipment | |
CN114462418B (en) | Event detection method, system, intelligent terminal and computer readable storage medium | |
CN111914822A (en) | Text image labeling method and device, computer readable storage medium and equipment | |
Sanmitra et al. | Machine Learning Based Real Time Sign Language Detection | |
KR20190121593A (en) | Sign language recognition system | |
CN116912856A (en) | Image identification method and device of intelligent scanning pen and intelligent scanning pen | |
CN116935411A (en) | Radical-level ancient character recognition method based on character decomposition and reconstruction | |
CN116311269A (en) | Formula picture identification question judging system | |
CN111753618A (en) | Image recognition method and device, computer equipment and computer readable storage medium | |
CN113657092B (en) | Method, device, equipment and medium for identifying tag | |
CN113761209B (en) | Text splicing method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |