CN113221885B - Hierarchical modeling method and system based on whole words and radicals - Google Patents
Hierarchical modeling method and system based on whole words and radicals Download PDFInfo
- Publication number
- CN113221885B CN113221885B CN202110523430.1A CN202110523430A CN113221885B CN 113221885 B CN113221885 B CN 113221885B CN 202110523430 A CN202110523430 A CN 202110523430A CN 113221885 B CN113221885 B CN 113221885B
- Authority
- CN
- China
- Prior art keywords
- decoding
- whole word
- whole
- radical
- text line
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/60—Type of objects
- G06V20/62—Text, e.g. of license plates, overlay texts or captions on TV images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/44—Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Health & Medical Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Computational Linguistics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Multimedia (AREA)
- Character Discrimination (AREA)
- Machine Translation (AREA)
Abstract
The invention relates to a hierarchical modeling method and a system based on whole words and radicals, wherein the method comprises the following steps: s1: the text line image is subjected to a convolution neural network and a circulation neural network to obtain the sequence characteristics of the text line image; s2: inputting the sequence characteristics of the text line images into a whole word decoding module with an attention mechanism to obtain context characteristic vectors of whole words and decoding results of the whole words; s3: inputting the context feature vector of the whole word into a radical decoding module to obtain the decoding result of each radical under the whole word level; s4: and fusing the whole word and the decoding confidence of each radical by using a confidence score fusion strategy to obtain a recognition result of the whole word. The method provided by the invention can not only realize the recognition of the whole character, but also realize the recognition of the radical at the moment, and not only can improve the recognition effect of the low-frequency character, but also maximally ensure the recognition effect of the non-low-frequency character through the strategy of fusing the decoding confidence coefficients of the whole character and the radical.
Description
Technical Field
The invention relates to the technical field of electronic information, in particular to a hierarchical modeling method and system based on whole characters and radical radicals.
Background
In daily life, text is an indispensable source of visual information. Compared with other contents in the image/video, the characters often contain stronger semantic information, so that the method has great significance for extracting and identifying the characters in the image. With the rapid development of deep learning, deep learning models are widely applied to the field of character recognition. However, the deep learning model requires a large amount of data to train, and if the training samples are few, it is difficult to train the model well. Particularly, for languages with a large number of characters, such as chinese, there is a problem that low-frequency character recognition is difficult.
The existing scheme for recognizing low-frequency characters is mainly based on two aspects, namely firstly, the scheme of adopting a language model, training the language model by utilizing more text corpora and recognizing the low-frequency characters under the assistance of the language model, and secondly, the scheme of adopting radical modeling, namely splitting characters according to the radicals, such as 'subject' characters, splitting the characters according to the radicals to obtain the charactersThe rice and the bucket, wherein,showing a left-right structure.
For the scheme of the language model, the recognition of the low-frequency characters depends on the language model excessively, the corpus selection of the language model seriously influences the recognition effect of the low-frequency characters, and for the scheme of modeling the radical, the whole characters are too finely divided, such as 'punish' characters, which are divided into 'month' and 'month', each single result can be regarded as a whole character, and the recognition difficulty is increased.
Disclosure of Invention
In order to solve the technical problems, the invention provides a hierarchical modeling method and system based on whole words and radicals.
The technical solution of the invention is as follows: a hierarchical modeling method based on whole words and radicals comprises the following steps:
step S1: the method comprises the steps that a text line image passes through a convolutional neural network and a cyclic neural network to obtain sequence characteristics of the text line image;
step S2: inputting the sequence characteristics of the text line images into a whole word decoding module with an attention mechanism to obtain a context characteristic vector of a whole word and a decoding result of the whole word;
step S3: inputting the context feature vector of the whole word into a radical decoding module to obtain the decoding result of each radical under the whole word level;
step S4: and respectively calculating the confidence coefficient of the decoding result of the whole word and the confidence coefficient of the decoding result of each radical by using a confidence coefficient score fusion strategy, and fusing to obtain the final recognition result of the whole word.
Compared with the prior art, the invention has the following advantages:
the invention provides a hierarchical modeling based on whole words and partial radicals, which uses the idea of partial radical modeling for reference, but is different from the existing partial radical modeling method.
Drawings
FIG. 1 is a flowchart of a hierarchical modeling method based on whole words and radicals according to an embodiment of the present invention;
fig. 2 is a step S1 in the hierarchical modeling method based on whole words and radicals in the embodiment of the present invention: the method comprises the steps that a text line image passes through a convolutional neural network and a cyclic neural network to obtain a flow chart of sequence characteristics of the text line image;
fig. 3 is a step S2 in the hierarchical modeling method based on whole words and radicals in the embodiment of the present invention: inputting the sequence characteristics of the text line images into a whole word decoding module with an attention mechanism to obtain a context characteristic vector of a whole word and a flow chart of a decoding result of the whole word;
fig. 4 is a step S3 in the hierarchical modeling method based on whole words and radicals in the embodiment of the present invention: inputting the context feature vector of the whole word into a radical decoding module to obtain a flow chart of a decoding result of each radical under the whole word level;
fig. 5 is a step S4 in the hierarchical modeling method based on whole words and radicals in the embodiment of the present invention: respectively calculating the confidence coefficient of the decoding result of the whole word and the confidence coefficient of the decoding result of each radical by using a confidence coefficient score fusion strategy, and fusing to obtain a final flow chart of the recognition result of the whole word;
FIG. 6 is a schematic diagram of a modeling structure of radicals at the whole word level according to an embodiment of the present invention;
fig. 7 is a block diagram of a hierarchical modeling system based on whole words and radicals in an embodiment of the present invention.
Detailed Description
The invention provides a hierarchical modeling method based on whole words and radical, which adopts a strategy of adding radical modeling branches under the level of whole word modeling, not only can realize the recognition of the whole words, but also can realize the recognition of the radical at the moment, and finally, the recognition effect of low-frequency words can be improved and the recognition effect of non-low-frequency words can be maximally ensured through the fusion of the whole word modeling confidence coefficient and the radical modeling confidence coefficient.
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings.
Example one
As shown in fig. 1, a hierarchical modeling method based on whole words and radicals according to an embodiment of the present invention includes the following steps:
step S1: the text line image is subjected to a convolutional neural network and a cyclic neural network to obtain sequence characteristics of the text line image;
step S2: inputting the sequence characteristics of the text line images into a whole word decoding module with an attention mechanism to obtain context characteristic vectors of whole words and decoding results of the whole words;
step S3: inputting the context feature vector of the whole word into a radical decoding module to obtain the decoding result of each radical under the whole word level;
step S4: and respectively calculating the confidence coefficient of the decoding result of the whole word and the confidence coefficient of the decoding result of each radical by using a confidence coefficient score fusion strategy, and fusing to obtain the final recognition result of the whole word.
As shown in fig. 2, in one embodiment, the step S1: the method comprises the following steps of enabling a text line image to pass through a convolutional neural network and a cyclic neural network to obtain sequence characteristics of the text line image, and specifically comprises the following steps:
step S11: normalizing the text line image to obtain a normalized text line image;
in the embodiment of the invention, the text line picture is normalized according to the height of 64 pixels, and the pixels are normalized to [ -1, 1 ].
Step S12: inputting the normalized text line image into a convolutional neural network to obtain a feature vector of the text line image;
in this step, the normalized text line image obtained in step S11 is input to a convolutional neural network for feature extraction, but the embodiment of the present invention uses a Resnet29 neural network, the height direction of the image is downsampled 6 times, that is, reduced by 64 times, the width direction of the image is downsampled 3 times, that is, reduced by 8 times, and the size of the obtained text line image feature map is [ H, l, d ], because in the embodiment of the present invention, the image height is 64 pixels, after passing through the Resnet29S neural network, H denotes that the height H of the feature map is 1, l denotes the length of the feature map, and d denotes the number of channels of the feature map. And carrying out slicing operation on the obtained feature map in length so as to obtain feature vectors with the dimension of l being d.
Step S13: and inputting the feature vector into a recurrent neural network to obtain the sequence features of the text line image.
In this step, the feature vector of one dimension obtained in step S13 is used as an input cyclic neural network, and in the embodiment of the present invention, two layers of bidirectional LSTM are used as the cyclic neural network to output the sequence features of the text line image, where the length of the output sequence features is l.
As shown in fig. 3, in one embodiment, the step S2: inputting the sequence features of the text line image into a whole word decoding module with attention mechanism to obtain a context feature vector of a whole word and a decoding result of the whole word, and specifically comprising the following steps:
step S21: inputting the sequence characteristics of the text line images into a whole word decoding module with attention mechanism shown in the following formulas (1) to (3) to obtain a context characteristic vector c of the whole word t ;
e ti =o(s t-1 ,h i ) (1)
Wherein s is t-1 In the last hidden state, h i An ith frame representing a sequence feature, and o represents a dot product operation; alpha is alpha ti For the weight of the attention mechanism, l is the number of the eigenvectors; c. C t A context feature vector for the whole word;
the whole word decoding module in the embodiment of the invention adopts a layer of unidirectional LSTM.
Step S22: output y of last moment t-1 And context feature vector c t After the operation of the cascade layer, the whole word decoding result y at the current moment is obtained through the classification layer t ;
The classification layer in the embodiment of the invention adopts a Softmax function.
As shown in fig. 4, in one embodiment, the step S3: inputting the context feature vector of the whole word into a radical decoding module to obtain the decoding result of each radical under the whole word level, which specifically comprises the following steps:
step S31: context feature vector c t The output of the input radical decoding module at the time t is r t ;
The radical decoding module in the embodiment of the invention also adopts a layer of unidirectional LSTM.
Step S32: r is t Obtaining a decoding result of the radicals of the whole word through a classification layer;
the classification layer in this step also uses a Softmax function.
Step S33: and counting the number of the split radicals corresponding to each whole word in a batch, and taking the obtained maximum number as the maximum decoding length of the radicals in the batch.
During the training process, the radical decoding module counts the number of the split radicals corresponding to each whole word under all the whole words in a batch, and the obtained maximum number is used as the maximum decoding length of the radical decoding of the batch.
As shown in fig. 5, in one embodiment, the step S4: respectively calculating the confidence coefficient of the decoding result of the whole word and the confidence coefficient of the decoding result of each radical by using a confidence coefficient score fusion strategy, and fusing to obtain a final recognition result of the whole word, wherein the confidence coefficient score fusion strategy specifically comprises the following steps:
step S41: judging the whole word decoding result y t Whether the Chinese character is selected, if not, y is determined t As a final decoding result; if yes, calculating the confidence coefficient of the whole word decoding according to the formula (4), calculating the confidence coefficient of the radical decoding according to the formula (5), and turning to the step S42;
-logp i (4)
wherein, in the formula (4), p i Representing the recognition probability corresponding to the ith character obtained by decoding; l in the formula (5) i The number of the split of the radical corresponding to the ith character is shown,representing the recognition probability corresponding to the jth radical of the ith character obtained by decoding;
step S42: comparing the confidence coefficient of the whole word decoding with the confidence coefficient of the radical decoding, and taking the result with lower confidence coefficient as the final decoding result at the moment;
step S43: and repeating the steps S41-S42 for decoding at each moment until the maximum decoding length is reached or an end symbol is met.
Fig. 6 is a schematic diagram of a modeling structure of radicals at the whole word level according to an embodiment of the present invention.
The invention provides a hierarchical modeling method based on whole words and partial radicals, which adopts a hierarchical structural design that partial radical modeling branches are added under the level of whole word modeling, context feature vectors at each moment are used as input of partial radical modeling under the whole words, so that not only can the recognition of the whole words be realized, but also the recognition of the partial radicals at the moment can be realized, and finally, the recognition effect of low-frequency words can be improved and the recognition effect of non-low-frequency words can be maximally ensured through a strategy of fusing the confidence coefficient of the whole word modeling and the confidence coefficient of the partial radical modeling.
Example two
As shown in fig. 7, an embodiment of the present invention provides a hierarchical modeling system based on whole words and radicals, including the following modules:
a sequence feature obtaining module 51, configured to pass the text line image through a convolutional neural network and a cyclic neural network to obtain a sequence feature of the text line image;
a whole word context feature vector and decoding result obtaining module 52, configured to input the sequence features of the text line image into a whole word decoding module with attention mechanism, so as to obtain a whole word context feature vector and a whole word decoding result;
a decoding result module 53 for obtaining each radical, configured to input the context feature vector of the whole word into the radical decoding module, and obtain a decoding result of each radical in the whole word level;
and the recognition result obtaining module 54 is configured to calculate the confidence of the decoding result of the whole word and the confidence of the decoding result of each radical respectively by using a confidence score fusion policy, and perform fusion to obtain a final recognition result of the whole word.
The above examples are provided for the purpose of describing the present invention only and are not intended to limit the scope of the present invention. The scope of the invention is defined by the appended claims. Various equivalent substitutions and modifications can be made without departing from the spirit and principles of the invention, and are intended to be within the scope of the invention.
Claims (4)
1. A hierarchical modeling method based on whole words and radicals is characterized by comprising the following steps:
step S1: the text line image is subjected to a convolution neural network and a circulation neural network to obtain the sequence characteristics of the text line image;
step S2: inputting the sequence features of the text line images into a whole word decoding module with attention mechanism to obtain a context feature vector of a whole word and a decoding result of the whole word, specifically comprising:
step S21: inputting the sequence characteristics of the text line images into a whole word decoding module with attention mechanism shown in the following formulas (1) to (3) to obtain a context characteristic vector c of the whole word t ;
e ti =o(s t-1 ,h i ) (1)
Wherein s is t-1 In the last hidden state, h i An ith frame representing the sequence feature, and o represents a dot product operation; alpha is alpha ti For the weight of the attention mechanism, l is the number of the eigenvectors; c. C t A context feature vector for the whole word;
step S22: output y of last moment t-1 And the context feature vector c t After the operation of the cascade layer, the whole word decoding result y at the current moment is obtained through the classification layer t ;
Step S3: inputting the context feature vector of the whole word into a radical decoding module to obtain a decoding result of each radical under the whole word level, specifically comprising:
step S31: the context feature vector c t The output of the input radical decoding module at the time t is r t ;
Step S32: r is t Obtaining a decoding result of the radicals of the whole word through a classification layer;
step S33: counting the number of the split radicals corresponding to each whole word in a batch, and taking the maximum number as the maximum decoding length of the radicals in the batch;
step S4: and respectively calculating the confidence coefficient of the decoding result of the whole word and the confidence coefficient of the decoding result of each radical by using a confidence coefficient score fusion strategy, and fusing to obtain the final recognition result of the whole word.
2. The hierarchical modeling method based on whole words and radical components according to claim 1, characterized in that said step S1: the method includes the steps that a text line image passes through a convolution neural network and a circulation neural network to obtain sequence characteristics of the text line image, and specifically includes the following steps:
step S11: carrying out normalization processing on the text line image to obtain a normalized text line image;
step S12: inputting the normalized text line image into the convolutional neural network to obtain a feature vector of the text line image;
step S13: and inputting the feature vector into the recurrent neural network, and obtaining the sequence features of the text line image.
3. The hierarchical modeling method based on whole words and radical components according to claim 1, characterized in that said step S4: respectively calculating the confidence coefficient of the decoding result of the whole word and the confidence coefficient of the decoding result of each radical by using a confidence coefficient score fusion strategy, and fusing to obtain the final recognition result of the whole word, wherein the confidence coefficient score fusion strategy specifically comprises the following steps:
step S41: judging the whole word decoding result y t Whether the Chinese character is selected, if not, y is determined t As a final decoding result; if yes, calculating the confidence coefficient of the whole word decoding according to a formula (4), calculating the confidence coefficient of the radical decoding according to a formula (5), and turning to the step S42;
-log p i (4)
wherein, in the formula (4), p i Representing the recognition probability corresponding to the ith character obtained by decoding; l in the formula (5) i The number of the split of the radicals corresponding to the ith character is shown,representing the recognition probability corresponding to the jth radical of the ith character obtained by decoding;
step S42: comparing the confidence coefficient of the whole word decoding with the confidence coefficient of the radical decoding, and taking the result with smaller confidence coefficient as the final decoding result at the moment t;
step S43: and repeating the steps S41-S42 for decoding at each moment until the maximum decoding length is reached or an end character is met.
4. A hierarchical modeling system based on whole words and radicals is characterized by comprising the following modules:
the sequence feature module is used for enabling the text line images to pass through a convolutional neural network and a cyclic neural network to obtain sequence features of the text line images;
a module for obtaining context feature vectors of whole words and decoding results, configured to input sequence features of the text line images into a whole word decoding module with attention mechanism, to obtain the context feature vectors of whole words and decoding results of whole words, where the module specifically includes:
step S21: inputting the sequence characteristics of the text line images into a whole word decoding module with attention mechanism shown in the following formulas (1) to (3) to obtain a context characteristic vector c of the whole word t ;
e ti =o(s t-1 ,h i ) (1)
Wherein s is t-1 In the last hidden state, h i An ith frame representing the sequence feature, and o represents a dot product operation; alpha (alpha) ("alpha") ti For the weight of the attention mechanism, l is the number of the eigenvectors; c. C t A context feature vector for the whole word;
step S22: output y of last moment t-1 And the context feature vector c t After the operation of the cascade layer, the whole word decoding result y at the current moment is obtained through the classification layer t ;
A decoding result module for obtaining each radical, configured to input the context feature vector of the whole word into the radical decoding module, and obtain a decoding result of each radical under the whole word level, where the decoding result specifically includes:
step S31: the context feature vector c t The output of the input radical decoding module at the time t is r t ;
Step S32: r is t Obtaining a decoding result of the radicals of the whole word through a classification layer;
step S33: counting the number of the split radicals corresponding to each whole word in a batch, and taking the maximum number as the maximum decoding length of the radicals in the batch;
and the recognition result obtaining module is used for respectively calculating the confidence coefficient of the decoding result of the whole word and the confidence coefficient of the decoding result of each radical by using a confidence coefficient score fusion strategy, and fusing to obtain the final recognition result of the whole word.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110523430.1A CN113221885B (en) | 2021-05-13 | 2021-05-13 | Hierarchical modeling method and system based on whole words and radicals |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110523430.1A CN113221885B (en) | 2021-05-13 | 2021-05-13 | Hierarchical modeling method and system based on whole words and radicals |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113221885A CN113221885A (en) | 2021-08-06 |
CN113221885B true CN113221885B (en) | 2022-09-06 |
Family
ID=77095683
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110523430.1A Active CN113221885B (en) | 2021-05-13 | 2021-05-13 | Hierarchical modeling method and system based on whole words and radicals |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113221885B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115187997B (en) * | 2022-07-13 | 2023-07-28 | 厦门理工学院 | Zero-sample Chinese character recognition method based on key component analysis |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107305630A (en) * | 2016-04-25 | 2017-10-31 | 腾讯科技(深圳)有限公司 | Text sequence recognition methods and device |
CN107797992A (en) * | 2017-11-10 | 2018-03-13 | 北京百分点信息科技有限公司 | Name entity recognition method and device |
CN109389091A (en) * | 2018-10-22 | 2019-02-26 | 重庆邮电大学 | The character identification system and method combined based on neural network and attention mechanism |
CN110097049A (en) * | 2019-04-03 | 2019-08-06 | 中国科学院计算技术研究所 | A kind of natural scene Method for text detection and system |
CN111126410A (en) * | 2019-12-31 | 2020-05-08 | 讯飞智元信息科技有限公司 | Character recognition method, device, equipment and readable storage medium |
CN111401268A (en) * | 2020-03-19 | 2020-07-10 | 内蒙古工业大学 | Multi-mode emotion recognition method and device for open environment |
CN111553349A (en) * | 2020-04-26 | 2020-08-18 | 佛山市南海区广工大数控装备协同创新研究院 | Scene text positioning and identifying method based on full convolution network |
-
2021
- 2021-05-13 CN CN202110523430.1A patent/CN113221885B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107305630A (en) * | 2016-04-25 | 2017-10-31 | 腾讯科技(深圳)有限公司 | Text sequence recognition methods and device |
CN107797992A (en) * | 2017-11-10 | 2018-03-13 | 北京百分点信息科技有限公司 | Name entity recognition method and device |
CN109389091A (en) * | 2018-10-22 | 2019-02-26 | 重庆邮电大学 | The character identification system and method combined based on neural network and attention mechanism |
CN110097049A (en) * | 2019-04-03 | 2019-08-06 | 中国科学院计算技术研究所 | A kind of natural scene Method for text detection and system |
CN111126410A (en) * | 2019-12-31 | 2020-05-08 | 讯飞智元信息科技有限公司 | Character recognition method, device, equipment and readable storage medium |
CN111401268A (en) * | 2020-03-19 | 2020-07-10 | 内蒙古工业大学 | Multi-mode emotion recognition method and device for open environment |
CN111553349A (en) * | 2020-04-26 | 2020-08-18 | 佛山市南海区广工大数控装备协同创新研究院 | Scene text positioning and identifying method based on full convolution network |
Non-Patent Citations (2)
Title |
---|
Fused Confidence for Scene Text Detection via Intersection-over-Union;Guo-lin Zhang等;《2019 IEEE 19th International Conference on Communication Technology (ICCT)》;20200102;第1540-1543页 * |
基于部首嵌入和注意力机制的病虫害命名实体识别;郭旭超等;《农业机械学报》;20201231;第51卷;第335-343页 * |
Also Published As
Publication number | Publication date |
---|---|
CN113221885A (en) | 2021-08-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109165563B (en) | Pedestrian re-identification method and apparatus, electronic device, storage medium, and program product | |
CN112818951B (en) | Ticket identification method | |
CN114511906A (en) | Cross-modal dynamic convolution-based video multi-modal emotion recognition method and device and computer equipment | |
CN111553350B (en) | Deep learning-based attention mechanism text recognition method | |
CN112836702B (en) | Text recognition method based on multi-scale feature extraction | |
CN114724548A (en) | Training method of multi-mode speech recognition model, speech recognition method and equipment | |
CN111680684B (en) | Spine text recognition method, device and storage medium based on deep learning | |
CN116304984A (en) | Multi-modal intention recognition method and system based on contrast learning | |
KR20200071865A (en) | Image object detection system and method based on reduced dimensional | |
CN114417851A (en) | Emotion analysis method based on keyword weighted information | |
CN116246279A (en) | Graphic and text feature fusion method based on CLIP background knowledge | |
CN110619119B (en) | Intelligent text editing method and device and computer readable storage medium | |
CN113221885B (en) | Hierarchical modeling method and system based on whole words and radicals | |
CN114694255A (en) | Sentence-level lip language identification method based on channel attention and time convolution network | |
CN111311364A (en) | Commodity recommendation method and system based on multi-mode commodity comment analysis | |
US20240119716A1 (en) | Method for multimodal emotion classification based on modal space assimilation and contrastive learning | |
CN112434686B (en) | End-to-end misplaced text classification identifier for OCR (optical character) pictures | |
CN113761377A (en) | Attention mechanism multi-feature fusion-based false information detection method and device, electronic equipment and storage medium | |
CN112905750A (en) | Generation method and device of optimization model | |
CN117012370A (en) | Multi-mode disease auxiliary reasoning system, method, terminal and storage medium | |
CN114881038B (en) | Chinese entity and relation extraction method and device based on span and attention mechanism | |
CN110929013A (en) | Image question-answer implementation method based on bottom-up entry and positioning information fusion | |
CN113159071B (en) | Cross-modal image-text association anomaly detection method | |
CN116110047A (en) | Method and system for constructing structured electronic medical record based on OCR-NER | |
CN115937852A (en) | Text-driven efficient weak supervision semantic segmentation method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |