CN112163508A - Character recognition method and system based on real scene and OCR terminal - Google Patents

Character recognition method and system based on real scene and OCR terminal Download PDF

Info

Publication number
CN112163508A
CN112163508A CN202011023019.XA CN202011023019A CN112163508A CN 112163508 A CN112163508 A CN 112163508A CN 202011023019 A CN202011023019 A CN 202011023019A CN 112163508 A CN112163508 A CN 112163508A
Authority
CN
China
Prior art keywords
model
training
character
densenet
trained
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011023019.XA
Other languages
Chinese (zh)
Inventor
张昊博
杨军
王滨
周娜
乔彩丽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CETC 15 Research Institute
Original Assignee
CETC 15 Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by CETC 15 Research Institute filed Critical CETC 15 Research Institute
Priority to CN202011023019.XA priority Critical patent/CN112163508A/en
Publication of CN112163508A publication Critical patent/CN112163508A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/35Categorising the entire scene, e.g. birthday party or wedding scene
    • G06V20/36Indoor scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/28Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet
    • G06V30/287Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet of Kanji, Hiragana or Katakana characters

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
  • Character Discrimination (AREA)

Abstract

The invention discloses a character recognition method and system based on a real scene and an OCR terminal. The method comprises the following steps: acquiring picture data in a real office scene; carrying out binarization processing on the picture data; training a CTPN model through the picture data after binarization processing; carrying out character region detection by using the trained CTPN model; training a DenseNet + CTC model through the detected character area; and performing character recognition by using the trained DenseNet + CTC model. According to the method, the picture data in the real office scene is acquired, an effective character detection and character recognition model is constructed for training, the trained deep learning model is used as a tool, an OCR tool terminal is built, a user can define a recognition area by himself, the working efficiency is improved, and meanwhile the recognition accuracy of the model is improved.

Description

Character recognition method and system based on real scene and OCR terminal
Technical Field
The invention relates to the field of character recognition, in particular to a character recognition method and system based on a real scene and an OCR terminal.
Background
With the rapid development of computers and information technologies, the process of forming human-understandable biological signals by processing visual signals by neurons of the brain by simulating human eyes is developed, and the application of image recognition technology is gradually expanded to a plurality of fields, and especially plays an increasingly important role in a plurality of fields such as biological recognition, image-text recognition and article recognition. Generally, image recognition technology mainly refers to processing a captured system front-end picture according to a set target by using a computer, in the field of artificial intelligence, a neural network is the most widely applied in the field of image recognition, and vector or raster coding of an image is converted into a feature vector representing characteristics of an object. Neural network models can compute and analyze these constructs by first simplifying the image and extracting the most important information, and then organizing the data by feature extraction and classification. Finally, neural network models decide by classification, prediction or other algorithms which class images or belong to or how best to describe them. The neural network model can realize several large plates such as face recognition, image detection, image retrieval, target tracking, style migration and the like. Among them, the functions of face recognition, image classification, and character recognition have achieved very excellent recognition results through long-term iterative development.
Optical Character Recognition (OCR) has emerged many conventional resolution algorithms before the advent of neural networks. The character recognition mainly comprises two parts of text detection and text recognition, and the accuracy of character recognition is improved when the deep neural network is widely applied. Opencv is a computer vision tool providing a full platform interface, is dedicated to real-time application in the real world, and has wide application range and strong executable capability; Tesseract-OCR is a recognition engine maintained by Google developed by the HP laboratory, developing a library of characters for recognition that encompasses almost all of the world's mainstream languages to date. OCR technology has been applied to various scenes in the office field. Overall, OCR technology can solve existing common tasks, but for some specific needs (e.g. certain fixed areas in the picture) the character recognition is not perfect.
Disclosure of Invention
Based on this, the invention aims to provide a character recognition method and system based on a real scene and an OCR terminal, so as to improve the character detection level and the character recognition accuracy.
In order to achieve the purpose, the invention provides the following scheme:
a character recognition method based on real scenes comprises the following steps:
acquiring picture data in a real office scene;
carrying out binarization processing on the picture data;
training a CTPN model through the picture data after binarization processing;
carrying out character region detection by using the trained CTPN model;
training a DenseNet + CTC model through the detected character area;
and performing character recognition by using the trained DenseNet + CTC model.
Further, the CTPN model includes a CNN model and an LSTM model.
Further, the training of the DenseNet + CTC model through the detected text region specifically includes:
training a DenseNet model through the detected character area;
extracting the characteristics of the character area through a trained DenseNet model;
training a CTC model through the characteristics of the character region; the trained CTC model is used for character recognition.
The invention also provides a character recognition system based on the real scene, which comprises the following components:
the image data acquisition module is used for acquiring image data in a real office scene;
the processing module is used for carrying out binarization processing on the picture data;
the first training module is used for training a CTPN model through the picture data after binarization processing;
the character region detection module is used for detecting character regions by utilizing the trained CTPN model;
the second training module is used for training a DenseNet + CTC model through the detected character area;
and the character recognition module is used for carrying out character recognition by utilizing the trained DenseNet + CTC model.
Further, the CTPN model includes a CNN model and an LSTM model.
Further, the second training module specifically includes:
a first training unit for training a DenseNet model through the detected text region;
the character extraction unit is used for extracting the characters of the character area through the trained DenseNet model;
the second training unit is used for training the CTC model through the characteristics of the character area; the trained CTC model is used for character recognition.
The invention also provides an OCR terminal applying the character recognition method based on the real scene, which comprises the following steps:
the picture uploading module is used for uploading picture data in a real office scene;
and the area selection module is used for carrying out area selection on the picture data in the real office scene.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
according to the method, the picture data in the real office scene is acquired, an effective character detection and character recognition model is constructed for training, the trained deep learning model is used as a tool, an OCR tool terminal is built, a user can define a recognition area by himself, the working efficiency is improved, and meanwhile the recognition accuracy of the model is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a flow chart of a text recognition method based on a real scene according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of the CTPN model according to the embodiment of the present invention;
FIG. 3 is a schematic diagram of the operation of the bidirectional recurrent neural network-BilSTM according to the embodiment of the present invention;
FIG. 4 is a diagram of a DenseNet model structure according to an embodiment of the present invention
FIG. 5 is a block diagram of a real scene-based text recognition system according to an embodiment of the present invention;
fig. 6 is a flowchart of the OCR terminal.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention aims to provide a character recognition method and system based on a real scene and an OCR terminal, which are used for improving the character detection level and the character recognition accuracy.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
As shown in fig. 1, a method for recognizing a character based on a real scene includes:
step 101: and acquiring picture data in a real office scene.
Acquiring picture data such as project propaganda pictures, notification announcements and form bills generated in the office field in the following specific modes: based on the existing public data set, the method is based on character recognition in a real scene, so that the data set including product description, street view real shooting and network advertisement is used for model training.
Step 102: and carrying out binarization processing on the picture data.
And carrying out area binarization processing on the picture based on an Opencv tool. The binarization of the image is to set the gray value of a pixel point on the image to be 0 or 255, so that the whole image has obvious black and white effect. In digital image processing, the image has uneven brightness due to light and other problems, and binarization greatly reduces the data amount in the image, so that the contour of the target can be highlighted. Specifically, before the binarization operation, a fixed threshold value is set in advance, and then comparison is performed according to the numerical values of the pixel points of the picture itself, and the portions exceeding the threshold value are set to be 255, while the portions smaller than the threshold value are set to be 0. However, such a "one-cut" operation is inevitable to generate some errors when processing complex pictures. To optimize this method, local binarization is used, which determines a binarization threshold at a pixel location based on the distribution of pixel values of a neighborhood block of pixels. The benefit of this is that the binarization threshold at each pixel location is not fixed, but rather determined by the distribution of its surrounding neighborhood pixels. The binarization threshold value of the image area with higher brightness is generally higher, while the binarization threshold value of the image area with lower brightness is correspondingly smaller. Local image regions of different brightness, contrast, texture will have corresponding local binarization thresholds.
Step 103: and training a CTPN model through the picture data after binarization processing. The CTPN model comprises a CNN model and an LSTM model.
And the image after binarization processing is sent to a deep learning model for training. The CTPN is combined with the CNN and the LSTM deep network, and can effectively detect the transversely distributed characters of the complex scene. The method divides the text line into slices, and presets a plurality of anchors with different scales for positioning the position of the characters, wherein the adopted bidirectional LSTM layer with time sequence characteristics improves the identification accuracy.
As shown in fig. 2, for a complete CTPN model, the VGG16 network is first used to process the input picture content. The VGG16 network is essentially a convolutional neural network consisting of 13 convolutional layers, 5 max pooling layers, and 3 fully-connected layers. The picture content can be regarded as matrix data consisting of three channels of pixel points. Using two-dimensional convolution kernel w epsilon R in convolution layer3×3Extracting the characteristics of the picture P to obtain a characteristic matrix C of the picturen
Figure BDA0002701274110000051
Wherein n represents the number of convolution operations, m represents the number of convolution kernels, i represents the i-th acquired feature matrix, f represents the nonlinear activation function, represents the shared weight of the convolution kernels and the corresponding operation of the feature matrix, w represents the weight of the convolution kernels, and b represents the offset value.
The method of maximum pooling is used at the pooling level, and the expression of feature extraction is as follows:
pu=Max2×2[Cn]
wherein u represents the number of pooling, Max2×2The operation method of maximum pooling with a size of 2 × 2 matrix is shown.
As shown in fig. 3, after a plurality of convolution and pooling operations, inputting the Reshape-processed data stream into a bidirectional LSTM model to obtain a feature vector with a time sequence attribute, which is expressed by a formula:
Figure BDA0002701274110000052
Figure BDA0002701274110000053
Figure BDA0002701274110000054
wherein s istIndicating the output of the forward sequence at time t, st' denotes the output of the reverse timing at time t. U shapeXtInitial input, U, representing forward timingXt' denotes an initial input of a reverse timing.
Figure BDA0002701274110000055
Representing an input at a time in the forward sequence,
Figure BDA0002701274110000056
and reversing the input of the next moment in time. otIndicating the output at time t. After the feature extraction of the time sequence model, the feature vector with the space + sequence is input into the RPN network, and simultaneously, the two feature extraction networks are passed through. One of the strips covers the text content of different heights in the whole image through a group of 10 anchors with fixed width at 16 and variable height. Then, utilizing an activation function softmax to classify to obtain positive and negative so as to judge whether the Anchor contains texts; the other is used for calculating the calculated offset for the boundingboxregression task of anchors so as to obtain accurate proposal. The filtering and position shifting of the anchors to determine the proposal is achieved by applying a loss function to the results of the two branches.
Step 104: and detecting the character area by using the trained CTPN model.
Step 105: the Densenet + CTC model was trained over the detected text region. The method specifically comprises the following steps: training a DenseNet model through the detected character area; extracting the characteristics of the character area through a trained DenseNet model; training a CTC model through the characteristics of the character region; the trained CTC model is used for character recognition.
After the character positions in the images are accurately positioned, the character images selected by positioning are sent to a DenseNet + CTC model in a characteristic matrix form for training and recognizing the character contents.
As shown in fig. 4, densneet is a convolution deep neural network model based on residual network, and is composed of denseblock dense block + transitionayer transition block. Wherein, the denseblock defines the connection relation between the input and the output, and the transitionlayer controls the number of channels. The DenseNet improves the problem of discontinuous information flow between different layers of the original model, and directly connects all layers on the premise of ensuring the maximum information transmission between the layers in the network. The formula is expressed as follows:
xl=Hl([x0,x1,...,xl-1])
wherein x islDenotes the output of each layer, HlRepresents the ReLU activation function and convolution kernel of 3 x 3 and the regularized optimization for each layer. After multiple rounds of convolution operation, reshape operation is carried out on the two-dimensional feature matrix to form potential feature vectors of an image, and the potential feature vectors are sent into a CTC model, wherein the model takes network output as probability distribution of all possible label sequences based on an input sequence, and the problem of training of a conversion task of unsegmented sequence data is solved.
Figure BDA0002701274110000071
Where S represents a sample set and the individual samples are represented as (x, z). X represents the original sequence before conversion in the sample, X is a sequence composed of m-dimensional vectors, and the set X to which the sequence belongs is called an input space. Z represents the transformed sequence in the sample, Z is a set L called the target space to which it belongs, L is a set of sequences composed of finite elements, and the length of Z must be smaller than the length of x. CTC trains a mapping h (X, z) from X to L, and the smaller the LER value, the more accurate the task is. The method is based on the densnet model and uses the loss training network of the ctc, so that the effect of recognizing characters is achieved.
Step 106: and performing character recognition by using the trained DenseNet + CTC model.
As shown in fig. 5, the present invention further provides a real scene-based text recognition system, which includes:
a picture data obtaining module 501, configured to obtain picture data in a real office scene.
A processing module 502, configured to perform binarization processing on the picture data.
The first training module 503 is configured to train a CTPN model through the binarized picture data. The CTPN model comprises a CNN model and an LSTM model.
And a text region detection module 504, configured to perform text region detection using the trained CTPN model.
And a second training module 505, configured to train the DenseNet + CTC model through the detected text region.
The second training module 505 specifically includes:
a first training unit for training a DenseNet model through the detected text region;
the character extraction unit is used for extracting the characters of the character area through the trained DenseNet model;
the second training unit is used for training the CTC model through the characteristics of the character area; the trained CTC model is used for character recognition.
And a character recognition module 506, configured to perform character recognition by using the trained DenseNet + CTC model.
The invention also provides an OCR terminal applying the character recognition method based on the real scene, which comprises the following steps:
the picture uploading module is used for uploading picture data in a real office scene;
and the area selection module is used for carrying out area selection on the picture data in the real office scene.
The work flow diagram of the OCR terminal is shown in fig. 6. Based on the character recognition method and the actual requirements of the user, the terminal provides a character recognition tool capable of customizing the recognition area. The tool supports users to upload pictures and provides region selection functions for the users by using canvas plug-ins. The user can select a plurality of areas on the picture for character recognition by using the function. The interactive design is convenient for the user to accurately grasp the characters to be recognized, and is also beneficial for the model to accurately position the character contents, so that the working efficiency is improved while the accuracy is improved.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The principles and embodiments of the present invention have been described herein using specific examples, which are provided only to help understand the method and the core concept of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims (7)

1. A character recognition method based on real scenes is characterized by comprising the following steps:
acquiring picture data in a real office scene;
carrying out binarization processing on the picture data;
training a CTPN model through the picture data after binarization processing;
carrying out character region detection by using the trained CTPN model;
training a DenseNet + CTC model through the detected character area;
and performing character recognition by using the trained DenseNet + CTC model.
2. The method of real scene-based word recognition according to claim 1, wherein the CTPN model comprises a CNN model and an LSTM model.
3. The method for recognizing characters based on real scenes as claimed in claim 1, wherein said training of DenseNet + CTC model through the detected character region specifically comprises:
training a DenseNet model through the detected character area;
extracting the characteristics of the character area through a trained DenseNet model;
training a CTC model through the characteristics of the character region; the trained CTC model is used for character recognition.
4. A real scene based word recognition system, comprising:
the image data acquisition module is used for acquiring image data in a real office scene;
the processing module is used for carrying out binarization processing on the picture data;
the first training module is used for training a CTPN model through the picture data after binarization processing;
the character region detection module is used for detecting character regions by utilizing the trained CTPN model;
the second training module is used for training a DenseNet + CTC model through the detected character area;
and the character recognition module is used for carrying out character recognition by utilizing the trained DenseNet + CTC model.
5. The real scene-based word recognition system of claim 4, wherein the CTPN model comprises a CNN model and an LSTM model.
6. The real scene-based word recognition system of claim 4, wherein the second training module specifically comprises:
a first training unit for training a DenseNet model through the detected text region;
the character extraction unit is used for extracting the characters of the character area through the trained DenseNet model;
the second training unit is used for training the CTC model through the characteristics of the character area; the trained CTC model is used for character recognition.
7. An OCR terminal applying the real scene-based character recognition method according to any one of claims 1 to 3, comprising:
the picture uploading module is used for uploading picture data in a real office scene;
and the area selection module is used for carrying out area selection on the picture data in the real office scene.
CN202011023019.XA 2020-09-25 2020-09-25 Character recognition method and system based on real scene and OCR terminal Pending CN112163508A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011023019.XA CN112163508A (en) 2020-09-25 2020-09-25 Character recognition method and system based on real scene and OCR terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011023019.XA CN112163508A (en) 2020-09-25 2020-09-25 Character recognition method and system based on real scene and OCR terminal

Publications (1)

Publication Number Publication Date
CN112163508A true CN112163508A (en) 2021-01-01

Family

ID=73863850

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011023019.XA Pending CN112163508A (en) 2020-09-25 2020-09-25 Character recognition method and system based on real scene and OCR terminal

Country Status (1)

Country Link
CN (1) CN112163508A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112560866A (en) * 2021-02-25 2021-03-26 江苏东大集成电路系统工程技术有限公司 OCR recognition method based on background suppression
CN114049641A (en) * 2022-01-13 2022-02-15 中国电子科技集团公司第十五研究所 Character recognition method and system based on deep learning
CN116128717A (en) * 2023-04-17 2023-05-16 四川观想科技股份有限公司 Image style migration method based on neural network
WO2023083280A1 (en) * 2021-11-12 2023-05-19 虹软科技股份有限公司 Scene text recognition method and device
GB2626249A (en) * 2023-01-11 2024-07-17 Skyworks Solutions Inc A circuit board processing system using local threshold value image analysis

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108710866A (en) * 2018-06-04 2018-10-26 平安科技(深圳)有限公司 Chinese mold training method, Chinese characters recognition method, device, equipment and medium
CN110110585A (en) * 2019-03-15 2019-08-09 西安电子科技大学 Intelligently reading realization method and system based on deep learning, computer program
CN110889402A (en) * 2019-11-04 2020-03-17 广州丰石科技有限公司 Business license content identification method and system based on deep learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108710866A (en) * 2018-06-04 2018-10-26 平安科技(深圳)有限公司 Chinese mold training method, Chinese characters recognition method, device, equipment and medium
CN110110585A (en) * 2019-03-15 2019-08-09 西安电子科技大学 Intelligently reading realization method and system based on deep learning, computer program
CN110889402A (en) * 2019-11-04 2020-03-17 广州丰石科技有限公司 Business license content identification method and system based on deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
故沉: "【学习与理解】:CTPN算法", 《HTTPS://BLOG.CSDN.NET/JESMINE_GU/ARTICLE/DETAILS/88524433》 *
静悟生慧: "文本检测: CTPN", 《HTTPS://WWW.CNBLOGS.COM/ALLEN-RG/P/9700095.HTML》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112560866A (en) * 2021-02-25 2021-03-26 江苏东大集成电路系统工程技术有限公司 OCR recognition method based on background suppression
WO2023083280A1 (en) * 2021-11-12 2023-05-19 虹软科技股份有限公司 Scene text recognition method and device
CN114049641A (en) * 2022-01-13 2022-02-15 中国电子科技集团公司第十五研究所 Character recognition method and system based on deep learning
CN114049641B (en) * 2022-01-13 2022-03-15 中国电子科技集团公司第十五研究所 Character recognition method and system based on deep learning
GB2626249A (en) * 2023-01-11 2024-07-17 Skyworks Solutions Inc A circuit board processing system using local threshold value image analysis
CN116128717A (en) * 2023-04-17 2023-05-16 四川观想科技股份有限公司 Image style migration method based on neural network
CN116128717B (en) * 2023-04-17 2023-06-23 四川观想科技股份有限公司 Image style migration method based on neural network

Similar Documents

Publication Publication Date Title
CN110598610B (en) Target significance detection method based on neural selection attention
CN109241982B (en) Target detection method based on deep and shallow layer convolutional neural network
CN112163508A (en) Character recognition method and system based on real scene and OCR terminal
CN107256246B (en) printed fabric image retrieval method based on convolutional neural network
CN111401384B (en) Transformer equipment defect image matching method
CN110516536B (en) Weak supervision video behavior detection method based on time sequence class activation graph complementation
US20210118144A1 (en) Image processing method, electronic device, and storage medium
CN108062525B (en) Deep learning hand detection method based on hand region prediction
CN109800817B (en) Image classification method based on fusion semantic neural network
CN111626993A (en) Image automatic detection counting method and system based on embedded FEFnet network
CN111582095B (en) Light-weight rapid detection method for abnormal behaviors of pedestrians
CN110796018A (en) Hand motion recognition method based on depth image and color image
CN110969171A (en) Image classification model, method and application based on improved convolutional neural network
CN110956059B (en) Dynamic gesture recognition method and device and electronic equipment
CN112037239B (en) Text guidance image segmentation method based on multi-level explicit relation selection
CN110852199A (en) Foreground extraction method based on double-frame coding and decoding model
Sun et al. IRDCLNet: Instance segmentation of ship images based on interference reduction and dynamic contour learning in foggy scenes
CN114926826A (en) Scene text detection system
CN114120359A (en) Method for measuring body size of group-fed pigs based on stacked hourglass network
Zhao et al. Ocean ship detection and recognition algorithm based on aerial image
CN116912673A (en) Target detection method based on underwater optical image
Hu et al. Two-stage insulator self-explosion defect detection method based on Mask R-CNN
CN110633666A (en) Gesture track recognition method based on finger color patches
CN116469172A (en) Bone behavior recognition video frame extraction method and system under multiple time scales
CN110796650A (en) Image quality evaluation method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination