CN110674811A - Image recognition method and device - Google Patents

Image recognition method and device Download PDF

Info

Publication number
CN110674811A
CN110674811A CN201910831740.2A CN201910831740A CN110674811A CN 110674811 A CN110674811 A CN 110674811A CN 201910831740 A CN201910831740 A CN 201910831740A CN 110674811 A CN110674811 A CN 110674811A
Authority
CN
China
Prior art keywords
recognized
image
text
character
size
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910831740.2A
Other languages
Chinese (zh)
Other versions
CN110674811B (en
Inventor
刘学文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Inspur Smart Computing Technology Co Ltd
Original Assignee
Guangdong Inspur Big Data Research Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Inspur Big Data Research Co Ltd filed Critical Guangdong Inspur Big Data Research Co Ltd
Priority to CN201910831740.2A priority Critical patent/CN110674811B/en
Publication of CN110674811A publication Critical patent/CN110674811A/en
Application granted granted Critical
Publication of CN110674811B publication Critical patent/CN110674811B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/62Text, e.g. of license plates, overlay texts or captions on TV images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/14Image acquisition
    • G06V30/148Segmentation of character regions
    • G06V30/158Segmentation of character regions using character size, text spacings or pitch estimation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Character Discrimination (AREA)
  • Character Input (AREA)

Abstract

The invention provides an image recognition method and device, wherein the method comprises the following steps: obtaining an image to be identified; the image to be recognized is displayed with a line of text to be recognized; processing the image to be recognized by using a target detection algorithm model to obtain the position information of each character forming the text to be recognized; segmenting the image to be recognized according to the position information of each character of the text to be recognized to obtain a plurality of sub-images; wherein each sub-image displays one character; processing each sub-image by using a character recognition-convolution neural network model to recognize characters in each sub-image; and arranging each character obtained by identification according to the position information of the character to obtain the identification result of the image to be identified. The method achieves the purpose of improving the accuracy of character segmentation in the graph, thereby improving the accuracy of character recognition.

Description

Image recognition method and device
Technical Field
The present invention relates to the field of image processing technologies, and in particular, to a method and an apparatus for image recognition.
Background
Character recognition in images has a wide demand in many fields, and related applications relate to identification card recognition, license plate number recognition, express bill recognition, bank card number recognition and the like. The character recognition in the image usually needs to divide each character from the image, and the traditional method is to divide the characters in the image horizontally, vertically project each divided row, find the left and right boundaries of each character, divide the single character, and then recognize the characters by using the designed model on the divided character image.
However, due to the close arrangement of the possible Chinese characters and the existence of many characters with left and right radical structures in the Chinese characters, excessive segmentation is easily caused if vertical segmentation is performed after vertical projection is used. Therefore, the accuracy of character recognition is not high in the subsequent character recognition process by using the designed model.
Disclosure of Invention
In view of this, embodiments of the present invention provide an image recognition method and apparatus, which are used to improve the accuracy of segmenting characters in a graph, so as to improve the accuracy of character recognition.
In order to achieve the above purpose, the embodiments of the present invention provide the following technical solutions:
a method of image recognition, comprising:
acquiring an image to be identified; the image to be recognized is displayed with a line of text to be recognized;
processing the image to be recognized by using a target detection algorithm model to obtain the position information of each character forming the text to be recognized;
segmenting the image to be recognized according to the position information of each character of the text to be recognized to obtain a plurality of sub-images; wherein each sub-image displays one character;
processing each sub-image by using a character recognition-convolution neural network model to recognize characters in each sub-image;
and arranging each character obtained by identification according to the position information of the character to obtain the identification result of the image to be identified.
Optionally, before the processing the image to be recognized by using the target detection algorithm model to obtain the position information of each character forming the text to be recognized, the method further includes:
judging whether the text to be recognized displayed by the image to be recognized is a plurality of lines;
and if the text to be recognized displayed by the image to be recognized is judged to be a plurality of lines, finding the upper limit and the lower limit of each line, and horizontally cutting to obtain a plurality of sub-texts to be recognized.
Optionally, the processing the image to be recognized by using the target detection algorithm model to obtain the position information of each character forming the text to be recognized includes:
judging whether the size of the text to be recognized accords with a preset size or not;
if the size of the text to be recognized is judged to be not in accordance with the preset size, changing the size of the text to be recognized into the preset size;
recording the position (xmin, ymin, xmax, ymax) of each Chinese character in the text to be recognized after the text to be recognized is changed to the preset size; wherein, (xmin, ymin) and (xmax, ymax) are the coordinates of the upper left corner and the lower right corner of the Chinese character, respectively.
Optionally, after the recording changes the text to be recognized to the preset size and positions (xmin, ymin, xmax, ymax) of each chinese character in the text to be recognized, the method further includes:
identifying the size of each Chinese character in the text to be identified by using a preset anchor, and confirming the size of each Chinese character in the text to be identified; wherein the preset anchors are of sizes (10,10), (20,20), (30,30), (40,40), (50,50) and (60, 60).
Optionally, before the processing of each sub-image by using the character recognition-convolution neural network model to recognize the characters in each sub-image, the method further includes:
and adjusting the size of each sub-image according to the size of a preset single character.
An apparatus for image recognition, comprising:
the device comprises an acquisition unit, a recognition unit and a processing unit, wherein the acquisition unit is used for acquiring an image to be recognized; the image to be recognized is displayed with a line of text to be recognized;
the first processing unit is used for processing the image to be recognized by utilizing a target detection algorithm model to obtain the position information of each character forming the text to be recognized;
the segmentation unit is used for segmenting the image to be recognized according to the position information of each character of the text to be recognized to obtain a plurality of sub-images; wherein each sub-image displays one character;
the second processing unit is used for processing each sub-image by utilizing a character recognition-convolution neural network model to recognize characters in each sub-image;
and the arranging unit is used for arranging each character obtained by identification according to the position information of the character to obtain the identification result of the image to be identified.
Optionally, the image recognition apparatus further includes:
the first judging unit is used for judging whether the text to be recognized displayed by the image to be recognized is a plurality of lines;
and the cutting unit is used for finding the upper limit and the lower limit of each line and horizontally cutting to obtain a plurality of sub texts to be identified if the first judging unit judges that the texts to be identified displayed by the images to be identified are in a plurality of lines.
Optionally, the first processing unit includes:
the second judging unit is used for judging whether the size of the text to be recognized accords with the preset size or not;
a changing unit, configured to change the size of the text to be recognized to the preset size if the second determining unit determines that the size of the text to be recognized does not conform to the preset size;
a recording unit, configured to record a position (xmin, ymin, xmax, ymax) of each Chinese character in the text to be recognized after the text to be recognized is changed to the preset size; wherein, (xmin, ymin) and (xmax, ymax) are the coordinates of the upper left corner and the lower right corner of the Chinese character, respectively.
Optionally, the image recognition apparatus further includes:
the identification unit is used for identifying the size of each Chinese character in the text to be identified by using a preset anchor and confirming the size of each Chinese character in the text to be identified; wherein the preset anchors are of sizes (10,10), (20,20), (30,30), (40,40), (50,50) and (60, 60).
Optionally, the image recognition apparatus further includes:
and the adjusting unit is used for adjusting the size of each sub-image according to the size of a preset single character.
According to the scheme, in the image identification method and the image identification device, the image to be identified is acquired; the image to be recognized is displayed with a line of text to be recognized; processing the image to be recognized by using a target detection algorithm model to obtain the position information of each character forming the text to be recognized; segmenting the image to be recognized according to the position information of each character of the text to be recognized to obtain a plurality of sub-images; wherein each sub-image displays one character; processing each sub-image by using a character recognition-convolution neural network model to recognize characters in each sub-image; and arranging each character obtained by identification according to the position information of the character to obtain the identification result of the image to be identified. The method achieves the purpose of improving the accuracy of character segmentation in the graph, thereby improving the accuracy of character recognition.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
Fig. 1 is a detailed flowchart of a method for image recognition according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating an image recognition method according to another embodiment of the present invention;
FIG. 3 is a schematic diagram of a horizontal projection method according to another embodiment of the present invention;
fig. 4 is a schematic diagram of an image recognition apparatus according to another embodiment of the present invention;
fig. 5 is a schematic diagram of an image recognition apparatus according to another embodiment of the present invention;
fig. 6 is a schematic diagram of an image recognition apparatus according to another embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
An embodiment of the present application provides an image recognition method, as shown in fig. 1, the method includes the following steps:
and S101, acquiring an image to be identified.
And displaying a line of text to be recognized on the image to be recognized.
It should be noted that the text to be recognized may include numbers, english, chinese characters, and the like. Typically identification card identification, license plate number identification, courier identification, bank card number identification, and the like.
S102, processing the image to be recognized by using the target detection algorithm model to obtain the position information of each character forming the text to be recognized.
The target detection algorithm model can be a currently common yoloV3 model, and is a neural network model.
It should be noted that, the neural network model is usually not optimal at the beginning of the construction, and the obtained prediction result, i.e. the calculation result, is not optimal. The neural network model needs to be trained through a large number of training samples. In the training process of the neural network model, a training sample with a known actual result is input into the convolutional neural network for training to obtain a prediction result of image denoising by a user in the test sample. And then continuously optimizing parameters in the neural network model according to the prediction result and the actual result, so that the prediction result output by the neural network model is consistent with the actual result as far as possible. The optimized neural network model can be directly used for processing the image to be recognized to obtain the position information of each character forming the text to be recognized.
Optionally, in another embodiment of the present invention, before step S102, the method further includes:
and judging whether the text to be recognized displayed by the image to be recognized is a plurality of lines.
It should be noted that yoloV3 can recognize only one line of words, and therefore, when there are multiple lines of words in the text to be recognized, it is necessary to divide the multiple lines of words in the text to be recognized into multiple lines of words.
It should be noted that, in the process of dividing a plurality of lines of text into a plurality of lines of text, a horizontal projection method is generally adopted, wherein, as shown in fig. 3, the horizontal projection method is to sum up and count pixels of each line of a text image, and then draw a statistical result graph according to the statistical result, so as to determine a starting point and an ending point of each line.
Specifically, if the text to be recognized displayed by the image to be recognized is judged to be multiple lines, the upper limit and the lower limit of each line are found, and horizontal cutting is performed to obtain multiple sub-texts to be recognized.
Optionally, in another embodiment of the present invention, an implementation manner of step S102, as shown in fig. 2, includes:
s201, judging whether the size of the text to be recognized accords with a preset size.
Because the yoloV3 model requires that the width and height of an input image can not be lower than 320, in the construction process of the yoloV3 model, 3500 commonly used Chinese characters are utilized to generate a current Chinese character image sample, usually, a single-line character image sample is set to have the width of 320 and the height of 320, and the number of single-line characters is set to be 20 at most; in the training process of yoloV3 model, the training set is generally 30000 sheets and the test set is generally 10000 sheets, but the number of the training set and the test set is not limited herein.
Specifically, if the step S201 determines that the size of the text to be recognized matches the preset size, the step S203 is directly executed; if the step S201 determines that the size of the text to be recognized does not conform to the preset size, the step S202 is executed and then the step S203 is executed.
S202, changing the size of the text to be recognized into a preset size.
Specifically, if the size of the text to be recognized is judged to be smaller than the preset size, a blank part is added below the text to be recognized, and then the size of the image sample is adjusted to the preset size; and if the size of the text to be recognized is judged to be larger than the preset size, segmenting the part which does not comprise the characters.
And S203, recording the position (xmin, ymin, xmax, ymax) of each Chinese character in the text to be recognized after the text to be recognized is changed to the preset size.
Wherein, (xmin, ymin) and (xmax, ymax) are the coordinates of the upper left corner and the lower right corner of the Chinese character, respectively.
Optionally, in another embodiment of the present invention, after step S203, the method further includes:
and identifying the size of each Chinese character in the text to be identified by using a preset anchor, and confirming the size of each Chinese character in the text to be identified.
Wherein the preset anchor may have a size of (10,10), (20,20), (30,30), (40,40), (50,50) and (60, 60).
It should be noted that the size of the anchor is a common font size, and in the process of practical application, the length-width ratio of the chinese character may be changed according to practical situations, such as (15,15), (28,28), and the like, and the anchor may also be set to have different lengths and widths, such as (15,20), (18,14), and the like.
It should be further noted that when the size of each Chinese character in the text to be recognized is determined by using the preset anchor, the number of the preset anchor detection object categories needs to be set to 1, so that it is avoided that two Chinese characters are obtained from one anchor, which results in inaccurate subsequent segmentation operation. For example, if the size of a chinese character in the text to be recognized is (15,15), and the chinese character in the text to be recognized is "wood", "sheep", or "denier", if the number of the detected object categories of the anchor of (60,60) is not set, the anchor of (60,60) is very easy to acquire a plurality of chinese characters, and if two chinese characters of "wood" and "sheep" are acquired at the same time, it is very likely that "wood" and "sheep" will be determined as one chinese character, i.e., "sample", in the process of segmenting the characters in the subsequent step; similarly, two Chinese characters of 'mu' and 'dan' may be obtained simultaneously, and in the process of segmenting the characters in the subsequent steps, the characters are likely to be judged as one Chinese character of 'mu' and 'dan', namely 'searching', which seriously affects the accuracy rate of identifying the characters subsequently.
S103, segmenting the image to be recognized according to the position information of each character of the text to be recognized to obtain a plurality of sub-images.
Wherein each sub-image displays a text.
It should be noted that, when the image to be recognized is segmented according to the position information of each character, it is necessary to record and ensure that a plurality of sub-images obtained after segmentation can be recombined back to the image to be recognized.
And S104, processing each sub-image by using a character recognition-convolution neural network model to recognize characters in each sub-image.
The character recognition-convolution neural network model is obtained by utilizing the training and construction of a convolution neural network.
In a specific implementation process of this embodiment, a training process of the character recognition-convolutional neural network model may be to establish an initial neural network according to preset initial parameters, and determine the initial neural network as a current neural network; recognizing characters in the sample image by using a current neural network; the sample image is each image in a preset sample image set; comparing the characters identified by the current neural network with the characters of the pre-labeled sample image to obtain a comparison result; judging whether the identification precision of the current neural network meets the precision requirement or not according to the comparison result; if the identification precision of the current neural network does not meet the precision requirement, updating the parameters in the current neural network to obtain an updated neural network; taking the updated neural network as a current neural network, and returning to execute the recognition of characters in the sample image by using the current neural network; and if the identification precision of the current neural network meets the precision requirement, determining the current neural network as a character identification-convolution neural network model.
Optionally, in another embodiment of the present invention, before step S104, the method further includes:
and adjusting the size of each sub-image according to the preset size of the single character.
It should be noted that, the word recognition-convolution neural network model generally adopts resnet50, and resnet50 can only recognize 32 × 32 size images, so that the size of the sub-images needs to be adjusted before inputting into the word recognition-convolution neural network model, so that the sub-images can be recognized by the word recognition-convolution neural network model.
And S105, arranging each character obtained by recognition according to the position information of the character to obtain the recognition result of the image to be recognized.
According to the scheme, in the image identification method provided by the application, the image to be identified is acquired; the image to be recognized is displayed with a line of text to be recognized; processing the image to be recognized by using a target detection algorithm model to obtain the position information of each character forming the text to be recognized; segmenting the image to be recognized according to the position information of each character of the text to be recognized to obtain a plurality of sub-images; wherein each sub-image displays one character; processing each sub-image by using a character recognition-convolution neural network model to recognize characters in each sub-image; and arranging each character obtained by identification according to the position information of the character to obtain the identification result of the image to be identified. The method achieves the purpose of improving the accuracy of character segmentation in the graph, thereby improving the accuracy of character recognition.
An embodiment of the present application provides an image recognition apparatus, as shown in fig. 4, including:
an obtaining unit 401 is configured to obtain an image to be recognized.
And displaying a line of text to be recognized on the image to be recognized.
The first processing unit 402 is configured to process the image to be recognized by using the target detection algorithm model, so as to obtain position information of each character forming the text to be recognized.
The target detection algorithm model can be a currently common yoloV3 model, and is a neural network model.
Optionally, in another embodiment of the present invention, an implementation manner of the first processing unit 402, as shown in fig. 5, includes:
the second determining unit 501 is configured to determine whether the size of the text to be recognized matches a preset size.
Because the yoloV3 model requires that the width and height of an input image can not be lower than 320, in the construction process of the yoloV3 model, 3500 commonly used Chinese characters are utilized to generate a current Chinese character image sample, usually, a single-line character image sample is set to have the width of 320 and the height of 320, and the number of single-line characters is set to be 20 at most; in the training process of yoloV3 model, the training set is generally 30000 sheets and the test set is generally 10000 sheets, but the number of the training set and the test set is not limited herein.
A changing unit 502, configured to change the size of the text to be recognized to a preset size if the second determining unit 501 determines that the size of the text to be recognized does not conform to the preset size.
The recording unit 503 is configured to record a position (xmin, ymin, xmax, ymax) of each chinese character in the text to be recognized after the text to be recognized is changed to a preset size.
Wherein, (xmin, ymin) and (xmax, ymax) are the coordinates of the upper left corner and the lower right corner of the Chinese character, respectively.
For the specific working process of the unit disclosed in the above embodiment of the present invention, reference may be made to the content of the corresponding method embodiment, as shown in fig. 2, which is not described herein again.
The segmenting unit 403 is configured to segment the image to be recognized according to the position information of each character of the text to be recognized, so as to obtain a plurality of sub-images.
Wherein each sub-image displays a text.
It should be noted that, when the image to be recognized is segmented according to the position information of each character, it is necessary to record and ensure that a plurality of sub-images obtained after segmentation can be recombined back to the image to be recognized.
And the second processing unit 404 is configured to process each sub-image by using the character recognition-convolutional neural network model, so as to recognize characters in each sub-image.
The arranging unit 405 is configured to arrange each character obtained through recognition according to the position information of the character, so as to obtain a recognition result of the image to be recognized.
For the specific working process of the unit disclosed in the above embodiment of the present invention, reference may be made to the content of the corresponding method embodiment, as shown in fig. 1, which is not described herein again.
Optionally, in another embodiment of the present invention, as shown in fig. 6, the image recognition apparatus further includes:
a first judging unit 601, configured to judge whether the text to be recognized displayed by the image to be recognized is multiple lines.
It should be noted that yoloV3 can recognize only one line of words, and therefore, when there are multiple lines of words in the text to be recognized, it is necessary to divide the multiple lines of words in the text to be recognized into multiple lines of words.
It should be noted that, in the process of dividing a plurality of lines of text into a plurality of lines of text, a horizontal projection method is generally adopted, wherein, as shown in fig. 3, the horizontal projection method is to sum up and count pixels of each line of a text image, and then draw a statistical result graph according to the statistical result, so as to determine a starting point and an ending point of each line.
The cutting unit 602 is configured to find an upper limit and a lower limit of each line if the first determining unit 601 determines that the text to be recognized displayed in the image to be recognized is multiple lines, and perform horizontal cutting to obtain multiple sub-texts to be recognized.
For the specific working process of the unit disclosed in the above embodiment of the present invention, reference may be made to the content of the corresponding method embodiment, which is not described herein again.
Optionally, in another embodiment of the present invention, as shown in fig. 6, the image recognition apparatus further includes:
and the identification unit is used for identifying the size of each Chinese character in the text to be identified by using a preset anchor and confirming the size of each Chinese character in the text to be identified.
Wherein the preset anchors are (10,10), (20,20), (30,30), (40,40), (50,50) and (60,60) in size.
It should be noted that the size of the anchor is a common font size, and in the process of practical application, the length-width ratio of the chinese character may be changed according to practical situations, such as (15,15), (28,28), and the like, and the anchor may also be set to have different lengths and widths, such as (15,20), (18,14), and the like.
Optionally, in another embodiment of the present invention, as shown in fig. 6, the image recognition apparatus further includes:
and the adjusting unit is used for adjusting the size of each sub-image according to the size of a preset single character.
According to the scheme, in the image recognition device provided by the application, the image to be recognized is acquired through the acquisition unit 401; the image to be recognized is displayed with a line of text to be recognized; the first processing unit 402 processes the image to be recognized by using a target detection algorithm model to obtain position information of each character forming the text to be recognized; segmenting the image to be identified by utilizing a segmentation unit 403 according to the position information of each character of the text to be identified to obtain a plurality of sub-images; wherein each sub-image displays one character; the second processing unit 404 processes each sub-image by using a character recognition-convolution neural network model to recognize characters in each sub-image; finally, arranging each character obtained by identification according to the position information of the character through an arranging unit 405 to obtain the identification result of the image to be identified. The method achieves the purpose of improving the accuracy of character segmentation in the graph, thereby improving the accuracy of character recognition.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method of image recognition, comprising:
acquiring an image to be identified; the image to be recognized is displayed with a line of text to be recognized;
processing the image to be recognized by using a target detection algorithm model to obtain the position information of each character forming the text to be recognized;
segmenting the image to be recognized according to the position information of each character of the text to be recognized to obtain a plurality of sub-images; wherein each sub-image displays one character;
processing each sub-image by using a character recognition-convolution neural network model to recognize characters in each sub-image;
and arranging each character obtained by identification according to the position information of the character to obtain the identification result of the image to be identified.
2. The method according to claim 1, wherein before the processing the image to be recognized by using the target detection algorithm model to obtain the position information of each character constituting the text to be recognized, the method further comprises:
judging whether the text to be recognized displayed by the image to be recognized is a plurality of lines;
and if the text to be recognized displayed by the image to be recognized is judged to be a plurality of lines, finding the upper limit and the lower limit of each line, and horizontally cutting to obtain a plurality of sub-texts to be recognized.
3. The method according to claim 1, wherein the processing the image to be recognized by using the target detection algorithm model to obtain the position information of each character constituting the text to be recognized comprises:
judging whether the size of the text to be recognized accords with a preset size or not;
if the size of the text to be recognized is judged to be not in accordance with the preset size, changing the size of the text to be recognized into the preset size;
recording the position (xmin, ymin, xmax, ymax) of each Chinese character in the text to be recognized after the text to be recognized is changed to the preset size; wherein, (xmin, ymin) and (xmax, ymax) are the coordinates of the upper left corner and the lower right corner of the Chinese character, respectively.
4. The method according to claim 3, wherein the recording after the position (xmin, ymin, xmax, ymax) of each Chinese character in the text to be recognized after the text to be recognized is changed to the preset size further comprises:
identifying the size of each Chinese character in the text to be identified by using a preset anchor, and confirming the size of each Chinese character in the text to be identified; wherein the preset anchors are of sizes (10,10), (20,20), (30,30), (40,40), (50,50) and (60, 60).
5. The method of claim 1, wherein before processing each of the sub-images using a text recognition-convolutional neural network model to identify text in each of the sub-images, further comprising:
and adjusting the size of each sub-image according to the size of a preset single character.
6. An apparatus for image recognition, comprising:
the device comprises an acquisition unit, a recognition unit and a processing unit, wherein the acquisition unit is used for acquiring an image to be recognized; the image to be recognized is displayed with a line of text to be recognized;
the first processing unit is used for processing the image to be recognized by utilizing a target detection algorithm model to obtain the position information of each character forming the text to be recognized;
the segmentation unit is used for segmenting the image to be recognized according to the position information of each character of the text to be recognized to obtain a plurality of sub-images; wherein each sub-image displays one character;
the second processing unit is used for processing each sub-image by utilizing a character recognition-convolution neural network model to recognize characters in each sub-image;
and the arranging unit is used for arranging each character obtained by identification according to the position information of the character to obtain the identification result of the image to be identified.
7. The apparatus of claim 6, further comprising:
the first judging unit is used for judging whether the text to be recognized displayed by the image to be recognized is a plurality of lines;
and the cutting unit is used for finding the upper limit and the lower limit of each line and horizontally cutting to obtain a plurality of sub texts to be identified if the first judging unit judges that the texts to be identified displayed by the images to be identified are in a plurality of lines.
8. The apparatus of claim 6, wherein the first processing unit comprises:
the second judging unit is used for judging whether the size of the text to be recognized accords with the preset size or not;
a changing unit, configured to change the size of the text to be recognized to the preset size if the second determining unit determines that the size of the text to be recognized does not conform to the preset size;
a recording unit, configured to record a position (xmin, ymin, xmax, ymax) of each Chinese character in the text to be recognized after the text to be recognized is changed to the preset size; wherein, (xmin, ymin) and (xmax, ymax) are the coordinates of the upper left corner and the lower right corner of the Chinese character, respectively.
9. The apparatus of claim 8, further comprising:
the identification unit is used for identifying the size of each Chinese character in the text to be identified by using a preset anchor and confirming the size of each Chinese character in the text to be identified; wherein the preset anchors are of sizes (10,10), (20,20), (30,30), (40,40), (50,50) and (60, 60).
10. The apparatus of claim 6, further comprising:
and the adjusting unit is used for adjusting the size of each sub-image according to the size of a preset single character.
CN201910831740.2A 2019-09-04 2019-09-04 Image recognition method and device Active CN110674811B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910831740.2A CN110674811B (en) 2019-09-04 2019-09-04 Image recognition method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910831740.2A CN110674811B (en) 2019-09-04 2019-09-04 Image recognition method and device

Publications (2)

Publication Number Publication Date
CN110674811A true CN110674811A (en) 2020-01-10
CN110674811B CN110674811B (en) 2022-04-29

Family

ID=69075949

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910831740.2A Active CN110674811B (en) 2019-09-04 2019-09-04 Image recognition method and device

Country Status (1)

Country Link
CN (1) CN110674811B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112508109A (en) * 2020-12-10 2021-03-16 锐捷网络股份有限公司 Training method and device for image recognition model
CN112668576A (en) * 2020-12-30 2021-04-16 广东电网有限责任公司电力调度控制中心 Electric power iron tower identification method and device based on character symbols

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102314608A (en) * 2010-06-30 2012-01-11 汉王科技股份有限公司 Method and device for extracting rows from character image
CN108427950A (en) * 2018-02-01 2018-08-21 北京捷通华声科技股份有限公司 A kind of literal line detection method and device
CN108885699A (en) * 2018-07-11 2018-11-23 深圳前海达闼云端智能科技有限公司 Character identifying method, device, storage medium and electronic equipment
CN109241974A (en) * 2018-08-23 2019-01-18 苏州研途教育科技有限公司 A kind of recognition methods and system of text image
CN109255356A (en) * 2018-07-24 2019-01-22 阿里巴巴集团控股有限公司 A kind of character recognition method, device and computer readable storage medium
CN109685055A (en) * 2018-12-26 2019-04-26 北京金山数字娱乐科技有限公司 Text filed detection method and device in a kind of image
US20190139449A1 (en) * 2017-11-03 2019-05-09 Neusoft Corporation Method, computer readable storage medium and electronic equipment for analyzing driving behavior
CN109902622A (en) * 2019-02-26 2019-06-18 中国科学院重庆绿色智能技术研究院 A kind of text detection recognition methods for boarding pass information verifying
CN109977935A (en) * 2019-02-27 2019-07-05 平安科技(深圳)有限公司 A kind of text recognition method and device
CN110020615A (en) * 2019-03-20 2019-07-16 阿里巴巴集团控股有限公司 The method and system of Word Input and content recognition is carried out to picture

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102314608A (en) * 2010-06-30 2012-01-11 汉王科技股份有限公司 Method and device for extracting rows from character image
US20190139449A1 (en) * 2017-11-03 2019-05-09 Neusoft Corporation Method, computer readable storage medium and electronic equipment for analyzing driving behavior
CN108427950A (en) * 2018-02-01 2018-08-21 北京捷通华声科技股份有限公司 A kind of literal line detection method and device
CN108885699A (en) * 2018-07-11 2018-11-23 深圳前海达闼云端智能科技有限公司 Character identifying method, device, storage medium and electronic equipment
CN109255356A (en) * 2018-07-24 2019-01-22 阿里巴巴集团控股有限公司 A kind of character recognition method, device and computer readable storage medium
CN109241974A (en) * 2018-08-23 2019-01-18 苏州研途教育科技有限公司 A kind of recognition methods and system of text image
CN109685055A (en) * 2018-12-26 2019-04-26 北京金山数字娱乐科技有限公司 Text filed detection method and device in a kind of image
CN109902622A (en) * 2019-02-26 2019-06-18 中国科学院重庆绿色智能技术研究院 A kind of text detection recognition methods for boarding pass information verifying
CN109977935A (en) * 2019-02-27 2019-07-05 平安科技(深圳)有限公司 A kind of text recognition method and device
CN110020615A (en) * 2019-03-20 2019-07-16 阿里巴巴集团控股有限公司 The method and system of Word Input and content recognition is carried out to picture

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112508109A (en) * 2020-12-10 2021-03-16 锐捷网络股份有限公司 Training method and device for image recognition model
CN112508109B (en) * 2020-12-10 2023-05-19 锐捷网络股份有限公司 Training method and device for image recognition model
CN112668576A (en) * 2020-12-30 2021-04-16 广东电网有限责任公司电力调度控制中心 Electric power iron tower identification method and device based on character symbols
CN112668576B (en) * 2020-12-30 2022-02-15 广东电网有限责任公司电力调度控制中心 Electric power iron tower identification method and device based on character symbols

Also Published As

Publication number Publication date
CN110674811B (en) 2022-04-29

Similar Documents

Publication Publication Date Title
CN109685055B (en) Method and device for detecting text area in image
CN112818812B (en) Identification method and device for table information in image, electronic equipment and storage medium
CN102567300B (en) Picture document processing method and device
CN110569830A (en) Multi-language text recognition method and device, computer equipment and storage medium
WO2017020723A1 (en) Character segmentation method and device and electronic device
CN113158808B (en) Method, medium and equipment for Chinese ancient book character recognition, paragraph grouping and layout reconstruction
CN103034848B (en) A kind of recognition methods of form types
JP7033208B2 (en) Certification document recognition methods and devices, electronic devices and computer-readable storage media
CN111259878A (en) Method and equipment for detecting text
CN113486828B (en) Image processing method, device, equipment and storage medium
CN110942004A (en) Handwriting recognition method and device based on neural network model and electronic equipment
CN110321837B (en) Test question score identification method, device, terminal and storage medium
CN109598185B (en) Image recognition translation method, device and equipment and readable storage medium
CN110674811B (en) Image recognition method and device
CN112001406A (en) Text region detection method and device
CN112906695B (en) Form recognition method adapting to multi-class OCR recognition interface and related equipment
CN107330430A (en) Tibetan character recognition apparatus and method
CN114663904A (en) PDF document layout detection method, device, equipment and medium
CN108734161B (en) Method, device and equipment for identifying prefix number area and storage medium
CN112560847A (en) Image text region positioning method and device, storage medium and electronic equipment
CN115546809A (en) Table structure identification method based on cell constraint and application thereof
Papandreou et al. Slant estimation and core-region detection for handwritten Latin words
CN109508716B (en) Image character positioning method and device
US9152876B1 (en) Methods and systems for efficient handwritten character segmentation
CN110895849A (en) Method and device for cutting and positioning crown word number, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant