CN111626249B - Method and device for identifying geometric figure in topic image and computer storage medium - Google Patents

Method and device for identifying geometric figure in topic image and computer storage medium Download PDF

Info

Publication number
CN111626249B
CN111626249B CN202010486507.8A CN202010486507A CN111626249B CN 111626249 B CN111626249 B CN 111626249B CN 202010486507 A CN202010486507 A CN 202010486507A CN 111626249 B CN111626249 B CN 111626249B
Authority
CN
China
Prior art keywords
image
candidate block
detection
block
geometric
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010486507.8A
Other languages
Chinese (zh)
Other versions
CN111626249A (en
Inventor
兴百桥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yizhen Xuesi Education Technology Co Ltd
Original Assignee
Beijing Yizhen Xuesi Education Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yizhen Xuesi Education Technology Co Ltd filed Critical Beijing Yizhen Xuesi Education Technology Co Ltd
Priority to CN202010486507.8A priority Critical patent/CN111626249B/en
Publication of CN111626249A publication Critical patent/CN111626249A/en
Application granted granted Critical
Publication of CN111626249B publication Critical patent/CN111626249B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/413Classification of content, e.g. text, photographs or tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention provides a method and a device for identifying geometric figures in a topic image and a computer storage medium, wherein the method comprises the following steps: detecting the title image by using a target detection network, determining a first candidate block of the geometric figure, and obtaining a first detection image for identifying the first candidate block; performing connected domain detection on the title image, determining a second candidate block of the geometric figure, and obtaining a second detection image for identifying the second candidate block; fusing the first detection image and the second detection image to obtain a block fusion image; and identifying the block fusion image to obtain an identification image comprising geometric figures. According to the embodiment of the invention, the situation that the geometric figure in the topic image is missed to be detected due to the use of a single geometric figure detection method can be avoided, and the comprehensiveness and the accuracy of the geometric figure identification in the topic image are improved.

Description

Method and device for identifying geometric figure in topic image and computer storage medium
Technical Field
The embodiment of the invention relates to the technical field of image processing, in particular to a method and a device for identifying geometric figures in a topic image and a computer storage medium.
Background
In job approval systems and topic recording systems, there are many scenes in which it is necessary to detect and recognize geometric figures in a topic image. For example, in the homework correction system, a teacher arranges a drawing title, and a student uploads the drawn graphics to the homework correction system in the form of an image. When correcting, the images drawn by students need to be detected and identified from the uploaded images so as to judge whether the images drawn by the students are correct. In a topic recording system, for geometric topics, besides topic characters are recorded, geometric figures in the topics need to be detected and recorded simultaneously. The correction, the entry and the like of the geometric figures require the participation of a relevant teacher, and the workload of the teacher is increased.
Therefore, in order to reduce the workload of teachers and improve the working efficiency of a job approval system and a topic recording system, an image recognition method is adopted to automatically recognize geometric figures in images. However, the theme image is different from other scene images, and the image scene is complicated due to the presence of interference patterns such as a printed text pattern, a handwritten text pattern, and a printed handwritten mixed text pattern. Therefore, when the geometric figures in the figures are detected and identified by utilizing the geometric figures, the condition of missing detection is easy to occur, and the accuracy rate of geometric figure identification is reduced.
Disclosure of Invention
In view of the above, an object of the present invention is to provide a method, an apparatus and a computer storage medium for identifying geometric figures in a topic image, so as to solve the problem of accuracy of geometric figure identification.
According to a first aspect of the embodiments of the present invention, there is provided a method for identifying geometric figures in a topic image, including: detecting the title image by using a target detection network, determining a first candidate block of the geometric figure, and obtaining a first detection image for identifying the first candidate block; performing connected domain detection on the title image, determining a second candidate block of the geometric figure, and obtaining a second detection image for identifying the second candidate block; fusing the first detection image and the second detection image to obtain a block fusion image; and identifying the block fusion image to obtain an identification image comprising geometric figures.
Optionally, in an embodiment of the present invention, the object detection network is used to detect the topic image, determine a first candidate block of the geometric figure, and obtain a first detection image identifying the first candidate block; and performing connected domain detection on the topic image, determining a second candidate block of the geometric figure, and obtaining a second detection image identifying the second candidate block comprises: preprocessing a question image to obtain a preprocessed image; detecting the preprocessed image by using a target detection network, determining a first candidate block of the geometric figure, and obtaining a first detection image; and detecting the connected domain of the preprocessed image, determining a second candidate block of the geometric figure, and obtaining a second detected image.
Optionally, in an embodiment of the present invention, the preprocessing the subject image, and obtaining the preprocessed image includes: performing image graying processing on the subject image to obtain a grayscale image; and carrying out histogram equalization processing on the gray level image to obtain a preprocessed image.
Optionally, in an embodiment of the present invention, detecting the topic image by using the target detection network, determining a first candidate block of the geometric figure, and obtaining a first detection image identifying the first candidate block includes: carrying out standardization processing on the subject image to obtain a standard detection image meeting the detection condition of a target detection network; and detecting the standard detection image by using a target detection network, determining a first candidate block of the geometric figure in the standard detection image, and obtaining a first detection image.
Optionally, in an embodiment of the present invention, performing connected component detection on the subject image, determining a second candidate block of the geometric figure, and obtaining a second detection image identifying the second candidate block includes: performing binarization processing on the subject image to obtain a binarized image; carrying out connected domain detection on the binary image, and determining at least one connected block in the binary image; merging at least one communication block to obtain at least one merged block; and screening a second candidate block from the at least one merged block, and obtaining a second detection image.
Optionally, in an embodiment of the present invention, the second candidate block is filtered out of the merged block: and removing the merged block with the length and width being less than 5 times the average length and width and the merged block with the area being less than 8 times the average area from the at least one merged block to obtain a second candidate block.
Optionally, in an embodiment of the present invention, the fusing the first detected image and the second detected image to obtain the block fused image includes: determining an intersection ratio between the first candidate block and the second candidate block; when the intersection ratio of the first candidate block and the second candidate block is greater than or equal to a preset value, deleting the second candidate block, and taking out a part corresponding to the first candidate block in the title image to obtain a block fusion image; and when the intersection ratio of the first candidate block and the second candidate block is smaller than a preset value, cutting out the parts corresponding to the first candidate block and the second candidate block in the title image.
Optionally, in an embodiment of the present invention, identifying the block fusion image, and obtaining the identification image including the geometric figure includes: zooming the block fusion image to a preset size, and performing gray value normalization processing to obtain a standard identification image meeting the identification condition of the target identification network; and carrying out geometric figure recognition on the standard recognition image by using a target recognition network to obtain a recognition image.
Optionally, in an embodiment of the present invention, the output layer of the target recognition network is
Figure BDA0002519415780000031
Wherein, X is the output of the neuron at the upper layer of the output layer in the target recognition network; w is the connection weight between a layer of neurons on the output layer and neurons on the output layer; sum (w) is the sum of the connection weights between neurons in the output layer and neurons in the output layer; b is the bias coefficient of the output layer neurons.
According to a second aspect of an embodiment of the present invention, there is provided a computer storage medium having stored therein instructions for executing the steps of the method for recognizing a geometric figure in a topic image according to any one of the above-described embodiments.
According to a third aspect of the embodiments of the present invention, there is provided an apparatus for recognizing a geometric figure in a topic image, the apparatus including: the image detection module is used for detecting the title image by using a target detection network, determining a first candidate block of the geometric figure and obtaining a first detection image for identifying the first candidate block; performing connected domain detection on the title image, determining a second candidate block of the geometric figure, and obtaining a second detection image for identifying the second candidate block; the image fusion module is used for carrying out fusion processing on the first detection image and the second detection image to obtain a block fusion image; and the image identification module is used for identifying the block fusion image to obtain an identification image comprising a geometric figure.
According to the technical scheme, as the subject image may include handwritten content and/or printing content, and a single geometric figure detection method may only have a high recognition success rate on one of the handwritten content and the printing content, in the embodiment of the invention, a first candidate block in a first detection image is determined through a deep learning network, a second candidate block in a second detection image is obtained through connected domain detection, the first detection image and the second detection image are fused to obtain a block fusion image, and thus the block fusion image is recognized to obtain the recognition image including the geometric figure. Therefore, according to the embodiment of the invention, the situation that the geometric figure in the topic image is missed to be detected due to the use of a single geometric figure detection method can be avoided, and the comprehensiveness and the accuracy of the geometric figure identification in the topic image are improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the embodiments of the present invention, and it is also possible for a person skilled in the art to obtain other drawings based on the drawings.
FIG. 1 is a flow chart of a method for identifying geometric patterns in a topic image according to a first embodiment of the present invention;
FIG. 2 is a flowchart illustrating a method for identifying geometric patterns in a topic image according to a second embodiment of the present invention;
fig. 3 is a schematic diagram illustrating an output-layer network connection in a target recognition network according to a second embodiment of the present invention;
fig. 4 is a schematic structural diagram of an apparatus for recognizing a geometric figure in a topic image according to a fourth embodiment of the present invention.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in the embodiments of the present invention, the technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments obtained by a person skilled in the art based on the embodiments of the present invention should fall within the scope of the protection of the embodiments of the present invention.
The following further describes specific implementation of the embodiments of the present invention with reference to the drawings.
Example one
Fig. 1 is a flowchart illustrating a method for identifying geometric patterns in a topic image according to a first embodiment of the present invention. As shown in fig. 1, a method for identifying geometric patterns in a topic image according to a first embodiment of the present invention includes the following steps:
s101, detecting a topic image by using a target detection network, determining a first candidate block of a geometric figure, and obtaining a first detection image for identifying the first candidate block; and performing connected domain detection on the topic image, determining a second candidate block of the geometric figure, and obtaining a second detection image for identifying the second candidate block.
In this embodiment, the topic image is an image obtained by image acquisition of a test question, and the acquisition mode of the topic image is not limited. The topic image can comprise one or more test questions, wherein the test question content can comprise text content and geometric figure content; and the test question content can comprise handwriting content, can also comprise printing content, or comprises the handwriting content and the printing content simultaneously.
In this embodiment, the target detection network is a deep learning network, and the network is at least used for recognizing the geometric figure included in the image. The type of the target detection network is not limited, and the target detection network can be selected according to actual application requirements. For example, the target detection network may be selected from a deep learning network such as an SSD network, a YOLO V3 network, a Faster R-CNN network, and the like.
After the topic image is detected by using the target detection network, all or part of geometric figures included in the topic image can be identified, so that the region where the identified geometric figures are located can be determined as a first candidate block, and the first candidate block is identified in the formed first detection image.
Optionally, the target detection network is preferably an SSD network. The SSD network is called a Single Shot MultiBox Detector, is a target detection algorithm proposed by Wei Liu on an ECCV 2016, and has the advantages of high running speed and high detection precision, so that the identification efficiency and precision of geometric figures can be improved by determining the SSD network as the target detection network.
In this embodiment, the connected domain is an image domain composed of foreground pixels having the same pixel value and adjacent positions in the image. After the connected domain detection is carried out on the topic image, all or part of geometric figures included in the topic image can be identified, so that the area where the identified geometric figures are located can be determined as a second candidate block, and the second candidate block is identified in the formed second detection image.
And S102, carrying out fusion processing on the first detection image and the second detection image to obtain a block fusion image.
In this embodiment, since the problem image is detected only by a single geometric figure detection method, a situation that the geometric figure is missed may occur, and therefore, in order to improve the comprehensiveness of the detection of the geometric figure, the first detection image and the second detection image may be subjected to image fusion processing to determine an area where the geometric figure is located, which is jointly detected by the target detection network detection and the connected domain detection, and a block fusion image is generated according to the area where the geometric figure is located.
The number of the block fusion images is not limited. For example, after the first detection image and the second detection image are fused, if the determined region where the geometric figure is located is two unconnected regions, two block fusion images may be generated.
Alternatively, one of the subject image, the first detected image, and the second detected image may be cropped according to the corresponding position information and shape information of the first candidate block and the second candidate block, and a portion other than the first candidate block and the second candidate block may be cropped in the image, so as to obtain the block fusion image.
And step S103, identifying the block fusion image to obtain an identification image comprising a geometric figure.
In this embodiment, image recognition is a technique of processing, analyzing, and understanding an image with a computer to recognize various different patterns of objects and objects. The image recognition of the block fusion image can further determine whether the block fusion image comprises geometric figures, if so, the block fusion image can be directly determined as the recognition image, or the block fusion image can be further processed to generate the recognition image.
If the block fusion image is further processed, the image processing mode is not limited, and the determination can be performed according to the actual application requirements. For example, the block fusion image may be enlarged or reduced to a predetermined size, or geometric figures in the fusion image may be classified according to a predetermined condition, or the like.
As can be seen from the above embodiments of the present invention, because the topic image may include handwritten content and/or print content, and a single geometric figure detection method may only have a high recognition success rate for one of the handwritten content and the print content, in the embodiments of the present invention, a first candidate block in a first detection image is determined through a deep learning network, a second candidate block in a second detection image is obtained through connected domain detection, and the first detection image and the second detection image are fused to obtain a block fusion image, so that the block fusion image is recognized to obtain a recognition image including geometric figures. Therefore, according to the embodiment of the invention, the situation that the geometric figure in the topic image is missed to be detected due to the use of a single geometric figure detection method can be avoided, and the comprehensiveness and the accuracy of the geometric figure identification in the topic image are improved.
Example two
FIG. 2 is a flowchart illustrating a method for identifying geometric patterns in a topic image according to a second embodiment of the present invention. As shown in fig. 2, a method for identifying a geometric figure in a topic image according to a second embodiment of the present invention includes the following steps:
step S201, detecting the title image by using a target detection network, determining a first candidate block of a geometric figure, and obtaining a first detection image for identifying the first candidate block; and performing connected domain detection on the topic image, determining a second candidate block of the geometric figure, and obtaining a second detection image for identifying the second candidate block.
In this embodiment, since the subject image directly obtained by image acquisition may have a defect such as low brightness, in order to improve the image definition and facilitate subsequent related processing, step S201 may include:
in the substep S201a, the subject image is preprocessed to obtain a preprocessed image.
Substep S201b, detecting the preprocessed image by using a target detection network, determining a first candidate block of the geometric figure, and obtaining a first detected image identifying the first candidate block; and detecting the connected domain of the preprocessed image, determining a second candidate block of the geometric figure, and obtaining a second detected image identifying the second candidate block.
The method for preprocessing the subject image in the substep S201a is not limited, and may be reasonably selected according to the actual application requirements. For example, the preprocessing of the subject image may include at least one of a graying process, a histogram equalization process, a contrast stretching process, and a Gamma correction process.
Optionally, in order to obtain better processing effect and reduce data processing amount, the sub-step S201a further includes the following sub-steps:
the substep a1 performs image gradation processing on the subject image to obtain a gradation image.
When the title image is a color image, the title image can be subjected to graying processing, and the subsequent processing of the grayscale image can reduce the data processing amount. The method of performing the image graying processing on the subject image is not limited, and for example, the method of performing the image graying processing on the subject image may include one of a component method, a maximum value method, an average value method, and a weighted average method.
The substep a2 performs histogram equalization processing on the grayscale image to obtain a preprocessed image.
The histogram equalization processing is to enhance the image contrast by stretching the pixel intensity distribution range, so that the histogram equalization processing is performed on the gray-scale image, and the image pixel gray-scale value distribution of the gray-scale image can be changed into approximately uniform distribution, thereby obtaining a preprocessed image with better contrast.
In this embodiment, since different deep learning networks have different requirements for the detected images, before the target detection network is used to detect the standard detection image, the problem image needs to be standardized to obtain the standard detection image satisfying the detection condition of the target detection network, that is, the problem image is adjusted to be consistent with the training image used when the target detection network is trained. Correspondingly, step S201 may include:
in the substep S201c, the subject image is normalized to obtain a standard detection image satisfying the detection condition of the target detection network.
If the sub-step S201a is executed, the sub-step S201c is: and carrying out standardization processing on the preprocessed image to obtain a standard detection image meeting the detection condition of the target detection network.
Optionally, normalizing the subject image may include: and zooming the subject image to a preset size, and/or performing normalization processing on the gray value of the subject image. For example, when the topic image is normalized, the topic image may be scaled to 300 × 300 pixels, 512 × 512 pixels, or the like.
Sub-step S201d, detecting the standard inspection image by using the target detection network, determining a first candidate block of the geometric figure in the standard inspection image, and obtaining a first inspection image identifying the first candidate block.
After the standard detection image is detected by using the target detection network, all or part of geometric figures included in the standard detection image can be identified, the area where the identified geometric figures are located is determined as a first candidate block, and the first candidate block is identified in the formed first detection image.
In this embodiment, if the theme image has a large color or brightness value, the detection effect of the connected component detection may be possibly affected, so to improve the effect of the connected component detection, the binarization processing may be performed on the theme image before the connected component detection is performed, and correspondingly, the step S201 may further include:
in the substep S201e, a binarization process is performed on the subject image to obtain a binarized image.
If the sub-step S201a is executed, the sub-step S201e is: and carrying out binarization processing on the preprocessed image to obtain a binarized image.
The gray value of each pixel in the pixel matrix of the binarized image is 0 (black) or 255 (white), and the binarized image obtained after processing can have the effect of only black and white by performing binarization processing on the subject image.
Alternatively, the title image may be binarized using the maximum inter-class variance method to obtain a binarized image. The maximum inter-class variance method (Otsu for short) is a method for automatically determining a threshold value suitable for a bimodal situation. The method divides an image into a background part and an object part according to the gray characteristic of the image. The larger the inter-class variance between the background and the object, the larger the difference between the two parts constituting the image, and the smaller the difference between the two parts when part of the object is mistaken for the background or part of the background is mistaken for the object. Therefore, the geometric figure in the theme image is taken as a target, the maximum inter-class variance method is used for carrying out binarization processing on the theme image, and the condition that the inter-class variance between the background and the geometric figure is maximum is selected, so that the probability of wrong division between the geometric figure and the background can be minimized, and a better processing effect can be obtained.
In the sub-step S201f, connected domain detection is performed on the binarized image to determine at least one connected block in the binarized image.
The connected block is an image domain which is composed of foreground pixel points with the same pixel value and adjacent positions in the binary image.
In the sub-step S201g, at least one connected block is merged to obtain at least one merged block.
If the connected blocks have an overlapping relationship or an inclusion relationship, it indicates that the content included in the corresponding region of the connected blocks in the topic image has a certain relevance, such as a segment of text or a geometric figure, so that the connected blocks having an overlapping relationship or an inclusion relationship can be merged to obtain a merged block.
In the sub-step S201h, a second candidate block is screened from the at least one merged block, and a second detected image is obtained.
Since the region corresponding to the merged block in the topic image may include text content, geometric figure content, or no content, the merged block needs to be filtered according to a preset rule to filter out the merged block including the geometric figure, i.e., the second candidate block.
Alternatively, considering that the general geometric figure occupies a larger area in the title image than the text content, the merged block may be filtered according to the area size to determine the second candidate block.
Alternatively, to determine the second candidate block more accurately based on the analysis of a large number of topic images, the sub-step S201h may include: and removing the merged block with the length and width being less than 5 times the average length and width and the merged block with the area being less than 8 times the average area from the at least one merged block to obtain a second candidate block.
Step S202, carrying out fusion processing on the first detection image and the second detection image to obtain a block fusion image.
In this embodiment, in order to improve the accuracy of geometric figure recognition and reduce the processing amount of subsequent image recognition, the step S202 may include the following steps:
in sub-step S202a, an intersection ratio between the first candidate block and the second candidate block is determined.
Wherein an Intersection-over-unity (IoU) ratio between the first candidate tile and the second candidate tile is a ratio between an Intersection area and a Union area of the first candidate tile and the second candidate tile. The value range of the intersection ratio is 0 to 1, when the intersection ratio is 1, the first candidate block and the second candidate block are overlapped, namely, the problem image in the region is detected to comprise a geometric figure by two modes of target detection network detection and connected domain detection; when the intersection ratio is 0, it indicates that the first candidate block and the second candidate block are not overlapped, that is, the two ways of the target detection network detection and the connected domain detection are completely different from the detection result of whether the subject image includes the geometric figure in the area corresponding to the first candidate block and the second candidate block.
Since the number of the first candidate tiles and the second candidate tiles may be 0 or more according to the detection results of the two manners of the target detection network detection and the connected component detection, the intersection ratio between all the first candidate tiles and all the second candidate tiles needs to be calculated in the sub-step S202 a.
In the sub-step S202b, when the intersection ratio of the first candidate block and the second candidate block is greater than or equal to the preset value, deleting the second candidate block, and capturing a portion corresponding to the first candidate block in the topic image to obtain a block fusion image; and when the intersection ratio of the first candidate block and the second candidate block is smaller than a preset value, cutting out the parts corresponding to the first candidate block and the second candidate block in the title image.
The target detection network is a deep learning network, and the probability of detection errors is usually smaller than that of connected domain detection errors, so that when the intersection ratio of a first candidate block and a second candidate block is greater than or equal to a preset value, geometric figures are detected in the area where the first candidate block is located or in the nearby area by two modes of target detection network detection and connected domain detection, only the first candidate block can be reserved, and the second candidate block is deleted, so that partial images corresponding to the area where the first candidate block is located in the topic image are extracted, and a block fusion image is obtained. The part of the title image corresponding to the first candidate block is the part with the same position and shape as the first candidate block.
When the intersection ratio of the first candidate block and the second candidate block is smaller than the preset value, it is indicated that the detection results of whether the geometric figures are included in the first candidate block and the second candidate block in the two modes of target detection network detection and connected domain detection have a certain difference, so that in order to detect the geometric figures as comprehensively as possible, when the intersection ratio of the first candidate block and the second candidate block is smaller than the preset value, the first candidate block and the second candidate block can be simultaneously reserved, so that the corresponding parts of the areas where the first candidate block and the second candidate block are located in the theme image are extracted, and the block fusion image is obtained. And the part corresponding to the second candidate block in the theme image is the part with the same position and shape as the second candidate block.
Optionally, in order to reduce the probability of geometric missing detection and ensure the identification accuracy of the geometric figure, the preset value may be preferably one third.
Step S203, identifying the block fusion image to obtain an identification image comprising a geometric figure.
In this embodiment, since the sizes of the different geometric figures are different, the sizes of the block fusion images obtained in the foregoing steps are also different, and in order to further identify and confirm whether the block fusion image includes the geometric figures by using the target identification network and improve the accuracy of geometric figure identification, the fusion image needs to be correspondingly processed. Correspondingly, step S203 may comprise the following sub-steps:
in the substep S203a, the block fusion image is scaled to a preset size, and a gray-scale value normalization process is performed to obtain a standard recognition image satisfying the recognition condition of the target recognition network.
The preset size can be set according to the identification condition of the selected target identification network. For example, the tile-fused image may be scaled to 64 × 64 pixels, 128 × 128 pixels, 224 × 224 pixels, etc.
In addition, in order to avoid interference caused by insufficient image contrast to the image recognition processing, the block fusion image can be further subjected to gray level normalization processing, so that a standard recognition image meeting the recognition condition of the target recognition network is obtained. The specific manner of the gray scale normalization process is not limited, and for example, the gray scale normalization process may be a mean variance normalization process or a gray scale transformation normalization process.
And a substep S203b, performing geometric figure recognition on the standard recognition image by using the target recognition network to obtain a recognition image.
Wherein the target recognition network is at least operable to determine whether the standard recognition image includes a geometric figure. And performing geometric figure recognition on the standard recognition image by using a target recognition network so as to determine the standard recognition image comprising the geometric figure, and if the geometric figure is included in the standard recognition image, directly determining the block fusion image as the recognition image, or further processing the block fusion image to generate the recognition image.
Optionally, in order to improve the accuracy of the recognition, the target recognition network may also be a deep learning network. The type of the target identification network is not limited, and the target identification network can be selected according to actual application requirements. For example, the target recognition network may be selected from deep learning networks such as VGG16 network, inclusion V4 network, Resnet network, and the like.
Optionally, the target identification network is preferably a VGG16 network. The VGG16 network is a convolution deep learning network with 16 layers, has simplicity and practicability, and shows very good results in image classification and target detection tasks. Determining a VGG16 network as a target recognition network may improve recognition accuracy of the geometry.
Alternatively, when the target recognition network is a deep learning network, each neuron of the output layer is connected to each neuron of a layer above the output layer, and the output of the output layer of the target recognition network can be used to determine whether the standard recognition image includes the geometric figure. In order to improve the accuracy of the target recognition network in recognizing the geometric figure, all the connection weights between the output layer and the neuron on the upper layer of the output layer can be comprehensively considered, that is, the output calculation method of the neuron on the output layer can be adjusted from the conventional method of "Y ═ f (W × X + b)" to:
Figure BDA0002519415780000131
wherein Y is the output of an output layer neuron; x is the output of a neuron on the upper layer of an output layer in the target recognition network; w is the connection weight between a layer of neurons on the output layer and neurons on the output layer; sum (w) is the sum of the connection weights between neurons in the output layer and neurons in the output layer; b is the bias coefficient for the output layer neurons.
For example, as shown in fig. 3, the connection relationship between the output layer and the upper layer network in the target recognition network, the output layer may include a first neuron y1 and a second neuron y2, and the upper layer of the output layer includes a third neuron x1 and a fourth neuron x 2.
According to the conventional calculation method, the output calculation formula of the first neuron y1 of the output layer is:
y1=f(w11*x1+w21*x2+b1)
the output of the second neuron y2 of the output layer is calculated as:
y2=f(w12*x1+w22*x2+b2)
according to the adjusted calculation method in the embodiment, in the target recognition network, the output calculation formula of the first neuron y1 of the output layer is:
y1=f(w11/(sum(w11,w12)-w11)*x1+w21/(sum(w21,w22)-w21)*x2+b1)
the output of the second neuron y2 of the output layer is calculated as:
y2=f(w12/(sum(w11,w12)-w12)*x1+w22/(sum(w21,w22)-w22)*x2+b2)
wherein w11 is the connection weight between the third neuron and the first neuron; w12 is the connection weight between the third neuron and the second neuron; w21 is the connection weight between the fourth neuron and the first neuron; w22 is the connection weight between the fourth neuron and the second neuron; b1 is the bias coefficient for the first neuron; b2 is the bias coefficient for the second neuron.
Optionally, the target recognition network may further classify the standard recognition images according to the recognition result. For example, the kind of the image included in the image may be recognized according to the standard, and the standard image may be classified into a category of a geometric figure, an other figure, an invalid figure, and the like, wherein the geometric figure may include a triangle, a rectangle, a parallelogram, a trapezoid, a circle, and the like, and the other figure may include a table, a coordinate system, a solid geometry, and the like.
As can be seen from the above embodiments of the present invention, in the present embodiment, a preprocessed image with higher definition is obtained by preprocessing the topic image, and then the preprocessed image is subjected to related processing, so that the comprehensiveness and accuracy of geometric figure recognition can be improved; after the title image is processed into a binary image, connected domain detection is carried out, so that the accuracy of identifying the second candidate block can be improved; by calculating the intersection ratio between the first candidate block and the second candidate block and then obtaining the block fusion image according to the intersection ratio, the processing amount of subsequent fusion image identification can be reduced; by selecting the target recognition network capable of performing deep learning, the accuracy of recognizing whether the fused image comprises the geometric figure can be improved.
EXAMPLE III
A third embodiment of the present invention provides a computer storage medium, where instructions for steps of the method for identifying geometric figures in a topic image in the first embodiment or the second embodiment are stored in the computer storage medium, and the method has beneficial effects of corresponding embodiments of the method steps, and details are not repeated herein.
Example four
Referring to fig. 4, a fourth embodiment of the present invention provides an apparatus for identifying geometric figures in a topic image, as shown in fig. 4, the apparatus for identifying geometric figures in a topic image includes an image detection module 401, an image fusion module 402, and an image identification module 403, where:
an image detection module 401, configured to detect the title image by using a target detection network, determine a first candidate block of the geometric figure, and obtain a first detection image identifying the first candidate block; and detecting the connected domain of the title image, determining a second candidate block of the geometric figure, and obtaining a second detection image identifying the second candidate block.
And an image fusion module 402, configured to perform fusion processing on the first detection image and the second detection image to obtain a block fusion image.
The image recognition module 403 is configured to recognize the block fusion image to obtain a recognition image including a geometric figure.
Optionally, the image detection module 401 is further configured to perform preprocessing on the topic image to obtain a preprocessed image. Detecting the preprocessed image by using a target detection network, determining a first candidate block of the geometric figure, and obtaining a first detection image; and detecting the connected domain of the preprocessed image, determining a second candidate block of the geometric figure, and obtaining a second detected image.
Optionally, the preprocessing the subject image, and obtaining the preprocessed image includes: and carrying out image graying processing on the subject image to obtain a grayscale image. And carrying out histogram equalization processing on the gray level image to obtain a preprocessed image.
Optionally, the image detection module 401 is further configured to perform normalization processing on the topic image to obtain a standard detection image meeting the target detection network detection condition. And detecting the standard detection image by using a target detection network, determining a first candidate block of the geometric figure in the standard detection image, and obtaining a first detection image.
Optionally, the image detection module 401 is further configured to: and carrying out binarization processing on the title image to obtain a binarization image. And detecting a connected domain of the binary image, and determining at least one connected block in the binary image. And merging the at least one connected block to obtain at least one merged block. And screening a second candidate block from the at least one merged block, and obtaining a second detection image.
The step of screening out a second candidate block from the merged block and obtaining a second detection image comprises: and removing the merged block with the length and width being less than 5 times the average length and width and the merged block with the area being less than 8 times the average area from the at least one merged block to obtain a second candidate block.
Optionally, the image identification module 403 is further configured to determine an intersection ratio between the first candidate block and the second candidate block. When the intersection ratio of the first candidate block and the second candidate block is greater than or equal to a preset value, deleting the second candidate block, and taking out a part corresponding to the first candidate block in the title image to obtain a block fusion image;
and when the intersection ratio of the first candidate block and the second candidate block is smaller than a preset value, cutting out the parts corresponding to the first candidate block and the second candidate block in the title image.
Optionally, the image recognition module 403 is further configured to scale the block fusion image to a preset size, and perform gray-scale normalization processing to obtain a standard recognition image meeting the recognition condition of the target recognition network. And carrying out geometric figure recognition on the standard recognition image by using a target recognition network to obtain a recognition image.
Optionally, the output layer of the target recognition network is
Figure BDA0002519415780000161
Wherein, X is the output of the neuron at the upper layer of the output layer in the target recognition network; w is the connection weight between a layer of neurons on the output layer and neurons on the output layer; sum (w) is the sum of the connection weights between neurons in the output layer and neurons in the output layer; b is the bias coefficient of the output layer neurons.
The apparatus for identifying geometric figures in a topic image of this embodiment can also be used to implement other steps in the method for identifying geometric figures in a topic image of the first embodiment or the second embodiment, and has the beneficial effects of corresponding method step embodiments, which are not described herein again.
It should be noted that, according to the implementation requirement, each component/step described in the embodiment of the present invention may be divided into more components/steps, and two or more components/steps or partial operations of the components/steps may also be combined into a new component/step to achieve the purpose of the embodiment of the present invention.
The above-described method according to an embodiment of the present invention may be implemented in hardware, firmware, or as software or computer code storable in a recording medium such as a CD ROM, a RAM, a floppy disk, a hard disk, or a magneto-optical disk, or as computer code originally stored in a remote recording medium or a non-transitory machine-readable medium downloaded through a network and to be stored in a local recording medium, so that the method described herein may be stored in such software processing on a recording medium using a general-purpose computer, a dedicated processor, or programmable or dedicated hardware such as an ASIC or FPGA. It will be appreciated that the computer, processor, microprocessor controller or programmable hardware includes memory components (e.g., RAM, ROM, flash memory, etc.) that can store or receive software or computer code that, when accessed and executed by the computer, processor or hardware, implements the teacher style determination methods described herein. Further, when a general-purpose computer accesses code for implementing the teacher-style determination method shown herein, execution of the code converts the general-purpose computer into a special-purpose computer for executing the teacher-style determination method shown herein.
Those of ordinary skill in the art will appreciate that the various illustrative elements and method steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present embodiments.
The above embodiments are only for illustrating the embodiments of the present invention and not for limiting the embodiments of the present invention, and those skilled in the art can make various changes and modifications without departing from the spirit and scope of the embodiments of the present invention, so that all equivalent technical solutions also belong to the scope of the embodiments of the present invention, and the scope of patent protection of the embodiments of the present invention should be defined by the claims.

Claims (10)

1. A method for identifying geometric figures in a topic image, which is characterized by comprising the following steps:
detecting the title image by using a target detection network, determining a first candidate block of a geometric figure, and obtaining a first detection image identifying the first candidate block; performing connected domain detection on the theme image, determining a second candidate block of the geometric figure, and obtaining a second detection image identifying the second candidate block;
fusing the first detection image and the second detection image to obtain a block fusion image;
identifying the block fusion image to obtain an identification image comprising the geometric figure;
the fusing the first detection image and the second detection image to obtain a block fused image comprises: determining an intersection ratio between the first candidate block and the second candidate block; when the intersection ratio of the first candidate block and the second candidate block is greater than or equal to a preset value, deleting the second candidate block, and intercepting a part corresponding to the first candidate block in the title image to obtain the block fusion image; and when the intersection ratio of the first candidate block and the second candidate block is smaller than the preset value, intercepting the part of the theme image corresponding to the first candidate block and the second candidate block.
2. The method of claim 1, wherein the detecting the topic image with the target detection network determines a first candidate block of a geometric figure and obtains a first detection image identifying the first candidate block; and performing connected domain detection on the theme image, determining a second candidate block of the geometric figure, and obtaining a second detection image identifying the second candidate block comprises:
preprocessing the question image to obtain a preprocessed image;
detecting the preprocessed image by using a target detection network, determining a first candidate block of a geometric figure, and obtaining a first detection image; and detecting the connected domain of the preprocessed image, determining a second candidate block of the geometric figure, and obtaining a second detected image.
3. The method of claim 2, wherein the pre-processing the topic image to obtain a pre-processed image comprises:
performing image graying processing on the subject image to obtain a grayscale image;
and carrying out histogram equalization processing on the gray level image to obtain the preprocessed image.
4. The method of claim 1, wherein the detecting the topic image using the target detection network, determining a first candidate block of a geometric figure, and obtaining a first detected image identifying the first candidate block comprises:
carrying out standardization processing on the question image to obtain a standard detection image meeting the detection condition of the target detection network;
and detecting the standard detection image by using a target detection network, determining the first candidate block of the geometric figure in the standard detection image, and obtaining the first detection image.
5. The method of claim 1, wherein the performing connected component detection on the topic image, determining a second candidate block of the geometric figure, and obtaining a second detection image identifying the second candidate block comprises:
performing binarization processing on the title image to obtain a binarization image;
carrying out connected domain detection on the binary image, and determining at least one connected block in the binary image;
merging the at least one communication block to obtain at least one merged block;
and screening the second candidate block from the at least one merged block, and obtaining the second detection image.
6. The method of claim 5, wherein the filtering the second candidate block from the at least one merged block comprises:
and removing the merged block with the length and width being less than 5 times the average length and width and the merged block with the area being less than 8 times the average area from the at least one merged block to obtain the second candidate block.
7. The method of claim 1, wherein the identifying the block-fused image, and obtaining the identified image including the geometric figure comprises:
zooming the block fusion image to a preset size, and carrying out gray value normalization processing to obtain a standard identification image meeting the identification condition of a target identification network;
and carrying out geometric figure recognition on the standard recognition image by using the target recognition network to obtain the recognition image.
8. The method of claim 7, wherein the output layer of the target recognition network is
Figure FDA0003557896490000031
Wherein X is the output of a layer of neurons above the output layer in the target recognition network; w is the connection weight between a layer of neurons on the output layer and the output layer neurons; sum (w) is the sum of the connection weights between a layer of neurons on the output layer and the output layer neurons; b is the bias coefficient of the output layer neurons.
9. A computer storage medium characterized in that instructions for executing each of the steps of the method for recognizing a geometric figure in a topic image according to any one of claims 1 to 8 are stored in the computer storage medium.
10. An apparatus for recognizing a geometric figure in a topic image, the apparatus comprising:
the image detection module is used for detecting the title image by using a target detection network, determining a first candidate block of a geometric figure and obtaining a first detection image for identifying the first candidate block; performing connected domain detection on the theme image, determining a second candidate block of the geometric figure, and obtaining a second detection image identifying the second candidate block;
an image fusion module, configured to perform fusion processing on the first detection image and the second detection image to obtain a block fusion image, where the performing fusion processing on the first detection image and the second detection image to obtain the block fusion image includes: determining an intersection ratio between the first candidate block and the second candidate block; when the intersection ratio of the first candidate block and the second candidate block is greater than or equal to a preset value, deleting the second candidate block, and intercepting a part corresponding to the first candidate block in the title image to obtain the block fusion image; when the intersection ratio of the first candidate block and the second candidate block is smaller than the preset value, truncating the part, corresponding to the first candidate block and the second candidate block, in the theme image;
and the image identification module is used for identifying the block fusion image to obtain an identification image comprising the geometric figure.
CN202010486507.8A 2020-06-01 2020-06-01 Method and device for identifying geometric figure in topic image and computer storage medium Active CN111626249B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010486507.8A CN111626249B (en) 2020-06-01 2020-06-01 Method and device for identifying geometric figure in topic image and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010486507.8A CN111626249B (en) 2020-06-01 2020-06-01 Method and device for identifying geometric figure in topic image and computer storage medium

Publications (2)

Publication Number Publication Date
CN111626249A CN111626249A (en) 2020-09-04
CN111626249B true CN111626249B (en) 2022-05-13

Family

ID=72272019

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010486507.8A Active CN111626249B (en) 2020-06-01 2020-06-01 Method and device for identifying geometric figure in topic image and computer storage medium

Country Status (1)

Country Link
CN (1) CN111626249B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112308946B (en) * 2020-11-09 2023-08-18 电子科技大学中山学院 Question generation method and device, electronic equipment and readable storage medium
CN112200167B (en) * 2020-12-07 2021-03-09 北京易真学思教育科技有限公司 Image recognition method, device, equipment and storage medium
CN112883795B (en) * 2021-01-19 2023-01-31 贵州电网有限责任公司 Rapid and automatic table extraction method based on deep neural network
CN114627730A (en) * 2022-03-31 2022-06-14 北京科技大学 Braille electronic book

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105096347B (en) * 2014-04-24 2017-09-08 富士通株式会社 Image processing apparatus and method
CN108108761B (en) * 2017-12-21 2020-05-01 西北工业大学 Rapid traffic signal lamp detection method based on deep feature learning
CN109753962B (en) * 2019-01-13 2023-06-09 南京邮电大学盐城大数据研究院有限公司 Method for processing text region in natural scene image based on hybrid network

Also Published As

Publication number Publication date
CN111626249A (en) 2020-09-04

Similar Documents

Publication Publication Date Title
CN111626249B (en) Method and device for identifying geometric figure in topic image and computer storage medium
CN107545239B (en) Fake plate detection method based on license plate recognition and vehicle characteristic matching
CN111401372B (en) Method for extracting and identifying image-text information of scanned document
CN110232713B (en) Image target positioning correction method and related equipment
CN108596166A (en) A kind of container number identification method based on convolutional neural networks classification
CN111680690B (en) Character recognition method and device
CN112307919B (en) Improved YOLOv 3-based digital information area identification method in document image
CN111274957A (en) Webpage verification code identification method, device, terminal and computer storage medium
CN111737478B (en) Text detection method, electronic device and computer readable medium
CN114549993B (en) Method, system and device for grading line segment image in experiment and readable storage medium
CN113591746B (en) Document table structure detection method and device
CN110598566A (en) Image processing method, device, terminal and computer readable storage medium
CN112446259A (en) Image processing method, device, terminal and computer readable storage medium
CN111461100A (en) Bill identification method and device, electronic equipment and storage medium
CN112464925A (en) Mobile terminal account opening data bank information automatic extraction method based on machine learning
CN111626145A (en) Simple and effective incomplete form identification and page-crossing splicing method
CN111738979A (en) Automatic certificate image quality inspection method and system
CN113435407A (en) Small target identification method and device for power transmission system
CN113688838A (en) Red handwriting extraction method and system, readable storage medium and computer equipment
CN114693917A (en) Data enhancement method applied to signboard identification
CN111008635A (en) OCR-based multi-bill automatic identification method and system
CN113361547A (en) Signature identification method, device, equipment and readable storage medium
CN117218672A (en) Deep learning-based medical records text recognition method and system
CN117132540A (en) PCB defect post-processing method based on segmentation model
CN110766001B (en) Bank card number positioning and end-to-end identification method based on CNN and RNN

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant