WO2022056875A1 - 一种铭牌图像的分割方法、装置和计算机可读存储介质 - Google Patents

一种铭牌图像的分割方法、装置和计算机可读存储介质 Download PDF

Info

Publication number
WO2022056875A1
WO2022056875A1 PCT/CN2020/116313 CN2020116313W WO2022056875A1 WO 2022056875 A1 WO2022056875 A1 WO 2022056875A1 CN 2020116313 W CN2020116313 W CN 2020116313W WO 2022056875 A1 WO2022056875 A1 WO 2022056875A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
nameplate
channel
pixel
coordinates
Prior art date
Application number
PCT/CN2020/116313
Other languages
English (en)
French (fr)
Inventor
王丹
李晶
刘浩
华文韬
李昂
张鹏飞
Original Assignee
西门子股份公司
西门子(中国)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 西门子股份公司, 西门子(中国)有限公司 filed Critical 西门子股份公司
Priority to PCT/CN2020/116313 priority Critical patent/WO2022056875A1/zh
Priority to CN202080105178.6A priority patent/CN116134481A/zh
Publication of WO2022056875A1 publication Critical patent/WO2022056875A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition

Definitions

  • the present invention relates to the technical field of image processing, and in particular, to a method, a device and a computer-readable storage medium for segmenting a nameplate image.
  • the nameplate also known as the sign, is mainly used to record the technical data of the equipment manufacturer and the rated working conditions for correct use without damaging the equipment.
  • the materials for making nameplates usually include metals and non-metals, among which metals include zinc alloy, copper, iron, aluminum or stainless steel, etc.; non-metals include plastic, acrylic organic board, PVC, PC or paper, etc.
  • Electrical and electronic equipment is usually attached with nameplates that record various attribute information of the equipment. For example, a transformer nameplate attached to a transformer usually records many electrical properties of the transformer.
  • the nameplate can be photographed to obtain the nameplate image, and then the content in the nameplate image can be automatically extracted using Optical Character Recognition (OCR) technology, and the respective data analysis (for example, electricity consumption data volume prediction) or related modeling can be performed using these contents. (For example, the health model of the device).
  • OCR Optical Character Recognition
  • nameplates contain both tables and text.
  • the text in the nameplate image is close to the table, which makes it easy to confuse the text and table that are close to each other as the same object during OCR processing, thus affecting the OCR effect.
  • the embodiments of the present invention provide a method, an apparatus and a computer-readable storage medium for segmenting a nameplate image.
  • a method for segmenting a nameplate image comprising:
  • the nameplate image is segmented based on the text area and the table area.
  • the text area in the binary image is first detected, and then the pixel value of each pixel in the text area is set to the same value, so that the edge detection can be accurately determined when performing edge detection on the binary image.
  • the table area can be extracted, and then the nameplate image can be divided based on the text area and the table area, so as to realize the separation of the text area and the table area in the nameplate image.
  • the converting the nameplate image including the nameplate into a binary image includes: converting the RGB image into a grayscale image; converting the grayscale image Converting the image into the binary image; or, when the image of the nameplate including the nameplate is a grayscale image, the converting the image of the nameplate including the nameplate into a binary image includes: converting the grayscale image into the grayscale image Binary image.
  • the nameplate image can be an RGB image or a grayscale image, which has a wide range of applications.
  • the detecting the text area in the binary image includes: detecting the text area in the binary image by adopting a maximum stable extreme value area method.
  • the embodiments of the present invention can accurately identify the text region based on the maximum stable extreme value region approach.
  • the setting the pixel value of each pixel in the text area to a predetermined same value includes: setting the pixel value of each pixel in the text area to 1, or setting the The pixel value of each pixel in the text area is set to 0.
  • the pixel value of each pixel in the text area is set to 1, so that the text area is set as a white area or a black area, so as to avoid interference to the detection process of the table area.
  • the performing edge detection on the binary image to determine the table area in the binary image includes: performing edge detection on the binary image to determine N table areas, where N is greater than or equal to a positive integer of 1; the dividing the nameplate image based on the text area and the table area includes: dividing the nameplate image into a first sub-image and N second sub-images, wherein the first sub-image includes a text area, Each of the second sub-images respectively includes a corresponding table area.
  • the embodiments of the present invention can generate sub-images corresponding to the table area and the text area.
  • the method before converting the nameplate image containing the nameplate to a binary image, the method further includes:
  • the nameplate image is generated based on the perspective transformation transformation matrix.
  • the embodiment of the present invention determines a perspective transformation transformation matrix based on the edge of the nameplate determined by edge detection, and uses the perspective transformation transformation matrix to generate a corrected image (namely the nameplate image) of the original image, and the nameplate figure in the nameplate image after the perspective transformation remains unchanged. , which overcomes the distortion defect of Hough transform and can improve the accuracy of image correction.
  • the generating the nameplate image based on the perspective transformation transformation matrix includes:
  • a rectified image with grayscale corresponding to the quadrilateral can be generated. Also, by transforming the coordinates of each pixel point in the R channel, G channel, and B channel of the nameplate image, a rectified image with RGB colors corresponding to the original image can be generated.
  • a device for segmenting a nameplate image comprising:
  • a text area detection module for detecting the text area in the binary image
  • a setting module for setting the pixel value of each pixel in the text area to a predetermined same value
  • a table area detection module for performing edge detection on the binary image to determine a table area in the binary image
  • a segmentation module configured to segment the nameplate image based on the text area and the table area.
  • the text area in the binary image is first detected, and then the pixel value of each pixel in the text area is set to the same value, so that the edge detection can be accurately determined when performing edge detection on the binary image.
  • the table area can be extracted, and then the nameplate image can be divided based on the text area and the table area, so as to realize the separation of the text area and the table area in the nameplate image.
  • the conversion module is configured to convert the RGB image into a grayscale image when the nameplate image including the nameplate is an RGB image; convert the grayscale image into a binary image; when the When the nameplate image including the nameplate is a grayscale image, the grayscale image is converted into a binary image.
  • the nameplate image can be an RGB image or a grayscale image, which has a wide range of applications.
  • the text area detection module is configured to detect the text area in the binary image by adopting the maximum stable extreme value area method.
  • the embodiments of the present invention can accurately identify the text region based on the maximum stable extreme value region approach.
  • the setting module is configured to set the pixel value of each pixel in the text area to 1, or set the pixel value of each pixel in the text area to 0.
  • the pixel value of each pixel in the text area is set to 1, so that the text area is set as a white area or a black area, so as to avoid interference to the detection process of the table area.
  • a table area detection module is used to perform edge detection on the binary image to determine N table areas, where N is a positive integer greater than or equal to 1;
  • a segmentation module is used to convert the nameplate image It is divided into a first sub-image and N second sub-images, wherein the first sub-image includes a text area, and each second sub-image includes a corresponding table area respectively.
  • the embodiments of the present invention can generate sub-images corresponding to the table area and the text area.
  • the apparatus further includes:
  • a correction module for converting the original image containing the nameplate into a grayscale image before the conversion module converts the nameplate image containing the nameplate into a binary image; performing edge detection on the grayscale image to determine the edge of the nameplate; A perspective transformation transformation matrix is determined based on vertex coordinates of a quadrilateral surrounding the edge and vertex coordinates of the nameplate image; and the nameplate image is generated based on the perspective transformation transformation matrix.
  • the embodiment of the present invention determines a perspective transformation transformation matrix based on the edge of the nameplate determined by edge detection, and uses the perspective transformation transformation matrix to generate a corrected image (namely the nameplate image) of the original image, and the nameplate figure in the nameplate image after the perspective transformation remains unchanged. , which overcomes the distortion defect of Hough transform and can improve the accuracy of image correction.
  • a correction module is configured to determine the coordinates of each pixel point in the quadrilateral; based on the product of the coordinates of each pixel point and the perspective transformation transformation matrix, determine the coordinates of each pixel point The transformed coordinates of each pixel point; copy each pixel point to the respective transformed coordinates to generate the nameplate image; or determine the coordinates of each pixel point in the original image; based on the coordinates of each pixel point and the perspective Transform the product of the transformation matrix to determine the transformed coordinates of the coordinates of each pixel point; separate the original image into R channel, G channel and B channel; determine that each pixel in the R channel is copied to the respective transformation
  • the corrected R channel and each pixel in the G channel generated at the rear coordinates are copied to the respective transformed coordinates.
  • the corrected G channel and each pixel in the B channel are copied to the respective transformed coordinates.
  • the generated correction B channel combine the rectified R channel, the rectified G channel and the rectified B channel into the nameplate image.
  • a rectified image with grayscale corresponding to the quadrilateral can be generated. Also, by transforming the coordinates of each pixel point in the R channel, G channel, and B channel of the nameplate image, a rectified image with RGB colors corresponding to the original image can be generated.
  • a device for segmenting a nameplate image comprising: a processor and a memory;
  • An application program executable by the processor is stored in the memory, so as to cause the processor to execute the method for segmenting a nameplate image as described above.
  • the embodiment of the present invention also proposes a nameplate image segmentation device with a memory-processor architecture, so as to realize the separation of the text area and the table area in the nameplate image.
  • a computer-readable storage medium storing computer-readable instructions for performing the method for segmenting a nameplate image as described in any one of the above.
  • the embodiments of the present invention also provide a computer-readable storage medium containing computer-readable instructions, so as to realize the separation of the text area and the table area in the nameplate image.
  • FIG. 1 is a flowchart of a method for segmenting a nameplate image according to an embodiment of the present invention.
  • FIG. 2 is a schematic diagram of a nameplate image including a text area and a table area according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of an exemplary segmentation of a nameplate image including a text area and a table area according to an embodiment of the present invention.
  • FIG. 4 is an exemplary schematic diagram of rectifying an original image to generate a nameplate image according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of an embodiment of the present invention including an original image of a transformer nameplate.
  • FIG. 6 is a schematic diagram of an original image including a transformer nameplate after correction according to an embodiment of the present invention.
  • FIG. 7 is a configuration diagram of an apparatus for dividing a nameplate image according to an embodiment of the present invention.
  • FIG. 8 is a structural diagram of an apparatus for segmenting a nameplate image with a memory-processor architecture according to an embodiment of the present invention.
  • the applicant proposes a technical solution for dividing the nameplate image. By dividing the nameplate image into a text area and a table area, it is beneficial to improve the OCR effect.
  • FIG. 1 is a flowchart of a method for segmenting a nameplate image according to an embodiment of the present invention.
  • the method includes:
  • Step 101 Convert the nameplate image including the nameplate into a binary image (Binary Image).
  • the nameplate image is a captured image of the device nameplate or a processed image of the captured image.
  • the nameplate usually records the technical data determined by the equipment manufacturer under the rated working conditions of the equipment.
  • the nameplate image may be an image taken at the site of the electrical and electronic equipment for the nameplate of the electrical and electronic equipment, or obtained from a database (such as a local database or a cloud database located in the cloud) or a third-party storage medium of the same type of electrical and electronic equipment.
  • Historical nameplate image The nameplate image contains the nameplate that is the subject of the photograph.
  • converting the nameplate image including the nameplate into a binary image in step 101 includes: converting the RGB image into a grayscale image; converting the grayscale image into a grayscale image; Convert the degree image to a binary image.
  • converting the nameplate image including the nameplate into a binary image in step 101 includes: converting the grayscale image into a binary image.
  • the floating point method, the integer method, the shift method, the average value method, the green only method, or the Gamma correction algorithm can be used to convert the RGB image into a grayscale image.
  • Grayscale images represent each image point with a different saturation of black.
  • RGB red, green, blue
  • Gray R*0.3+G*0.59+B*0.11;
  • Binary images are only represented by black (0) and white (1).
  • the grayscale image with grayscale values of 0 to 255 is transformed into a binary image with pixel values of 0-1. This process is called binarization.
  • the implementation principle is to set a threshold. If it is 128, then traverse each pixel of the grayscale image from 0 to 255. If the grayscale value of the pixel is greater than 128, set it to white (1), otherwise set it to black (0).
  • Step 102 Detect the text area in the binary image.
  • the detecting the text region in the binary image comprises: detecting the text region in the binary image by adopting a maximum stable extreme value region (MSER) method.
  • MSER can be used to roughly find text regions in images.
  • a single MSER algorithm may generate multiple rectangles that contain each other.
  • the text region is detected by a combination of MSER and non-maximum suppression (NMS), where NMS is an algorithm that often accompanies image region detection, and its function is to remove repeated regions and suppress boxes that are not the largest box. , that is, remove the small rectangle contained in the large rectangle.
  • NMS non-maximum suppression
  • Step 103 Set the pixel value of each pixel in the text area to a predetermined same value.
  • the setting the pixel value of the text area to a predetermined same value includes: setting the pixel value of the text area to 1 or 0. Therefore, it is implemented to set the text area as a white area or a black area, so as to avoid interference to the detection process of the table area.
  • Step 104 Perform edge detection on the binary image to determine the table area.
  • the pixel value of each pixel in the text area of the binary image has been set to the same predetermined value in step 103, so the text area (which has been transformed into a white area or a black area) will not be used for the table area. interference in the detection process.
  • the performing edge detection on the binary image to determine the table region includes: performing edge detection on the binary image to determine N table regions, where N is a positive integer greater than or equal to 1;
  • the dividing the nameplate image based on the text area and the table area includes: dividing the nameplate image into a first sub-image containing the text area and N second sub-images, wherein each second sub-image contains the corresponding table area.
  • edge detection is to identify points in an image with significant changes in brightness. Significant changes in image properties often reflect significant events and changes in properties.
  • edges of the table contained in the binary image can be determined.
  • an edge refers to a set of pixels around which the grayscale of the pixels changes sharply. Edges exist between objects, backgrounds and regions, so edges are the basis for image segmentation. After performing edge detection on a binary image, you can return the edges of the table area.
  • edge detection there are many methods for edge detection, which can be roughly divided into two categories: search-based and zero-crossing-based.
  • search-based edge detection method the edge strength is first calculated, which is usually represented by a first-order derivative, such as the gradient mode; then, the local direction of the edge is estimated by calculation, usually the direction of the gradient is used, and this direction is used to find the local gradient mode. maximum value.
  • zero-crossing based method the zero-crossing point of the second derivative obtained from the image is found to locate the edge.
  • edge detection templates include Laplacian operator, Roberts operator, Sobel operator, log(Laplacian-Gauss) operator, Kirsch operator and Prewitt operator, and so on.
  • Step 105 Segment the nameplate image based on the text area and the table area.
  • the nameplate image is segmented according to the range of the text area determined in step 102 and the range of the table area determined in step 104 .
  • the nameplate image in step 105 is the nameplate image before being converted into a binary image in step 101 , or a duplicate image of the nameplate image before being converted into a binary image in step 101 .
  • FIG. 2 is a schematic diagram of a nameplate image including a text area and a table area according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of an exemplary segmentation of a nameplate image including a text area and a table area according to an embodiment of the present invention.
  • the nameplate image 35 includes a text area 30 and a table area 40 .
  • the image segmentation process shown in FIG. 1 is performed on the nameplate image 35 , and the first sub-image 50 and the second sub-image 60 can be obtained.
  • a plurality of second sub-images 60 may be generated, wherein each of the second sub-images 60 includes a corresponding table.
  • OCR processing may be performed on the first sub-image 50 and the second sub-image 60 respectively. Since the text and the table are no longer confused as the same object, the recognition accuracy of performing the OCR processing on the first sub-image 50 and the second sub-image 60 respectively at this time is significantly better than the recognition accuracy of performing the OCR processing on the nameplate image 35 .
  • the Hough transform is usually used to determine the rotation angle of the nameplate in the nameplate image, and then the nameplate is transformed to an appropriate position based on the rotation angle, so as to correct the nameplate image.
  • the Hough transform can only determine the direction of the straight line in the correction process, and the length information of the line segment is lost, so the image is easily distorted and the correction effect is not good.
  • the method before converting the nameplate image including the nameplate into a binary image, the method further includes: converting the original image including the nameplate (that is, the nameplate image before correction) into a grayscale image; Perform edge detection on the grayscale image converted from the image to determine the edge of the nameplate; determine a perspective transformation transformation matrix based on the vertex coordinates of the quadrilateral surrounding the edge and the vertex coordinates of the nameplate image; generate a nameplate based on the perspective transformation transformation matrix image (i.e. rectified nameplate image). Then, the method flow shown in FIG. 1 may be implemented on the rectified nameplate image to perform image segmentation.
  • a process of determining a quadrilateral surrounding the edge is also included.
  • the quadrilateral with the shortest perimeter is determined as the quadrilateral.
  • a perspective transformation transformation matrix is determined based on the vertex coordinates of the quadrilateral with the shortest perimeter and the vertex coordinates of the nameplate image.
  • Perspective transformation refers to the use of the condition that the three points of the perspective center, the image point and the target point are collinear, and according to the law of perspective rotation, the shadow-bearing surface (perspective surface) is rotated around the trace (perspective axis) by a certain angle, destroying the original projection light. Harness, a transformation that still preserves the projected geometry on the shadow-bearing surface.
  • [x,y] is the two-dimensional coordinates of the pixel in the corrected nameplate image; [u,v,w] is the three-dimensional coordinate of the original image before correction, w is usually equal to 1; after the correction
  • the three-dimensional coordinates in the nameplate image can be defined as [x,y,1].
  • the rectified nameplate image is usually a rectangle.
  • the coordinates of the four vertices of the nameplate image are known, such as (0,0,1), (0,h,1), (w,h,1) and (w,0,1), respectively, where w is the width of the nameplate image and h is the height of the nameplate image.
  • 8 equations can be constructed according to formula (3), thereby calculating a 11 , a 12 , Values of a 13 , a 21 , a 22 , a 23 , a 31 , and a 32 .
  • the perspective transformation matrix can be uniquely determined where a 33 is 1.
  • the generating the nameplate image based on the perspective transformation transformation matrix includes:
  • Mode (1) determine the coordinates of each pixel in the quadrilateral (three-dimensional coordinates, where the w value is set to 1); based on the product of the coordinates of each pixel and the perspective transformation transformation matrix, determine the The transformed coordinates of the coordinates of the pixels; each pixel is copied to the respective transformed coordinates to generate the nameplate image.
  • embodiments of the present invention by transforming the coordinates of each pixel in the quadrilateral surrounding the edge of the nameplate, a rectified image with grayscale corresponding to the quadrilateral can be generated. Accordingly, embodiments of the present invention also implement a rectified nameplate image in the form of a grayscale image.
  • Mode (2) determine the coordinates of each pixel in the original image; determine the converted coordinates of the coordinates of each pixel based on the product of the coordinates of each pixel and the perspective transformation transformation matrix;
  • the original image is separated into R channel, G channel and B channel; it is determined that each pixel point in the R channel is copied to the corrected R channel and each pixel point in the G channel is copied to the respective converted coordinates.
  • the corrected G channel and each pixel in the B channel generated at the transformed coordinates are copied to the corrected B channel generated at the respective transformed coordinates; the corrected R channel, the corrected G channel and the corrected B channel are copied.
  • the channels are merged into the rectified image.
  • the transformed coordinates of the coordinates of each pixel are determined. Then, separate the original image into R channel, G channel and B channel, and copy each pixel in the R channel to the respective transformed coordinates to generate a rectified R channel, copy each pixel in the G channel to The respective transformed coordinates are used to generate the corrected G channel, and each pixel in the B channel is copied to the respective transformed coordinates to generate the corrected B channel.
  • the rectified R channel, rectified G channel, and rectified B channel are combined into a rectified image. Among them, the pixels at the same position of the R channel, the G channel and the B channel respectively have the same coordinates after conversion.
  • the original image A which has color, needs to be rectified.
  • the transformed coordinates of the coordinates of each pixel in the original image A are determined.
  • the original image A contains 100 pixels, in which the coordinates of pixel 1 correspond to the converted coordinates K1, the coordinates of pixel 2 correspond to the converted coordinates K1, and the coordinates of pixel 3 correspond to the converted coordinates K3... Pixels The coordinates of the point 100 correspond to the transformed coordinates K100.
  • the original image A is separated into three channels, which are the R channel of the original image A, the G channel of the original image A, and the B channel of the original image A.
  • each pixel point in the R channel of the original image A is copied to the respective transformed coordinates in the rectified R channel to generate the rectified R channel.
  • the pixel point 1 in the R channel of the original image A is copied to the transformed coordinate K1 in the corrected R channel
  • the pixel point 2 in the R channel of the original image A is copied to the transformation in the corrected R channel
  • the rear coordinate K2 copy the pixel point 3 in the R channel of the original image A to the transformed coordinate K3 in the corrected R channel
  • the converted coordinate K100 in thus forming the corrected R channel.
  • Each pixel in the G channel of the original image A is copied to the respective transformed coordinates in the rectified G channel to generate the rectified G channel.
  • the pixel point 1 in the G channel of the original image A is copied to the transformed coordinate K1 in the corrected G channel
  • the pixel point 2 in the G channel of the original image A is copied to the transformation in the corrected G channel
  • the rear coordinate K2 copy the pixel point 3 in the G channel of the original image A to the transformed coordinate K3 in the corrected G channel
  • the converted coordinate K100 in thus forming the corrected G channel.
  • Each pixel in the B channel of the original image A is copied to the respective transformed coordinates in the rectified G channel to generate the rectified B channel.
  • the pixel point 1 in the B channel of the original image A is copied to the transformed coordinate K1 in the corrected B channel
  • the pixel point 2 in the B channel of the original image A is copied to the transformed B channel.
  • copy the pixel point 3 in the B channel of the original image A to the converted coordinate K3 in the corrected B channel ... Copy the pixel point 100 in the B channel of the nameplate image A to the corrected B channel
  • the converted coordinate K100 in thus forming the corrected B channel.
  • the embodiment of the present invention by transforming the coordinates of each pixel in the R channel, G channel and B channel of the original image, a rectified image with RGB colors corresponding to the original image can be generated. Therefore, the embodiment of the present invention also realizes a corrected nameplate image in the form of RGB colors.
  • the method further comprises: increasing the contrast of the grayscale image;
  • the contrasted grayscale image is subjected to noise reduction processing.
  • the image enhancement method based on histogram equalization can be used to increase the contrast of grayscale images.
  • FIG. 4 is an exemplary schematic diagram of rectifying an original image to generate a nameplate image according to an embodiment of the present invention.
  • the quadrilateral with the shortest perimeter is determined, which is assumed to be a quadrilateral JKMN (usually a trapezoid).
  • the coordinates of the four vertices J, K, M, and N are determined.
  • the nameplate image obtained after correction is a rectangle of a predetermined size. The coordinates of the four vertices A, B, C and D of the nameplate image are determined.
  • the perspective transformation transformation matrix can be calculated. Then, using the perspective transformation transformation matrix, each pixel point in the quadrilateral JKMN can be transformed to the corresponding coordinates of the nameplate image ABCD, so as to achieve correction.
  • FIG. 5 is a schematic diagram of an embodiment of the present invention including an original image of a transformer nameplate.
  • 6 is a schematic diagram of an original image including a transformer nameplate after correction according to an embodiment of the present invention. It can be seen that the transformer nameplate image in FIG. 5 has a tilt angle and has a photographing background pattern; the tilt angle of the transformer nameplate image in FIG. 6 is corrected and no longer includes the photographing background pattern, thus facilitating subsequent OCR operations.
  • an embodiment of the present invention also proposes an apparatus for segmenting a nameplate image.
  • FIG. 7 is a block diagram of an apparatus for dividing a nameplate image according to an embodiment of the present invention.
  • the segmentation device 700 of the nameplate image includes:
  • a conversion module 702 configured to convert the nameplate image including the nameplate into a binary image
  • a text area detection module 703, configured to detect the text area in the binary image
  • a setting module 704 configured to set the pixel value of each pixel in the text area to a predetermined same value
  • a table area detection module 705, configured to perform edge detection on the binary image to determine the table area
  • a segmentation module 706, configured to segment the nameplate image based on the text area and the table area.
  • the conversion module 702 is configured to convert the RGB image into a grayscale image when the nameplate image including the nameplate is an RGB image; convert the grayscale image into a binary image; When the nameplate image including the nameplate is a grayscale image, the grayscale image is converted into a binary image.
  • the text area detection module 703 is configured to detect the text area in the binary image by adopting the maximum stable extreme value area method.
  • the setting module 704 is configured to set the pixel value of each pixel in the text area to 1, or set the pixel value of each pixel in the text area to 0.
  • the table area detection module 705 is used to perform edge detection on the binary image to determine N table areas, where N is a positive integer greater than or equal to 1; the segmentation module 706 is used to convert the nameplate image It is divided into a first sub-image containing a text area and N second sub-images, wherein each of the second sub-images respectively contains a corresponding table area.
  • the apparatus 700 further includes:
  • the correction module 701 is used to convert the original image containing the transformer nameplate into a grayscale image before the conversion module 702 converts the nameplate image containing the nameplate into a binary image; performing edge detection on the grayscale image to determine the nameplate the edge of ; determining a perspective transformation transformation matrix based on the vertex coordinates of the quadrilateral surrounding the edge and the vertex coordinates of the nameplate image; and generating the nameplate image based on the perspective transformation transformation matrix.
  • the correction module 701 is configured to determine the coordinates of each pixel in the quadrilateral; based on the product of the coordinates of each pixel and the perspective transformation matrix, determine the coordinates of each pixel The transformed coordinates of the coordinates; copy each pixel to the respective transformed coordinates to generate the nameplate image; or determine the coordinates of each pixel in the original image; based on the coordinates of each pixel and the The product of the perspective transformation transformation matrix determines the transformed coordinates of the coordinates of each pixel point; separates the original image into R channel, G channel and B channel; determines that each pixel in the R channel is copied to the respective The corrected R channel and each pixel in the G channel generated at the transformed coordinates are copied to the respective transformed coordinates. The corrected G channel and each pixel in the B channel are copied to the respective transformed coordinates. Correcting the B channel; combining the corrected R channel, the corrected G channel, and the corrected B channel into the nameplate image.
  • an embodiment of the present invention also proposes an apparatus for segmenting a nameplate image with a memory-processor architecture.
  • FIG. 8 is a block diagram of an apparatus for segmenting a nameplate image with a memory-processor architecture according to an embodiment of the present invention.
  • an apparatus 800 for segmenting a nameplate image includes a processor 801 , a memory 802 and a computer program stored in the memory 802 and running on the processor 801 .
  • the computer program is executed by the processor 801 , any one of the above is implemented The segmentation method of the nameplate image.
  • the memory 802 can be specifically implemented as various storage media such as Electrically Erasable Programmable Read-Only Memory (EEPROM), Flash Memory (Flash memory), Programmable Program Read-Only Memory (PROM).
  • the processor 801 may be implemented to include one or more central processing units or one or more field programmable gate arrays, wherein the field programmable gate arrays integrate one or more central processing unit cores.
  • the central processing unit or the central processing unit core may be implemented as a CPU or an MCU or a DSP or the like.
  • the hardware modules in various embodiments may be implemented mechanically or electronically.
  • a hardware module may include specially designed permanent circuits or logic devices (eg, special purpose processors, such as FPGAs or ASICs) for performing specific operations.
  • Hardware modules may also include programmable logic devices or circuits (eg, including general-purpose processors or other programmable processors) temporarily configured by software for performing particular operations.
  • programmable logic devices or circuits eg, including general-purpose processors or other programmable processors
  • the present invention also provides a machine-readable storage medium storing instructions for causing a machine to perform a method as described herein.
  • a system or device equipped with a storage medium on which software program codes for realizing the functions of any one of the above-described embodiments are stored, and make the computer (or CPU or MPU of the system or device) ) to read and execute the program code stored in the storage medium.
  • a part or all of the actual operation can also be completed by an operating system or the like operating on the computer based on the instructions of the program code.
  • the program code read from the storage medium can also be written into the memory provided in the expansion board inserted into the computer or into the memory provided in the expansion unit connected to the computer, and then the instructions based on the program code make the device installed in the computer.
  • the CPU on the expansion board or the expansion unit or the like performs part and all of the actual operations, so as to realize the functions of any one of the above-mentioned embodiments.
  • Embodiments of storage media for providing program code include floppy disks, hard disks, magneto-optical disks, optical disks (eg, CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, DVD+RW), Magnetic tapes, non-volatile memory cards and ROMs.
  • the program code may be downloaded from a server computer or cloud over a communications network.

Abstract

一种铭牌图像的分割方法、装置和计算机可读存储介质。该方法包括:将包含铭牌的铭牌图像转换为二值图像(101);检测所述二值图像中的文本区域(102);将所述文本区域中的每个像素点的像素值设置为预定的相同值(103);对所述二值图像执行边缘检测以确定所述二值图像中的表格区域(104);基于所述文本区域和所述表格区域分割所述铭牌图像(105)。该方法可以将铭牌图像分割为文本区域和表格区域,从而提高后续的光学字符识别的准确度,还可以对包含铭牌的铭牌图像进行矫正,提高矫正准确度。

Description

一种铭牌图像的分割方法、装置和计算机可读存储介质 技术领域
本发明涉及图像处理技术领域,特别是涉及一种铭牌图像的分割方法、装置和计算机可读存储介质。
背景技术
铭牌(nameplate)又称标牌,主要用来记载设备生产厂家及额定工作情况下的技术数据,以供正确使用而不致损坏设备。制作铭牌的材料通常包括金属类和非金属类,其中金属类有锌合金、铜、铁、铝或不锈钢等;非金属类有塑料、亚克力有机板、PVC、PC或纸等。电子电气设备上通常附着有记录设备的各种属性信息的铭牌。比如,附加到变压器上的变压器铭牌通常记录有变压器的诸多电属性。
可以拍摄铭牌以获取铭牌图像,然后利用光学字符识别(Optical Character Recognition,OCR)技术自动提取铭牌图像中的内容,并利用这些内容执行各自数据分析(比如,用电数据量预测)或相关建模(比如,设备的健康度模型)。
然而,很多铭牌同时包含表格和文字。铭牌图像中的文字与表格靠近,导致OCR处理时容易将相互靠近的文字和表格混淆为同一个物体,从而影响OCR效果。
发明内容
本发明实施方式提出一种铭牌图像的分割方法、装置和计算机可读存储介质。
本发明实施方式的技术方案如下:
一种铭牌图像的分割方法,该方法包括:
将包含铭牌的铭牌图像转换为二值图像;
检测所述二值图像中的文本区域;
将所述文本区域中的每个像素点的像素值设置为预定的相同值;
对所述二值图像执行边缘检测以确定所述二值图像中的表格区域;
基于所述文本区域和所述表格区域分割所述铭牌图像。
可见,在本发明实施方式中,首先检测二值图像中的文本区域,再将文本区域中的每个像素点的像素值设置为相同值,从而对二值图像执行边缘检测时可以准确地确定出表格区域,然后可以基于文本区域和表格区域分割铭牌图像,实现分离铭牌图像中的文字区域与表格区域。
在一个实施方式中,当所述包含铭牌的铭牌图像为RGB图像时,所述将包含铭牌的铭牌图像转换为二值图像包括:将所述RGB图像转换为灰度图像;将所述灰度图像转换为所述二值图像;或,当所述包含铭牌的铭牌图像为灰度图像时,所述将包含铭牌的铭牌图像转换为二值图像包括:将所述灰度图像转换为所述二值图像。
可见,在本发明实施方式中,铭牌图像可以为RGB图像或灰度图像,适用范围广泛。
在一个实施方式中,所述检测所述二值图像中的文本区域包括:采用最大稳定极值区域方式检测所述二值图像中的文本区域。
因此,本发明实施方式基于最大稳定极值区域方式可以准确识别出文本区域。
在一个实施方式中,所述将文本区域中的每个像素点的像素值设置为预定的相同值包括:将所述文本区域中的每个像素点的像素值设置为1,或将所述文本区域中的每个像素点的像素值设置为0。
因此,本发明实施方式将文本区域中的每个像素点的像素值设置为1,实现将文本区域设置为白色区域或黑色区域,避免对表格区域的检测过程造成干扰。
在一个实施方式中,所述对二值图像执行边缘检测以确定所述二值图像中的表格区域包括:对所述二值图像执行边缘检测以确定出N个表格区域,其中N为大于等于1的正整数;所述基于文本区域和表格区域分割所述铭牌图像包括:将所述铭牌图像分割为第一子图像和N个第二子图像,其中所述第一子图像包含文本区域,每个第二子图像中分别包含对应的表格区域。
因此,本发明实施方式可以生成对应于表格区域和文字区域的子图像。
在一个实施方式中,在将包含铭牌的铭牌图像转换为二值图像之前,该方法还包括:
将包含铭牌的原始图像转换为灰度图像;
对所述灰度图像执行边缘检测以确定所述铭牌的边缘;
基于包围所述边缘的四边形的顶点坐标和所述铭牌图像的顶点坐标确定透视变换转换矩阵;
基于所述透视变换转换矩阵生成所述铭牌图像。
可见,本发明实施方式基于边缘检测所确定的铭牌边缘确定透视变换转换矩阵,并利用透视变换转换矩阵生成原始图像的矫正图像(即铭牌图像),透视变换后的铭牌图像中的铭牌图形不变,克服了霍夫变换的失真缺陷,可以提高图像的矫正准确度。
在一个实施方式中,所述基于所述透视变换转换矩阵生成所述铭牌图像包括:
确定所述四边形中的每个像素点的坐标;基于每个像素点的坐标与所述透视变换转换矩阵的乘积,确定所述每个像素点的坐标的转换后坐标;将每个像素点复制到各自的转换后坐标处以生成所述铭牌图像;或
确定所述原始图像中的每个像素点的坐标;基于每个像素点的坐标与所述透视变换转换矩阵的乘积,确定所述每个像素点的坐标的转换后坐标;将所述原始图像分离为R通道、G通道和B通道;确定R通道中的每个像素点复制到各自的转换后坐标处所生成的矫正R通道、G通道中的每个像素点复制到各自的转换后坐标处所生成的矫正G通道和B通道中的每个像素点复制到各自的转换后坐标处所生成的矫正B通道;将所述矫正R通道、所述矫正G通道以及所述矫正B通道合并为所述铭牌图像。
可见,在本发明实施方式中,通过对包围铭牌边缘的四边形中的每个像素点的坐标转换,可以生成对 应于该四边形的、具有灰度的矫正图像。而且,通过对铭牌图像的R通道、G通道和B通道中的每个像素点的坐标转换,可以生成对应于原始图像的、具有RGB色彩的矫正图像。
一种铭牌图像的分割装置,该装置包括:
转换模块,用于将包含铭牌的铭牌图像转换为二值图像;
文本区域检测模块,用于检测所述二值图像中的文本区域;
设置模块,用于将所述文本区域中的每个像素点的像素值设置为预定的相同值;
表格区域检测模块,用于对所述二值图像执行边缘检测以确定所述二值图像中的表格区域;
分割模块,用于基于所述文本区域和所述表格区域分割所述铭牌图像。
可见,在本发明实施方式中,首先检测二值图像中的文本区域,再将文本区域中的每个像素点的像素值设置为相同值,从而对二值图像执行边缘检测时可以准确地确定出表格区域,然后可以基于文本区域和表格区域分割铭牌图像,实现分离铭牌图像中的文字区域与表格区域。
在一个实施方式中,转换模块,用于当所述包含铭牌的铭牌图像为RGB图像时,将所述RGB图像转换为灰度图像;将所述灰度图像转换为二值图像;当所述包含铭牌的铭牌图像为灰度图像时,将所述灰度图像转换为二值图像。
可见,在本发明实施方式中,铭牌图像可以为RGB图像或灰度图像,适用范围广泛。
在一个实施方式中,文本区域检测模块,用于采用最大稳定极值区域方式检测所述二值图像中的文本区域。
因此,本发明实施方式基于最大稳定极值区域方式可以准确识别出文本区域。
在一个实施方式中,设置模块,用于将所述文本区域中的每个像素点的像素值设置为1,或将所述文本区域中的每个像素点的像素值设置为0。
因此,本发明实施方式将文本区域中的每个像素点的像素值设置为1,实现将文本区域设置为白色区域或黑色区域,避免对表格区域的检测过程造成干扰。
在一个实施方式中,表格区域检测模块,用于对所述二值图像执行边缘检测以确定出N个表格区域,其中N为大于等于1的正整数;分割模块,用于将所述铭牌图像分割为第一子图像和N个第二子图像,其中所述第一子图像包含文本区域,每个第二子图像中分别包含对应的表格区域。
因此,本发明实施方式可以生成对应于表格区域和文字区域的子图像。
在一个实施方式中,该装置还包括:
矫正模块,用于在转换模块将包含铭牌的铭牌图像转换为二值图像之前,将包含铭牌的原始图像转换为灰度图像;对所述灰度图像执行边缘检测以确定所述铭牌的边缘;基于包围所述边缘的四边形的顶点坐标和所述铭牌图像的顶点坐标确定透视变换转换矩阵;基于所述透视变换转换矩阵生成所述铭牌图像。
可见,本发明实施方式基于边缘检测所确定的铭牌边缘确定透视变换转换矩阵,并利用透视变换转换 矩阵生成原始图像的矫正图像(即铭牌图像),透视变换后的铭牌图像中的铭牌图形不变,克服了霍夫变换的失真缺陷,可以提高图像的矫正准确度。
在一个实施方式中,矫正模块,用于确定所述四边形中的每个像素点的坐标;基于每个像素点的坐标与所述透视变换转换矩阵的乘积,确定所述每个像素点的坐标的转换后坐标;将每个像素点复制到各自的转换后坐标处以生成所述铭牌图像;或确定所述原始图像中的每个像素点的坐标;基于每个像素点的坐标与所述透视变换转换矩阵的乘积,确定所述每个像素点的坐标的转换后坐标;将所述原始图像分离为R通道、G通道和B通道;确定R通道中的每个像素点复制到各自的转换后坐标处所生成的矫正R通道、G通道中的每个像素点复制到各自的转换后坐标处所生成的矫正G通道和B通道中的每个像素点复制到各自的转换后坐标处所生成的矫正B通道;将所述矫正R通道、所述矫正G通道以及所述矫正B通道合并为所述铭牌图像。
可见,在本发明实施方式中,通过对包围铭牌边缘的四边形中的每个像素点的坐标转换,可以生成对应于该四边形的、具有灰度的矫正图像。而且,通过对铭牌图像的R通道、G通道和B通道中的每个像素点的坐标转换,可以生成对应于原始图像的、具有RGB色彩的矫正图像。
一种铭牌图像的分割装置,包括:处理器和存储器;
其中所述存储器中存储有可被所述处理器执行的应用程序,用于使得所述处理器执行如上所述的铭牌图像的分割方法。
可见,本发明实施方式还提出了具有存储器-处理器架构的铭牌图像的分割装置,实现分离铭牌图像中的文字区域与表格区域。
一种计算机可读存储介质,其中存储有计算机可读指令,该计算机可读指令用于执行如上任一项所述的铭牌图像的分割方法。
可见,本发明实施方式还提出了包含计算机可读指令的计算机可读存储介质,实现分离铭牌图像中的文字区域与表格区域。
附图说明
图1为本发明实施方式的铭牌图像的分割方法的流程图。
图2为本发明实施方式包含文字区域和表格区域的铭牌图像的示意图。
图3为本发明实施方式包含文字区域和表格区域的铭牌图像的示范性分割示意图。
图4为本发明实施方式的对原始图像进行矫正以生成铭牌图像的示范性示意图。
图5为本发明实施方式包含变压器铭牌的原始图像的示意图。
图6为本发明实施方式包含变压器铭牌的原始图像矫正后的示意图。
图7为本发明实施方式的铭牌图像的分割装置的结构图。
图8为本发明实施方式具有存储器-处理器架构的、铭牌图像的分割装置的结构图。
其中,附图标记如下:
标号 含义
100 铭牌图像的分割方法
101~105 步骤
20 铭牌边缘
30 文本区域
35 铭牌图像
40 表格区域
50 第一子图像
60 第二子图像
700 铭牌图像的分割装置
701 矫正模块
702 转换模块
703 文本区域检测模块
704 设置模块
705 表格区域检测模块
706 分割模块
800 铭牌图像的分割装置
801 处理器
802 存储器
具体实施方式
为了使本发明的技术方案及优点更加清楚明白,以下结合附图及实施方式,对本发明进行进一步详细说明。应当理解,此处所描述的具体实施方式仅仅用以阐述性说明本发明,并不用于限定本发明的保护范围。
为了描述上的简洁和直观,下文通过描述若干代表性的实施方式来对本发明的方案进行阐述。实施方式中大量的细节仅用于帮助理解本发明的方案。但是很明显,本发明的技术方案实现时可以不局限于这些细节。为了避免不必要地模糊了本发明的方案,一些实施方式没有进行细致地描述,而是仅给出了框架。下文中,“包括”是指“包括但不限于”,“根据……”是指“至少根据……,但不限于仅根据……”。由于汉语的语言习惯,下文中没有特别指出一个成分的数量时,意味着该成分可以是一个也可以是多个,或可理解为至少一个。
考虑到铭牌图像中的文字与表格靠近会影响OCR效果,申请人提出一种分割铭牌图像的技术方案, 通过将铭牌图像分割为文字区域和表格区域,有利于提高OCR效果。
图1为本发明实施方式的铭牌图像的分割方法的流程图。
如图1所示,该方法包括:
步骤101:将包含铭牌的铭牌图像转换为二值图像(Binary Image)。
在这里,铭牌图像为针对设备铭牌的拍摄图像或拍摄图像的处理图像。铭牌中通常记载设备生产厂家所确定的、设备额定工作情况下的技术数据。比如,铭牌图像可以为在电气电子设备现场针对电气电子设备的铭牌的现场拍摄图像,或者从数据库(比如本地数据库或位于云端的云数据库)或第三方存储介质所获取的同类型电气电子设备的历史铭牌图像。铭牌图像中包含作为被拍摄对象的铭牌。
在一个实施方式中,当所述包含铭牌的铭牌图像为RGB图像时,步骤101中将包含铭牌的铭牌图像转换为二值图像包括:将所述RGB图像转换为灰度图像;将所述灰度图像转换为二值图像。
在一个实施方式中,当所述包含铭牌的铭牌图像为灰度图像时,步骤101中将包含铭牌的铭牌图像转换为二值图像包括:将所述灰度图像转换为二值图像。
在这里,可以采用浮点法、整数法、移位法、平均值法、仅取绿色法或Gamma校正算法等方式,将RGB图像转换为灰度图像。灰度图像是用不同饱和度的黑色来表示每个图像点。
假如RGB彩色图像中某点的颜色为RGB(R,G,B),可以通过下面的示范性方法,将其转换为灰度(Gray)。
(1)、浮点法:Gray=R*0.3+G*0.59+B*0.11;
(2)、整数法:Gray=(R*30+G*59+B*11)/100;
(3)、移位法:Gray=(R*77+G*151+B*28)>>8;
(4)、平均值法:Gray=(R+G+B)/3;
(5)仅取绿色法:Gray=G;
(6)、Gamma校正算法:
Figure PCTCN2020116313-appb-000001
以上示范性描述了将RGB图像转换为灰度图像的典型方法,本领域技术人员可以意识到,这种描述仅是示范性的,并不用于限定本发明实施方式的保护范围。
二值图像只有黑色(0)和白色(1)两种颜色表示。灰度值0~255的灰度图像变到像素值0-1的二值图像,这个过程称为二值化。实现原理为设定一个阈值,假如为128,接下来遍历0~255灰度图像的每一个像素,如果像素灰度值大于128,那么置为白色(1),否则置为黑色(0)。
步骤102:检测所述二值图像中的文本区域。
在一个实施方式中,所述检测所述二值图像中的文本区域包括:采用最大稳定极值区域(MSER)方式检测所述二值图像中的文本区域。MSER可以用来粗略地寻找图像中的文字区域。不过,单独的MSER算法可能产生多个互相包含的矩形框。优选地,采用MSER与非极大值抑制(non maximum suppression, NMS)相结合的方式检测文本区域,其中NMS是经常伴随图像区域检测的算法,作用是去除重复的区域,抑制不是最大框的框,也就是去除大矩形框中包含的小矩形框。
以上示范性描述了检测二值图像中的文本区域的典型方式,本领域技术人员可以意识到,这种描述仅是示范性的,并不用于限定本发明实施方式的保护范围。
步骤103:将所述文本区域中的每个像素点的像素值设置为预定的相同值。
在一个实施方式中,所述将文本区域的像素值设置为预定的相同值包括:将所述文本区域的像素值设置为1或0。因此,实现将文本区域设置为白色区域或黑色区域,避免对表格区域的检测过程造成干扰。
步骤104:对所述二值图像执行边缘检测以确定所述表格区域。
此时的二值图像的文本区域中的每个像素点的像素值已经在步骤103中被设置为预定的相同值,因此文本区域(已经转变为白色区域或黑色区域)不会对针对表格区域的检测过程造成干扰。
在一个实施方式中,所述对二值图像执行边缘检测以确定所述表格区域包括:对所述二值图像执行边缘检测以确定出N个表格区域,其中N为大于等于1的正整数;所述基于所述文本区域和所述表格区域分割所述铭牌图像包括:将铭牌图像分割为包含文本区域的第一子图像和N个第二子图像,其中每个第二子图像中分别包含对应的表格区域。
边缘检测的目的是标识图像中亮度变化明显的点。图像属性中的显著变化通常反映了属性的重要事件和变化。通过对二值图像执行边缘检测,可以确定包含在二值图像中的表格的边缘。具体地,边缘是指其周围像素灰度急剧变化的那些象素的集合。边缘存在于目标、背景和区域之间,所以,边缘是图像分割所依赖的依据。对二值图像执行边缘检测后,可以返回表格区域的边缘。
目前,存在有许多用于边缘检测的方法,大致可分为两类:基于搜索和基于零交叉。在基于搜索的边缘检测方法中,首先计算边缘强度,通常用一阶导数表示,例如梯度模;然后,用计算估计边缘的局部方向,通常采用梯度的方向,并利用此方向找到局部梯度模的最大值。在基于零交叉的方法中,找到由图像得到的二阶导数的零交叉点来定位边缘。通常用拉普拉斯算子或非线性微分方程的零交叉点。目前,常用的边缘检测模板有Laplacian算子、Roberts算子、Sobel算子、log(Laplacian-Gauss)算子、Kirsch算子和Prewitt算子,等等。
以上示范性描述了执行边缘检测的典型方法,本领域技术人员可以意识到,这种描述仅是示范性的,并不用于限定本发明实施方式的保护范围。
步骤105:基于所述文本区域和所述表格区域分割所述铭牌图像。
在这里,按照步骤102中确定的文本区域的范围和步骤104中确定的表格区域的范围,分割铭牌图像。其中,步骤105中的铭牌图像为步骤101中被转换为二值图像前的铭牌图像,或步骤101中被转换为二值图像前的铭牌图像的复制图像。
图2为本发明实施方式包含文字区域和表格区域的铭牌图像的示意图。图3为本发明实施方式包含文 字区域和表格区域的铭牌图像的示范性分割示意图。
由图2可见,在铭牌图像35中,包含有文字区域30和表格区域40。针对该铭牌图像35执行如图1所示的图像分割流程,可以得到第一子图像50和第二子图像60。其中,当铭牌图像35中包含多个表格时,可以生成多个第二子图像60,其中每个第二子图像60包含各自的一张对应表格。
后续处理中,可以分别对第一子图像50和第二子图像60执行OCR处理。由于文字和表格不再混淆为同一个物体,因此此时分别对第一子图像50和第二子图像60执行OCR处理的识别准确度,显著优于针对铭牌图像35执行OCR处理的识别准确度。
申请人还发现:当拍摄铭牌的拍摄角度发生倾斜时,拍摄得到的原始铭牌图像中的铭牌相应具有倾斜角度,此时OCR技术难以准确提取铭牌内容。目前,通常采用霍夫变换(Hough transform)确定铭牌图像中铭牌的旋转角度,再基于旋转角度将铭牌变换到合适的位置,从而矫正铭牌图像。然而,采用霍夫变换在矫正过程中只能确定直线方向,丢失了线段的长度信息,因此容易图像失真,矫正效果不佳。
在一个实施方式中,在将包含铭牌的铭牌图像转换为二值图像之前,该方法还包括:将包含铭牌的原始图像(即矫正前的铭牌图像)转换为灰度图像;对包含铭牌的原始图像所转换出的灰度图像执行边缘检测以确定铭牌的边缘;基于包围所述边缘的四边形的顶点坐标和所述铭牌图像的顶点坐标确定透视变换转换矩阵;基于所述透视变换转换矩阵生成铭牌图像(即矫正后的铭牌图像)。然后,可以针对矫正后的铭牌图像实施图1所示的方法流程,以执行图像分割。
优选地,还包括确定包围边缘的四边形的过程。其中,在所有包围所述边缘的四边形集合(包含包围该边缘的全部四边形)中,将周长最短的四边形确定为该四边形。而且,基于该周长最短的四边形的顶点坐标和铭牌图像的顶点坐标确定透视变换转换矩阵。
下面对透视变换(Perspective Transformation)进行说明。
透视变换是指利用透视中心、像点、目标点三点共线的条件,按透视旋转定律使得承影面(透视面)绕迹线(透视轴)旋转某一角度,破坏原有的投影光线束,仍能保持承影面上投影几何图形不变的变换。
在透视变换中,具有如下公式:
Figure PCTCN2020116313-appb-000002
Figure PCTCN2020116313-appb-000003
Figure PCTCN2020116313-appb-000004
其中:
[x,y]是像素点在矫正后的铭牌图像中的二维坐标;[u,v,w]是像素点在矫正前的原始图像的三维坐标,w 通常等于1;像素点在矫正后的铭牌图像中的三维坐标可以定义为[x,y,1]。
Figure PCTCN2020116313-appb-000005
即为透视变换转换矩阵,其中a 33为1。
矫正后的铭牌图像通常为长方形。而且,该铭牌图像的4个顶点坐标为已知,比如分别为(0,0,1)、(0,h,1)、(w,h,1)和(w,0,1),其中w为铭牌图像的宽度,h为铭牌图像的高度。
因此,基于包围边缘的四边形的四个顶点坐标(已知)和铭牌图像的4个顶点坐标(已知),根据公式(3)可以构建出8个方程,从而计算出a 11、a 12、a 13、a 21、a 22、a 23、a 31和a 32的值。当计算出a 11、a 12、a 13、a 21、a 22、a 23、a 31和a 32的值后,可以唯一地确定出透视变换转换矩阵
Figure PCTCN2020116313-appb-000006
其中a 33为1。
优选地,所述基于所述透视变换转换矩阵生成所述铭牌图像包括:
方式(1):确定所述四边形中的每个像素点的坐标(三维坐标,其中w值设置为1);基于每个像素点的坐标与所述透视变换转换矩阵的乘积,确定所述每个像素点的坐标的转换后坐标;将每个像素点复制到各自的转换后坐标处以生成所述铭牌图像。
可见,在本发明实施方式中,通过对包围铭牌边缘的四边形中的每个像素点的坐标转换,可以生成对应于该四边形的、具有灰度的已矫正图像。因此,本发明实施方式还实现了一种灰度图形式的已矫正铭牌图像。
方式(2):确定所述原始图像中的每个像素点的坐标;基于每个像素点的坐标与所述透视变换转换矩阵的乘积,确定所述每个像素点的坐标的转换后坐标;将所述原始图像分离为R通道、G通道和B通道;确定R通道中的每个像素点复制到各自的转换后坐标处所生成的矫正R通道、G通道中的每个像素点复制到各自的转换后坐标处所生成的矫正G通道和B通道中的每个像素点复制到各自的转换后坐标处所生成的矫正B通道;将所述矫正R通道、所述矫正G通道以及所述矫正B通道合并为所述矫正图像。具体地,首先基于原始图像中的每个像素点的坐标与透视变换转换矩阵的乘积,确定每个像素点的坐标的转换后坐标。然后,将原始图像分离为R通道、G通道和B通道,并且将R通道中的每个像素点复制到各自的转换后坐标处以生成矫正R通道,将G通道中的每个像素点复制到各自的转换后坐标处以生成矫正G通道,将B通道中的每个像素点复制到各自的转换后坐标处以生成矫正B通道。接着,将矫正R通道、矫正G通道以及矫正B通道合并为矫正图像。其中,R通道、G通道和B通道的相同位置处的像素点,分别具有相同的转换后坐标。
举例,假定有彩色的原始图像A需要被矫正。首先,基于原始图像A中的每个像素点的坐标与透视变换转换矩阵的乘积,确定原始图像A中的每个像素点的坐标的转换后坐标。比如,原始图像A包含100个像素点,其中像素点1的坐标对应于转换后坐标K1、像素点2的坐标对应于转换后坐标K1、像素点3 的坐标对应于转换后坐标K3……像素点100的坐标对应于转换后坐标K100。
然后,将原始图像A分离为三个通道,分别为原始图像A的R通道、原始图像A的G通道和原始图像A的B通道。
接着,将原始图像A的R通道中的每个像素点,复制到矫正的R通道中的各自的转换后坐标处以生成矫正的R通道。具体地,将原始图像A的R通道中的像素点1复制到矫正的R通道中的转换后坐标K1处,将原始图像A的R通道中的像素点2复制到矫正的R通道中的转换后坐标K2处,将原始图像A的R通道中的像素点3复制到矫正的R通道中的转换后坐标K3处……将原始图像A的R通道中的像素点100复制到矫正的R通道中的转换后坐标K100处,从而形成矫正的R通道。
将原始图像A的G通道中的每个像素点,复制到矫正的G通道中的各自的转换后坐标处以生成矫正的G通道。具体地,将原始图像A的G通道中的像素点1复制到矫正的G通道中的转换后坐标K1处,将原始图像A的G通道中的像素点2复制到矫正的G通道中的转换后坐标K2处,将原始图像A的G通道中的像素点3复制到矫正的G通道中的转换后坐标K3处……将原始图像A的G通道中的像素点100复制到矫正的G通道中的转换后坐标K100处,从而形成矫正的G通道。
将原始图像A的B通道中的每个像素点,复制到矫正的G通道中的各自的转换后坐标处以生成矫正的B通道。具体地,将原始图像A的B通道中的像素点1复制到矫正的B通道中的转换后坐标K1处,将原始图像A的B通道中的像素点2复制到矫正的B通道中的转换后坐标K2处,将原始图像A的B通道中的像素点3复制到矫正的B通道中的转换后坐标K3处……将铭牌图像A的B通道中的像素点100复制到矫正的B通道中的转换后坐标K100处,从而形成矫正的B通道。
最后,将所述矫正R通道、所述矫正G通道以及所述矫正B通道合并为矫正后的铭牌图像。
可见,在本发明实施方式中,通过对原始图像的R通道、G通道和B通道中的每个像素点的坐标转换,可以生成对应于原始图像的、具有RGB色彩的矫正图像。因此,本发明实施方式还实现了一种RGB色彩形式的矫正后的铭牌图像。
在一个实施方式中,在将包含铭牌的原始图像转换为灰度图像与对灰度图像执行边缘检测以确定所述铭牌的边缘之间,该方法还包括:增加灰度图像的对比度;对增加对比度后的灰度图像执行降噪处理。具体地,可以采用基于直方图均衡化的图像增强方式增加灰度图像的对比度,其基本思想是对于图像中的灰度点做映射,使得整体图像的灰度大致符合均匀分布。
图4为本发明实施方式的对原始图像进行矫正以生成铭牌图像的示范性示意图。
在铭牌的轮廓20被确定后,在包围边缘20的四边形集合(该四边形集合包含所有包围边缘20的四边形)中,确定出周长最短的四边形,假定为四边形JKMN(通常为不规则四边形)。四边形JKMN被确定后,4个顶点J、K、M、N的坐标即确定。矫正后得到的铭牌图像(后续参与图1所示流程的铭牌图像的分割方法)为预定大小的长方形。铭牌图像的四个顶点A、B、C和D的坐标是已确定的。因此,基于 J、K、M、N的坐标与A、B、C和D的坐标之间的对应关系,可以计算出透视变换转换矩阵。然后,利用该透视变换转换矩阵,可以将四边形JKMN中的每个像素点转换到铭牌图像ABCD的对应坐标处,从而实现矫正。
图5为本发明实施方式包含变压器铭牌的原始图像的示意图。图6为本发明实施方式包含变压器铭牌的原始图像矫正后的示意图。可见,图5的变压器铭牌图像具有倾斜角度且带有拍摄背景图案;图6中的变压器铭牌图像的倾斜角度得到矫正且不再包含拍摄背景图案,因此便于后续的OCR操作。
基于上述描述,本发明实施方式还提出了铭牌图像的分割装置。
图7为本发明实施方式的铭牌图像的分割装置的方框图。
如图7所示,铭牌图像的分割装置700包括:
转换模块702,用于将包含铭牌的铭牌图像转换为二值图像;
文本区域检测模块703,用于检测所述二值图像中的文本区域;
设置模块704,用于将所述文本区域中的每个像素点的像素值设置为预定的相同值;
表格区域检测模块705,用于对所述二值图像执行边缘检测以确定所述表格区域;
分割模块706,用于基于所述文本区域和所述表格区域分割所述铭牌图像。
在一个实施方式中,转换模块702,用于当所述包含铭牌的铭牌图像为RGB图像时,将所述RGB图像转换为灰度图像;将所述灰度图像转换为二值图像;当所述包含铭牌的铭牌图像为灰度图像时,将所述灰度图像转换为二值图像。
在一个实施方式中,文本区域检测模块703,用于采用最大稳定极值区域方式检测所述二值图像中的文本区域。
在一个实施方式中,设置模块704,用于将所述文本区域中的每个像素点的像素值设置为1,或将所述文本区域中的每个像素点的像素值设置为0。
在一个实施方式中,表格区域检测模块705,用于对所述二值图像执行边缘检测以确定出N个表格区域,其中N为大于等于1的正整数;分割模块706,用于将铭牌图像分割为包含文本区域的第一子图像和N个第二子图像,其中每个第二子图像中分别包含对应的表格区域。
在一个实施方式中,该装置700还包括:
矫正模块701,用于在转换模块702将包含铭牌的铭牌图像转换为二值图像之前,将包含变压器铭牌的原始图像转换为灰度图像;对所述灰度图像执行边缘检测以确定所述铭牌的边缘;基于包围所述边缘的四边形的顶点坐标和所述铭牌图像的顶点坐标确定透视变换转换矩阵;基于所述透视变换转换矩阵生成所述铭牌图像。
在一个实施方式中,矫正模块701,用于确定所述四边形中的每个像素点的坐标;基于每个像素点的坐标与所述透视变换转换矩阵的乘积,确定所述每个像素点的坐标的转换后坐标;将每个像素点复制到各 自的转换后坐标处以生成所述铭牌图像;或确定所述原始图像中的每个像素点的坐标;基于每个像素点的坐标与所述透视变换转换矩阵的乘积,确定所述每个像素点的坐标的转换后坐标;将所述原始图像分离为R通道、G通道和B通道;确定R通道中的每个像素点复制到各自的转换后坐标处所生成的矫正R通道、G通道中的每个像素点复制到各自的转换后坐标处所生成的矫正G通道和B通道中的每个像素点复制到各自的转换后坐标处所生成的矫正B通道;将所述矫正R通道、所述矫正G通道以及所述矫正B通道合并为所述铭牌图像。
基于上述描述,本发明实施方式还提出有存储器-处理器架构的、铭牌图像的分割装置。
图8为本发明实施方式具有存储器-处理器架构的、铭牌图像的分割装置的方框图。
如图8所示,铭牌图像的分割装置800包括处理器801、存储器802及存储在存储器802上并可在处理器801上运行的计算机程序,计算机程序被处理器801执行时实现如上任一项的铭牌图像的分割方法。
其中,存储器802具体可以实施为电可擦可编程只读存储器(EEPROM)、快闪存储器(Flash memory)、可编程程序只读存储器(PROM)等多种存储介质。处理器801可以实施为包括一或多个中央处理器或一或多个现场可编程门阵列,其中现场可编程门阵列集成一或多个中央处理器核。具体地,中央处理器或中央处理器核可以实施为CPU或MCU或DSP等等。
需要说明的是,上述各流程和各结构图中不是所有的步骤和模块都是必须的,可以根据实际的需要忽略某些步骤或模块。各步骤的执行顺序不是固定的,可以根据需要进行调整。各模块的划分仅仅是为了便于描述采用的功能上的划分,实际实现时,一个模块可以分由多个模块实现,多个模块的功能也可以由同一个模块实现,这些模块可以位于同一个设备中,也可以位于不同的设备中。
各实施方式中的硬件模块可以以机械方式或电子方式实现。例如,一个硬件模块可以包括专门设计的永久性电路或逻辑器件(如专用处理器,如FPGA或ASIC)用于完成特定的操作。硬件模块也可以包括由软件临时配置的可编程逻辑器件或电路(如包括通用处理器或其它可编程处理器)用于执行特定操作。至于具体采用机械方式,或是采用专用的永久性电路,或是采用临时配置的电路(如由软件进行配置)来实现硬件模块,可以根据成本和时间上的考虑来决定。
本发明还提供了一种机器可读的存储介质,存储用于使一机器执行如本文所述方法的指令。具体地,可以提供配有存储介质的系统或者装置,在该存储介质上存储着实现上述实施例中任一实施方式的功能的软件程序代码,且使该系统或者装置的计算机(或CPU或MPU)读出并执行存储在存储介质中的程序代码。此外,还可以通过基于程序代码的指令使计算机上操作的操作系统等来完成部分或者全部的实际操作。还可以将从存储介质读出的程序代码写到插入计算机内的扩展板中所设置的存储器中或者写到与计算机相连接的扩展单元中设置的存储器中,随后基于程序代码的指令使安装在扩展板或者扩展单元上的CPU等来执行部分和全部实际操作,从而实现上述实施方式中任一实施方式的功能。用于提供程序代码的存储介质实施方式包括软盘、硬盘、磁光盘、光盘(如CD-ROM、CD-R、CD-RW、DVD-ROM、DVD-RAM、 DVD-RW、DVD+RW)、磁带、非易失性存储卡和ROM。可选择地,可以由通信网络从服务器计算机或云上下载程序代码。
以上所述,仅为本发明的较佳实施方式而已,并非用于限定本发明的保护范围。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (16)

  1. 一种铭牌图像的分割方法(100),其特征在于,该方法(100)包括:
    将包含铭牌的铭牌图像转换为二值图像(101);
    检测所述二值图像中的文本区域(102);
    将所述文本区域中的每个像素点的像素值设置为预定的相同值(103);
    对所述二值图像执行边缘检测以确定所述二值图像中的表格区域(104);
    基于所述文本区域和所述表格区域分割所述铭牌图像(105)。
  2. 根据权利要求1所述的铭牌图像的分割方法(100),其特征在于,
    当所述包含铭牌的铭牌图像为RGB图像时,所述将包含铭牌的铭牌图像转换为二值图像(101)包括:将所述RGB图像转换为灰度图像;将所述灰度图像转换为所述二值图像;或
    当所述包含铭牌的铭牌图像为灰度图像时,所述将包含铭牌的铭牌图像转换为二值图像(101)包括:将所述灰度图像转换为所述二值图像。
  3. 根据权利要求1所述的铭牌图像的分割方法(100),其特征在于,所述检测所述二值图像中的文本区域(102)包括:采用最大稳定极值区域方式检测所述二值图像中的文本区域。
  4. 根据权利要求1所述的铭牌图像的分割方法(100),其特征在于,所述将文本区域中的每个像素点的像素值设置为预定的相同值(103)包括:将所述文本区域中的每个像素点的像素值设置为1,或将所述文本区域中的每个像素点的像素值设置为0。
  5. 根据权利要求1所述的铭牌图像的分割方法(100),其特征在于,
    所述对二值图像执行边缘检测以确定所述二值图像中的表格区域(104)包括:对所述二值图像执行边缘检测以确定出N个表格区域,其中N为大于等于1的正整数;
    所述基于文本区域和表格区域分割所述铭牌图像(105)包括:将所述铭牌图像分割为第一子图像和N个第二子图像,其中所述第一子图像包含文本区域,每个第二子图像中分别包含对应的表格区域。
  6. 根据权利要求1所述的铭牌图像的分割方法(100),其特征在于,在将包含铭牌的铭牌图像转换为二值图像(101)之前,该方法(100)还包括:
    将包含铭牌的原始图像转换为灰度图像;
    对所述灰度图像执行边缘检测以确定所述铭牌的边缘;
    基于包围所述边缘的四边形的顶点坐标和所述铭牌图像的顶点坐标确定透视变换转换矩阵;
    基于所述透视变换转换矩阵生成所述铭牌图像。
  7. 根据权利要求6所述的铭牌图像的分割方法(100),其特征在于,
    所述基于所述透视变换转换矩阵生成所述铭牌图像包括:
    确定所述四边形中的每个像素点的坐标;基于每个像素点的坐标与所述透视变换转换矩阵的乘积,确定所述每个像素点的坐标的转换后坐标;将每个像素点复制到各自的转换后坐标处以生成所述铭牌图像; 或
    确定所述原始图像中的每个像素点的坐标;基于每个像素点的坐标与所述透视变换转换矩阵的乘积,确定所述每个像素点的坐标的转换后坐标;将所述原始图像分离为R通道、G通道和B通道;确定R通道中的每个像素点复制到各自的转换后坐标处所生成的矫正R通道、G通道中的每个像素点复制到各自的转换后坐标处所生成的矫正G通道和B通道中的每个像素点复制到各自的转换后坐标处所生成的矫正B通道;将所述矫正R通道、所述矫正G通道以及所述矫正B通道合并为所述铭牌图像。
  8. 一种铭牌图像的分割装置(700),其特征在于,该装置(700)包括:
    转换模块(702),用于将包含铭牌的铭牌图像转换为二值图像;
    文本区域检测模块(703),用于检测所述二值图像中的文本区域;
    设置模块(704),用于将所述文本区域中的每个像素点的像素值设置为预定的相同值;
    表格区域检测模块(705),用于对所述二值图像执行边缘检测以确定所述二值图像中的表格区域;
    分割模块(706),用于基于所述文本区域和所述表格区域分割所述铭牌图像。
  9. 根据权利要求8所述的铭牌图像的分割装置(700),其特征在于,
    转换模块(702),用于当所述包含铭牌的铭牌图像为RGB图像时,将所述RGB图像转换为灰度图像;将所述灰度图像转换为二值图像;当所述包含铭牌的铭牌图像为灰度图像时,将所述灰度图像转换为二值图像。
  10. 根据权利要求8所述的铭牌图像的分割装置(700),其特征在于,
    文本区域检测模块(703),用于采用最大稳定极值区域方式检测所述二值图像中的文本区域。
  11. 根据权利要求8所述的铭牌图像的分割装置(700),其特征在于,
    设置模块(704),用于将所述文本区域中的每个像素点的像素值设置为1,或将所述文本区域中的每个像素点的像素值设置为0。
  12. 根据权利要求8所述的铭牌图像的分割装置(700),其特征在于,
    表格区域检测模块(705),用于对所述二值图像执行边缘检测以确定出N个表格区域,其中N为大于等于1的正整数;
    分割模块(706),用于将所述铭牌图像分割为第一子图像和N个第二子图像,其中所述第一子图像包含文本区域,每个第二子图像中分别包含对应的表格区域。
  13. 根据权利要求8所述的铭牌图像的分割装置(700),其特征在于,该装置(700)还包括:
    矫正模块(701),用于在转换模块(702)将包含铭牌的铭牌图像转换为二值图像之前,将包含铭牌的原始图像转换为灰度图像;对所述灰度图像执行边缘检测以确定所述铭牌的边缘;基于包围所述边缘的四边形的顶点坐标和所述铭牌图像的顶点坐标确定透视变换转换矩阵;基于所述透视变换转换矩阵生成所述铭牌图像。
  14. 根据权利要求13所述的铭牌图像的分割装置(700),其特征在于,
    矫正模块(701),用于确定所述四边形中的每个像素点的坐标;基于每个像素点的坐标与所述透视变换转换矩阵的乘积,确定所述每个像素点的坐标的转换后坐标;将每个像素点复制到各自的转换后坐标处以生成所述铭牌图像;或确定所述原始图像中的每个像素点的坐标;基于每个像素点的坐标与所述透视变换转换矩阵的乘积,确定所述每个像素点的坐标的转换后坐标;将所述原始图像分离为R通道、G通道和B通道;确定R通道中的每个像素点复制到各自的转换后坐标处所生成的矫正R通道、G通道中的每个像素点复制到各自的转换后坐标处所生成的矫正G通道和B通道中的每个像素点复制到各自的转换后坐标处所生成的矫正B通道;将所述矫正R通道、所述矫正G通道以及所述矫正B通道合并为所述铭牌图像。
  15. 一种铭牌图像的分割装置(800),其特征在于,包括:处理器(801)和存储器(802);
    其中所述存储器(802)中存储有可被所述处理器(801)执行的应用程序,用于使得所述处理器(801)执行如权利要求1至7中任一项所述的铭牌图像的分割方法(100)。
  16. 一种计算机可读存储介质,其特征在于,其中存储有计算机可读指令,该计算机可读指令用于执行如权利要求1至6中任一项所述的铭牌图像的分割方法(100)。
PCT/CN2020/116313 2020-09-18 2020-09-18 一种铭牌图像的分割方法、装置和计算机可读存储介质 WO2022056875A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2020/116313 WO2022056875A1 (zh) 2020-09-18 2020-09-18 一种铭牌图像的分割方法、装置和计算机可读存储介质
CN202080105178.6A CN116134481A (zh) 2020-09-18 2020-09-18 一种铭牌图像的分割方法、装置和计算机可读存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2020/116313 WO2022056875A1 (zh) 2020-09-18 2020-09-18 一种铭牌图像的分割方法、装置和计算机可读存储介质

Publications (1)

Publication Number Publication Date
WO2022056875A1 true WO2022056875A1 (zh) 2022-03-24

Family

ID=80777418

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/116313 WO2022056875A1 (zh) 2020-09-18 2020-09-18 一种铭牌图像的分割方法、装置和计算机可读存储介质

Country Status (2)

Country Link
CN (1) CN116134481A (zh)
WO (1) WO2022056875A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115689994A (zh) * 2022-09-14 2023-02-03 优层智能科技(上海)有限公司 一种铭牌和条码缺陷检测方法、设备和存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101447017A (zh) * 2008-11-27 2009-06-03 浙江工业大学 一种基于版面分析的选票快速识别统计方法及系统
CN106156761A (zh) * 2016-08-10 2016-11-23 北京交通大学 面向移动终端拍摄的图像表格检测与识别方法
CN106326921A (zh) * 2016-08-18 2017-01-11 宁波傲视智绘光电科技有限公司 文本检测方法和装置
US9715624B1 (en) * 2016-03-29 2017-07-25 Konica Minolta Laboratory U.S.A., Inc. Document image segmentation based on pixel classification
CN107301418A (zh) * 2017-06-28 2017-10-27 江南大学 光学字符识别中的版面分析
CN110516208A (zh) * 2019-08-12 2019-11-29 深圳智能思创科技有限公司 一种针对pdf文档表格提取的系统及方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101447017A (zh) * 2008-11-27 2009-06-03 浙江工业大学 一种基于版面分析的选票快速识别统计方法及系统
US9715624B1 (en) * 2016-03-29 2017-07-25 Konica Minolta Laboratory U.S.A., Inc. Document image segmentation based on pixel classification
CN106156761A (zh) * 2016-08-10 2016-11-23 北京交通大学 面向移动终端拍摄的图像表格检测与识别方法
CN106326921A (zh) * 2016-08-18 2017-01-11 宁波傲视智绘光电科技有限公司 文本检测方法和装置
CN107301418A (zh) * 2017-06-28 2017-10-27 江南大学 光学字符识别中的版面分析
CN110516208A (zh) * 2019-08-12 2019-11-29 深圳智能思创科技有限公司 一种针对pdf文档表格提取的系统及方法

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115689994A (zh) * 2022-09-14 2023-02-03 优层智能科技(上海)有限公司 一种铭牌和条码缺陷检测方法、设备和存储介质
CN115689994B (zh) * 2022-09-14 2023-08-04 优层智能科技(上海)有限公司 一种铭牌和条码缺陷检测方法、设备和存储介质

Also Published As

Publication number Publication date
CN116134481A (zh) 2023-05-16

Similar Documents

Publication Publication Date Title
US7894689B2 (en) Image stitching
US8494297B2 (en) Automatic detection and mapping of symmetries in an image
US8457403B2 (en) Method of detecting and correcting digital images of books in the book spine area
CN104751142B (zh) 一种基于笔划特征的自然场景文本检测方法
US8401333B2 (en) Image processing method and apparatus for multi-resolution feature based image registration
CN109190623B (zh) 一种识别投影仪品牌和型号的方法
JP2010171976A (ja) 歪み文書画像を補正する方法及びシステム
CN111353961B (zh) 一种文档曲面校正方法及装置
US20180253852A1 (en) Method and device for locating image edge in natural background
Lelore et al. Super-resolved binarization of text based on the fair algorithm
JP5541679B2 (ja) 画像処理装置及び方法、並びに、プログラム
CN109741273A (zh) 一种手机拍照低质图像的自动处理与评分方法
EP2536123B1 (en) Image processing method and image processing apparatus
JP3814353B2 (ja) 画像分割方法および画像分割装置
CN112419207A (zh) 一种图像矫正方法及装置、系统
US9094617B2 (en) Methods and systems for real-time image-capture feedback
WO2022056875A1 (zh) 一种铭牌图像的分割方法、装置和计算机可读存储介质
CN110827189A (zh) 一种数字图像或视频的水印去除方法及系统
CN110610163B (zh) 一种自然场景下基于椭圆拟合的表格提取方法及系统
CN115410191B (zh) 文本图像识别方法、装置、设备和存储介质
JP2012060452A (ja) 画像処理装置、その方法およびプログラム
WO2022056872A1 (zh) 一种铭牌图像的矫正方法、装置和计算机可读存储介质
Bhaskar et al. Implementing optical character recognition on the android operating system for business cards
WO2022062417A1 (zh) 视频中嵌入图像的方法、平面预测模型获取方法和装置
WO2022056876A1 (zh) 一种电机铭牌的识别方法、装置和计算机可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20953733

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20953733

Country of ref document: EP

Kind code of ref document: A1