CN113392848A - Deep learning-based reading method and device for OCR on cylinder - Google Patents

Deep learning-based reading method and device for OCR on cylinder Download PDF

Info

Publication number
CN113392848A
CN113392848A CN202110948821.8A CN202110948821A CN113392848A CN 113392848 A CN113392848 A CN 113392848A CN 202110948821 A CN202110948821 A CN 202110948821A CN 113392848 A CN113392848 A CN 113392848A
Authority
CN
China
Prior art keywords
ocr
camera
characters
character
cylinder
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110948821.8A
Other languages
Chinese (zh)
Inventor
施晨涛
任世强
吴潘
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hitery Tianjin Technology Co ltd
Original Assignee
Hitery Tianjin Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hitery Tianjin Technology Co ltd filed Critical Hitery Tianjin Technology Co ltd
Priority to CN202110948821.8A priority Critical patent/CN113392848A/en
Publication of CN113392848A publication Critical patent/CN113392848A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Character Input (AREA)
  • Character Discrimination (AREA)

Abstract

The invention discloses a deep learning-based reading method and device for OCR on a cylinder, which solve the problem that the arc surface of the cylinder cannot be imaged stably to read the OCR and are compatible with OCR characters with different colors and fonts; the invention is feasible, obtains stable effect, has short required time, does not need manual participation, and can lay a good foundation for product digital management and tracing; the method can be flexibly used for the transformation of the existing production line, the required hardware change is little, a complex industrial vision system is not required to be designed, and certain reference significance is designed for the scheme of the related problems.

Description

Deep learning-based reading method and device for OCR on cylinder
Technical Field
The invention relates to the technical field of image recognition, in particular to a deep learning-based reading method and device for OCR (optical character recognition) on a cylinder.
Background
The OCR recognition detection is an important application technology of computer image processing in machine vision, and the main function of the OCR recognition detection is to further analyze and extract character information and corresponding position information by processing and characterizing images. And integrating the position information according to the character position arrangement sequence to form character information which accords with logic, thereby providing a foundation for the filing of a digital database of a subsequent object and the tracing management of products. The OCR reading is widely applied in the industry and is an indispensable link of a digital factory, the planar OCR characters have relatively mature application and solutions, but for some special industries, such as the bicycle industry, the OCR characters are printed on an arc surface and widely exist in the industrial production design process, because of the reasons that the lighting and the camera are difficult to image, the reading and the calibration of the OCR information are mainly completed manually at present, and the main problems exist: (1) the labor cost is too high; (2) the efficiency is low; (3) the accuracy cannot be guaranteed; (4) the problem of characters in the production and manufacturing process cannot be solved, and a digital factory is prevented from forming a closed loop.
The existing curved surface OCR recognition methods are mainly classified into 4 types: (1) based on an X-ray imaging reading method, reading image information formed by X-ray irradiation (2) based on a line laser 3D imaging reading method, reading three-dimensional image information formed by calibration shooting of a point laser and a plane camera (3) based on a line scanning camera imaging reading method, reading images under light sources at different angles for a plurality of times by controlling the camera to shoot by a camera or a recognition object movement imaging reading method (4) based on a multi-surface light imaging reading method, and synthesizing a 2.5D image for reading.
The method based on X-ray imaging has certain damage to the surface of the measured object and cannot be applied to reading of vulnerable products. According to the reading method based on the line laser 3D imaging, on one hand, an object or a camera needs to move and cannot be suitable for a fixed detection position, and the reading result is influenced due to the fact that the imaging effect of a low-reflection material is not ideal, and on the other hand, the reading method is not easy to achieve due to high manufacturing cost. The linear scanning camera is difficult to be compatible with detection objects with different depths of field for image reading, and the detection objects and the camera need to generate relative motion, so that the linear scanning camera cannot be applied to fixed scenes. The reading method based on the multi-surface light imaging is characterized in that the product imaging depends on a synthesis algorithm and a product shooting position, the OCR position is not fixed and stable imaging cannot be achieved without solving the parameter problems of the light source angle and the synthesis algorithm, and the image forming needs time, is long in time consumption compared with common detection and is not suitable for a high-speed detection scene.
The main existing difficulties of curved surface OCR reading: (1) the imaging is difficult to stabilize, the reflection angle of the curved surface imaging is larger, and the imaging effect is poorer when the imaging visual field is larger; (2) OCR characters cannot be stably imaged into the same picture for processing; (3) the OCR variation difference is large for products with different materials and colors; (4) the size and position of the OCR character will change and the shape of the character will also change.
The existing method can not realize stable imaging and reading of the cylinder, and a part of the curved surface of the cylinder where the OCR is located is intercepted as a curved surface by utilizing a surface integral principle, as long as the part of the curved surface meets the condition that the highest point from a plane where the cross section is located to the curved surface is smaller than the depth of field of the selected camera. The characteristic of the image can be approximately regarded as a plane, and the stable reading can be carried out by using the multi-camera, and the characteristic of the multi-camera is combined with an algorithm with higher robustness so as to be compatible with OCR of different positions and color angles.
Disclosure of Invention
The invention provides a reading method and a device of OCR on a cylinder based on deep learning, aiming at overcoming the defects of the background art, and solving the problems that the imaging is difficult and the OCR recognition is difficult under the curved surface of the cylinder by recognizing the OCR characters of the cylinder in a visible light image. The method can be used for OCR recognition of cylinder curved surfaces of various colors based on deep learning algorithm processing. The method has accurate identification and short time consumption, does not need manual participation, and can lay a good foundation for subsequent factory digital management.
In order to achieve the above object, one aspect of the present invention provides the following embodiments: a reading method of OCR on a cylinder based on deep learning comprises the following steps:
selecting at least two cameras for photographing and collecting, and selecting a proper camera installation angle according to the size of OCR (optical character recognition), so that the area of a superposed view area of two adjacent cameras is not less than 1/3 of the area of a view area of each camera;
setting camera parameters and shooting and collecting;
thirdly, building a target detection basic model by using YOLOv3, and building a rear-end classification model by using a ResNet34 model to obtain a deep learning model;
marking the picture data samples collected by the camera to generate corresponding sample marking files;
step five, training the deep learning model by using a sample marking file to obtain an OCR model;
taking a picture through a camera to acquire picture data;
step seven, inputting the picture data collected by the camera in the step six into the OCR model in the step five for picture data recognition and character marking processing to obtain character data;
and step eight, processing and integrating the character data obtained in the step seven to realize OCR reading.
Further, in the fifth step, the Deep learning model includes cbr convolution module, crc convolution module and Deep convolution module, the cbr convolution module is formed by mutually connecting convolution layer conv, batch normalization layer bn and Relu activation function in series, the crc convolution module is formed by connecting convolution layer conv, Relu activation function and convolution layer conv in series, the Deep convolution module is formed by connecting two cbr convolution modules in series, and the cbr convolution module, crc convolution module and Deep convolution module are used for extracting features of the OCR picture to form the OCR detection model.
Further, step eight includes three parts of removing repeated character data, longitudinally arranging character data, segmenting character set according to actual conditions and integrating.
Further, in step eight, the repeated character data are removed: detecting the characteristic position and the maximum position range of an OCR (optical character recognition) from a single picture, mapping characters on each position to a plurality of classes by a deep learning model, wherein each class corresponds to a similarity score value of one character and each class, the score values range from 0 to 1, and removing repeated character data by taking the highest score value of the same position.
Further, in step eight, the character data is arranged longitudinally: by means of the coordinate y of the output point in the y directioniThe values are arranged in ascending order, and the character processing mode for the total number of M and the number of lines of N is that
The number of lines thereof
Figure 379449DEST_PATH_IMAGE001
Wherein n iskIs the number of lines where the character is located, M is the total number of characters, N is the number of lines of the character, i is the index of the character, the value range is 0 to M-1, y1Is the minimum value of the y coordinates of all characters, y2Is the maximum value of the y-coordinate of all characters, yjIs the average of the minimum and maximum values of the y-coordinate in all characters.
Further, in the sixth step, the character set is segmented and integrated according to the actual situation: the single line of characters for any of the cameras Cam _ L and Cam _ R are processed by taking the characters over the entire field of view
Figure 226182DEST_PATH_IMAGE002
The number of effective total characters satisfies the total number of characters
Figure 764611DEST_PATH_IMAGE003
Wherein L isLIs the number of characters in the left camera, LRThe number of the characters in the right camera is the number of the characters in the right camera, the left camera starts to take the characters from the left side of the image, the right camera starts to take the characters from the right side of the image, and the above operation is carried out on each line of characters to obtain all the characters of the shooting arc surface.
Another aspect of the invention provides the following examples: an on-cylinder OCR reading device comprising:
at least two cameras;
a light source;
the processor is connected with the camera and the light source, and comprises picture fusion and recognition software; and
the controller is used for realizing the reading method of the OCR on the cylinder based on the deep learning;
the controller is respectively connected with the camera, the light source and the processor, when the light source irradiates OCR on a cylindrical surface, the controller controls the camera to collect images at preset time, obtains partial plane images of a plurality of OCRs, controls the processor to operate the image fusion and recognition software, fuses the partial plane images of the OCRs, generates a complete plane image of the OCRs, and processes and recognizes the complete plane image.
Further, the camera is a CCD image sensing camera or a CMOS image sensing camera.
Further, the device also comprises a rotatable cylindrical workpiece supporting mechanism.
Compared with the prior art, the invention has the following beneficial effects:
the invention fully uses the principle of the surface integral of the curved surface of the cylinder for reference, solves the problem that the arc surface of the cylinder cannot be imaged stably to read the OCR, and can be compatible with OCR characters with different colors and fonts; the invention is feasible, obtains stable effect, has short required time, does not need manual participation, and can lay a good foundation for product digital management and tracing; the method can be flexibly used for the transformation of the existing production line, the required hardware change is little, a complex industrial vision system is not required to be designed, and certain reference significance is designed for the scheme of the related problems.
Drawings
Fig. 1 is a flowchart illustrating a deep learning-based reading method for OCR on a cylinder according to the present invention.
Fig. 2 is a camera detection primitive diagram in embodiment 1 of the present invention.
Fig. 3 shows the reading result of a single camera in embodiment 1 of the present invention.
FIG. 4 is a diagram of the integrated final read effect in embodiment 1 of the present invention.
FIG. 5 is a schematic structural diagram of a reading apparatus for OCR on a cylinder according to the present invention.
Figure 6 is a top view of an OCR on cylinder reading apparatus of the present invention.
In the figure: 1. a camera; 2. a light source; 3. a processor; 4. and a controller.
Detailed Description
In order to make the technical means, the creation characteristics, the achievement purposes and the effects of the invention easy to understand, the invention is further described with the specific embodiments.
Example 1
Referring to fig. 1, the present invention provides a method for reading an OCR on a cylinder based on deep learning, including the following steps:
the following takes two cameras as an example:
selecting at least two cameras for photographing and image-taking detection, and selecting the installation angle of each camera according to the size of the maximum OCR (optical character recognition), so that each two cameras respectively irradiate the position 2/3 in the length direction of the cylindrical OCR, and the area of the overlapped vision area of the two adjacent cameras is not less than 1/3 of the area of the vision area of each camera;
specifically, as can be seen from fig. 5, two cameras are used for photographing and image-taking detection, and two camera installation angles are selected according to the size of the maximum OCR, so that the cameras Cam _ L and Cam _ R respectively irradiate the positions 2/3 in the length direction of the cylindrical OCR, the overlapping view area of two adjacent cameras is not less than 1/3 of the view area of each camera, and the OCR in different positions and different sizes can be clearly imaged.
As shown in fig. 2, step two, setting appropriate camera parameters and taking a picture for collection;
specifically, setting appropriate camera parameters, taking pictures and collecting, and respectively collecting a large number of OCR pictures with different colors and different states to form a data set;
and step three, in order to guarantee reading speed and stability, building a target detection basic model by using YOLOv3, and building a rear-end classification model by using a ResNet34 model to obtain a deep learning model.
Specifically, DFAPI is selected for development in deep learning basic development, an open-source framework Paddlex is selected for secondary development, the performance of the Paddlex and the ecological environment of deep learning in China are mainly considered, the open-source framework is commercially friendly following an Apache License protocol, in order to guarantee reading speed and stability, a target detection basic model is YOLOv3, and a rear-end classification model is identified by a ResNet 34.
Marking the pictures acquired by the camera to generate corresponding sample marking files;
specifically, full-image data labeling is carried out on pictures of two cameras through self-developed software Dolphin Focus (DF), characteristic characters are selected from each picture to be labeled, a corresponding sample labeling data file is generated, software automatic labeling is carried out on the existing picture data through an OCR model which is completed, and then manual verification is completed, or software labeling is used for setting the size of characters, and then low-score characters which cannot be labeled through software are manually labeled.
Further specifically, the deep learning model maps the characters at each position to a plurality of classes, each class corresponds to a similarity score value of one character and one class, the value range of the score values is 0 to 1, and the characters with the score value lower than 0.5 marked by software need to be manually marked again.
Step five, training the deep learning model by using a sample marking file to obtain an OCR model;
specifically, the deep learning model is trained and tested through the marked data to obtain an OCR model, and the OCR model is used for performing OCR detection on an image to be detected containing characters to form a detection result.
Further specifically, the Deep learning model comprises an cbr convolution module, a crc convolution module and a Deep convolution module, the cbr convolution module is formed by mutually connecting convolution layer conv, batch normalization layer bn and Relu activation functions in series, the crc convolution module is formed by connecting convolution layer conv, Relu activation function and convolution layer conv in series, the Deep convolution module is formed by connecting two cbr convolution modules in series, and the cbr convolution module, the crc convolution module and the Deep convolution module are used for extracting features of the OCR picture to obtain the OCR detection model.
Meanwhile, the light source and the camera can be controlled to be compatible with different product backgrounds, so that OCR recognition objects with different background colors and materials can be realized, the camera background can be adjusted by adjusting the exposure of the camera, the exposure parameters of the camera are bound with the product numbers, and different exposures are adopted for photographing and image taking detection when different detection schemes are used.
As shown in fig. 3-4, step six, taking a picture by a camera to acquire picture data;
step seven, inputting the picture data collected by the camera in the step six into the OCR model in the step five for picture data recognition and character marking processing to obtain character data;
and step eight, processing and integrating the character data obtained in the step seven to realize OCR reading.
Further, step eight, the method also comprises three parts of removing repeated characters, imaging and arranging character data, segmenting a character set according to actual conditions and integrating.
Specifically, in step eight, the repeated character data are removed: detecting feature positions and the maximum position range of an OCR (optical character recognition) from a single picture, mapping characters at each position to a plurality of classes by a deep learning model, wherein each class corresponds to a similarity score value of one character and one class, and the value range of the score values is 0 to 1;
specifically, in step eight, the character data are arranged longitudinally: by means of the coordinate y of the output point in the y directioniThe values are arranged in ascending order, and the character processing mode for the total number of M and the number of lines of N is that
The number of lines thereof
Figure 696795DEST_PATH_IMAGE004
Wherein n iskIs the number of lines where the character is located, M is the total number of characters, N is the number of lines of the character, i is the index of the character, the value range is 0 to M-1, y1Is the minimum value of the y coordinates of all characters, y2Is the maximum value of the y-coordinate of all characters, yjIs the average of the minimum and maximum values of the y-coordinate in all characters.
Specifically, in the sixth step, the character set is segmented and integrated according to the actual situation: the single line of characters for any of the cameras Cam _ L and Cam _ R are processed by taking the characters over the entire field of view
Figure 713292DEST_PATH_IMAGE005
The number of effective total characters satisfies the total number of characters
Figure 148953DEST_PATH_IMAGE006
Wherein L isLIs the number of characters in the left camera, LRThe number of the characters in the right camera is the number of the characters in the right camera, the left camera starts to take the characters from the left side of the image, the right camera starts to take the characters from the right side of the image, and the above operation is carried out on each line of characters to obtain all the characters of the shooting arc surface.
Illustrate by way of example
There are 2 lines of characters, the first line is "12345678", the second line is "abcdefgh"; 2 cameras, wherein 2 lines of characters can be seen in the visual field of the 1 st camera, the content of the 1 st line is '12345', and the content of the 2 nd line is 'abcde'; in the view of the 2 nd camera, 2 lines of characters can be seen, the content of the first line is '45678', and the content of the 2 nd line is 'defgh'.
In the first step, with repeated characters removed, the same character may recognize multiple results, for example, the character "6" may be recognized as "8", but the score value recognized as "6" is 0.8, and the score value recognized as "8" is 0.1, and the final recognition result of each character is determined by the maximum score value.
And secondly, longitudinally arranging, namely, after the character content is recognized, disordering in the vertical direction, and needing to be longitudinally arranged, calculating the number of lines where each character is located according to a formula, wherein the number of lines is 1 or 2, and the line 1 character content of the 1 st camera is 12345, the line 2 character content is abcde, the line 1 character content of the 2 nd camera is 45678, and the line 2 content is defgh.
Thirdly, segmentation and integration, wherein it is known that line 1 has 8 characters and line 2 has 8 characters, so that in the 1 st camera, line 1 takes the left 4 characters "1234", and line 2 takes the left 4 characters "abcd"; in camera 2, line 1 takes the right 4 characters "5678" and line 2 takes the right 4 characters "efgh". Finally, the character content "12345678" in the 1 st line and the character content "abcdefgh" in the 2 nd line are obtained through integration.
Example 2
Detailed structure of the invention referring to fig. 5, a reading apparatus for OCR on a cylinder includes:
at least two cameras 1;
light source 2: specifically, the light source can be adjusted, for satisfying the OCR discernment object of different background colours and material, camera background is adjusted through adjusting camera exposure simultaneously to this scheme, and camera exposure parameter binds with the product senna, adopts different exposures to shoot when using different detection scheme and gets the picture and detect.
The processor is connected with the camera 1 and the light source 2, and comprises picture fusion and recognition software; and
the controller is used for realizing the reading method of the OCR on the cylinder based on the deep learning;
the controller is respectively connected with the camera, the light source and the processor, when the light source irradiates OCR on a cylindrical surface, the controller controls the camera to collect images at preset time so as to obtain a plurality of partial plane images of the OCR, and controls the processor to operate the image fusion and recognition software so as to fuse the partial plane images of the OCR, thereby generating a complete plane image of the OCR, and processing and recognizing the complete plane image.
Further, the camera 1 is a CCD image sensing camera or a CMOS image sensing camera.
Specifically, the camera 1 may be a CCD or CMOS image sensor, and preferably has a resolution of not less than 30 ten thousand pixels, and the camera 1 can output a digital image signal, so that the microprocessor operates image fusion and recognition software to perform image fusion and recognition. The CCD image sensing camera as a novel photoelectric converter is widely applied to the fields of camera shooting, image acquisition, scanners, industrial measurement and the like, and has the advantages of small size, light weight, high resolution, high sensitivity, wide dynamic range, low power consumption, good shock resistance and impact resistance, high reliability and the like. The CMOS image sensing camera has a series of advantages of random window reading capability, radiation resistance, high reliability and the like.
Further, a rotatable cylindrical workpiece support mechanism 3 is included.
Specifically, as shown in fig. 6, the cylindrical workpiece support mechanism 3 may be formed by a support mounting frame having a rotary driving member.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.
While there have been shown and described what are at present considered the fundamental principles and essential features of the invention and its advantages, it will be apparent to those skilled in the art that the invention is not limited to the details of the foregoing exemplary embodiments, but is capable of other specific forms without departing from the spirit or essential characteristics thereof. The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.
Furthermore, it should be understood that although the present description refers to embodiments, not every embodiment may contain only a single embodiment, and such description is for clarity only, and those skilled in the art should integrate the description, and the embodiments may be combined as appropriate to form other embodiments understood by those skilled in the art.

Claims (9)

1. A reading method of OCR on a cylinder based on deep learning is characterized by comprising the following steps:
selecting at least two cameras for photographing and collecting, and selecting a proper camera installation angle according to the size of OCR (optical character recognition), so that the area of a superposed view area of two adjacent cameras is not less than 1/3 of the area of a view area of each camera;
setting camera parameters and shooting and collecting;
thirdly, building a target detection basic model by using YOLOv3, and building a rear-end classification model by using a ResNet34 model to obtain a deep learning model;
marking the picture data samples collected by the camera to generate corresponding sample marking files;
step five, training the deep learning model by using a sample marking file to obtain an OCR model;
taking a picture through a camera to acquire picture data;
step seven, inputting the picture data collected by the camera in the step six into the OCR model in the step five for picture data recognition and character marking processing to obtain character data;
and step eight, processing and integrating the character data obtained in the step seven to realize OCR reading.
2. The deep learning based reading method for OCR on cylinder as claimed in claim 1, wherein: in the fifth step, the Deep learning model comprises an cbr convolution module, a crc convolution module and a Deep convolution module, the cbr convolution module is formed by mutually connecting convolution layer conv, batch normalization layer bn and Relu activation functions in series, the crc convolution module is formed by connecting convolution layer conv, Relu activation functions and convolution layer conv in series, the Deep convolution module is formed by connecting two cbr convolution modules in series, and the cbr convolution module, the crc convolution module and the Deep convolution module are used for extracting features of the OCR picture to form the OCR detection model.
3. The deep learning based reading method for OCR on cylinder as claimed in claim 1, wherein: and step eight, removing repeated character data, longitudinally arranging the character data, segmenting a character set and integrating.
4. The deep learning based on-cylinder OCR reading method as claimed in claim 3, wherein:
and step eight, removing repeated character data: the feature position and the maximum position range of the OCR are detected from a single picture, the deep learning model maps the character data of each position to a plurality of classes, each class corresponds to a similarity score value of a character and the class, the score values range from 0 to 1, and repeated character data are removed by taking the highest score value of the same position.
5. The deep learning based on-cylinder OCR reading method as claimed in claim 3, wherein:
in the eighth step, the character data are arranged longitudinally: by seating in the y-direction of the output pointMark yiThe values are arranged in ascending order, and the character processing mode for the total number of M and the number of lines of N is that
The number of lines thereof
Figure 501563DEST_PATH_IMAGE001
Wherein n iskIs the number of lines where the character is located, M is the total number of characters, N is the number of lines of the character, i is the index of the character, the value range is 0 to M-1, y1Is the minimum value of the y coordinates of all characters, y2Is the maximum value of the y-coordinate of all characters, yjIs the average of the minimum and maximum values of the y-coordinate in all characters.
6. The deep learning based on-cylinder OCR reading method as claimed in claim 3, wherein:
in the eighth step, the character set is divided and integrated: the single line of characters for any of the cameras Cam _ L and Cam _ R are processed by taking the characters over the entire field of view
Figure 389885DEST_PATH_IMAGE002
The number of effective total characters satisfies the total number of characters
Figure 469793DEST_PATH_IMAGE003
Wherein L isLIs the number of characters in the left camera, LRThe number of the characters in the right camera is the number of the characters in the right camera, the left camera starts to take the characters from the left side of the image, the right camera starts to take the characters from the right side of the image, and the above operation is carried out on each line of characters to obtain all the characters of the shooting arc surface.
7. An OCR on cylinder reading apparatus, characterized by: the method comprises the following steps:
at least two cameras;
a light source;
the processor is connected with the camera and the light source, and comprises picture fusion and recognition software; and
a controller for implementing a deep learning based on-cylinder OCR reading method as claimed in any one of claims 1-5;
the controller is respectively connected with the camera, the light source and the processor, when the light source irradiates OCR on a cylindrical surface, the controller controls the camera to collect images at preset time, obtains partial plane images of a plurality of OCRs, controls the processor to operate the image fusion and recognition software, fuses the partial plane images of the OCRs, generates a complete plane image of the OCRs, and processes and recognizes the complete plane image.
8. An on-cylinder OCR reading apparatus as claimed in claim 7 wherein the camera is a CCD image sensing camera or a CMOS image sensing camera.
9. An on-cylinder OCR reading apparatus according to claim 7 further comprising a rotatable cylinder workpiece support mechanism.
CN202110948821.8A 2021-08-18 2021-08-18 Deep learning-based reading method and device for OCR on cylinder Pending CN113392848A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110948821.8A CN113392848A (en) 2021-08-18 2021-08-18 Deep learning-based reading method and device for OCR on cylinder

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110948821.8A CN113392848A (en) 2021-08-18 2021-08-18 Deep learning-based reading method and device for OCR on cylinder

Publications (1)

Publication Number Publication Date
CN113392848A true CN113392848A (en) 2021-09-14

Family

ID=77622921

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110948821.8A Pending CN113392848A (en) 2021-08-18 2021-08-18 Deep learning-based reading method and device for OCR on cylinder

Country Status (1)

Country Link
CN (1) CN113392848A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105488507A (en) * 2016-01-22 2016-04-13 吉林大学 Cylindrical surface character recognition system and method
CN110175603A (en) * 2019-04-01 2019-08-27 佛山缔乐视觉科技有限公司 A kind of engraving character recognition methods, system and storage medium
CN110569341A (en) * 2019-07-25 2019-12-13 深圳壹账通智能科技有限公司 method and device for configuring chat robot, computer equipment and storage medium
CN112150354A (en) * 2019-06-26 2020-12-29 四川大学 Single image super-resolution method combining contour enhancement and denoising statistical prior
CN112528998A (en) * 2021-02-18 2021-03-19 成都新希望金融信息有限公司 Certificate image processing method and device, electronic equipment and readable storage medium
CN112699860A (en) * 2021-03-24 2021-04-23 成都新希望金融信息有限公司 Method for automatically extracting and sorting effective information in personal tax APP operation video

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105488507A (en) * 2016-01-22 2016-04-13 吉林大学 Cylindrical surface character recognition system and method
CN110175603A (en) * 2019-04-01 2019-08-27 佛山缔乐视觉科技有限公司 A kind of engraving character recognition methods, system and storage medium
CN112150354A (en) * 2019-06-26 2020-12-29 四川大学 Single image super-resolution method combining contour enhancement and denoising statistical prior
CN110569341A (en) * 2019-07-25 2019-12-13 深圳壹账通智能科技有限公司 method and device for configuring chat robot, computer equipment and storage medium
CN112528998A (en) * 2021-02-18 2021-03-19 成都新希望金融信息有限公司 Certificate image processing method and device, electronic equipment and readable storage medium
CN112699860A (en) * 2021-03-24 2021-04-23 成都新希望金融信息有限公司 Method for automatically extracting and sorting effective information in personal tax APP operation video

Similar Documents

Publication Publication Date Title
CN111855664B (en) Adjustable three-dimensional tunnel defect detection system
CN109100741B (en) Target detection method based on 3D laser radar and image data
CN109993086B (en) Face detection method, device and system and terminal equipment
EP0669593B1 (en) Two-dimensional code recognition method
CN100380393C (en) Precise location method of QR code image symbol region at complex background
CN110400315B (en) Defect detection method, device and system
WO2022121283A1 (en) Vehicle key point information detection and vehicle control
CN110598743A (en) Target object labeling method and device
JP6305171B2 (en) How to detect objects in a scene
CN110619279B (en) Road traffic sign instance segmentation method based on tracking
JP2014511772A (en) Method to invalidate sensor measurement value after picking motion in robot system
CN110766758A (en) Calibration method, device, system and storage device
CN108961262B (en) Bar code positioning method in complex scene
CN111724445A (en) Method and system for identifying large-view small-size identification code
CN113569679B (en) Method, device and system for measuring elongation at break
Li et al. Face detection based on depth information using HOG-LBP
CN106645045A (en) Bi-directional scanning imaging method based on TDI-CCD (time delay integration-charge coupled device) in fluorescent optical micro-imaging
CN117314986A (en) Unmanned aerial vehicle cross-mode power distribution equipment inspection image registration method based on semantic segmentation
CN113392848A (en) Deep learning-based reading method and device for OCR on cylinder
CN112329893A (en) Data-driven heterogeneous multi-target intelligent detection method and system
CN116883483A (en) Fish body measuring method based on laser camera system
CN116012712A (en) Object general feature-based target detection method, device, equipment and medium
CN115760860A (en) Multi-type workpiece dimension visual measurement method based on DXF file import
CN116125489A (en) Indoor object three-dimensional detection method, computer equipment and storage medium
CN113971799A (en) Vehicle nameplate information position detection method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210914