CN114495106A

CN114495106A - A deep learning MOCR method applied to DFB laser chips

Info

Publication number: CN114495106A
Application number: CN202210401938.9A
Authority: CN
Inventors: 王旭东; 李晔彬; 王昭睿; 刘蔚; 杜晓辉; 张静; 刘娟秀; 刘霖; 叶玉堂; 刘永
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2022-04-18
Filing date: 2022-04-18
Publication date: 2022-05-13

Abstract

The invention discloses a deep learning MOCR method applied to a DFB laser chip, which belongs to the field of image processing. Including: step 1: collect the image of the DFB chip online; step 2: preprocess the image; step 3: correct the direction of the characters on the chip through the character area detection network; step 4: use the character recognition network to correct the characters on the chip Recognize the content; Step 5: Determine whether the recognized character content is correct, if it is correct, store it in the associated database; if it is incorrect, remove the chip or perform manual rechecking. The invention can efficiently and accurately identify the character information on the DFB laser chip without changing the original production process flow. Compared with the traditional recognition method, the recognition accuracy is greatly improved, and the recognition accuracy can reach 98.8%.

Description

A deep learning MOCR method applied to DFB laser chips

技术领域technical field

本发明涉及光电半导体微观视觉检测领域，尤其涉及一种应用于DFB激光器芯片的深度学习MOCR方法。The invention relates to the field of optoelectronic semiconductor microscopic visual inspection, in particular to a deep learning MOCR method applied to a DFB laser chip.

背景技术Background technique

分布反馈式激光二极管（DFB）由于其高边模抑制比和超窄谱宽被广泛应用于光纤通信系统中。随着通信容量和通信带宽的不断发展，激光器芯片也变得越来小，以便于进行集成和封装成光通信器件；随着5G通信，FTTR及数据中心的高速发展，对通信速率也变得越来越高。DFB激光器作为光纤通信系统中最为关键的核心芯片，其性能和质量的可靠稳定性将变得越来越重要。Distributed feedback laser diodes (DFBs) are widely used in optical fiber communication systems due to their high side-mode rejection ratio and ultra-narrow spectral width. With the continuous development of communication capacity and communication bandwidth, laser chips are also becoming smaller and smaller, so as to facilitate integration and packaging into optical communication devices; with the rapid development of 5G communication, FTTR and data centers, the communication rate has also become Higher and higher. As the most critical core chip in the optical fiber communication system, the reliability and stability of the performance and quality of the DFB laser will become more and more important.

智能制造和人工智能的不断发展为制造的信息化，智能化和可追溯性赋予了强大的能力。DFB激光器芯片由于其在技术和价格方面的重要性，需要在整个光器件的生产制造过程中做到能够质量追溯和过程控制。DFB激光器芯片的各项关键参数如光功率，消光比，斜效率，中心波长，交叉点以及边模抑制比等关键信息需要和每个芯片一一关联和对应起来。而DFB激光器芯片上的字符信息作为芯片本身的唯一身份标识，对其的准确无误高效识别将变得尤为重要。The continuous development of intelligent manufacturing and artificial intelligence has given powerful capabilities to the informatization, intelligence and traceability of manufacturing. Due to its importance in technology and price, DFB laser chips need to be able to trace the quality and control the process in the entire manufacturing process of optical devices. The key parameters of DFB laser chips such as optical power, extinction ratio, slope efficiency, central wavelength, crossover point and side mode suppression ratio need to be associated and corresponding to each chip one by one. As the character information on the DFB laser chip is the unique identification of the chip itself, its accurate and efficient identification will become particularly important.

然而DFB激光器芯片上的字符与传统的字符有较大的不同和区别，主要表现为在芯片上的字符特别小，约小于100um（字符区域占芯片面积约10%），这就需要显微成像技术来对目标进行清晰成像；其次是字符之间没有上下文语义，这导致无法采用语义识别方法；再次是由于芯片微小，其字符不是很标准的字体，这给识别也带来了一定难度。再加上工业环境和芯片由于脏污，缺陷和尘埃等造成的干扰，加大了识别难度。这些所有的问题和特点就导致了DFB激光器芯片字符识别的独特性和难度。However, the characters on the DFB laser chip are quite different and different from the traditional characters, mainly because the characters on the chip are particularly small, about less than 100um (the character area accounts for about 10% of the chip area), which requires microscopic imaging technology to image the target clearly; secondly, there is no contextual semantics between characters, which makes it impossible to use semantic recognition methods; thirdly, because the chip is small, its characters are not very standard fonts, which also brings certain difficulties to recognition. Coupled with the industrial environment and the interference of chips due to dirt, defects and dust, etc., it increases the difficulty of identification. All these problems and characteristics lead to the uniqueness and difficulty of character recognition of DFB laser chips.

采用传统的字符识别方法大概识别准确率只有60%左右，这导致在每一个是识别工位上需要有专门的工程师人员实时人为辅助判断无法识别的字符。这在光通信器件的实时制造产线上是无法满足要求的。严重导致了人力和效率的浪费，且由于人容易疲劳和情绪等问题也出小质量控制上的漏洞和风险。Using the traditional character recognition method, the recognition accuracy rate is only about 60%, which leads to the need for special engineers in real-time human assistance to judge unrecognized characters at each recognition station. This is unsatisfactory in the real-time manufacturing line of optical communication devices. Seriously lead to waste of manpower and efficiency, and due to people's easy fatigue and emotions, there are also small loopholes and risks in quality control.

发明内容SUMMARY OF THE INVENTION

本发明的目的在于克服现有技术的不足，提供一种应用于光器件封装过程中DFB激光器芯片上的微观字符的深度学习自动字符区域定位及字符识别方法，该方法可以在不改变原有生产工艺流程的情况下高效准确的识别DFB激光器芯片上的字符信息。相比于传统识别方法，大大提高了识别准确率，其识别准确率可达98.8%。The purpose of the present invention is to overcome the deficiencies of the prior art, and to provide a deep learning automatic character area positioning and character recognition method applied to microscopic characters on a DFB laser chip in an optical device packaging process, which can be used without changing the original production. Efficiently and accurately identify the character information on the DFB laser chip under the condition of the technological process. Compared with the traditional recognition method, the recognition accuracy rate is greatly improved, and the recognition accuracy rate can reach 98.8%.

本发明的目的是通过以下技术方案来实现的：The purpose of this invention is to realize through the following technical solutions:

一种应用于DFB激光器芯片的深度学习MOCR方法，包括以下步骤：A deep learning MOCR method applied to a DFB laser chip, comprising the following steps:

步骤1：在线采集DFB激光器芯片的图像；Step 1: Collect the image of the DFB laser chip online;

步骤2：对图像进行预处理；Step 2: Preprocess the image;

步骤3：通过字符区域检测网络对芯片上的字符方向进行校正；Step 3: Correct the direction of the characters on the chip through the character area detection network;

步骤3.1：对输入图像依次进行四次下采样，将第四次下采样的结果与第三次下采样的结果求和得到特征一，将特征一与第二次下采样的结果求和得到特征二，将特征二与第一次下采样的结果求和得到特征三；Step 3.1: Perform downsampling on the input image four times in turn, sum the result of the fourth downsampling and the result of the third downsampling to obtain feature 1, and sum the result of feature 1 and the result of the second downsampling to obtain the feature Second, sum the result of feature two and the first downsampling to obtain feature three;

步骤3.2：将第四次下采样的结果、特征一、特征二、特征三进行特征融合得到融合特征；Step 3.2: Perform feature fusion on the result of the fourth downsampling, feature 1, feature 2, and feature 3 to obtain fused features;

步骤3.3：将融合特征进行卷积与反卷积，得到概率图和阈值图；Step 3.3: Perform convolution and deconvolution on the fusion feature to obtain a probability map and a threshold map;

步骤3.4：将概率图和阈值图进行可微分二值化，得到近似二值化图像；Step 3.4: Differentiable binarization is performed on the probability map and the threshold map to obtain an approximate binarized image;

步骤3.5：根据近似二值化图像中连通域输出识别框，并根据识别框长款比例将识别框转换为水平；Step 3.5: Output the recognition frame according to the connected domain in the approximate binarized image, and convert the recognition frame to horizontal according to the length ratio of the recognition frame;

步骤3.6：通过文本方向分类器来确定文本是否反转；若文本方向是反的，则将文本旋转180度后，再输入字符识别网络进行识别；若不是，则直接输入字符识别网络进行识别；Step 3.6: Determine whether the text is reversed through the text direction classifier; if the text direction is reversed, rotate the text 180 degrees, and then input the character recognition network for recognition; if not, directly input the character recognition network for recognition;

步骤4：通过字符识别网络对芯片上的字符内容进行识别；Step 4: Recognize the character content on the chip through the character recognition network;

步骤5：判断识别的字符内容是否正确，若正确，则存储至关联数据库；若不正确，则剔除该芯片或者进行人工复检。Step 5: Determine whether the recognized character content is correct, if correct, store it in the associated database; if not, remove the chip or perform manual rechecking.

进一步的，所述字符区域检测网络训练时所用的数据集是通过对原始图像进行手动标注并进行数据增强后形成的。Further, the data set used in the training of the character region detection network is formed by manually labeling the original image and performing data enhancement.

进一步的，所述字符识别网络为基于字符识别的全卷积神经网络CRNN，用来识别字符框中包含的字符。Further, the character recognition network is a fully convolutional neural network CRNN based on character recognition, which is used to recognize the characters contained in the character frame.

进一步的，所述字符识别网络训练时所用的数据集是通过对原始图像进行手动标注并进行数据增强后形成的。Further, the data set used in the character recognition network training is formed by manually labeling the original image and performing data enhancement.

进一步的，所述步骤1中在线采集DFB激光器芯片的图像所采用的设备包括光源、COMS相机和远心镜头，采用基于红外测距的自动对焦技术对DFB激光器芯片的表面字符图像进行采集。Further, the equipment used to collect the image of the DFB laser chip online in the step 1 includes a light source, a CMOS camera and a telecentric lens, and the surface character image of the DFB laser chip is collected using an infrared ranging-based autofocus technology.

本发明的有益效果：本发明的目的在于提供一种应用于光器件封装过程中DFB激光器芯片上的微观字符的深度学习自动字符区域定位及字符识别方法，可以在不改变原有生产工艺流程的情况下高效准确的识别DFB激光器芯片上的字符信息。相比于传统识别方法，大大提高了识别准确率，其识别准确率可达98.8%。Beneficial effects of the present invention: The purpose of the present invention is to provide a deep learning automatic character area positioning and character recognition method applied to the microscopic characters on the DFB laser chip during the packaging process of the optical device, which can be used without changing the original production process flow. It can efficiently and accurately identify the character information on the DFB laser chip under different circumstances. Compared with the traditional recognition method, the recognition accuracy rate is greatly improved, and the recognition accuracy rate can reach 98.8%.

附图说明Description of drawings

图1是图像字符区域定位及识别流程图。Fig. 1 is a flow chart of image character area location and recognition.

图2是字符区域检测网络架构图。Figure 2 is a diagram of the character area detection network architecture.

图3是字符识别网络架构图。Figure 3 is a character recognition network architecture diagram.

具体实施方式Detailed ways

应当理解，此处所描述的具体实施例仅用以解释本发明，并不用于限定本发明。It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.

下面将结合本发明实施例中的附图，对本发明实施例中的技术方案进行清楚、完整地描述，显然，所描述的实施例仅是本发明的一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

在本实施例中，如图1所示，一种应用于DFB激光器芯片的深度学习MOCR方法，包括以下步骤：In this embodiment, as shown in FIG. 1 , a deep learning MOCR method applied to a DFB laser chip includes the following steps:

步骤2：对图像进行预处理；Step 2: Preprocess the image;

其中，字符区域检测网络为DBNet深度学习网络，并结合DFB激光器芯片的字符特征对DBNet深度学习网络进行优化；优化具体是基于最新的DBNet模型，该模型负责检测图像中的字符区域，并将文本框转换为水平方向。然后使用文本方向分类器来确定文本是否反转。如果是，则需要将文本旋转180度才能再次识别。此外，最后的字符识别方法使用全卷积的基于字符的识别模型来处理检测到的字符框，并识别其中包含的字符。Among them, the character area detection network is the DBNet deep learning network, and the DBNet deep learning network is optimized in combination with the character features of the DFB laser chip; the optimization is based on the latest DBNet model, which is responsible for detecting the character area in the image. The box is converted to a horizontal orientation. A text orientation classifier is then used to determine if the text is reversed. If it is, the text needs to be rotated 180 degrees to be recognized again. Furthermore, the final character recognition method uses a fully convolutional character-based recognition model to process the detected character boxes and identify the characters contained within them.

其中，DBNet深度学习网络训练时所用的数据集是通过对原始图像进行手动标注并进行数据增强后形成的。Among them, the data set used in the training of the DBNet deep learning network is formed by manually labeling the original images and performing data enhancement.

其中，字符识别网络为基于字符识别的全卷积神经网络CRNN，用来识别字符框中包含的字符。Among them, the character recognition network is a fully convolutional neural network CRNN based on character recognition, which is used to recognize the characters contained in the character frame.

字符识别网络训练时所用的数据集是通过对原始图像进行手动标注并进行数据增强后形成的。The dataset used to train the character recognition network is formed by manually annotating the original images and performing data augmentation.

本发明中的MOCR方法分为文本检测和文本识别两个阶段。检测方法基于最新的DBNet深度学习模型，并在此基础上结合DFB激光器芯片的字符特征对模型进行了优化。经过优化后的模型负责检测图像中的字符区域框定，并将文本框变为水平方向，如图2所示为字符区域定位网络模型。然后使用文本方向分类器来确定文本是否反转，其图中的“1/2”、“1/4”和“1/32”表示与输入图像相比的比例。“pred”由两个步长为 2 的反卷积算子和一个3×3 卷积算子组成。The MOCR method in the present invention is divided into two stages: text detection and text recognition. The detection method is based on the latest DBNet deep learning model, and on this basis, the model is optimized in combination with the character features of the DFB laser chip. The optimized model is responsible for detecting the character area frame in the image and turning the text frame into a horizontal direction, as shown in Figure 2 for the character area localization network model. A text orientation classifier is then used to determine whether the text is reversed, where "1/2", "1/4" and "1/32" in the figure represent the scale compared to the input image. "pred" consists of two stride 2 deconvolution operators and a 3x3 convolution operator.

如果方向是反的，则需要将文本旋转180度才能再次识别。此外，最终的字符识别方法使用全卷积神经网络基于字符的识别模型来处理检测到的字符框并识别其中包含的字符。识别网络如图3所示。If the orientation is reversed, the text needs to be rotated 180 degrees to be recognized again. Furthermore, the final character recognition method uses a fully convolutional neural network character-based recognition model to process the detected character boxes and identify the characters contained within them. The identification network is shown in Figure 3.

在本实施例中，在线采集DFB激光器芯片的图像所采用的设备包括光源、COMS相机和远心镜头，采用基于红外测距的自动对焦技术对DFB激光器芯片的表面字符图像进行采集。COMS相机采用的是大恒公司生产的面阵COMS图像传感器（MER-502-79U3C）（分辨率：2448*2048，帧率：79fps，图像尺寸：3.45um；光接口：C接口，数据接口：USB 3.0);采用灿锐远心镜头（XF-T6X65D）（放大倍数：6.0X，物场：Φ1.3mm，景深：0.06，远心度：<0.02°）。In this embodiment, the equipment used for online acquisition of the image of the DFB laser chip includes a light source, a CMOS camera and a telecentric lens, and the surface character image of the DFB laser chip is acquired using an infrared ranging-based autofocus technology. The COMS camera adopts the area array COMS image sensor (MER-502-79U3C) produced by Daheng Company (resolution: 2448*2048, frame rate: 79fps, image size: 3.45um; optical interface: C interface, data interface: USB 3.0); using Canrui telecentric lens (XF-T6X65D) (magnification: 6.0X, object field: Φ1.3mm, depth of field: 0.06, telecentricity: <0.02°).

DFB激光器芯片表面字符大小为70um*80um。这类字符被表示为最大尺寸小于100微米的微字符，因为肉眼几乎无法识别。同时，我们采用了基于红外测距的自动对焦技术，保证了生产线的实时性和高效率，最大限度地减少了环境和批次错误造成的散焦图像模糊的影响。The size of the characters on the surface of the DFB laser chip is 70um*80um. Such characters are represented as microcharacters with a maximum dimension of less than 100 microns, as they are almost invisible to the naked eye. At the same time, we have adopted autofocus technology based on infrared ranging, which ensures the real-time and high efficiency of the production line, and minimizes the impact of defocused images caused by environmental and batch errors.

在本实施例中，由于原始图像是2448*2048像素的图像，其中字符区域只占一小部分，因此，采用了两阶段的深度学习网络模型的训练方法（检测和识别）；然后，对原始图像进行标记，得到检测识别网络所需的图像和相应的标签。In this embodiment, since the original image is an image of 2448*2048 pixels, in which the character area only occupies a small part, a two-stage deep learning network model training method (detection and recognition) is adopted; The images are labeled to obtain the images and corresponding labels required by the detection and recognition network.

首先用矩形框标记原图像中的字符区域，记录从左上角顺时针方向的四个坐标点。之后，得到矩形框在原始图像中的坐标位置，即检测网络所需的标签。然后裁剪原始图像的矩形区域并记录文本内容以获得识别网络所需的图片和标签。最后，我们将用于识别的标签的文本内容替换为 0 度和 180 度，得到方向分类器所需的标签。First, mark the character area in the original image with a rectangular frame, and record the four coordinate points clockwise from the upper left corner. After that, the coordinate position of the rectangular box in the original image is obtained, that is, the label required by the detection network. The rectangular area of the original image is then cropped and the text content is recorded to obtain the pictures and labels needed to recognize the network. Finally, we replace the text content of the labels used for recognition with 0 and 180 degrees to get the labels required by the orientation classifier.

以上为MOCR实现DFB激光器芯片字符识别的具体方法描述。该方法目前已经成功应用于DFB光器件的封装检测生产线上。其大量的实际使用数据表明该方法具有较好的鲁棒性和识别准确率。目前识别准确率达到约98.8%。MOCR的首次实现和成功应用对光通信领域DFB激光器芯片和器件的智能制造和质量追溯起到了关键作用，对于进一步加快工业互联网的发展也具有重要意义。The above is a description of the specific method for MOCR to realize character recognition of DFB laser chip. The method has been successfully applied to the packaging and inspection production line of DFB optical devices. A large number of practical use data show that the method has good robustness and recognition accuracy. At present, the recognition accuracy rate is about 98.8%. The first realization and successful application of MOCR has played a key role in the intelligent manufacturing and quality traceability of DFB laser chips and devices in the field of optical communication, and is also of great significance for further accelerating the development of the Industrial Internet.

需要说明的是，对于前述的各个方法实施例，为了简单描述，故将其都表述为一系列的动作组合，但是本领域技术人员应该知悉，本申请并不受所描述的动作顺序的限制，因为依据本申请，某一些步骤可以采用其他顺序或者同时进行。其次，本领域技术人员也应该知悉，说明书中所描述的实施例均属于优选实施例，所涉及的动作和单元并不一定是本申请所必须的。It should be noted that, for the sake of simple description, the foregoing method embodiments are all expressed as a series of action combinations, but those skilled in the art should know that the present application is not limited by the described action sequence. Because in accordance with the present application, certain steps may be performed in other orders or simultaneously. Secondly, those skilled in the art should also know that the embodiments described in the specification are all preferred embodiments, and the actions and units involved are not necessarily required by the present application.

在上述实施例中，对各个实施例的描述都各有侧重，某个实施例中没有详细描述的部分，可以参见其他实施例的相关描述。In the above-mentioned embodiments, the description of each embodiment has its own emphasis. For parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程，是可以通过计算机程序来指令相关的硬件来完成，所述的程序可存储于计算机可读取存储介质中，该程序在执行时，可包括如上述各方法的实施例的流程。其中，所述的存储介质可为磁碟、光盘、ROM、RAM等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing the relevant hardware through a computer program, and the program can be stored in a computer-readable storage medium, and the program is During execution, it may include the processes of the embodiments of the above-mentioned methods. Wherein, the storage medium may be a magnetic disk, an optical disk, a ROM, a RAM, and the like.

以上所揭露的仅为本发明较佳实施例而已，当然不能以此来限定本发明之权利范围，因此依本发明权利要求所作的等同变化，仍属本发明所涵盖的范围。The above disclosures are only preferred embodiments of the present invention, and of course, the scope of the rights of the present invention cannot be limited by this. Therefore, equivalent changes made according to the claims of the present invention are still within the scope of the present invention.

Claims

1. a deep learning MOCR method applied to DFB laser chip, is characterized in that, comprises the following steps:

Step 1: Collect the image of the DFB laser chip online;

Step 2: Preprocess the image;

Step 3: Correct the direction of the characters on the chip through the character area detection network;

Step 3.1: Perform downsampling on the input image four times in turn, sum the result of the fourth downsampling and the result of the third downsampling to obtain feature 1, and sum the result of feature 1 and the result of the second downsampling to obtain the feature Second, sum the result of feature two and the first downsampling to obtain feature three;

Step 3.2: Perform feature fusion on the result of the fourth downsampling, feature 1, feature 2, and feature 3 to obtain fused features;

Step 3.3: Perform convolution and deconvolution on the fusion feature to obtain a probability map and a threshold map;

Step 3.4: Differentiable binarization is performed on the probability map and the threshold map to obtain an approximate binarized image;

Step 3.5: Output the recognition frame according to the connected domain in the approximate binarized image, and convert the recognition frame to horizontal according to the length ratio of the recognition frame;

Step 3.6: Determine whether the text is reversed through the text direction classifier; if the text direction is reversed, rotate the text 180 degrees, and then input the character recognition network for recognition; if not, directly input the character recognition network for recognition;

Step 4: Recognize the character content on the chip through the character recognition network;

Step 5: Determine whether the recognized character content is correct, if correct, store it in the associated database; if not, remove the chip or perform manual rechecking.

2 . The deep learning MOCR method applied to a DFB laser chip according to claim 1 , wherein the character area detection network is a DBNet deep learning network. 3 .

3. a kind of deep learning MOCR method that is applied to DFB laser chip according to claim 1, is characterized in that, described character recognition network is the full convolutional neural network CRNN based on character recognition, is used to recognize that character frame contains character of.

4. a kind of deep learning MOCR method applied to DFB laser chip according to claim 1, is characterized in that, in described step 1, the equipment that the image of DFB laser chip is collected online comprises light source, CMOS camera and telecentricity The lens adopts the automatic focusing technology based on infrared ranging to collect the surface character image of the DFB laser chip.

5. a kind of deep learning MOCR method applied to DFB laser chip according to claim 2, is characterized in that, the data set used during described DBNet deep learning network training is to manually mark and carry out data enhancement by original image formed later.