CN115731550A - Deep learning-based automatic drug specification identification method and system and storage medium - Google Patents

Deep learning-based automatic drug specification identification method and system and storage medium Download PDF

Info

Publication number
CN115731550A
CN115731550A CN202211478211.7A CN202211478211A CN115731550A CN 115731550 A CN115731550 A CN 115731550A CN 202211478211 A CN202211478211 A CN 202211478211A CN 115731550 A CN115731550 A CN 115731550A
Authority
CN
China
Prior art keywords
character
image
recognition
characters
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202211478211.7A
Other languages
Chinese (zh)
Inventor
阿仁宝力高
谢宇
邱璐璐
李淼
刘纯彤
李昀
刘晓凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Second Hospital of Dalian Medical University
Original Assignee
Second Hospital of Dalian Medical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Second Hospital of Dalian Medical University filed Critical Second Hospital of Dalian Medical University
Priority to CN202211478211.7A priority Critical patent/CN115731550A/en
Publication of CN115731550A publication Critical patent/CN115731550A/en
Withdrawn legal-status Critical Current

Links

Images

Landscapes

  • Character Discrimination (AREA)
  • Image Analysis (AREA)

Abstract

本发明提供一种基于深度学习的药品说明书自动识别方法、系统及存储介质。本发明方法,包括:获取药品说明书图像,基于图像处理方法对获取的图像进行预处理,提取有效字体区域;基于提取的有效字体区域,采用分段式识别法对文字信息进行初步识别;联合高频词训练的文字信息,对初步识别的文字信息进行优化识别,得到字符的优化识别结果。利用本发明提供的基于深度学习的药品说明书自动识别方法,可以对各医药公司上传电子药品说明书的药品名称、批准文号、核准和修改日期等关键特征信息进行是否合标准的自动检测,因此可以建立院内说明书实时查询系统,让一线医疗工作者随时查询完整、最新的说明书扫描文件,减少人力物力的同时,能够大大提升工作效率。

Figure 202211478211

The invention provides a method, system and storage medium for automatic recognition of drug instructions based on deep learning. The method of the present invention includes: acquiring the image of the drug instruction manual, preprocessing the acquired image based on the image processing method, and extracting the effective font area; based on the extracted effective font area, using a segmented recognition method to initially identify the text information; The text information of the frequent word training is used to optimize the recognition of the text information initially recognized, and obtain the optimized recognition result of the characters. Using the deep learning-based automatic recognition method for drug instructions provided by the present invention, it is possible to automatically detect whether the key feature information such as the drug name, approval number, approval and modification date uploaded by each pharmaceutical company to the electronic drug instructions is up to standard, so it can Establish a real-time query system for manuals in the hospital, allowing front-line medical workers to query complete and up-to-date scanned manuals at any time, reducing manpower and material resources while greatly improving work efficiency.

Figure 202211478211

Description

一种基于深度学习的药品说明书自动识别方法、系统及存储 介质A method, system and storage for automatic recognition of drug instructions based on deep learning medium

技术领域technical field

本发明涉及图像识别技术领域,具体而言,尤其涉及一种基于深度学习的药品说明书自动识别方法、系统及存储介质。The present invention relates to the technical field of image recognition, in particular, to an automatic recognition method, system and storage medium for drug instructions based on deep learning.

背景技术Background technique

光学文字识别(Optical Character Recognition,OCR)指的是电子设备(例如扫描仪或数码相机)检查纸上打印的字符,然后用字符识别方法将形状翻译成计算机文字的过程。数码相机及平台式扫描仪的广泛应用大大推动了光学文字识别技术的发展。现有技术主要分成两类:一是基于传统图像处理方法,二是基于深度神经网络的学习策略。针对药品说明书自动管理系统,现有方法具有以下缺陷:Optical Character Recognition (OCR) refers to the process in which electronic devices (such as scanners or digital cameras) check characters printed on paper, and then use character recognition methods to translate shapes into computer text. The wide application of digital cameras and flatbed scanners has greatly promoted the development of optical character recognition technology. The existing technologies are mainly divided into two categories: one is based on traditional image processing methods, and the other is based on deep neural network learning strategies. For the automatic management system of drug instructions, the existing methods have the following defects:

(1)功能缺陷:现有技术无法自动识别说明书中的标题、批准文号、批准日期和修订日期等关键特征信息,说明书的管理需要依赖人工审核;(1) Functional defects: the existing technology cannot automatically identify key feature information such as the title, approval number, approval date, and revision date in the instruction manual, and the management of the instruction manual needs to rely on manual review;

(2)技术缺陷一:扫描件质量参差不齐:部分扫描件可能存在文字不清晰或发生形变、文档倾斜、扫描方向不定(横向、纵向)等问题,现有文字识别算法未对以上问题有针对性研究;(2) Technical defect 1: The quality of scanned documents is uneven: Some scanned documents may have problems such as unclear or deformed text, document tilt, and uncertain scanning direction (horizontal, vertical), etc. The existing text recognition algorithm does not solve the above problems. targeted research;

(3)技术缺陷二:药物的通用名称或有效成分多是由西文音译的,常用生僻字来进行区分,这类生僻字往往结构较为复杂,加大了正确识别的难度,这些生僻字往往与其他字合起来表达某一固定化学结构名称;另外,药品名称中通常包含固定词汇,如注射、氧氟、硅油、复合等等。对于这两类名称如果用单字识别往往因为忽略了字间固定的搭配而导致错误率偏高。(3) Technical defect 2: The common names or active ingredients of drugs are mostly transliterated from Western languages, and are often distinguished by uncommon words. These uncommon words are often complicated in structure, which increases the difficulty of correct identification. These uncommon words are often Combined with other words to express a fixed chemical structure name; in addition, drug names usually contain fixed vocabulary, such as injection, oxyfluoride, silicone oil, compound, etc. For these two types of names, if single-character recognition is used, the error rate is often high due to ignoring the fixed collocation between characters.

发明内容Contents of the invention

根据上述提出的技术问题,提供一种基于深度学习的药品说明书自动识别方法、系统及存储介质。本发明主要利用基于数据驱动思想和深度学习技术实现汉字及数字的高精度提取及识别。According to the technical problems raised above, a method, system and storage medium for automatic recognition of drug instructions based on deep learning are provided. The present invention mainly uses data-driven thought and deep learning technology to realize high-precision extraction and recognition of Chinese characters and numbers.

本发明采用的技术手段如下:The technical means adopted in the present invention are as follows:

一种基于深度学习的药品说明书自动识别方法,包括:A method for automatic identification of drug instructions based on deep learning, including:

获取药品说明书图像,基于图像处理方法对获取的图像进行预处理,提取有效字体区域;Obtain the image of the drug instructions, preprocess the acquired image based on the image processing method, and extract the effective font area;

基于提取的有效字体区域,采用分段式识别法对文字信息进行初步识别;Based on the extracted effective font area, the text information is initially recognized by the segmented recognition method;

联合高频词训练的文字信息,对初步识别的文字信息进行优化识别,得到字符的优化识别结果。Combined with the text information of the high-frequency word training, the text information of the preliminary recognition is optimized and recognized, and the optimized recognition result of the characters is obtained.

进一步地,所述基于图像处理方法对获取的图像进行预处理,具体包括:扫描图像增强处理、扫描件主方向矫正处理、图像倾斜方向校正处理、文字信息区域定位处理、字符区域分割处理、独立字符分割处理、以及字形矫正处理。Further, the preprocessing of the acquired image based on the image processing method specifically includes: scanning image enhancement processing, scanning main direction correction processing, image tilt direction correction processing, text information area positioning processing, character area segmentation processing, independent Character segmentation processing, and font correction processing.

进一步地,所述扫描图像增强处理、扫描件主方向矫正处理、图像倾斜方向校正处理、图像倾斜方向校正处理、文字信息区域定位处理、字符区域分割处理、独立字符分割处理、以及字形矫正处理,具体包括:Further, the scanning image enhancement processing, the main direction correction processing of the scanned document, the image oblique direction correction processing, the image oblique direction correction processing, the text information area positioning processing, the character area segmentation processing, the independent character segmentation processing, and the font correction processing, Specifically include:

所述扫描图像增强处理,包括:采用加权平均法对图像进行灰度化;采用均值滤波对图像进行线性滤波;The scanning image enhancement process includes: using a weighted average method to grayscale the image; using mean filtering to linearly filter the image;

所述扫描件主方向矫正处理,包括:提取扫描件长宽特征,并将图像灰度值分别投影到两个方向上,获得投影特征,结合主方向先验特征,判断扫描件主方向;The main direction correction processing of the scanned part includes: extracting the length and width characteristics of the scanned part, and projecting the gray value of the image into two directions respectively to obtain the projection features, and judging the main direction of the scanned part in combination with the prior characteristics of the main direction;

所述图像倾斜方向校正处理,包括:利用Radon变换估计图像的倾斜角,将图像空间利用如下公式投影到极坐标空间:The image inclination direction correction process includes: utilizing the Radon transform to estimate the inclination angle of the image, and projecting the image space into the polar coordinate space using the following formula:

Figure BDA0003960171790000021
Figure BDA0003960171790000021

极坐标中的点相当于图像空间中对应两点的直线,通过极坐标空间中的点集的累加峰值确定图像空间的对应线条,由于极坐标本身包含倾斜角θ,因此根据点集累加峰值确定倾斜角度;A point in polar coordinates is equivalent to a straight line corresponding to two points in the image space. The corresponding line in the image space is determined by the cumulative peak value of the point set in the polar coordinate space. Since the polar coordinate itself contains the inclination angle θ, it is determined according to the cumulative peak value of the point set slope;

所述文字信息区域定位处理,包括:将图像进行形态学膨胀运算,减小字符临近笔画和临近字符间的空隙;提取图像的连通域,将同类的区域进行合并;采用投影法,做横向投影直方图,获得投影特征;针对药品说明书药品名称部分字体最大且均处于深色背景区的特点,选取字码最大且色块投影值最大的区域为药品名称图像区域;针对药品说明书的核准日期和修改日期在文件顶部且文字稀疏的特点,选取图像顶部色块投影值小于在某一阈值内的为核准日期和修改日期图像区域;对中括号标记及括号内的关键词进行识别,从而对批准文号所在区域进行定位;The text information area positioning processing includes: performing morphological expansion operation on the image to reduce the gap between adjacent strokes and adjacent characters; extracting the connected domain of the image and merging similar areas; using projection method to perform horizontal projection Histogram to obtain the projection features; for the characteristics of the drug name part of the drug instructions with the largest font and all in the dark background area, select the area with the largest character code and the largest color block projection value as the drug name image area; for the approval date and The date of modification is at the top of the file and the text is sparse. Select the color block projection value at the top of the image that is less than a certain threshold as the image area of the approval date and modification date; identify the brackets and the keywords in the brackets, so as to approve Locate the area where the document number is located;

所述字符区域分割处理,包括:对已选定的核准日期或批准文号图像区域做横向投影直方图,行字符在直方图上呈现波峰,行间隔在直方图上呈现明显的波谷状,按照波谷处进行分割,得到划分后的批准文号、核准日期和修改日期;The character area segmentation processing includes: making a horizontal projection histogram on the selected approval date or approval number image area, the line characters present a peak on the histogram, and the line interval presents an obvious trough on the histogram, according to Divide at the trough to obtain the divided approval number, approval date and modification date;

所述独立字符分割处理,对核准日期和修改日期、批准文号的各行以及药品名称区域做纵向投影直方图,各字符点阵在直方图上呈现波峰,字符间隙在直方图上呈现明显的波谷状,按照波谷处进行分割,得到核准日期、修改日期、批准文号以及药品名称的数字、汉字及符号;In the independent character segmentation process, a longitudinal projection histogram is made for the approval date and modification date, each line of the approval number and the drug name area, and each character lattice presents a peak on the histogram, and the character gap presents an obvious trough on the histogram The form is divided according to the trough to obtain the approval date, modification date, approval number, and the numbers, Chinese characters and symbols of the drug name;

所述字形矫正处理,包括:针对字体形变的局部性,对每行字符分别进行字形矫正;利用Hough变换得到每行字的最小外接四边形,计算四边形到矩形的变换的仿射矩阵H,将每个分割的独立字符与仿射矩阵H相乘,得到矫正后的字符图像。Described glyph correction processing comprises: for the locality of font deformation, carry out glyph correction respectively to each row of characters; Utilize Hough transform to obtain the minimum circumscribed quadrilateral of each row of characters, calculate the affine matrix H of the transformation of quadrangle to rectangle, each Multiply the segmented independent characters with the affine matrix H to obtain the rectified character image.

进一步地,所述基于提取的有效字体区域,采用分段式识别法对文字信息进行初步识别,包括:利用单字符训练得到核准和修改日期、批准文号以及药品名称的初步识别结果,通过卷积循环神经网络模型提取词间相关搭配关系,对初步识别结果进一步优化。Further, based on the extracted effective font area, the segmented recognition method is used to conduct preliminary recognition of the text information, including: using single-character training to obtain the preliminary recognition results of the approval and modification date, the approval number and the name of the drug, through the volume The product recurrent neural network model extracts the correlation and collocation relationship between words, and further optimizes the preliminary recognition results.

进一步地,所述利用单字符训练得到核准和修改日期、批准文号以及药品名称的初步识别结果,通过卷积循环神经网络模型提取词间相关搭配关系,对初步识别结果进一步优化,具体包括:Further, the preliminary recognition results of the date of approval and modification, the approval number and the name of the drug are obtained by using single-character training, and the relevant collocation relationship between words is extracted through the convolutional cyclic neural network model, and the preliminary recognition results are further optimized, specifically including:

构建字符训练库:根据国家药品目录,提取其中的符号,包括汉字、数字、百分号,生成常用字体的符号图片,对每张图片进行轻微扰动以增加噪声;Build a character training library: According to the national drug catalog, extract the symbols, including Chinese characters, numbers, and percent signs, generate symbol images of commonly used fonts, and slightly perturb each image to increase noise;

划分训练集和验证集:将生成的字符训练库按照5:1的比例生成训练集和验证集,训练集用于训练得到最优的深度模型,验证集用于生成最优的深度模型超参数;Divide the training set and the verification set: the generated character training library is used to generate a training set and a verification set at a ratio of 5:1. The training set is used to train the optimal depth model, and the verification set is used to generate the optimal depth model hyperparameters ;

构建卷积神经网络模型:输入字符图片,维度为32×32,用6个大小为5×5的卷积核进行卷积操作,得到尺寸为6@28×28的卷积特征图;以stride=2进行平均池化即下采样,得到6@14×14的池化特征图;以16个大小为5×5的卷积核进行卷积操作,得到尺寸为16@10×10的卷积特征图;以stride=2进行平均池化即下采样,得到16@5×5的池化特征图;分别利用一个核为5×5和两个核为1×1的卷积对特征进行缩放以获得丰富的特征组合,最后经非线性映射判定类别输出;Construct a convolutional neural network model: input a character image with a dimension of 32×32, and use six convolution kernels with a size of 5×5 for convolution operation to obtain a convolutional feature map with a size of 6@28×28; = 2 Perform average pooling or downsampling to obtain a pooled feature map of 6@14×14; perform convolution operation with 16 convolution kernels with a size of 5×5 to obtain a convolution with a size of 16@10×10 Feature map; perform average pooling with stride=2, that is, downsampling, to obtain a pooled feature map of 16@5×5; use a convolution with a kernel of 5×5 and two kernels of 1×1 to scale the feature To obtain a rich combination of features, and finally determine the category output through nonlinear mapping;

优化深度模型:任意选取一个参照样本,在同类别字符库中随机选取一个样本作为正样本,从不同类别字符库中随机选取一个样本作为负样本;采用孪生机制,在一次迭代中,将参照样本输入支路1,将正样本和负样本轮流依次输入支路2,两个支路共享网络参数;分别对支路1和支路2的样本特征用SoftMax进行分类,采用交叉熵损失函数进行约束;联合支路1和支路2以对比损失函数进行约束,使参考样本和正样本特征尽量相似,同时使参考样本和负样本特征差异尽量大;对网络进行反向传播,更新网络;Optimize the depth model: randomly select a reference sample, randomly select a sample from the character library of the same category as a positive sample, and randomly select a sample from a character library of a different category as a negative sample; use the twin mechanism, in one iteration, the reference sample Input branch 1, input positive samples and negative samples in turn to branch 2, and the two branches share network parameters; respectively classify the sample features of branch 1 and branch 2 with SoftMax, and use the cross-entropy loss function to constrain ; Combine branch 1 and branch 2 to constrain the comparison loss function, so that the characteristics of the reference sample and the positive sample are as similar as possible, and at the same time make the difference between the characteristics of the reference sample and the negative sample as large as possible; carry out backpropagation on the network, and update the network;

模型评估:更新网络超参,通过监测验证集选取最优网络超参;Model evaluation: update the network hyperparameters, and select the optimal network hyperparameters by monitoring the verification set;

字符初步判别:将经图像处理后的字符图片输入训练得到的卷积神经网络中,获得每个单字符的初步判定结果,保留单字符分类概率。Preliminary character discrimination: Input the image-processed character picture into the trained convolutional neural network to obtain the preliminary judgment result of each single character, and retain the single-character classification probability.

进一步地,所述联合高频词训练的文字信息,对初步识别的文字信息进行优化识别,得到字符的优化识别结果,具体包括:Further, the combined text information trained by high-frequency words optimizes and recognizes the initially recognized text information, and obtains an optimized recognition result of characters, which specifically includes:

构建高频词库:采用JIEBA开源分词系统对国家药品目录中的药品名称进行自动分词,对于部分疑难词组进行人工筛选和更正;统计所有词组的出现概率,挑选高频词组构建高频词库,生成常用字体的高频词库图片,对每张图片进行轻微扰动以增加噪声;Build a high-frequency thesaurus: use the JIEBA open source word segmentation system to automatically segment the names of drugs in the national drug catalog, and manually screen and correct some difficult phrases; count the occurrence probability of all phrases, select high-frequency phrases to build a high-frequency thesaurus, Generate high-frequency lexicon images of commonly used fonts, and slightly perturb each image to increase noise;

划分训练集和验证集:将生成的高频词库按照5:1的比例生成训练集和验证集,训练集用于训练得到最优的深度模型,验证集用于生成最优的深度模型超参数;Divide the training set and the verification set: generate the training set and verification set according to the ratio of 5:1 from the generated high-frequency lexicon. The training set is used to train the optimal depth model, and the verification set is used to generate the optimal depth model. parameter;

构建卷积循环记忆模型:利用卷积神经子网络对高频词X={xt}的各个字符进行特征提取,获得各字符特征F={ft};循环神经子网络在时间步t上取一个输入xt,在时间步t-1上取一个隐藏状态ht-1以计算时间步t上的隐藏状态ht,并利用Relu求得t时刻输出yt与输入的非线性关系:Construct a convolutional cyclic memory model: use the convolutional neural sub-network to extract the features of each character of the high-frequency word X={x t }, and obtain the character features F={f t }; the cyclic neural sub-network is at time step t Take an input x t , take a hidden state h t-1 at time step t-1 to calculate the hidden state h t at time step t, and use Relu to obtain the nonlinear relationship between the output y t and the input at time t:

ht=tanh(whhht-1+whxxt)h t =tanh(w hh h t-1 +w hx x t )

yt=Whyht y t = Why hy h t

其中,whh,whx,Why均为网络待学习权重;Among them, w hh , w hx , Why are the network weights to be learned;

深度模型优化:各时间步损失函数为交叉熵损失函数,总体损失函数为各时间步损失函数之和,对网络进行反向传播,更新网络;Deep model optimization: the loss function of each time step is the cross entropy loss function, the overall loss function is the sum of the loss functions of each time step, and the network is backpropagated to update the network;

高频词修正:将经卷积循环神经网络得到的各字符概率与字符初步判别中保留的单字符分类概率相乘,获得字符的优化识别结果。High-frequency word correction: Multiply the probability of each character obtained by the convolutional cyclic neural network with the single-character classification probability reserved in the preliminary identification of the character to obtain the optimized recognition result of the character.

本发明还提供了一种基于上述基于深度学习的药品说明书自动识别方法的基于深度学习的药品说明书自动识别系统,包括:The present invention also provides an automatic recognition system for drug instructions based on deep learning based on the above-mentioned method for automatic recognition of drug instructions based on deep learning, including:

文字信息提取模块,用于获取药品说明书图像,基于图像处理方法对获取的图像进行预处理,提取有效字体区域;The text information extraction module is used to obtain the image of the drug instructions, preprocess the obtained image based on the image processing method, and extract the effective font area;

文字信息初步识别模块,用于基于提取的有效字体区域,采用分段式识别法对文字信息进行初步识别;The text information preliminary recognition module is used for preliminary recognition of the text information based on the extracted effective font area by adopting a segmented recognition method;

文字信息优化识别模块,用于联合高频词训练的文字信息,对初步识别的文字信息进行优化识别,得到字符的优化识别结果。The text information optimization recognition module is used to combine the text information of the high-frequency word training to optimize the recognition of the initially recognized text information, and obtain the optimized recognition result of the characters.

本发明还提供了一种存储介质,所述存储介质包括存储的程序,其中,所述程序运行时,执行上述基于深度学习的药品说明书自动识别方法。The present invention also provides a storage medium, which includes a stored program, wherein, when the program is running, the above-mentioned deep learning-based automatic recognition method for drug instructions is executed.

较现有技术相比,本发明具有以下优点:Compared with the prior art, the present invention has the following advantages:

1、本发明提出了药品说明书一体化识别方法,能够降低人工成本,极大地提升药品说明书的管理效率和时效性;1. The present invention proposes an integrated identification method for drug instructions, which can reduce labor costs and greatly improve the management efficiency and timeliness of drug instructions;

2、本发明提出了图像增强和文字识别的一系列方法,能够对文字不清晰或发生形变、文档倾斜、扫描方向不定(横向、纵向)等问题有较好的效果;2. The present invention proposes a series of methods for image enhancement and text recognition, which can have better effects on problems such as unclear or deformed text, tilted documents, and uncertain scanning directions (horizontal and vertical);

3、本发明提出了分段式识别法,首先利用卷积神经网络对单字进行初步识别,然后利用卷积循环记忆网络联合高频词训练对文字信息进一步优化识别,从而利用药品说明书中经常出现的高频词提升名称检测的准确度。3. The present invention proposes a segmented recognition method. Firstly, the convolutional neural network is used to initially recognize individual characters, and then the convolutional cyclic memory network is combined with high-frequency word training to further optimize the recognition of text information, thereby utilizing the The high-frequency words improve the accuracy of name detection.

基于上述理由本发明可在药品说明书识别等领域广泛推广。Based on the above reasons, the present invention can be widely promoted in fields such as identification of drug insert sheets.

附图说明Description of drawings

为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图做以简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description These are some embodiments of the present invention. For those skilled in the art, other drawings can also be obtained according to these drawings without any creative effort.

图1为本发明方法流程图。Fig. 1 is a flow chart of the method of the present invention.

图2为本发明实施例提供的卷积神经网络模型示意图。Fig. 2 is a schematic diagram of a convolutional neural network model provided by an embodiment of the present invention.

图3为本发明实施例提供的卷积循环记忆模型示意图。Fig. 3 is a schematic diagram of a convolutional cyclic memory model provided by an embodiment of the present invention.

具体实施方式Detailed ways

为了使本技术领域的人员更好地理解本发明方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分的实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明保护的范围。In order to enable those skilled in the art to better understand the solutions of the present invention, the following will clearly and completely describe the technical solutions in the embodiments of the present invention in conjunction with the drawings in the embodiments of the present invention. Obviously, the described embodiments are only It is an embodiment of a part of the present invention, but not all embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts shall fall within the protection scope of the present invention.

需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first" and "second" in the description and claims of the present invention and the above drawings are used to distinguish similar objects, but not necessarily used to describe a specific sequence or sequence. It is to be understood that the data so used are interchangeable under appropriate circumstances such that the embodiments of the invention described herein can be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "comprising" and "having", as well as any variations thereof, are intended to cover a non-exclusive inclusion, for example, a process, method, system, product or device comprising a sequence of steps or elements is not necessarily limited to the expressly listed instead, may include other steps or elements not explicitly listed or inherent to the process, method, product or apparatus.

如图1所示,本发明提供了一种基于深度学习的药品说明书自动识别方法,包括:As shown in Figure 1, the present invention provides a method for automatic recognition of drug instructions based on deep learning, including:

S1、获取药品说明书图像,基于图像处理方法对获取的图像进行预处理,提取有效字体区域;S1. Acquire the image of the drug instruction manual, preprocess the acquired image based on the image processing method, and extract the effective font area;

S2、基于提取的有效字体区域,采用分段式识别法对文字信息进行初步识别;S2. Based on the extracted effective font area, a segmented recognition method is used to initially recognize the text information;

S3、联合高频词训练的文字信息,对初步识别的文字信息进行优化识别,得到字符的优化识别结果。S3. Combining the text information of the high-frequency word training, optimize the recognition of the initially recognized text information, and obtain an optimized recognition result of characters.

具体实施时,作为本发明优选的实施方式,所述步骤S1中,基于图像处理方法对获取的图像进行预处理,具体包括:扫描图像增强处理、扫描件主方向矫正处理、图像倾斜方向校正处理、文字信息区域定位处理、字符区域分割处理、独立字符分割处理、以及字形矫正处理。其目的是对关键区域进行定向和定位以提取清晰、端正的有效字体区域,为后续文字信息识别的准确率提升奠定基础。During specific implementation, as a preferred embodiment of the present invention, in the step S1, the acquired image is preprocessed based on the image processing method, which specifically includes: scanning image enhancement processing, scanning part main direction correction processing, and image tilt direction correction processing , Text information area positioning processing, character area segmentation processing, independent character segmentation processing, and font correction processing. Its purpose is to orientate and locate the key areas to extract clear and correct effective font areas, laying the foundation for the improvement of the accuracy of subsequent text information recognition.

具体实施时,作为本发明优选的实施方式,所述扫描图像增强处理、扫描件主方向矫正处理、图像倾斜方向校正处理、图像倾斜方向校正处理、文字信息区域定位处理、字符区域分割处理、独立字符分割处理、以及字形矫正处理,具体包括:During specific implementation, as a preferred embodiment of the present invention, the scanned image enhancement processing, the main direction correction processing of the scanned document, the image oblique direction correction processing, the image oblique direction correction processing, the text information area positioning processing, the character area segmentation processing, and the independent Character segmentation processing, and font correction processing, including:

所述扫描图像增强处理,包括:采用加权平均法对图像进行灰度化;采用均值滤波对图像进行线性滤波;去除因扫描设备性能不佳导致的噪声;The scanning image enhancement process includes: graying the image by using a weighted average method; performing linear filtering on the image by using an average value filter; removing noise caused by poor performance of the scanning device;

所述扫描件主方向矫正处理,包括:提取扫描件长宽特征,并将图像灰度值分别投影到两个方向上,获得投影特征,结合主方向先验特征,判断扫描件主方向;The main direction correction processing of the scanned part includes: extracting the length and width characteristics of the scanned part, and projecting the gray value of the image into two directions respectively to obtain the projection features, and judging the main direction of the scanned part in combination with the prior characteristics of the main direction;

所述图像倾斜方向校正处理,包括:利用Radon变换估计图像的倾斜角,将图像空间利用如下公式投影到极坐标空间:The image inclination direction correction process includes: utilizing the Radon transform to estimate the inclination angle of the image, and projecting the image space into the polar coordinate space using the following formula:

Figure BDA0003960171790000071
Figure BDA0003960171790000071

极坐标中的点相当于图像空间中对应两点的直线,通过极坐标空间中的点集的累加峰值确定图像空间的对应线条,由于极坐标本身包含倾斜角θ,因此根据点集累加峰值确定倾斜角度;A point in polar coordinates is equivalent to a straight line corresponding to two points in the image space. The corresponding line in the image space is determined by the cumulative peak value of the point set in the polar coordinate space. Since the polar coordinate itself contains the inclination angle θ, it is determined according to the cumulative peak value of the point set slope;

所述文字信息区域定位处理,包括:将图像进行形态学膨胀运算,减小字符临近笔画和临近字符间的空隙;提取图像的连通域,将同类的区域进行合并;采用投影法,做横向投影直方图,获得投影特征;针对药品说明书药品名称部分字体最大且均处于深色背景区的特点,选取字码最大且色块投影值最大的区域为药品名称图像区域;针对药品说明书的核准日期和修改日期在文件顶部且文字稀疏的特点,选取图像顶部色块投影值小于在某一阈值内的为核准日期和修改日期图像区域;对中括号标记及括号内的关键词进行识别,从而对批准文号所在区域进行定位;The text information area positioning processing includes: performing morphological expansion operation on the image to reduce the gap between adjacent strokes and adjacent characters; extracting the connected domain of the image and merging similar areas; using projection method to perform horizontal projection Histogram to obtain the projection features; for the characteristics of the drug name part of the drug instructions with the largest font and all in the dark background area, select the area with the largest character code and the largest color block projection value as the drug name image area; for the approval date and The date of modification is at the top of the file and the text is sparse. Select the color block projection value at the top of the image that is less than a certain threshold as the image area of the approval date and modification date; identify the brackets and the keywords in the brackets, so as to approve Locate the area where the document number is located;

所述字符区域分割处理,包括:对已选定的核准日期或批准文号图像区域做横向投影直方图,行字符在直方图上呈现波峰,行间隔在直方图上呈现明显的波谷状,按照波谷处进行分割,得到划分后的批准文号、核准日期和修改日期;The character area segmentation processing includes: making a horizontal projection histogram on the selected approval date or approval number image area, the line characters present a peak on the histogram, and the line interval presents an obvious trough on the histogram, according to Divide at the trough to obtain the divided approval number, approval date and modification date;

所述独立字符分割处理,对核准日期和修改日期、批准文号的各行以及药品名称区域做纵向投影直方图,各字符点阵在直方图上呈现波峰,字符间隙在直方图上呈现明显的波谷状,按照波谷处进行分割,得到核准日期、修改日期、批准文号以及药品名称的数字、汉字及符号;In the independent character segmentation process, a longitudinal projection histogram is made for the approval date and modification date, each line of the approval number and the drug name area, and each character lattice presents a peak on the histogram, and the character gap presents an obvious trough on the histogram The form is divided according to the trough to obtain the approval date, modification date, approval number, and the numbers, Chinese characters and symbols of the drug name;

所述字形矫正处理,包括:针对字体形变的局部性,对每行字符分别进行字形矫正;利用Hough变换得到每行字的最小外接四边形,计算四边形到矩形的变换的仿射矩阵H,将每个分割的独立字符与仿射矩阵H相乘,得到矫正后的字符图像。Described glyph correction processing comprises: for the locality of font deformation, carry out glyph correction respectively to each row of characters; Utilize Hough transform to obtain the minimum circumscribed quadrilateral of each row of characters, calculate the affine matrix H of the transformation of quadrangle to rectangle, each Multiply the segmented independent characters with the affine matrix H to obtain the rectified character image.

具体实施时,作为本发明优选的实施方式,所述步骤S2中,基于提取的有效字体区域,采用分段式识别法对文字信息进行初步识别,包括:利用单字符训练得到核准和修改日期、批准文号以及药品名称的初步识别结果,通过卷积循环神经网络模型提取词间相关搭配关系,对初步识别结果进一步优化,具体包括:During specific implementation, as a preferred embodiment of the present invention, in the step S2, based on the extracted effective font area, a segmented recognition method is used to perform preliminary recognition of the text information, including: using single-character training to obtain the approval and modification date, For the initial recognition results of the approval number and drug name, the relevant collocation relationship between words is extracted through the convolutional cyclic neural network model, and the preliminary recognition results are further optimized, including:

构建字符训练库:根据国家药品目录,提取其中的符号,包括汉字、数字、百分号,生成常用字体的符号图片,对每张图片进行轻微扰动以增加噪声,从而增强深度模型的识别鲁棒性,扰动操作包括剪裁、角度偏转。Build a character training library: According to the national drug catalog, extract the symbols, including Chinese characters, numbers, and percent signs, generate symbol images of commonly used fonts, and slightly perturb each image to increase noise, thereby enhancing the recognition robustness of the depth model The perturbation operations include clipping and angle deflection.

划分训练集和验证集:将生成的字符训练库按照5:1的比例生成训练集和验证集,训练集用于训练得到最优的深度模型,验证集用于生成最优的深度模型超参数,如batch尺寸,训练步长等。Divide the training set and the verification set: the generated character training library is used to generate a training set and a verification set at a ratio of 5:1. The training set is used to train the optimal depth model, and the verification set is used to generate the optimal depth model hyperparameters , such as batch size, training step size, etc.

构建卷积神经网络模型:输入字符图片,维度为32×32,用6个大小为5×5的卷积核进行卷积操作,得到尺寸为6@28×28的卷积特征图;以stride=2进行平均池化即下采样,得到6@14×14的池化特征图;以16个大小为5×5的卷积核进行卷积操作,得到尺寸为16@10×10的卷积特征图;以stride=2进行平均池化即下采样,得到16@5×5的池化特征图;分别利用一个核为5×5和两个核为1×1的卷积对特征进行缩放以获得丰富的特征组合,最后经非线性映射判定类别输出;构建的卷积神经网络模型如图2所示。Construct a convolutional neural network model: input a character image with a dimension of 32×32, and use six convolution kernels with a size of 5×5 for convolution operation to obtain a convolutional feature map with a size of 6@28×28; = 2 Perform average pooling or downsampling to obtain a pooled feature map of 6@14×14; perform convolution operation with 16 convolution kernels with a size of 5×5 to obtain a convolution with a size of 16@10×10 Feature map; perform average pooling with stride=2, that is, downsampling, to obtain a pooled feature map of 16@5×5; use a convolution with a kernel of 5×5 and two kernels of 1×1 to scale the feature To obtain a rich combination of features, and finally determine the category output through nonlinear mapping; the constructed convolutional neural network model is shown in Figure 2.

优化深度模型:其目的是训练网络使其能提取到能代表各个字符的判别性特征,网络以多分类与异同二分类的形式联合进行训练,以提升模型的准确度,具体步骤为:Optimize the deep model: the purpose is to train the network so that it can extract the discriminative features that can represent each character. The network is jointly trained in the form of multi-classification and similarity-difference classification to improve the accuracy of the model. The specific steps are:

A、任意选取一个参照样本,在同类别字符库中随机选取一个样本作为正样本,从不同类别字符库中随机选取一个样本作为负样本;A. Randomly select a reference sample, randomly select a sample from the character library of the same category as a positive sample, and randomly select a sample from a character library of a different category as a negative sample;

B、采用孪生机制,在一次迭代中,将参照样本输入支路1,将正样本和负样本轮流依次输入支路2,两个支路共享网络参数;B. Using the twin mechanism, in one iteration, the reference sample is input into branch 1, and the positive samples and negative samples are input into branch 2 in turn, and the two branches share network parameters;

C、分别对支路1和支路2的样本特征用SoftMax进行分类,采用交叉熵损失函数进行约束;C. Use SoftMax to classify the sample features of branch 1 and branch 2 respectively, and use the cross-entropy loss function to constrain;

D、联合支路1和支路2以对比损失函数进行约束,使参考样本和正样本特征尽量相似,同时使参考样本和负样本特征差异尽量大;D. Combine branch 1 and branch 2 to constrain the comparison loss function, so that the characteristics of the reference sample and the positive sample are as similar as possible, and at the same time, the difference between the characteristics of the reference sample and the negative sample is as large as possible;

E、对网络进行反向传播,更新网络;E. Backpropagating the network and updating the network;

模型评估:更新网络超参,通过监测验证集选取最优网络超参;Model evaluation: update the network hyperparameters, and select the optimal network hyperparameters by monitoring the verification set;

字符初步判别:将经图像处理后的字符图片输入训练得到的卷积神经网络中,获得每个单字符的初步判定结果,保留单字符分类概率。此处要保留单字符分类概率,对于某些字形相似易错词,在下一步中通过高频词训练进行概率更新。Preliminary character discrimination: Input the image-processed character picture into the trained convolutional neural network to obtain the preliminary judgment result of each single character, and retain the single-character classification probability. Here, the single-character classification probability should be retained. For some error-prone words with similar fonts, the probability will be updated through high-frequency word training in the next step.

具体实施时,作为本发明优选的实施方式,所述步骤S3中,联合高频词训练的文字信息,对初步识别的文字信息进行优化识别,得到字符的优化识别结果,具体包括:During specific implementation, as a preferred embodiment of the present invention, in the step S3, combined with the text information of the high-frequency word training, the text information of the preliminary recognition is optimized and recognized, and the optimized recognition result of the characters is obtained, which specifically includes:

构建高频词库:采用JIEBA开源分词系统对国家药品目录中的药品名称进行自动分词,对于部分疑难词组进行人工筛选和更正;统计所有词组的出现概率,挑选高频词组构建高频词库,生成常用字体的高频词库图片,对每张图片进行轻微扰动以增加噪声;从而增强高频词汇识别的鲁棒性,扰动操作包括剪裁、角度偏转等。Build a high-frequency thesaurus: use the JIEBA open source word segmentation system to automatically segment the names of drugs in the national drug catalog, and manually screen and correct some difficult phrases; count the occurrence probability of all phrases, select high-frequency phrases to build a high-frequency thesaurus, Generate high-frequency lexicon images of commonly used fonts, and slightly perturb each image to increase noise; thereby enhancing the robustness of high-frequency vocabulary recognition, perturbation operations include cropping, angle deflection, etc.

划分训练集和验证集:将生成的高频词库按照5:1的比例生成训练集和验证集,训练集用于训练得到最优的深度模型,验证集用于生成最优的深度模型超参数;Divide the training set and the verification set: generate the training set and verification set according to the ratio of 5:1 from the generated high-frequency lexicon. The training set is used to train the optimal depth model, and the verification set is used to generate the optimal depth model. parameter;

构建卷积循环记忆模型:高频词识别中采用卷积循环记忆模型,即利用卷积神经子网络提取高频词的空间信息,利用循环记忆子网络提取高频词的字符间相关信息,具体设置如下:Constructing a convolutional cyclic memory model: The convolutional cyclic memory model is used in the recognition of high-frequency words, that is, the spatial information of high-frequency words is extracted by using the convolutional neural sub-network, and the inter-character information of high-frequency words is extracted by using the cyclic memory sub-network. The settings are as follows:

a、利用卷积神经子网络对高频词X={xt}的各个字符进行特征提取,获得各字符特征F={ft};a. Use the convolutional neural sub-network to perform feature extraction for each character of the high-frequency word X={x t }, and obtain each character feature F={f t };

b、循环神经子网络在时间步t上取一个输入xt,在时间步t-1上取一个隐藏状态ht-1以计算时间步t上的隐藏状态ht,并利用Relu求得t时刻输出yt与输入的非线性关系:b. The recurrent neural sub-network takes an input x t at time step t, takes a hidden state h t-1 at time step t-1 to calculate the hidden state h t at time step t, and uses Relu to obtain t The nonlinear relationship between the output y t and the input at any time:

ht=tanh(whhht-1+whxxt)h t =tanh(w hh h t-1 +w hx x t )

yt=Whyht y t = Why hy h t

其中,whh,whx,Why均为网络待学习权重;构建的卷积循环记忆模型如图3所示。Among them, w hh , w hx , Why are the network weights to be learned; the constructed convolutional cycle memory model is shown in Figure 3.

深度模型优化:各时间步损失函数为交叉熵损失函数,总体损失函数为各时间步损失函数之和,对网络进行反向传播,更新网络;Deep model optimization: the loss function of each time step is the cross entropy loss function, the overall loss function is the sum of the loss functions of each time step, and the network is backpropagated to update the network;

高频词修正:将经卷积循环神经网络得到的各字符概率与字符初步判别中保留的单字符分类概率相乘,获得字符的优化识别结果。High-frequency word correction: Multiply the probability of each character obtained by the convolutional cyclic neural network with the single-character classification probability reserved in the preliminary identification of the character to obtain the optimized recognition result of the character.

对应本申请中的基于深度学习的药品说明书自动识别方法,本申请还提供了一种基于深度学习的药品说明书自动识别系统,包括:文字信息提取模块、文字信息初步识别模块以及文字信息优化识别模块,其中:Corresponding to the deep learning-based automatic recognition method for drug instructions in this application, this application also provides an automatic recognition system for drug instructions based on deep learning, including: a text information extraction module, a text information preliminary recognition module, and a text information optimization recognition module ,in:

文字信息提取模块,用于获取药品说明书图像,基于图像处理方法对获取的图像进行预处理,提取有效字体区域;The text information extraction module is used to obtain the image of the drug instructions, preprocess the obtained image based on the image processing method, and extract the effective font area;

文字信息初步识别模块,用于基于提取的有效字体区域,采用分段式识别法对文字信息进行初步识别;在本实施例中,文字信息初步识别模块的作用是针对已提取的目标字块,基于数据驱动思想和深度学习技术,实现对于汉字及数字的高精度识别,核心原理是通过挖掘隐藏在汉字数据库中的潜在特征及映射规律形成高精度判别器。主要包括训练数据库构建、深度神经网络模型构建、深度模型优化等多个子模块。药物中化学元素多是由西文音译的,常用生僻字来进行区分,这类生僻字往往结构较为复杂,加大了正确识别的难度,这些生僻字往往与其他字合起来表达某一固定元素名称;另外,药品名称中通常包含固定词汇,如注射、氧氟、硅油、复合等等。对于这两类名称如果用单字识别往往因为忽略了字间固定的搭配而导致错误率偏高,因此本方法提出分段式识别法,既首先利用单字符训练得到核准和修改日期、批准文号以及药品名称的初步识别结果,然后通过卷积循环神经网络模型提取词间相关搭配关系,对初步识别结果进一步优化。The text information preliminary recognition module is used for preliminary recognition of the text information based on the extracted effective font area by adopting the segmented recognition method; in this embodiment, the function of the text information preliminary recognition module is to target the extracted target block, Based on data-driven thinking and deep learning technology, high-precision recognition of Chinese characters and numbers is realized. The core principle is to form a high-precision discriminator by mining the potential features and mapping laws hidden in the Chinese character database. It mainly includes training database construction, deep neural network model construction, deep model optimization and other sub-modules. Most of the chemical elements in medicines are transliterated from Western languages, and they are often distinguished by rare words. Such rare words often have a complex structure, which increases the difficulty of correct identification. These rare words are often combined with other words to express a certain fixed element In addition, drug names usually contain fixed vocabulary, such as injection, oxyfluoride, silicone oil, compound, etc. For these two types of names, if single-character recognition is used, the error rate is often high due to ignoring the fixed collocation between characters. Therefore, this method proposes a segmented recognition method. First, use single-character training to obtain the approval and revision date, approval number, etc. And the preliminary recognition results of drug names, and then use the convolutional cyclic neural network model to extract the correlation and collocation relationship between words, and further optimize the preliminary recognition results.

文字信息优化识别模块,用于联合高频词训练的文字信息,对初步识别的文字信息进行优化识别,得到字符的优化识别结果。The text information optimization recognition module is used to combine the text information of the high-frequency word training to optimize the recognition of the initially recognized text information, and obtain the optimized recognition result of the characters.

对于本发明实施例的而言,由于其与上面实施例中的相对应,所以描述的比较简单,相关相似之处请参见上面实施例中部分的说明即可,此处不再详述。For the embodiment of the present invention, because it corresponds to the above embodiment, the description is relatively simple. For related similarities, please refer to the part of the description in the above embodiment, and will not be described in detail here.

本申请实施例还公开了一种计算机可读存储介质,该计算机可读存储介质内存储有计算机指令集,计算机指令集被处理器执行时实现如上文任一实施例所提供的基于深度学习的药品说明书自动识别方法The embodiment of the present application also discloses a computer-readable storage medium. The computer-readable storage medium stores a computer instruction set. When the computer instruction set is executed by a processor, the deep learning-based Automatic Recognition Method of Drug Instructions

在本申请所提供的几个实施例中,应该理解到,所揭露的技术内容,可通过其它的方式实现。其中,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,可以为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,单元或模块的间接耦合或通信连接,可以是电性或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed technical content can be realized in other ways. Wherein, the device embodiments described above are only illustrative. For example, the division of the units may be a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or may be Integrate into another system, or some features may be ignored, or not implemented. In another point, the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of units or modules may be in electrical or other forms.

所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

另外,在本发明各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.

所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the essence of the technical solution of the present invention or the part that contributes to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in various embodiments of the present invention. The aforementioned storage media include: U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk and other media that can store program codes. .

最后应说明的是:以上各实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述各实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present invention, rather than limiting them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: It is still possible to modify the technical solutions described in the foregoing embodiments, or perform equivalent replacements for some or all of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the various embodiments of the present invention. scope.

Claims (8)

1. A medicine specification automatic identification method based on deep learning is characterized by comprising the following steps:
acquiring a medicine specification image, preprocessing the acquired image based on an image processing method, and extracting an effective font area;
based on the extracted effective font area, adopting a sectional type identification method to carry out preliminary identification on the character information;
and combining the character information of the high-frequency word training to perform optimized recognition on the preliminarily recognized character information to obtain an optimized recognition result of the characters.
2. The method for automatically identifying the drug instruction manual based on deep learning of claim 1, wherein the image processing method is used for preprocessing the acquired image, and specifically comprises: the processing method includes a scanned image enhancement process, a scanned object main direction correction process, an image tilt direction correction process, a character information region positioning process, a character region division process, an independent character division process, and a font correction process.
3. The method for automatically identifying a drug order based on deep learning according to claim 2, wherein the scan image enhancement processing, the scan item main direction correction processing, the image tilt direction correction processing, the text information region positioning processing, the character region segmentation processing, the independent character segmentation processing, and the font correction processing specifically include:
the scan image enhancement processing includes: graying the image by adopting a weighted average method; performing linear filtering on the image by adopting mean filtering;
the main direction correction treatment of the scanning piece comprises the following steps: extracting the length and width characteristics of the scanning piece, projecting the gray value of the image to two directions respectively to obtain projection characteristics, and judging the main direction of the scanning piece by combining the prior characteristics of the main direction;
the image tilt direction correction processing includes: estimating the tilt angle of the image by using Radon transformation, and projecting the image space to the polar coordinate space by using the following formula:
Figure FDA0003960171780000011
the point in the polar coordinate is equivalent to a straight line of two corresponding points in the image space, the corresponding line of the image space is determined through the accumulated peak value of the point set in the polar coordinate space, and the inclination angle is determined according to the accumulated peak value of the point set because the polar coordinate itself comprises the inclination angle theta;
the character information area positioning processing comprises the following steps: performing morphological dilation operation on the image to reduce gaps between adjacent strokes of the character and adjacent characters; extracting connected domains of the images, and merging the regions of the same type; adopting a projection method to make a transverse projection histogram to obtain projection characteristics; aiming at the characteristics that partial characters of the medicine names in the medicine specification are maximum and are all positioned in a deep-color background region, selecting a region with the maximum character code and the maximum color block projection value as a medicine name image region; aiming at the characteristics that the approval date and the modification date of the medicine specification are on the top of the file and characters are sparse, selecting an image area with the approval date and the modification date, wherein the projection value of a color block on the top of the image is smaller than that in a certain threshold value; identifying the bracketed sign and the keywords in the bracket so as to position the area where the approved character number is located;
the character region segmentation process includes: making a horizontal projection histogram for the selected approved date or approved character image area, displaying a peak on the histogram by line characters, displaying an obvious trough on the histogram by line intervals, and segmenting according to the trough to obtain the divided approved date and modified date;
the independent character segmentation processing is to perform longitudinal projection column diagrams on each row of the approved date, the modified date, the approved letter number and the medicine name area, wherein each character dot matrix presents a peak on the column diagrams, and the character gaps present obvious trough shapes on the column diagrams, and are segmented according to the trough positions to obtain numbers, chinese characters and symbols of the approved date, the modified date, the approved letter number and the medicine name;
the font correction processing includes: aiming at the locality of font deformation, respectively performing font correction on each line of characters; and obtaining the minimum external quadrangle of each line of characters by utilizing Hough transformation, calculating an affine matrix H for transforming the quadrangle into the rectangle, and multiplying each segmented independent character by the affine matrix H to obtain a corrected character image.
4. The method for automatically recognizing the deep learning-based medicine specification according to claim 2, wherein the preliminary recognition of the text information by using a segmented recognition method based on the extracted effective font area comprises: and obtaining preliminary recognition results of approval and modification dates, approval character numbers and medicine names by utilizing single character training, and extracting related collocation relationships among words through a convolution cyclic neural network model to further optimize the preliminary recognition results.
5. The method as claimed in claim 4, wherein the deep learning-based automatic drug specification recognition method comprises obtaining preliminary recognition results of approved and modified dates, approved characters and drug names by single character training, extracting related collocation relationships among words through a convolutional recurrent neural network model, and further optimizing the preliminary recognition results, specifically comprising:
constructing a character training library: extracting symbols including Chinese characters, numbers and percentiles from the national medicine catalog, generating symbol pictures with common fonts, and slightly disturbing each picture to increase noise;
dividing a training set and a verification set: and (3) the generated character training library is processed according to the following steps of 5:1, generating a training set and a verification set, wherein the training set is used for training to obtain an optimal depth model, and the verification set is used for generating an optimal depth model hyperparameter;
constructing a convolutional neural network model: inputting a character picture, wherein the dimensionality is 32 multiplied by 32, carrying out convolution operation by using 6 convolution kernels with the size of 5 multiplied by 5 to obtain a convolution characteristic diagram with the size of 6@28 multiplied by 28; performing average pooling, namely downsampling, by stride =2 to obtain a pooling characteristic diagram of 6@14 × 14; performing convolution operation by 16 convolution kernels with the size of 5 multiplied by 5 to obtain a convolution feature map with the size of 16@10 multiplied by 10; performing average pooling, namely downsampling, by stride =2 to obtain a pooling feature map of 16@5 × 5; respectively utilizing convolution with one kernel of 5 multiplied by 5 and two kernels of 1 multiplied by 1 to scale the features so as to obtain rich feature combinations, and finally judging category output through nonlinear mapping;
optimizing a depth model: randomly selecting a reference sample, randomly selecting a sample from character libraries of different types as a positive sample, and randomly selecting a sample from character libraries of different types as a negative sample; a twin mechanism is adopted, in one iteration, a reference sample is input into a branch 1, a positive sample and a negative sample are sequentially input into a branch 2 in turn, and the two branches share network parameters; classifying the sample characteristics of the branch 1 and the branch 2 by SoftMax respectively, and constraining by adopting a cross entropy loss function; the combined branch 1 and the branch 2 are restricted by a contrast loss function, so that the characteristics of the reference sample and the positive sample are similar as much as possible, and the characteristic difference between the reference sample and the negative sample is large as much as possible; carrying out backward propagation on the network and updating the network;
and (3) model evaluation: updating the network super parameters, and selecting the optimal network super parameters through a monitoring verification set;
and (3) character preliminary discrimination: and inputting the character picture after image processing into a convolutional neural network obtained by training to obtain a primary judgment result of each single character and reserve the single character classification probability.
6. The method for automatically recognizing the medicine specification based on the deep learning of claim 1, wherein the step of optimally recognizing the preliminarily recognized text information in combination with the text information trained by the high-frequency words to obtain the optimal recognition result of the characters specifically comprises the steps of:
constructing a high-frequency word bank: automatically segmenting the medicine names in the national medicine catalogue by adopting a JIEBA open source segmentation system, and manually screening and correcting partial difficult phrases; counting the occurrence probability of all phrases, selecting high-frequency phrases to construct a high-frequency word stock, generating high-frequency word stock pictures with common fonts, and slightly disturbing each picture to increase noise;
dividing a training set and a verification set: and (4) enabling the generated high-frequency word bank to be as follows: 1, generating a training set and a verification set according to the proportion, wherein the training set is used for training to obtain an optimal depth model, and the verification set is used for generating an optimal depth model hyper-parameter;
constructing a convolution cyclic memory model: using convolutional neural sub-networks to pair high-frequency words X = { X t Performing feature extraction on each character to obtain each character feature F = { F = { F } t }; the recurrent neural subnetwork takes an input x at time step t t Taking a hidden state h at time step t-1 t-1 To calculate the hidden state h at time step t t And using Relu to calculate the output y at t moment t Nonlinear relationship to input:
h t =tanh(w hh h t-1 +w hx x t )
y t =W hy h t
wherein, w hh ,w hx ,W hy The weights are network weights to be learned;
optimizing a depth model: each time step loss function is a cross entropy loss function, the total loss function is the sum of each time step loss function, the network is reversely propagated, and the network is updated;
high-frequency word correction: and multiplying the probability of each character obtained by the convolution cyclic neural network with the single character classification probability reserved in the preliminary character discrimination to obtain the optimal character recognition result.
7. An automatic deep learning-based drug instruction identification system based on the automatic deep learning-based drug instruction identification method according to any one of claims 1 to 6, comprising:
the character information extraction module is used for acquiring a medicine specification image, preprocessing the acquired image based on an image processing method and extracting an effective font area;
the character information preliminary identification module is used for preliminarily identifying the character information by adopting a sectional identification method based on the extracted effective font area;
and the character information optimization and recognition module is used for optimizing and recognizing the preliminarily recognized character information in combination with the character information of the high-frequency word training to obtain an optimized recognition result of the characters.
8. A storage medium comprising a stored program, wherein the program when executed performs the method for automatic deep learning based drug instruction manual identification of any one of claims 1 to 6.
CN202211478211.7A 2022-11-23 2022-11-23 Deep learning-based automatic drug specification identification method and system and storage medium Withdrawn CN115731550A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211478211.7A CN115731550A (en) 2022-11-23 2022-11-23 Deep learning-based automatic drug specification identification method and system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211478211.7A CN115731550A (en) 2022-11-23 2022-11-23 Deep learning-based automatic drug specification identification method and system and storage medium

Publications (1)

Publication Number Publication Date
CN115731550A true CN115731550A (en) 2023-03-03

Family

ID=85297787

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211478211.7A Withdrawn CN115731550A (en) 2022-11-23 2022-11-23 Deep learning-based automatic drug specification identification method and system and storage medium

Country Status (1)

Country Link
CN (1) CN115731550A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116912845A (en) * 2023-06-16 2023-10-20 广东电网有限责任公司佛山供电局 Intelligent content identification and analysis method and device based on NLP and AI

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116912845A (en) * 2023-06-16 2023-10-20 广东电网有限责任公司佛山供电局 Intelligent content identification and analysis method and device based on NLP and AI
CN116912845B (en) * 2023-06-16 2024-03-19 广东电网有限责任公司佛山供电局 Intelligent content identification and analysis method and device based on NLP and AI

Similar Documents

Publication Publication Date Title
EP2166488B1 (en) Handwritten word spotter using synthesized typed queries
KR101122854B1 (en) Method and apparatus for populating electronic forms from scanned documents
US11790675B2 (en) Recognition of handwritten text via neural networks
Wilkinson et al. Neural Ctrl-F: segmentation-free query-by-string word spotting in handwritten manuscript collections
US12046067B2 (en) Optical character recognition systems and methods for personal data extraction
Park et al. Automatic detection and recognition of Korean text in outdoor signboard images
Kaundilya et al. Automated text extraction from images using OCR system
CN108664975A (en) A kind of hand-written Letter Identification Method of Uighur, system and electronic equipment
CN113901952A (en) Print form and handwritten form separated character recognition method based on deep learning
CN106991416A (en) It is a kind of based on the laboratory test report recognition methods taken pictures manually
Meng et al. Ancient Asian character recognition for literature preservation and understanding
Rahman et al. Bn-htrd: A benchmark dataset for document level offline bangla handwritten text recognition (htr) and line segmentation
CN115731550A (en) Deep learning-based automatic drug specification identification method and system and storage medium
CN115311666A (en) Image-text recognition method and device, computer equipment and storage medium
Karthik et al. Segmentation and recognition of handwritten kannada text using relevance feedback and histogram of oriented gradients–a novel approach
CN117076455A (en) Intelligent identification-based policy structured storage method, medium and system
Al-Barhamtoshy et al. Arabic OCR segmented-based system
Sen et al. BYANJON: a ground truth preparation system for online handwritten Bangla documents
CN110298350A (en) A kind of efficient block letter Uighur words partitioning algorithm
Tamirat Customers Identity Card Data Detection and Recognition Using Image Processing
Sarnacki et al. Character Recognition Based on Skeleton Analysis
Neumann Scene text localization and recognition in images and videos
Mishra Understanding Text in Scene Images
Mahmood Text Detection and Recognition from Natural Images
JP2906758B2 (en) Character reader

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20230303