CN115731550A - Deep learning-based automatic drug specification identification method and system and storage medium - Google Patents
Deep learning-based automatic drug specification identification method and system and storage medium Download PDFInfo
- Publication number
- CN115731550A CN115731550A CN202211478211.7A CN202211478211A CN115731550A CN 115731550 A CN115731550 A CN 115731550A CN 202211478211 A CN202211478211 A CN 202211478211A CN 115731550 A CN115731550 A CN 115731550A
- Authority
- CN
- China
- Prior art keywords
- character
- image
- medicine
- characters
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 238000000034 method Methods 0.000 title claims abstract description 85
- 239000003814 drug Substances 0.000 title claims abstract description 74
- 238000013135 deep learning Methods 0.000 title claims abstract description 25
- 238000003860 storage Methods 0.000 title claims abstract description 12
- 229940079593 drug Drugs 0.000 title claims description 17
- 238000012549 training Methods 0.000 claims abstract description 51
- 238000003672 processing method Methods 0.000 claims abstract description 14
- 238000007781 pre-processing Methods 0.000 claims abstract description 10
- 238000012545 processing Methods 0.000 claims description 39
- 238000012937 correction Methods 0.000 claims description 31
- 230000008569 process Effects 0.000 claims description 29
- 239000000523 sample Substances 0.000 claims description 27
- 238000012795 verification Methods 0.000 claims description 21
- 230000006870 function Effects 0.000 claims description 18
- 238000012986 modification Methods 0.000 claims description 16
- 230000004048 modification Effects 0.000 claims description 16
- 230000011218 segmentation Effects 0.000 claims description 14
- 125000004122 cyclic group Chemical group 0.000 claims description 13
- 238000010586 diagram Methods 0.000 claims description 13
- 238000011176 pooling Methods 0.000 claims description 12
- 239000013074 reference sample Substances 0.000 claims description 12
- 238000013527 convolutional neural network Methods 0.000 claims description 9
- 239000011159 matrix material Substances 0.000 claims description 8
- 238000000605 extraction Methods 0.000 claims description 7
- 230000001537 neural effect Effects 0.000 claims description 7
- 238000003062 neural network model Methods 0.000 claims description 7
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 claims description 6
- 238000001914 filtration Methods 0.000 claims description 6
- 238000005457 optimization Methods 0.000 claims description 5
- 230000009466 transformation Effects 0.000 claims description 5
- 238000013528 artificial neural network Methods 0.000 claims description 4
- 238000013507 mapping Methods 0.000 claims description 4
- 230000000306 recurrent effect Effects 0.000 claims description 4
- 230000010339 dilation Effects 0.000 claims description 3
- 238000011156 evaluation Methods 0.000 claims description 3
- 230000007246 mechanism Effects 0.000 claims description 3
- 238000012544 monitoring process Methods 0.000 claims description 3
- 230000000877 morphologic effect Effects 0.000 claims description 3
- 230000000644 propagated effect Effects 0.000 claims description 3
- 229910052704 radon Inorganic materials 0.000 claims description 3
- SYUHGPGVQRZVTB-UHFFFAOYSA-N radon atom Chemical compound [Rn] SYUHGPGVQRZVTB-UHFFFAOYSA-N 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 3
- 230000001131 transforming effect Effects 0.000 claims description 3
- 239000000463 material Substances 0.000 abstract 1
- 239000000243 solution Substances 0.000 description 9
- 238000005516 engineering process Methods 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 230000007547 defect Effects 0.000 description 3
- 238000012015 optical character recognition Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000013329 compounding Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000002347 injection Methods 0.000 description 2
- 239000007924 injection Substances 0.000 description 2
- 229920002545 silicone oil Polymers 0.000 description 2
- 229910052729 chemical element Inorganic materials 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000007849 functional defect Effects 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Landscapes
- Image Analysis (AREA)
- Character Discrimination (AREA)
Abstract
The invention provides a method, a system and a storage medium for automatically identifying a medicine specification based on deep learning. The method comprises the following steps: acquiring a medicine instruction book image, preprocessing the acquired image based on an image processing method, and extracting an effective font area; based on the extracted effective font area, adopting a sectional type identification method to carry out preliminary identification on the character information; and (4) combining the character information of the high-frequency word training, and carrying out optimized recognition on the primarily recognized character information to obtain an optimized recognition result of the characters. By using the medicine specification automatic identification method based on deep learning provided by the invention, the key characteristic information such as medicine names, approved characters, approved and modified dates and the like uploaded by each medicine company can be automatically detected whether to meet the standard, so that a real-time inquiry system of the specifications in a hospital can be established, a front-line medical worker can inquire complete and latest specification scanning files at any time, manpower and material resources are reduced, and the working efficiency can be greatly improved.
Description
Technical Field
The invention relates to the technical field of image recognition, in particular to a medicine specification automatic recognition method and system based on deep learning and a storage medium.
Background
Optical Character Recognition (OCR) refers to a process in which an electronic device (e.g., a scanner or a digital camera) examines characters printed on paper and then translates the shapes into computer text using Character Recognition methods. The wide application of digital cameras and flat bed scanners has greatly pushed the development of optical character recognition technology. The prior art is mainly divided into two categories: the method is based on a traditional image processing method and a learning strategy based on a deep neural network. Aiming at an automatic management system of a drug instruction, the existing method has the following defects:
(1) Functional defects: in the prior art, the key characteristic information such as a title, an approval document number, an approval date, a revision date and the like in the specification cannot be automatically identified, and the management of the specification needs to depend on manual review;
(2) The first technical defect is as follows: uneven scanning piece quality: part of scanned parts may have the problems of unclear characters or deformation, document inclination, indefinite scanning direction (horizontal and longitudinal) and the like, and the existing character recognition algorithm does not specifically study the problems;
(3) The technical defects are as follows: most common names or effective components of the medicines are transliterated from western languages and are distinguished by common rarely-used words, the rarely-used words are often complex in structure and increase the difficulty of correct identification, and the rarely-used words are often combined with other words to express a certain fixed chemical structure name; in addition, the names of drugs often include fixed words such as injection, oxyfluoride, silicone oil, compounding, and the like. For the two types of names, if the names are identified by using single words, the error rate is high because fixed collocation between the words is ignored.
Disclosure of Invention
In light of the above-mentioned technical problems, an automatic drug instruction identification method, system and storage medium based on deep learning are provided. The invention mainly realizes the high-precision extraction and identification of Chinese characters and numbers by using a data-driven thought and a deep learning technology.
The technical means adopted by the invention are as follows:
a method for automatically identifying a drug instruction book based on deep learning comprises the following steps:
acquiring a medicine instruction book image, preprocessing the acquired image based on an image processing method, and extracting an effective font area;
based on the extracted effective font area, preliminarily identifying the character information by adopting a sectional identification method;
and (4) combining the character information of the high-frequency word training, and carrying out optimized recognition on the primarily recognized character information to obtain an optimized recognition result of the characters.
Further, the preprocessing the acquired image based on the image processing method specifically includes: the processing method includes a scanned image enhancement process, a scanned object main direction correction process, an image tilt direction correction process, a character information region positioning process, a character region division process, an independent character division process, and a font correction process.
Further, the scan image enhancement processing, the scan main direction correction processing, the image tilt direction correction processing, the text information region positioning processing, the character region segmentation processing, the independent character segmentation processing, and the font correction processing specifically include:
the scan image enhancement processing includes: graying the image by adopting a weighted average method; performing linear filtering on the image by adopting mean filtering;
the main direction correction treatment of the scanning piece comprises the following steps: extracting length and width characteristics of the scanned part, projecting the image gray value to two directions respectively to obtain projection characteristics, and judging the main direction of the scanned part by combining with the prior characteristics of the main direction;
the image tilt direction correction process includes: estimating the tilt angle of the image by using Radon transformation, and projecting the image space to the polar coordinate space by using the following formula:
the point in the polar coordinate is equivalent to a straight line of two corresponding points in the image space, the corresponding line of the image space is determined through the accumulated peak value of the point set in the polar coordinate space, and the inclination angle is determined according to the accumulated peak value of the point set because the polar coordinate contains the inclination angle theta;
the character information area positioning processing comprises the following steps: performing morphological dilation operation on the image to reduce gaps between adjacent strokes of the character and adjacent characters; extracting connected domains of the images, and merging the regions of the same type; adopting a projection method to make a transverse projection histogram to obtain projection characteristics; aiming at the characteristics that partial characters of the medicine names in the medicine specification are maximum and are all positioned in a deep-color background region, selecting a region with the maximum character code and the maximum color block projection value as a medicine name image region; aiming at the characteristics that the approval date and the modification date of the medicine specification are on the top of the file and the characters are sparse, selecting an image area with the approval date and the modification date, wherein the projection value of a color block at the top of the image is smaller than that within a certain threshold value; identifying the bracketed sign and the keywords in the bracket so as to position the area where the approved character number is located;
the character region segmentation process includes: making a horizontal projection histogram for the selected approval date or approval document number image area, wherein line characters present wave crests on the histogram, line intervals present obvious wave trough shapes on the histogram, and the line intervals are divided according to the wave troughs to obtain the divided approval document number, approval date and modification date;
the independent character segmentation processing is to perform longitudinal projection column diagrams on each row of the approved date, the modified date, the approved letter number and the medicine name area, wherein each character dot matrix presents a peak on the column diagrams, and the character gaps present obvious trough shapes on the column diagrams, and are segmented according to the trough positions to obtain numbers, chinese characters and symbols of the approved date, the modified date, the approved letter number and the medicine name;
the font correction processing includes: aiming at the locality of font deformation, respectively performing font correction on each line of characters; and obtaining the minimum external quadrangle of each row of characters by utilizing Hough transformation, calculating an affine matrix H for transforming the quadrangle into the rectangle, and multiplying each segmented independent character by the affine matrix H to obtain a corrected character image.
Further, the preliminary recognition of the text information by using a sectional recognition method based on the extracted effective font area includes: and obtaining preliminary recognition results of approval and modification dates, approval character numbers and medicine names by utilizing single character training, and extracting related collocation relationships among words through a convolution cyclic neural network model to further optimize the preliminary recognition results.
Further, the obtaining of the preliminary recognition result of the approval and modification date, the approval document number and the medicine name by using the single character training, extracting the related collocation relationship among the words through the convolution cyclic neural network model, and further optimizing the preliminary recognition result specifically comprises the following steps:
constructing a character training library: extracting symbols including Chinese characters, numbers and percentiles from the national medicine catalog, generating symbol pictures with common fonts, and slightly disturbing each picture to increase noise;
dividing a training set and a verification set: and (3) the generated character training library is processed according to the following steps of 5:1, generating a training set and a verification set according to the proportion, wherein the training set is used for training to obtain an optimal depth model, and the verification set is used for generating an optimal depth model hyper-parameter;
constructing a convolutional neural network model: inputting a character picture, wherein the dimension is 32 multiplied by 32, carrying out convolution operation by using 6 convolution kernels with the size of 5 multiplied by 5 to obtain a convolution feature map with the size of 6@28 multiplied by 28; performing average pooling, namely downsampling, by stride =2 to obtain a pooling feature map of 6@14 × 14; performing convolution operation by 16 convolution kernels with the size of 5 × 5 to obtain a convolution characteristic diagram with the size of 16@10 × 10; performing average pooling, namely downsampling, by stride =2 to obtain a pooling characteristic map of 16@5 × 5; respectively utilizing convolution with one kernel of 5 multiplied by 5 and two kernels of 1 multiplied by 1 to scale the features so as to obtain rich feature combinations, and finally judging category output through nonlinear mapping;
optimizing a depth model: randomly selecting a reference sample, randomly selecting a sample from character libraries of different types as a positive sample, and randomly selecting a sample from character libraries of different types as a negative sample; a twin mechanism is adopted, in one iteration, a reference sample is input into a branch 1, a positive sample and a negative sample are sequentially input into a branch 2 in turn, and the two branches share network parameters; classifying sample characteristics of the branch 1 and the branch 2 by SoftMax respectively, and constraining by adopting a cross entropy loss function; the combined branch 1 and the branch 2 are restricted by a contrast loss function, so that the characteristics of the reference sample and the positive sample are similar as much as possible, and the characteristic difference between the reference sample and the negative sample is large as much as possible; carrying out backward propagation on the network and updating the network;
and (3) model evaluation: updating the network super parameters, and selecting the optimal network super parameters through a monitoring verification set;
and (3) character preliminary discrimination: and inputting the character picture after image processing into a convolutional neural network obtained by training to obtain a primary judgment result of each single character and reserve the single character classification probability.
Further, the word information trained by combining the high-frequency words optimizes and identifies the preliminarily identified word information to obtain an optimal identification result of the characters, and the method specifically comprises the following steps:
constructing a high-frequency word bank: automatically segmenting the medicine names in the national medicine catalogue by adopting a JIEBA open source segmentation system, and manually screening and correcting partial difficult phrases; counting the occurrence probability of all phrases, selecting high-frequency phrases to construct a high-frequency word stock, generating high-frequency word stock pictures with common fonts, and slightly disturbing each picture to increase noise;
dividing a training set and a verification set: and (5) the generated high-frequency word bank is processed according to the following steps: 1, generating a training set and a verification set, wherein the training set is used for training to obtain an optimal depth model, and the verification set is used for generating an optimal depth model hyperparameter;
constructing a convolution cyclic memory model: using convolutional neural subnetworks to pair high-frequency words X = { X = { X } t Performing feature extraction on each character to obtain each character feature F = { F = { F } t }; the recurrent neural subnetwork takes an input x at time step t t Taking a hidden state h at time step t-1 t-1 To calculate the hidden state h at time step t t And using Relu to calculate the output y at t moment t Nonlinear relationship to input:
h t =tanh(w hh h t-1 +w hx x t )
y t =W hy h t
wherein, w hh ,w hx ,W hy The weights are network weights to be learned;
optimizing a depth model: each time step loss function is a cross entropy loss function, the total loss function is the sum of each time step loss function, the network is reversely propagated, and the network is updated;
correcting high-frequency words: and multiplying the probability of each character obtained by the convolution cyclic neural network by the classification probability of the single character reserved in the preliminary character discrimination to obtain an optimized recognition result of the character.
The invention also provides a deep learning-based automatic medicine specification identification system based on the deep learning-based automatic medicine specification identification method, which comprises the following steps:
the character information extraction module is used for acquiring a medicine specification image, preprocessing the acquired image based on an image processing method and extracting an effective font area;
the character information preliminary identification module is used for preliminarily identifying the character information by adopting a sectional type identification method based on the extracted effective font area;
and the character information optimization and recognition module is used for optimizing and recognizing the preliminarily recognized character information in combination with the character information of the high-frequency word training to obtain an optimized recognition result of the characters.
The invention also provides a storage medium which comprises a stored program, wherein when the program runs, the method for automatically identifying the medicine specification based on deep learning is executed.
Compared with the prior art, the invention has the following advantages:
1. the invention provides an integrated identification method for a medicine specification, which can reduce the labor cost and greatly improve the management efficiency and timeliness of the medicine specification;
2. the invention provides a series of methods for image enhancement and character recognition, which can have better effects on the problems of unclear characters, deformation, document inclination, uncertain scanning direction (horizontal and longitudinal), and the like;
3. the invention provides a sectional type recognition method, which is characterized in that firstly, a convolutional neural network is utilized to preliminarily recognize single characters, then, a convolutional loop memory network is utilized to be combined with high-frequency word training to further optimize and recognize character information, and therefore, the accuracy of name detection is improved by utilizing the frequently-appearing high-frequency words in a medicine specification.
Based on the reasons, the invention can be widely popularized in the fields of medicine specification identification and the like.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flow chart of the method of the present invention.
Fig. 2 is a schematic diagram of a convolutional neural network model according to an embodiment of the present invention.
FIG. 3 is a schematic diagram of a convolution cyclic memory model according to an embodiment of the present invention.
Detailed Description
In order to make those skilled in the art better understand the technical solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
As shown in FIG. 1, the invention provides a method for automatically identifying a drug instruction manual based on deep learning, which comprises the following steps:
s1, acquiring a medicine specification image, preprocessing the acquired image based on an image processing method, and extracting an effective font area;
s2, based on the extracted effective font area, performing primary identification on the character information by adopting a sectional identification method;
and S3, combining the character information of the high-frequency word training, and performing optimized recognition on the preliminarily recognized character information to obtain an optimized recognition result of the characters.
In specific implementation, as a preferred embodiment of the present invention, in the step S1, the preprocessing is performed on the acquired image based on an image processing method, which specifically includes: the processing method includes a scanned image enhancement process, a scanned object main direction correction process, an image tilt direction correction process, a character information region positioning process, a character region division process, an independent character division process, and a font correction process. The method aims to orient and position the key area to extract a clear and correct effective font area, and lays a foundation for improving the accuracy of subsequent character information identification.
In a specific embodiment of the present invention, the scanned image enhancement processing, the scanner main direction correction processing, the image tilt direction correction processing, the character information region positioning processing, the character region division processing, the independent character division processing, and the font correction processing specifically include:
the scan image enhancement processing includes: graying the image by adopting a weighted average method; performing linear filtering on the image by adopting mean filtering; removing noise caused by poor performance of scanning equipment;
the scanning piece main direction correction treatment comprises the following steps: extracting length and width characteristics of the scanned part, projecting the image gray value to two directions respectively to obtain projection characteristics, and judging the main direction of the scanned part by combining with the prior characteristics of the main direction;
the image tilt direction correction process includes: estimating the tilt angle of the image using Radon transform, projecting the image space into polar space using the following formula:
the point in the polar coordinate is equivalent to a straight line of two corresponding points in the image space, the corresponding line of the image space is determined through the accumulated peak value of the point set in the polar coordinate space, and the inclination angle is determined according to the accumulated peak value of the point set because the polar coordinate contains the inclination angle theta;
the character information area positioning processing comprises the following steps: performing morphological dilation operation on the image to reduce gaps between adjacent strokes of the character and adjacent characters; extracting connected domains of the images, and combining the areas of the same type; adopting a projection method to make a transverse projection histogram to obtain projection characteristics; aiming at the characteristics that partial characters of the medicine names in the medicine specification are maximum and are all positioned in a dark background area, selecting an area with the maximum character code and the maximum color block projection value as a medicine name image area; aiming at the characteristics that the approval date and the modification date of the medicine specification are on the top of the file and characters are sparse, selecting an image area with the approval date and the modification date, wherein the projection value of a color block on the top of the image is smaller than that in a certain threshold value; identifying the bracketed sign and the keywords in the bracket so as to position the area where the approved character number is located;
the character region segmentation process includes: making a horizontal projection histogram for the selected approval date or approval document number image area, wherein line characters present wave crests on the histogram, line intervals present obvious wave trough shapes on the histogram, and the line intervals are divided according to the wave troughs to obtain the divided approval document number, approval date and modification date;
the independent character segmentation processing is to make a longitudinal projection histogram for each row of the approval date, the modification date and the approval document number and the medicine name area, each character lattice presents a wave peak on the histogram, a character gap presents an obvious wave trough shape on the histogram, and the independent character segmentation processing is carried out according to the wave trough position to obtain the numbers, chinese characters and symbols of the approval date, the modification date, the approval document number and the medicine name;
the font correction processing includes: respectively carrying out font correction on each line of characters aiming at the locality of font deformation; and obtaining the minimum external quadrangle of each line of characters by utilizing Hough transformation, calculating an affine matrix H for transforming the quadrangle into the rectangle, and multiplying each segmented independent character by the affine matrix H to obtain a corrected character image.
In specific implementation, as a preferred embodiment of the present invention, in step S2, based on the extracted effective font area, a segmented identification method is adopted to perform preliminary identification on the text information, which includes: obtaining the preliminary recognition results of approval and modification dates, approval document numbers and medicine names by utilizing single character training, extracting the related matching relation among words through a convolution circulation neural network model, and further optimizing the preliminary recognition results, wherein the preliminary recognition results specifically comprise the following steps:
constructing a character training library: according to the national medicine catalogue, symbols including Chinese characters, numbers and percentile numbers are extracted, symbol pictures of common fonts are generated, each picture is slightly disturbed to increase noise, and therefore the recognition robustness of the depth model is enhanced, and disturbing operation includes cutting and angle deflection.
Dividing a training set and a verification set: and (4) performing the following steps on the generated character training library: 1, generating a training set and a verification set, wherein the training set is used for training to obtain an optimal depth model, and the verification set is used for generating optimal depth model hyper-parameters, such as batch size, training step length and the like.
Constructing a convolutional neural network model: inputting a character picture, wherein the dimensionality is 32 multiplied by 32, carrying out convolution operation by using 6 convolution kernels with the size of 5 multiplied by 5 to obtain a convolution characteristic diagram with the size of 6@28 multiplied by 28; performing average pooling, namely downsampling, by stride =2 to obtain a pooling characteristic diagram of 6@14 × 14; performing convolution operation by 16 convolution kernels with the size of 5 multiplied by 5 to obtain a convolution feature map with the size of 16@10 multiplied by 10; performing average pooling, namely downsampling, by stride =2 to obtain a pooling characteristic map of 16@5 × 5; respectively utilizing convolution with one kernel of 5 multiplied by 5 and two kernels of 1 multiplied by 1 to scale the features so as to obtain rich feature combinations, and finally judging category output through nonlinear mapping; the constructed convolutional neural network model is shown in fig. 2.
Optimizing a depth model: the method aims to train the network to extract discriminant features which can represent each character, and the network is trained in a multi-classification and different-two classification mode in a combined mode to improve the accuracy of the model, and the method specifically comprises the following steps:
A. randomly selecting a reference sample, randomly selecting a sample from character libraries of different types as a positive sample, and randomly selecting a sample from character libraries of different types as a negative sample;
B. a twin mechanism is adopted, in one iteration, a reference sample is input into a branch 1, a positive sample and a negative sample are sequentially input into a branch 2 in turn, and the two branches share network parameters;
C. classifying sample characteristics of the branch 1 and the branch 2 by SoftMax respectively, and constraining by adopting a cross entropy loss function;
D. the combined branch 1 and the branch 2 are constrained by a comparison loss function, so that the characteristics of the reference sample and the positive sample are similar as much as possible, and the characteristic difference between the reference sample and the negative sample is large as much as possible;
E. carrying out backward propagation on the network and updating the network;
and (3) model evaluation: updating the network super parameters, and selecting the optimal network super parameters through a monitoring verification set;
and (3) character preliminary discrimination: and inputting the character picture after image processing into a convolutional neural network obtained by training to obtain a primary judgment result of each single character and reserve the single character classification probability. The single character classification probability is reserved, and for some words with similar fonts and high-frequency word training, probability updating is carried out in the next step.
In specific implementation, as a preferred embodiment of the present invention, in step S3, the preliminarily recognized text information is optimally recognized in combination with the text information of the high-frequency word training, so as to obtain an optimal recognition result of the character, which specifically includes:
constructing a high-frequency word bank: automatically segmenting the medicine names in the national medicine catalogue by adopting a JIEBA open source segmentation system, and manually screening and correcting partial difficult phrases; counting the occurrence probability of all phrases, selecting high-frequency phrases to construct a high-frequency word stock, generating high-frequency word stock pictures with common fonts, and slightly disturbing each picture to increase noise; thereby enhancing the robustness of high-frequency vocabulary recognition, and disturbing operations including clipping, angle deflection and the like.
Dividing a training set and a verification set: and (4) enabling the generated high-frequency word bank to be as follows: 1, generating a training set and a verification set, wherein the training set is used for training to obtain an optimal depth model, and the verification set is used for generating an optimal depth model hyperparameter;
constructing a convolution cyclic memory model: a convolution cyclic memory model is adopted in high-frequency word recognition, namely, a convolution neural subnetwork is used for extracting spatial information of high-frequency words, and a cyclic memory subnetwork is used for extracting related information among characters of the high-frequency words, and the method is specifically set as follows:
a. using convolutional neural sub-networks to pair high-frequency words X = { X t Performing feature extraction on each character to obtain each character feature F = { F = } t };
b. The recurrent neural subnetwork takes an input x at time step t t Taking a hidden state h at time step t-1 t-1 To calculate the hidden state h at time step t t And using Relu to calculate the output y at t moment t Nonlinear relationship to input:
h t =tanh(w hh h t-1 +w hx x t )
y t =W hy h t
wherein w hh ,w hx ,W hy The weights are network weights to be learned; the constructed convolution cyclic memory model is shown in fig. 3.
Optimizing a depth model: each time step loss function is a cross entropy loss function, the total loss function is the sum of each time step loss function, the network is reversely propagated, and the network is updated;
correcting high-frequency words: and multiplying the probability of each character obtained by the convolution cyclic neural network with the single character classification probability reserved in the preliminary character discrimination to obtain the optimal character recognition result.
Corresponding to the medicine specification automatic identification method based on deep learning in the application, the application also provides a medicine specification automatic identification system based on deep learning, which comprises the following steps: text message draws module, the preliminary identification module of text message and text message optimization identification module, wherein:
the character information extraction module is used for acquiring a medicine specification image, preprocessing the acquired image based on an image processing method and extracting an effective font area;
the character information preliminary identification module is used for preliminarily identifying the character information by adopting a sectional type identification method based on the extracted effective font area; in the embodiment, the primary character information identification module is used for realizing high-precision identification of Chinese characters and numbers on the basis of a data-driven thought and a deep learning technology aiming at extracted target character blocks, and the core principle is that a high-precision discriminator is formed by mining potential features and mapping rules hidden in a Chinese character database. The method mainly comprises a plurality of sub-modules of training database construction, deep neural network model construction, deep model optimization and the like. Most of chemical elements in the medicine are transliterated from western languages and are distinguished by rare words, the rare words are complex in structure and increase the difficulty of correct identification, and the rare words are often combined with other words to express a certain fixed element name; in addition, the names of drugs often include fixed words such as injection, oxyfluoride, silicone oil, compounding, and the like. For the two types of names, if the single characters are used for identification, the error rate is high because fixed collocation between the characters is ignored, so the method provides a sectional type identification method, namely firstly, the single character training is utilized to obtain the preliminary identification results of approval and modification dates, approval character numbers and medicine names, and then the convolution circulation neural network model is utilized to extract the related collocation relationship between the words, thereby further optimizing the preliminary identification results.
And the character information optimization and recognition module is used for optimizing and recognizing the preliminarily recognized character information in combination with the character information of the high-frequency word training to obtain an optimized recognition result of the characters.
For the embodiments of the present invention, the description is simple because it corresponds to the above embodiments, and for the relevant similar points, reference may be made to the description in the above embodiments, and details are not described here.
The embodiment of the application also discloses a computer-readable storage medium, wherein a computer instruction set is stored in the computer-readable storage medium, and when the computer instruction set is executed by a processor, the method for automatically identifying the drug specification based on deep learning provided by any one of the above embodiments is realized
In the embodiments provided in the present application, it should be understood that the disclosed technology can be implemented in other ways. The above-described apparatus embodiments are merely illustrative, and for example, the division of the units may be a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or may not be executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one position, or may be distributed on a plurality of units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may also be implemented in the form of a software functional unit.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.
Claims (8)
1. A medicine specification automatic identification method based on deep learning is characterized by comprising the following steps:
acquiring a medicine specification image, preprocessing the acquired image based on an image processing method, and extracting an effective font area;
based on the extracted effective font area, adopting a sectional type identification method to carry out preliminary identification on the character information;
and combining the character information of the high-frequency word training to perform optimized recognition on the preliminarily recognized character information to obtain an optimized recognition result of the characters.
2. The method for automatically identifying the drug instruction manual based on deep learning of claim 1, wherein the image processing method is used for preprocessing the acquired image, and specifically comprises: the processing method includes a scanned image enhancement process, a scanned object main direction correction process, an image tilt direction correction process, a character information region positioning process, a character region division process, an independent character division process, and a font correction process.
3. The method for automatically identifying a drug order based on deep learning according to claim 2, wherein the scan image enhancement processing, the scan item main direction correction processing, the image tilt direction correction processing, the text information region positioning processing, the character region segmentation processing, the independent character segmentation processing, and the font correction processing specifically include:
the scan image enhancement processing includes: graying the image by adopting a weighted average method; performing linear filtering on the image by adopting mean filtering;
the main direction correction treatment of the scanning piece comprises the following steps: extracting the length and width characteristics of the scanning piece, projecting the gray value of the image to two directions respectively to obtain projection characteristics, and judging the main direction of the scanning piece by combining the prior characteristics of the main direction;
the image tilt direction correction processing includes: estimating the tilt angle of the image by using Radon transformation, and projecting the image space to the polar coordinate space by using the following formula:
the point in the polar coordinate is equivalent to a straight line of two corresponding points in the image space, the corresponding line of the image space is determined through the accumulated peak value of the point set in the polar coordinate space, and the inclination angle is determined according to the accumulated peak value of the point set because the polar coordinate itself comprises the inclination angle theta;
the character information area positioning processing comprises the following steps: performing morphological dilation operation on the image to reduce gaps between adjacent strokes of the character and adjacent characters; extracting connected domains of the images, and merging the regions of the same type; adopting a projection method to make a transverse projection histogram to obtain projection characteristics; aiming at the characteristics that partial characters of the medicine names in the medicine specification are maximum and are all positioned in a deep-color background region, selecting a region with the maximum character code and the maximum color block projection value as a medicine name image region; aiming at the characteristics that the approval date and the modification date of the medicine specification are on the top of the file and characters are sparse, selecting an image area with the approval date and the modification date, wherein the projection value of a color block on the top of the image is smaller than that in a certain threshold value; identifying the bracketed sign and the keywords in the bracket so as to position the area where the approved character number is located;
the character region segmentation process includes: making a horizontal projection histogram for the selected approved date or approved character image area, displaying a peak on the histogram by line characters, displaying an obvious trough on the histogram by line intervals, and segmenting according to the trough to obtain the divided approved date and modified date;
the independent character segmentation processing is to perform longitudinal projection column diagrams on each row of the approved date, the modified date, the approved letter number and the medicine name area, wherein each character dot matrix presents a peak on the column diagrams, and the character gaps present obvious trough shapes on the column diagrams, and are segmented according to the trough positions to obtain numbers, chinese characters and symbols of the approved date, the modified date, the approved letter number and the medicine name;
the font correction processing includes: aiming at the locality of font deformation, respectively performing font correction on each line of characters; and obtaining the minimum external quadrangle of each line of characters by utilizing Hough transformation, calculating an affine matrix H for transforming the quadrangle into the rectangle, and multiplying each segmented independent character by the affine matrix H to obtain a corrected character image.
4. The method for automatically recognizing the deep learning-based medicine specification according to claim 2, wherein the preliminary recognition of the text information by using a segmented recognition method based on the extracted effective font area comprises: and obtaining preliminary recognition results of approval and modification dates, approval character numbers and medicine names by utilizing single character training, and extracting related collocation relationships among words through a convolution cyclic neural network model to further optimize the preliminary recognition results.
5. The method as claimed in claim 4, wherein the deep learning-based automatic drug specification recognition method comprises obtaining preliminary recognition results of approved and modified dates, approved characters and drug names by single character training, extracting related collocation relationships among words through a convolutional recurrent neural network model, and further optimizing the preliminary recognition results, specifically comprising:
constructing a character training library: extracting symbols including Chinese characters, numbers and percentiles from the national medicine catalog, generating symbol pictures with common fonts, and slightly disturbing each picture to increase noise;
dividing a training set and a verification set: and (3) the generated character training library is processed according to the following steps of 5:1, generating a training set and a verification set, wherein the training set is used for training to obtain an optimal depth model, and the verification set is used for generating an optimal depth model hyperparameter;
constructing a convolutional neural network model: inputting a character picture, wherein the dimensionality is 32 multiplied by 32, carrying out convolution operation by using 6 convolution kernels with the size of 5 multiplied by 5 to obtain a convolution characteristic diagram with the size of 6@28 multiplied by 28; performing average pooling, namely downsampling, by stride =2 to obtain a pooling characteristic diagram of 6@14 × 14; performing convolution operation by 16 convolution kernels with the size of 5 multiplied by 5 to obtain a convolution feature map with the size of 16@10 multiplied by 10; performing average pooling, namely downsampling, by stride =2 to obtain a pooling feature map of 16@5 × 5; respectively utilizing convolution with one kernel of 5 multiplied by 5 and two kernels of 1 multiplied by 1 to scale the features so as to obtain rich feature combinations, and finally judging category output through nonlinear mapping;
optimizing a depth model: randomly selecting a reference sample, randomly selecting a sample from character libraries of different types as a positive sample, and randomly selecting a sample from character libraries of different types as a negative sample; a twin mechanism is adopted, in one iteration, a reference sample is input into a branch 1, a positive sample and a negative sample are sequentially input into a branch 2 in turn, and the two branches share network parameters; classifying the sample characteristics of the branch 1 and the branch 2 by SoftMax respectively, and constraining by adopting a cross entropy loss function; the combined branch 1 and the branch 2 are restricted by a contrast loss function, so that the characteristics of the reference sample and the positive sample are similar as much as possible, and the characteristic difference between the reference sample and the negative sample is large as much as possible; carrying out backward propagation on the network and updating the network;
and (3) model evaluation: updating the network super parameters, and selecting the optimal network super parameters through a monitoring verification set;
and (3) character preliminary discrimination: and inputting the character picture after image processing into a convolutional neural network obtained by training to obtain a primary judgment result of each single character and reserve the single character classification probability.
6. The method for automatically recognizing the medicine specification based on the deep learning of claim 1, wherein the step of optimally recognizing the preliminarily recognized text information in combination with the text information trained by the high-frequency words to obtain the optimal recognition result of the characters specifically comprises the steps of:
constructing a high-frequency word bank: automatically segmenting the medicine names in the national medicine catalogue by adopting a JIEBA open source segmentation system, and manually screening and correcting partial difficult phrases; counting the occurrence probability of all phrases, selecting high-frequency phrases to construct a high-frequency word stock, generating high-frequency word stock pictures with common fonts, and slightly disturbing each picture to increase noise;
dividing a training set and a verification set: and (4) enabling the generated high-frequency word bank to be as follows: 1, generating a training set and a verification set according to the proportion, wherein the training set is used for training to obtain an optimal depth model, and the verification set is used for generating an optimal depth model hyper-parameter;
constructing a convolution cyclic memory model: using convolutional neural sub-networks to pair high-frequency words X = { X t Performing feature extraction on each character to obtain each character feature F = { F = { F } t }; the recurrent neural subnetwork takes an input x at time step t t Taking a hidden state h at time step t-1 t-1 To calculate the hidden state h at time step t t And using Relu to calculate the output y at t moment t Nonlinear relationship to input:
h t =tanh(w hh h t-1 +w hx x t )
y t =W hy h t
wherein, w hh ,w hx ,W hy The weights are network weights to be learned;
optimizing a depth model: each time step loss function is a cross entropy loss function, the total loss function is the sum of each time step loss function, the network is reversely propagated, and the network is updated;
high-frequency word correction: and multiplying the probability of each character obtained by the convolution cyclic neural network with the single character classification probability reserved in the preliminary character discrimination to obtain the optimal character recognition result.
7. An automatic deep learning-based drug instruction identification system based on the automatic deep learning-based drug instruction identification method according to any one of claims 1 to 6, comprising:
the character information extraction module is used for acquiring a medicine specification image, preprocessing the acquired image based on an image processing method and extracting an effective font area;
the character information preliminary identification module is used for preliminarily identifying the character information by adopting a sectional identification method based on the extracted effective font area;
and the character information optimization and recognition module is used for optimizing and recognizing the preliminarily recognized character information in combination with the character information of the high-frequency word training to obtain an optimized recognition result of the characters.
8. A storage medium comprising a stored program, wherein the program when executed performs the method for automatic deep learning based drug instruction manual identification of any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211478211.7A CN115731550A (en) | 2022-11-23 | 2022-11-23 | Deep learning-based automatic drug specification identification method and system and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211478211.7A CN115731550A (en) | 2022-11-23 | 2022-11-23 | Deep learning-based automatic drug specification identification method and system and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115731550A true CN115731550A (en) | 2023-03-03 |
Family
ID=85297787
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211478211.7A Withdrawn CN115731550A (en) | 2022-11-23 | 2022-11-23 | Deep learning-based automatic drug specification identification method and system and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115731550A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116912845A (en) * | 2023-06-16 | 2023-10-20 | 广东电网有限责任公司佛山供电局 | Intelligent content identification and analysis method and device based on NLP and AI |
-
2022
- 2022-11-23 CN CN202211478211.7A patent/CN115731550A/en not_active Withdrawn
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116912845A (en) * | 2023-06-16 | 2023-10-20 | 广东电网有限责任公司佛山供电局 | Intelligent content identification and analysis method and device based on NLP and AI |
CN116912845B (en) * | 2023-06-16 | 2024-03-19 | 广东电网有限责任公司佛山供电局 | Intelligent content identification and analysis method and device based on NLP and AI |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111414906B (en) | Data synthesis and text recognition method for paper bill pictures | |
Karatzas et al. | ICDAR 2011 robust reading competition-challenge 1: reading text in born-digital images (web and email) | |
CN112508011A (en) | OCR (optical character recognition) method and device based on neural network | |
CN110807454B (en) | Text positioning method, device, equipment and storage medium based on image segmentation | |
Attivissimo et al. | An automatic reader of identity documents | |
CN111626292B (en) | Text recognition method of building indication mark based on deep learning technology | |
CN114092938B (en) | Image recognition processing method and device, electronic equipment and storage medium | |
CN113033558B (en) | Text detection method and device for natural scene and storage medium | |
US20210056429A1 (en) | Apparatus and methods for converting lineless tables into lined tables using generative adversarial networks | |
Faustina Joan et al. | A survey on text information extraction from born-digital and scene text images | |
Kaundilya et al. | Automated text extraction from images using OCR system | |
CN106991416A (en) | It is a kind of based on the laboratory test report recognition methods taken pictures manually | |
CN111814576A (en) | Shopping receipt picture identification method based on deep learning | |
US12046067B2 (en) | Optical character recognition systems and methods for personal data extraction | |
CN115731550A (en) | Deep learning-based automatic drug specification identification method and system and storage medium | |
De Nardin et al. | Few-shot pixel-precise document layout segmentation via dynamic instance generation and local thresholding | |
Rahman et al. | Bn-htrd: A benchmark dataset for document level offline bangla handwritten text recognition (htr) and line segmentation | |
CN118135584A (en) | Automatic handwriting form recognition method and system based on deep learning | |
Rani et al. | 2d morphable feature space for handwritten character recognition | |
CN117076455A (en) | Intelligent identification-based policy structured storage method, medium and system | |
Tsimpiris et al. | Tesseract OCR evaluation on Greek food menus datasets | |
Munir et al. | Automatic character extraction from handwritten scanned documents to build large scale database | |
Kumar et al. | Line based robust script identification for indianlanguages | |
Alsimry et al. | A new approach for finding duplicated words in scanned Arabic documents based on OCR and SURF. | |
Prabaharan et al. | Text extraction from natural scene images and conversion to audio in smart phone applications |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WW01 | Invention patent application withdrawn after publication |
Application publication date: 20230303 |
|
WW01 | Invention patent application withdrawn after publication |