WO2022227218A1 - 药名识别方法、装置、计算机设备和存储介质 - Google Patents

药名识别方法、装置、计算机设备和存储介质 Download PDF

Info

Publication number
WO2022227218A1
WO2022227218A1 PCT/CN2021/097413 CN2021097413W WO2022227218A1 WO 2022227218 A1 WO2022227218 A1 WO 2022227218A1 CN 2021097413 W CN2021097413 W CN 2021097413W WO 2022227218 A1 WO2022227218 A1 WO 2022227218A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
drug name
text
medicine box
drug
Prior art date
Application number
PCT/CN2021/097413
Other languages
English (en)
French (fr)
Inventor
曾婵
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2022227218A1 publication Critical patent/WO2022227218A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/412Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables

Definitions

  • the present application relates to the field of artificial intelligence and digital medicine, and in particular, to a method, device, computer equipment and storage medium for identifying a drug name.
  • the existing method mainly uses OCR (Optical Character Recognition, Optical Character Recognition) to automatically recognize the text on the medicine box, but the OCR technology is only suitable for simple scenes, and in the text recognition stage, feature extraction and template matching are mainly used.
  • OCR Optical Character Recognition
  • the method uses human experience knowledge to guide text feature extraction, and then matches from the feature library according to text similarity. The stability and effectiveness of this method are poor, and it is difficult to identify the complex and diverse drug names in the kit.
  • the application provides a drug name recognition method, device, computer equipment and storage medium.
  • the trained drug name recognition model is suitable for recognizing medicine box images. It can effectively improve the accuracy of identifying the drug name in the scene of complex and diverse drug names; by performing fuzzy matching of the drug name on the drug name recognition result output by the drug name recognition model, the accuracy of identifying the drug name is further improved.
  • the application provides a method for identifying a drug name, the method comprising:
  • the image training data is obtained by performing text extraction on the sample medicine box image, and obtaining image expansion data;
  • the drug name recognition result is subjected to fuzzy matching of the drug name to obtain the drug name corresponding to the medicine box image.
  • the present application also provides a device for identifying a drug name, the device comprising:
  • an image data acquisition module configured to acquire image training data, the image training data is obtained by text extraction of sample medicine box images, and acquire image expansion data;
  • a model training module for inputting the image training data and the image expansion data into a drug name recognition model for iterative training, until the drug name recognition model converges, and the trained drug name recognition model is obtained;
  • an image extraction module configured to acquire an image of a medicine box to be identified by the name of the medicine, and to determine at least one image to be identified corresponding to the image of the medicine box;
  • a medicine name recognition module configured to input each of the images to be recognized into the trained medicine name recognition model for medicine name recognition, and obtain a medicine name recognition result corresponding to the medicine box image;
  • the drug name fuzzy matching module is configured to perform a drug name fuzzy matching on the drug name recognition result based on a preset drug name information database, and obtain the drug name corresponding to the medicine box image.
  • the present application also provides a computer device, the computer device comprising a memory and a processor;
  • the memory for storing computer programs
  • the processor is configured to execute the computer program and implement the above-mentioned drug name recognition method when the computer program is executed.
  • the present application also provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and when the computer program is executed by the processor, the processor realizes the above-mentioned drug name recognition method.
  • the present application discloses a drug name recognition method, device, computer equipment and storage medium. By acquiring image training data and acquiring image augmentation data, and inputting the image training data and image augmentation data into a drug name recognition model for iterative training, the training can be performed.
  • a good drug name recognition model is suitable for the scene of recognizing the complex and diverse drug names in the medicine box image, which can effectively improve the accuracy of identifying the drug name;
  • the The drug name recognition result is subjected to fuzzy matching of the drug name to obtain the drug name corresponding to the medicine box image, which further improves the accuracy of the drug name recognition on the medicine box image.
  • FIG. 1 is a schematic flowchart of a method for identifying a drug name provided by an embodiment of the present application
  • FIG. 2 is a schematic diagram of training a drug name recognition model provided by an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of a sub-step of acquiring image augmentation data provided by an embodiment of the present application
  • FIG. 4 is a schematic flowchart of a sub-step of determining an image to be recognized according to an embodiment of the present application
  • FIG. 5 is a schematic flowchart of a sub-step of text detection on a medicine box image provided by an embodiment of the present application
  • FIG. 6 is a schematic block diagram of a device for identifying a drug name provided by an embodiment of the present application.
  • FIG. 7 is a schematic structural block diagram of a computer device provided by an embodiment of the present application.
  • Embodiments of the present application provide a drug name identification method, apparatus, computer device and storage medium.
  • the drug name recognition method can be applied to a server or a terminal.
  • the trained drug name recognition model is suitable for recognizing the complex and diverse drug names in the medicine box image. It can effectively improve the accuracy of identifying the drug name; by performing fuzzy matching of the drug name on the drug name recognition result output by the drug name recognition model, the accuracy of identifying the drug name is further improved.
  • the server may be an independent server or a server cluster.
  • Terminals can be electronic devices such as smart phones, tablet computers, notebook computers, and desktop computers.
  • the method for identifying a drug name includes steps S10 to S50.
  • Step S10 Acquire image training data, the image training data is obtained by performing text extraction on the sample medicine box image, and acquire image expansion data.
  • FIG. 2 is a schematic diagram of training a drug name recognition model provided by an embodiment of the present application.
  • the specific training process may include: acquiring image training data, which is obtained by text extraction of sample medicine box images, and acquiring image expansion data; inputting the image training data and image expansion data into the initial
  • the drug name recognition model is iteratively trained until the drug name recognition model converges, and a trained drug name recognition model is obtained.
  • text extraction may be performed on a preset number of sample pillbox images to obtain image training data.
  • text extraction can be performed on the sample medicine box image through a text detection model.
  • a preset amount of image augmentation data may be generated. It should be noted that since the text on the image of the medicine box is generally more complex, including text of different font types, different sizes and different directions, it is necessary to add image expansion data to train the drug name recognition model, so that the trained drug name recognition model can be recognized.
  • the model is suitable for the scene of identifying complex and diverse drug names in the medicine box image, thereby improving the accuracy of the drug name recognition model.
  • FIG. 3 is a schematic flowchart of a sub-step of acquiring image augmentation data provided by an embodiment of the present application, which may specifically include the following steps S101 to S103 .
  • Step S101 based on a preset drug knowledge base, acquire text data containing drug information, where the text data includes at least one of Chinese, English, numbers and symbols.
  • a preset amount of text data containing drug information may be extracted from the drug knowledge base.
  • the extracted text data is unformatted text; the content of the text data may include at least one of Chinese, English, numbers and symbols.
  • Step S102 adding a text attribute to the text data to obtain the text data after adding the text attribute, where the text attribute includes at least one of text length, text direction, font size and font style.
  • different text attributes can be added to the text data.
  • the text orientation of the text data is set to landscape; for example, the font size of the text data is set to three; for another example, the font style of the text data is set to color.
  • a plurality of text data after adding the text attribute can be obtained.
  • the text direction of the text data after adding the text attribute may be horizontal, and the font style may be bold.
  • the text direction of the text data after adding the text attribute may be vertical, and the font size may be three.
  • Step S103 adding the text data after adding the text attribute to a preset image template to obtain the image extension data.
  • the preset image template may be a blank image.
  • the text data after adding the text attribute can be loaded into a blank image, so as to obtain image augmented data.
  • image augmentation data By constructing image augmentation data, rich image training data can be automatically generated, which not only solves the problem of insufficient training samples, but also improves the robustness and accuracy of the drug name recognition model.
  • Step S20 Input the image training data and the image expansion data into a drug name recognition model for iterative training until the drug name recognition model converges, and the trained drug name recognition model is obtained.
  • the added image expansion data and the image training data are jointly input into the drug name recognition model for training, so that the trained drug name recognition model is suitable for the scene of recognizing the complex and diverse drug names in the medicine box image, Thus, the accuracy of the drug name recognition model is improved.
  • the drug name recognition model is used to recognize the text information in the image, classify the text information by labels, and determine the label with the highest output probability as the drug name recognition result.
  • the drug name recognition model can include convolutional neural network and recurrent neural network; among them, the convolutional neural network is used to convolve and pool the input image, and output the characteristic image; the recurrent neural network is used to classify the characteristic image of the drug name Predict, output the drug name recognition result.
  • the loss function value during model training can be calculated by using the CTC loss function, and the parameters of the model can be adjusted according to the loss function value.
  • the CTC Connectionist Temporal Classification
  • the CTC (Connectionist Temporal Classification) loss function is used to solve the alignment problem of input features and output labels.
  • inputting the image training data and the image augmentation data into the drug name recognition model for iterative training until the drug name recognition model converges may include: determining the training samples for each round of training according to the image training data and the image augmentation data Image; input the current round of training sample images into the convolutional neural network for convolution and pooling, and output feature images; input the feature images into the recurrent neural network for drug name classification prediction, and output the corresponding drug name recognition results; based on the connection time series classification loss function to determine the loss function value corresponding to the drug name recognition result; if the loss function value is greater than the preset loss value threshold, adjust the parameters of the convolutional neural network and the recurrent neural network, and perform the next round of training until the loss function is obtained. If the value is less than or equal to the loss value threshold, the training ends, and the trained drug name recognition model is obtained.
  • the preset loss value threshold may be set according to the actual situation, and the specific value is not limited herein.
  • the parameters of the convolutional neural network and the cyclic neural network can be adjusted based on the error back propagation algorithm, and the parameters of the convolutional neural network and the cyclic neural network can also be adjusted according to other algorithms, such as a gradient descent algorithm.
  • the error back propagation (Error Back Propagation, BP) algorithm is a multi-layer feedforward neural network trained according to the error back propagation algorithm.
  • the trained drug name recognition model can also be stored in a node of a blockchain.
  • the trained drug name recognition model needs to be used, it can be called from the nodes of the blockchain.
  • inputting the current round of training sample images into a convolutional neural network for convolution and pooling, and before outputting the characteristic images may further include: determining the image recognition direction corresponding to the drug name recognition model; according to the current round of training sample images The corresponding height and width, determine the image orientation corresponding to the current round of training sample images; adjust the orientation of the training sample images whose image orientation is different from the image recognition orientation to obtain the adjusted training sample images; The training sample images that need to be adjusted in direction are input into the convolutional neural network for convolution and pooling, and the feature images corresponding to the current round of training sample images are obtained.
  • the drug name recognition model recognizes images
  • the same drug name recognition model is suitable for recognizing images in a fixed direction. If there are training sample images in multiple directions, multiple drug name recognition models need to be trained for recognition. Therefore, when training the drug name recognition model, it is necessary to first determine the image recognition direction of the drug name recognition model, and then adjust the direction of the training sample image.
  • the image recognition direction corresponding to the drug name recognition model may be set; for example, the image recognition direction may be set as horizontal, or the image recognition direction may be set as vertical.
  • the image orientation corresponding to the current round of training sample images may be determined according to the height and width corresponding to the current round of training sample images. For example, when the width of the training sample image is greater than the height, the image orientation of the training sample image can be determined to be horizontal; when the width of the training sample image is less than or equal to the height, the image orientation of the training sample image can be determined to be vertical.
  • the training sample images can be rotated by 90°.
  • they can be rotated according to the actual image orientation.
  • one drug name recognition model can be used to train horizontal images and vertical images, without the need to train a single drug name recognition model for horizontal images and another drug name recognition model. Training on longitudinal images simplifies the training process.
  • Step S30 Acquire an image of a medicine box to be identified by a medicine name, and determine at least one image to be identified corresponding to the image of the medicine box.
  • the drug name recognition method provided in the embodiment of the present application can be applied to a complex scene of identifying complex and diverse drug names in a medicine box image.
  • the user can upload the image of the medicine box for which the medicine name recognition needs to be performed to the server or the terminal, wherein the server or the terminal is installed with a medicine name recognition system or a medicine name recognition application program.
  • a user can use a mobile phone to take a picture of a medicine box that needs to be recognized by a drug name, and then upload the drug name recognition system or a drug name recognition application for drug name recognition.
  • the medicine box image uploaded by the user is determined as the medicine box image to be recognized by the medicine name according to the image uploading operation.
  • FIG. 4 is a schematic flowchart of a sub-step of determining an image to be recognized provided by an embodiment of the present application, which may specifically include the following steps S301 and S302.
  • Step S301 Input the medicine box image into a text detection model to perform text detection, and obtain text position information corresponding to the medicine box image.
  • the text detection model can be a DB-Net (Differentiable Binarization Network) model.
  • DB-Net Differentiable Binarization Network
  • the biggest innovation of the DB-Net model is that each pixel is adaptively binarized, in which the binarization threshold is learned by the network, and the process of binarization is added to the network for training, which can effectively enhance the Robustness of text detection models, and improving the detection speed of text detection models.
  • the backbone network of the DB-Net model adopts the ResNet (Residual Neural Network, residual neural network) structure.
  • ResNet Residual Neural Network, residual neural network
  • the text detection model includes at least a feature extraction layer; the feature extraction layer is used to perform feature extraction on an image input to the text detection model to obtain a corresponding feature image.
  • the text detection model can also include a feature prediction layer, a binarization layer, and a text position prediction layer.
  • the feature prediction layer is used to predict the probability feature map and the threshold feature image corresponding to the feature image;
  • the binarization layer is used to perform binarization calculation on the probability feature map and the threshold feature image, and output the binarized feature image;
  • text position prediction The layer is used to identify the text region of the binarized feature image and output the text position information.
  • the text detection model may be a pre-trained text detection model.
  • a preset number of sample pillbox images may be acquired, and the sample pillbox images may be input into an initial text detection model for iterative training until the text detection model converges, and a trained text detection model is obtained.
  • the specific training process is not limited here.
  • the image of the medicine box to be recognized by the medicine name can be input into the trained text detection model for text detection, and the text position information corresponding to the image of the medicine box can be obtained.
  • FIG. 5 is a schematic flowchart of a sub-step of text detection on a medicine box image provided by an embodiment of the present application, which may specifically include the following steps S3011 to S3013.
  • Step S3011 input the medicine box image into the feature extraction layer to perform feature extraction, and obtain a feature image corresponding to the medicine box image.
  • the medicine box image may be input into the feature extraction layer, the feature extraction layer performs feature extraction, and outputs a feature image corresponding to the medicine box image.
  • the feature extraction layer may include FPN (Feature Pyramid Networks, feature pyramid network). It should be noted that FPN solves the multi-scale problem in object detection. Through simple network connection changes, the performance of small object detection is greatly improved without increasing the amount of calculation of the original model.
  • an FPN network may be used to upsample the image of the medicine box, so as to extract the characteristic image corresponding to the image of the medicine box.
  • Step S3012 Determine the binarized feature map corresponding to the feature image.
  • the feature image output by the feature extraction layer can be input into the feature prediction layer, and the feature prediction layer can predict the probability feature map and the threshold feature image corresponding to the feature image. Then, the probability feature map and the threshold feature image are input into the binarization layer for binarization calculation, and the binarized feature image is output.
  • the binarization layer may perform a binarization calculation on the probability feature map and the threshold feature image based on a differentiable binarization formula to obtain a binarized feature map.
  • Step S3013 Determine the text area in the binarized feature map, and determine the text position information according to the text area.
  • the text region in the binarized feature map may be determined according to pixel values in the binarized feature map. For example, a pixel whose pixel value is greater than a preset pixel threshold is determined as a text area. A pixel whose pixel value is less than the preset pixel threshold is determined as a non-text area.
  • the preset pixel threshold can be set according to the actual situation, and the specific value is not limited here.
  • the coordinates of the upper left corner and the lower right corner of the text area can be determined as the text position information. Therefore, the text position information may include the coordinates of the upper left corner and the lower right corner of the text area.
  • the existing text detection methods based on image segmentation usually set a fixed threshold to convert the probability feature map generated by the segmentation network into a binary feature map. Since different thresholds have a greater impact on the performance of the model, the existing text detection methods based on image segmentation have low accuracy.
  • the DB-Net model in the embodiment of the present application can adaptively predict the threshold value of each position in the image by inserting the binarization operation into the segmentation network for joint optimization, so that the foreground and background of pixels can be completely distinguished. It not only improves the detection speed of the text detection model, but also enhances the robustness of the text detection model.
  • the text position information corresponding to the text in different formats in the medicine box image can be accurately determined.
  • Step S302 segment the medicine box image according to the text position information to obtain at least one image to be recognized.
  • the image area to be segmented can be determined according to the coordinates of the upper left corner and the lower right corner of the text area, and then a screenshot of the image area is taken to obtain the image to be recognized.
  • a screenshot of the image area is taken to obtain the image to be recognized.
  • other segmentation methods may also be used, and the specific segmentation methods are not limited herein.
  • each text position information is correspondingly segmented to obtain an image to be recognized.
  • the to-be-recognized image containing the text can be accurately obtained, thereby improving the accuracy of the medicine name recognition on the to-be-recognized image.
  • Step S40 Input each of the images to be recognized into the trained drug name recognition model for drug name recognition, and obtain a drug name recognition result corresponding to the medicine box image.
  • each to-be-recognized image can be input into the trained medicine name recognition model for medicine name recognition, so that the medicine name corresponding to the image of the medicine box can be obtained Identify the results.
  • the method may further include: determining an image recognition direction corresponding to the drug name recognition model; Recognize the image and the to-be-recognized image that does not require direction adjustment; perform direction adjustment on the to-be-recognized image to be oriented to obtain the adjusted to-be-recognized image.
  • the image recognition direction corresponding to the drug name recognition model has been determined during training. For example, if the image recognition direction corresponding to the drug name recognition model during training is horizontal, it can be determined that the image recognition direction corresponding to the drug name recognition model at this time is horizontal. For another example, if the image recognition direction corresponding to the drug name recognition model during training is vertical, it can be determined that the image recognition direction corresponding to the drug name recognition model at this time is vertical.
  • the to-be-recognized image to be adjusted and the to-be-recognized image that does not require direction adjustment may be determined according to the image recognition direction.
  • the image recognition direction is horizontal
  • the width of the image to be recognized is greater than the height
  • it means that the image direction of the image to be recognized is horizontal
  • it can be determined that the image to be recognized does not require orientation adjustment.
  • the image recognition direction is horizontal
  • the width of the image to be recognized is smaller than the height, it means that the image direction of the image to be recognized is vertical, and it can be determined that the image to be recognized needs to be oriented.
  • the to-be-recognized image whose orientation is to be adjusted may be rotated clockwise or counterclockwise until the image orientation of the to-be-recognized image is adjusted to be horizontal.
  • inputting each to-be-recognized image into the drug name recognition model for drug name recognition may include: inputting the adjusted to-be-recognized image and the to-be-recognized image that does not require orientation adjustment into the drug name recognition model for drug name recognition .
  • the adjusted to-be-recognized image and the to-be-recognized image that does not require orientation adjustment may be input into a drug name recognition model for drug name recognition, so as to obtain a drug name recognition result corresponding to the medicine box image.
  • determining the image recognition direction corresponding to the drug name recognition model it can be determined whether the direction of the image to be recognized needs to be adjusted, and a drug name recognition model can recognize the horizontal and vertical images to be recognized without adding another drug name recognition model. Recognition not only reduces the amount of calculation and memory consumption, but also improves the accuracy of drug name recognition by the drug name recognition model.
  • Step S50 based on a preset drug name information database, perform a drug name fuzzy matching on the drug name recognition result to obtain a drug name corresponding to the medicine box image.
  • the drug name recognition results obtained by the drug name recognition model may contain some homophones or synonyms, the recognition results are not accurate enough. Therefore, more accurate drug names can be obtained by performing fuzzy matching of drug names on the drug name recognition results. .
  • the drug name information library includes a variety of standard drug name texts.
  • a variety of standard drug name texts may be collected in advance and stored in the drug name information database.
  • the above-mentioned drug name information database can be stored in a node of a blockchain.
  • performing a drug name fuzzy matching on the drug name recognition result to obtain the drug name corresponding to the medicine box image may include: determining the drug name recognition based on a preset edit distance algorithm The edit distance value between the result and the standard drug name text in the drug name information database; the standard drug name text whose edit distance value is less than the preset edit distance threshold value is determined as the drug name corresponding to the medicine box image.
  • Levenshtein Distance (edit distance) algorithm refers to the minimum number of editing operations required to convert two strings from one to the other.
  • the edit distance value between the drug name recognition result and the standard drug name text in the drug name information database can be calculated based on the edit distance algorithm, and the edit distance value is smaller than the standard drug name text corresponding to the preset edit distance threshold. , which is determined as the name of the medicine corresponding to the image of the medicine box.
  • the preset edit distance threshold may be set according to the actual situation, and the specific value is not limited herein.
  • the medicine box image when recognizing complex and diverse drug names in the image of the medicine box, exemplarily, identifying the drug names composed of Chinese, English and symbols, the text direction is horizontal, and the font style is colored, for example, the drug name "AB" -C particle", first input the image of the medicine box into the text detection model for text detection, and obtain the text position information corresponding to the image of the medicine box. Secondly, the medicine box image is segmented according to the text position information to obtain at least one to-be-recognized image, and the to-be-recognized image may be an image containing the name of the drug or an image containing other text.
  • the method may further include: determining the medicine description information of the medicine name corresponding to the medicine box image based on the preset correspondence between the medicine name and the medicine description information, And display the drug name and drug description information.
  • the medication instruction information may include the main function information, the administration instruction information, and the like.
  • the drug name and drug description information may be displayed on the server or terminal where the user uploads the image of the pill box.
  • the drug name and drug description information are displayed on the user's mobile phone.
  • the method may further include: performing speech synthesis on the medicine name corresponding to the medicine box image, generating speech information corresponding to the medicine name, and broadcasting the speech information.
  • a TTS (Text To Speech, speech synthesis) model can be used to perform speech synthesis on the drug name corresponding to the medicine box image to generate voice information corresponding to the drug name.
  • the TTS model includes modules such as text analysis, acoustic model, and audio synthesis, which can realize converting a piece of text into a speech signal.
  • the voice information may be broadcast on the server or terminal.
  • voice information is broadcast through the user's mobile phone.
  • the method may further include: determining the medicine description information of the medicine name corresponding to the medicine box image based on the preset correspondence between the medicine name and the medicine description information, And display the drug name and drug description information; perform speech synthesis on the drug name corresponding to the medicine box image, generate the voice information corresponding to the drug name, and broadcast the voice information.
  • the user can more conveniently obtain the drug name and medication description information on the medicine box image, which is more humanized;
  • the drug name corresponding to the image is synthesized by speech, and the voice information corresponding to the drug name is generated and broadcasted, so that users with poor eyesight can easily obtain the drug name in the medicine box image, which improves the user experience.
  • the drug name recognition method provided by the above embodiment can automatically generate rich image training data by constructing image expansion data, which not only solves the problem of insufficient training samples, but also improves the robustness and accuracy of the drug name recognition model. ;By inputting the added image expansion data and image training data into the drug name recognition model for training, the trained drug name recognition model is suitable for the scene of identifying complex and diverse drug names in the medicine box image, thereby improving the drug name recognition model.
  • the alignment problem of the input feature and the output label is solved, and the trained
  • the accuracy of the drug name recognition model for recognizing drug names by unifying the orientation of the training sample images, a drug name recognition model can be used to train horizontal images and vertical images without the need for a single drug name recognition model to perform horizontal image training alone. Training, separate longitudinal image training for another medicine name recognition model, which simplifies the training process; by inputting the medicine box image to be used for medicine name recognition into the text detection model for text detection, it is possible to accurately determine the different formats in the medicine box image.
  • the text position information corresponding to the text is obtained; by segmenting the medicine box image according to the text position information, the to-be-recognized image containing the text can be accurately obtained, thereby improving the accuracy of drug name recognition on the to-be-recognized image; It can further improve the accuracy of the recognized drug name by performing fuzzy matching of the drug name on the drug name recognition result. It makes it easier for users to obtain the drug name and medication description information on the image of the medicine box, which is more user-friendly; by performing speech synthesis on the name of the medicine corresponding to the image of the medicine box, the voice information corresponding to the name of the medicine is generated and broadcasted, so that the visual acuity can be improved. Poor users can also easily obtain the drug name in the medicine box image, which improves the user experience.
  • FIG. 6 is a schematic block diagram of a medicine name recognition device 1000 further provided by an embodiment of the present application, and the medicine name recognition device is used to execute the aforementioned medicine name recognition method.
  • the medicine name identification device may be configured in a server or a terminal.
  • the medicine name recognition device 1000 includes: an image data acquisition module 1001 , a model training module 1002 , an image extraction module 1003 , a medicine name recognition module 1004 and a medicine name fuzzy matching module 1005 .
  • the image data acquisition module 1001 is used for acquiring image training data, the image training data is obtained by text extraction of the sample medicine box image, and acquiring image expansion data.
  • the model training module 1002 is configured to input the image training data and the image expansion data into a drug name recognition model for iterative training, until the drug name recognition model converges, and the trained drug name recognition model is obtained.
  • the image extraction module 1003 is configured to acquire an image of a medicine box to be recognized by the name of the medicine, and to determine at least one image to be recognized corresponding to the image of the medicine box.
  • the medicine name recognition module 1004 is configured to input each of the to-be-recognized images into the trained medicine name recognition model for medicine name recognition, and obtain a medicine name recognition result corresponding to the medicine box image.
  • the drug name fuzzy matching module 1005 is configured to perform a drug name fuzzy matching on the drug name recognition result based on a preset drug name information database to obtain the drug name corresponding to the medicine box image.
  • the above-mentioned apparatus can be implemented in the form of a computer program, and the computer program can be executed on a computer device as shown in FIG. 7 .
  • FIG. 7 is a schematic structural block diagram of a computer device provided by an embodiment of the present application.
  • the computer device can be a server or a terminal.
  • the computer device includes a processor and a memory connected through a system bus, wherein the memory may include a non-volatile storage medium and an internal memory.
  • the processor is used to provide computing and control capabilities to support the operation of the entire computer equipment.
  • the internal memory provides an environment for running the computer program in the non-volatile storage medium, and when the computer program is executed by the processor, the processor can execute any drug name recognition method.
  • the processor may be a central processing unit (Central Processing Unit, CPU), and the processor may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSP), application specific integrated circuits (Application Specific Integrated circuits) Circuit, ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor can be a microprocessor or the processor can also be any conventional processor or the like.
  • the processor is configured to run a computer program stored in the memory to implement the following steps:
  • the image training data is obtained by performing text extraction on the sample medicine box image, and obtaining image expansion data; inputting the image training data and the image expansion data into the drug name recognition model for iterative training, until The medicine name recognition model converges, and the trained medicine name recognition model is obtained; the medicine box image to be recognized by the medicine name is obtained, and at least one to-be-recognized image corresponding to the medicine box image is determined; The image to be recognized is input into the trained drug name recognition model for drug name recognition, and the drug name recognition result corresponding to the medicine box image is obtained; based on the preset drug name information database, the drug name recognition result is performed on the drug name recognition result Fuzzy matching to obtain the name of the medicine corresponding to the image of the medicine box.
  • the processor when the processor acquires the image augmentation data, the processor is configured to:
  • the text data includes at least one of Chinese, English, numbers and symbols; add text attributes to the text data, and obtain the text attributes after adding the text attributes.
  • Text data the text attributes include at least one of text length, text direction, font size and font style; the text data after adding the text attributes is added to a preset image template to obtain the image augmented data.
  • the processor determines at least one to-be-identified image corresponding to the image of the medicine box, the processor is configured to:
  • the text detection model includes at least a feature extraction layer; when the processor implements text detection by inputting the medicine box image into the text detection model, and obtains text position information corresponding to the medicine box image, Used to implement:
  • the processor before implementing that each of the images to be recognized is input into a drug name recognition model for drug name recognition, the processor is further configured to:
  • the processor when the processor implements inputting each of the to-be-recognized images into a drug name recognition model for drug name recognition, the processor is configured to:
  • the adjusted image to be recognized and the image to be recognized that does not require direction adjustment are input into the drug name recognition model for drug name recognition.
  • the drug name information library includes a variety of standard drug name texts; the processor performs fuzzy matching on the drug name recognition result based on a preset drug name information library, and obtains the obtained drug name.
  • the processor performs fuzzy matching on the drug name recognition result based on a preset drug name information library, and obtains the obtained drug name.
  • the processor is further configured to:
  • the medication description information Based on the preset correspondence between the drug name and the medication description information, determine the medication description information of the drug name corresponding to the medicine box image, and display the drug name and the medication description information; and/or
  • the medicine name corresponding to the medicine box image is synthesized by speech, the speech information corresponding to the medicine name is generated, and the speech information is broadcasted.
  • the embodiments of the present application further provide a computer-readable storage medium, where the computer-readable storage medium stores a computer program, the computer program includes program instructions, and the processor executes the program instructions to implement the present application Any drug name identification method provided in the embodiment.
  • the computer-readable storage medium may be an internal storage unit of the computer device described in the foregoing embodiments, for example, a hard disk or a memory of the computer device.
  • the computer-readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk equipped on the computer device, a smart memory card (Smart Media Card, SMC), a Secure Digital Card (Secure Digital Card) , SD Card), flash memory card (Flash Card), etc.
  • the computer-readable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function, and the like; The data created by the use of the node, etc.
  • the blockchain referred to in this application is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

Abstract

一种药名识别方法、装置、设备和介质,该方法包括:将图像训练数据与图像扩充数据输入药名识别模型训练,得到训练好的药名识别模型;获取待药名识别的药盒图像,确定药盒图像的待识别图像;将待识别图像输入药名识别模型进行药名识别,获得药名识别结果;基于药名信息库,对药名识别结果进行药名模糊匹配,获得药物名称。

Description

药名识别方法、装置、计算机设备和存储介质
本申请要求于2021年4月30日提交中国专利局、申请号为202110486506.8、发明名称为“药名识别方法、装置、计算机设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及人工智能与数字医疗领域,尤其涉及一种药名识别方法、装置、计算机设备和存储介质。
背景技术
在日常生活中,中老年人经常接触到各种药物,但很多药盒上的文字较小,中老年人难以识别药盒上的药物名称。发明人发现,现有方法主要是采用OCR(Optical Character Recognition,光学字符识别)自动识别药盒上的文字,但是OCR技术仅适用于简单场景,并且在文本识别阶段主要是采用特征提取与模板匹配方法,利用人的经验知识指导文字特征提取,然后从特征库中根据文字相似度匹配。这种方法的稳定性与有效性都较差,难以识别药盒复杂多样的药物名称。
因此如何提高识别药盒上的药物名称的准确性成为亟需解决的问题。
发明内容
本申请提供了一种药名识别方法、装置、计算机设备和存储介质,通过根据图像训练数据与图像扩充数据对药名识别模型进行训练,使得训练好的药名识别模型适用于识别药盒图像中复杂多样的药物名称的场景,可以有效提高识别药物名称的准确性;通过对药名识别模型输出的药名识别结果进行药名模糊匹配,进一步提高了识别药物名称的准确性。
第一方面,本申请提供了一种药名识别方法,所述方法包括:
获取图像训练数据,所述图像训练数据为对样本药盒图像进行文本提取得到的,以及获取图像扩充数据;
将所述图像训练数据与所述图像扩充数据输入药名识别模型进行迭代训练,直至所述药名识别模型收敛,得到训练好的所述药名识别模型;
获取待进行药名识别的药盒图像,并确定所述药盒图像对应的至少一个待识别图像;
将每个所述待识别图像输入训练好的所述药名识别模型进行药名识别,获得所述药盒图像对应的药名识别结果;
基于预设的药名信息库,对所述药名识别结果进行药名模糊匹配,获得所述药盒图像对应的药物名称。
第二方面,本申请还提供了一种药名识别装置,所述装置包括:
图像数据获取模块,用于获取图像训练数据,所述图像训练数据为对样本药盒图像进行文本提取得到的,以及获取图像扩充数据;
模型训练模块,用于将所述图像训练数据与所述图像扩充数据输入药名识别模型进行迭代训练,直至所述药名识别模型收敛,得到训练好的所述药名识别模型;
图像提取模块,用于获取待进行药名识别的药盒图像,并确定所述药盒图像对应的至少一个待识别图像;
药名识别模块,用于将每个所述待识别图像输入训练好的所述药名识别模型进行药名识别,获得所述药盒图像对应的药名识别结果;
药名模糊匹配模块,用于基于预设的药名信息库,对所述药名识别结果进行药名模糊匹配,获得所述药盒图像对应的药物名称。
第三方面,本申请还提供了一种计算机设备,所述计算机设备包括存储器和处理器;
所述存储器,用于存储计算机程序;
所述处理器,用于执行所述计算机程序并在执行所述计算机程序时实现如上述的药名识别方法。
第四方面,本申请还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时使所述处理器实现如上述的药名识别方法。
本申请公开了一种药名识别方法、装置、计算机设备和存储介质,通过获取图像训练数据以及获取图像扩充数据,将图像训练数据与图像扩充数据输入药名识别模型进行迭代训练,可以使得训练好的药名识别模型适用于识别药盒图像中复杂多样的药物名称的场景,可以有效提高识别药物名称的准确性;通过获取待进行药名识别的药盒图像并确定待识别图像,可以获得药盒文本的待识别图像;通过将每个待识别图像输入训练好的药名识别模型进行药名识别,可以有效提高药名识别结果的准确性;通过基于预设的药名信息库,对药名识别结果进行药名模糊匹配,获得药盒图像对应的药物名称,进一步提高了对药盒图像进行药物名称识别的准确性。
附图说明
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请实施例提供的一种药名识别方法的示意性流程图;
图2是本申请实施例提供的一种对药名识别模型进行训练的示意图;
图3是本申请实施例提供的一种获取图像扩充数据的子步骤的示意性流程图;
图4是本申请实施例提供的一种确定待识别图像的子步骤的示意性流程图;
图5是本申请实施例提供的一种对药盒图像进行文本检测的子步骤的示意性流程图;
图6是本申请实施例提供的一种药名识别装置的示意性框图;
图7是本申请实施例提供的一种计算机设备的结构示意性框图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
附图中所示的流程图仅是示例说明,不是必须包括所有的内容和操作/步骤,也不是必须按所描述的顺序执行。例如,有的操作/步骤还可以分解、组合或部分合并,因此实际执行的顺序有可能根据实际情况改变。
应当理解,在此本申请说明书中所使用的术语仅仅是出于描述特定实施例的目的而并不意在限制本申请。如在本申请说明书和所附权利要求书中所使用的那样,除非上下文清楚地指明其它情况,否则单数形式的“一”、“一个”及“该”意在包括复数形式。
还应当理解,在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。
本申请的实施例提供了一种药名识别方法、装置、计算机设备和存储介质。其中,该药名识别方法可以应用于服务器或终端中,通过根据训练图像与扩充图像对药名识别模型进行训练,使得训练好的药名识别模型适用于识别药盒图像中复杂多样的药物名称的场景,可以有效提高识别药物名称的准确性;通过对药名识别模型输出的药名识别结果进行药名模糊匹配,进一步提高了识别药物名称的准确性。
其中,服务器可以为独立的服务器,也可以为服务器集群。终端可以是智能手机、平板电脑、笔记本电脑和台式电脑等电子设备。
下面结合附图,对本申请的一些实施方式作详细说明。在不冲突的情况下,下述的实施例及实施例中的特征可以相互组合。
如图1所示,药名识别方法包括步骤S10至步骤S50。
步骤S10、获取图像训练数据,所述图像训练数据为对样本药盒图像进行文本提取得到的,以及获取图像扩充数据。
在本申请实施例中,为提高药名识别模型识别药名的准确性,在采用药名识别模型对待识别图像进行药名识别之前,需要对初始的药名识别模型进行训练至收敛,得到训练好的药名识别模型。
请参阅图2,图2是本申请实施例提供的一种对药名识别模型进行训练的示意图。如图2所示,具体的训练过程可以包括:获取图像训练数据,图像训练数据为对样本药盒图像进行文本提取得到的,以及获取图像扩充数据;将图像训练数据与图像扩充数据输入初始的药名识别模型进行迭代训练,直至药名识别模型收敛,得到训练好的药名识别模型。
示例性的,可以对预设数量的样本药盒图像进行文本提取,获取图像训练数据。其中,可以通过文本检测模型对样本药盒图像进行文本提取。
示例性的,可以生成预设数量的图像扩充数据。需要说明的是,由于药盒图像上的文字一般比较复杂,包括不同字体种类、不同大小以及不同方向的文字,因此需要增加图像扩充数据对药名识别模型进行训练,使得训练好的药名识别模型适用于识别药盒图像中复杂多样的药物名称的场景,从而提高药名识别模型的准确度。
请参阅图3,图3是本申请实施例提供的一种获取图像扩充数据的子步骤的示意性流程图,具体可以包括以下步骤S101至步骤S103。
步骤S101、基于预设的药物知识库,获取包含药物信息的文本数据,所述文本数据包括中文、英文、数字以及符号中至少一种。
示例性的,可以从药物知识库中提取预设数量的包含药物信息的文本数据。其中,提取的文本数据是无格式的文本;文本数据的内容可以包括中文、英文、数字以及符号中至少一种。
步骤S102、对所述文本数据添加文本属性,获得添加文本属性后的所述文本数据,所述文本属性包括文本长度、文本方向、字体大小以及字体样式中至少一种。
示例性的,可以对文本数据添加不同的文本属性。例如,将文本数据的文本方向设置为横向;例如,将文本数据的字体大小设置为三号;又例如,将文本数据的字体样式设置为彩色。从而可以获得添加文本属性后的多个文本数据。
示例性的,添加文本属性后的文本数据的文本方向可以是横向、字体样式可以是加粗。
示例性的,添加文本属性后的文本数据的文本方向可以是纵向、字体大小可以是三号。
步骤S103、将添加文本属性后的所述文本数据添加至预设的图像模板中,获得所述图像扩充数据。
需要说明的是,预设的图像模板可以是空白图像。
示例性的,可以将添加文本属性后的文本数据加载至空白图像中,从而获得图像扩充数据。
通过构建图像扩充数据,可以实现自动生成丰富的图像训练数据,不仅解决了训练样本不足的问题,而且还提高了药名识别模型的鲁棒性与准确性。
步骤S20、将所述图像训练数据与所述图像扩充数据输入药名识别模型进行迭代训练,直至所述药名识别模型收敛,得到训练好的所述药名识别模型。
在本申请实施例中,通过将增加的图像扩充数据与图像训练数据共同输入药名识别模型进行训练,使得训练好的药名识别模型适用于识别药盒图像中复杂多样的药物名称的场景,从而提高药名识别模型的准确度。
需要说明的是,药名识别模型用于识别图像中的文字信息,并对文字信息进行标签分类,将输出的概率最大的标签确定为药名识别结果。药名识别模型可以包括卷积神经网络与循环神经网络;其中,卷积神经网络用于对输入的图像进行卷积与池化,输出特征图像;循环神经网络用于对特征图像进行药名分类预测,输出药名识别结果。
示例性的,在对药名识别模型进行训练时,可以通过CTC损失函数计算模型训练时的损失函数值,并根据损失函数值调整模型的参数。其中,CTC(Connectionist Temporal  Classification,连接时序分类)损失函数用于解决输入特征与输出标签的对齐问题。
在一些实施例中,将图像训练数据与图像扩充数据输入药名识别模型进行迭代训练,直至药名识别模型收敛,可以包括:根据图像训练数据与图像扩充数据,确定每一轮训练的训练样本图像;将当前轮训练样本图像输入卷积神经网络进行卷积与池化,输出特征图像;将特征图像输入循环神经网络进行药名分类预测,输出对应的药名识别结果;基于连接时序分类损失函数,确定药名识别结果对应的损失函数值;若损失函数值大于预设的损失值阈值,则调整卷积神经网络与循环神经网络的参数,并进行下一轮训练,直至得到的损失函数值小于或等于损失值阈值,结束训练,得到训练好的药名识别模型。
示例性的,预设的损失值阈值可以根据实际情况进行设定,具体数值在此不作限定。
示例性的,可以基于误差反向传播算法调整卷积神经网络与循环神经网络的参数,也可以根据其它算法调整卷积神经网络与循环神经网络的参数,例如梯度下降算法。其中,误差反向传播(Error Back Propagation,BP)算法是一种按照误差逆向传播算法训练的多层前馈神经网络。
为进一步保证上述训练好的药名识别模型的私密和安全性,上述训练好的药名识别模型还可以存储于一区块链的节点中。当需要使用训练好的药名识别模型时,可以从区块链的节点中调用。
通过基于连接时序分类损失函数,计算损失函数值并根据损失函数值调整卷积神经网络与循环神经网络的参数,解决了输入特征与输出标签的对齐问题,有效提高了训练好的药名识别模型识别药名的准确性。
在一些实施例中,将当前轮训练样本图像输入卷积神经网络进行卷积与池化,输出特征图像之前,还可以包括:确定药名识别模型对应的图像识别方向;根据当前轮训练样本图像对应的高度与宽度,确定当前轮训练样本图像对应的图像方向;对图像方向与图像识别方向不同的训练样本图像进行方向调整,获得调整后的训练样本图像;将调整后的训练样本图像及不需要进行方向调整的训练样本图像输入卷积神经网络中进行卷积与池化,得到当前轮训练样本图像对应的特征图像。
需要说明的是,药名识别模型在识别图像时,为确保识别的准确性,对图像识别方向有要求,同一个药名识别模型适用于识别固定方向的图像。若存在多种方向的训练样本图像,则需要训练多个药名识别模型进行识别。因此,在训练药名识别模型时,需要先确定药名识别模型的图像识别方向,然后对训练样本图像进行方向调整。
示例性的,可以设定药名识别模型对应的图像识别方向;例如将图像识别方向设定为横向,也可以将图像识别方向设定为纵向。
示例性的,可以根据当前轮训练样本图像对应的高度与宽度,确定当前轮训练样本图像对应的图像方向。例如,当训练样本图像的宽度大于高度时,可以确定该训练样本图像的图像方向为横向;当训练样本图像的宽度小于或等于高度时,可以确定该训练样本图像的图像方向为纵向。
示例性的,若药名识别模型的图像识别方向为横向,则需要对图像方向为纵向的训练样 本图像进行方向调整。例如,可以将训练样本图像旋转90°。对于其它图像方向的训练样本图像,可以根据实际的图像方向进行旋转。
通过统一训练样本图像的方向,可以实现由一个药名识别模型进行横向图像和纵向图像的训练,而不需要对一个药名识别模型单独进行横向图像的训练,对另一药名识别模型单独进行纵向图像的训练,简化了训练过程。
步骤S30、获取待进行药名识别的药盒图像,并确定所述药盒图像对应的至少一个待识别图像。
需要说明的是,本申请实施例提供的药名识别方法可以应用于识别药盒图像中复杂多样的药物名称的复杂场景中。用户可以将需要进行药名识别的药盒图像上传至服务器或终端,其中,服务器或终端安装有药名识别系统或药名识别应用程序。例如,用户可以通过移动手机拍摄需要进行药名识别的药盒图像,然后上传药名识别系统或药名识别应用程序进行药名识别。
示例性的,当检测到用户的图像上传操作时,根据图像上传操作将用户上传的药盒图像,确定为待进行药名识别的药盒图像。
在本申请实施例中,在获取待进行药名识别的药盒图像之后,可以确定药盒图像对应的至少一个待识别图像。请参阅图4,图4是本申请实施例提供的一种确定待识别图像的子步骤的示意性流程图,具体可以包括以下步骤S301和步骤S302。
步骤S301、将所述药盒图像输入文本检测模型进行文本检测,获得所述药盒图像对应的文本位置信息。
需要说明的是,文本检测模型可以是DB-Net(Differentiable Binarization Network,可微分二值化网络)模型。DB-Net模型最大的创新点在于,对每一个像素点进行自适应二值化,其中二值化阈值由网络学习得到,实现将二值化这一过程加入到网络里一起训练,可以有效增强文本检测模型的鲁棒性,以及提高文本检测模型的检测速度。此外,DB-Net模型的骨干网络采用ResNet(Residual Neural Network,残差神经网络)结构,在训练过程中,图片输入后经过特征提取和上采样融合等操作后,根据得到的特征图像分别预测出概率特征图和阈值特征图,通过概率特征图和阈值特征图计算得到二值化特征图。
在本申请实施例中,文本检测模型至少包括特征提取层;特征提取层用于对输入文本检测模型的图像进行特征提取,得到对应的特征图像。文本检测模型除了包括特征提取层,还可以包括特征预测层、二值化层以及文本位置预测层。其中,特征预测层用于预测特征图像对应的概率特征图与阈值特征图像;二值化层用于对概率特征图与阈值特征图像进行二值化计算,输出二值化特征图像;文本位置预测层用于对二值化特征图像进行文本区域识别,输出文本位置信息。
示例性的,文本检测模型可以是预先训练好的文本检测模型。在一些实施例中,可以获取预设数量的样本药盒图像,将样本药盒图像输入初始的文本检测模型进行迭代训练,直至文本检测模型收敛,得到训练好的文本检测模型。其中,具体的训练过程在此不作限定。
在一些实施例中,可以将待进行药名识别的药盒图像输入训练好的文本检测模型进行文 本检测,获得药盒图像对应的文本位置信息。
请参阅图5,图5是本申请实施例提供的一种对药盒图像进行文本检测的子步骤的示意性流程图,具体可以包括以下步骤S3011至步骤S3013。
步骤S3011、将所述药盒图像输入所述特征提取层进行特征提取,获得所述药盒图像对应的特征图像。
示例性的,可以将药盒图像输入特征提取层,由特征提取层进行特征提取,输出药盒图像对应的特征图像。其中,特征提取层可以包括FPN(Feature Pyramid Networks,特征金字塔网络)。需要说明的是,FPN解决的是物体检测中的多尺度问题,通过简单的网络连接改变,在基本不增加原有模型计算量的情况下,大幅度提升了小物体检测的性能。
示例性的,可以通过FPN网络对药盒图像进行上采样,从而提取得到药盒图像对应的特征图像。
步骤S3012、确定所述特征图像对应的二值化特征图。
示例性的,确定特征图像对应的二值化特征图,可以先将特征提取层输出的特征图像输入特征预测层,由特征预测层预测得到特征图像对应的概率特征图与阈值特征图像。然后将概率特征图与阈值特征图像输入二值化层进行二值化计算,输出二值化特征图像。
示例性的,二值化层可以基于可微二值化公式,对概率特征图与阈值特征图像进行二值化计算,获得二值化特征图。
步骤S3013、确定所述二值化特征图中的文本区域,并根据所述文本区域,确定所述文本位置信息。
示例性的,在确定二值化特征图中的文本区域时,可以根据二值化特征图中的像素值,确定二值化特征图中的文本区域。例如,将像素值大于预设的像素阈值的像素点,确定为文本区域。将像素值小于预设的像素阈值的像素点,确定为非文本区域。其中,预设的像素阈值可以根据实际情况设定,具体数值在此不作限定。
示例性的,可以文本区域的左上角坐标与右下角坐标,确定为文本位置信息。从而文本位置信息可以包括文本区域的左上角坐标与右下角坐标。
需要说明的是,现有的基于图像分割的文本检测方法,通常是设置固定的阈值,将分割网络产生的概率特征图转换为二值化特征图。由于不同的阈值对模型的性能影响较大,因此,现有的基于图像分割的文本检测方法的准确度较低。而本申请实施例中的DB-Net模型通过将二值化操作插入到分割网络中进行联合优化,可以自适应地预测图像中每个位置的阈值,从而能够完全区分像素的前景和背景。不仅提高了文本检测模型的检测速度,而且也增强了文本检测模型的鲁棒性。
通过将待进行药名识别的药盒图像输入文本检测模型进行文本检测,可以准确地确定药盒图像中不同格式的文本对应的文本位置信息。
步骤S302、根据所述文本位置信息对所述药盒图像进行切分,得到至少一个所述待识别图像。
示例性的,可以根据文本区域的左上角坐标与右下角坐标,确定待切分的图像区域,然 后对图像区域进行截图,得到待识别图像。当然还可以是其它切分方式,具体的切分方式在此不作限定。其中,每一文本位置信息,对应切分得到一个待识别图像。
通过根据文本位置信息对药盒图像进行切分,可以准确地得到包含文本的待识别图像,进而可以提高对待识别图像进行药名识别的准确性。
步骤S40、将每个所述待识别图像输入训练好的药名识别模型进行药名识别,获得所述药盒图像对应的药名识别结果。
在确定待进行药名识别的药盒图像对应的至少一个待识别图像之后,可以将每个待识别图像输入训练好的药名识别模型进行药名识别,从而可以得到药盒图像对应的药名识别结果。
在一些实施例中,将每个待识别图像输入药名识别模型进行药名识别之前,还可以包括:确定药名识别模型对应的图像识别方向;根据图像识别方向,确定待进行方向调整的待识别图像以及不需要方向调整的待识别图像;对待进行方向调整的待识别图像进行方向调整,获得调整后的待识别图像。
需要说明的是,药名识别模型对应的图像识别方向,在训练时已经确定。例如,若药名识别模型在训练时对应的图像识别方向为横向,则可以确定此时的药名识别模型对应的图像识别方向为横向。又例如,若药名识别模型在训练时对应的图像识别方向为纵向,则可以确定此时的药名识别模型对应的图像识别方向为纵向。
示例性的,在确定药名识别模型对应的图像识别方向之后,可以根据图像识别方向,确定待进行方向调整的待识别图像以及不需要方向调整的待识别图像。例如,当图像识别方向为横向时,若待识别图像的宽度大于高度,说明该待识别图像的图像方向为横向,则可以确定该待识别图像不需要方向调整。又例如,当图像识别方向为横向时,若待识别图像的宽度小于高度,说明该待识别图像的图像方向为纵向,则可以确定该待识别图像需要进行方向调整。
示例性的,可以将待进行方向调整的待识别图像进行顺时针方向或逆时针方向旋转,直至将该待识别图像的图像方向调整为横向。
在一些实施例中,将每个待识别图像输入药名识别模型进行药名识别,可以包括:将调整后的待识别图像以及不需要方向调整的待识别图像输入药名识别模型进行药名识别。
示例性的,可以将调整后的待识别图像以及不需要方向调整的待识别图像输入药名识别模型进行药名识别,从而可以获得药盒图像对应的药名识别结果。
通过确定药名识别模型对应的图像识别方向,可以判定是否需要调整待识别图像的方向,实现一个药名识别模型可以识别横向与纵向的待识别图像,不需要额外增加另一个药名识别模型进行识别,不仅减少了计算量和内存消耗,而且还提高了药名识别模型进行药名识别的准确性。
步骤S50、基于预设的药名信息库,对所述药名识别结果进行药名模糊匹配,获得所述药盒图像对应的药物名称。
需要说明的是,由于药名识别模型识别得到的药名识别结果可能存在一些同音词或同义词,导致识别结果不够准确,因此通过对药名识别结果进行药名模糊匹配,可以获得更加准 确的药物名称。
示例性的,药名信息库包括多种标准药名文本。在本申请实施例中,可以预先收集多种标准药名文本,并存储至药名信息库中。为进一步保证上述药名信息库的私密和安全性,上述药名信息库可以存储于一区块链的节点中。
在一些实施例中,基于预设的药名信息库,对药名识别结果进行药名模糊匹配,获得药盒图像对应的药物名称,可以包括:基于预设的编辑距离算法,确定药名识别结果与药名信息库中的标准药名文本之间的编辑距离值;将编辑距离值小于预设的编辑距离阈值对应的标准药名文本,确定为药盒图像对应的药物名称。
需要说明的是,Levenshtein Distance(编辑距离)算法是指两个字符串之间,由一个转成另一个所需要的最小编辑操作次数。
示例性的,可以基于编辑距离算法,计算药名识别结果与药名信息库中的标准药名文本之间的编辑距离值,将编辑距离值小于预设的编辑距离阈值对应的标准药名文本,确定为药盒图像对应的药物名称。
其中,预设的编辑距离阈值可以根据实际情况设定,具体数值在此不作限定。
通过基于预设的药名信息库对药名识别结果进行药名模糊匹配,可以进一步提高识别的药物名称的准确性。
在一些实施方式中,在识别药盒图像中复杂多样的药物名称时,示例性的,识别由中文、英文以及符号组成且文本方向为横向、字体样式为彩色的药物名称,例如药物名称“AB-C颗粒”,首先将该药盒图像输入文本检测模型进行文本检测,得到药盒图像对应的文本位置信息。其次,根据文本位置信息对药盒图像进行切分,得到至少一个待识别图像,待识别图像可以是包含药物名称的图像,也可以是包含其他文字的图像。再次,根据药名识别模型的图像识别方向对待进行方向调整的待识别图像进行方向调整,将调整后的待识别图像以及不需要方向调整的待识别图像输入药名识别模型进行药名识别,获得药盒图像对应的药名识别结果,例如“AB_D颗粒”。最后,基于预设的药名信息库,对药名识别结果“AB_D颗粒”进行药名模糊匹配,得到药盒图像对应的药物名称“AB-C颗粒”。从而可以准确地识别到药盒图像中的药物名称。
在一些实施例中,在获得药盒图像对应的药物名称之后,还可以包括:基于预设的药物名称与用药说明信息之间的对应关系,确定药盒图像对应的药物名称的用药说明信息,并显示药物名称与用药说明信息。
示例性的,用药说明信息可以包括主治功能信息与服用说明信息等等。
示例性的,可以在用户上传药盒图像的服务器或终端中显示药物名称与用药说明信息。例如,在用户的移动手机上显示药物名称与用药说明信息。
在一些实施例中,在获得药盒图像对应的药物名称之后,还可以包括:对药盒图像对应的药物名称进行语音合成,生成药物名称对应的语音信息,并播报语音信息。
示例性的,可以通过TTS(Text To Speech,语音合成)模型,对药盒图像对应的药物名称进行语音合成,生成药物名称对应的语音信息。需要说明的是,TTS模型包括文本分析、 声学模型、音频合成等模块,可以实现将一段文本转换为语音信号。
示例性的,在生成药物名称对应的语音信息之后,可以在服务器或终端上播报语音信息。例如,通过用户的移动手机播报语音信息。此外,还可以对用药说明信息进行语音合成,并播报用药说明信息。
在一些实施例中,在获得药盒图像对应的药物名称之后,还可以包括:基于预设的药物名称与用药说明信息之间的对应关系,确定药盒图像对应的药物名称的用药说明信息,并显示药物名称与用药说明信息;对药盒图像对应的药物名称进行语音合成,生成药物名称对应的语音信息,并播报语音信息。
通过确定药盒图像对应的药物名称的用药说明信息,并显示药物名称与用药说明信息,使得用户更加方便地获取到药盒图像上的药物名称与用药说明信息,更加人性化;通过对药盒图像对应的药物名称进行语音合成,生成药物名称对应的语音信息并播报语音信息,使得视力差的用户也非常方便地获取到药盒图像中的药物名称,提高了用户的体验度。
上述实施例提供的药名识别方法,通过构建图像扩充数据,可以实现自动生成丰富的图像训练数据,不仅解决了训练样本不足的问题,而且还提高了药名识别模型的鲁棒性与准确性;通过将增加的图像扩充数据与图像训练数据共同输入药名识别模型进行训练,使得训练好的药名识别模型适用于识别药盒图像中复杂多样的药物名称的场景,从而提高药名识别模型的准确度;通过基于连接时序分类损失函数,计算损失函数值并根据损失函数值调整卷积神经网络与循环神经网络的参数,解决了输入特征与输出标签的对齐问题,有效提高了训练好的药名识别模型识别药名的准确性;通过统一训练样本图像的方向,可以实现由一个药名识别模型进行横向图像和纵向图像的训练,而不需要对一个药名识别模型单独进行横向图像的训练,对另一药名识别模型单独进行纵向图像的训练,简化了训练过程;通过将待进行药名识别的药盒图像输入文本检测模型进行文本检测,可以准确地确定药盒图像中不同格式的文本对应的文本位置信息;通过根据文本位置信息对药盒图像进行切分,可以准确地得到包含文本的待识别图像,进而可以提高对待识别图像进行药名识别的准确性;通过基于预设的药名信息库对药名识别结果进行药名模糊匹配,可以进一步提高识别的药物名称的准确性;通过确定药盒图像对应的药物名称的用药说明信息,并显示药物名称与用药说明信息,使得用户更加方便地获取到药盒图像上的药物名称与用药说明信息,更加人性化;通过对药盒图像对应的药物名称进行语音合成,生成药物名称对应的语音信息并播报语音信息,使得视力差的用户也非常方便地获取到药盒图像中的药物名称,提高了用户的体验度。
请参阅图6,图6是本申请的实施例还提供一种药名识别装置1000的示意性框图,该药名识别装置用于执行前述的药名识别方法。其中,该药名识别装置可以配置于服务器或终端中。
如图6所示,该药名识别装置1000,包括:图像数据获取模块1001、模型训练模块1002、图像提取模块1003、药名识别模块1004和药名模糊匹配模块1005。
图像数据获取模块1001,用于获取图像训练数据,所述图像训练数据为对样本药盒图像进行文本提取得到的,以及获取图像扩充数据。
模型训练模块1002,用于将所述图像训练数据与所述图像扩充数据输入药名识别模型进行迭代训练,直至所述药名识别模型收敛,得到训练好的所述药名识别模型。
图像提取模块1003,用于获取待进行药名识别的药盒图像,并确定所述药盒图像对应的至少一个待识别图像。
药名识别模块1004,用于将每个所述待识别图像输入训练好的所述药名识别模型进行药名识别,获得所述药盒图像对应的药名识别结果。
药名模糊匹配模块1005,用于基于预设的药名信息库,对所述药名识别结果进行药名模糊匹配,获得所述药盒图像对应的药物名称。
需要说明的是,所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的装置和各模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
上述的装置可以实现为一种计算机程序的形式,该计算机程序可以在如图7所示的计算机设备上运行。
请参阅图7,图7是本申请实施例提供的一种计算机设备的结构示意性框图。该计算机设备可以是服务器或终端。
请参阅图7,该计算机设备包括通过系统总线连接的处理器和存储器,其中,存储器可以包括非易失性存储介质和内存储器。
处理器用于提供计算和控制能力,支撑整个计算机设备的运行。
内存储器为非易失性存储介质中的计算机程序的运行提供环境,该计算机程序被处理器执行时,可使得处理器执行任意一种药名识别方法。
应当理解的是,处理器可以是中央处理单元(Central Processing Unit,CPU),该处理器还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。其中,通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
其中,在一个实施例中,所述处理器用于运行存储在存储器中的计算机程序,以实现如下步骤:
获取图像训练数据,所述图像训练数据为对样本药盒图像进行文本提取得到的,以及获取图像扩充数据;将所述图像训练数据与所述图像扩充数据输入药名识别模型进行迭代训练,直至所述药名识别模型收敛,得到训练好的所述药名识别模型;获取待进行药名识别的药盒图像,并确定所述药盒图像对应的至少一个待识别图像;将每个所述待识别图像输入训练好的所述药名识别模型进行药名识别,获得所述药盒图像对应的药名识别结果;基于预设的药名信息库,对所述药名识别结果进行药名模糊匹配,获得所述药盒图像对应的药物名称。
在一个实施例中,所述处理器在实现获取图像扩充数据时,用于实现:
基于预设的药物知识库,获取包含药物信息的文本数据,所述文本数据包括中文、英文、数字以及符号中至少一种;对所述文本数据添加文本属性,获得添加文本属性后的所述文本 数据,所述文本属性包括文本长度、文本方向、字体大小以及字体样式中至少一种;将添加文本属性后的所述文本数据添加至预设的图像模板中,获得所述图像扩充数据。
在一个实施例中,所述处理器在实现确定所述药盒图像对应的至少一个待识别图像时,用于实现:
将所述药盒图像输入文本检测模型进行文本检测,获得所述药盒图像对应的文本位置信息;根据所述文本位置信息对所述药盒图像进行切分,得到至少一个所述待识别图像。
在一个实施例中,所述文本检测模型至少包括特征提取层;所述处理器在实现将所述药盒图像输入文本检测模型进行文本检测,获得所述药盒图像对应的文本位置信息时,用于实现:
将所述药盒图像输入所述特征提取层进行特征提取,获得所述药盒图像对应的特征图像;确定所述特征图像对应的二值化特征图;确定所述二值化特征图中的文本区域,并根据所述文本区域,确定所述文本位置信息。
在一个实施例中,所述处理器在实现将每个所述待识别图像输入药名识别模型进行药名识别之前,还用于实现:
确定所述药名识别模型对应的图像识别方向;根据所述图像识别方向,确定待进行方向调整的待识别图像以及不需要方向调整的待识别图像;对待进行方向调整的所述待识别图像进行方向调整,获得调整后的所述待识别图像。
在一个实施例中,所述处理器在实现将每个所述待识别图像输入药名识别模型进行药名识别时,用于实现:
将调整后的所述待识别图像以及不需要方向调整的所述待识别图像输入所述药名识别模型进行药名识别。
在一个实施例中,所述药名信息库包括多种标准药名文本;所述处理器在实现基于预设的药名信息库,对所述药名识别结果进行药名模糊匹配,获得所述药盒图像对应的药物名称时,用于实现:
基于预设的编辑距离算法,确定所述药名识别结果与所述药名信息库中的标准药名文本之间的编辑距离值;将编辑距离值小于预设的编辑距离阈值对应的标准药名文本,确定为所述药盒图像对应的药物名称。
在一个实施例中,所述处理器在实现获得所述药盒图像对应的药物名称之后,还用于实现:
基于预设的药物名称与用药说明信息之间的对应关系,确定所述药盒图像对应的药物名称的用药说明信息,并显示所述药物名称与所述用药说明信息;和/或对所述药盒图像对应的药物名称进行语音合成,生成所述药物名称对应的语音信息,并播报所述语音信息。
本申请的实施例中还提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序中包括程序指令,所述处理器执行所述程序指令,实现本申请实施例提供的任一项药名识别方法。
其中,所述计算机可读存储介质可以是前述实施例所述的计算机设备的内部存储单元, 例如所述计算机设备的硬盘或内存。所述计算机可读存储介质也可以是所述计算机设备的外部存储设备,例如所述计算机设备上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字卡(Secure Digital Card,SD Card),闪存卡(Flash Card)等。
进一步地,所述计算机可读存储介质可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序等;存储数据区可存储根据区块链节点的使用所创建的数据等。
本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。

Claims (20)

  1. 一种药名识别方法,其中,包括:
    获取图像训练数据,所述图像训练数据为对样本药盒图像进行文本提取得到的,以及获取图像扩充数据;
    将所述图像训练数据与所述图像扩充数据输入药名识别模型进行迭代训练,直至所述药名识别模型收敛,得到训练好的所述药名识别模型;
    获取待进行药名识别的药盒图像,并确定所述药盒图像对应的至少一个待识别图像;
    将每个所述待识别图像输入训练好的所述药名识别模型进行药名识别,获得所述药盒图像对应的药名识别结果;
    基于预设的药名信息库,对所述药名识别结果进行药名模糊匹配,获得所述药盒图像对应的药物名称。
  2. 根据权利要求1所述的药名识别方法,其中,所述获取图像扩充数据,包括:
    基于预设的药物知识库,获取包含药物信息的文本数据,所述文本数据包括中文、英文、数字以及符号中至少一种;
    对所述文本数据添加文本属性,获得添加文本属性后的所述文本数据,所述文本属性包括文本长度、文本方向、字体大小以及字体样式中至少一种;
    将添加文本属性后的所述文本数据添加至预设的图像模板中,获得所述图像扩充数据。
  3. 根据权利要求1所述的药名识别方法,其中,所述确定所述药盒图像对应的至少一个待识别图像,包括:
    将所述药盒图像输入文本检测模型进行文本检测,获得所述药盒图像对应的文本位置信息;
    根据所述文本位置信息对所述药盒图像进行切分,得到至少一个所述待识别图像。
  4. 根据权利要求3所述的药名识别方法,其中,所述文本检测模型至少包括特征提取层;所述将所述药盒图像输入文本检测模型进行文本检测,获得所述药盒图像对应的文本位置信息,包括:
    将所述药盒图像输入所述特征提取层进行特征提取,获得所述药盒图像对应的特征图像;
    确定所述特征图像对应的二值化特征图;
    确定所述二值化特征图中的文本区域,并根据所述文本区域,确定所述文本位置信息。
  5. 根据权利要求1所述的药名识别方法,其中,所述将每个所述待识别图像输入药名识别模型进行药名识别之前,还包括:
    确定所述药名识别模型对应的图像识别方向;
    根据所述图像识别方向,确定待进行方向调整的待识别图像以及不需要方向调整的待识别图像;
    对待进行方向调整的所述待识别图像进行方向调整,获得调整后的所述待识别图像;
    所述将每个所述待识别图像输入药名识别模型进行药名识别,包括:
    将调整后的所述待识别图像以及不需要方向调整的所述待识别图像输入所述药名识别模型进行药名识别。
  6. 根据权利要求1所述的药名识别方法,其中,所述药名信息库包括多种标准药名文本;所述基于预设的药名信息库,对所述药名识别结果进行药名模糊匹配,获得所述药盒图像对应的药物名称,包括:
    基于预设的编辑距离算法,确定所述药名识别结果与所述药名信息库中的标准药名文本之间的编辑距离值;
    将编辑距离值小于预设的编辑距离阈值对应的标准药名文本,确定为所述药盒图像对应的药物名称。
  7. 根据权利要求1-6任一项所述的药名识别方法,其中,所述获得所述药盒图像对应的药物名称之后,还包括:
    基于预设的药物名称与用药说明信息之间的对应关系,确定所述药盒图像对应的药物名称的用药说明信息,并显示所述药物名称与所述用药说明信息;和/或
    对所述药盒图像对应的药物名称进行语音合成,生成所述药物名称对应的语音信息,并播报所述语音信息。
  8. 一种药名识别装置,其中,包括:
    图像数据获取模块,用于获取图像训练数据,所述图像训练数据为对样本药盒图像进行文本提取得到的,以及获取图像扩充数据;
    模型训练模块,用于将所述图像训练数据与所述图像扩充数据输入药名识别模型进行迭代训练,直至所述药名识别模型收敛,得到训练好的所述药名识别模型;
    图像提取模块,用于获取待进行药名识别的药盒图像,并确定所述药盒图像对应的至少一个待识别图像;
    药名识别模块,用于将每个所述待识别图像输入训练好的所述药名识别模型进行药名识别,获得所述药盒图像对应的药名识别结果;
    药名模糊匹配模块,用于基于预设的药名信息库,对所述药名识别结果进行药名模糊匹配,获得所述药盒图像对应的药物名称。
  9. 一种计算机设备,其中,所述计算机设备包括存储器和处理器;
    所述存储器,用于存储计算机程序;
    所述处理器,用于执行所述计算机程序并在执行所述计算机程序时实现如下步骤:
    获取图像训练数据,所述图像训练数据为对样本药盒图像进行文本提取得到的,以及获取图像扩充数据;
    将所述图像训练数据与所述图像扩充数据输入药名识别模型进行迭代训练,直至所述药名识别模型收敛,得到训练好的所述药名识别模型;
    获取待进行药名识别的药盒图像,并确定所述药盒图像对应的至少一个待识别图像;
    将每个所述待识别图像输入训练好的所述药名识别模型进行药名识别,获得所述药盒图像对应的药名识别结果;
    基于预设的药名信息库,对所述药名识别结果进行药名模糊匹配,获得所述药盒图像对应的药物名称。
  10. 根据权利要求9所述的计算机设备,其中,所述处理器实现获取图像扩充数据的步骤,包括:
    基于预设的药物知识库,获取包含药物信息的文本数据,所述文本数据包括中文、英文、数字以及符号中至少一种;
    对所述文本数据添加文本属性,获得添加文本属性后的所述文本数据,所述文本属性包括文本长度、文本方向、字体大小以及字体样式中至少一种;
    将添加文本属性后的所述文本数据添加至预设的图像模板中,获得所述图像扩充数据。
  11. 根据权利要求9所述的计算机设备,其中,所述处理器实现确定所述药盒图像对应的至少一个待识别图像的步骤,包括:
    将所述药盒图像输入文本检测模型进行文本检测,获得所述药盒图像对应的文本位置信息;
    根据所述文本位置信息对所述药盒图像进行切分,得到至少一个所述待识别图像;
    其中,所述文本检测模型至少包括特征提取层,所述处理器实现将所述药盒图像输入文本检测模型进行文本检测,获得所述药盒图像对应的文本位置信息的步骤,包括:
    将所述药盒图像输入所述特征提取层进行特征提取,获得所述药盒图像对应的特征图像;
    确定所述特征图像对应的二值化特征图;
    确定所述二值化特征图中的文本区域,并根据所述文本区域,确定所述文本位置信息。
  12. 根据权利要求9所述的计算机设备,其中,所述处理器实现将每个所述待识别图像输入药名识别模型进行药名识别的步骤之前,还实现:
    确定所述药名识别模型对应的图像识别方向;
    根据所述图像识别方向,确定待进行方向调整的待识别图像以及不需要方向调整的待识别图像;
    对待进行方向调整的所述待识别图像进行方向调整,获得调整后的所述待识别图像;
    所述将每个所述待识别图像输入药名识别模型进行药名识别,包括:
    将调整后的所述待识别图像以及不需要方向调整的所述待识别图像输入所述药名识别模型进行药名识别。
  13. 根据权利要求9所述的计算机设备,其中,所述药名信息库包括多种标准药名文本;所述处理器实现基于预设的药名信息库,对所述药名识别结果进行药名模糊匹配,获得所述药盒图像对应的药物名称的步骤,包括:
    基于预设的编辑距离算法,确定所述药名识别结果与所述药名信息库中的标准药名文本之间的编辑距离值;
    将编辑距离值小于预设的编辑距离阈值对应的标准药名文本,确定为所述药盒图像对应的药物名称。
  14. 根据权利要求9-13任一项所述的计算机设备,其中,所述处理器实现获得所述药盒 图像对应的药物名称的步骤之后,还实现以下步骤:
    基于预设的药物名称与用药说明信息之间的对应关系,确定所述药盒图像对应的药物名称的用药说明信息,并显示所述药物名称与所述用药说明信息;和/或
    对所述药盒图像对应的药物名称进行语音合成,生成所述药物名称对应的语音信息,并播报所述语音信息。
  15. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时使所述处理器实现如下步骤:
    获取图像训练数据,所述图像训练数据为对样本药盒图像进行文本提取得到的,以及获取图像扩充数据;
    将所述图像训练数据与所述图像扩充数据输入药名识别模型进行迭代训练,直至所述药名识别模型收敛,得到训练好的所述药名识别模型;
    获取待进行药名识别的药盒图像,并确定所述药盒图像对应的至少一个待识别图像;
    将每个所述待识别图像输入训练好的所述药名识别模型进行药名识别,获得所述药盒图像对应的药名识别结果;
    基于预设的药名信息库,对所述药名识别结果进行药名模糊匹配,获得所述药盒图像对应的药物名称。
  16. 根据权利要求15所述的计算机可读存储介质,其中,所述处理器实现获取图像扩充数据的步骤,包括:
    基于预设的药物知识库,获取包含药物信息的文本数据,所述文本数据包括中文、英文、数字以及符号中至少一种;
    对所述文本数据添加文本属性,获得添加文本属性后的所述文本数据,所述文本属性包括文本长度、文本方向、字体大小以及字体样式中至少一种;
    将添加文本属性后的所述文本数据添加至预设的图像模板中,获得所述图像扩充数据。
  17. 根据权利要求15所述的计算机可读存储介质,其中,所述处理器实现确定所述药盒图像对应的至少一个待识别图像的步骤,包括:
    将所述药盒图像输入文本检测模型进行文本检测,获得所述药盒图像对应的文本位置信息;
    根据所述文本位置信息对所述药盒图像进行切分,得到至少一个所述待识别图像;
    其中,所述文本检测模型至少包括特征提取层,所述处理器实现将所述药盒图像输入文本检测模型进行文本检测,获得所述药盒图像对应的文本位置信息的步骤,包括:
    将所述药盒图像输入所述特征提取层进行特征提取,获得所述药盒图像对应的特征图像;
    确定所述特征图像对应的二值化特征图;
    确定所述二值化特征图中的文本区域,并根据所述文本区域,确定所述文本位置信息。
  18. 根据权利要求15所述的计算机可读存储介质,其中,所述处理器实现将每个所述待识别图像输入药名识别模型进行药名识别的步骤之前,还实现:
    确定所述药名识别模型对应的图像识别方向;
    根据所述图像识别方向,确定待进行方向调整的待识别图像以及不需要方向调整的待识别图像;
    对待进行方向调整的所述待识别图像进行方向调整,获得调整后的所述待识别图像;
    所述将每个所述待识别图像输入药名识别模型进行药名识别,包括:
    将调整后的所述待识别图像以及不需要方向调整的所述待识别图像输入所述药名识别模型进行药名识别。
  19. 根据权利要求15所述的计算机可读存储介质,其中,所述药名信息库包括多种标准药名文本;所述处理器实现基于预设的药名信息库,对所述药名识别结果进行药名模糊匹配,获得所述药盒图像对应的药物名称的步骤,包括:
    基于预设的编辑距离算法,确定所述药名识别结果与所述药名信息库中的标准药名文本之间的编辑距离值;
    将编辑距离值小于预设的编辑距离阈值对应的标准药名文本,确定为所述药盒图像对应的药物名称。
  20. 根据权利要求15-19任一项所述的计算机可读存储介质,其中,所述处理器实现获得所述药盒图像对应的药物名称的步骤之后,还实现以下步骤:
    基于预设的药物名称与用药说明信息之间的对应关系,确定所述药盒图像对应的药物名称的用药说明信息,并显示所述药物名称与所述用药说明信息;和/或
    对所述药盒图像对应的药物名称进行语音合成,生成所述药物名称对应的语音信息,并播报所述语音信息。
PCT/CN2021/097413 2021-04-30 2021-05-31 药名识别方法、装置、计算机设备和存储介质 WO2022227218A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110486506.8A CN113205047A (zh) 2021-04-30 2021-04-30 药名识别方法、装置、计算机设备和存储介质
CN202110486506.8 2021-04-30

Publications (1)

Publication Number Publication Date
WO2022227218A1 true WO2022227218A1 (zh) 2022-11-03

Family

ID=77028529

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/097413 WO2022227218A1 (zh) 2021-04-30 2021-05-31 药名识别方法、装置、计算机设备和存储介质

Country Status (2)

Country Link
CN (1) CN113205047A (zh)
WO (1) WO2022227218A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116168410B (zh) * 2023-04-21 2023-06-30 江苏羲辕健康科技有限公司 一种基于神经网络的药盒信息识别方法及系统
CN116740688B (zh) * 2023-08-11 2023-11-07 武汉市中西医结合医院(武汉市第一医院) 一种药品识别方法和系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107818289A (zh) * 2016-09-13 2018-03-20 北京搜狗科技发展有限公司 一种药方识别方法和装置、一种用于药方识别的装置
CN109993165A (zh) * 2019-03-28 2019-07-09 永康市几米电子科技有限公司 药片板药名识别及药片板信息获取方法、装置与系统
CN111126140A (zh) * 2019-11-19 2020-05-08 腾讯科技(深圳)有限公司 文本识别方法、装置、电子设备以及存储介质
CN111488826A (zh) * 2020-04-10 2020-08-04 腾讯科技(深圳)有限公司 一种文本识别方法、装置、电子设备和存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019201141A1 (en) * 2018-04-18 2019-10-24 Beijing Didi Infinity Technology And Development Co., Ltd. Methods and systems for image processing
CN109471944B (zh) * 2018-11-12 2021-07-16 中山大学 文本分类模型的训练方法、装置及可读存储介质
CN110110715A (zh) * 2019-04-30 2019-08-09 北京金山云网络技术有限公司 文本检测模型训练方法、文本区域、内容确定方法和装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107818289A (zh) * 2016-09-13 2018-03-20 北京搜狗科技发展有限公司 一种药方识别方法和装置、一种用于药方识别的装置
CN109993165A (zh) * 2019-03-28 2019-07-09 永康市几米电子科技有限公司 药片板药名识别及药片板信息获取方法、装置与系统
CN111126140A (zh) * 2019-11-19 2020-05-08 腾讯科技(深圳)有限公司 文本识别方法、装置、电子设备以及存储介质
CN111488826A (zh) * 2020-04-10 2020-08-04 腾讯科技(深圳)有限公司 一种文本识别方法、装置、电子设备和存储介质

Also Published As

Publication number Publication date
CN113205047A (zh) 2021-08-03

Similar Documents

Publication Publication Date Title
US10726304B2 (en) Refining synthetic data with a generative adversarial network using auxiliary inputs
CN111476284B (zh) 图像识别模型训练及图像识别方法、装置、电子设备
CN111062871B (zh) 一种图像处理方法、装置、计算机设备及可读存储介质
WO2018108129A1 (zh) 用于识别物体类别的方法及装置、电子设备
WO2020233269A1 (zh) 由2d图像重建3d模型的方法、装置、设备及存储介质
US20210295114A1 (en) Method and apparatus for extracting structured data from image, and device
CN111027563A (zh) 一种文本检测方法、装置及识别系统
CN110516096A (zh) 合成感知数字图像搜索
WO2020253127A1 (zh) 脸部特征提取模型训练方法、脸部特征提取方法、装置、设备及存储介质
WO2020223859A1 (zh) 一种检测倾斜文字的方法、装置及设备
WO2020244075A1 (zh) 手语识别方法、装置、计算机设备及存储介质
WO2022089170A1 (zh) 字幕区域识别方法、装置、设备及存储介质
CN112287914B (zh) Ppt视频段提取方法、装置、设备及介质
WO2022227218A1 (zh) 药名识别方法、装置、计算机设备和存储介质
KR20200059993A (ko) 웹툰 제작을 위한 콘티를 생성하는 장치 및 방법
WO2023151237A1 (zh) 人脸位姿估计方法、装置、电子设备及存储介质
CN113901954A (zh) 一种文档版面的识别方法、装置、电子设备及存储介质
JP2022160662A (ja) 文字認識方法、装置、機器、記憶媒体、スマート辞書ペン及びコンピュータプログラム
WO2021237227A1 (en) Method and system for multi-language text recognition model with autonomous language classification
WO2020244076A1 (zh) 人脸识别方法、装置、电子设备及存储介质
WO2023109086A1 (zh) 文字识别方法、装置、设备及存储介质
WO2023035535A1 (zh) 语义分割网络的训练方法、装置、设备及存储介质
CN113486171B (zh) 一种图像处理方法及装置、电子设备
WO2022105120A1 (zh) 图片文字检测方法、装置、计算机设备及存储介质
CN115858776A (zh) 一种变体文本分类识别方法、系统、存储介质和电子设备

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21938688

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE