TWI781856B

TWI781856B - Method for identifying medicine image, computer device and storage medium

Info

Publication number: TWI781856B
Application number: TW110147307A
Authority: TW
Inventors: 王育任; 呂孟蘋
Original assignee: 新加坡商鴻運科股份有限公司
Priority date: 2021-12-16
Filing date: 2021-12-16
Publication date: 2022-10-21
Also published as: TW202326613A

Abstract

The present application relates to an image analysis technology, and the present application provides a method for identifying medicine image, a computer device and a storage medium. The method includes obtaining a plurality of medicine images, a test image and a pre-trained medicine detection model, which includes a location detection network, a text-recognition network and a category-identification network; obtaining a plurality of target images by inputting the medicine images into the location detection network, generating a plurality of text characteristic matrices by inputting the target images into the text-recognition network, and generating a plurality of image characteristic matrices by inputting the target images into the category-identification network; generating a reference matrix according to each of the image characteristic matrices and a corresponding text characteristic matrix; generating a test matrix according to the test image and the medicine detection model, generating an identification result according to a similarity between the test matrix and each reference matrix.

Description

Drug image recognition method, computer equipment and storage medium

本申請涉及影像處理領域，尤其涉及一種藥物影像辨識方法、電腦設備及儲存介質。 The present application relates to the field of image processing, in particular to a drug image recognition method, computer equipment and storage media.

在現有藥物影像辨識方案中，需要耗費大量人力標記藥物資料，當訓練資料過少時還會導致訓練模型的辨識準確性不佳，因此，如何提高藥物影像辨識準確性，成為了亟需解決的技術問題。 In the existing drug image recognition scheme, it takes a lot of manpower to label drug data. When the training data is too small, the recognition accuracy of the training model will be poor. Therefore, how to improve the accuracy of drug image recognition has become an urgent need to solve the technology question.

鑒於以上內容，有必要提供一種藥物影像辨識方法、電腦設備及儲存介質，能夠解決難以對藥物影像進行準確且高效識別的技術問題。 In view of the above, it is necessary to provide a drug image recognition method, computer equipment and storage The storage medium can solve the technical problem that it is difficult to accurately and efficiently identify drug images.

一種藥物影像辨識方法，所述藥物影像辨識方法包括：獲取多張藥物圖像及待測藥物圖像；獲取預先訓練完成的藥物偵測模型，所述藥物偵測模型包括位置偵測網路、文字辨識網路及類別辨識網路；將所述多張藥物圖像輸入到所述位置偵測網路中，得到多張目標圖像，每張目標圖像中包含有單個藥物的圖像；根據所述多張目標圖像及所述文字辨識網路，生成複數文字特徵矩陣；將所述多張目標圖像輸入到所述類別辨識網路中，得到複數影像特徵矩陣；根據每個影像特徵矩陣及對應的文字特徵矩陣生成參照矩陣；基於所述藥物偵測模型處理所述待測藥物圖像，得到待測矩陣；根據所述待測矩陣與每個參照矩陣的相似度生成所述待測藥物圖像的辨識結果。 A drug image recognition method, the drug image recognition method includes: acquiring a plurality of drug images and drug images to be tested; acquiring a pre-trained drug detection model, the drug detection model includes a position detection network, A character recognition network and a category recognition network; input the multiple drug images into the position detection network to obtain multiple target images, each target image contains an image of a single drug; Generate a complex character feature matrix according to the multiple target images and the character recognition network; input the multiple target images into the category recognition network to obtain complex image features feature matrix; generate a reference matrix according to each image feature matrix and corresponding text feature matrix; process the drug image to be tested based on the drug detection model to obtain a test matrix; according to the test matrix and each reference The similarity of the matrix generates the recognition result of the drug image to be tested.

根據本申請可選實施例，在將所述多張藥物圖像輸入到所述位置偵測網路中，得到多張目標圖像之前，所述藥物影像辨識方法還包括：獲取位置偵測學習器及位置圖像，所述位置圖像包括第一圖像及第二圖像，所述第二圖像包括多張標記圖像及多張未標記圖像；使用所述第一圖像對所述位置偵測學習器進行訓練，得到第一預訓練網路；基於所述多張標記圖像對所述第一預訓練網路進行調整，得到第一標記網路；將所述多張未標記圖像輸入到所述第一標記網路中，得到輸出圖像及每張輸出圖像所包含藥物的預測概率值；將大於預設閾值的預測概率值所對應的輸出圖像對所述第一標記網路進行調整，得到第二標記網路；計算所述第二標記網路的第一損失值，並基於所述第一損失值對所述第二標記網路進行多次調整，直至所述第一損失值下降到最低後停止調整，得到所述位置偵測網路。 According to an optional embodiment of the present application, before inputting the multiple medicine images into the position detection network to obtain multiple target images, the medicine image recognition method further includes: acquiring position detection learning device and a position image, the position image includes a first image and a second image, and the second image includes a plurality of marked images and a plurality of unmarked images; using the first image pair The position detection learner is trained to obtain a first pre-training network; the first pre-training network is adjusted based on the multiple marked images to obtain a first marked network; the multiple marked images The unmarked image is input into the first marking network to obtain the output image and the predicted probability value of the drug contained in each output image; the output image corresponding to the predicted probability value greater than the preset threshold is paired with the adjusting the first marked network to obtain a second marked network; calculating a first loss value of the second marked network, and performing multiple adjustments to the second marked network based on the first loss value , stop adjusting until the first loss value drops to a minimum, and obtain the position detection network.

根據本申請可選實施例，在根據所述多張目標圖像及所述文字辨識網路，生成複數文字特徵矩陣前，所述藥物影像識別方法還包括：獲取文字圖像及文字辨識學習器，所述文字圖像包括第三圖像及第四圖像；使用所述第三圖像對所述文字辨識學習器進行訓練，得到第二預訓練網路，其中，所述第二預訓練網路包括卷積神經網路模型及循環神經網路模型；計算所述第二預訓練網路的第二損失值，並透過所述第二損失值進行反向傳播，對所述第二預訓練網路的參數進行多次調整，直至所述第二預訓練模型達到收斂後停止調整，得到所述文字辨識網路。 According to an optional embodiment of the present application, before generating a complex character feature matrix based on the multiple target images and the character recognition network, the drug image recognition method further includes: acquiring a character image and a character recognition learner , the character image includes a third image and a fourth image; use the third image to train the character recognition learner to obtain a second pre-training network, wherein the second pre-training Networks include convolutional neural network models and recurrent neural network models type; calculate the second loss value of the second pre-training network, and perform backpropagation through the second loss value, and adjust the parameters of the second pre-training network multiple times until the first After the second pre-training model reaches convergence, the adjustment is stopped to obtain the character recognition network.

根據本申請可選實施例，所述根據所述多張目標圖像及所述文字辨識網路，生成複數文字特徵矩陣包括：將每張目標圖像進行色彩轉換，得到多張文字灰階圖像；將每張文字灰階圖像進行二值化處理，得到多張二值化圖像；將每張二值化圖像進行濾波處理，得到多張濾波圖像；定位每張濾波圖像中藥物文字的位置，得到文字位置；根據所述文字位置從每張目標圖像中框選出文字圖像；將每張文字圖像輸入到所述卷積神經網路模型進行特徵提取，得到特徵序列；將所述特徵序列輸入到所述循環神經網路模型中，得到所述複數文字特徵矩陣。 According to an optional embodiment of the present application, the generating complex character feature matrix according to the multiple target images and the character recognition network includes: performing color conversion on each target image to obtain multiple gray scale images of characters image; Binarize each text grayscale image to obtain multiple binarized images; filter each binarized image to obtain multiple filtered images; locate each filtered image According to the position of the Chinese medicine text, the text position is obtained; the text image is selected from each target image according to the text position; each text image is input to the convolutional neural network model for feature extraction, and the feature is obtained. Sequence; input the feature sequence into the cyclic neural network model to obtain the complex character feature matrix.

根據本申請可選實施例，在將所述多張目標圖像輸入到所述類別辨識網路中，得到複數影像特徵矩陣之前，所述藥物影像識別方法還包括：獲取類別辨識學習器，所述類別辨識學習器中使用soft-max函數作為激活函數；計算所述類別辨識學習器的第三損失值，並基於所述第三損失值對所述類別辨識學習器進行調整，直至所述第三損失值下降到最低後停止調整，並從調整後的類別辨識學習器中刪除所述激活函數，得到所述類別辨識網路。 According to an optional embodiment of the present application, before inputting the plurality of target images into the class recognition network to obtain a complex image feature matrix, the drug image recognition method further includes: acquiring a class recognition learner, so The soft-max function is used as the activation function in the class identification learner; the third loss value of the class identification learner is calculated, and the class identification learner is adjusted based on the third loss value until the first After the loss value drops to the minimum, the adjustment is stopped, and the activation function is deleted from the adjusted class recognition learner to obtain the class recognition network.

根據本申請可選實施例，所述計算所述類別辨識學習器的第三損失值包括：獲取多張類別圖像，所述多張類別圖像包括複數類別；將所述多張類別圖像進行增廣處理，得到多張增廣圖像；所述第三損失值的確定公式為：

其中，

是指所述第三損失值，2N是指所述多張增廣圖像，i是指所述多張增廣圖像中的第i張增廣圖像，yi是指所述第i張增廣圖像的類別，j是指與i的類別相同的增廣圖像中的第j張增廣圖像，yj是指所述第j張增廣圖像的類別，N _yi是指與i的類別相同的所有增廣圖像的數量，∥_i≠j為第一指示函數，當且僅當i=j時取零，當i≠j時取1，∥_yi=yj為第二指示函數，當且僅當yi=yj時取零，當yi≠yj時取1，∥_i≠k為第三指示函數，當且僅當i=k時取零，當i≠k時取1，z _i是指將i輸入到所述類別辨識網路中得到的單位向量，z _j是指將j輸入到所述類別辨識網路中得到的單位向量，k是指除了i之外的任意一張增廣圖像，z _k是指將k輸入到所述類別辨識網路中得到的單位向量，τ為預設的標量調節參數。 According to an optional embodiment of the present application, the calculating the third loss value of the class recognition learner includes: acquiring a plurality of class images, the plurality of class images including plural classes; Perform augmentation processing to obtain multiple augmented images; the formula for determining the third loss value is:

in,

refers to the third loss value, 2 N refers to the multiple augmented images, i refers to the i -th augmented image in the plurality of augmented images, and yi refers to the category of the i -th augmented image, j refers to the j -th augmented image among the augmented images of the same category as i , yj refers to the category of the j -th augmented image, N _yi refers to all augmented images of the same category as i ∥ _{i ≠ j} is the first indicator function, which is zero if and only when i = j , and 1 when i ≠ j , ∥ _{yi = yj} is the second indicator function, which is taken if and only when yi = yj zero, takes 1 when yi ≠ yj , ∥ _{i ≠ k} is the third indicator function, takes zero if and only when i = k , takes 1 when i ≠ k , z _i refers to inputting i into the category The unit vector obtained in the identification network, z _j refers to the unit vector obtained by inputting j into the category identification network, k refers to any augmented image except i , z _k refers to inputting k to the unit vector obtained in the category recognition network, τ is a preset scalar adjustment parameter.

根據本申請可選實施例，所述根據每個影像特徵矩陣及對應的文字特徵矩陣生成參照矩陣包括：將每個影像特徵矩陣及對應的文字特徵矩陣進行相加運算，得到所述參照矩陣，其中，每個影像特徵矩陣及對應的文字特徵矩陣具備相同的行數與列數。 According to an optional embodiment of the present application, the generating the reference matrix according to each image feature matrix and the corresponding text feature matrix includes: adding each image feature matrix and the corresponding text feature matrix to obtain the reference matrix, Wherein, each image feature matrix and the corresponding text feature matrix have the same number of rows and columns.

根據本申請可選實施例，所述根據所述待測矩陣與每個參照矩陣的相似度生成所述待測藥物圖像的辨識結果包括：計算所述待測矩陣與每個參照矩陣的相似度；將最大的相似度所對應的參照矩陣確定為目標矩陣；基於預設標籤映射表對所述目標矩陣進行映射處理，得到所述辨識結果。 According to an optional embodiment of the present application, the generating the recognition result of the test drug image according to the similarity between the test matrix and each reference matrix includes: calculating the similarity between the test matrix and each reference matrix degree; determining the reference matrix corresponding to the maximum similarity as the target matrix; performing mapping processing on the target matrix based on a preset label mapping table to obtain the identification result.

本申請提供一種電腦設備，所述電腦設備包括：儲存器，儲存至少一個指令；及處理器，執行所述至少一個指令以實現所述藥物影像辨識方法。 The present application provides a computer device, which includes: a memory storing at least one instruction; and a processor executing the at least one instruction to implement the drug image recognition method.

本申請提供一種電腦可讀儲存介質，所述電腦可讀儲存介質中儲存有至少一個指令，所述至少一個指令被電腦設備中的處理器執行以實現所述藥物影像辨識方法。 The present application provides a computer-readable storage medium, wherein at least one instruction is stored in the computer-readable storage medium, and the at least one instruction is executed by a processor in a computer device to implement the drug image recognition method.

由以上技術方案可以看出，使用所述多張標記圖像對所述第二預訓練網路進行多次調整得到所述位置偵測網路，由於所述位置偵測網路學習到了所述多張標記圖像的特徵，所以所述位置偵測網路能夠對所述待測圖像進行位置標記，在訓練所述類別辨識網路時，對所述多張類別圖像進行增廣處理，避免了訓練資料過少的問題，根據每個影像特徵矩陣及對應的文字特徵矩陣生成複數參照矩陣，將所述待測圖像以相同的方式生成所述待測矩陣，計算所述待測矩陣與每個參照矩陣的相似度，並選取最大的相似度對應的參照矩陣的標籤信息作為所述辨識結果，由於所述待測矩陣同時包含了所述待測圖像的文字特徵及圖像特徵，使得所述待測矩陣能夠全面地反映出所述待測圖像的特徵，從而使得所述辨識結果的準確性更高。 It can be seen from the above technical solutions that the position detection network is obtained by using the multiple marked images to adjust the second pre-training network multiple times, because the position detection network has learned the The features of multiple marked images, so the position detection network can mark the position of the image to be tested, and perform augmentation processing on the multiple category images when training the category recognition network , to avoid the problem of too little training data, generate a complex reference matrix according to each image feature matrix and corresponding text feature matrix, generate the test matrix in the same way as the image to be tested, and calculate the test matrix The similarity with each reference matrix, and select the label information of the reference matrix corresponding to the maximum similarity as the identification result, because the matrix to be tested contains both the text features and image features of the image to be tested , so that the matrix to be tested can fully reflect the features of the image to be tested, so that the accuracy of the recognition result is higher.

1:電腦設備 1: Computer equipment

2:攝像裝置 2: camera device

12:儲存器 12: Storage

13:處理器 13: Processor

S10~S17:步驟 S10~S17: Steps

圖1是本申請藥物影像辨識方法的較佳實施例的應用環境圖。 FIG. 1 is an application environment diagram of a preferred embodiment of the drug image recognition method of the present application.

圖2是本申請藥物影像辨識方法的較佳實施例的流程圖。 FIG. 2 is a flowchart of a preferred embodiment of the drug image recognition method of the present application.

圖3是本申請實現藥物影像辨識方法的較佳實施例的電腦設備的結構示意圖。 FIG. 3 is a schematic structural diagram of a computer device for implementing a preferred embodiment of the drug image recognition method of the present application.

為了使本申請的目的、技術方案和優點更加清楚，下面結合附圖和具體實施例對本申請進行詳細描述。 In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be described in detail below in conjunction with the accompanying drawings and specific embodiments.

如圖1所示，是本申請一種藥物影像辨識方法的較佳實施例的應用環境圖。所述藥物影像辨識方法可應用於一個或者複數電腦設備1中，所述電腦設備1與攝像裝置2相通信，所述攝像裝置2可以是攝像頭，也可以是實現拍攝的其它裝置，例如，透過攝像裝置2能夠拍攝待測藥物，得到待測藥物圖像。所述待測藥物可以為膠囊類、片類藥物，例如，阿莫西林膠囊、克拉黴素分散片等。 As shown in FIG. 1 , it is an application environment diagram of a preferred embodiment of a drug image recognition method of the present application. The drug image recognition method can be applied to one or a plurality of computer equipment 1, and the computer equipment 1 communicates with the camera device 2, and the camera device 2 can be a camera, or can be used to realize the camera For example, the drug to be tested can be photographed through the camera device 2 to obtain an image of the drug to be tested. The drug to be tested can be a capsule or tablet drug, for example, amoxicillin capsule, clarithromycin dispersible tablet and the like.

所述電腦設備1是一種能夠按照事先設定或儲存的指令，自動進行數值計算和/或資訊處理的設備，其硬體包括，但不限於：微處理器、專用積體電路(Application Specific Integrated Circuit，ASIC)、可程式設計閘陣列(Field-Programmable Gate Array，FPGA)、數位訊號處理器(Digital Signal Processor，DSP)、嵌入式設備等。 The computer device 1 is a device that can automatically perform numerical calculation and/or information processing according to pre-set or stored instructions. Its hardware includes, but is not limited to: microprocessor, application specific integrated circuit (Application Specific Integrated Circuit) , ASIC), programmable gate array (Field-Programmable Gate Array, FPGA), digital signal processor (Digital Signal Processor, DSP), embedded devices, etc.

所述電腦設備1可以是任何一種可與用戶進行人機交互的電腦產品，例如，個人電腦、平板電腦、智慧手機、個人數位助理(Personal Digital Assistant，PDA)、遊戲機、互動式網路電視(Internet Protocol Television，IPTV)、智慧式穿戴式設備等。 The computer device 1 can be any computer product capable of man-machine interaction with the user, for example, a personal computer, a tablet computer, a smart phone, a personal digital assistant (Personal Digital Assistant, PDA), a game console, an interactive network TV (Internet Protocol Television, IPTV), smart wearable devices, etc.

所述電腦設備1還可以包括網路設備和/或使用者設備。其中，所述網路設備包括，但不限於單個網路伺服器、複數網路伺服器組成的伺服器組或基於雲計算(Cloud Computing)的由大量主機或網路伺服器構成的雲。 The computer equipment 1 may also include network equipment and/or user equipment. Wherein, the network device includes, but is not limited to, a single network server, a server group composed of multiple network servers, or a cloud composed of a large number of hosts or network servers based on Cloud Computing.

所述電腦設備1所處的網路包括但不限於網際網路、廣域網路、都會區網路、局域網、虛擬私人網路(Virtual Private Network，VPN)等。 The network where the computer device 1 is located includes but not limited to Internet, wide area network, metropolitan area network, local area network, virtual private network (Virtual Private Network, VPN) and so on.

如圖2所示，是本申請一種藥物影像辨識方法的較佳實施例的流程圖。根據不同的需求，該流程圖中各個步驟的順序可以根據實際檢測要求進行調整，某些步驟可以省略。所述方法的執行主體為電腦設備，例如圖1所示的電腦設備1。 As shown in FIG. 2 , it is a flowchart of a preferred embodiment of a drug image recognition method of the present application. According to different requirements, the order of each step in the flow chart can be adjusted according to actual detection requirements, and some steps can be omitted. The execution body of the method is a computer device, such as the computer device 1 shown in FIG. 1 .

步驟S10，獲取多張藥物圖像及待測藥物圖像。 Step S10, acquiring a plurality of images of the drug and images of the drug to be tested.

在本申請的至少一個實施例中，所述多張藥物圖像是指攜帶標籤信息的藥物圖像，所述多張藥物圖像可用於生成參照矩陣。 In at least one embodiment of the present application, the multiple medicine images refer to medicine images carrying label information, and the multiple medicine images can be used to generate a reference matrix.

所述標籤信息可以包括，但不限於：藥物的名稱、藥物的類別、藥物的用法等。 The label information may include, but not limited to: the name of the drug, the category of the drug, the usage of the drug, and the like.

在本申請的至少一個實施例中，所述待測藥物圖像是指沒有攜帶所述標籤信息的藥物圖像，所述待測藥物圖像中的待測藥物的表面存在藥物文字，所述藥物文字可為字母、數字。 In at least one embodiment of the present application, the drug image to be tested refers to a drug image that does not carry the label information, and there are drug characters on the surface of the drug to be tested in the drug image to be tested, the Drug characters can be letters and numbers.

在本申請的至少一個實施例中，所述電腦設備從預先設置的目標資料庫中獲取所述多張藥物圖像及每張藥物圖像對應的標籤信息。 In at least one embodiment of the present application, the computer device acquires the plurality of medicine images and label information corresponding to each medicine image from a preset target database.

在本申請的至少一個實施例中，所述電腦設備控制所述攝像裝置2拍攝所述待測藥物，得到所述待測藥物圖像。 In at least one embodiment of the present application, the computer device controls the imaging device 2 to photograph the drug to be tested to obtain an image of the drug to be tested.

其中，所述攝像裝置2可以是攝像頭。 Wherein, the camera device 2 may be a camera.

步驟S11，獲取預先訓練完成的藥物偵測模型，所述藥物偵測模型包括位置偵測網路、文字辨識網路及類別辨識網路。 Step S11 , obtaining a pre-trained drug detection model, which includes a location detection network, a text recognition network, and a category recognition network.

在本申請的至少一個實施例中，所述藥物偵測模型是指對所述待測圖像及每張藥物圖像中藥物的位置進行偵測的網路模型。 In at least one embodiment of the present application, the drug detection model refers to a network model that detects the position of the drug in the image to be tested and each drug image.

在本申請的至少一個實施例中，所述位置偵測網路用於從每張藥物圖像中框選出單個藥物的圖像。 In at least one embodiment of the present application, the position detection network is used to select a single image of the drug from each image of the drug.

在本申請的至少一個實施例中，所述文字辨識網路可以用於獲取每張藥物圖像中的文字信息。 In at least one embodiment of the present application, the text recognition network can be used to obtain text information in each medicine image.

在本申請的至少一個實施例中，所述類別辨識網路可以用於識別每張藥物圖像中藥物的種類。 In at least one embodiment of the present application, the category recognition network can be used to identify the category of medicine in each medicine image.

步驟S12，將所述多張藥物圖像輸入到所述位置偵測網路中，得到多張目標圖像，每張目標圖像中包含有單個藥物的圖像。 Step S12, inputting the multiple images of the drug into the position detection network to obtain multiple target images, each of which contains an image of a single drug.

在本申請的至少一個實施例中，所述目標圖像是指從每張藥物圖像上框選出的包含有單個藥物的圖像。 In at least one embodiment of the present application, the target image refers to an image containing a single drug selected from each drug image.

在本申請的至少一個實施例中，在將所述多張藥物圖像輸入到所述位置偵測網路中，得到多張目標圖像之前，所述藥物影像辨識方法還包括：所述電腦設備獲取位置偵測學習器及位置圖像，所述位置圖像包括第一圖像及第二圖像，所述第二圖像包括多張標記圖像及多張未標記圖像，所述電腦設備使用所述第一圖像對所述位置偵測學習器進行訓練，得到第一預訓練網路，基於所述多張標記圖像對所述第一預訓練網路進行調整，得到第一標記網路，所述電腦設備將所述多張未標記圖像輸入到所述第一標記網路中，得到輸出圖像及每張輸出圖像所包含藥物的預測概率值，並將大於預設閾值的預測概率值所對應的輸出圖像對所述第一標記網路進行調整，得到第二標記網路，進一步地，所述電腦設備計算所述第二標記網路的第一損失值，並基於所述第一損失值對所述第二標記網路進行多次調整，直至所述第一損失值下降到最低後停止調整，得到所述位置偵測網路。 In at least one embodiment of the present application, before inputting the multiple medicine images into the position detection network to obtain multiple target images, the medicine image recognition method further includes: the computer The device acquires a position detection learner and a position image, the position image includes a first image and a second image, and the second image includes a plurality of marked images and a plurality of unmarked images, so The computer device uses the first image to train the position detection learner to obtain a first pre-training network, and adjusts the first pre-training network based on the plurality of marked images to obtain The first marking network, the computer device inputs the multiple unmarked images into the first marking network, obtains the output image and the predicted probability value of the drug contained in each output image, and The output image corresponding to the predicted probability value greater than the preset threshold is adjusted to the first marked network to obtain a second marked network. Further, the computer device calculates the first marked network of the second marked network. loss value, and adjust the second marking network multiple times based on the first loss value until the first loss value drops to a minimum and then stop the adjustment to obtain the position detection network.

具體地，所述電腦設備基於所述第一損失值對所述第二標記網路進行多次調整，直至所述第一損失值下降到最低後停止調整，得到所述位置偵測網路包括：所述電腦設備將所述大於預設閾值的預測概率值所對應的輸出圖像輸入到所述第一標記網路進行訓練，對所述第一標記網路的權值進行迭代更新，直至所述第一標記網路達到收斂，得到所述位置偵測網路。 Specifically, the computer device adjusts the second marking network multiple times based on the first loss value, and stops the adjustment until the first loss value drops to a minimum, so that the position detection network includes : the computer device inputs the output image corresponding to the predicted probability value greater than the preset threshold to the first labeling network for training, and iteratively updates the weights of the first labeling network until The first marking network achieves convergence to obtain the location detection network.

其中，所述位置偵測學習器可以為目標檢測器efficientDet，所述位置偵測學習器可用於對每張藥物圖像中單個藥物的位置進行準確的定位。所述第一圖像是指從預先設置的第一資料庫中獲取到的圖像，可以是包含任意物件的圖像，所述第一資料庫可以為COCO資料庫、ImageNet資料庫及CPTN資料庫等資料庫。所述第一圖像中包括了動物(例如，小狗，小貓等)、植物(例如，花、樹等)等複數種類的物品圖像，所述第二圖像是指包含有藥物的圖像，所述第二圖像可以從預先設置的第二資料庫中獲取。所述多張標記圖像是指已經對圖像中藥物的位置進行標記的圖像，所述多張未標記圖像是指圖像中藥物的位置未標記的圖像。 Wherein, the position detection learner can be a target detector efficientDet, and the position detection learner can be used to accurately locate the position of a single drug in each drug image. The first image refers to the image obtained from the preset first database, which can be an image containing any object, and the first database can be COCO database, ImageNet database and CPTN data database etc. The first image includes images of multiple types of items such as animals (for example, puppies, kittens, etc.), plants (for example, flowers, trees, etc.), and the second image refers to items containing medicines. image, and the second image may be obtained from a preset second database. The multiple marked images refer to images in which the position of the drug in the image has been marked, and the multiple unmarked images refer to images in which the position of the drug in the image has not been marked.

所述輸出圖像是指可能包括有單個藥物的圖像，所述輸出圖像可用於對所述第一標記網路進行多次調整。 The output image refers to an image that may include a single drug, and the output image can be used for multiple adjustments to the first marking network.

所述預測概率值是指所述輸出圖像中包含單個藥物的概率。 The predicted probability value refers to the probability that a single drug is contained in the output image.

所述預設閾值可以自行設置，本申請對此不作限制。 The preset threshold can be set by itself, which is not limited in this application.

所述第一預訓練網路是指使用所述第一圖像進行預訓練後得到的網路。 The first pre-trained network refers to a network obtained after pre-training using the first image.

所述第一標記網路是指根據所述多張標記圖像對所述第一預訓練網路進行再次訓練後得到的網路，所述第一標記網路可用於對所述多張未標記圖像進行特徵標記。 The first labeled network refers to a network obtained after retraining the first pre-trained network according to the multiple labeled images, and the first labeled network can be used to Label images for feature labeling.

所述第一損失值的計算公式為：FL(p _t)=-α _t(1-p _t)^γlog(p _t)；其中，FL(p _t)為所述第一損失值，p _t為所述預測概率值，α _t

[0,1]，γ

0。 The calculation formula of the first loss value is: FL ( p _t )=- α _t (1- p _t ) ^γ log( p _t ); wherein, FL ( p _t ) is the first loss value, p _t is the predicted probability value, α _t

[0,1], γ

0.

透過上述實施方式，能夠選取到第一損失值最小的第二標記網路作為所述位置偵測網路，由於所述第二標記網路學習到了所述多張標記圖像的特徵，從而使得所述位置偵測網路能夠準確的對每張藥物圖像進行標記。 Through the above implementation, the second labeling network with the smallest first loss value can be selected as the position detection network, because the second labeling network has learned the features of the multiple labeling images, so that The position detection network can accurately mark each drug image.

步驟S13，根據所述多張目標圖像及所述文字辨識網路，生成複數文字特徵矩陣。 Step S13, generating complex character feature matrices according to the plurality of target images and the character recognition network.

在本申請的至少一個實施例中，所述複數文字特徵矩陣是指包含有所述多張目標圖像中文字特徵的矩陣，每個文字特徵矩陣可用於生成所述參照矩陣。 In at least one embodiment of the present application, the complex text feature matrix refers to a matrix containing text features in the multiple target images, and each text feature matrix can be used to generate the reference matrix.

在本申請的至少一個實施例中，在根據所述多張目標圖像及所述文字辨識網路，生成複數文字特徵矩陣前，所述藥物影像識別方法還包括：所述電腦設備獲取文字圖像及文字辨識學習器，所述文字圖像包括第三圖像及第四圖像，使用所述第三圖像對所述文字辨識學習器進行訓練，得到第二預訓練網路，其中，所述第二預訓練網路包括卷積神經網路模型及循環神經網路模型，所述電腦設備計算所述第二預訓練網路的第二損失值，並透過所述第二損失值進行反向傳播，對所述第二預訓練網路的參數進行多次調整，直至所述第二預訓練模型達到收斂後停止調整，得到所述文字辨識網路。 In at least one embodiment of the present application, before generating a complex character feature matrix based on the multiple target images and the character recognition network, the drug image recognition method further includes: the computer device acquires a character image An image and character recognition learner, the character image includes a third image and a fourth image, and the third image is used to train the character recognition learner to obtain a second pre-training network, wherein, The second pre-training network includes a convolutional neural network model and a recurrent neural network model, and the computer device calculates a second loss value of the second pre-training network, and uses the second loss value to perform Backpropagation, adjusting the parameters of the second pre-training network multiple times until the second pre-training model reaches convergence and then stop the adjustment to obtain the character recognition network.

其中，所述文字辨識學習器是指對每張藥物圖像的文字進行辨識的學習器，所述文字圖像是指用於訓練所述文字辨識學習器的圖像，所述文字圖像中包含有任意藥物上所帶有的文字。 Wherein, the character recognition learner refers to a learner that recognizes the characters of each drug image, and the character image refers to an image used to train the character recognition learner, and in the character image Contains text attached to any medicine.

所述卷積神經網路模型可以為VGG16網路，所述卷積神經網路模型可用於提取所述第四圖像的文字特徵。所述循環神經網路模型可以為長短期記憶網路(Long Short-Term Memory，LSTM)，所述循環神經網路模型可用於提取所述文字特徵的時序信息。 The convolutional neural network model may be a VGG16 network, and the convolutional neural network model may be used to extract text features of the fourth image. The cyclic neural network model may be a long short-term memory network (Long Short-Term Memory, LSTM), and the cyclic neural network model may be used to extract time series information of the character features.

所述第三圖像可用於訓練所述文字辨識學習器的權重，所述第四圖像是指包含有任意藥物文字的圖像，所述藥物文字可以包括，但不限於：字母、數字。所述第三圖像可以從所述第一資料庫中獲取，所述第四圖像可以從預先設置的第三資料庫中獲取，所述第三資料庫中儲存有多張藥物文字的圖像。 The third image can be used to train the weight of the character recognition learner, and the fourth image refers to an image containing any drug characters, and the drug characters can include, but not limited to: letters and numbers. The third image can be obtained from the first database, and the fourth image can be obtained from a pre-set third database, and the third database stores multiple images of drug characters. picture.

所述第二預訓練網路使用的損失函數可以為連接時序分類損失函數(Connectionist temporal classification，CTC)。 The loss function used by the second pre-training network may be a connectionist temporal classification loss function (Connectionist temporal classification, CTC).

透過上述實施方式，選取最低的第二損失值對應的第二預訓練網路作為所述文字辨識網路，能夠提高所述文字辨識網路的可靠性和精度，使得所述文字辨識網路能夠準確地提取所述目標圖像的文字特徵。 Through the above implementation, selecting the second pre-training network corresponding to the lowest second loss value as the text recognition network can improve the reliability and accuracy of the text recognition network, so that the text recognition network can Accurately extract the text features of the target image.

在本申請的至少一個實施例中，所述電腦設備根據所述多張目標圖像及所述文字辨識網路，生成複數文字特徵矩陣包括：所述電腦設備將每張目標圖像進行色彩轉換，得到多張文字灰階圖像，並將每張文字灰階圖像進行二值化處理，得到多張二值化圖像，將每張二值化圖像進行濾波處理，得到多張濾波圖像，進一步地，所述電腦設備定位每張濾波圖像中藥物文字的位置，得到文字位置，根據所述文字位置從每張目標圖像中框選出文字圖像，更進一步地，所述電腦設備將每張文字圖像輸入到所述卷積神經網路模型進行特徵提取，得到特徵序列，所述電腦設備將所述特徵序列輸入到所述循環神經網路模型中，得到所述複數文字特徵矩陣。 In at least one embodiment of the present application, the computer device generates a complex character feature matrix according to the multiple target images and the character recognition network includes: the computer device performs color conversion on each target image , to obtain multiple text grayscale images, and perform binarization processing on each text grayscale image to obtain multiple binarized images, and filter each binarized image to obtain multiple filtered images image, further, the computer device locates the position of the drug text in each filtered image to obtain the text position, and selects a text image from each target image according to the text position, and further, the The computer device inputs each text image into the convolutional neural network model for feature extraction to obtain a feature sequence, and the computer device inputs the feature sequence into the cyclic neural network model to obtain the complex number Text feature matrix.

其中，所述特徵序列是指所述卷積神經網路模型對每張濾波圖像進行提取所得到的特徵，所述複數文字特徵矩陣是指所述循環神經網路模型對所述特徵序列進行提取所得到的特徵。 Wherein, the feature sequence refers to the convolutional neural network model for each filter image The feature obtained by row extraction, the complex character feature matrix refers to the feature obtained by the cyclic neural network model extracting the feature sequence.

透過上述實施方式，對每張目標圖像進行色彩轉換、二值化、濾波等處理，能夠獲得更加清晰的濾波圖像，根據所述多張濾波圖像能夠準確的獲取到文字特徵，有利於生成所述文字特徵矩陣。 Through the above-mentioned implementation, each target image is processed by color conversion, binarization, filtering, etc., and a clearer filtered image can be obtained. According to the multiple filtered images, character features can be accurately obtained, which is beneficial to Generate the character feature matrix.

步驟S14，將所述多張目標圖像輸入到所述類別辨識網路中，得到複數影像特徵矩陣。 Step S14, inputting the plurality of target images into the class recognition network to obtain complex image feature matrices.

在本申請的至少一個實施例中，所述複數影像特徵矩陣是指包含有所述多張目標圖像中的圖像特徵的矩陣。 In at least one embodiment of the present application, the complex image feature matrix refers to a matrix including image features in the multiple target images.

在本申請的至少一個實施例中，在將所述多張目標圖像輸入到所述類別辨識網路中，得到複數影像特徵矩陣之前，所述藥物影像識別方法還包括：所述電腦設備獲取類別辨識學習器，所述類別辨識學習器中使用soft-max函數作為激活函數，所述電腦設備計算所述類別辨識學習器的第三損失值，進一步地，並基於所述第三損失值對所述類別辨識學習器進行調整，直至所述第三損失值下降到最低後停止調整，並從調整後的類別辨識學習器中刪除所述激活函數，得到所述類別辨識網路。 In at least one embodiment of the present application, before inputting the plurality of target images into the class recognition network to obtain a complex image feature matrix, the drug image recognition method further includes: the computer device acquires A class recognition learner, using a soft-max function as an activation function in the class recognition learner, the computer device calculates a third loss value of the class recognition learner, and further, based on the third loss value pair The class recognition learner is adjusted until the third loss value drops to a minimum, then the adjustment is stopped, and the activation function is deleted from the adjusted class recognition learner to obtain the class recognition network.

具體地，所述電腦設備獲取類別辨識學習器，所述類別辨識學習器中使用soft-max函數作為激活函數包括：所述電腦設備基於resnet50網路構建所述類別辨識學習器，並將soft-max作為所述激活函數。 Specifically, the computer device acquires a class recognition learner, and using the soft-max function as an activation function in the class recognition learner includes: the computer device builds the class recognition learner based on the resnet50 network, and uses the soft-max function as the activation function. max is used as the activation function.

在本申請的至少一個實施例中，所述電腦設備計算所述類別辨識學習器的第三損失值包括：所述電腦設備獲取多張類別圖像，所述多張類別圖像包括複數類別，將所述多張類別圖像進行增廣處理，得到多張增廣圖像，所述多張增廣圖像成對存在，每對增廣圖像包含第一增廣圖像及第二增廣圖像，所述第一增廣圖像及所述第二增廣圖像來源於同一張類別圖像。 In at least one embodiment of the present application, the calculation of the third loss value of the class recognition learner by the computer device includes: the computer device acquires a plurality of class images, and the plurality of class images include plural classes, performing augmentation processing on the plurality of category images to obtain a plurality of augmented images, the plurality of augmented images exist in pairs, and each pair of augmented images includes a first augmented image and a second augmented image, The first augmented image and the second augmented image are from the same category image.

所述多張類別圖像是指取值大於預設閾值的預測概率值所對應的輸出圖像。 The plurality of category images refer to output images corresponding to predicted probability values whose values are greater than a preset threshold.

所述增廣處理是指將每張類別圖像進行旋轉及裁剪等變換後得到的圖像。 The augmentation process refers to an image obtained by performing transformations such as rotation and cropping on each category image.

所述複數類別包括，但不限於：抗生素類、維生素類等。 The plural categories include, but are not limited to: antibiotics, vitamins, and the like.

所述第三損失值的確定公式為：

其中，

是指所述第三損失值，2N是指所述多張增廣圖像，i是指所述多張增廣圖像中的第i張增廣圖像，yi是指所述第i張增廣圖像的類別，j是指與i的類別相同的增廣圖像中的第j張增廣圖像，yj是指所述第j張增廣圖像的類別，N _yi是指與i的類別相同的所有增廣圖像的數量，∥_i≠j為第一指示函數，當且僅當i=j時取零，當i≠j時取1，∥_yi=yj為第二指示函數，當且僅當yi=yj時取零，當yi≠yj時取1，∥_i≠k為第三指示函數，當且僅當i=k時取零，當i≠k時取1，z _i是指將i輸入到所述類別辨識網路中得到的單位向量，z _j是指將j輸入到所述類別辨識網路中得到的單位向量，k是指除了i之外的任意一張增廣圖像，z _k是指將k輸入到所述類別辨識網路中得到的單位向量，τ為預設的標量調節參數。 The formula for determining the third loss value is:

in,

透過上述實施方式，對所述多張類別圖像進行增廣處理，擴充了訓練資料，使用更多的訓練資料訓練得到所述類別辨識網路，能夠提高所述類別辨識網路的辨識準確性。 Through the above implementation manner, the multiple category images are augmented, the training data is expanded, and more training data are used to train the category recognition network, which can improve the recognition accuracy of the category recognition network .

步驟S15，根據每個影像特徵矩陣及對應的文字特徵矩陣生成參照矩陣。 Step S15, generating a reference matrix according to each image feature matrix and the corresponding character feature matrix.

在本申請的至少一個實施例中，所述參照矩陣是指包含了所述複數目標圖像的圖像特徵及文字特徵的矩陣，所述參照矩陣可用於指示所述待測藥物圖像的標籤信息。 In at least one embodiment of the present application, the reference matrix refers to a matrix that includes the image features and text features of the complex target images, and the reference matrix can be used to indicate the label of the drug image to be tested information.

在本申請的至少一個實施例中，所述電腦設備根據每個影像特徵矩陣及對應的文字特徵矩陣生成參照矩陣包括：所述電腦設備將每個影像特徵矩陣及對應的文字特徵矩陣進行相加運算，得到所述參照矩陣，其中，每個影像特徵矩陣及對應的文字特徵矩陣具備相同的行數與列數。 In at least one embodiment of the present application, the computer device according to each image feature moment Generating the reference matrix by the matrix and the corresponding character feature matrix includes: the computer equipment adds each image feature matrix and the corresponding character feature matrix to obtain the reference matrix, wherein each image feature matrix and the corresponding character matrix The feature matrix has the same number of rows and columns.

在本申請的至少一個實施例中，生成所述參照矩陣還有其它方式，例如：所述電腦設備將每個影像特徵矩陣及對應的文字特徵矩陣進行相乘運算，得到所述參照矩陣，或者，所述電腦設備將每個影像特徵矩陣及對應的文字特徵矩陣進行相減運算，得到所述參照矩陣。 In at least one embodiment of the present application, there are other ways to generate the reference matrix, for example: the computer device multiplies each image feature matrix and the corresponding character feature matrix to obtain the reference matrix, or , the computer device subtracts each image feature matrix and the corresponding text feature matrix to obtain the reference matrix.

透過上述實施方式，能夠提取每張藥物圖像中藥物的圖像特徵及文字特徵，並生成同時具備圖像特徵及文字特徵的參照矩陣。 Through the above embodiments, the image features and text features of the medicine in each medicine image can be extracted, and a reference matrix having both the image features and the text features can be generated.

步驟S16，基於所述藥物偵測模型處理所述待測藥物圖像，得到待測矩陣。 Step S16, processing the image of the drug to be tested based on the drug detection model to obtain a matrix to be tested.

在本申請的至少一個實施例中，所述待測矩陣是指包含有所述待測圖像的圖像特徵及文字特徵的矩陣。 In at least one embodiment of the present application, the test matrix refers to a matrix including image features and text features of the test image.

由於生成所述待測矩陣的過程與生成所述參照矩陣一致，故本申請在此不作贅述。 Since the process of generating the test matrix is consistent with that of generating the reference matrix, the present application does not repeat it here.

透過上述實施方式，使用處理所述多張藥物圖像方式對所述待測圖像進行處理，得到所述待測矩陣，使得所述待測矩陣及每個參照矩陣具有相同的行數及列數，更便於計算所述待測矩陣與每個參照矩陣的相似度。 Through the above-mentioned embodiment, the image to be tested is processed by processing the plurality of drug images to obtain the matrix to be tested, so that the matrix to be tested and each reference matrix have the same number of rows and columns It is easier to calculate the similarity between the test matrix and each reference matrix.

步驟S17，根據所述待測矩陣與每個參照矩陣的相似度生成所述待測藥物圖像的辨識結果。 Step S17, generating a recognition result of the image of the drug to be tested according to the similarity between the matrix to be tested and each reference matrix.

在本申請的至少一個實施例中，所述辨識結果是指所述待測藥物對應的標籤信息。 In at least one embodiment of the present application, the identification result refers to label information corresponding to the drug to be tested.

在本申請的至少一個實施例中，所述電腦設備據所述待測矩陣與每個參照矩陣的相似度生成所述待測藥物圖像的辨識結果包括：所述電腦設備計算所述待測矩陣與每個參照矩陣的相似度，將最大的相似度所對應的參照矩陣確定為目標矩陣，基於預設標籤映射表對所述目標矩陣進行映射處理，得到所述辨識結果。 In at least one embodiment of the present application, the computer device generating the recognition result of the test drug image according to the similarity between the test matrix and each reference matrix includes: the computer device calculating the test The similarity between the matrix and each reference matrix will be maximized The reference matrix corresponding to the similarity is determined as the target matrix, and the target matrix is mapped based on the preset label mapping table to obtain the identification result.

其中，所述預設標籤映射表是指每個參照矩陣與對應的標籤信息的映射表，所述預設標籤映射表中每個參照矩陣與標籤信息一一對應。 Wherein, the preset label mapping table refers to a mapping table between each reference matrix and corresponding label information, and each reference matrix in the preset label mapping table corresponds to label information one by one.

所述相似度可以包括，但不限於：余弦相似度、歐氏距離。 The similarity may include, but not limited to: cosine similarity and Euclidean distance.

所述余弦相似度的計算公式為：

cosine是指所述余弦相似度，n是指所述待測矩陣及任意一個參照矩陣中的所有元素，i指所述待測矩陣及任意一個參照矩陣中的第i個元素，A _i是指所述待測矩陣中的第i個元素，B _i是指任意一個參照矩陣中的第i個元素。 The formula for calculating the cosine similarity is:

cosine refers to the cosine similarity, n refers to all elements in the test matrix and any reference matrix, i refers to the i -th element in the test matrix and any reference matrix, A _i refers to The i -th element in the test matrix, B _i refers to the i -th element in any reference matrix.

具體地，所述電腦設備基於預設標籤映射表對所述目標矩陣進行映射處理，得到所述辨識結果包括：所述電腦設備根據所述目標矩陣確定所述待測藥物在所述預設標籤映射表中對應的標籤信息，並將所述對應的標籤信息作為所述辨識結果。 Specifically, the computer device performs mapping processing on the target matrix based on a preset label mapping table, and obtaining the identification result includes: the computer device determines that the drug to be tested is in the preset label according to the target matrix. corresponding label information in the mapping table, and use the corresponding label information as the identification result.

透過上述實施方式，選取最大的相似度對應的參照矩陣作為所述目標矩陣，使得所述目標矩陣對應的藥物圖像與所述待測藥物圖像更相似，將所述對應的藥物圖像的標籤信息作為所述待測藥物的標籤信息，根據所述目標矩陣與標籤信息的一一對應關係，能夠快速找到所述待測藥物的標籤信息，提高了藥物圖像的辨識效率。 Through the above-mentioned embodiment, the reference matrix corresponding to the maximum similarity is selected as the target matrix, so that the drug image corresponding to the target matrix is more similar to the drug image to be tested, and the corresponding drug image The label information is used as the label information of the drug to be tested, and according to the one-to-one correspondence between the target matrix and the label information, the label information of the drug to be tested can be quickly found, which improves the identification efficiency of the drug image.

由以上技術方案可以看出，使用所述多張標記圖像對所述第二預訓練網路進行多次調整得到所述位置偵測網路，由於所述位置偵測網路學習到了所述多張標記圖像的特徵，所以所述位置偵測網路能夠對所述待測圖像進行位置標記，在訓練所述類別辨識網路時，對所述多張類別圖像進行增廣處理，避免了訓練資料過少的問題，根據每個影像特徵矩陣及對應的文字特徵矩陣生成複數參照矩陣，將所述待測圖像以相同的方式生成所述待測矩陣，計算所述待測矩陣與每個參照矩陣的相似度，並選取最大的相似度對應的參照矩陣的標籤信息作為所述辨識結果，由於所述待測矩陣同時包含了所述待測圖像的文字特徵及圖像特徵，使得所述待測矩陣能夠全面地反映出所述待測圖像的特徵，從而使得所述辨識結果的準確性更高。 It can be seen from the above technical solutions that the position detection network is obtained by using the multiple marked images to adjust the second pre-training network multiple times, because the position detection network has learned the The features of multiple marked images, so the position detection network can mark the position of the image to be tested, and perform augmentation processing on the multiple category images when training the category recognition network , to avoid the problem of too little training data, generate a complex reference matrix according to each image feature matrix and corresponding text feature matrix, generate the test matrix in the same way as the image to be tested, and calculate the test matrix The similarity with each reference matrix, and select the label of the reference matrix corresponding to the largest similarity information as the recognition result, since the matrix to be tested includes both text features and image features of the image to be tested, the matrix to be tested can fully reflect the features of the image to be tested, Therefore, the accuracy of the identification result is higher.

如圖3所示，是本申請實現藥物影像辨識方法的較佳實施例的電腦設備的結構示意圖。 As shown in FIG. 3 , it is a schematic structural diagram of a computer device for implementing a preferred embodiment of the drug image recognition method of the present application.

在本申請的一個實施例中，所述電腦設備1包括，但不限於，儲存器12、處理器13，以及儲存在所述儲存器12中並可在所述處理器13上運行的電腦程式，例如藥物影像辨識程式。 In one embodiment of the present application, the computer device 1 includes, but is not limited to, a storage 12, a processor 13, and a computer program stored in the storage 12 and operable on the processor 13 , such as a drug image recognition program.

本領域技術人員可以理解，所述示意圖僅僅是電腦設備1的示例，並不構成對電腦設備1的限定，可以包括比圖示更多或更少的部件，或者組合某些部件，或者不同的部件，例如所述電腦設備1還可以包括輸入輸出設備、網路接入設備、匯流排等。 Those skilled in the art can understand that the schematic diagram is only an example of the computer device 1 and does not constitute a limitation to the computer device 1. It may include more or less components than those shown in the illustration, or combine certain components, or have different Components, for example, the computer device 1 may also include input and output devices, network access devices, bus bars, and the like.

所述處理器13可以是中央處理單元(Central Processing Unit，CPU)，還可以是其他通用處理器、數位訊號處理器(Digital Signal Processor，DSP)、專用積體電路(Application Specific Integrated Circuit，ASIC)、現場可程式設計閘陣列(Field-Programmable Gate Array，FPGA)或者其他可程式設計邏輯器件、分立門或者電晶體邏輯器件、分立硬體元件等。通用處理器可以是微處理器或者該處理器也可以是任何常規的處理器等，所述處理器13是所述電腦設備1的運算核心和控制中心，利用各種介面和線路連接整個電腦設備1的各個部分，及獲取所述電腦設備1的作業系統以及安裝的各類應用程式、程式碼等。例如，所述處理器13可以透過介面獲取所述攝像裝置2拍攝到的所述待測藥物圖像。 The processor 13 can be a central processing unit (Central Processing Unit, CPU), and can also be other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC) , Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. The general-purpose processor can be a microprocessor or the processor can also be any conventional processor, etc., and the processor 13 is the computing core and control center of the computer device 1, and utilizes various interfaces and lines to connect the entire computer device 1 Each part of the system, and obtain the operating system of the computer device 1 and various installed applications, program codes, etc. For example, the processor 13 may acquire the image of the drug to be tested captured by the camera device 2 through an interface.

所述處理器13獲取所述電腦設備1的作業系統以及安裝的各類應用程式。所述處理器13獲取所述應用程式以實現上述各個藥物影像辨識方法實施例中的步驟，例如圖2所示的步驟。 The processor 13 acquires the operating system of the computer device 1 and various installed applications. The processor 13 acquires the application program to implement the steps in the above embodiments of the drug image recognition method, for example, the steps shown in FIG. 2 .

示例性的，所述電腦程式可以被分割成一個或複數模組/單元，所述一個或者複數模組/單元被儲存在所述儲存器12中，並由所述處理器13獲取，以完成本申請。所述一個或複數模組/單元可以是能夠完成特定功能的一系列電腦程式指令段，該指令段用於描述所述電腦程式在所述電腦設備1中的獲取過程。 Exemplarily, the computer program can be divided into one or multiple modules/units, the one or multiple modules/units are stored in the storage 12 and acquired by the processor 13, to complete this application. The one or multiple modules/units may be a series of computer program instruction segments capable of accomplishing specific functions, and the instruction segments are used to describe the acquisition process of the computer program in the computer device 1 .

所述儲存器12可用於儲存所述電腦程式和/或模組，所述處理器13透過運行或獲取儲存在所述儲存器12內的電腦程式和/或模組，以及調用儲存在儲存器12內的資料，實現所述電腦設備1的各種功能。所述儲存器12可主要包括儲存程式區和儲存資料區，其中，儲存程式區可儲存作業系統、至少一個功能所需的應用程式(比如聲音播放功能、圖像播放功能等)等；儲存資料區可儲存根據電腦設備的使用所創建的資料等。此外，儲存器12可以包括非易失性儲存器，例如硬碟、儲存器、插接式硬碟，智慧儲存卡(Smart Media Card,SMC)，安全數位(Secure Digital,SD)卡，快閃儲存器卡(Flash Card)、至少一個磁碟儲存器件、快閃儲存器器件、或其他非易失性固態儲存器件。 The storage 12 can be used to store the computer programs and/or modules, and the processor 13 executes or obtains the computer programs and/or modules stored in the storage 12, and calls the computer programs and/or modules stored in the storage 12 to realize various functions of the computer device 1. The storage device 12 can mainly include a program storage area and a data storage area, wherein the program storage area can store an operating system, at least one application program required by a function (such as a sound playback function, an image playback function, etc.); The area can store data created according to the use of the computer equipment, etc. In addition, the storage 12 may include non-volatile storage, such as hard disk, memory, plug-in hard disk, smart memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash memory A memory card (Flash Card), at least one magnetic disk storage device, a flash memory device, or other non-volatile solid state storage devices.

所述儲存器12可以是電腦設備1的外部儲存器和/或內部儲存器。進一步地，所述儲存器12可以是具有實物形式的儲存器，如儲存器條、TF卡(Trans-flash Card)等等。 The storage 12 may be an external storage and/or an internal storage of the computer device 1 . Further, the storage 12 may be a physical storage, such as a storage stick, a TF card (Trans-flash Card) and the like.

所述電腦設備1集成的模組/單元如果以軟體功能單元的形式實現並作為獨立的產品銷售或使用時，可以儲存在一個電腦可讀取儲存介質中。基於這樣的理解，本申請實現上述實施例方法中的全部或部分流程，也可以透過電腦程式的指令控制相關的硬體來完成，所述的電腦程式可儲存於一電腦可讀儲存介質中，該電腦程式在被處理器獲取時，可實現上述各個方法實施例的步驟。 If the integrated modules/units of the computer device 1 are realized in the form of software function units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on such an understanding, all or part of the processes in the methods of the above embodiments of the present application can also be completed by controlling related hardware through instructions of a computer program. The computer program can be stored in a computer-readable storage medium. When the computer program is acquired by the processor, it can realize the steps of the above-mentioned various method embodiments.

其中，所述電腦程式包括電腦程式代碼，所述電腦程式代碼可以為原始程式碼形式、物件代碼形式、可獲取檔或某些中間形式等。所述電腦可讀介質可以包括：能夠攜帶所述電腦程式代碼的任何實體或裝置、記錄介質、隨身碟、移動硬碟、磁碟、光碟、電腦儲存器、唯讀儲存器(ROM，Read-Only Memory)。 Wherein, the computer program includes computer program code, and the computer program code may be in the form of original code, object code, obtainable file or some intermediate form. The computer-readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a flash drive, a removable hard disk, a magnetic disk, an optical disk, a computer storage, a read-only memory (ROM, Read- Only Memory).

結合圖2，所述電腦設備1中的所述儲存器12儲存複數指令以實現一種藥物影像辨識方法，所述處理器13可獲取所述複數指令從而實現：獲取多張藥物圖像及待測藥物圖像；獲取預先訓練完成的藥物偵測模型，所述藥物偵測模型包括位置偵測網路、文字辨識網路及類別辨識網路；將所述多張藥物圖像輸入到所述位置偵測網路中，得到多張目標圖像，每張目標圖像中包含有單個藥物的圖像；根據所述多張目標圖像及所述文字辨識網路，生成複數文字特徵矩陣；將所述多張目標圖像輸入到所述類別辨識網路中，得到複數影像特徵矩陣；根據每個影像特徵矩陣及對應的文字特徵矩陣生成參照矩陣；基於所述藥物偵測模型處理所述待測藥物圖像，得到待測矩陣；根據所述待測矩陣與每個參照矩陣的相似度生成所述待測藥物圖像的辨識結果。 Referring to FIG. 2 , the storage 12 in the computer device 1 stores multiple instructions to implement a drug image recognition method, and the processor 13 can obtain the multiple instructions to realize: acquiring multiple images of drugs and the drug images to be tested Drug image; obtain a pre-trained drug detection model, the drug detection model includes a position detection network, a character recognition network and a category recognition network; input the multiple drug images to the position In the detection network, multiple target images are obtained, and each target image contains an image of a single drug; according to the multiple target images and the character recognition network, a complex character feature matrix is generated; The plurality of target images are input into the category recognition network to obtain complex image feature matrices; a reference matrix is generated according to each image feature matrix and the corresponding character feature matrix; the drug detection model is used to process the to-be The test drug image is obtained to obtain the test matrix; the identification result of the test drug image is generated according to the similarity between the test matrix and each reference matrix.

具體地，所述處理器13對上述指令的具體實現方法可參考圖2對應實施例中相關步驟的描述，在此不贅述。 Specifically, for the specific implementation method of the above instruction by the processor 13, reference may be made to the description of the relevant steps in the embodiment corresponding to FIG. 2 , which will not be repeated here.

在本申請所提供的幾個實施例中，應該理解到，所揭露的系統，裝置和方法，可以透過其它的方式實現。例如，以上所描述的裝置實施例僅僅是示意性的，例如，所述模組的劃分，僅僅為一種邏輯功能劃分，實際實現時可以有另外的劃分方式。 In the several embodiments provided in this application, it should be understood that the disclosed systems, devices and methods can be implemented in other ways. For example, the device embodiments described above are only illustrative. For example, the division of the modules is only a logical function division, and there may be other division methods in actual implementation.

所述作為分離部件說明的模組可以是或者也可以不是物理上分開的，作為模組顯示的部件可以是或者也可以不是物理單元，即可以位於一個地方，或者也可以分佈到複數網路單元上。可以根據實際的需要選擇其中的部分或者全部模組來實現本實施例方案的目的。 The modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, they may be located in one place, or may also be distributed to multiple network units superior. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

另外，在本申請各個實施例中的各功能模組可以集成在一個處理單元中，也可以是各個單元單獨物理存在，也可以兩個或兩個以上單元集成在一個單元中。上述集成的單元既可以採用硬體的形式實現，也可以採用硬體加軟體功能模組的形式實現。 In addition, each functional module in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented not only in the form of hardware, but also in the form of hardware plus software function modules.

因此，無論從哪一點來看，均應將實施例看作是示範性的，而且是非限制性的，本申請的範圍由所附請求項而不是上述說明限定，因此旨在將落在請求項的等同要件的含義和範圍內的所有變化涵括在本申請內。不應將請求項中的任何附關聯圖標記視為限制所涉及的請求項。 Therefore, no matter from any point of view, the embodiments should be regarded as exemplary and non-restrictive, and the scope of the application is defined by the appended claims rather than the above description, so it is intended to All changes that come within the meaning and range of equivalents of the claims are embraced in this application. Any attached reference mark in a claim shall not be deemed to limit the claim to which it relates.

此外，顯然“包括”一詞不排除其他單元或步驟，單數不排除複數。本申請中陳述的複數單元或裝置也可以由一個單元或裝置透過軟體或者硬體來實現。第一、第二等詞語用來表示名稱，而並不表示任何特定的順序。 In addition, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or devices stated in this application can also be realized by one unit or device through software or hardware. The terms first, second, etc. are used to denote names and do not imply any particular order.

最後應說明的是，以上實施例僅用以說明本申請的技術方案而非限制，儘管參照較佳實施例對本申請進行了詳細說明，本領域的普通技術人員應當理解，可以對本申請的技術方案進行修改或等同替換，而不脫離本申請技術方案的精神和範圍。 Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present application without limitation. Although the present application has been described in detail with reference to the preferred embodiments, those skilled in the art should understand that the technical solutions of the present application can be Make modifications or equivalent replacements without departing from the spirit and scope of the technical solutions of the present application.

S10~S17:步驟 S10~S17: Steps

Claims

A drug image recognition method, which is applied to computer equipment, wherein the drug image recognition method includes: acquiring a plurality of drug images and drug images to be tested; acquiring a pre-trained drug detection model, the drug detection The model includes a position detection network, a character recognition network, and a category recognition network, wherein the obtaining the category recognition network includes: obtaining a category recognition learner, and a soft-max function is used in the category recognition learner as The activation function is used to calculate the third loss value of the category recognition learner, including: acquiring a plurality of category images, the plurality of category images including complex categories, and performing augmentation processing on the plurality of category images to obtain Multiple augmented images, the formula for determining the third loss value is:

in,

refers to the third loss value, 2 N refers to the multiple augmented images, i refers to the i -th augmented image in the plurality of augmented images, and yi refers to the category of the i -th augmented image, j refers to the j -th augmented image among the augmented images of the same category as i , yj refers to the category of the j -th augmented image, N _yi refers to all augmented images of the same category as i ∥ _{i ≠ j} is the first indicator function, which is zero if and only when i = j , and 1 when i ≠ j , ∥ _{yi = yj} is the second indicator function, which is taken if and only when yi = yj zero, takes 1 when yi ≠ yj , ∥ _{i ≠ k} is the third indicator function, takes zero if and only when i = k , takes 1 when i ≠ k , z _i refers to inputting i into the category The unit vector obtained in the identification network, z _j refers to the unit vector obtained by inputting j into the category identification network, k refers to any augmented image except i , z _k refers to inputting k to the unit vector obtained in the class recognition network, τ is a preset scalar adjustment parameter, and the class recognition learner is adjusted based on the third loss value until the third loss value drops to the minimum stop the adjustment, and delete the activation function from the adjusted class recognition learner to obtain the class recognition network; input the multiple drug images into the position detection network to obtain multiple target Image, each target image contains an image of a single drug; According to the multiple target images and the text recognition network, a complex character feature matrix is generated; the multiple target images are input to the In the category recognition network, a complex image feature matrix is obtained; a reference matrix is generated according to each image feature matrix and the corresponding character feature matrix; the drug image to be tested is processed based on the drug detection model to obtain a test matrix; A recognition result of the drug image to be tested is generated according to the similarity between the matrix to be tested and each reference matrix.

The medicine image recognition method according to claim 1, wherein, before inputting the multiple medicine images into the position detection network to obtain multiple target images, the medicine image recognition method further includes : Obtain a position detection learner and a position image, the position image includes a first image and a second image, and the second image includes a plurality of marked images and a plurality of unmarked images, wherein, The first image is a plurality of types of item images; using the first image to train the position detection learner to obtain a first pre-training network; The first pre-training network is adjusted to obtain a first marked network; the multiple unlabeled images are input into the first marked network to obtain output images and each output image contains The predicted probability value of the drug; the output image corresponding to the predicted probability value greater than the preset threshold is adjusted to the first marked network to obtain the second marked network; the first marked network of the second marked network is calculated. loss value, and adjust the second marking network multiple times based on the first loss value until the first loss value drops to a minimum and then stop the adjustment to obtain the position detection network.

The drug image recognition method according to claim 1, wherein, before generating a complex character feature matrix based on the multiple target images and the character recognition network, the drug image recognition method further includes: obtaining a character image Image and text recognition learner, the text image includes a third image and a fourth image, Wherein, the text image is a text image related to any drug; the text recognition learner is trained using the third image to obtain a second pre-training network, wherein the second pre-training The network includes a convolutional neural network model and a recurrent neural network model; the second loss value of the second pre-training network is calculated, and backpropagation is performed through the second loss value, and the second pre-trained The parameters of the training network are adjusted several times until the second pre-training model reaches convergence, and the adjustment is stopped to obtain the character recognition network.

The drug image recognition method according to claim 3, wherein said generating a complex character feature matrix based on said multiple target images and said character recognition network includes: performing color conversion on each target image to obtain multiple gray-scale text images; binarize each gray-scale text image to obtain multiple binarized images; filter each binarized image to obtain multiple filtered images; Locate the position of the drug text in each filtered image to obtain the text position; select a text image from each target image according to the text position; input each text image to the convolutional neural network model performing feature extraction to obtain a feature sequence; inputting the feature sequence into the cyclic neural network model to obtain the complex character feature matrix.

The drug image recognition method according to claim 1, wherein the generating the reference matrix according to each image feature matrix and the corresponding text feature matrix includes: adding each image feature matrix and the corresponding text feature matrix, The reference matrix is obtained, wherein each image feature matrix and the corresponding text feature matrix have the same number of rows and columns.

The drug image recognition method according to claim 1, wherein the generating the recognition result of the drug image to be tested according to the similarity between the matrix to be tested and each reference matrix includes: calculating the matrix to be tested and the The similarity of each reference matrix; determining the reference matrix corresponding to the maximum similarity as the target matrix; performing mapping processing on the target matrix based on the preset label mapping table to obtain the identification result.

A computer device, wherein the computer device includes: a memory storing at least one instruction; and a processor obtaining the instruction stored in the memory to realize the drug image as described in any one of claims 1 to 6 identification method.

A computer-readable storage medium, wherein: at least one instruction is stored in the computer-readable storage medium, and the at least one instruction is executed by a processor in a computer device to implement any one of claims 1 to 6. drug image recognition method.