WO2023189734A1

WO2023189734A1 - Object identification device, object identification method, and program

Info

Publication number: WO2023189734A1
Application number: PCT/JP2023/010610
Authority: WO
Inventors: 真司羽田
Original assignee: 富士フイルム富山化学株式会社
Priority date: 2022-03-28
Filing date: 2023-03-17
Publication date: 2023-10-05

Abstract

Provided are an object identification device, an object identification method, and a program that allow for identification of the types of individual objects using an image containing a plurality of objects that can include both a type-identifiable object and a type-unidentifiable object. This object identification device comprises: a detector that detects objects from an image containing a plurality of objects on an object-by-object basis; an identifier that estimates, with respect to a type-identifiable object among the objects detected by the detector for which the type of object can be identified from the image, the type of the object from the image, and that estimates, with respect to a type-unidentifiable object for which identification of the type of object is difficult but that allows for identification of a group to which the object belongs, the group to which the object belongs from the image; and a processing unit that performs group-specific processing that leads to identification of the type with respect to the object that has been estimated as a group by the identifier.

Description

Object identification device, object identification method and program

The present disclosure relates to an object identification device, an object identification method, and a program, and particularly relates to an image recognition technique for identifying an object from an image.

Patent Document 1 discloses drug identification software that sets a drug to be differentiated in a drug imaging device, photographs the drug, and searches for the drug by referring to a database based on the data of the photographed drug image. is listed.

Patent Document 2 describes an imaging step of capturing a reflected light image and a transmitted light image of a packaging paper in which one or more tablets for one dose are packaged, and a reflected light image based on the reflected light image and the transmitted light image. The type of each tablet is determined by a cutting process in which a tablet area corresponding to the tablet inside is cut out, and by comparing the dimensions and colors of each tablet area cut out by the cutting process with model information regarding the shape and color of the tablet. At least similar tablets of different types but with similar features based on a learning model generated by performing machine learning using learning data including images of tablets. A second identification step of identifying the type of the tablet is described.

Japanese Patent Application Publication No. 2022-010060 JP2018-027242A

When identifying the drug type of a tablet, in many cases, the type of drug (drug type) can be identified using information from stamps or printed characters and/or symbols attached to the tablet. On the other hand, for drugs such as capsule drugs and half-tablets, only part of the character and symbol information used to identify the drug type is captured in the photographed image, or there are countless patterns of capsule interlocking and tablet division. For these reasons, it is difficult or impossible to identify the drug type by template matching of a photographed image with a master image or by machine learning-based methods. In other words, there are two types of drugs, such as tablets with stamps or prints, whose type can be identified from the photographed image using image recognition technology, and those such as capsules and half-tablets, which can be identified from the photographed image. There may be drugs whose type is difficult to identify. In this specification, the former is referred to as a "drug whose drug type can be identified," and the latter is referred to as a "drug whose drug type is difficult to identify." Furthermore, characters and/or symbols attached to drugs by stamping or printing are referred to as "character symbols."

For drugs whose drug type can be identified, it is expected that the drug type can be identified through image recognition based on captured images. On the other hand, in order to identify the drug type of a drug that is difficult to identify, for example, the captured image is enlarged and information on the text symbols attached to the drug is presented to the user. In some cases, the drug type can be identified by visual inspection and text or voice input, and by searching a database such as a stamped text master. As described above, for drugs whose drug type is difficult to identify, a method of presenting information useful for drug type identification to the user to support the drug type identification work is considered to be practically effective.

At the stage of photographing the drug to be identified, a system is constructed that separates drugs whose drug type can be identified and drugs whose drug type is difficult to identify, photographs only drugs whose drug type can be identified, and identifies the type of drugs whose drug type can be identified. It is also possible.

However, from a pharmaceutical standpoint and a usability standpoint, it is preferable to photograph multiple medications at the same time, at the same time of administration, at the same dispensing, etc., and to collectively identify the drug type.

However, if images are taken at the same time of administration, etc., a single image will inevitably contain a mixture of drugs whose drug type is difficult to identify and drugs whose drug type can be identified. It is necessary to choose between a difficult drug and a drug whose type can be identified. From a usability standpoint, it is preferable that this selection be performed automatically on the drug type identification device whenever possible. Further, it is preferable that this selection is performed in a series of drug type identification work flows on the drug type identification device.

Such technical issues are not limited to applications for identifying drugs, but can be understood as common issues when identifying the type of object from an image of various objects. In this specification, an object whose type can be identified from an image is referred to as an "object whose type can be identified", and an object whose type is difficult to identify from an image but whose group it belongs to can be identified as a "difficult to identify object". ”.

The present disclosure has been made in view of the above circumstances, and describes the method of identifying the type of each object using an image in which a plurality of objects are photographed, in which objects whose types can be identified and objects whose types are difficult to identify can coexist. The object of the present invention is to provide an object identification device, an object identification method, and a program that enable the object identification.

An object identification device according to an aspect of the present disclosure includes a detector that detects objects on an object-by-object basis from images in which a plurality of objects are photographed, and can identify the type of object from the image among the objects detected by the detector. a classifier that estimates the type of the object from the image for objects whose type can be specified, and estimates the group from the image for objects whose type is difficult to specify from the image, but the group to which the object belongs can be specified; and a processing unit that performs group-specific processing that leads to identification of the type of objects estimated as a group by the classifier.

According to this aspect, objects are detected from images in which a plurality of objects are photographed, and estimation is performed for each object by a classifier. If the object detected from the image is an object whose type can be identified, the classifier estimates the type of the object. On the other hand, if an object detected from an image is an object whose type is difficult to identify, the classifier estimates the group to which the object belongs, and performs group-specific processing that contributes to identifying the object type for objects whose group has been estimated. This will lead to identification of the species.

The image may be an image taken in a state in which objects whose types can be identified and objects whose types are difficult to identify coexist.

The object whose type is difficult to identify may be classified into a plurality of groups, and a group-specific process may be determined for each group.

The detector is configured to include a first trained model trained by machine learning using first training data in which each object is labeled without distinguishing between objects whose types can be identified and objects whose types are difficult to identify. It's okay.

The classifier is trained by machine learning using second training data in which objects whose type can be identified are labeled by the type of the object, and objects whose type is difficult to identify are labeled by the group to which the object belongs. The configuration may include trained models.

Labels that identify groups may be defined in a hierarchical structure.

As input to the discriminator, an object image obtained by cutting out the area of the object detected by the detector from the image in object units, a character/symbol extraction image containing at least one of characters and symbols extracted from the object image, and the external shape of the object. The configuration may be such that at least one of an image and object size information is used.

A configuration may also be adopted in which magnification information indicating the enlargement or reduction ratio of the object image is used as input to the discriminator.

The group-specific processing may include processing to display a screen that accepts input of search conditions for searching for the type of object within the estimated group.

In another aspect of the present disclosure, the object is a drug, the hard-to-identify object includes at least one of a capsule drug, a plain drug, and a divided tablet, and the identifiable object includes a tablet with an inscription or print. The configuration may include the following.

As input to the discriminator, a drug image is obtained by cutting out the region of the drug detected by the detector from the image for each drug, a character/symbol extraction image containing at least one of the characters and symbols extracted from the drug image, and the outline of the drug. The configuration may be such that at least one of an image and drug size information is used.

An object identification device according to another aspect of the present disclosure is an object identification device including one or more processors and one or more memories in which programs executed by the one or more processors are stored, the object identification device including: The two or more processors perform a detection process that detects objects in units of objects from images in which multiple objects are photographed, and detect objects that can be identified from the image to the type of the object detected by the detection process. The type of object can be estimated from the image, and although it is difficult to identify the type of object from the image, the group to which the object belongs can be estimated.For objects that are difficult to identify, a classification process is performed to estimate the group from the image, and the classification process is used to estimate the group. processing for transitioning to group-specific processing that leads to identification of the type of the object.

The one or more processors include a first trained model trained by machine learning using first training data that is labeled on an object-by-object basis without distinguishing between objects whose types can be identified and objects whose types are difficult to identify. The configuration may be such that the detection process is executed using a detector.

The one or more processors are trained by machine learning using second training data in which objects whose type can be identified are labeled by the type of the object, and objects whose type is difficult to identify are labeled by the group to which the object belongs. The configuration may also be such that the classification process is executed using a classifier including a second trained model.

The one or more processors may be configured to cut out a region of the object detected by the detection process from the image, execute processing to generate an object image for each object, and perform identification processing based on the object image.

In another aspect of the present disclosure, the object is a drug, the hard-to-identify object includes at least one of a capsule drug, a plain drug, and a divided tablet, and the identifiable object includes a tablet with an inscription or print. The group-specific processing may include processing for displaying a screen that accepts input of search conditions for searching for the type of drug within the estimated group.

In another aspect of the present disclosure, there is provided a first database in which character symbol information including at least one of characters and symbols indicated by a stamp or print attached to a drug is associated with a drug type, and a master image of the drug. a second database in which is stored, the one or more processors search at least one of the first database and the second database based on the accepted search conditions, and search for drugs corresponding to the search conditions. The configuration may be such that the candidates are output.

The one or more processors may be configured to perform processing to display a screen including a photographed image display unit that displays an image and a candidate display unit that displays information about the candidate object based on the estimation result of the identification process. good.

The one or more processors may be configured to perform a process of displaying information on a group to which a candidate object to be displayed on the candidate display section belongs.

The one or more processors may be configured to accept an instruction specifying a group to which a candidate object to be displayed on the candidate display section belongs, and control the display on the candidate display section according to the received instruction.

In another aspect of the present disclosure, a configuration may be provided that includes a camera and a display that displays an image taken by the camera and information regarding an object estimated from the image.

An object identification method according to another aspect of the present disclosure is an object identification method executed by one or more processors, wherein the one or more processors detect objects in units of objects from images in which a plurality of objects are photographed. Among the detected objects, if the type of the object can be estimated from the image and the type of the object can be identified, the type of the object is estimated from the image, and although it is difficult to identify the type of object from the image, the group to which the object belongs is determined. The method includes estimating a group from an image for objects whose type is difficult to specify, and performing group-specific processing that leads to specifying the type of objects estimated as a group.

A program according to another aspect of the present disclosure provides a computer with a function of detecting objects on an object-by-object basis from images in which a plurality of objects are photographed, and a type identification that can identify the type of the object from the image among the detected objects. For possible objects, the type of object is estimated from the image, and although it is difficult to identify the type of object from the image, it is possible to identify the group to which the object belongs.For objects that are difficult to identify, there is a function that estimates the group from the image, and it is estimated as a group. A function for executing group-specific processing that leads to identification of the type of object is realized.

According to the present disclosure, it is possible to identify the type of each object in an image even if the image is a mixture of objects whose types can be identified and objects whose types are difficult to identify.

FIG. 1 is a front perspective view of a smartphone. FIG. 2 is a rear perspective view of the smartphone. FIG. 3 is a block diagram showing the electrical configuration of the smartphone. FIG. 4 is a block diagram showing the internal configuration of the in-camera. FIG. 5 is a block diagram showing the functional configuration of the drug type identification device according to the embodiment. FIG. 6 is a flowchart showing an overview of a learning phase by machine learning for realizing a drug type identification device including a drug detector and a drug discriminator. FIG. 7 is an explanatory diagram showing an example of an identification code provided as training data for a drug whose drug type can be specified. FIG. 8 is an explanatory diagram showing an example of an identification code provided as training data for a drug whose drug type is difficult to identify. FIG. 9 is an explanatory diagram showing another example of an identification code given as training data for a capsule drug. FIG. 10 is an explanatory diagram showing another example of an identification code given as teacher data for a plain drug. FIG. 11 is a conceptual diagram showing an example of information input to the drug identifier. FIG. 12 is a diagram showing an example of input information to the neural network. FIG. 13 is a diagram showing another example of input information to the neural network. FIG. 14 is a diagram showing an example of input information in the case of a capsule medicine with an identification symbol printed on the capsule. FIG. 15 is a diagram showing an example of input information for a capsule drug without an identification symbol. FIG. 16 is a diagram showing an example of input information in the case of a half tablet. FIG. 17 is a block diagram showing the functional configuration of the machine learning system. FIG. 18 is a flowchart showing the operation of the drug type identification device according to the embodiment. FIG. 19 is a flowchart showing the operation of the drug type identification device according to the embodiment, and shows an example of loop processing applied to step S15 and step S16 in FIG. 18. FIG. 20 is a diagram showing an example of a screen displayed on a touch panel display. FIG. 21 is a diagram showing an example of a screen display when editing a drug area. FIG. 22 is an explanatory diagram illustrating an example of a region editing operation method. FIG. 23 is a diagram illustrating an example of a screen display in a state where the identification of the drug has been determined. FIG. 24 is a diagram showing an example of a screen display of an identification result confirmation GUI (Graphical User Interface). FIG. 25 is a diagram showing an example of a screen display of the candidate list display GUI. FIG. 26 is a diagram showing an example of a screen display of the engraved text GUI. FIG. 27 is a diagram showing an example of a screen display of the capsule search GUI. FIG. 28 is a diagram showing an example of a screen display of the plain medicine search GUI. FIG. 29 is a diagram showing an example of a screen display of the divided tablet search GUI. FIG. 30 is a diagram showing an example of an icon representing the shape of a typical drug. FIG. 31 is a diagram showing an example of icons used for color selection. FIG. 32 is a block diagram illustrating an example of a functional configuration for identifying the type of drug that is difficult to identify in the drug type identification device according to the embodiment. FIG. 33 is a top view of the photographic assistance device. FIG. 34 is a sectional view taken along line 34-34 in FIG. 33. FIG. 35 is a top view of the photographing auxiliary device with the auxiliary light source removed. FIG. 36 is a top view of a photographic assisting device according to another embodiment. FIG. 37 is a sectional view taken along line 37-37 in FIG. 36. FIG. 38 is a top view of the photographing auxiliary device with the auxiliary light source removed. FIG. 39 is a cross-sectional view showing an example of a lighting device. FIG. 40 is a top view of a drug mounting table using a circular marker as a reference marker. FIG. 41 is a top view of a medicine placement table using a circular marker according to a modified example. FIG. 42 is a diagram showing a specific example of a reference marker having a rectangular outer shape. FIG. 43 is a top view of a medicine placement table using a circular marker according to another modification. FIG. 44 is a diagram showing another specific example of a rectangular reference marker. FIG. 45 is a diagram showing an example of a drug mounting table having a recessed structure. FIG. 46 is a diagram showing an example of a drug mounting table having a recessed structure for capsule drugs. FIG. 47 is a diagram illustrating an example of a drug platform having a recessed structure for elliptical tablets.

Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings.

<Overview of drug type identification device according to embodiment>
The drug type identification device according to the embodiment is a device that identifies the type of drug (drug type) from an image of the drug. In this embodiment, it is assumed that a plurality of drugs are photographed in a state in which a drug whose drug type can be identified and a drug whose drug type is difficult to identify coexist.However, in the case where both drugs do not coexist or only one drug is photographed However, the drug type identification device can still function effectively.

The drug type identification device according to the embodiment detects each drug from an image of a plurality of drugs, and automatically determines whether each detected drug is a difficult-to-identify drug or a drug that can be identified. This makes it possible to efficiently identify the drug type by automatically branching to a drug type identification flow suitable for each drug type. The drug type identification device is installed in a mobile terminal device, for example. The mobile terminal device includes at least one of a smartphone, a mobile phone, a PHS (Personal Handy-phone System), a PDA (Personal Digital Assistant), a tablet computer terminal, a notebook personal computer terminal, and a mobile game console. . Hereinafter, a drug type identification device realized by hardware and software of a smartphone will be described in detail with reference to the drawings.

[Smartphone appearance]
FIG. 1 is a front perspective view of a smartphone 10 that is a camera-equipped mobile terminal device that functions as a drug type identification device according to an embodiment. As shown in FIG. 1, the smartphone 10 has a flat housing 12. The smartphone 10 includes a touch panel display 14, a speaker 16, a microphone 18, and an in-camera 20 on the front of a housing 12.

The touch panel display 14 includes a display section that displays images and the like, and a touch panel section that is placed in front of the display section and receives touch input. The display unit is, for example, a color LCD (Liquid Crystal Display) panel.

The touch panel section is, for example, a capacitive touch panel that is provided in a planar manner on a substrate main body that has optical transparency, and has a position detection electrode that has optical transparency, and an insulating layer provided on the position detection electrode. It is. The touch panel section generates and outputs two-dimensional position coordinate information corresponding to a user's touch operation. Touch operations include tap operations, double-tap operations, flick operations, swipe operations, drag operations, pinch-in operations, and pinch-out operations.

The speaker 16 is an audio output unit that outputs audio during a call and when playing a video. The microphone 18 is an audio input unit into which audio is input during a call and when shooting a video. The in-camera 20 is an imaging device that shoots moving images and still images.

FIG. 2 is a rear perspective view of the smartphone 10. As shown in FIG. 2, the smartphone 10 includes an out camera 22 and a light 24 on the back surface of the housing 12. The outside camera 22 is an imaging device that photographs moving images and still images. The light 24 is a light source that emits illumination light when photographing with the out-camera 22, and is composed of, for example, an LED (Light Emitting Diode).

Furthermore, as shown in FIGS. 1 and 2, the smartphone 10 includes switches 26 on the front and side surfaces of the housing 12, respectively. The switch 26 is an input member that receives instructions from the user. The switch 26 is a push-button switch that is turned on when pressed with a finger or the like, and turned off by the restoring force of a spring or the like when the finger is released.

Note that the configuration of the housing 12 is not limited to this, and a configuration having a folding structure or a sliding mechanism may be adopted.

[Electrical configuration of smartphone]
The main function of the smartphone 10 is a wireless communication function that performs mobile wireless communication via a base station device and a mobile communication network.

FIG. 3 is a block diagram showing the electrical configuration of the smartphone 10. As shown in FIG. 3, the smartphone 10 includes the above-mentioned touch panel display 14, speaker 16, microphone 18, in-camera 20, out-camera 22, light 24, and switch 26, as well as a CPU (Central Processing Unit) 28, wireless communication 30, a communication section 32, a memory 34, an external input/output section 40, a GPS reception section 42, and a power supply section 44.

The CPU 28 is an example of a processor that executes instructions stored in the memory 34. The CPU 28 operates according to the control program and control data stored in the memory 34, and centrally controls each part of the smartphone 10. The CPU 28 has a mobile communication control function that controls each part of the communication system and an application processing function in order to perform voice communication and data communication through the wireless communication unit 30.

Additionally, the CPU 28 has an image processing function that displays moving images, still images, characters, etc. on the touch panel display 14. This image processing function visually conveys information such as still images, moving images, and text to the user. Further, the CPU 28 acquires two-dimensional position coordinate information corresponding to the user's touch operation from the touch panel section of the touch panel display 14. Further, the CPU 28 obtains an input signal from the switch 26.

The in-camera 20 and the out-camera 22 shoot moving images and still images according to instructions from the CPU 28. FIG. 4 is a block diagram showing the internal configuration of the in-camera 20. Note that the internal configuration of the out-camera 22 is the same as that of the in-camera 20. As shown in FIG. 4, the in-camera 20 includes a photographic lens 50, an aperture 52, an image sensor 54, an AFE (Analog Front End) 56, an A/D (Analog to Digital) converter 58, and a lens drive unit 60. .

The photographing lens 50 is composed of a zoom lens 50Z and a focus lens 50F. The lens drive unit 60 performs optical zoom adjustment and focus adjustment by driving the zoom lens 50Z and focus lens 50F forward and backward in response to commands from the CPU 28. Further, the lens drive section 60 controls the aperture 52 according to commands from the CPU 28 to adjust exposure. The lens drive section 60 corresponds to an exposure correction section that performs exposure correction of the camera based on the color of gray, which will be described later. Information such as the positions of the zoom lens 50Z and the focus lens 50F and the opening degree of the aperture 52 is input to the CPU 28.

The image sensor 54 includes a light-receiving surface in which a large number of light-receiving elements are arranged in a matrix. The subject light that has passed through the zoom lens 50Z, the focus lens 50F, and the aperture 52 is imaged on the light receiving surface of the image sensor 54. On the light receiving surface of the image sensor 54, color filters of R (red), G (green), and B (blue) are provided. Each light-receiving element of the image sensor 54 converts the subject light imaged on the light-receiving surface into an electrical signal based on R, G, and B color signals. Thereby, the image sensor 54 acquires a color image of the subject. As the image sensor 54, a photoelectric conversion element such as CMOS (Complementary Metal-Oxide Semiconductor) or CCD (Charge-Coupled Device) can be used.

The AFE 56 performs noise removal, amplification, etc. of the analog image signal output from the image sensor 54. The A/D converter 58 converts the analog image signal input from the AFE 56 into a digital image signal with a gradation range. Note that an electronic shutter is used as a shutter to control the exposure time of incident light to the image sensor 54. In the case of an electronic shutter, the exposure time (shutter speed) can be adjusted by controlling the charge accumulation period of the image sensor 54 by the CPU 28.

The in-camera 20 may convert image data of captured moving pictures and still images into compressed image data such as MPEG (Moving Picture Experts Group) or JPEG (Joint Photographic Experts Group).

Returning to the explanation of FIG. 3, the CPU 28 causes the memory 34 to store the moving images and still images taken by the in-camera 20 and the out-camera 22. Further, the CPU 28 may output the moving images and still images taken by the in-camera 20 and the out-camera 22 to the outside of the smartphone 10 through the wireless communication unit 30 or the external input/output unit 40.

Further, the CPU 28 displays the moving images and still images taken by the in-camera 20 and the out-camera 22 on the touch panel display 14. The CPU 28 may use the moving images and still images taken by the in-camera 20 and the out-camera 22 within the application software.

Note that the CPU 28 may illuminate the subject with shooting auxiliary light by turning on the light 24 when shooting with the out-camera 22. The lighting and extinguishing of the light 24 may be controlled by the user's touch operation on the touch panel display 14 or the operation of the switch 26 .

The wireless communication unit 30 performs wireless communication with a base station device accommodated in a mobile communication network according to instructions from the CPU 28. The smartphone 10 uses this wireless communication to send and receive various file data such as audio data and image data, e-mail data, etc., and receive Web (abbreviation for World Wide Web) data, streaming data, and the like.

A speaker 16 and a microphone 18 are connected to the talking section 32 . The communication unit 32 decodes the audio data received by the wireless communication unit 30 and outputs it from the speaker 16. The communication unit 32 converts the user's voice input through the microphone 18 into voice data that can be processed by the CPU 28 and outputs the data to the CPU 28 .

The memory 34 stores instructions for the CPU 28 to execute. The memory 34 includes an internal storage section 36 built into the smartphone 10 and an external storage section 38 that is detachable from the smartphone 10 . The internal storage section 36 and the external storage section 38 are realized using known storage media.

The memory 34 stores the control program of the CPU 28, control data, application software, address data associated with the name and telephone number of the communication partner, data of sent and received e-mails, web data downloaded through web browsing, and downloaded content. Store data etc. Further, the memory 34 may temporarily store streaming data and the like.

The external input/output unit 40 serves as an interface with an external device connected to the smartphone 10. The smartphone 10 is directly or indirectly connected to other external devices through communication or the like via the external input/output unit 40 . The external input/output unit 40 transmits data received from an external device to each component inside the smartphone 10, and transmits data inside the smartphone 10 to the external device.

Examples of communication methods include Universal Serial Bus (USB), IEEE (Institute of Electrical and Electronics Engineers) 1394, the Internet, wireless LAN (Local Area Network), Bluetooth (registered trademark), and RFID (Radio Frequency Identification). ), and infrared communication. Further, external devices include, for example, a headset, an external charger, a data port, an audio device, a video device, a smartphone, a PDA, a personal computer, and an earphone.

The GPS receiving unit 42 detects the position of the smartphone 10 based on positioning information from GPS satellites ST1, ST2, ..., STn.

The power supply unit 44 is a power supply source that supplies power to each part of the smartphone 10 via a power supply circuit (not shown). Power supply section 44 includes a lithium ion secondary battery. The power supply section 44 may include an A/D conversion section that generates a DC voltage from an external AC power source.

The smartphone 10 configured in this manner is set to the shooting mode by inputting an instruction from the user using the touch panel display 14 or the like, and can shoot moving images and still images using the in-camera 20 and the out-camera 22.

When the smartphone 10 is set to the shooting mode, it enters a shooting standby state, a moving image is shot by the in-camera 20 or the out-camera 22, and the shot moving image is displayed on the touch panel display 14 as a live view image.

The user can visually check the live view image displayed on the touch panel display 14 to determine the composition, confirm the subject to be photographed, and set photographing conditions.

When the smartphone 10 is in the shooting standby state and is instructed to shoot by inputting an instruction from the user using the touch panel display 14 or the like, the smartphone 10 performs AF (Autofocus) and AE (Auto Exposure) control to shoot and store videos and still images. I do.

[Functional configuration of drug type identification device]
FIG. 5 is a block diagram showing the functional configuration of the drug type identification device 100 realized by the smartphone 10. Drug type identification device 100 includes a processor 102 and a storage device 104. Processor 102 includes CPU 28. The processor 102 may include a GPU (Graphics Processing Unit). Storage device 104 is a non-transitory, tangible, computer-readable medium and includes memory 34 . Processor 102 is connected to touch panel display 14 . The touch panel display 14 includes a display section 14A that functions as a display device (display), and an input section 14B that functions as an input device that receives input by touch operation.

The drug type identification device 100 includes an image acquisition section 112, a drug detector 114, a region correction section 116, a drug region cutting section 118, a drug identifier 120, a text search section 122, a display control section 124, and an input processing section 126. .

The image acquisition unit 112 acquires a captured image of the drug to be identified. The photographed image is, for example, an image photographed by the in-camera 20 or the out-camera 22. The photographed image may be an image acquired from another device via the wireless communication section 30, the external storage section 38, or the external input/output section 40. The photographed image may include a plurality of drugs to be identified. The plurality of drugs to be identified are not limited to drugs of the same drug type, but may be drugs of different drug types. In this embodiment, an example will be described in which an image in which a plurality of drugs are photographed together (as one image) is processed in a state where different types of drugs are mixed.

The photographed image may be an image in which the drug to be identified and the marker are photographed. The marker may be, for example, an ArUco marker, a circular marker, a square marker, or the like. Preferably, a plurality of markers are included in the captured image. The plurality of markers are arranged, for example, at the four corners of the rectangular area of the medicine placement range. The photographed image may be an image in which the drug to be identified and a reference gray color are photographed.

The photographed image may be an image photographed at a standard photographing distance and photographing viewpoint. The photographing distance can be expressed from the distance between the drug to be identified and the photographing lens 50, and the focal length of the photographing lens 50. Further, the photographing viewpoint can be expressed by the angle formed by the marker printing surface and the optical axis of the photographing lens 50.

The image acquisition unit 112 includes an image correction unit (not shown). When the photographed image includes a marker, the image correction unit standardizes the photographing distance and photographing viewpoint of the photographed image based on the marker to obtain a standardized image. The standardized image may be an image obtained by performing standardization processing on a photographed image and then cutting out an area inside a rectangle having four corner markers as vertices. For example, the image correction unit specifies the coordinates to which the four vertices of the rectangle whose coordinates are specified by the marker go after standardization of the photographing distance and photographing viewpoint. The image correction unit obtains a perspective transformation matrix that transforms these four vertices into respective designated coordinate positions. Such a perspective transformation matrix is uniquely determined if there are four points. For example, if there is a correspondence between four points, a transformation matrix can be obtained using the getPerspectiveTransform function of OpenCV (Open Source Computer Vision Library).

The image correction unit perspectively transforms the entire original captured image using the obtained perspective transformation matrix, and obtains a transformed image. Such perspective transformation can be executed by using the warpPerspective function of OpenCV. The image after this conversion may be a standardized image in which the photographing distance and photographing viewpoint are standardized.

Furthermore, when the captured image includes a reference gray color area, the image correction unit may perform color tone correction of the captured image based on the reference gray color.

The drug detector 114 includes a first learned model TM1 that is a learning model trained by machine learning. The first trained model TM1 is a model trained to perform a so-called object detection task. When the first trained model TM1 receives an image (pre-standardized image or standardized image) as input, it outputs position information corresponding to the region of the detected object, the class of the object, and a score indicating the probability of the object's certainty. do. The first trained model TM1 is an example of a "first trained model" in the present disclosure.

The class of objects in drug detection includes at least "drug" and may further include "marker." The drug detector 114 detects a drug from the captured image acquired by the image acquisition unit 112 and outputs information indicating the area of the detected drug. When the standardized image is acquired by the image correction unit, the drug detector 114 detects a drug area from the standardized image. If a plurality of drugs are included in the captured image, the drug detector 114 detects regions of each of the plurality of drugs.

The output from the first trained model TM1 may be position information of a bounding box indicating the region of each drug detected in the photographed image, or a segmentation mask that fills in the region of each drug pixel by pixel. It may be an image. Details of the learning method for creating the first trained model TM1 and the content of the detection process by the drug detector 114 will be described later. Drug detector 114 is an example of a "detector" in this disclosure.

The results of the detection process by the drug detector 114 are displayed on the display section 14A via the display control section 124. The display control unit 124 generates a signal for display on the touch panel display 14 and performs display control. The display control section 124 includes a magnification changing section 125 that changes the display magnification of the image displayed on the display section 14A. When an instruction to change the display magnification is input, such as when a pinch-out operation or a pinch-in operation is performed on the touch panel display 14, the magnification changing unit 125 performs enlargement or reduction processing according to the instruction. The input processing section 126 receives input from the input section 14B or the microphone 18, and sends the received input information to the corresponding processing section.

The area correction unit 116 receives a correction instruction from the user regarding the drug area detected by the drug detector 114, and performs a process of correcting the drug area according to the received instruction. That is, the region modification section 116 modifies the detection result of the drug detector 114 according to the region modification instruction received from the input section 14B. The area correction unit 116 allows the user to specify the correct area of the drug to be identified by inputting an instruction to correct the detection result from the input unit 14B when an erroneous detection or failure of detection occurs by the drug detector 114. be able to.

The drug region cutting unit 118 performs a process of cutting out a drug region for each drug from the photographed image based on the detection result of the drug detector 114. When the medicine region is modified by the region modification unit 116, the medicine region cutting unit 118 performs a process of cutting out the modified medicine region.

The drug identifier 120 includes a second learned model TM2 that is a learning model trained by machine learning. The second trained model TM2 is a model trained to perform a so-called object recognition task. The drug identifier 120 acquires each drug region image (hereinafter referred to as a drug image) cut out by the drug region cutting unit 118, estimates the type of the corresponding drug or the group to which the drug belongs, and labels it. (i.e., multi-class classification). The class classification performed by the drug identifier 120 differs in the fineness (granularity) of the classification depending on whether the drug in the input image is a drug whose drug type is difficult to identify or a drug whose drug type can be identified. The drug image is an example of an "object image" in the present disclosure. The second trained model TM2 is an example of a "second trained model" in the present disclosure.

If the input drug image is of a drug whose type is difficult to identify, the drug identifier 120 estimates the group to which the drug belongs and outputs the estimation result. Examples of the group to which drugs whose drug type is difficult to identify may include "capsule drugs," "divided tablets," or "plain drugs." Further, the group definition may be further divided into categories, and subgroups may be defined by hierarchical classification. For example, groups can be defined such as "white single-color capsule drugs," "red and white bicolor capsule drugs," "half tablets," "quarter tablets," "white plain drugs," or "transparent plain drugs." Good too.

If the input drug image is an image of a drug whose drug type can be identified, the drug identifier 120 identifies the type of drug (drug type) and outputs the estimation result of the drug type. The identification information of the drug type output by the drug identifier 120 may be, for example, a unique identification code defined for each drug type. Details of the learning method for creating the second learned model TM2 and the content of the identification process by the drug discriminator 120 will be described later. The drug identifier 120 searches a drug master database (not shown) based on the identification code unique to the estimated drug type, and obtains drug information about the corresponding drug or similar drugs. Drug identifier 120 is an example of a "discriminator" in the present disclosure.

The results of the identification process by the drug identifier 120 are displayed on the display section 14A via the display control section 124. After confirming the identification result displayed on the display unit 14A, the user can take actions such as finalizing the identification result, modifying the identification result, or performing a separate text search.

The text search unit 122 receives input of a search key from the input unit 14B or the microphone 18, accesses the stamped text master database 130, searches for relevant information, and outputs the search results. The stamped text master includes data in which text information of character symbols stamped or printed for various drugs is linked to the type of drug. The text information of characters and symbols stored in the database 130 is an example of "characters and symbols information" in the present disclosure.

The storage device 104 stores an engraved dextrose master database 130, a master image database 131 including master images of various drugs, and a drug master database (not shown). Master image database 131 may be included in a drug master database. Data such as the stamped text master, master image, and drug master may be stored on a network such as a cloud server (not shown).

The search results by the text search section 122 are displayed on the display section 14A via the display control section 124. Further, the master image read from the master image database 131 is displayed on the display section 14A via the display control section 124.

After checking the search results and master image displayed on the display unit 14A, the user can take actions such as confirming the drug type identification results, further narrowing down the search, or re-searching. Database 130 is an example of a "first database" in the present disclosure. Master image database 131 is an example of a "second database" in the present disclosure.

The storage device 104 includes an identification result storage section 132. The identification result storage unit 132 is a storage area that stores the identification results of drug types by the drug identifier 120, the identification results of drug types specified based on the search results by the text search unit 122, and the like.

[Explanation of learning phase]
The drug type identification device 100 according to this embodiment causes each of the drug detector 114 and the drug identifier 120 to learn based on machine learning as described below, and configures the drug type identification device 100 by combining them. FIG. 6 is a flowchart outlining a learning phase by machine learning for realizing the drug type identification device 100 including the drug detector 114 and the drug identifier 120.

The processing of each step shown in FIG. 6 can be performed, for example, by a computer executing a program. The machine learning method for obtaining the drug type identification device 100 includes a step of creating training data used for training the drug detector 114 (step S1: first training data creation step), and performing machine learning using the training data. A process of training the drug detector 114 (step S2: first training process), a process of creating training data used for training the drug identifier 120 (step S3: second training data creation process), A process of training the drug identifier 120 by performing machine learning using the drug detector 114 (step S4), and a drug type identification device using the drug detector 114 trained in step S2 and the drug identifier 120 trained in step S4. 100 (step S5).

The training data used for supervised learning includes input data and correct answer data (supervised data). Creating training data includes creating correct answer data (teacher data) corresponding to input data.

The process of creating the drug detector 114 (step S1, step S2) and the process of creating the drug identifier 120 (step S3, step S4) may be executed in parallel or sequentially. good. The execution timing of each step 1 to 5 is not particularly limited, and may be executed continuously, or the processing of each step may be executed at different times or using different computers. For example, a first computer executes the process of creating the drug detector 114 (step S1, step S2), and a second computer different from the first computer executes the process of creating the drug identifier 120 (step S3, step S2). S4) may also be executed. Alternatively, the first computer may execute the process of creating training data (steps S1 and S3), and the second computer may execute the learning process (steps S2 and S4).

In step S1, teacher data (correct data) is added to the training image regarding the position information and class of each drug included in the image. The drug position information provided as teacher data may be information specifying the position of a bounding box surrounding the drug, or may be a mask filled with the shape of the drug itself. Further, the class label assigned as the teacher data may be uniformly set to "drug" regardless of the type of drug (drug type). In other words, the same classification label is given to both drugs whose drug type can be identified and drugs whose drug type is difficult to identify.

[Learning of drug detector 114]
The drug detector 114 uses object position information, class information, and detected objects ( Here, we estimate the probability of drug use. The object position information includes, for example, "coordinates of the four vertices of a non-rotated bounding box surrounding the drug,""center coordinates, height, and width of a non-rotated bounding box surrounding the drug," and "rotation surrounding the drug." ``center coordinates, height, width, and rotation angle of the bounding box'' or ``a mask filled with the shape of the drug itself.''

Many drugs exist in the shape of an ellipsoid, and when ellipsoidal drugs are arranged vertically in a diagonal direction, multiple drugs are included in one bounding box if the ``bounding box surrounding the drug without rotation'' is used. Therefore, the object position information is preferably "center coordinates, height, width, and rotation angle of a rotating bounding box surrounding the drug" or "a mask filled with the shape of the drug itself."

In step S2, machine learning is performed to train a learning model (hereinafter referred to as the first learning model) applied to the drug detector 114 using the dataset of training data created in step S1. The first learning model is configured using, for example, a neural network. For example, a convolutional neural network (CNN) can be used as a network model suitable for object detection. Drug detector 114 is trained to receive image input and output object position information for individual drugs within the image.

Note that when the drug detector 114 also detects a marker, training of the first learning model requires a dataset that includes training data in which the marker region is given as correct data (teacher data). The training data used for learning the drug detector 114 is an example of "first training data" in the present disclosure.

[Learning of drug identifier 120]
The drug identifier 120 is input with images of individual drugs, and is trained to estimate the class information of the type of drug or the group to which the drug belongs, and the likelihood of the identified class.

When creating training data for the drug identifier 120, for drugs whose drug type can be identified, a unique identification code determined for each drug is given as a correct label. On the other hand, for drugs whose drug type is difficult to identify, an identification code is assigned to the group to which the drug belongs. For example, for drugs whose drug type is difficult to identify, correct labels can be added for groups such as "capsule drug," "plain drug," "half tablet," or "quarter tablet," for which you want to branch the processing flow in the drug type identification phase. Give. Note that "capsule drug" may be further divided into smaller groups such as "hard capsule drug" and "soft capsule drug" and correct labels may be assigned.

In order to improve the discrimination performance for groups, it is preferable to use at least one of a stamp extraction image, a drug external shape image, drug size information, etc., preferably in combination, as input to the drug identifier 120. The stamp extraction image is an image in which only the stamped part or printed part of the medicine is extracted, and the stamped part or printed part is represented in white on a mainly black background. The training data used for learning the drug identifier 120 is an example of "second training data" in the present disclosure.

FIG. 7 is a diagram illustrating an example of an identification code provided as training data for a drug whose drug type can be identified. For drugs whose drug type can be identified, a different identification code is assigned to each drug type. “P000001”, “P000002”, . . . “P009999” in FIG. 7 represent examples of identification codes uniquely defined corresponding to different drug types. The identification code shown in FIG. 7 is an example of a "type unit" label in the present disclosure.

FIG. 8 is a diagram showing an example of an identification code provided as training data for a drug whose drug type is difficult to identify. For drugs that are difficult to identify, an identification code is assigned to the group to which the drug belongs. For example, an identification code of "G000001" representing the group of "capsule drugs" is assigned to an image of a capsule drug, regardless of the drug type. Furthermore, an identification code of "G000002" representing the "plain drug" group is assigned to images of plain drugs, regardless of the drug type. For images of half tablets (half tablets), an identification code of "G000003" representing the group of "half tablets" is assigned regardless of the drug type. For images of quarter tablets, an identification code of "G000004" representing the group of "quarter tablets" is assigned regardless of the drug type. Note that if it is desired to handle half tablets and quarter tablets in the same way, the same identification code may be given to them. Alternatively, separate identification codes may be assigned to each of the half tablet and quarter tablet, and the program may perform the same processing on the identification codes "G000003" and "G000004."

Furthermore, regarding the identification code given to a group, identification codes of subgroups that are finely classified into groups according to a hierarchical structure may be defined.

FIG. 9 is a diagram showing another example of an identification code given as training data for a capsule drug. For example, it is expected that the inference accuracy of the drug identifier 120 will be improved by dividing capsule drugs into more finely divided groups based on capsule color or printed character color. Further, by performing such detailed grouping, usability may be improved as a search attribute during capsule search, and it may be utilized for discrimination processing. In FIG. 9, an example is shown in which classifications (subgroups) are defined for the group of "capsule drugs" into cases where the capsule color is a single color and cases where the capsule color is a combination of two colors. Furthermore, groups of single-color capsules may be further classified according to the color of the capsules. For example, an identification code of "G000010" representing a group of monochromatic and white capsule drugs is assigned to an image of monochrome and white capsule drugs. An identification code of "G000011" representing a group of monochrome blue capsule drugs is assigned to an image of monochrome blue capsule drugs. Although not shown in FIG. 9, individual identification codes representing each group may be similarly assigned to images of monochromatic capsule medicines of other colors.

Furthermore, the group of two-color capsules may be further classified based on the combination of capsule colors. For example, for an image of two-colored capsule medicines, red and white, an identification code of "G000100" representing a group of red and white two-color capsule medicines is assigned. An identification code of "G000101" representing a group of blue and white capsule medicines is assigned to an image of capsule medicines of two colors, blue and white. Although not shown in FIG. 9, individual identification codes representing each group may be similarly assigned to images of capsule medicines with other two-color combinations.

In FIG. 9, an example of capsule medicine has been described, but plain tablets may also be divided into groups according to color or shape and identification codes may be assigned.

FIG. 10 is a diagram showing another example of the identification code given as teacher data for plain drugs. FIG. 10 shows an example in which the plain drugs are further divided into groups based on the combination of the shape and color of the plain drugs. FIG. 10 shows examples in which the shape of the plain drug is circular and ellipsoidal.

As shown in FIG. 10, for an image of a plain medicine with a circular medicine shape and a white medicine color, for example, an identification code of "G001001" representing a group of circular white medicines is given. An identification code of "G001002" representing a group of circular yellow plain drugs is assigned to an image of a plain drug that is circular in shape and yellow in color. Although not shown in FIG. 10, individual identification codes representing the respective groups may be similarly assigned to images of plain drugs having circular shapes and other colors.

Furthermore, for an image of a plain drug with an ellipsoidal shape and an orange drug color, for example, an identification code of "G002001" representing a group of ellipsoidal and orange plain drugs is assigned. . An identification code of "G002002" representing a group of ellipsoidal and transparent plain drugs is assigned to an image of a plain drug whose shape is an ellipsoid and whose color is transparent. Although not shown in FIG. 10, individual identification codes representing the respective groups may be similarly assigned to images of plain drugs of other colors in which the drugs are ellipsoidal in shape. Similarly, individual identification codes representing each group may be assigned to images of plain medicines in combinations of shapes and colors (not shown).

[About input information to drug identifier 120]
When photographing is performed using the smartphone 10 under various environments, color information may not necessarily be a reliable information source due to the influence of the photographing environment. For this reason, when identifying drugs that can be identified by drug type, a stamp extraction image that excludes color information is often important. In the drug identifier 120, it is preferable to perform identification with more emphasis on the stamp extraction image. The stamp extracted image may be obtained by inputting the original image and performing inference using a trained machine learning model that inputs the original image and outputs the stamp extracted image.

On the other hand, even if the image is greatly affected by the shooting environment, color information may still be helpful, such as when it comes to drugs with very rare colors. Furthermore, drug shape information can also be a robust information source that is not easily affected by the imaging environment. An image of the external shape of a drug can be useful shape information.

Furthermore, information on the length and breadth of the drug's major axis and minor axis (size information) obtained by non-machine learning methods can also be a robust information source that is not easily affected by the imaging environment. For example, by extracting the long axis and short axis of the drug using OpenCV and using the numerical information, it becomes a robust information source that is not easily affected by the imaging environment. In this way, the size information measured for each drug from the photographed image becomes an important information source for identifying a huge number of drug types, even for drugs that are difficult to identify. In particular, size information can be an extremely important source of information for group identification for drugs that are difficult to identify, such as capsule drugs without printed identification symbols or plain tablets.

Considering these facts, the input to the drug identifier 120 is the original image, which is a region image (drug image) cut out for each drug from the captured image, the stamp extraction image extracted from the original image, and the external shape. It is preferable to use a combination of one or more of the image and size information, preferably a plurality of them. Note that the term "mark extraction image" includes not only a stamp but also an image obtained by extracting a print attached to a tablet or a capsule. The stamp extraction image may also be referred to as a character symbol extraction image. The term "imprint" may be understood as a term including concepts such as "printing", "print symbol", "identification symbol" or "letter symbol" on the tablet or capsule drug, depending on the context.

The configuration may be such that all of the information including the original image including color information, the stamp extraction image, the external shape image, and the size information is input into the drug identifier 120, or some of these information may be combined. The configuration may be such that the information is input to the drug identifier 120.

FIG. 11 is a conceptual diagram showing an example of information input to the drug identifier 120. The original image Org is a drug image for each drug, and corresponds to an individual drug image cut out for each drug from the photographed image. The stamp extraction image Egm is an image in which a stamp is extracted from the original image Org. The stamp extraction image Egm may be an image subjected to emphasis processing to increase the visibility of the extracted stamp. The external shape image Otw is an image showing the external shape of the drug extracted from the original image Org. Note that the external shape image Otw is not limited to an image showing a strict outline of the medicine, but may be an image showing an approximate shape according to the outline. The size information is, for example, numerical information obtained by measuring the major axis and minor axis of the drug based on the original image Org or the external image Otw.

In order to construct a system that is robust to the imaging environment, it is preferable to use a combination of these multiple pieces of information as input information to the drug identifier 120.

The learning model applied to the drug identifier 120 is configured using, for example, a neural network. For example, CNN can be used as a network model suitable for image recognition. When inputting a combination of two or more images among the original image, stamp extraction image, and outline image as input to the neural network, there are two methods: combining these multiple images in the channel direction and inputting them ( There may be a method in which the images are combined horizontally or vertically (within the plane of the image) and input as a composite image.

In a neural network, the input image size may be subject to a fixed value constraint, so the following two methods may be available as input image methods.

[Method 1] The first method is a method of enlarging or reducing the image according to the input image size accepted by the input layer of the neural network. FIG. 12 shows an example of input information according to the first method.

In the first method, each image is enlarged or reduced to the upper limit of the input image size accepted by the neural network and inputted. Information on the image enlargement or reduction ratio used in this image enlargement or reduction processing (hereinafter referred to as "resizing processing") is stored in case it is needed in subsequent processing. In the case of the first method, magnification information applied to the resizing process is stored, and one or more images of the original image Org1, the engraving extraction image Egm1, and the outer shape image Otw1, and size information are input to the neural network. In addition to the combination, magnification information for resizing processing is used as necessary. The original image Org1 shown here is obtained by resizing an image of each drug unit cut out from the standardized image of the captured image to match the input image size of the neural network. Furthermore, a stamp extraction image Egm1 is obtained by performing a process of extracting a stamp from the original image Org1. Outline image Otw1 is obtained by performing processing to extract the outer shape of the medicine from original image Org1.

According to the first method, for example, when the engraved characters are small or detailed on a small tablet, the image is enlarged and input to the neural network, which has the advantage of improving inference (identification) accuracy. Furthermore, according to the first method, the input image size of the neural network can be designed to be smaller, and a reduction in execution time can be expected. In other words, the input image size of the neural network can be designed to be an appropriate size for recognition performance without being constrained by the maximum size of drugs that exist in the world, thereby suppressing unnecessary data processing.

On the other hand, in the case of the first method, processing is somewhat complicated in that image resizing processing is performed during input/output of the neural network. In addition, information on the enlargement/reduction ratio (magnification information) is stored in case it is needed in later processing (such as when presenting drug identification results to the user with an image proportional to the size of the tablet). It is necessary to keep it.

[Method 2] The second method is a method in which the image is pasted and input at the same image size (without enlarging/reducing processing) at the center position of the image of the input image size accepted by the input layer of the neural network. be. In this case, the input image size to the neural network is determined in advance, and the same original image cut out from the standardized image of the photographed image is input to the input layer of the acceptable input image size. The standardized original image is an image whose correspondence with the actual size of the drug is known.

FIG. 13 shows an example of input information according to the second method. According to the second method, image resizing processing is not necessary. Therefore, in the case of the second method, there is no need to input magnification information indicating the enlargement/reduction ratio. In the second method, a combination of one or more of the original image Org2, the stamp extraction image Egm2, and the outline image Otw2 and size information can be used as input information to the neural network. In addition, in the second method, since the outline image Otw2 itself extracted from the original image Org2 includes information that substantially indicates the size of the drug, when the outline image Otw2 is used for input, the numerical values of the major axis and the minor axis are The advantage is that information is not always necessary.

On the other hand, in the case of the second method, the input image size that the neural network accepts is determined by the largest drug size that exists in the world, so extra space is created around the tablet, especially for small tablets. , there may be waste in processing. Furthermore, since the input image size accepted by the neural network is larger than that of the first method, the execution time may be longer than that of the first method. Furthermore, in the case of the second method, small tablets and tablets with finely engraved characters are processed as they are, so the identification accuracy of these tablets may be lower than that of the first method.

Comparing the first method and the second method, the first method is a more preferable method for identifying drug types using the smartphone 10.

[Example of input information for drugs whose drug type is difficult to identify]
FIGS. 14 to 16 show examples of input information regarding drugs whose drug type is difficult to identify. Here, an example will be shown in which information about a drug whose drug type is difficult to identify is input to the neural network using the first method.

FIG. 14 is an example of input information in the case of a capsule drug (with identification symbol) in which an identification symbol is printed on the capsule. For capsule medicines with identification symbols, a combination of one or more of the original image Org3, the stamp extraction image Egm3, and the external shape image Otw3 is input to the neural network as in FIG. 12. . In addition to the image, numerical information on the size (longer axis and shorter axis) of the capsule medicine measured from the original image Org3 can be input. Furthermore, in addition to these pieces of information, magnification information for resizing processing may be input. The stamp extraction image Egm3 is an image of the identification symbol extracted from the original image Org3.

FIG. 15 is an example of input information for a capsule drug without an identification symbol. A capsule drug without an identification symbol that does not have an identification symbol in the original image Org4 is a capsule drug that does not originally have an identification symbol printed on the capsule, or even if there is an identification symbol on the capsule, the printing was not done at the time of photographing. There may be cases where the identification symbol is not captured in the original image Org4 because it was photographed in a hidden state.

For a capsule medicine without an identification symbol, a combination of one or more of the original image Org4, the stamp extraction image Egm4, and the external shape image Otw4 is input. However, in the case of a capsule medicine without an identification mark, the stamp extraction image Egm4 is an image that does not include any information about the identification mark. The external shape image Otw4 is an image obtained by extracting the shape of the capsule medicine shown in the original image Org4. Further, similarly to FIG. 14, in addition to these images, numerical information on the size (longer axis and shorter axis) of the capsule medicine measured from the original image Org4 can be input. Furthermore, in addition to these pieces of information, magnification information for resizing processing may be input.

FIG. 16 is an example of input information for half tablets. Similarly, in the case of a half tablet, a combination of one or more of the original image Org5, the stamp extraction image Egm5 extracted from the original image Org5, and the external shape image Otw5 is input. In addition to the image, a combination of numerical information on the size of the half tablet (longer axis and shorter axis) measured from the original image Org5 and magnification information for resizing processing can be input.

FIG. 17 is a block diagram showing the functional configuration of the machine learning system 150. Here, an example will be shown in which the combination of input information described in FIG. 12 is input to the learning model 151.

The machine learning system 150 is a device that generates a second trained model TM2 applied to the drug identifier 120, and is realized using a computer system including one or more computers. Machine learning system 150 includes a learning model 151, a loss calculation unit 152, and an optimizer 154. The learning model 151 uses a neural network such as CNN.

The learning model 151 receives, as input information, a drug image IMj that is a region image of the drug DRj, a stamp extraction image IM1j, an outline image IM2j, and size information SZj that is numerical information indicating the size of the drug. It accepts the input of information combined with the magnification information MGj of the resizing process, and outputs the inference result PRj of the type of drug DRj or the group to which the drug DRj belongs. The subscript j represents the index number of the training data.

The machine learning system 150 shown in FIG. 17 includes a stamp extraction section 140, an external shape extraction section 142, and a size measurement section 144 before the learning model 151. The stamp extracting unit 140 extracts a stamp or printed character symbol from the drug image IMj, and generates a stamp extraction image IM1j that is an image of the extracted character symbol. The stamp extraction unit 140 processes the drug region of the input image, eliminates external edge information of the drug DRj, and extracts character symbols. The stamp extraction image IM1j is an image in which the character symbol is emphasized by expressing the luminance of the stamped portion or the printed portion relatively higher than the luminance of the portion other than the stamped portion or the printed portion.

The outer shape extraction unit 142 extracts the outer shape of the drug DRj from the drug image IMj, and generates an outer shape image IM2j that is an image showing the outer shape of the drug DRj. The size measuring unit 144 measures the size of the drug DRj from the drug image IMj and/or the external shape image IM2j, and generates size information SZj indicating the respective dimensions of the major axis and the minor axis.

The machine learning system 150 adopts a configuration in which this information is generated from the drug image IMj and input into the learning model 151. However, in the stage of preparing training data in advance, the engraving extraction image IM1j, the external shape image IM2j, and the size Part or all of the information SZj may be created in advance and included in the training data dataset. In that case, part or all of the stamp extraction section 140, the external shape extraction section 142, and the size measurement section 144 are unnecessary in the machine learning system 150.

The loss calculation unit 152 calculates a loss value (loss) between the inference result output from the learning model 151 and the correct data (teacher data) GTj associated with the input information.

The optimizer 154 adjusts the learning model 151 based on the calculation result of the loss value indicating the error between the output of the learning model 151 and the correct teacher signal so that the inference result PRj output by the learning model 151 approaches the correct answer data GTj. The update amount of the parameters is determined, and the parameters of the learning model 151 are updated. The optimizer 154 updates parameters based on an algorithm such as gradient descent. The parameters of the learning model 151 include filter coefficients (weights of connections between nodes) of filters used for processing each layer of the neural network, node biases, and the like. The machine learning system 150 may acquire training data and update parameters in units of mini-batches that are a collection of a plurality of training data.

In this way, by performing machine learning using a large amount of training data, the parameters of the learning model 151 are optimized, and a learning model 151 having the desired inference performance is generated. The learned (trained) learning model 151 for which acceptable inference accuracy has been confirmed is used as the second trained model TM2 of the drug discriminator 120.

In this case, in order to obtain a stamp extraction image, an external shape image, and size information from the photographed image, the drug type identification device 100 may, for example, insert a link between the drug region cutting unit 118 and the drug identifier 120 described in FIG. The configuration includes a processing section similar to the stamp extraction section 140, the external shape extraction section 142, and the size measurement section 144. The drug type identification device 100 is an example of an “object identification device” in the present disclosure.

[Utilization phase of drug type identification device 100]
FIG. 18 is a flowchart showing the operation of the drug type identification device 100 according to this embodiment. Here, as an example of how the medicine type identification device 100 is used, a case will be described in which medicines brought by a certain patient are identified.

In step S11, the user simultaneously photographs multiple drugs to be identified. The user can take pictures of a plurality of drugs in pharmaceutically meaningful units, such as when the drug is taken at the same time. For example, the user takes out a package of medicines included in the patient's medicines from a bag, and uses the camera function of the smartphone 10 to simultaneously (collectively) photograph the plurality of medicines. The processor 102 acquires a photographed image obtained by photographing. There is a possibility that the plurality of drugs photographed at this time include drugs whose drug types can be identified and drugs whose drug types are difficult to identify. A drug whose drug type can be specified is an example of a "type-identifiable object" in the present disclosure, and a drug whose drug type is difficult to specify is an example of an "object whose type is difficult to specify" in the present disclosure.

Next, in step S12, the processor 102 uses the drug detector 114 to detect individual drugs from the captured image. The drug detector 114 detects drugs whose drug type can be identified and drugs whose drug type is difficult to identify on an individual drug basis from the captured image. The drug detector 114 estimates, for example, "the center coordinates, height, width, and rotation angle of a bounding box with rotation surrounding the drug" or "a segmentation mask that fills the drug area with the shape of the drug itself", Output the estimation results. The estimation (detection) result by the drug detector 114 is displayed on the touch panel display 14 for confirmation by the user. The processor 102 receives an input from the input unit 14B of the touch panel display 14 or the like for an instruction to modify the detection result or an instruction to approve (confirm) the detection result. If the detection results of the drug detector 114 include over-detection or detection failure (missing detection), the user inputs an instruction to correct the detection results from the input section 14B of the touch panel display 14, etc., and selects the correct area for each drug. Can be specified. Note that the detection results can be corrected for each detected area or for each drug. A drug unit is an example of an "object unit" in the present disclosure.

In step S13, the processor 102 determines whether the detection results by the drug detector 114 include over-detection or detection failure. If there is overdetection or detection failure in the detection result by the drug detector 114 and the determination result in step S13 is Yes, the processor 102 proceeds to step S14 and corrects the drug area according to the instruction received from the user. I do. For example, if an overdetected area is included, processing is performed to delete that area. Furthermore, if there is a detection failure in which a region is not detected even though there is a drug, a process such as adding a new region for the drug is performed. After step S14, the processor 102 proceeds to step S15.

Further, if the determination result in step S13 is No, that is, if there is no overdetection or detection failure in the detection result by the drug detector 114, the processor 102 proceeds to step S15. In step S15, the processor 102 uses the drug identifier 120 to identify each drug in the image. The processor 102 cuts out drug images for each detected drug, inputs each drug image into a drug identifier 120, and identifies the drug.

The processing content of step S15 will be described later using FIG. 19, but the outline is as follows. That is, for drugs whose drug type can be specified, it is expected that the drug identifier 120 will make inferences on a drug type (drug type) basis. If it is confirmed visually that the drug is correct based on the drug type identification result obtained by the drug identifier 120, the inferred result is confirmed. On the other hand, if the drug identifier 120 mistakenly infers a group of drugs whose drug type is difficult to identify, the drug type can be selected from among the drugs presented as other high-rank inference candidates, or the drug type can be selected by voice search, text search, etc. Identify.

Furthermore, for drugs whose drug type can be identified, it is expected that the drug identifier 120 will make inferences in units of groups to which the drug belongs. If it is inferred for each group as a drug that is difficult to identify, and if the inference is confirmed to be correct by visual inspection of the photographed image of the drug, the process proceeds to the drug type identification flow defined for that group. For example, if it is a group of "capsule medicines", you can visually check the character symbols attached to the capsules on the photographed image, input the text of the characters and/or symbols by text or voice, and search the engraving text master database. Then, by comparing the drug to be identified with the master data, the process moves to a drug type identification flow in which the drug type is specified.

On the other hand, if the inference result of the drug identifier 120 for a drug whose drug type is difficult to identify is wrong, the user appropriately corrects the inference result and proceeds to the appropriate drug type identification flow (see FIG. 19).

In step S16, the processor 102 determines whether the drug types for all drugs in the image have been determined. If the determination result in step S16 is No, the processor 102 returns to step S15, changes the drug to be identified, and continues the process. If the determination result in step S16 is Yes, the processor 102 ends the flowchart of FIG.

FIG. 19 is a flowchart illustrating an example of loop processing applied to steps S15 and S16 in FIG. 18.

When the process in step S15 is started, in step S21, the processor 102 uses the drug identifier 120 to identify the drug, and determines whether the identification result is a drug belonging to a drug whose drug type can be specified. . The identification result by the drug identifier 120 is provided to the user through the touch panel display 14. The processor 102 receives input of various instructions from the input unit 14B of the touch panel display 14, such as an instruction to confirm the drug type, an instruction to correct the identification result, or an instruction to proceed to other processing such as text search. The user confirms the presented identification results and inputs an instruction to confirm the drug type from the input section of the touch panel display 14, or inputs an instruction to modify the identification results or move to an engraved text search. I can do it.

If the determination result in step S21 is Yes, the processor 102 proceeds to step S22. In step S22, the processor 102 determines whether the result of the medicine identifier 120 that the medicine can be identified is correct. If the determination result in step S22 is Yes, the processor 102 proceeds to step S23 and determines whether the drug type specified (inferred) by the drug identifier 120 is correct. If the user can visually confirm that the drug type is correct, the user can input an instruction to confirm the drug type.

If the determination result in step S23 is Yes, the processor 102 proceeds to step S29 and determines the drug type.

If the determination result in step S23 is No, the processor 102 proceeds to step S28. In step S28, the processor 102 executes a non-machine learning-based drug type identification flow using stamped text or the like. After step S28, the processor 102 proceeds to step S29.

If the determination result in step S21 is No, the processor 102 proceeds to step S24. In step S24, the processor 102 determines whether or not the result of the drug being difficult to identify by drug type identified by the drug identifier 120 is correct. If the determination result in step S24 is No, the processor 102 proceeds to step S28.

If the determination result in step S24 is Yes, the processor 102 proceeds to step S25, and determines whether the type of group to which the difficult-to-identify drug belongs, which is the identification result identified (inferred) by the drug identifier 120, is correct. do. If the determination result in step S25 is No, or if the determination result in step S22 is No, the processor 102 proceeds to step S26. In step S26, the processor 102 identifies the group of difficult-to-identify drugs to which the drug belongs. In step S26, the processor 102 receives an input from the input unit 14B of the touch panel display 14 or the like to specify the type of group to which the drug belongs, and specifies the group according to the received instruction.

After step S26, the processor 102 proceeds to step S27. Further, if the determination result in step S25 is Yes, the processor 102 proceeds to step S27. In step S27, the processor 102 executes the drug type identification flow defined for the group of drugs that are difficult to identify. After step S27, the processor 102 proceeds to step S29 and determines the drug type.

Once the drug types for all drugs included in the image have been determined, the loop processing in FIG. 19 ends. The drug identification method described using the flowcharts of FIGS. 18 and 19 is an example of the "object identification method" in the present disclosure.

[Example of GUI (Graphical User Interface) of drug type identification device 100]
[Detection result display GUI]
FIG. 20 is a diagram showing an example of a screen displayed on the touch panel display 14. FIG. 20 shows an example of a detection result display GUI that displays the detection results of the drug area detected by the drug detector 114. The screen SC1 provided by the drug type identification device 100 is roughly divided into three areas: an entire image display area EDA, a candidate display area CDA, and a button display area BDA. A standardized image generated from the acquired captured image is displayed on the entire image display section EDA. Note that the entire captured image including the marker may be displayed on the entire image display section EDA. The entire image display section EDA is an example of a "photographed image display section" in the present disclosure.

The entire image display section EDA can enlarge or reduce a portion of the image by pinching out or pinching in. Therefore, if the characters and symbols attached to drugs are too small to be seen, each drug can be enlarged and displayed.

The detection result by the drug detector 114 is displayed on the entire image display section EDA. After photographing, the area of each drug detected by the drug detector 114 is displayed in a rectangular frame (bounding box). In FIG. 20, an example of a captured image including three drugs DR1, DR2, and DR3 is shown. Frames BX1 and BX2 are displayed for drugs DR1 and DR2 detected by drug detector 114, respectively. In the example shown in FIG. 20, detection has failed for drug DR3, and the frame for drug DR3 is not displayed. Furthermore, in the example shown in FIG. 20, there is overdetection in the region Abs where no drug is present, and a frame BX3 based on erroneous detection is displayed.

Each area surrounded by frames BX1, BX2, and BX3 can be selected by the user, and for example, the selected area is displayed with a red frame, and the unselected area is displayed with a blue frame, changing the color of the frame line. Of course, another color may be used. It is preferable that the frames BX1, BX2, and BX3 indicating the regions are displayed with different border colors depending on the selected/unselected state of the region.

A check box CB is attached to each frame BX1, BX2, and BX3. The check box CB may be marked or left blank depending on the status, such as "drug type confirmed" or "undetermined state."

At the bottom of the entire image display section EDA, a region editing switch SW1, a region addition button BT2, and a region deletion button BT3 are displayed. The region editing switch SW1 is a switch operated when it is desired to modify the detection result of the drug region. When the area editing switch SW1 is slid to the right of the screen, the drug area can be edited on the screen of the entire image display section EDA. When the area editing switch SW1 is turned on, the area addition button BT2 and the area deletion button BT3 become pressable, and the screen shifts to the area correction GUI. When the area editing switch SW1 is slid to the left of the screen, the area becomes unable to be edited. If editing is not possible, the area addition button BT2 and area deletion button BT3 are grayed out.

Note that turning on/off the editability of a region is not limited to the region editing switch SW1 illustrated in FIG. 20, and may be implemented by, for example, a long press or a double tap on the entire image display section EDA. When the region addition button BT2 is pressed, a state is created in which a region can be specified on the entire image display section EDA and a drug region can be added. Further, the area deletion button BT3 is used when deleting an over-detected area.

When you tap and select each area with area editing turned off, the drug image cut out in that area is subjected to identification processing using the drug discriminator 120, and the candidate drug is displayed in the candidate display area CDA below. Display information. Additionally, the dimensions of the major axis and minor axis are displayed for each area. The dimension display may be turned on/off using a separate switch or the like. A candidate drug is an example of a "candidate object" in this disclosure.

The candidate display section CDA is an area that mainly displays drug information of candidate drugs. The candidate display section CDA displays drug information of candidate drugs based on the identification results estimated by the drug identifier 120. The drug information of the candidate drug includes, for example, a master image of the drug, an identification code of the drug, a drug surface, a drug efficacy classification, and information on a group to which the drug belongs. In FIG. 20, an example of drug information of a candidate drug for drug DR1 is shown. Note that it is also possible to enlarge/reduce the image in the candidate display section CDA by pinching out or pinching in.

The display field that displays group information is configured as a selection box SLB1. Further, an engraved text search button BT4 is displayed in the candidate display area CDA.

Various buttons are arranged in the button display area BDA, including, for example, a drug type confirmation button BT5, a drug type confirmation hold button BT6, a reshoot button BT7, and a completion button BT8. These buttons can change to a pressable state or a non-pressable state depending on the situation. The re-photographing button BT7 is used to perform re-photographing by pressing the button when there is a problem with the photographed image, such as when the photographed image is blurred.

[Area editing GUI]
FIG. 21 is an example of a screen display when editing a drug area. As illustrated in FIG. 20, when overdetection and/or detection failure is confirmed in the detection results of the drug detector 114, the user slides the area editing switch SW1 to the right of the screen to enable area editing. do.

In this state, for example, if the area of the drug DR1 is selected, the position, shape, rotation angle, etc. of the selected area can be adjusted by dragging, tapping, etc. For example, as shown in Fig. 22, you can enlarge or reduce the area by dragging the sides of frame BX, move the area in parallel by dragging the area, or pinch the area with two fingers in the rotation direction to enlarge or reduce the area. can be rotated. Furthermore, by dragging the corner portion of the frame BX, the area can be expanded or contracted while keeping the aspect ratio constant.

In the case of the screen SC2 shown in FIG. 21, since the area Abs where no drug is present is extracted (detected as a drug) due to overdetection, the user selects the area Abs indicated by the frame BX3 and clicks the area delete button. By pressing BT3, the area Abs can be deleted. Note that the "area deletion" function is not limited to the area deletion button BT3; deletion can also be performed by long-pressing the relevant area to display a pull-down menu (not shown), and selecting "area deletion" from the pull-down menu. A GUI may be implemented so that the

In addition, in the case of the example shown in FIG. 21, regarding the drug DR3, although the drug DR3 is present in the photographed image, the region is not detected, so the user presses the region addition button BT2, selects the region, and adds the region. can. To add an area, for example, tap any point, drag it to the bottom right to create an additional rectangular area, and use the method explained in Figure 22 to change the rectangle's width, height, rotation angle, etc. Adjust.

Alternatively, press and hold a point on the screen of the entire image display section EDA to display the default rectangle with an undefined area, add an area, move this rectangle in parallel to an appropriate position, change the width and height of the rectangle, and rotate it. It may be implemented to correct by adjusting the angle.

[GUI for displaying detection results with area confirmed]
FIG. 23 is an example of a screen display in a state where the identification of the drug has been determined. As shown in the screen SC3 shown in FIG. 23, a check mark indicating the status of "drug type confirmed" is entered in the check box CB for each of the frames BX1 and BX4 for drugs DR1 and DR3 whose drug type has been confirmed. By long-pressing the check box CB, a pull-down menu is displayed, and the status can be changed from the pull-down menu. For example, the status can be changed from "drug type confirmed" to "undetermined state."

Additionally, for drugs that are difficult to identify, the drug type may not necessarily be identified, so the status may be "drug type determination pending." For example, an "x" mark indicating the status of "drug type determination pending" may be displayed in the check box CB of the drug DR2 for which determination of the drug type is pending. Note that in the case of an "unconfirmed state" in which neither the drug type has been determined nor the drug type determination pending, nothing is displayed in the check box CB (see FIG. 22). The default of check box CB is blank indicating "undefined state".

After determining the identification status of all drugs, the user presses the completion button BT8 to complete the identification work for the main photographed image.

[Identification result confirmation GUI (for drugs whose drug type can be identified)]
FIG. 24 is an example of a screen display of the identification result confirmation GUI. FIG. 24 shows an example of a GUI that displays drug information about drugs belonging to the group ``Drugs that can be identified by drug type.'' Although the illustration of the entire image display section EDA is omitted in FIG. 24, the entire image display section EDA exists above the candidate display section CDA of the screen SC4 shown in FIG. 24 (see FIG. 23).

The candidate display section CDA displays the front and back sides of the master image of the drug, the drug code, the drug name, and the size ( Drug information such as major axis and minor axis dimensions is displayed. Furthermore, the type of group to which the selected drug belongs is displayed at the bottom of the drug information display section where the drug information is displayed. The type of group is displayed in the selection box SLB1.

When the right arrow button BT9 at the right end of the screen SC4 is pressed or swiped to the left, candidate drugs with lower scores are displayed in order on the drug information display section of the candidate display section CDA. Furthermore, when the left arrow button BT10 at the left end of the screen is pressed or swiped right, candidate drugs with higher scores are displayed in order on the drug information display section of the candidate display section CDA.

Furthermore, if the inferred group is incorrect, the pull-down menu PDM can be displayed by pressing the pull-down button of the selection box SLB1, the correct group can be selected from the pull-down menu PDM, and the transition can be made to a GUI specific to each group. can.

When performing a search using a text-based method that does not rely on drug type estimation by the drug identifier 120, such as when the correct drug is not found among the candidate drugs output by the drug identifier 120, the stamped text search button BT4 is pressed. By doing so, the screen transitions to the engraved text search GUI.

The drug type confirmation button BT5 is pressed to confirm the drug type with the drug displayed or selected in the candidate display area CDA. When the drug type confirmation button BT5 is pressed, the drug type is confirmed and a check mark is displayed in the check box CB of the corresponding drug on the entire image display section EDA. The drug type confirmation hold button BT6 is pressed when the discrimination of the selected drug is completed without determining the drug type. When the drug type confirmation hold button BT6 is pressed, an "x" mark is displayed in the check box CB of the corresponding drug on the entire image display section EDA.

When the arrow button BT11 at the top right of the candidate display section CDA is pressed, the screen transitions to the candidate list display GUI (see FIG. 25).

[Candidate list display GUI]
FIG. 25 is an example of a screen display of the candidate list display GUI. The candidate list display GUI is a GUI that displays a list of drugs with high scores based on inference by the drug identifier 120 when the selected drug belongs to the group of "drugs that can be identified by drug type." As shown in FIG. 25, since a plurality of candidate drugs are displayed in a list, comparative study becomes possible. Although the illustration of the entire image display section EDA and the button display section BDA is omitted in FIG. 25, the entire image display section EDA is present on the screen SC5 of the candidate display section CDA shown in FIG. A button display section BDA exists below the screen of the candidate display section CDA (see FIG. 23). The same applies to each of the figures in FIGS. 26 to 29.

FIG. 25 shows an example in which drug information about the four drugs with the highest scores is displayed in a list. By operating the knob of the scroll bar SRB at the right end of the screen SC5 downward or by swiping the screen, lower-level candidates can also be displayed. The drug type can be confirmed by selecting any drug on the candidate image list screen SC5 and pressing the "drug type confirmation button BT5" on the button display section BDA.

The display order in the list display may be determined by the score of the inference by the drug discriminator 120, or by, for example, searching the stamp text database for drugs that are similar to the stamp text of the drug with the highest score by the drug discriminator 120. They may be displayed in order of similarity score.

Furthermore, the display format of the list display is not limited to the display format in which the items are arranged vertically in a single column as shown in FIG. 25, but may be displayed in multiple rows and multiple columns by using a wider screen.

In the candidate image list, the drug information for each drug includes the master image (front and back), text information indicating the drug name, and numerical size information, as well as stamp text information registered in the stamp text database ( It is preferable to display master character information). In FIG. 25, "master text information" is used for convenience of illustration, but in actual screen display, stamp text information registered for each drug will be displayed.

When the arrow button BT11 at the top right of the candidate display area CDA is pressed, the screen returns to the identification result confirmation GUI (see FIG. 24).

[Engraving text search GUI]
FIG. 26 is an example of a screen display of the engraved text GUI. The stamped text GUI is a GUI that displays a list of drugs with text similar to the text input in the search box SBX1 when the selected drug belongs to the group of "drugs that can be identified by drug type." When the engraved text search button BT4 is pressed on the screen of FIG. 24 or 25, a engraved text GUI screen SC6 including a search box SBX1 is displayed as shown in FIG.

The user can search by engraved characters by inputting text into the search box SBX1 using a keyboard or voice input and pressing the search button SBT1. The search results are displayed in a list as in FIG. 25. The display format for displaying a list of candidate drugs is also the same as that in FIG. 25. The similarity score may be determined every time a character is input into the search box SBX1, and the candidate list display may be updated immediately.

[Capsule search GUI]
FIG. 27 is an example of a screen display of the capsule search GUI. The capsule search GUI is a GUI displayed on the candidate display section CDA when the medicine identifier 120 identifies "capsule medicine" or when "capsule" is selected from the pull-down menu PDM of the group selection box SLB1. It is.

The capsule search screen SC7 includes a text search box SBX2, a size search box SZB, capsule color selection boxes SLB2 and SLB3, a font color selection box SLB4, a capsule image display area SRD, and a group selection box SLB1. , capsule list button BT13.

By entering text into the text search box SBX2 using a keyboard or voice input and pressing the search button SBT2, you can search using the characters printed on the capsule. You can also search by text of the drug name.

The size search box SZB includes an input box IB1 for inputting a numerical value of the major axis, an input box IB2 for inputting a numerical value for the minor axis, and a search button BT12. By inputting numerical values into the input boxes IB1 and IB2 by keyboard or voice input and pressing the search button BT12, it is possible to search by the size of the capsule drug.

The search results are displayed on the capsule image display section SRD. In the case of text search, the front and back sides of the master image are displayed from the top in descending order of text similarity score. The similarity score may be determined every time a character is input into the text search box SBX2, and the candidate list display may be updated immediately. In capsule search, text search is limited to capsules. On the other hand, when searching by size, the front and back sides of the master image are displayed from the top in descending order of degree of matching with the input length and breadth values. In this case as well, the search range is limited to capsules.

Furthermore, regarding the display order of master images of capsule medicines to be displayed on the capsule image display section SRD, past discrimination history data may be referred to, and capsules with a discrimination history may be displayed preferentially. By operating the knob of the scroll bar SRB at the right end of the screen SC7 downward or by swiping the screen SC7, lower-level candidates can also be displayed.

The form of candidate display is not limited to the form in which candidates are displayed in a single column vertically as shown in FIG. 27, but may be displayed in multiple rows and in multiple columns.

When the capsule list button BT13 at the bottom right of the screen SC7 is pressed, the capsule image display section SRD expands to the right, and more capsule drug candidates are displayed as a list at once. In this case, the capsule list button BT13 is replaced with a "back button" (not shown), and when the back button is pressed, the screen returns to the original capsule candidate screen (FIG. 27).

The drug type can be determined by selecting one of the capsules from among the capsule images displayed on the capsule image display section SRD and pressing the "drug type confirmation button BT5" on the button display section BDA.

Capsule color selection boxes SLB2 and SLB3 and text color selection box SLB4 may be used to narrow down the candidates by specifying the capsule color and printed text color. Alternatively, the capsule color may be automatically recognized from the photographed image, the candidates may be narrowed down in advance, and the selection boxes SLB2 and SLB3 may be displayed by default. In the capsule candidate image list, the drug information for each drug includes the master image (front and back), text information indicating the drug name, numerical information about the size, and the stamp text registered in the stamp text database. It is preferable to display information (master character information).

[Plain drug search GUI]
FIG. 28 is an example of a screen display of the plain drug search GUI. The plain drug search GUI is displayed in the candidate display area CDA when the drug is identified as "plain drug" by the drug identifier 120 or when "plain drug" is selected in the pull-down menu PDM of the group selection box SLB1. This is a GUI.

The plain drug search screen SC8 includes a color selection box SLB5, a plain drug image display section SRD2, a group selection box SLB1, and a plain drug list button BT14.

The plain drug image display section SRD2 displays the front and back sides of the master image from top to bottom in order of degree of agreement with the values of the major axis and minor axis of the drug selected in the entire image display section EDA. In this case, the search range is limited to plain tablets.

Regarding the display order of master images of plain drugs to be displayed on the plain drug image display section SRD2, past discrimination history data may be referred to, and plain drugs with a discrimination history may be displayed preferentially. By operating the slider bar knob at the right end of the screen downward or by swiping the screen, lower-level candidates can also be displayed.

When the plain drug list button BT14 at the bottom right of the screen SC8 is pressed, the plain drug image display section SRD2 expands to the right, and more plain drug candidates are displayed as a list at once. In this case, the plain drug list button BT14 is replaced with a "back button" (not shown), and when the back button is pressed, the screen returns to the original plain drug candidate screen (FIG. 28).

A color selection box SLB5 may be used to specify the color of the plain drug to narrow down the candidates. Alternatively, the color of the plain drug may be automatically recognized from the photographed image, the candidates narrowed down in advance, and the selection box SLB5 may be displayed by default.

The user can confirm the drug type by selecting any capsule from among the capsule images displayed on the capsule image display section SRD and pressing the drug type confirmation button BT5 on the button display section BDA.

[Divided tablet search GUI]
FIG. 29 is an example of a screen display of the divided tablet search GUI. The split tablet search GUI is displayed in the candidate display area CDA when the drug is identified as "split tablet" by the drug identifier 120, or when "split tablet" is selected from the pull-down menu of the group selection box SLB1. It is a GUI.

The divided tablet search screen SC9 includes a divided tablet image display section LS1, a divided tablet call button BT15, a search box SBX3, a candidate drug image display section LS2, and a group selection box SLB1.

The divided tablet image display section LS1 displays a list of the divided tablet images selected in the entire image display section EDA and the divided tablet images called up by pressing the divided tablet call button BT15. For example, the divided tablet image selected in the entire image display section EDA is displayed at the top of the list display in the divided tablet image display section LS1. The divided tablet image display section LS1 clearly distinguishes between a display section that displays the divided tablet image called up by pressing the divided tablet call button BT15 and a display section that displays the divided tablet image selected in the entire image display section EDA. It may be configured to display the information.

When the user presses the divided tablet call button BT15, one or more divided tablet images can be selected from the list of previously registered divided tablet photographed images and added to the list displayed in the divided tablet image display section LS1. As a method for registering divided tablet images, the following implementation may be considered, for example. That is, when the drug group selected on the entire image display section EDA is in the state of "divided tablets", if you press and hold the selected area on the entire image display section EDA, a pull-down menu will be displayed, and from this pull-down menu, you can select " ``Divided tablet image registration'' item can be selected. When "Register divided tablet image" is selected, the corresponding divided tablet image is stored in the shared section, and the image can be called by pressing the divided tablet call button BT15 on this screen. The shared unit is a part of the storage area of the storage device 104, and is a storage area in which data and the like that are shared in the processing in the drug type identification device 100 beyond the work unit of discrimination are stored. The method for registering divided tablet images is not limited to this example, and other methods may be used.

You can search by engraved characters by entering text into the search box SBX3 using a keyboard or voice input and pressing the search button SBT3 at the right end.

The candidate drug image display section LS2 displays the front and back sides of the master image in descending order of text similarity score from top to bottom. The similarity score may be determined every time one character is input into the search box SBX3, and the candidate list display may be updated immediately. Regarding the display order of master images of drugs to be displayed on the candidate drug image display section LS2, past discrimination history data may be referred to, and drugs corresponding to divided tablets with a discrimination history may be displayed preferentially. By operating the slider bar knob at the right end of the screen downward or by swiping the screen, lower-level candidates can also be displayed.

The user can confirm the drug type by selecting any drug from among the drugs displayed on the candidate drug image display section LS2 and pressing the "drug type confirmation button BT5" on the button display section BDA.

In the list of candidate drug images in FIG. 29, drug information for each drug includes the master image (front and back), text information indicating the drug name, and numerical size information, which are registered in the stamp text database. It is preferable to display stamped text information (master character information).

Although not shown in FIG. 29, a color selection box may be used to specify the color of the drug to narrow down the candidates. In this case, the color of the divided tablet may be automatically recognized from the photographed image, the candidates narrowed down in advance, and the color selection box may be displayed by default.

[Improvements to improve usability in each search GUI]
In the search screen illustrated using FIGS. 20 to 29, multiple icons representing typical drug shapes are further arranged, and when the user selects one or more icons, only drugs with the corresponding shape are displayed. It may also be displayed on the candidate screen.

FIG. 30 shows an example of an icon representing the shape of a typical drug. Typical shapes for agents can be circular, oval, oval, pentagonal, and hexagonal. As shown in FIG. 30, graphic icons corresponding to each of these typical shapes and an "other" button may be arranged on the search screen, and the designation of the shape may be accepted by selecting the icon.

Regarding color selection, the configuration is not limited to selection from a pull-down menu, and color options may be provided using icons.

FIG. 31 shows an example of icons used for color selection. FIG. 31 shows an example in which, from the left, icons of white, yellow, orange, brown, red, blue, green, and transparent colors and an "other" button are arranged. Of course, the arrangement order of colors, the types of colors displayed as icons, and the number of colors can be designed as appropriate.

A configuration may also be adopted in which icons corresponding to each color and an "other" button are arranged on the search screen, and the designation of the color is accepted by selecting the icon. When the user selects one or more icons, only drugs of the corresponding color may be displayed on the candidate screen.

[Example of a configuration that performs group-specific drug type identification processing]
FIG. 32 is a block diagram illustrating an example of a functional configuration for identifying the type of drug that is difficult to identify in the drug type identification device 100 according to the embodiment. The drug type identification device 100 includes a drug type identification processing control section 160 that controls processing content based on information estimated by the drug identifier 120, a drug type estimation result presentation processing section 170, a capsule drug identification processing section 172, and a plain drug identification processing section 170. It includes a specific processing section 174 and a divided tablet specific processing section 176.

The drug identifier 120 outputs drug type estimation information in which the type of the drug is estimated when the drug to be identified is a drug whose drug type can be specified, and if the drug to be identified is a drug whose drug type is difficult to identify, it estimates the group to which the drug belongs. Output group estimation information. The drug type identification processing control unit 160 acquires the estimated information output from the drug identifier 120, and distributes subsequent processing according to the estimated information.

When the drug type identification processing control unit 160 acquires the drug type estimation information from the drug identifier 120, it causes the drug type estimation result presentation processing unit 170 to execute the process. The drug type estimation result presentation processing section 170 performs a process of displaying drug information of a candidate drug on the candidate display section CDA based on the drug estimation information estimated by the drug discriminator 120. Through the processing of the drug type estimation result presentation processing unit 170, the screen display described with reference to FIGS. 24 and 25 is realized. The drug type estimation result presentation processing section 170 cooperates with the text search section 122 and can transition to the process of searching for stamped text by pressing the stamped text search button BT4 (see FIG. 26).

The drug type identification processing control section 160 includes a group discrimination section 162. The group determining unit 162 determines the label of the estimated group when the estimated information output from the drug discriminator 120 is group estimated information. Here, an example will be shown in which it is determined which group it belongs to among "capsule drugs," "plain drugs," and "divided tablets."

If the group estimation information acquired from the drug identifier 120 indicates a group of "capsule drugs," the drug type identification processing control section 160 causes the capsule drug identification processing section 172 to execute the process. The capsule drug identification processing unit 172 performs processing to provide a capsule search GUI (see FIG. 27) for supporting identification of the drug type of a capsule drug. The capsule drug identification processing section 172 includes a capsule drug search section 173. The capsule drug search unit 173 receives input of search conditions, executes a search process based on the received search conditions, and outputs search results.

If the group estimation information acquired from the drug identifier 120 indicates a "plain drug" group, the drug type identification processing control unit 160 causes the plain drug identification processing unit 174 to execute the process. The plain drug identification processing unit 174 performs processing to provide a plain drug search GUI (see FIG. 28) for supporting identification of drug types for plain drugs. The plain drug identification processing section 174 includes a plain drug search section 175. The plain drug search unit 175 receives input of search conditions, executes a search process based on the received search conditions, and outputs search results.

If the group estimation information obtained from the drug identifier 120 indicates a group of "split medicines", the drug type identification processing control unit 160 causes the split tablet identification processing unit 176 to execute the processing. The split tablet identification processing unit 176 performs processing to provide a split tablet search GUI (see FIG. 29) for supporting identification of drug types for split tablets. The split tablet identification processing unit 176 includes a split tablet search unit 177 and a split tablet image registration unit 178. Divided tablet search unit 177 receives input of search conditions, executes search processing based on the received search conditions, and outputs search results. The divided tablet image registration unit 178 is a processing unit that performs the registration process and readout process of the divided tablet image described in FIG. 29.

Note that when the group labels are defined in a hierarchical structure, each of the capsule drug search section 173, the plain drug search section 175, and the divided tablet search section 177 narrows down the search conditions based on the estimated group labels. be able to.

The process including the provision of a group-specific search GUI executed by each of the capsule medicine identification processing unit 172, the plain medicine identification processing unit 174, and the divided tablet identification processing unit 176 is a process that leads to identification of the type of medicine in each group. This is an example of (group-specific processing).

<Photography assistance device>
FIG. 33 is a top view of the photographing auxiliary device 70 for photographing a photographed image to be input to the drug type identification device 100. Further, FIG. 34 is a sectional view taken along line 34-34 in FIG. 33. FIG. 34 also shows the smartphone 10 that takes an image of the drug using the photographing auxiliary device 70.

As shown in FIGS. 33 and 34, the photographing auxiliary device 70 includes a housing 72, a medicine table 74, a main light source 75, and an auxiliary light source 78. In FIG. 33, the shape is shown based on a square, but the housing 72, the medicine table 74, and the auxiliary light source 78 may have a rectangular shape.

The casing 72 includes a square bottom plate 72A supported horizontally, and four

rectangular side plates

72B, 72C, 72D, and 72E vertically fixed to the ends of each side of the bottom plate 72A. configured.

The drug mounting table 74 is fixed to the upper surface of the bottom plate 72A of the housing 72. The drug placement table 74 is a member having a surface on which a drug is placed, and here, it is a thin plate-like member made of plastic or paper that is square in top view, and the placement surface on which the drug to be identified is placed. It has a standard gray color. The standard gray color is expressed by 256 gradation values from 0 (black) to 255 (white), for example, in the range of 130 to 220, and more preferably in the range of 150 to 190. It is a gradation value.

Generally, when a drug is photographed with a smartphone 10 against a white background or a black background, the colors may be washed out due to the automatic exposure adjustment function, and sufficient marking information may not be obtained. According to the drug placement table 74, since the placement surface is gray in color, the details of the markings can be captured without color scattering. Further, by acquiring the gray pixel values appearing in the photographed image and correcting them to the true gray gradation values, it is possible to realize tone correction or exposure correction of the photographed image.

Reference markers

74A, 74B, 74C, and 74D made of black and white are placed at the four corners of the placement surface of the drug placement table 74 by pasting or printing. Although any

reference markers

74A, 74B, 74C, and 74D may be used, simple circular markers are used here because they have high detection robustness.

The

reference markers

74A, 74B, 74C, and 74D preferably have a size of 3 to 30 mm in the vertical and horizontal directions, and more preferably a size of 5 to 15 mm.

Further, the distance between the reference marker 74A and the reference marker 74B and the distance between the reference marker 74A and the reference marker 74D are each preferably 20 to 100 mm, more preferably 20 to 60 mm.

Furthermore, as shown in FIG. 33, the four

reference markers

74A, 74B, 74C, and 74D may be connected with a straight line to clearly define a rectangular drug placement range. Although FIG. 33 shows an example in which four

reference markers

74A, 74B, 74C, and 74D are arranged at the vertices of a square, the arrangement form of the

reference markers

74A, 74B, 74C, and 74D is not limited to the example in FIG. 33. . For example, four

fiducial markers

74A, 74B, 74C, and 74D may be placed at the vertices of a rectangle.

The drug to be photographed is placed inside a rectangular area (drug placement range) with four

reference markers

74A, 74B, 74C, and 74D as vertices. FIG. 32 shows an example in which five drugs T1, T2, T3, T4, and T5 are imaged.

Note that the bottom plate 72A of the casing 72 may also serve as the drug placement table 74.

The main light source 75 and the auxiliary light source 78 constitute an illumination device used to capture an image of the drug to be identified. The main light source 75 is used to extract the stamp of the drug to be identified. Auxiliary light source 78 is used to accurately bring out the color and shape of the drug to be identified. The photographing auxiliary device 70 does not need to include the auxiliary light source 78.

FIG. 35 is a top view of the photographing auxiliary device 70 with the auxiliary light source 78 removed.

The main light source 75 is composed of a plurality of LEDs 76. The LEDs 76 are white light sources each having a light emitting portion within 10 mm in diameter. Here, six LEDs 76 are arranged horizontally at a constant height on each of four

rectangular side plates

72B, 72C, 72D, and 72E. Thereby, the main light source 75 irradiates the identification target drug with illumination light from at least four directions. Note that the main light source 75 only needs to be able to irradiate the identification target drug with illumination light from at least two directions.

The angle θ formed by the irradiation light emitted by the LED 76 and the upper surface (horizontal surface) of the drug to be identified is preferably within the range of 10° to 20° in order to extract the stamp. Note that the main light source 75 may be composed of rod-shaped light sources each having a width of 10 mm or less and arranged horizontally on each of the four

rectangular side plates

72B, 72C, 72D, and 72E.

The main light source 75 may be lit all the time. Thereby, the photographing auxiliary device 70 can irradiate the identification target drug with illumination light from all directions. An image taken with all the LEDs 76 turned on is called a fully illuminated image. According to the full illumination image, it becomes easy to extract the print of the identification target drug to which the print is added.

In the main light source 75, the LED 76 may be turned on and off depending on the timing, or the LED 76 may be turned on and off by a switch (not shown). Thereby, the photographic assisting device 70 can irradiate the identification target drug with illumination light from a plurality of different directions using the plurality of main light sources 75.

For example, an image taken with only the six LEDs 76 provided on the side plate 72B lit is called a partially illuminated image. Similarly, a partial illumination image photographed with only the six LEDs 76 provided on the side plate 72C lit, and a portion photographed with only the six LEDs 76 provided on the side plate 72D lit. By photographing the illumination image and the partial illumination image photographed with only the six LEDs 76 provided on the side plate 72E turned on, four partial illumination images each illuminated with illumination light from different directions can be obtained. can be obtained. According to a plurality of partial illumination images in which irradiation light is irradiated from a plurality of different directions, it becomes easy to extract the stamp of the identification target drug to which the stamp has been added.

The auxiliary light source 78 is a flat white light source with a square outer shape and a square opening in the center. The auxiliary light source 78 may be an achromatic reflector that diffusely reflects the light emitted from the main light source 75. The auxiliary light source 78 is arranged between the smartphone 10 and the medicine table 74 so that the identification target medicine is uniformly irradiated with light from the photographing direction (optical axis direction of the camera). The illuminance of the irradiation light from the auxiliary light source 78 that irradiates the identification target drug is relatively lower than the illuminance of the irradiation light from the main light source 75 that irradiates the identification target drug.

FIG. 36 is a top view of a photographic assisting device 80 according to another embodiment. FIG. 37 is a sectional view taken along line 37-37 in FIG. 36. FIG. 37 also shows the smartphone 10 that takes an image of the drug using the photographing auxiliary device 80. Note that in FIGS. 36 and 37, parts common to those in FIGS. 33 and 34 are given the same reference numerals, and detailed explanation thereof will be omitted. As shown in FIGS. 36 and 37, the photographic assisting device 80 includes a housing 82, a main light source 84, and an auxiliary light source 86. The photographing auxiliary device 80 does not need to include the auxiliary light source 86.

The housing 82 has a cylindrical shape and is composed of a circular bottom plate 82A supported horizontally and a side plate 82B fixed perpendicularly to the bottom plate 82A. A drug mounting table 74 is fixed to the upper surface of the bottom plate 82A.

The main light source 84 and the auxiliary light source 86 constitute an illumination device used to capture an image of the drug to be identified. FIG. 38 is a top view of the photographing auxiliary device 80 with the auxiliary light source 86 removed.

The main light source 84 is composed of 24 LEDs 85 arranged in a ring shape at constant intervals in the horizontal direction at a constant height on the side plate 82B. The main light source 84 may be lit all the time, or the LED 85 may be switched on and off.

The auxiliary light source 86 is a flat white light source with a circular outer shape and a circular opening in the center. The auxiliary light source 86 may be an achromatic reflector that diffusely reflects the light emitted from the main light source 84. The illuminance of the irradiation light from the auxiliary light source 86 that irradiates the identification target drug is relatively lower than the illuminance of the irradiation light from the main light source 84 that irradiates the identification target drug.

The photographing auxiliary device 70 and the photographing auxiliary device 80 may include a fixing mechanism (not shown) that fixes the smartphone 10 that photographs the drug to be identified at a standard photographing distance and photographing viewpoint position. The fixing mechanism may be configured to be able to change the distance between the identification target drug and the camera according to the focal length of the photographing lens 50 of the smartphone 10.

[Lighting device]
FIG. 39 is a cross-sectional view showing the configuration of an illumination device 81 as a photographic assistance device according to another embodiment. The illumination device 81 shown in FIG. 39 has a configuration in which the bottom plate 82A and the drug mounting table 74 are removed from the configuration of the photographic assistance device 80 described in FIGS. 37 and 38. The other configuration may be the same as the configuration of the photographic assisting device 80. Note that a diffuser cover (not shown) or the like may be arranged to cover the LED 85 on the side surface.

<Example of reference marker>
FIG. 40 is a top view of the medicine mounting table 74 using the circular marker MC1. As shown in FIG. 40, circular markers MC1 are arranged at the four corners of the medicine table 74 as

reference markers

74A, 74B, 74C, and 74D. The left diagram F40A in FIG. 40 shows an example in which the centers of the

reference markers

74A, 74B, 74C, and 74D constitute four vertices of a square, and the right diagram F40B in FIG. , and 74D constitute four vertices of a rectangle. The reason for arranging four reference markers is that the coordinates of four points are required to determine a perspective transformation matrix for standardization.

Here, the four

reference markers

74A, 74B, 74C, and 74D have the same size and the same color, but may have different sizes and colors. Note that when the sizes are different, it is preferable that the centers of the

adjacent reference markers

74A, 74B, 74C, and 74D are arranged so as to constitute four vertices of a square or a rectangle. By making the reference markers different in size or color, it becomes easier to specify the shooting direction.

Furthermore, at least four circular markers MC1 may be arranged, and five or more circular markers MC1 may be arranged. When five or more circular markers MC1 are arranged, the centers of the four circular markers MC1 constitute four vertices of a square or rectangle, and the centers of the additional circular markers MC1 are arranged on the sides of the square or rectangle. It is preferable. By arranging five or more reference markers, the photographing direction can be easily identified. Additionally, by arranging five or more reference markers, even if detection of one of the reference markers fails, at least four points necessary to obtain the perspective transformation matrix for standardization can be detected simultaneously. This has the advantage of increasing the probability and reducing the time and effort required to re-photograph.

The circular marker MC1 includes an outer first perfect circle with a relatively large diameter and an inner second perfect circle that is arranged concentrically with the first perfect circle and has a relatively smaller diameter than the first perfect circle. Be prepared. That is, the first perfect circle and the second perfect circle are circles that are arranged at the same center but have different radii. Further, in the circular marker MC1, the inside of the second perfect circle is white, and the area inside the first perfect circle and outside the second perfect circle is filled in black.

The diameter of the first perfect circle is preferably 3 mm to 20 mm. Further, the diameter of the second perfect circle is preferably 0.5 mm to 5 mm. Furthermore, the ratio of the diameters of the first perfect circle and the second perfect circle (diameter of the first perfect circle/diameter of the second perfect circle) is preferably 2 to 10.

There is a possibility that the true center point of the object in the pre-standardization image and the coordinates of the center point estimated by machine learning may deviate. By giving the coordinates of the center point of the inner second perfect circle like the circular marker MC1 as training data, machine learning can accurately and easily estimate the true center coordinates of the object. Further, due to the effect of the presence of the relatively large first perfect circle on the outside, the possibility of erroneous detection due to dust or the like attached to the circular marker MC1 can be significantly reduced.

FIG. 41 is a top view of a medicine placement table 74 using a circular marker according to a modification. The left diagram F41A in FIG. 41 is a diagram showing an example in which the circular marker MC2 is used as the

reference markers

74A, 74B, 74C, and 74D. The circular marker MC2 has a cross-shaped figure made up of two white straight lines perpendicular to each other inside a black filled perfect circle so that the intersection of the two straight lines coincides with the center of the perfect circle. It is located in The

reference markers

74A, 74B, 74C, and 74D of the circular marker MC2 are arranged so that their respective centers constitute four vertices of a square, and the straight lines of the cross-shaped figure of the circular marker MC2 are connected to the sides of this square. arranged in parallel. The

reference markers

74A, 74B, 74C, and 74D of the circular marker MC2 may be arranged such that their centers constitute four vertices of a rectangle. The thickness of the line of the cross-shaped figure of the circular marker MC2 can be determined as appropriate.

On the other hand, the right diagram F40B in FIG. 41 is a diagram showing an example in which the circular marker MC3 is used as the

reference markers

74A, 74B, 74C, and 74D. The circular marker MC3 has two perfect circles with different radii placed at the same center, the inside of the inner perfect circle is white, and the area inside the outer perfect circle and outside the inner perfect circle is white. It is painted black. Further, in the circular marker MC3, a cross-shaped figure consisting of two black straight lines orthogonal to each other is arranged inside the inner perfect circle so that the intersection of the two straight lines coincides with the center of the perfect circle. There is. The

reference markers

74A, 74B, 74C, and 74D of the circular marker MC3 are arranged so that their centers constitute four vertices of a square, and the straight lines of the cross-shaped figure of the circular marker MC3 are connected to the sides of this square. arranged in parallel. The

reference markers

74A, 74B, 74C, and 74D of the circular marker MC3 may be arranged such that their centers constitute four vertices of a rectangle. The thickness of the line of the cross-shaped figure of the circular marker MC3 can be determined as appropriate.

According to the circular markers MC2 and MC3, it is possible to improve the estimation accuracy of the center point coordinates. Further, since the circular markers MC2 and MC3 have a different appearance from the medicine, the markers can be easily recognized.

FIG. 42 is a diagram showing a specific example of a reference marker having a rectangular outer shape. The left diagram F42A in FIG. 41 shows the rectangular marker MS1. The rectangular marker MS1 includes an outer square SQ1 with a relatively large side length and an inner square SQ2 that is arranged concentrically with the square SQ1 and has a relatively smaller side length than the square SQ1. That is, the squares SQ1 and SQ2 are quadrilaterals arranged at the same center (center of gravity) but having different lengths on one side. In addition, in the rectangular marker MS1, the inside of the square SQ2 is white, and the area inside the square SQ1 and outside the square SQ2 is filled in black.

The length of one side of the square SQ1 is preferably 3 mm to 20 mm. Further, the length of one side of the square SQ2 is preferably 0.5 mm to 5 mm. Furthermore, the ratio of the lengths of one side of square SQ1 and square SQ2 (length of one side of square SQ1/length of one side of square SQ2) is preferably from 2 to 10. In order to further improve the accuracy of estimating the center coordinates, a black rectangle (for example, a square), not shown, whose side length is relatively smaller than that of the square SQ2 may be concentrically arranged inside the square SQ2.

The right view F42B of FIG. 42 shows a top view of the medicine mounting table 74 using the rectangular square marker MS1. As shown in the right figure F41B, on the medicine table 74, rectangular markers MS1 are arranged at the four corners as

reference markers

74A, 74B, 74C, and 74D. Here, an example is shown in which a line connecting the centers of

adjacent reference markers

74A, 74B, 74C, and 74D forms a square. It may also be rectangular.

FIG. 43 is a top view of a medicine placement table 74 using a circular marker according to another modification. The left diagram F43A in FIG. 42 is a diagram showing an example in which circular marker MC4 is used as the

reference markers

74A, 74B, 74C, and 74D. The circular marker MC4 includes an outer first perfect circle with a relatively large diameter, an inner second perfect circle that is arranged concentrically with the first perfect circle and has a relatively smaller diameter than the first perfect circle, and A black circle (third perfect circle) that is arranged concentrically with the second perfect circle inside the second perfect circle and has a diameter that is relatively smaller than the second perfect circle. In the circular marker MC4, the inside of the second perfect circle is white, and the area inside the first perfect circle and outside the second perfect circle is filled in black.

The diameter of the first perfect circle is preferably 3 mm to 20 mm. Further, the diameter of the second perfect circle is preferably 5 mm to 18 mm. The diameter of the third perfect circle is preferably 0.5 mm to 5 mm. Furthermore, the ratio of the diameters of the first perfect circle and the second perfect circle (diameter of the first perfect circle/diameter of the second perfect circle) is preferably 1.1 to 3. Furthermore, the ratio of the diameters of the second perfect circle and the third perfect circle (diameter of the second perfect circle/diameter of the third perfect circle) is preferably 2 to 10.

The left diagram F43A in FIG. 43 shows an example in which the centers of the

reference markers

74A, 74B, 74C, and 74D constitute four vertices of a square, and the right diagram F43B in FIG. , and 74D constitute four vertices of a rectangle. A straight line connecting the four circular markers MC4 in the vertical and horizontal directions is displayed. According to the circular marker MC4, it is possible to improve the estimation accuracy of the center point coordinates. In addition, by connecting the circular markers MC4 with straight lines, it becomes easier to recognize the marker and drug placement range, and the straight lines appearing in the captured image can be used to correct distortion of the captured image, specify the cropping range of the standardized image, etc. It is also possible to deal with Note that these straight lines may not be provided.

FIG. 44 is a diagram showing another specific example of a rectangular reference marker. The left diagram F44A in FIG. 44 shows the rectangular marker MS2. The quadrilateral marker MS2 includes an outer square SQ1 whose side length is relatively large, an inner square SQ2 which is arranged concentrically with the square SQ1 and whose side length is relatively smaller than the square SQ1, and a square SQ2. A black square SQ3, which is arranged concentrically with the square SQ2 inside the square SQ2 and whose length on one side is relatively smaller than the square SQ2. The length of one side of the square SQ1 is preferably 3 mm to 20 mm. Further, the length of one side of the square SQ2 is preferably 5 mm to 18 mm. The length of one side of the square SQ3 is preferably 0.5 mm to 5 mm. Furthermore, the ratio of the lengths of one side of square SQ1 and square SQ2 (length of one side of square SQ1/length of one side of square SQ2) is preferably 1.1 to 3. The ratio of the lengths of one side of square SQ2 and square SQ3 (length of one side of square SQ2/length of one side of square SQ3) is preferably from 2 to 10.

The right diagram F44B in FIG. 44 shows a top view of the medicine mounting table 74 using the rectangular square marker MS2. As shown in the right figure F44B, quadrangular markers MS2 are arranged at the four corners of the medicine table 74 as

reference markers

adjacent reference markers

74A, 74B, 74C, and 74D forms a square. It may also be rectangular. Also, similar to FIG. 43, straight lines connecting the four rectangular markers MS2 in the vertical and horizontal directions are displayed.

According to the rectangular marker MS2, it is possible to improve the estimation accuracy of the center point coordinates. Furthermore, by connecting the rectangular markers MS2 with straight lines, it becomes easier to recognize the marker and the medicine placement range. Note that these straight lines may not be provided.

The medicine mounting table 74 may include a mixture of circular markers MC and square markers MS. By mixing circular markers MC and square markers MS, effects such as easier identification of the photographing direction can be expected.

Further, it is preferable to use a circular marker rather than a rectangular marker. This is because when detecting a reference marker using a mobile terminal device such as a smartphone, the following constraints (a) to (c) exist.
(a) Since applications in mobile terminal devices may have capacity limitations, it is desirable to perform marker detection and drug detection using the same trained model.
(b) In drug detection, it is desirable to use a bounding box with rotation so that part of another drug does not enter the cutout image of the oval tablet. In this case, according to request (a), it is necessary to prepare reasonable training data regarding the rotation angle for marker detection as well.
(c) In the case of a rectangular marker, there is an arbitrariness in determining the rotation angle of the bounding box due to 4-fold symmetry, so there is no reasonable teacher for the rotation angle of the rectangular marker in the pre-standardization image, which is the input image at the time of marker detection. Data creation is difficult. On the other hand, if a circular marker is used, it is possible to create rational training data with the rotation angle always being 0 degrees.

Furthermore, by making the circular markers concentric, it is possible to improve the accuracy of estimating the center coordinates of the markers. In pre-standardized images, a simple circular marker is distorted in shape and tends to cause errors in estimating the center coordinates, but since the inner circles of concentric circles have a narrower range, the trained model can easily estimate the center coordinates even in distorted pre-standardized images. Easy to identify. Furthermore, the outer circles of concentric circles have the advantage of having a large structure, making it easier to find trained models, and being robust against noise and dirt. Note that the accuracy of estimating the center point coordinates of the rectangular markers can also be improved by making them concentric.

<Other aspects of drug placement table>
The placement surface of the drug placement table of the photographic assistance device may be provided with a recessed structure on which the drug is placed. Indentation structures include depressions, grooves, depressions, and holes.

FIG. 45 is a diagram showing a drug mounting table 410 that is used in place of or in addition to the drug mounting table 74 (see FIG. 33) and is provided with a recessed structure. . The drug mounting table 410 is made of paper, synthetic resin, fiber, rubber, or glass. The left view F45A of FIG. 45 is a top view of the medicine mounting table 410. As shown in the left figure F45A, the placement surface of the drug placement table 410 has a gray color, and

reference markers

74A, 74B, 74C, and 74D are arranged at the four corners. In FIG. 45, the shape of the marker is shown using a circular marker MC1, but it is not limited to the circular marker MC1, and may be a circular marker MC2, MC3, MC4, a square marker MS1, or MS2.

Furthermore, a total of nine

depressions

410A, 410B, 410C, 410D, 410E, 410F, 410G, 410H, and 410I are provided on the placement surface of the drug placement table 410 as a depression structure. The depressions 410A to 410I are circular in shape and have the same size when viewed from above.

The right view F45B of FIG. 45 is a cross-sectional view of the drug mounting table 410 taken along the line 44-44. As shown in the right figure F45B, the

depressions

410A, 410B, and 410C have hemispherical bottom surfaces and each have the same depth. The same applies to the depressions 410D to 410I. The bottom surface does not need to be perfectly hemispherical, and may be a concave curved surface with a constant radius of curvature or a concave curved surface with a gradually changing radius of curvature.

In addition, right diagram F45B shows tablets T51, T52, and T53 placed in

depressions

410A, 410B, and 410C, respectively. Tablets T51 and T52 are each medicines that are circular in top view and rectangular in side view. Moreover, the tablet T53 is a drug that is circular in top view and oval in side view. In a top view, tablet T51 and tablet T53 have the same size, and tablet T52 is relatively smaller than tablet T51 and tablet T53. As shown in the right figure F45B, the tablets T51 to T53 are placed in the recesses 410A to 410C and left still. The tablets T51 to T53 may have a circular shape when viewed from above, or may have a straight line on the left and right sides and an arc shape on the top and bottom when viewed from the side.

In this way, according to the drug placement table 410, by providing the placement surface with a hemispherical recessed structure, the circular drug can be prevented from moving when viewed from above and can be left stationary. Furthermore, since the position of the medicine at the time of photographing can be determined to be the position of the depression structure, it becomes easy to detect the medicine area.

The medicine table may have a recessed structure for rolling type medicines such as capsule medicines. FIG. 46 is a diagram showing a drug mounting table 412 provided with a recessed structure for capsule drugs. Note that parts common to those in FIG. 45 are given the same reference numerals, and detailed explanation thereof will be omitted.

The left view F46A in FIG. 46 is a top view of the medicine mounting table 412. As shown in the left figure F46A, a total of six

depressions

412A, 412B, 412C, 412D, 412E, and 412F are provided on the placement surface of the drug placement table 412 as a depression structure in 3 rows and 2 columns. The depressions 412A to 412F are each rectangular in size when viewed from above. The recess need not be perfectly hemispherical, but may be a concave curved surface with a constant radius of curvature or a concave curved surface with a gradually changing radius of curvature.

The right view F46B of FIG. 46 is a cross-sectional view of the drug mounting table 412 taken along the line 46-46. As shown in right figure F46B, depressions 412A, 412B, and 412C have semi-cylindrical bottoms and are each of the same depth. The same applies to the depressions 412D to 412F. Furthermore, F63B shows capsule medicines CP1, CP2, and CP3 placed in

recesses

412A, 412B, and 412C, respectively. Capsule drugs CP1 to CP3 have a cylindrical shape with hemispherical ends (bottom surfaces), and have different diameters. As shown in the right figure F46B, the capsule medicines CP1 to CP3 are placed in the recesses 412A to 412C and left still.

In this way, according to the drug placement table 412, by providing the placement surface with the semi-cylindrical recessed structure, the cylindrical capsule drug can be prevented from moving or rolling, and can be left still. can. Furthermore, since the position of the medicine at the time of photographing can be determined to be the position of the depression structure, it becomes easy to detect the medicine area.

Additionally, the drug mounting table may have a recessed structure for elliptical tablets. FIG. 47 is a diagram showing a medicine platform 414 provided with a concave structure for oval tablets. Note that parts common to those in FIG. 46 are denoted by the same reference numerals, and detailed explanation thereof will be omitted.

The left view F47A of FIG. 47 is a top view of the medicine mounting table 414. As shown in the left figure F47A, a total of six

depressions

414A, 414B, 414C, 414D, 414E, and 414F are provided on the placement surface of the drug placement table 414 as a depression structure in 3 rows and 2 columns. Each of the recesses 414A to 414F is rectangular when viewed from above.

The

recesses

414A and 414B have the same size.

Recesses

414C and 414D have the same size and are relatively smaller than

recesses

414A and 414B.

Recesses

414E and 414F have the same size and are relatively smaller than

recesses

414C and 414D.

In addition, the left diagram F47A shows tablets T61, T62, and T63 placed in the

recesses

414B, 414D, and 414F, respectively. As shown in the left diagram F47A, the

depressions

414B, 414D, and 414F have sizes corresponding to the tablets T61, T62, and T63, respectively.

The right view F47B in FIG. 47 is a cross-sectional view of the drug mounting table 414 taken along the line 47-47. As shown in the right figure F47B, the

recesses

414A and 414B have flat bottom surfaces. As shown in the right figure F47B, the tablet T61 is placed in the depression 414B and left still. The same applies to tablets T62 and T63.

In this way, according to the drug placement table 414, by providing the placement surface with the rectangular parallelepiped-shaped recessed structure, it is possible to prevent the oval-shaped tablet from moving and to allow it to stand still. Furthermore, since the position of the medicine at the time of photographing can be determined to be the position of the depression structure, it becomes easy to detect the medicine area.

Note that the shape, number, and arrangement of the recess structures are not limited to the embodiments shown in FIGS. 45 to 47, and may be appropriately combined, enlarged, or reduced.

<About the hardware configuration of each processing unit and control unit>
The image acquisition unit 112, drug detector 114, area correction unit 116, drug area extraction unit 118, drug identifier 120, text search unit 122, display control unit 124, magnification change unit 125, and input processing unit described in FIG. 5 126, the stamp extraction unit 140, the outer shape extraction unit 142, the size measurement unit 144, the loss calculation unit 152, the optimizer 154, which were explained in FIG. Presentation processing unit 170, capsule medicine identification processing unit 172, capsule medicine search unit 173, plain medicine identification processing unit 174, plain medicine search unit 175, split tablet identification processing unit 176, split tablet search unit 177, and split tablet image registration unit The hardware structure of a processing unit that executes various processes such as 178 is the following various processors.

Various types of processors include the CPU (Central Processing Unit), which is a general-purpose processor that executes programs and functions as various processing units, the GPU (Graphics Processing Unit), which is a processor specialized in image processing, and the FPGA (Field Processing Unit). Programmable Logic Devices (PLDs), which are processors whose circuit configuration can be changed after manufacturing, such as Programmable Gate Arrays, and ASICs (Application Specific Integrated Circuits), which are specifically designed to perform specific processes. This includes a dedicated electric circuit that is a processor having a circuit configuration.

One processing unit may be composed of one of these various processors, or may be composed of two or more processors of the same type or different types. For example, one processing unit may be configured by a plurality of FPGAs, a combination of a CPU and an FPGA, a combination of a CPU and a GPU, or the like. Further, the plurality of processing units may be configured with one processor. As an example of configuring multiple processing units with one processor, first, one processor is configured with a combination of one or more CPUs and software, as typified by computers such as clients and servers. There is a form in which a processor functions as multiple processing units. Second, there are processors that use a single IC (Integrated Circuit) chip, such as System On Chip (SoC), which implements the functions of an entire system including multiple processing units. be. In this way, various processing units are configured using one or more of the various processors described above as a hardware structure.

Furthermore, the hardware structure of these various processors is, more specifically, an electric circuit (circuitry) that is a combination of circuit elements such as semiconductor elements.

<About the program that realizes the functions of the drug type identification device 100>
The processing functions of the drug type identification device 100 are not limited to the smartphone 10, and can be realized using various forms of information processing devices such as a tablet computer, a personal computer, or a workstation. A computer-readable program that allows a computer to implement part or all of the processing functions of the drug type identification device 100 described in the above embodiments is a non-transitory information storage medium such as an optical disk, a magnetic disk, or a semiconductor memory or other tangible object. It is possible to record the program on a medium and provide the program through this information storage medium. Furthermore, instead of providing the program by storing it in a tangible, non-temporary information storage medium, it is also possible to provide the program signal as a download service using a telecommunications line such as the Internet.

Furthermore, it is also possible to provide a part of the processing functions of the drug type identification device 100 described in the above embodiment as an application server, and provide a service that provides the processing functions through a telecommunications line.

<Effects of embodiment>
According to the drug type identification device 100 according to the embodiment, even if an image is taken in which a drug whose drug type can be identified and a drug whose drug type is difficult to identify are mixed, it is possible to identify the type of each drug in the image or the group to which the drug belongs. It is possible to automatically identify the drug, shift to processing specific to each group for drugs whose drug type is difficult to identify, and provide a GUI that supports identification of the type. For this reason, it becomes possible to collectively image a plurality of drugs in meaningful units from a pharmaceutical standpoint and identify each drug.

[Modification 1]
The drug identifier 120 may include an image recognizer using a template matching technique.

[Modification 2]
Although the above embodiment describes an example of a case where drugs are discriminated, the technology of the present disclosure can also be applied to a case where a drug audit is performed. Further, in the above embodiment, an example has been described in which the object to be identified is a drug, but the object to be identified is not limited to a drug. The technology of the present disclosure can be applied as a technology for identifying an object from an image with respect to various objects, regardless of the type of object and its purpose.

<Explanation of terms>
In the present disclosure, an "object" refers to an object that can be classified according to a hierarchical structure. The specific depth of the hierarchical structure of classification is called granularity, and an object belongs to one of the "types" at that granularity. "Identification" refers to determining which type an object belongs to at a certain granularity.

The hierarchical structure may be defined depending on the purpose of identification.

The above-mentioned "certain granularity" shall be determined depending on the purpose of identification.

A collection of objects with common characteristics is called a group. Groups may or may not be related to types. The group may refer to an upper layer of the type in the above hierarchical structure, or may be a newly defined set (unrelated to the above hierarchical structure) for convenience. Groups may be determined and defined depending on the purpose of identification.

〔Concrete example〕
Drugs are objects that can be classified in a hierarchical structure based on medicinal efficacy classification or hierarchical structure based on external characteristics. Drug identification can be defined as the act of determining which YJ code the target drug has (this is an example of the definition of drug identification; for example, the identification code may be defined using a type other than the YJ code ). Each group of "capsule drug" and "plain tablet" explained in the above embodiment is a set of types newly defined for convenience. Furthermore, the "half-tablet" group is a group of drugs newly defined for convenience. Although the YJ code is a set unrelated to the type, the purpose of drug identification is to specify the YJ code of half a tablet.

"others"
The technical scope of the present invention is not limited to the scope described in the above-described embodiments and modifications. The configurations of the embodiments and modified examples can be changed without departing from the spirit of the present invention, and the embodiments and modified examples can be combined as appropriate.

10 Smartphone 12 Housing 14 Touch panel display

14A Display section

14B Input section 16 Speaker 18 Microphone 20 In-camera 22 Out-camera 24 Light 26 Switch 28 CPU
30 Wireless communication section 32 Call section 34 Memory 36 Internal storage section 38 External storage section 40 External input/output section 42 GPS reception section 44 Power supply section 50 Photographic lens 50F Focus lens 50Z Zoom lens 54 Image sensor 58 A/D converter 60 Lens drive Part 70 Photography auxiliary device 72 Housing 72A Bottom plate 72B Side plate 72C Side plate 72D Side plate 72E Side plate 74 Drug placement tables 74A, 74B, 74C, 74D Reference marker 75 Main light source 78 Auxiliary light source 80 Photography auxiliary device 81 Illumination device 82 Housing 82A Bottom plate 82B Side plate 84 Main light source 86 Auxiliary light source 100 Drug type identification device 102 Processor 104 Storage device 112 Image acquisition unit 114 Drug detector 116 Area correction unit 118 Drug area extraction unit 120 Drug identifier 122 Text search unit 124 Display control unit 125 Magnification change unit 126 Input processing unit 130 Database 131 Master image database 132 Identification result storage unit 140 Engraving extraction unit 142 Outline extraction unit 144 Size measurement unit 150 Machine learning system 151 Learning model 152 Loss calculation unit 154 Optimizer 160 Drug type identification Process control section 162 Group discrimination section 170 Drug type estimation result presentation processing section 172 Capsule drug identification processing section 173 Capsule drug search section 174 Plain drug identification processing section 175 Plain drug search section 176 Divided tablet identification processing section 177 Divided tablet search section 178 Divided tablet Image registration unit 410 Drug placement table 410A, 410B, 410C, 410D Hollow 410E, 410F, 410G, 410H, 410I Hollow 412 Drug placement table 412A, 412B, 412C, 412D Hollow 412E, 412F Hollow 414 Drug placement table 414A, 41 4B, 414C , 414D Hollow 414E, 414F Hollow Abs Area BDA Button display area BT2 Add area button BT3 Area delete button BT4 Engraved text search button BT5 Drug type confirmation button BT6 Drug type confirmation hold button BT7 Reshoot button BT8 Complete button BT9 Right arrow button BT10 Left arrow button BT11 Arrow button BT12 Search button BT13 Capsule list button BT14 Plain drug list button BT15 Divided tablet call button BX, BX1, BX2, BX3, BX4 Frame CB Check box CDA Candidate display area CP1, CP2, CP3 Capsule drug DR1, DR2, DR3, DRj Drug EDA Whole image display area Egm, Egm1, Egm2, Egm3, Egm4, Egm5 Stamp extraction image F40A Left figure F40B Right figure F41A Left figure F41B Right figure F42A Left figure F42B Right figure F43A Left figure F43B Right figure F44A Left figure F44B Right Figure F45A Left figure F45B Right figure F46A Left figure F46B Right figure F47A Left figure F47B Right figure GTj Correct data IB1 Input box IB2 Input box IM1j Engraving extraction image IM2j External image IMj Drug image LS1 Divided tablet image display section LS2 Candidate drug image display section MC, MC1, MC2, MC3, MC4 Circular marker MGj Magnification information MS, MS1, MS2 Rectangular marker Org, Org1, Org2, Org3, Org4, Org5 Original image Otw, Otw1, Otw2, Otw3, Otw4, Otw5 Outer image PDM Pull-down menu PRj Inference results SBT1, SBT2, SBT3 Search button SBX1 Search box SBX2 Text search box SBX3 Search box SC1, SC2, SC3, SC4, SC5, SC6, SC7, SC8, SC9 Screen SLB1, SLB2, SLB3, SLB4, SLB5 Selection box SQ1 , SQ2, SQ3 Square SRB Scroll bar SRD Capsule image display section SRD2 Plain drug image display section ST1 GPS satellite ST2 GPS satellite SW1 Area editing switch SZB Size search box SZj Size information T1, T2, T3, T4, T5 Drug T51, T52, T53 Tablets T61, T62, T63 Tablet TM1 First trained model TM2 Second trained model S1 to S5 Steps S11 to S16 of the learning method for configuring the drug type identification device Steps S21 to S29 of the drug type identification method step

Claims

a detector that detects each object from an image of a plurality of objects;
Among the objects detected by the detector, if the type of the object can be identified from the image, the type of the object is estimated from the image, and the type of the object is difficult to identify from the image. However, for objects whose type is difficult to identify to which the group the object belongs, a classifier that estimates the group from the image;
a processing unit that performs group-specific processing that leads to identification of the type of the objects estimated as the group by the discriminator;
An object identification device comprising:
The image is an image taken in a state in which the object whose type can be identified and the object whose type is difficult to identify are mixed.
The object identification device according to claim 1.
The hard-to-identify object is classified into a plurality of groups,
Group-specific processing is defined for each of the groups;
The object identification device according to claim 1 or 2.
The detector is configured to generate a first learned model trained by machine learning using first training data in which each object is labeled without distinguishing between the object whose type can be identified and the object whose type is difficult to identify. include,
The object identification device according to any one of claims 1 to 3.
The discriminator performs machine learning using second training data in which the object whose type can be identified is labeled by the type of the object, and the object whose type is difficult to identify is labeled by the group to which the object belongs. comprising a trained second learned model;
The object identification device according to any one of claims 1 to 4.
Labels identifying the groups are defined in a hierarchical structure;
The object identification device according to any one of claims 1 to 5.
As input to the discriminator, an object image obtained by cutting out a region of the object detected by the detector from the image in units of objects, and a character symbol including at least one of a character and a symbol extracted from the object image. using at least one of an extracted image, an external shape image of the object, and size information of the object;
An object identification device according to any one of claims 1 to 6.
further using magnification information indicating an enlargement or reduction ratio of the object image as input to the discriminator;
The object identification device according to claim 7.
The group-specific processing includes processing for displaying a screen that accepts input of search conditions for searching for the type of object within the estimated group.
An object identification device according to any one of claims 1 to 8.
the object is a drug;
The object whose type is difficult to identify includes at least one of a capsule drug, a plain drug, and a divided tablet,
The type-identifiable object includes a tablet having a stamp or print.
An object identification device according to any one of claims 1 to 9.
As an input to the discriminator, a drug image is obtained by cutting out the area of the drug detected by the detector from the image in units of drugs, and character symbol extraction containing at least one of characters and symbols extracted from the drug image is provided. using at least one of an image, an external image of the drug, and size information of the drug;
The object identification device according to claim 10.
An object identification device comprising one or more processors and one or more memories in which programs executed by the one or more processors are stored,
The one or more processors include:
Detection processing that detects the object in units of objects from an image in which a plurality of objects are photographed;
Among the objects detected by the detection process, for objects whose type can be identified from the image, the type of the object is estimated from the image, and it is difficult to identify the type of the object from the image. However, for objects whose type is difficult to identify to which the group to which the object belongs can be estimated, an identification process is performed to estimate the group from the image;
A process of shifting to a group-specific process that leads to specifying the type of the object estimated as the group by the identification process;
An object identification device that performs
The one or more processors include:
Using a detector including a first trained model trained by machine learning using first training data labeled on an object basis without distinguishing between the object whose type can be identified and the object whose type is difficult to identify. executing the detection process using
The object identification device according to claim 12.
The one or more processors include:
The second training data is trained by machine learning using second training data in which the object whose type can be identified is labeled by the type of the object, and the object whose type is difficult to identify is labeled by the group to which the object belongs. Executing the identification process using a classifier including a trained model of
The object identification device according to claim 12 or 13.
The one or more processors include:
Executing a process of cutting out a region of the object detected by the detection process from the image and generating an object image for each object,
performing the identification process based on the object image;
Object identification device according to any one of claims 12 to 14.
the object is a drug;
The object whose type is difficult to identify includes at least one of a capsule drug, a plain drug, and a divided tablet,
The type-identifiable object includes a tablet having a stamp or print,
The group-specific processing includes processing for displaying a screen that accepts input of search conditions for searching for the type of drug within the estimated group.
Object identification device according to any one of claims 12 to 15.
a first database in which character/symbol information including at least one of characters and symbols indicated by a stamp or print attached to the drug and the type of the drug are linked;
a second database storing a master image of the drug;
Equipped with
The one or more processors include:
searching at least one of the first database and the second database based on the accepted search condition, and outputting drug candidates that meet the search condition;
The object identification device according to claim 16.
The one or more processors include:
a photographed image display unit that displays the image;
performing a process of displaying a screen including a candidate display unit that displays information about the candidate object based on the estimation result of the identification process;
Object identification device according to any one of claims 12 to 17.
The one or more processors include:
performing a process of displaying information on the group to which the candidate object to be displayed on the candidate display section belongs;
The object identification device according to claim 18.
The one or more processors include:
receiving an instruction specifying the group to which the candidate object to be displayed on the candidate display section belongs;
controlling the display of the candidate display section according to the received instruction;
The object identification device according to claim 19.
camera and
a display that displays the image taken by the camera and information about the object estimated from the image;
The object identification device according to any one of claims 1 to 20, comprising:
A method of object identification performed by one or more processors, the method comprising:
the one or more processors,
Detecting each object from an image in which a plurality of objects are photographed;
Among the detected objects, for objects whose type can be determined from the image, the type of the object is estimated from the image, and although the type of the object is difficult to identify from the image, estimating the group from the image for objects whose type is difficult to identify to which the group the object belongs;
performing group-specific processing that leads to identification of the type of the objects estimated as the group;
Object identification methods, including:
to the computer,
a function of detecting each object from an image in which a plurality of objects are photographed;
Among the detected objects, for objects whose type can be identified from the image, the type of the object is estimated from the image, and although it is difficult to identify the type of the object from the image, a function for estimating the group from the image for objects whose type is difficult to identify by which the group to which the object belongs can be identified;
a function of executing group-specific processing that leads to identification of the type of the objects estimated as the group;
A program that makes this possible.
A non-transitory computer-readable recording medium, on which the program according to claim 23 is recorded.