US20210313048A1 - Neural network-based object detection in visual input - Google Patents
Neural network-based object detection in visual input Download PDFInfo
- Publication number
- US20210313048A1 US20210313048A1 US17/352,438 US202117352438A US2021313048A1 US 20210313048 A1 US20210313048 A1 US 20210313048A1 US 202117352438 A US202117352438 A US 202117352438A US 2021313048 A1 US2021313048 A1 US 2021313048A1
- Authority
- US
- United States
- Prior art keywords
- medical image
- selection
- roi
- sub
- regions
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/72—Signal processing specially adapted for physiological signals or for diagnostic purposes
- A61B5/7235—Details of waveform analysis
- A61B5/7264—Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
- A61B5/7267—Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B8/00—Diagnosis using ultrasonic, sonic or infrasonic waves
- A61B8/52—Devices using data or image processing specially adapted for diagnosis using ultrasonic, sonic or infrasonic waves
- A61B8/5215—Devices using data or image processing specially adapted for diagnosis using ultrasonic, sonic or infrasonic waves involving processing of medical diagnostic data
- A61B8/5223—Devices using data or image processing specially adapted for diagnosis using ultrasonic, sonic or infrasonic waves involving processing of medical diagnostic data for extracting a diagnostic or physiological parameter from medical diagnostic data
-
- G06K9/3233—
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/82—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H30/00—ICT specially adapted for the handling or processing of medical images
- G16H30/20—ICT specially adapted for the handling or processing of medical images for handling medical images, e.g. DICOM, HL7 or PACS
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H30/00—ICT specially adapted for the handling or processing of medical images
- G16H30/40—ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V2201/00—Indexing scheme relating to image or video recognition or understanding
- G06V2201/03—Recognition of patterns in medical or anatomical images
Definitions
- the field of the embodiments relate to a device to an object within a medical image.
- the object may be identified and labelled as a region of interest which contains a lesion (malignant).
- a modern device includes components to provide variety of services such as communication, display, imaging, voice, and/or data capture, among others. Abilities of the modern device jump exponentially when networked to other resources that provide previously unimagined number of services associated with medical imaging.
- Ultrasound and other medical imaging devices scan biological structures or tissues of a patient to provide images.
- the scanned images are provided to medical practitioner(s) to aid with diagnosis of illnesses such as cancer. Clarity and quality of scanned image are usually suspect and depend on variety of conditions associated with the patient and a skill of a technician capturing the scanned image. Furthermore, the medical practitioner is also subject to missed diagnosis or false diagnosis associated with the scanned image due to quality of the scanned image and/or human error.
- the present invention and its embodiments relate to a method to detect an object in a medical image.
- the method may include receiving the medical image as an input.
- the medical image may next be partitioned to sub-regions. Parts of the object may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model.
- Bounding boxes for the selection may also be determined.
- the bounding boxes may be evaluated based on a confidence score detected as above a threshold level.
- the confidence score may designate the parts as contained within the selection.
- a region of interest (ROI) may be determined as a group including the selection. Similar orientations associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model.
- the selection may be designated as the ROI within the medical image.
- the medical image may also be provided with the ROI to a user.
- a device to detect an object in a medical image may be configured to receive the medical image as an input.
- the medical image may next be partitioned to sub-regions. Parts of the object may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model.
- DNN deep-learning neural network
- Bounding boxes for the selection may also be determined.
- the bounding boxes may be evaluated based on a confidence score detected as above a threshold level.
- the confidence score may designate the parts as contained within the selection.
- a region of interest (ROI) may be determined as a group including the selection. Similar orientations associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model.
- the selection may be designated as the ROI within the medical image.
- the selection within the ROI may be labelled with annotation(s) associated with a type of tissue.
- the type of tissue may include lobulated, spiculated, angular, clear boundary, oval, circumscribed, or abrupt interface.
- the medical image may also be provided with the annotation(s) and the ROI to a user.
- a device to detection an objection in a medical image may include a memory configured to store instructions associated with an image analysis application.
- a processor may be coupled to the memory.
- the processor may execute the instructions associated with the image analysis application.
- the image analysis application may include a computer assisted detection (CADe) module.
- the CADe module may be configured to receive the medical image as an input.
- the medical image may next partitioned to sub-regions. Parts of the object may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model. Bounding boxes for the selection may also be determined.
- the bounding boxes may be evaluated based on a confidence score detected as above a threshold level.
- the confidence score may designate the parts as contained within the selection.
- a region of interest may be determined as a group including the selection. Similar orientations associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model. Furthermore, the selection may be designated as the ROI within the medical image.
- the selection within the ROI may be labelled with annotation(s) associated with a type of tissue. The type of tissue may include lobulated, spiculated, angular, clear boundary, oval, circumscribed, or abrupt interface.
- the medical image may also be provided with the annotation(s) and the ROI to a user.
- ROI region of interest
- FIG. 1 shows a conceptual diagram illustrating examples of detecting an object in a medical image, according to an embodiment of the invention.
- FIG. 2 shows a display diagram illustrating components of an image analysis application determining a region of interest (ROI) in the medical image and labelling the ROI with annotation(s), according to an embodiment of the invention.
- ROI region of interest
- FIG. 3 shows another display diagram illustrating components of user interface allowing a user to interact with a ROI and annotation(s) associated with the ROI within a medical image, according to an embodiment of the invention.
- FIG. 4 is a block diagram of an example computing device, which may be used to detect an object in a medical image.
- FIG. 5 is a logic flow diagram illustrating a process for detecting an object in a medical image, according to an embodiment of the invention.
- FIG. 1 shows a conceptual diagram illustrating examples of detecting an object in a medical image 108 .
- a device 104 may execute (or provide) an image analysis application 106 .
- the device 104 may include a physical computing device hosting and/or providing features associated with a client application (such as the image analysis application 106 ).
- the device 104 may include and/or is part of a smart phone, a tablet based device, a laptop computer, a desktop computer, a physical server, and/or a cluster of servers, among others.
- the device 104 may also be a node of a network.
- the network may also include other nodes such as a medical image provider 112 , among others.
- the network may connect nodes with wired and wireless infrastructure.
- the device 104 may execute the image analysis application 106 .
- the image analysis application 106 may receive the medical image 108 as an input.
- An example of the medical image 108 may include an ultrasound image (or scan).
- Other examples of the medical image 108 may include a x-ray image, a magnetic resonance imaging (MRI) scan, a computed tomography (CT) scan, and/or a positron emission tomography (PET) scan, among others.
- the medical image 108 may be received from the medical image provider 112 .
- the medical image provider 112 may include a medical imaging device/system that captures, manages, and/or presents the medical image 108 to a user 102 .
- the user 102 may include a medical practitioner such as a doctor, a nurse, and/or a technician, a patient, and/or an administrator, among others.
- the user 102 may use the medical image 108 to diagnose an issue, a malignancy (cancer), and/or other illness associated with a patient.
- a medical practitioner such as a doctor, a nurse, and/or a technician, a patient, and/or an administrator, among others.
- the user 102 may use the medical image 108 to diagnose an issue, a malignancy (cancer), and/or other illness associated with a patient.
- cancer malignancy
- the medical image 108 may include an object 110 .
- the object 110 may include a biological structure of a patient.
- the object 110 may include a malignant or a benign lesion.
- the object 110 may represent another structure associated with an organ and/or other body part of the patient.
- a computer assisted detection (CADe) module of the image analysis application 106 may partition the medical image into sub-regions.
- a size of the sub-regions may be determined by the CADe module based on a size of the object 110 .
- the object 110 that consumes a large portion of the medical image 108 may be partitioned to a large number of the sub-regions.
- the object 110 that consumes a small portion of the medical image 108 may be partitioned to a small number of the sub-regions.
- the number of the sub-regions may be determined dynamically based on attributes associated with the medical image 108 such as dimensions, resolution, quality, and clarity.
- the CADe module may process each sub-region to detect parts of the object 110 .
- the parts of the object 110 may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model.
- the DNN model may include a machine learning mechanism based on learning data representations. Learning operations associated with the DNN model may vary from supervised learning to unsupervised learning.
- the CADe module may determine bounding boxes for the selection of the sub-regions (associated with the object 110 ).
- the bounding boxes may be evaluated based a confidence score detected as above a threshold.
- the confidence score (detected as above the threshold) may designate the parts of the object 110 as contained within the selection of the sub-regions.
- the confidence score may confirm that the CADe module has correctly recognized the parts of the object 110 within the bounding boxes representing the selection of the sub-regions (of the medical image 108 ).
- the threshold level may be determined automatically by the CADe based on positive and negative training models within the DNN model. Alternatively, the user 102 may manually determine the threshold level.
- a region of interest (ROI) 114 may be determined as a group comprising the selection of the sub-regions.
- the CADe module may determine the ROI 114 based on a comparison of similar orientations associated with the bounding boxes (representing the selection of the sub-regions) in relation to similar orientations of a positive training model of the DNN model.
- the similar orientations of the parts of the object 110 may describe orientation based relationships between the parts of the object 110 that are expected to match orientation based relationships within the positive training model.
- the CADe module may conclusively detect the object 110 as a lesion when the attributes of the parts of the object 110 (such as the similar orientations) match comparable attributes in the positive training model.
- the selection of the sub-regions may next be designated as the ROI 114 by the CADe module.
- the medical image 108 may be provided with the ROI 114 to the user 102 .
- the ROI 114 may alert the user 102 regarding a disease state (such as malignant/cancer or benign) associated with the object 110 .
- the image analysis application 106 may perform operations associated with detecting the object 110 in the medical image 108 as a desktop application, a workstation application, and/or a server application, among others.
- the image analysis application 106 may also be a client interface of a server based application.
- the device 104 may include a component of the medical image provider 112 .
- the image analysis application 106 may include a service associated with the medical image provider 112 .
- the user 102 may interact with the image analysis application 106 with a keyboard based input, a mouse based input, a voice based input, a pen based input, and a gesture based input, among others.
- the gesture based input may include one or more touch based actions such as a touch action, a swipe action, and a combination of each, among others.
- FIG. 1 has been described with specific components including the device 104 , the image analysis application 106 , embodiments are not limited to these components or system configurations and can be implemented with other system configuration employing fewer or additional components.
- FIG. 2 shows a display diagram illustrating components of the image analysis application 106 determining the ROI 114 in the medical image 108 and labelling the ROI 114 with annotation(s) 228 .
- the medical image 108 may be received as an input by the image analysis application 106 .
- the CADe module 216 may partition the medical image 108 into sub-regions 218 and process the sub-regions 218 with a DNN model 230 to detect parts of the object 110 .
- a selection 220 of the sub-regions 218 may be correlated to the parts of the object 110 .
- Bounding boxes may next be determined for the selection 220 .
- the bounding boxes may be representations of the selection 220 and may be used interchangeable to refer to the selection 220 .
- the bounding boxes may be evaluated based on a confidence score 222 detected as above a threshold 224 level.
- the confidence score 222 (detected as above the threshold 224 level) may designate the parts of the object 110 as contained within the selection 220 (of the sub-regions 218 ).
- the confidence score 222 associated with the selection 220 (and/or the bounding boxes) may confirm that the CADe module 216 correctly identified and captured the parts of the object 110 within the selection 220 .
- the ROI 114 may be determined as a group (of the sub-regions 218 ) that includes the selection 220 .
- Similar orientations 226 associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model 230 .
- the similar orientations 226 of the parts of the object 110 may describe orientation based relationships between the parts of the object 110 that are expected to match orientation based relationships within the positive training model.
- the CADe module 216 may conclusively detect the object 110 as a lesion when the attributes of the parts of the object 110 (such as the similar orientations 226 ) match comparable attributes in the positive training model.
- Similar orientations ( 226 ) of the bounding boxes may include similar and/or complimentary angular orientations between parts of the object 110 .
- parts of the object 110 within a sub-region 218 positioned in a right top quadrant may include edges that are oriented outwards toward a top and right directions.
- other parts of the object 110 within the sub-regions 218 located in top left, bottom left, and bottom right quadrants of the selection 220 may include edges that are oriented outwards toward top left, bottom left, and bottom right directions, respectively.
- similar orientations 226 may include similar distances between the parts of the object 110 .
- the CADe module 216 may apply a non-maximum suppression (NMS) mechanism to the selection 220 of the sub-regions 218 to obtain the bounding boxes (associated with the selection 220 ) in relation to the object 110 .
- the DNN model 230 associated with a detection of the parts of the object 110 may include a region based convolutional neural network (R-CNN) model, a fast R-CNN model, a faster R-CNN model, a you only look once (YOLO) model, and/or a single shot multi-box (SSD) model, among others.
- R-CNN region based convolutional neural network
- YOLO you only look once
- SSD single shot multi-box
- the CADe module 216 may label the selection 220 of the sub-regions within the ROI 114 with annotation(s) 228 associated with a type of tissue (in relation to the object 110 ).
- the type of tissue may include lobulated, spiculated, angular, clear boundary, oval, circumscribed, and/or abrupt interface, among others.
- the annotation(s) 228 may also include a relative position associated with a part (of the object 110 ). The relative position may include top, bottom, left side, right side, and/or combinations of relative positions, among others.
- the CADe module 216 may emphasize the object 110 with the ROI 114 as a lesion within the medical image 108 by processing the medical image 108 with the DNN model 230 .
- a training mechanism associated with the DNN model 230 may include a compensation for an unbalanced training data such as a majority of training medical images with no lesion and a minority of training medical images with a lesion.
- the training mechanism may include a down-sampling of the majority, an up-sampling of the minority, or a utilization of a cost sensitive mechanism, a gradient boost machine, or a hard negative mining mechanism.
- FIG. 3 shows another display diagram illustrating components of a user interface allowing a user to interact with the ROI 114 and the annotation(s) 228 associated with the ROI 114 within the medical image 108 .
- the image analysis application 106 (executed by the device 104 ) may provide the medical image 108 (or a digital copy) with the annotation(s) 228 and the ROI 114 to a user (such as a medical practitioner or a patient).
- Examples of the annotation(s) 228 may include a lobulated 332 sub-region, a spiculated 334 sub-region, an angular 336 sub-region, a clear boundary 338 sub-region, an oval 340 sub-region, a circumscribed 342 sub-region, and/or an abrupt interface 344 sub-region, among others.
- the user interface may also be configured to allow the user to change the annotation(s) 228 associated with the ROI 114 .
- the image analysis application 106 may detect the user provide a change to the annotation(s) 228 to personalize the annotation(s) for the user.
- the image analysis application 106 may identify a rate of concordance of the user in relation to the DNN model 230 .
- the rate of concordance may include an evaluation of a correctness of the user when diagnosing medical images in relation to the object(s) within the medical images (whether malignant or benign).
- the image analysis application 106 may determine the concordance rate of the user as above (or equal to) a threshold.
- the threshold may include a level which is correlated with a competence (associated with the user) when diagnosing an object within medical images as malignant or benign.
- the image analysis application 106 may retrain the DNN model 230 based on the change to the annotation(s) 228 .
- the selection 220 (of the sub-regions 218 ) may be re-processed based on the change to the annotation(s) 228 .
- the selection 220 of the sub-regions 218 may be re-labelled based on the change to the annotation(s) 228 to personalize the annotation(s) 228 to preference(s) of the user.
- the image analysis application 106 may determine the concordance rate of the user as below a threshold.
- the change to the annotation(s) 228 (introduced by the user) may be rejected.
- a notification may be provided to the user to re-evaluate the change to the annotation(s). The notification may alert the user that the user may be incorrect regarding the change to the annotation(s) 228 .
- the user may be allowed to interact, through an augmented reality display, with the medical image 108 , the ROI 114 , the annotation(s) 228 .
- the annotation(s) 228 may also be provided as text, sound, or texture associated with the ROI 114 .
- FIGS. 1 through 3 are shown with specific components, data types, and configurations. Embodiments are not limited to systems according to these example configurations.
- a device to detect an object in a medical image may be implemented in configurations employing fewer or additional components in applications and user interfaces.
- the example schema and components shown in FIGS. 1 through 3 and their subcomponents may be implemented in a similar manner with other values using the principles described herein.
- FIG. 4 is a block diagram of an example computing device, which may be used to detect an object in a medical image, according to embodiments.
- computing device 400 may be used as a server, desktop computer, portable computer, smart phone, special purpose computer, or similar device.
- the computing device 400 may include one or more processors 404 and a system memory 406 .
- a memory bus 408 may be used for communication between the processor 404 and the system memory 406 .
- the basic configuration 402 may be illustrated in FIG. 4 by those components within the inner dashed line.
- the processor 404 may be of any type, including but not limited to a microprocessor ( ⁇ P), a microcontroller ( ⁇ C), a digital signal processor (DSP), or any combination thereof.
- the processor 404 may include one more levels of caching, such as a level cache memory 412 , one or more processor cores 414 , and registers 416 .
- the example processor cores 414 may (each) include an arithmetic logic unit (ALU), a floating-point unit (FPU), a digital signal processing core (DSP Core), a graphics processing unit (GPU), or any combination thereof.
- An example memory controller 418 may also be used with the processor 404 , or in some implementations, the memory controller 418 may be an internal part of the processor 404 .
- the system memory 406 may be of any type including but not limited to volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.), or any combination thereof.
- the system memory 406 may include an operating system 420 , the image analysis application 106 , and a program data 424 .
- the image analysis application 106 may include components such as the CADe module 216 .
- the CADe module 216 may execute the instructions and processes associated with the image analysis application 106 .
- the CADe module 216 may receive the medical image as an input. The medical image may next be partitioned to sub-regions.
- Parts of the object may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model.
- Bounding boxes for the selection may also be determined.
- the bounding boxes may be evaluated based on a confidence score detected as above a threshold level.
- the confidence score may designate the parts as contained within the selection.
- a region of interest (ROI) may be determined as a group including the selection. Similar orientations associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model.
- the selection may be designated as the ROI within the medical image.
- the medical image may also be provided with the ROI to a user.
- Input to and output out of the image analysis application 106 may be captured and displayed through a display component that may be integrated to the computing device 400 .
- the display component may include a display screen, and/or a display monitor, among others that may capture an input through a touch/gesture based component such as a digitizer.
- the program data 424 may also include, among other data, the medical image 108 , or the like, as described herein.
- the object 110 in the medical image 108 may be identified and emphasized with the ROI 114 and annotation(s) 228 , among other things.
- the computing device 400 may have additional features or functionality, and additional interfaces to facilitate communications between the basic configuration 402 and any desired devices and interfaces.
- a bus/interface controller 430 may be used to facilitate communications between the basic configuration 402 and one or more data storage devices 432 via a storage interface bus 434 .
- the data storage devices 432 may be one or more removable storage devices 436 , one or more non-removable storage devices 438 , or a combination thereof.
- Examples of the removable storage and the non-removable storage devices may include magnetic disk devices, such as flexible disk drives and hard-disk drives (HDDs), optical disk drives such as compact disk (CD) drives or digital versatile disk (DVD) drives, solid state drives (SSDs), and tape drives, to name a few.
- Example computer storage media may include volatile and nonvolatile, removable, and non-removable media implemented in any method or technology for storage of information, such as computer-readable instructions, data structures, program modules, or other data.
- the system memory 406 , the removable storage devices 436 and the non-removable storage devices 438 are examples of computer storage media.
- Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVDs), solid state drives, or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which may be used to store the desired information and which may be accessed by the computing device 400 . Any such computer storage media may be part of the computing device 400 .
- the computing device 400 may also include an interface bus 440 for facilitating communication from various interface devices (for example, one or more output devices 442 , one or more peripheral interfaces 444 , and one or more communication devices 466 ) to the basic configuration 402 via the bus/interface controller 430 .
- interface devices for example, one or more output devices 442 , one or more peripheral interfaces 444 , and one or more communication devices 466 .
- Some of the example output devices 442 include a graphics processing unit 448 and an audio processing unit 450 , which may be configured to communicate to various external devices such as a display or speakers via one or more A/V ports 452 .
- One or more example peripheral interfaces 444 may include a serial interface controller 454 or a parallel interface controller 456 , which may be configured to communicate with external devices such as input devices (for example, keyboard, mouse, pen, voice input device, touch input device, etc.) or other peripheral devices (for example, printer, scanner, etc.) via one or more I/O ports 458 .
- An example of the communication device(s) 466 includes a network controller 460 , which may be arranged to facilitate communications with one or more other computing devices 462 over a network communication link via one or more communication ports 464 .
- the one or more other computing devices 462 may include servers, computing devices, and comparable devices.
- the network communication link may be one example of a communication media.
- Communication media may typically be embodied by computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism, and may include any information delivery media.
- a “modulated data signal” may be a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal.
- communication media may include wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, radio frequency (RF), microwave, infrared (IR) and other wireless media.
- RF radio frequency
- IR infrared
- the term computer readable media as used herein may include both storage media and communication media.
- the computing device 400 may be implemented as a part of a specialized server, mainframe, or similar computer, which includes any of the above functions.
- the computing device 400 may also be implemented as a personal computer including both laptop computer and non-laptop computer configurations. Additionally, the computing device 400 may include specialized hardware such as an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA), a programmable logic device (PLD), and/or a free form logic on an integrated circuit (IC), among others.
- ASIC application-specific integrated circuit
- FPGA field programmable gate array
- PLD programmable logic device
- IC integrated circuit
- Example embodiments may also include methods to detect an object in a medical image. These methods can be implemented in any number of ways, including the structures described herein. One such way may be by machine operations, of devices of the type described in the present disclosure. Another optional way may be for one or more of the individual operations of the methods to be performed in conjunction with one or more human operators performing some of the operations while other operations may be performed by machines. These human operators need not be collocated with each other, but each can be only with a machine that performs a portion of the program. In other embodiments, the human interaction can be automated such as by pre-selected criteria that may be machine automated.
- FIG. 5 is a logic flow diagram illustrating a process for detecting an object in a medical image.
- Process 500 may be implemented on a computing device, such as the computing device 400 or another system.
- Process 500 begins with operation 510 , where an image analysis application may receive the medical image as an input.
- the medical image may be partitioned to sub-regions.
- parts of the object may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model.
- DNN deep-learning neural network
- bounding boxes for the selection may be determined. The bounding boxes may be evaluated based on a confidence score detected as above a threshold level. The confidence score may designate the parts as contained within the selection.
- a ROI may be determined as a group including the selection. Similar orientations associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model. Furthermore, at operation 560 , the selection may be designated as the ROI within the medical image. At operation 570 , the medical image may be provided with the ROI to a user.
- process 500 is for illustration purposes. Detecting an object in a medical image may be implemented by similar processes with fewer or additional steps, as well as in different order of operations using the principles described herein.
- the operations described herein may be executed by one or more processors operated on one or more computing devices, one or more processor cores, specialized processing devices, and/or special purpose processors, among other examples.
- a method of detecting an object in a medical image includes receiving the medical image as an input.
- the medical image may next be partitioned to sub-regions. Parts of the object may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model.
- Bounding boxes for the selection may also be determined. The bounding boxes may be evaluated based on a confidence score detected as above a threshold level. The confidence score may designate the parts as contained within the selection.
- a region of interest (ROI) may be determined as a group including the selection. Similar orientations associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model.
- the selection may be designated as the ROI within the medical image.
- the medical image may also be provided with the ROI to a user.
- the articles “a,” “an,” and “the” are intended to mean that there are one or more of the elements.
- the adjective “another,” when used to introduce an element, is intended to mean one or more elements.
- the terms “including” and “having” are intended to be inclusive such that there may be additional elements other than the listed elements.
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Public Health (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biomedical Technology (AREA)
- General Physics & Mathematics (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Databases & Information Systems (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Epidemiology (AREA)
- Primary Health Care (AREA)
- Pathology (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- Animal Behavior & Ethology (AREA)
- Veterinary Medicine (AREA)
- Mathematical Physics (AREA)
- Physiology (AREA)
- Heart & Thoracic Surgery (AREA)
- Surgery (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Fuzzy Systems (AREA)
- Psychiatry (AREA)
- Signal Processing (AREA)
- Quality & Reliability (AREA)
- Apparatus For Radiation Diagnosis (AREA)
- Magnetic Resonance Imaging Apparatus (AREA)
- Image Analysis (AREA)
Abstract
A device to detect an object in a medical image is described. An image analysis application, executed by the device, receives the medical image as an input. The medical image is next partitioned to sub-regions. Parts of the object are detected in a selection of the sub-regions using a deep-learning neural network (DNN) model. Bounding boxes for the selection are also determined. The bounding boxes are evaluated based on a confidence score detected as above a threshold level. The confidence score designates the parts as contained within the selection. Next, a region of interest (ROI) is determined as a group including the selection. Similar orientations associated with the bounding boxes are comparable to similar orientations of a positive training model of the DNN model. Furthermore, the selection is designated as the ROI within the medical image. The medical image is provided with the ROI to a user.
Description
- This application is a continuation application, which relates to and claims the benefit to U.S. application Ser. No. 16/989,625, filed on Aug. 10, 2020 and granted as U.S. Pat. No. 11,043,297 which issued on Jun. 22, 2021, entitled “Neural Network-Based Object Detection In Visual Input”, which relates to and claims the benefit to U.S. application Ser. No. 16/218,832, filed on Dec. 13, 2018, entitled “Object Detection in Medical Image”, the entirety of both of which are incorporated herein by reference.
- The field of the embodiments relate to a device to an object within a medical image. The object may be identified and labelled as a region of interest which contains a lesion (malignant).
- Information exchanges have changed processes associated with work and personal environments. Automation and improvements in processes have expanded the scope of capabilities offered for personal and business data consumption. With the development of faster and smaller electronics, a variety of devices have integrated into daily lives. A modern device includes components to provide variety of services such as communication, display, imaging, voice, and/or data capture, among others. Abilities of the modern device jump exponentially when networked to other resources that provide previously unimagined number of services associated with medical imaging.
- Ultrasound and other medical imaging devices scan biological structures or tissues of a patient to provide images. The scanned images are provided to medical practitioner(s) to aid with diagnosis of illnesses such as cancer. Clarity and quality of scanned image are usually suspect and depend on variety of conditions associated with the patient and a skill of a technician capturing the scanned image. Furthermore, the medical practitioner is also subject to missed diagnosis or false diagnosis associated with the scanned image due to quality of the scanned image and/or human error.
- The present invention and its embodiments relate to a method to detect an object in a medical image. In an example scenario, the method may include receiving the medical image as an input. The medical image may next be partitioned to sub-regions. Parts of the object may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model. Bounding boxes for the selection may also be determined. The bounding boxes may be evaluated based on a confidence score detected as above a threshold level. The confidence score may designate the parts as contained within the selection. Next, a region of interest (ROI) may be determined as a group including the selection. Similar orientations associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model. Furthermore, the selection may be designated as the ROI within the medical image. The medical image may also be provided with the ROI to a user.
- In another embodiment of the present invention, a device to detect an object in a medical image is described. The device may be configured to receive the medical image as an input. The medical image may next be partitioned to sub-regions. Parts of the object may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model. Bounding boxes for the selection may also be determined. The bounding boxes may be evaluated based on a confidence score detected as above a threshold level. The confidence score may designate the parts as contained within the selection. Next, a region of interest (ROI) may be determined as a group including the selection. Similar orientations associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model. Furthermore, the selection may be designated as the ROI within the medical image. The selection within the ROI may be labelled with annotation(s) associated with a type of tissue. The type of tissue may include lobulated, spiculated, angular, clear boundary, oval, circumscribed, or abrupt interface. The medical image may also be provided with the annotation(s) and the ROI to a user.
- In yet another embodiment of the present invention, a device to detection an objection in a medical image is described. The device may include a memory configured to store instructions associated with an image analysis application. A processor may be coupled to the memory. The processor may execute the instructions associated with the image analysis application. The image analysis application may include a computer assisted detection (CADe) module. The CADe module may be configured to receive the medical image as an input. The medical image may next partitioned to sub-regions. Parts of the object may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model. Bounding boxes for the selection may also be determined. The bounding boxes may be evaluated based on a confidence score detected as above a threshold level. The confidence score may designate the parts as contained within the selection. Next, a region of interest (ROI) may be determined as a group including the selection. Similar orientations associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model. Furthermore, the selection may be designated as the ROI within the medical image. The selection within the ROI may be labelled with annotation(s) associated with a type of tissue. The type of tissue may include lobulated, spiculated, angular, clear boundary, oval, circumscribed, or abrupt interface. The medical image may also be provided with the annotation(s) and the ROI to a user.
- It is an object of the embodiments of the present invention to detect an object in a medical image.
- It is an object of the embodiments of the present invention to partition the medical image to sub-regions.
- It is an object of the embodiments of the present invention to determine a selection of the sub-regions associated with the object.
- It is an object of the embodiments of the present invention to process the selection to determine a region of interest (ROI).
- It is an object of the embodiments of the present invention to label the selection with annotation(s).
- It is an object of the embodiments to provide the medical image, the annotation(s) and the ROI to a user.
- These and other features, aspects and advantages of the present invention will become better understood with reference to the following drawings, description and claims.
-
FIG. 1 shows a conceptual diagram illustrating examples of detecting an object in a medical image, according to an embodiment of the invention. -
FIG. 2 shows a display diagram illustrating components of an image analysis application determining a region of interest (ROI) in the medical image and labelling the ROI with annotation(s), according to an embodiment of the invention. -
FIG. 3 shows another display diagram illustrating components of user interface allowing a user to interact with a ROI and annotation(s) associated with the ROI within a medical image, according to an embodiment of the invention. -
FIG. 4 is a block diagram of an example computing device, which may be used to detect an object in a medical image. -
FIG. 5 is a logic flow diagram illustrating a process for detecting an object in a medical image, according to an embodiment of the invention. - The preferred embodiments of the present invention will now be described with reference to the drawings. Identical elements in the various figures are identified with the same reference numerals.
- Reference will now be made in detail to each embodiment of the present invention. Such embodiments are provided by way of explanation of the present invention, which is not intended to be limited thereto. In fact, those of ordinary skill in the art may appreciate upon reading the present specification and viewing the present drawings that various modifications and variations may be made thereto.
-
FIG. 1 shows a conceptual diagram illustrating examples of detecting an object in amedical image 108. In an example scenario, adevice 104 may execute (or provide) animage analysis application 106. Thedevice 104 may include a physical computing device hosting and/or providing features associated with a client application (such as the image analysis application 106). Thedevice 104 may include and/or is part of a smart phone, a tablet based device, a laptop computer, a desktop computer, a physical server, and/or a cluster of servers, among others. Thedevice 104 may also be a node of a network. The network may also include other nodes such as amedical image provider 112, among others. The network may connect nodes with wired and wireless infrastructure. - The
device 104 may execute theimage analysis application 106. In an example scenario, theimage analysis application 106 may receive themedical image 108 as an input. An example of themedical image 108 may include an ultrasound image (or scan). Other examples of themedical image 108 may include a x-ray image, a magnetic resonance imaging (MRI) scan, a computed tomography (CT) scan, and/or a positron emission tomography (PET) scan, among others. Themedical image 108 may be received from themedical image provider 112. Themedical image provider 112 may include a medical imaging device/system that captures, manages, and/or presents themedical image 108 to auser 102. Theuser 102 may include a medical practitioner such as a doctor, a nurse, and/or a technician, a patient, and/or an administrator, among others. Theuser 102 may use themedical image 108 to diagnose an issue, a malignancy (cancer), and/or other illness associated with a patient. - The
medical image 108 may include anobject 110. Theobject 110 may include a biological structure of a patient. For example, theobject 110 may include a malignant or a benign lesion. Alternatively, theobject 110 may represent another structure associated with an organ and/or other body part of the patient. - Next, a computer assisted detection (CADe) module of the
image analysis application 106 may partition the medical image into sub-regions. A size of the sub-regions may be determined by the CADe module based on a size of theobject 110. For example, theobject 110 that consumes a large portion of themedical image 108 may be partitioned to a large number of the sub-regions. Alternatively, theobject 110 that consumes a small portion of themedical image 108 may be partitioned to a small number of the sub-regions. In yet another example scenario, the number of the sub-regions may be determined dynamically based on attributes associated with themedical image 108 such as dimensions, resolution, quality, and clarity. - The CADe module may process each sub-region to detect parts of the
object 110. The parts of theobject 110 may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model. The DNN model may include a machine learning mechanism based on learning data representations. Learning operations associated with the DNN model may vary from supervised learning to unsupervised learning. - In an example scenario, the CADe module may determine bounding boxes for the selection of the sub-regions (associated with the object 110). The bounding boxes may be evaluated based a confidence score detected as above a threshold. The confidence score (detected as above the threshold) may designate the parts of the
object 110 as contained within the selection of the sub-regions. The confidence score may confirm that the CADe module has correctly recognized the parts of theobject 110 within the bounding boxes representing the selection of the sub-regions (of the medical image 108). The threshold level may be determined automatically by the CADe based on positive and negative training models within the DNN model. Alternatively, theuser 102 may manually determine the threshold level. - Next, a region of interest (ROI) 114 may be determined as a group comprising the selection of the sub-regions. The CADe module may determine the
ROI 114 based on a comparison of similar orientations associated with the bounding boxes (representing the selection of the sub-regions) in relation to similar orientations of a positive training model of the DNN model. The similar orientations of the parts of theobject 110 may describe orientation based relationships between the parts of theobject 110 that are expected to match orientation based relationships within the positive training model. As such, the CADe module may conclusively detect theobject 110 as a lesion when the attributes of the parts of the object 110 (such as the similar orientations) match comparable attributes in the positive training model. - In addition, the selection of the sub-regions may next be designated as the
ROI 114 by the CADe module. Moreover, themedical image 108 may be provided with theROI 114 to theuser 102. TheROI 114 may alert theuser 102 regarding a disease state (such as malignant/cancer or benign) associated with theobject 110. - Previous example(s) to detect the
object 110 in themedical image 108 are not provided in a limiting sense. Alternatively, theimage analysis application 106 may perform operations associated with detecting theobject 110 in themedical image 108 as a desktop application, a workstation application, and/or a server application, among others. Theimage analysis application 106 may also be a client interface of a server based application. Furthermore, thedevice 104 may include a component of themedical image provider 112. As such, theimage analysis application 106 may include a service associated with themedical image provider 112. - The
user 102 may interact with theimage analysis application 106 with a keyboard based input, a mouse based input, a voice based input, a pen based input, and a gesture based input, among others. The gesture based input may include one or more touch based actions such as a touch action, a swipe action, and a combination of each, among others. - While the example system in
FIG. 1 has been described with specific components including thedevice 104, theimage analysis application 106, embodiments are not limited to these components or system configurations and can be implemented with other system configuration employing fewer or additional components. -
FIG. 2 shows a display diagram illustrating components of theimage analysis application 106 determining theROI 114 in themedical image 108 and labelling theROI 114 with annotation(s) 228. In an example scenario, themedical image 108 may be received as an input by theimage analysis application 106. TheCADe module 216 may partition themedical image 108 intosub-regions 218 and process thesub-regions 218 with aDNN model 230 to detect parts of theobject 110. Aselection 220 of thesub-regions 218 may be correlated to the parts of theobject 110. - Bounding boxes may next be determined for the
selection 220. The bounding boxes may be representations of theselection 220 and may be used interchangeable to refer to theselection 220. The bounding boxes may be evaluated based on aconfidence score 222 detected as above athreshold 224 level. The confidence score 222 (detected as above thethreshold 224 level) may designate the parts of theobject 110 as contained within the selection 220 (of the sub-regions 218). As such, theconfidence score 222 associated with the selection 220 (and/or the bounding boxes) may confirm that theCADe module 216 correctly identified and captured the parts of theobject 110 within theselection 220. - Next, the
ROI 114 may be determined as a group (of the sub-regions 218) that includes theselection 220.Similar orientations 226 associated with the bounding boxes may be comparable to similar orientations of a positive training model of theDNN model 230. Thesimilar orientations 226 of the parts of theobject 110 may describe orientation based relationships between the parts of theobject 110 that are expected to match orientation based relationships within the positive training model. As such, theCADe module 216 may conclusively detect theobject 110 as a lesion when the attributes of the parts of the object 110 (such as the similar orientations 226) match comparable attributes in the positive training model. - Similar orientations (226) of the bounding boxes may include similar and/or complimentary angular orientations between parts of the
object 110. For example, parts of theobject 110 within asub-region 218 positioned in a right top quadrant may include edges that are oriented outwards toward a top and right directions. Similarly, other parts of theobject 110 within thesub-regions 218 located in top left, bottom left, and bottom right quadrants of theselection 220 may include edges that are oriented outwards toward top left, bottom left, and bottom right directions, respectively. - Furthermore,
similar orientations 226 may include similar distances between the parts of theobject 110. In addition, theCADe module 216 may apply a non-maximum suppression (NMS) mechanism to theselection 220 of thesub-regions 218 to obtain the bounding boxes (associated with the selection 220) in relation to theobject 110. Moreover, theDNN model 230 associated with a detection of the parts of theobject 110 may include a region based convolutional neural network (R-CNN) model, a fast R-CNN model, a faster R-CNN model, a you only look once (YOLO) model, and/or a single shot multi-box (SSD) model, among others. - Furthermore, the
CADe module 216 may label theselection 220 of the sub-regions within theROI 114 with annotation(s) 228 associated with a type of tissue (in relation to the object 110). The type of tissue may include lobulated, spiculated, angular, clear boundary, oval, circumscribed, and/or abrupt interface, among others. The annotation(s) 228 may also include a relative position associated with a part (of the object 110). The relative position may include top, bottom, left side, right side, and/or combinations of relative positions, among others. - The
CADe module 216 may emphasize theobject 110 with theROI 114 as a lesion within themedical image 108 by processing themedical image 108 with theDNN model 230. Furthermore, a training mechanism associated with theDNN model 230 may include a compensation for an unbalanced training data such as a majority of training medical images with no lesion and a minority of training medical images with a lesion. In addition, the training mechanism may include a down-sampling of the majority, an up-sampling of the minority, or a utilization of a cost sensitive mechanism, a gradient boost machine, or a hard negative mining mechanism. -
FIG. 3 shows another display diagram illustrating components of a user interface allowing a user to interact with theROI 114 and the annotation(s) 228 associated with theROI 114 within themedical image 108. The image analysis application 106 (executed by the device 104) may provide the medical image 108 (or a digital copy) with the annotation(s) 228 and theROI 114 to a user (such as a medical practitioner or a patient). Examples of the annotation(s) 228 may include a lobulated 332 sub-region, a spiculated 334 sub-region, an angular 336 sub-region, aclear boundary 338 sub-region, an oval 340 sub-region, a circumscribed 342 sub-region, and/or anabrupt interface 344 sub-region, among others. - The user interface may also be configured to allow the user to change the annotation(s) 228 associated with the
ROI 114. In an example scenario, theimage analysis application 106 may detect the user provide a change to the annotation(s) 228 to personalize the annotation(s) for the user. In response, theimage analysis application 106 may identify a rate of concordance of the user in relation to theDNN model 230. The rate of concordance may include an evaluation of a correctness of the user when diagnosing medical images in relation to the object(s) within the medical images (whether malignant or benign). - The
image analysis application 106 may determine the concordance rate of the user as above (or equal to) a threshold. The threshold may include a level which is correlated with a competence (associated with the user) when diagnosing an object within medical images as malignant or benign. In response to the determination associated with the concordance rate, theimage analysis application 106 may retrain theDNN model 230 based on the change to the annotation(s) 228. The selection 220 (of the sub-regions 218) may be re-processed based on the change to the annotation(s) 228. Furthermore, theselection 220 of thesub-regions 218 may be re-labelled based on the change to the annotation(s) 228 to personalize the annotation(s) 228 to preference(s) of the user. - In another example scenario, the
image analysis application 106 may determine the concordance rate of the user as below a threshold. In response, the change to the annotation(s) 228 (introduced by the user) may be rejected. A notification may be provided to the user to re-evaluate the change to the annotation(s). The notification may alert the user that the user may be incorrect regarding the change to the annotation(s) 228. - In yet another example scenario, the user may be allowed to interact, through an augmented reality display, with the
medical image 108, theROI 114, the annotation(s) 228. The annotation(s) 228 may also be provided as text, sound, or texture associated with theROI 114. - The example scenarios and schemas in
FIGS. 1 through 3 are shown with specific components, data types, and configurations. Embodiments are not limited to systems according to these example configurations. A device to detect an object in a medical image may be implemented in configurations employing fewer or additional components in applications and user interfaces. Furthermore, the example schema and components shown inFIGS. 1 through 3 and their subcomponents may be implemented in a similar manner with other values using the principles described herein. -
FIG. 4 is a block diagram of an example computing device, which may be used to detect an object in a medical image, according to embodiments. - For example,
computing device 400 may be used as a server, desktop computer, portable computer, smart phone, special purpose computer, or similar device. In a basic configuration 402, thecomputing device 400 may include one ormore processors 404 and asystem memory 406. A memory bus 408 may be used for communication between theprocessor 404 and thesystem memory 406. The basic configuration 402 may be illustrated inFIG. 4 by those components within the inner dashed line. - Depending on the desired configuration, the
processor 404 may be of any type, including but not limited to a microprocessor (μP), a microcontroller (μC), a digital signal processor (DSP), or any combination thereof. Theprocessor 404 may include one more levels of caching, such as alevel cache memory 412, one ormore processor cores 414, and registers 416. Theexample processor cores 414 may (each) include an arithmetic logic unit (ALU), a floating-point unit (FPU), a digital signal processing core (DSP Core), a graphics processing unit (GPU), or any combination thereof. Anexample memory controller 418 may also be used with theprocessor 404, or in some implementations, thememory controller 418 may be an internal part of theprocessor 404. - Depending on the desired configuration, the
system memory 406 may be of any type including but not limited to volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.), or any combination thereof. Thesystem memory 406 may include anoperating system 420, theimage analysis application 106, and aprogram data 424. Theimage analysis application 106 may include components such as theCADe module 216. TheCADe module 216 may execute the instructions and processes associated with theimage analysis application 106. In an example scenario, theCADe module 216 may receive the medical image as an input. The medical image may next be partitioned to sub-regions. Parts of the object may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model. Bounding boxes for the selection may also be determined. The bounding boxes may be evaluated based on a confidence score detected as above a threshold level. The confidence score may designate the parts as contained within the selection. Next, a region of interest (ROI) may be determined as a group including the selection. Similar orientations associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model. Furthermore, the selection may be designated as the ROI within the medical image. The medical image may also be provided with the ROI to a user. - Input to and output out of the
image analysis application 106 may be captured and displayed through a display component that may be integrated to thecomputing device 400. The display component may include a display screen, and/or a display monitor, among others that may capture an input through a touch/gesture based component such as a digitizer. Theprogram data 424 may also include, among other data, themedical image 108, or the like, as described herein. Theobject 110 in themedical image 108 may be identified and emphasized with theROI 114 and annotation(s) 228, among other things. - The
computing device 400 may have additional features or functionality, and additional interfaces to facilitate communications between the basic configuration 402 and any desired devices and interfaces. For example, a bus/interface controller 430 may be used to facilitate communications between the basic configuration 402 and one or moredata storage devices 432 via a storage interface bus 434. Thedata storage devices 432 may be one or more removable storage devices 436, one or morenon-removable storage devices 438, or a combination thereof. Examples of the removable storage and the non-removable storage devices may include magnetic disk devices, such as flexible disk drives and hard-disk drives (HDDs), optical disk drives such as compact disk (CD) drives or digital versatile disk (DVD) drives, solid state drives (SSDs), and tape drives, to name a few. Example computer storage media may include volatile and nonvolatile, removable, and non-removable media implemented in any method or technology for storage of information, such as computer-readable instructions, data structures, program modules, or other data. - The
system memory 406, the removable storage devices 436 and thenon-removable storage devices 438 are examples of computer storage media. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVDs), solid state drives, or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which may be used to store the desired information and which may be accessed by thecomputing device 400. Any such computer storage media may be part of thecomputing device 400. - The
computing device 400 may also include an interface bus 440 for facilitating communication from various interface devices (for example, one ormore output devices 442, one or moreperipheral interfaces 444, and one or more communication devices 466) to the basic configuration 402 via the bus/interface controller 430. Some of theexample output devices 442 include agraphics processing unit 448 and an audio processing unit 450, which may be configured to communicate to various external devices such as a display or speakers via one or more A/V ports 452. One or more exampleperipheral interfaces 444 may include a serial interface controller 454 or aparallel interface controller 456, which may be configured to communicate with external devices such as input devices (for example, keyboard, mouse, pen, voice input device, touch input device, etc.) or other peripheral devices (for example, printer, scanner, etc.) via one or more I/O ports 458. An example of the communication device(s) 466 includes anetwork controller 460, which may be arranged to facilitate communications with one or moreother computing devices 462 over a network communication link via one ormore communication ports 464. The one or moreother computing devices 462 may include servers, computing devices, and comparable devices. - The network communication link may be one example of a communication media. Communication media may typically be embodied by computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism, and may include any information delivery media. A “modulated data signal” may be a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media may include wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, radio frequency (RF), microwave, infrared (IR) and other wireless media. The term computer readable media as used herein may include both storage media and communication media.
- The
computing device 400 may be implemented as a part of a specialized server, mainframe, or similar computer, which includes any of the above functions. Thecomputing device 400 may also be implemented as a personal computer including both laptop computer and non-laptop computer configurations. Additionally, thecomputing device 400 may include specialized hardware such as an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA), a programmable logic device (PLD), and/or a free form logic on an integrated circuit (IC), among others. - Example embodiments may also include methods to detect an object in a medical image. These methods can be implemented in any number of ways, including the structures described herein. One such way may be by machine operations, of devices of the type described in the present disclosure. Another optional way may be for one or more of the individual operations of the methods to be performed in conjunction with one or more human operators performing some of the operations while other operations may be performed by machines. These human operators need not be collocated with each other, but each can be only with a machine that performs a portion of the program. In other embodiments, the human interaction can be automated such as by pre-selected criteria that may be machine automated.
-
FIG. 5 is a logic flow diagram illustrating a process for detecting an object in a medical image.Process 500 may be implemented on a computing device, such as thecomputing device 400 or another system. -
Process 500 begins withoperation 510, where an image analysis application may receive the medical image as an input. Atoperation 520, the medical image may be partitioned to sub-regions. At operation 530, parts of the object may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model. Atoperation 540, bounding boxes for the selection may be determined. The bounding boxes may be evaluated based on a confidence score detected as above a threshold level. The confidence score may designate the parts as contained within the selection. - Next, at
operation 550, a ROI may be determined as a group including the selection. Similar orientations associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model. Furthermore, atoperation 560, the selection may be designated as the ROI within the medical image. At operation 570, the medical image may be provided with the ROI to a user. - The operations included in
process 500 is for illustration purposes. Detecting an object in a medical image may be implemented by similar processes with fewer or additional steps, as well as in different order of operations using the principles described herein. The operations described herein may be executed by one or more processors operated on one or more computing devices, one or more processor cores, specialized processing devices, and/or special purpose processors, among other examples. - A method of detecting an object in a medical image is also described. The method includes receiving the medical image as an input. The medical image may next be partitioned to sub-regions. Parts of the object may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model. Bounding boxes for the selection may also be determined. The bounding boxes may be evaluated based on a confidence score detected as above a threshold level. The confidence score may designate the parts as contained within the selection. Next, a region of interest (ROI) may be determined as a group including the selection. Similar orientations associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model. Furthermore, the selection may be designated as the ROI within the medical image. The medical image may also be provided with the ROI to a user.
- When introducing elements of the present disclosure or the embodiment(s) thereof, the articles “a,” “an,” and “the” are intended to mean that there are one or more of the elements. Similarly, the adjective “another,” when used to introduce an element, is intended to mean one or more elements. The terms “including” and “having” are intended to be inclusive such that there may be additional elements other than the listed elements.
- Although this invention has been described with a certain degree of particularity, it is to be understood that the present disclosure has been made only by way of illustration and that numerous changes in the details of construction and arrangement of parts may be resorted to without departing from the spirit and the scope of the invention.
Claims (1)
1. A method comprising:
receiving, by a processor, an image;
determining, by the processor, a size of the image;
utilizing, by the processor, a neural network (NN) model to analyze the image to identify objects within the image based at least in part on confidence score; and
instructing, by the processor, an output computer module to produce an alert indicative of the identified objects.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17/352,438 US20210313048A1 (en) | 2018-12-13 | 2021-06-21 | Neural network-based object detection in visual input |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/218,832 US20200194108A1 (en) | 2018-12-13 | 2018-12-13 | Object detection in medical image |
US16/989,625 US11043297B2 (en) | 2018-12-13 | 2020-08-10 | Neural network-based object detection in visual input |
US17/352,438 US20210313048A1 (en) | 2018-12-13 | 2021-06-21 | Neural network-based object detection in visual input |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/989,625 Continuation US11043297B2 (en) | 2018-12-13 | 2020-08-10 | Neural network-based object detection in visual input |
Publications (1)
Publication Number | Publication Date |
---|---|
US20210313048A1 true US20210313048A1 (en) | 2021-10-07 |
Family
ID=69167907
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/218,832 Abandoned US20200194108A1 (en) | 2018-12-13 | 2018-12-13 | Object detection in medical image |
US16/989,625 Active US11043297B2 (en) | 2018-12-13 | 2020-08-10 | Neural network-based object detection in visual input |
US17/352,438 Abandoned US20210313048A1 (en) | 2018-12-13 | 2021-06-21 | Neural network-based object detection in visual input |
Family Applications Before (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/218,832 Abandoned US20200194108A1 (en) | 2018-12-13 | 2018-12-13 | Object detection in medical image |
US16/989,625 Active US11043297B2 (en) | 2018-12-13 | 2020-08-10 | Neural network-based object detection in visual input |
Country Status (2)
Country | Link |
---|---|
US (3) | US20200194108A1 (en) |
WO (1) | WO2020123749A1 (en) |
Families Citing this family (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108230294B (en) * | 2017-06-14 | 2020-09-29 | 北京市商汤科技开发有限公司 | Image detection method, image detection device, electronic equipment and storage medium |
CN108537151A (en) * | 2018-03-27 | 2018-09-14 | 上海小蚁科技有限公司 | A kind of non-maxima suppression arithmetic unit and system |
US20200394458A1 (en) * | 2019-06-17 | 2020-12-17 | Nvidia Corporation | Weakly-supervised object detection using one or more neural networks |
US11720817B2 (en) * | 2019-07-01 | 2023-08-08 | Medical Informatics Corp. | Waveform annotator |
US11188740B2 (en) * | 2019-12-18 | 2021-11-30 | Qualcomm Incorporated | Two-pass omni-directional object detection |
CN111598866B (en) * | 2020-05-14 | 2023-04-11 | 四川大学 | Lens key feature positioning method based on eye B-ultrasonic image |
CA3103872A1 (en) * | 2020-12-23 | 2022-06-23 | Pulsemedica Corp. | Automatic annotation of condition features in medical images |
US11961314B2 (en) * | 2021-02-16 | 2024-04-16 | Nxp B.V. | Method for analyzing an output of an object detector |
CN113191353A (en) * | 2021-04-15 | 2021-07-30 | 华北电力大学扬中智能电气研究中心 | Vehicle speed determination method, device, equipment and medium |
CN113744328B (en) * | 2021-11-05 | 2022-02-15 | 极限人工智能有限公司 | Medical image mark point identification method and device, electronic equipment and storage medium |
US12020475B2 (en) * | 2022-02-21 | 2024-06-25 | Ford Global Technologies, Llc | Neural network training |
US20240112329A1 (en) * | 2022-10-04 | 2024-04-04 | HeHealth PTE Ltd. | Distinguishing a Disease State from a Non-Disease State in an Image |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180204111A1 (en) * | 2013-02-28 | 2018-07-19 | Z Advanced Computing, Inc. | System and Method for Extremely Efficient Image and Pattern Recognition and Artificial Intelligence Platform |
US20190384304A1 (en) * | 2018-06-13 | 2019-12-19 | Nvidia Corporation | Path detection for autonomous machines using deep neural networks |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20120102447A (en) | 2011-03-08 | 2012-09-18 | 삼성전자주식회사 | Method and apparatus for diagnostic |
US8467607B1 (en) | 2011-11-21 | 2013-06-18 | Google Inc. | Segmentation-based feature pooling for object models |
US10140709B2 (en) * | 2017-02-27 | 2018-11-27 | International Business Machines Corporation | Automatic detection and semantic description of lesions using a convolutional neural network |
CN107392218B (en) | 2017-04-11 | 2020-08-04 | 创新先进技术有限公司 | Vehicle loss assessment method and device based on image and electronic equipment |
EP3392832A1 (en) | 2017-04-21 | 2018-10-24 | General Electric Company | Automated organ risk segmentation machine learning methods and systems |
EP3399465A1 (en) * | 2017-05-05 | 2018-11-07 | Dassault Systèmes | Forming a dataset for fully-supervised learning |
JP7227168B2 (en) * | 2017-06-19 | 2023-02-21 | モハメド・アール・マーフーズ | Surgical Navigation of the Hip Using Fluoroscopy and Tracking Sensors |
US10268204B2 (en) | 2017-08-30 | 2019-04-23 | GM Global Technology Operations LLC | Cross traffic detection using cameras |
US11446008B2 (en) * | 2018-08-17 | 2022-09-20 | Tokitae Llc | Automated ultrasound video interpretation of a body part with one or more convolutional neural networks |
-
2018
- 2018-12-13 US US16/218,832 patent/US20200194108A1/en not_active Abandoned
-
2019
- 2019-12-12 WO PCT/US2019/065867 patent/WO2020123749A1/en active Application Filing
-
2020
- 2020-08-10 US US16/989,625 patent/US11043297B2/en active Active
-
2021
- 2021-06-21 US US17/352,438 patent/US20210313048A1/en not_active Abandoned
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180204111A1 (en) * | 2013-02-28 | 2018-07-19 | Z Advanced Computing, Inc. | System and Method for Extremely Efficient Image and Pattern Recognition and Artificial Intelligence Platform |
US20190384304A1 (en) * | 2018-06-13 | 2019-12-19 | Nvidia Corporation | Path detection for autonomous machines using deep neural networks |
Also Published As
Publication number | Publication date |
---|---|
WO2020123749A1 (en) | 2020-06-18 |
US20200194108A1 (en) | 2020-06-18 |
US11043297B2 (en) | 2021-06-22 |
US20210050095A1 (en) | 2021-02-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20210313048A1 (en) | Neural network-based object detection in visual input | |
US10290101B1 (en) | Heat map based medical image diagnostic mechanism | |
KR101943011B1 (en) | Method for facilitating medical image reading and apparatus using the same | |
US10853449B1 (en) | Report formatting for automated or assisted analysis of medical imaging data and medical diagnosis | |
US9760689B2 (en) | Computer-aided diagnosis method and apparatus | |
US10692602B1 (en) | Structuring free text medical reports with forced taxonomies | |
JP2022024139A (en) | Computer-aided detection using multiple images from different views of region of interest to improve detection accuracy | |
US20210264574A1 (en) | Correcting image blur in medical image | |
US20210297588A1 (en) | Medical image based distortion correction mechanism | |
JP6796060B2 (en) | Image report annotation identification | |
US20210118551A1 (en) | Device to enhance and present medical image using corrective mechanism | |
CN109191451B (en) | Abnormality detection method, apparatus, device, and medium | |
JP7240001B2 (en) | METHOD FOR SUPPORTING BROWSING IMAGES AND DEVICE USING THE SAME | |
US20150173705A1 (en) | Apparatus and method for adapting diagnostic model for computer-aided diagnosis | |
US20210048941A1 (en) | Method for providing an image base on a reconstructed image group and an apparatus using the same | |
US20210407637A1 (en) | Method to display lesion readings result | |
JP2019030584A (en) | Image processing system, apparatus, method, and program | |
Cheng et al. | Development and validation of a deep learning pipeline to measure pericardial effusion in echocardiography | |
Arias-Londoño et al. | Analysis of the Clever Hans effect in COVID-19 detection using Chest X-Ray images and Bayesian Deep Learning | |
Ragnarsdottir et al. | Interpretable prediction of pulmonary hypertension in newborns using echocardiograms | |
KR102507451B1 (en) | Method to read chest image | |
US20230033263A1 (en) | Information processing system, information processing method, information terminal, and non-transitory computer-readable medium | |
CN111028173B (en) | Image enhancement method, device, electronic equipment and readable storage medium | |
US20240087304A1 (en) | System for medical data analysis | |
US20220076796A1 (en) | Medical document creation apparatus, method and program, learning device, method and program, and trained model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
AS | Assignment |
Owner name: RUTGERS, THE STATE UNIVERSITY OF NEW JERSEY, NEW JERSEY Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:PODILCHUK, CHRISTINE I.;MAMMONE, RICHARD;REEL/FRAME:057751/0897 Effective date: 20210121 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |