US20210313048A1 - Neural network-based object detection in visual input - Google Patents

Neural network-based object detection in visual input Download PDF

Info

Publication number
US20210313048A1
US20210313048A1 US17/352,438 US202117352438A US2021313048A1 US 20210313048 A1 US20210313048 A1 US 20210313048A1 US 202117352438 A US202117352438 A US 202117352438A US 2021313048 A1 US2021313048 A1 US 2021313048A1
Authority
US
United States
Prior art keywords
medical image
selection
roi
sub
regions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US17/352,438
Inventor
Christine I. Podilctluk
Richard Mammone
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Rutgers State University of New Jersey
Original Assignee
Rutgers State University of New Jersey
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Rutgers State University of New Jersey filed Critical Rutgers State University of New Jersey
Priority to US17/352,438 priority Critical patent/US20210313048A1/en
Publication of US20210313048A1 publication Critical patent/US20210313048A1/en
Assigned to RUTGERS, THE STATE UNIVERSITY OF NEW JERSEY reassignment RUTGERS, THE STATE UNIVERSITY OF NEW JERSEY ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MAMMONE, RICHARD, PODILCHUK, CHRISTINE I.
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • A61B5/7267Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B8/00Diagnosis using ultrasonic, sonic or infrasonic waves
    • A61B8/52Devices using data or image processing specially adapted for diagnosis using ultrasonic, sonic or infrasonic waves
    • A61B8/5215Devices using data or image processing specially adapted for diagnosis using ultrasonic, sonic or infrasonic waves involving processing of medical diagnostic data
    • A61B8/5223Devices using data or image processing specially adapted for diagnosis using ultrasonic, sonic or infrasonic waves involving processing of medical diagnostic data for extracting a diagnostic or physiological parameter from medical diagnostic data
    • G06K9/3233
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/20ICT specially adapted for the handling or processing of medical images for handling medical images, e.g. DICOM, HL7 or PACS
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/40ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images

Definitions

  • the field of the embodiments relate to a device to an object within a medical image.
  • the object may be identified and labelled as a region of interest which contains a lesion (malignant).
  • a modern device includes components to provide variety of services such as communication, display, imaging, voice, and/or data capture, among others. Abilities of the modern device jump exponentially when networked to other resources that provide previously unimagined number of services associated with medical imaging.
  • Ultrasound and other medical imaging devices scan biological structures or tissues of a patient to provide images.
  • the scanned images are provided to medical practitioner(s) to aid with diagnosis of illnesses such as cancer. Clarity and quality of scanned image are usually suspect and depend on variety of conditions associated with the patient and a skill of a technician capturing the scanned image. Furthermore, the medical practitioner is also subject to missed diagnosis or false diagnosis associated with the scanned image due to quality of the scanned image and/or human error.
  • the present invention and its embodiments relate to a method to detect an object in a medical image.
  • the method may include receiving the medical image as an input.
  • the medical image may next be partitioned to sub-regions. Parts of the object may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model.
  • Bounding boxes for the selection may also be determined.
  • the bounding boxes may be evaluated based on a confidence score detected as above a threshold level.
  • the confidence score may designate the parts as contained within the selection.
  • a region of interest (ROI) may be determined as a group including the selection. Similar orientations associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model.
  • the selection may be designated as the ROI within the medical image.
  • the medical image may also be provided with the ROI to a user.
  • a device to detect an object in a medical image may be configured to receive the medical image as an input.
  • the medical image may next be partitioned to sub-regions. Parts of the object may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model.
  • DNN deep-learning neural network
  • Bounding boxes for the selection may also be determined.
  • the bounding boxes may be evaluated based on a confidence score detected as above a threshold level.
  • the confidence score may designate the parts as contained within the selection.
  • a region of interest (ROI) may be determined as a group including the selection. Similar orientations associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model.
  • the selection may be designated as the ROI within the medical image.
  • the selection within the ROI may be labelled with annotation(s) associated with a type of tissue.
  • the type of tissue may include lobulated, spiculated, angular, clear boundary, oval, circumscribed, or abrupt interface.
  • the medical image may also be provided with the annotation(s) and the ROI to a user.
  • a device to detection an objection in a medical image may include a memory configured to store instructions associated with an image analysis application.
  • a processor may be coupled to the memory.
  • the processor may execute the instructions associated with the image analysis application.
  • the image analysis application may include a computer assisted detection (CADe) module.
  • the CADe module may be configured to receive the medical image as an input.
  • the medical image may next partitioned to sub-regions. Parts of the object may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model. Bounding boxes for the selection may also be determined.
  • the bounding boxes may be evaluated based on a confidence score detected as above a threshold level.
  • the confidence score may designate the parts as contained within the selection.
  • a region of interest may be determined as a group including the selection. Similar orientations associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model. Furthermore, the selection may be designated as the ROI within the medical image.
  • the selection within the ROI may be labelled with annotation(s) associated with a type of tissue. The type of tissue may include lobulated, spiculated, angular, clear boundary, oval, circumscribed, or abrupt interface.
  • the medical image may also be provided with the annotation(s) and the ROI to a user.
  • ROI region of interest
  • FIG. 1 shows a conceptual diagram illustrating examples of detecting an object in a medical image, according to an embodiment of the invention.
  • FIG. 2 shows a display diagram illustrating components of an image analysis application determining a region of interest (ROI) in the medical image and labelling the ROI with annotation(s), according to an embodiment of the invention.
  • ROI region of interest
  • FIG. 3 shows another display diagram illustrating components of user interface allowing a user to interact with a ROI and annotation(s) associated with the ROI within a medical image, according to an embodiment of the invention.
  • FIG. 4 is a block diagram of an example computing device, which may be used to detect an object in a medical image.
  • FIG. 5 is a logic flow diagram illustrating a process for detecting an object in a medical image, according to an embodiment of the invention.
  • FIG. 1 shows a conceptual diagram illustrating examples of detecting an object in a medical image 108 .
  • a device 104 may execute (or provide) an image analysis application 106 .
  • the device 104 may include a physical computing device hosting and/or providing features associated with a client application (such as the image analysis application 106 ).
  • the device 104 may include and/or is part of a smart phone, a tablet based device, a laptop computer, a desktop computer, a physical server, and/or a cluster of servers, among others.
  • the device 104 may also be a node of a network.
  • the network may also include other nodes such as a medical image provider 112 , among others.
  • the network may connect nodes with wired and wireless infrastructure.
  • the device 104 may execute the image analysis application 106 .
  • the image analysis application 106 may receive the medical image 108 as an input.
  • An example of the medical image 108 may include an ultrasound image (or scan).
  • Other examples of the medical image 108 may include a x-ray image, a magnetic resonance imaging (MRI) scan, a computed tomography (CT) scan, and/or a positron emission tomography (PET) scan, among others.
  • the medical image 108 may be received from the medical image provider 112 .
  • the medical image provider 112 may include a medical imaging device/system that captures, manages, and/or presents the medical image 108 to a user 102 .
  • the user 102 may include a medical practitioner such as a doctor, a nurse, and/or a technician, a patient, and/or an administrator, among others.
  • the user 102 may use the medical image 108 to diagnose an issue, a malignancy (cancer), and/or other illness associated with a patient.
  • a medical practitioner such as a doctor, a nurse, and/or a technician, a patient, and/or an administrator, among others.
  • the user 102 may use the medical image 108 to diagnose an issue, a malignancy (cancer), and/or other illness associated with a patient.
  • cancer malignancy
  • the medical image 108 may include an object 110 .
  • the object 110 may include a biological structure of a patient.
  • the object 110 may include a malignant or a benign lesion.
  • the object 110 may represent another structure associated with an organ and/or other body part of the patient.
  • a computer assisted detection (CADe) module of the image analysis application 106 may partition the medical image into sub-regions.
  • a size of the sub-regions may be determined by the CADe module based on a size of the object 110 .
  • the object 110 that consumes a large portion of the medical image 108 may be partitioned to a large number of the sub-regions.
  • the object 110 that consumes a small portion of the medical image 108 may be partitioned to a small number of the sub-regions.
  • the number of the sub-regions may be determined dynamically based on attributes associated with the medical image 108 such as dimensions, resolution, quality, and clarity.
  • the CADe module may process each sub-region to detect parts of the object 110 .
  • the parts of the object 110 may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model.
  • the DNN model may include a machine learning mechanism based on learning data representations. Learning operations associated with the DNN model may vary from supervised learning to unsupervised learning.
  • the CADe module may determine bounding boxes for the selection of the sub-regions (associated with the object 110 ).
  • the bounding boxes may be evaluated based a confidence score detected as above a threshold.
  • the confidence score (detected as above the threshold) may designate the parts of the object 110 as contained within the selection of the sub-regions.
  • the confidence score may confirm that the CADe module has correctly recognized the parts of the object 110 within the bounding boxes representing the selection of the sub-regions (of the medical image 108 ).
  • the threshold level may be determined automatically by the CADe based on positive and negative training models within the DNN model. Alternatively, the user 102 may manually determine the threshold level.
  • a region of interest (ROI) 114 may be determined as a group comprising the selection of the sub-regions.
  • the CADe module may determine the ROI 114 based on a comparison of similar orientations associated with the bounding boxes (representing the selection of the sub-regions) in relation to similar orientations of a positive training model of the DNN model.
  • the similar orientations of the parts of the object 110 may describe orientation based relationships between the parts of the object 110 that are expected to match orientation based relationships within the positive training model.
  • the CADe module may conclusively detect the object 110 as a lesion when the attributes of the parts of the object 110 (such as the similar orientations) match comparable attributes in the positive training model.
  • the selection of the sub-regions may next be designated as the ROI 114 by the CADe module.
  • the medical image 108 may be provided with the ROI 114 to the user 102 .
  • the ROI 114 may alert the user 102 regarding a disease state (such as malignant/cancer or benign) associated with the object 110 .
  • the image analysis application 106 may perform operations associated with detecting the object 110 in the medical image 108 as a desktop application, a workstation application, and/or a server application, among others.
  • the image analysis application 106 may also be a client interface of a server based application.
  • the device 104 may include a component of the medical image provider 112 .
  • the image analysis application 106 may include a service associated with the medical image provider 112 .
  • the user 102 may interact with the image analysis application 106 with a keyboard based input, a mouse based input, a voice based input, a pen based input, and a gesture based input, among others.
  • the gesture based input may include one or more touch based actions such as a touch action, a swipe action, and a combination of each, among others.
  • FIG. 1 has been described with specific components including the device 104 , the image analysis application 106 , embodiments are not limited to these components or system configurations and can be implemented with other system configuration employing fewer or additional components.
  • FIG. 2 shows a display diagram illustrating components of the image analysis application 106 determining the ROI 114 in the medical image 108 and labelling the ROI 114 with annotation(s) 228 .
  • the medical image 108 may be received as an input by the image analysis application 106 .
  • the CADe module 216 may partition the medical image 108 into sub-regions 218 and process the sub-regions 218 with a DNN model 230 to detect parts of the object 110 .
  • a selection 220 of the sub-regions 218 may be correlated to the parts of the object 110 .
  • Bounding boxes may next be determined for the selection 220 .
  • the bounding boxes may be representations of the selection 220 and may be used interchangeable to refer to the selection 220 .
  • the bounding boxes may be evaluated based on a confidence score 222 detected as above a threshold 224 level.
  • the confidence score 222 (detected as above the threshold 224 level) may designate the parts of the object 110 as contained within the selection 220 (of the sub-regions 218 ).
  • the confidence score 222 associated with the selection 220 (and/or the bounding boxes) may confirm that the CADe module 216 correctly identified and captured the parts of the object 110 within the selection 220 .
  • the ROI 114 may be determined as a group (of the sub-regions 218 ) that includes the selection 220 .
  • Similar orientations 226 associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model 230 .
  • the similar orientations 226 of the parts of the object 110 may describe orientation based relationships between the parts of the object 110 that are expected to match orientation based relationships within the positive training model.
  • the CADe module 216 may conclusively detect the object 110 as a lesion when the attributes of the parts of the object 110 (such as the similar orientations 226 ) match comparable attributes in the positive training model.
  • Similar orientations ( 226 ) of the bounding boxes may include similar and/or complimentary angular orientations between parts of the object 110 .
  • parts of the object 110 within a sub-region 218 positioned in a right top quadrant may include edges that are oriented outwards toward a top and right directions.
  • other parts of the object 110 within the sub-regions 218 located in top left, bottom left, and bottom right quadrants of the selection 220 may include edges that are oriented outwards toward top left, bottom left, and bottom right directions, respectively.
  • similar orientations 226 may include similar distances between the parts of the object 110 .
  • the CADe module 216 may apply a non-maximum suppression (NMS) mechanism to the selection 220 of the sub-regions 218 to obtain the bounding boxes (associated with the selection 220 ) in relation to the object 110 .
  • the DNN model 230 associated with a detection of the parts of the object 110 may include a region based convolutional neural network (R-CNN) model, a fast R-CNN model, a faster R-CNN model, a you only look once (YOLO) model, and/or a single shot multi-box (SSD) model, among others.
  • R-CNN region based convolutional neural network
  • YOLO you only look once
  • SSD single shot multi-box
  • the CADe module 216 may label the selection 220 of the sub-regions within the ROI 114 with annotation(s) 228 associated with a type of tissue (in relation to the object 110 ).
  • the type of tissue may include lobulated, spiculated, angular, clear boundary, oval, circumscribed, and/or abrupt interface, among others.
  • the annotation(s) 228 may also include a relative position associated with a part (of the object 110 ). The relative position may include top, bottom, left side, right side, and/or combinations of relative positions, among others.
  • the CADe module 216 may emphasize the object 110 with the ROI 114 as a lesion within the medical image 108 by processing the medical image 108 with the DNN model 230 .
  • a training mechanism associated with the DNN model 230 may include a compensation for an unbalanced training data such as a majority of training medical images with no lesion and a minority of training medical images with a lesion.
  • the training mechanism may include a down-sampling of the majority, an up-sampling of the minority, or a utilization of a cost sensitive mechanism, a gradient boost machine, or a hard negative mining mechanism.
  • FIG. 3 shows another display diagram illustrating components of a user interface allowing a user to interact with the ROI 114 and the annotation(s) 228 associated with the ROI 114 within the medical image 108 .
  • the image analysis application 106 (executed by the device 104 ) may provide the medical image 108 (or a digital copy) with the annotation(s) 228 and the ROI 114 to a user (such as a medical practitioner or a patient).
  • Examples of the annotation(s) 228 may include a lobulated 332 sub-region, a spiculated 334 sub-region, an angular 336 sub-region, a clear boundary 338 sub-region, an oval 340 sub-region, a circumscribed 342 sub-region, and/or an abrupt interface 344 sub-region, among others.
  • the user interface may also be configured to allow the user to change the annotation(s) 228 associated with the ROI 114 .
  • the image analysis application 106 may detect the user provide a change to the annotation(s) 228 to personalize the annotation(s) for the user.
  • the image analysis application 106 may identify a rate of concordance of the user in relation to the DNN model 230 .
  • the rate of concordance may include an evaluation of a correctness of the user when diagnosing medical images in relation to the object(s) within the medical images (whether malignant or benign).
  • the image analysis application 106 may determine the concordance rate of the user as above (or equal to) a threshold.
  • the threshold may include a level which is correlated with a competence (associated with the user) when diagnosing an object within medical images as malignant or benign.
  • the image analysis application 106 may retrain the DNN model 230 based on the change to the annotation(s) 228 .
  • the selection 220 (of the sub-regions 218 ) may be re-processed based on the change to the annotation(s) 228 .
  • the selection 220 of the sub-regions 218 may be re-labelled based on the change to the annotation(s) 228 to personalize the annotation(s) 228 to preference(s) of the user.
  • the image analysis application 106 may determine the concordance rate of the user as below a threshold.
  • the change to the annotation(s) 228 (introduced by the user) may be rejected.
  • a notification may be provided to the user to re-evaluate the change to the annotation(s). The notification may alert the user that the user may be incorrect regarding the change to the annotation(s) 228 .
  • the user may be allowed to interact, through an augmented reality display, with the medical image 108 , the ROI 114 , the annotation(s) 228 .
  • the annotation(s) 228 may also be provided as text, sound, or texture associated with the ROI 114 .
  • FIGS. 1 through 3 are shown with specific components, data types, and configurations. Embodiments are not limited to systems according to these example configurations.
  • a device to detect an object in a medical image may be implemented in configurations employing fewer or additional components in applications and user interfaces.
  • the example schema and components shown in FIGS. 1 through 3 and their subcomponents may be implemented in a similar manner with other values using the principles described herein.
  • FIG. 4 is a block diagram of an example computing device, which may be used to detect an object in a medical image, according to embodiments.
  • computing device 400 may be used as a server, desktop computer, portable computer, smart phone, special purpose computer, or similar device.
  • the computing device 400 may include one or more processors 404 and a system memory 406 .
  • a memory bus 408 may be used for communication between the processor 404 and the system memory 406 .
  • the basic configuration 402 may be illustrated in FIG. 4 by those components within the inner dashed line.
  • the processor 404 may be of any type, including but not limited to a microprocessor ( ⁇ P), a microcontroller ( ⁇ C), a digital signal processor (DSP), or any combination thereof.
  • the processor 404 may include one more levels of caching, such as a level cache memory 412 , one or more processor cores 414 , and registers 416 .
  • the example processor cores 414 may (each) include an arithmetic logic unit (ALU), a floating-point unit (FPU), a digital signal processing core (DSP Core), a graphics processing unit (GPU), or any combination thereof.
  • An example memory controller 418 may also be used with the processor 404 , or in some implementations, the memory controller 418 may be an internal part of the processor 404 .
  • the system memory 406 may be of any type including but not limited to volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.), or any combination thereof.
  • the system memory 406 may include an operating system 420 , the image analysis application 106 , and a program data 424 .
  • the image analysis application 106 may include components such as the CADe module 216 .
  • the CADe module 216 may execute the instructions and processes associated with the image analysis application 106 .
  • the CADe module 216 may receive the medical image as an input. The medical image may next be partitioned to sub-regions.
  • Parts of the object may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model.
  • Bounding boxes for the selection may also be determined.
  • the bounding boxes may be evaluated based on a confidence score detected as above a threshold level.
  • the confidence score may designate the parts as contained within the selection.
  • a region of interest (ROI) may be determined as a group including the selection. Similar orientations associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model.
  • the selection may be designated as the ROI within the medical image.
  • the medical image may also be provided with the ROI to a user.
  • Input to and output out of the image analysis application 106 may be captured and displayed through a display component that may be integrated to the computing device 400 .
  • the display component may include a display screen, and/or a display monitor, among others that may capture an input through a touch/gesture based component such as a digitizer.
  • the program data 424 may also include, among other data, the medical image 108 , or the like, as described herein.
  • the object 110 in the medical image 108 may be identified and emphasized with the ROI 114 and annotation(s) 228 , among other things.
  • the computing device 400 may have additional features or functionality, and additional interfaces to facilitate communications between the basic configuration 402 and any desired devices and interfaces.
  • a bus/interface controller 430 may be used to facilitate communications between the basic configuration 402 and one or more data storage devices 432 via a storage interface bus 434 .
  • the data storage devices 432 may be one or more removable storage devices 436 , one or more non-removable storage devices 438 , or a combination thereof.
  • Examples of the removable storage and the non-removable storage devices may include magnetic disk devices, such as flexible disk drives and hard-disk drives (HDDs), optical disk drives such as compact disk (CD) drives or digital versatile disk (DVD) drives, solid state drives (SSDs), and tape drives, to name a few.
  • Example computer storage media may include volatile and nonvolatile, removable, and non-removable media implemented in any method or technology for storage of information, such as computer-readable instructions, data structures, program modules, or other data.
  • the system memory 406 , the removable storage devices 436 and the non-removable storage devices 438 are examples of computer storage media.
  • Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVDs), solid state drives, or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which may be used to store the desired information and which may be accessed by the computing device 400 . Any such computer storage media may be part of the computing device 400 .
  • the computing device 400 may also include an interface bus 440 for facilitating communication from various interface devices (for example, one or more output devices 442 , one or more peripheral interfaces 444 , and one or more communication devices 466 ) to the basic configuration 402 via the bus/interface controller 430 .
  • interface devices for example, one or more output devices 442 , one or more peripheral interfaces 444 , and one or more communication devices 466 .
  • Some of the example output devices 442 include a graphics processing unit 448 and an audio processing unit 450 , which may be configured to communicate to various external devices such as a display or speakers via one or more A/V ports 452 .
  • One or more example peripheral interfaces 444 may include a serial interface controller 454 or a parallel interface controller 456 , which may be configured to communicate with external devices such as input devices (for example, keyboard, mouse, pen, voice input device, touch input device, etc.) or other peripheral devices (for example, printer, scanner, etc.) via one or more I/O ports 458 .
  • An example of the communication device(s) 466 includes a network controller 460 , which may be arranged to facilitate communications with one or more other computing devices 462 over a network communication link via one or more communication ports 464 .
  • the one or more other computing devices 462 may include servers, computing devices, and comparable devices.
  • the network communication link may be one example of a communication media.
  • Communication media may typically be embodied by computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism, and may include any information delivery media.
  • a “modulated data signal” may be a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal.
  • communication media may include wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, radio frequency (RF), microwave, infrared (IR) and other wireless media.
  • RF radio frequency
  • IR infrared
  • the term computer readable media as used herein may include both storage media and communication media.
  • the computing device 400 may be implemented as a part of a specialized server, mainframe, or similar computer, which includes any of the above functions.
  • the computing device 400 may also be implemented as a personal computer including both laptop computer and non-laptop computer configurations. Additionally, the computing device 400 may include specialized hardware such as an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA), a programmable logic device (PLD), and/or a free form logic on an integrated circuit (IC), among others.
  • ASIC application-specific integrated circuit
  • FPGA field programmable gate array
  • PLD programmable logic device
  • IC integrated circuit
  • Example embodiments may also include methods to detect an object in a medical image. These methods can be implemented in any number of ways, including the structures described herein. One such way may be by machine operations, of devices of the type described in the present disclosure. Another optional way may be for one or more of the individual operations of the methods to be performed in conjunction with one or more human operators performing some of the operations while other operations may be performed by machines. These human operators need not be collocated with each other, but each can be only with a machine that performs a portion of the program. In other embodiments, the human interaction can be automated such as by pre-selected criteria that may be machine automated.
  • FIG. 5 is a logic flow diagram illustrating a process for detecting an object in a medical image.
  • Process 500 may be implemented on a computing device, such as the computing device 400 or another system.
  • Process 500 begins with operation 510 , where an image analysis application may receive the medical image as an input.
  • the medical image may be partitioned to sub-regions.
  • parts of the object may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model.
  • DNN deep-learning neural network
  • bounding boxes for the selection may be determined. The bounding boxes may be evaluated based on a confidence score detected as above a threshold level. The confidence score may designate the parts as contained within the selection.
  • a ROI may be determined as a group including the selection. Similar orientations associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model. Furthermore, at operation 560 , the selection may be designated as the ROI within the medical image. At operation 570 , the medical image may be provided with the ROI to a user.
  • process 500 is for illustration purposes. Detecting an object in a medical image may be implemented by similar processes with fewer or additional steps, as well as in different order of operations using the principles described herein.
  • the operations described herein may be executed by one or more processors operated on one or more computing devices, one or more processor cores, specialized processing devices, and/or special purpose processors, among other examples.
  • a method of detecting an object in a medical image includes receiving the medical image as an input.
  • the medical image may next be partitioned to sub-regions. Parts of the object may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model.
  • Bounding boxes for the selection may also be determined. The bounding boxes may be evaluated based on a confidence score detected as above a threshold level. The confidence score may designate the parts as contained within the selection.
  • a region of interest (ROI) may be determined as a group including the selection. Similar orientations associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model.
  • the selection may be designated as the ROI within the medical image.
  • the medical image may also be provided with the ROI to a user.
  • the articles “a,” “an,” and “the” are intended to mean that there are one or more of the elements.
  • the adjective “another,” when used to introduce an element, is intended to mean one or more elements.
  • the terms “including” and “having” are intended to be inclusive such that there may be additional elements other than the listed elements.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Public Health (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • General Physics & Mathematics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Pathology (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Animal Behavior & Ethology (AREA)
  • Veterinary Medicine (AREA)
  • Mathematical Physics (AREA)
  • Physiology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Surgery (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Fuzzy Systems (AREA)
  • Psychiatry (AREA)
  • Signal Processing (AREA)
  • Quality & Reliability (AREA)
  • Apparatus For Radiation Diagnosis (AREA)
  • Magnetic Resonance Imaging Apparatus (AREA)
  • Image Analysis (AREA)

Abstract

A device to detect an object in a medical image is described. An image analysis application, executed by the device, receives the medical image as an input. The medical image is next partitioned to sub-regions. Parts of the object are detected in a selection of the sub-regions using a deep-learning neural network (DNN) model. Bounding boxes for the selection are also determined. The bounding boxes are evaluated based on a confidence score detected as above a threshold level. The confidence score designates the parts as contained within the selection. Next, a region of interest (ROI) is determined as a group including the selection. Similar orientations associated with the bounding boxes are comparable to similar orientations of a positive training model of the DNN model. Furthermore, the selection is designated as the ROI within the medical image. The medical image is provided with the ROI to a user.

Description

    CROSS-REFERENCE TO RELATED APPLICATION
  • This application is a continuation application, which relates to and claims the benefit to U.S. application Ser. No. 16/989,625, filed on Aug. 10, 2020 and granted as U.S. Pat. No. 11,043,297 which issued on Jun. 22, 2021, entitled “Neural Network-Based Object Detection In Visual Input”, which relates to and claims the benefit to U.S. application Ser. No. 16/218,832, filed on Dec. 13, 2018, entitled “Object Detection in Medical Image”, the entirety of both of which are incorporated herein by reference.
  • FIELD OF THE EMBODIMENTS
  • The field of the embodiments relate to a device to an object within a medical image. The object may be identified and labelled as a region of interest which contains a lesion (malignant).
  • BACKGROUND OF THE EMBODIMENTS
  • Information exchanges have changed processes associated with work and personal environments. Automation and improvements in processes have expanded the scope of capabilities offered for personal and business data consumption. With the development of faster and smaller electronics, a variety of devices have integrated into daily lives. A modern device includes components to provide variety of services such as communication, display, imaging, voice, and/or data capture, among others. Abilities of the modern device jump exponentially when networked to other resources that provide previously unimagined number of services associated with medical imaging.
  • Ultrasound and other medical imaging devices scan biological structures or tissues of a patient to provide images. The scanned images are provided to medical practitioner(s) to aid with diagnosis of illnesses such as cancer. Clarity and quality of scanned image are usually suspect and depend on variety of conditions associated with the patient and a skill of a technician capturing the scanned image. Furthermore, the medical practitioner is also subject to missed diagnosis or false diagnosis associated with the scanned image due to quality of the scanned image and/or human error.
  • SUMMARY OF THE EMBODIMENTS
  • The present invention and its embodiments relate to a method to detect an object in a medical image. In an example scenario, the method may include receiving the medical image as an input. The medical image may next be partitioned to sub-regions. Parts of the object may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model. Bounding boxes for the selection may also be determined. The bounding boxes may be evaluated based on a confidence score detected as above a threshold level. The confidence score may designate the parts as contained within the selection. Next, a region of interest (ROI) may be determined as a group including the selection. Similar orientations associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model. Furthermore, the selection may be designated as the ROI within the medical image. The medical image may also be provided with the ROI to a user.
  • In another embodiment of the present invention, a device to detect an object in a medical image is described. The device may be configured to receive the medical image as an input. The medical image may next be partitioned to sub-regions. Parts of the object may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model. Bounding boxes for the selection may also be determined. The bounding boxes may be evaluated based on a confidence score detected as above a threshold level. The confidence score may designate the parts as contained within the selection. Next, a region of interest (ROI) may be determined as a group including the selection. Similar orientations associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model. Furthermore, the selection may be designated as the ROI within the medical image. The selection within the ROI may be labelled with annotation(s) associated with a type of tissue. The type of tissue may include lobulated, spiculated, angular, clear boundary, oval, circumscribed, or abrupt interface. The medical image may also be provided with the annotation(s) and the ROI to a user.
  • In yet another embodiment of the present invention, a device to detection an objection in a medical image is described. The device may include a memory configured to store instructions associated with an image analysis application. A processor may be coupled to the memory. The processor may execute the instructions associated with the image analysis application. The image analysis application may include a computer assisted detection (CADe) module. The CADe module may be configured to receive the medical image as an input. The medical image may next partitioned to sub-regions. Parts of the object may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model. Bounding boxes for the selection may also be determined. The bounding boxes may be evaluated based on a confidence score detected as above a threshold level. The confidence score may designate the parts as contained within the selection. Next, a region of interest (ROI) may be determined as a group including the selection. Similar orientations associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model. Furthermore, the selection may be designated as the ROI within the medical image. The selection within the ROI may be labelled with annotation(s) associated with a type of tissue. The type of tissue may include lobulated, spiculated, angular, clear boundary, oval, circumscribed, or abrupt interface. The medical image may also be provided with the annotation(s) and the ROI to a user.
  • It is an object of the embodiments of the present invention to detect an object in a medical image.
  • It is an object of the embodiments of the present invention to partition the medical image to sub-regions.
  • It is an object of the embodiments of the present invention to determine a selection of the sub-regions associated with the object.
  • It is an object of the embodiments of the present invention to process the selection to determine a region of interest (ROI).
  • It is an object of the embodiments of the present invention to label the selection with annotation(s).
  • It is an object of the embodiments to provide the medical image, the annotation(s) and the ROI to a user.
  • These and other features, aspects and advantages of the present invention will become better understood with reference to the following drawings, description and claims.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 shows a conceptual diagram illustrating examples of detecting an object in a medical image, according to an embodiment of the invention.
  • FIG. 2 shows a display diagram illustrating components of an image analysis application determining a region of interest (ROI) in the medical image and labelling the ROI with annotation(s), according to an embodiment of the invention.
  • FIG. 3 shows another display diagram illustrating components of user interface allowing a user to interact with a ROI and annotation(s) associated with the ROI within a medical image, according to an embodiment of the invention.
  • FIG. 4 is a block diagram of an example computing device, which may be used to detect an object in a medical image.
  • FIG. 5 is a logic flow diagram illustrating a process for detecting an object in a medical image, according to an embodiment of the invention.
  • DESCRIPTION OF THE PREFERRED EMBODIMENTS
  • The preferred embodiments of the present invention will now be described with reference to the drawings. Identical elements in the various figures are identified with the same reference numerals.
  • Reference will now be made in detail to each embodiment of the present invention. Such embodiments are provided by way of explanation of the present invention, which is not intended to be limited thereto. In fact, those of ordinary skill in the art may appreciate upon reading the present specification and viewing the present drawings that various modifications and variations may be made thereto.
  • FIG. 1 shows a conceptual diagram illustrating examples of detecting an object in a medical image 108. In an example scenario, a device 104 may execute (or provide) an image analysis application 106. The device 104 may include a physical computing device hosting and/or providing features associated with a client application (such as the image analysis application 106). The device 104 may include and/or is part of a smart phone, a tablet based device, a laptop computer, a desktop computer, a physical server, and/or a cluster of servers, among others. The device 104 may also be a node of a network. The network may also include other nodes such as a medical image provider 112, among others. The network may connect nodes with wired and wireless infrastructure.
  • The device 104 may execute the image analysis application 106. In an example scenario, the image analysis application 106 may receive the medical image 108 as an input. An example of the medical image 108 may include an ultrasound image (or scan). Other examples of the medical image 108 may include a x-ray image, a magnetic resonance imaging (MRI) scan, a computed tomography (CT) scan, and/or a positron emission tomography (PET) scan, among others. The medical image 108 may be received from the medical image provider 112. The medical image provider 112 may include a medical imaging device/system that captures, manages, and/or presents the medical image 108 to a user 102. The user 102 may include a medical practitioner such as a doctor, a nurse, and/or a technician, a patient, and/or an administrator, among others. The user 102 may use the medical image 108 to diagnose an issue, a malignancy (cancer), and/or other illness associated with a patient.
  • The medical image 108 may include an object 110. The object 110 may include a biological structure of a patient. For example, the object 110 may include a malignant or a benign lesion. Alternatively, the object 110 may represent another structure associated with an organ and/or other body part of the patient.
  • Next, a computer assisted detection (CADe) module of the image analysis application 106 may partition the medical image into sub-regions. A size of the sub-regions may be determined by the CADe module based on a size of the object 110. For example, the object 110 that consumes a large portion of the medical image 108 may be partitioned to a large number of the sub-regions. Alternatively, the object 110 that consumes a small portion of the medical image 108 may be partitioned to a small number of the sub-regions. In yet another example scenario, the number of the sub-regions may be determined dynamically based on attributes associated with the medical image 108 such as dimensions, resolution, quality, and clarity.
  • The CADe module may process each sub-region to detect parts of the object 110. The parts of the object 110 may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model. The DNN model may include a machine learning mechanism based on learning data representations. Learning operations associated with the DNN model may vary from supervised learning to unsupervised learning.
  • In an example scenario, the CADe module may determine bounding boxes for the selection of the sub-regions (associated with the object 110). The bounding boxes may be evaluated based a confidence score detected as above a threshold. The confidence score (detected as above the threshold) may designate the parts of the object 110 as contained within the selection of the sub-regions. The confidence score may confirm that the CADe module has correctly recognized the parts of the object 110 within the bounding boxes representing the selection of the sub-regions (of the medical image 108). The threshold level may be determined automatically by the CADe based on positive and negative training models within the DNN model. Alternatively, the user 102 may manually determine the threshold level.
  • Next, a region of interest (ROI) 114 may be determined as a group comprising the selection of the sub-regions. The CADe module may determine the ROI 114 based on a comparison of similar orientations associated with the bounding boxes (representing the selection of the sub-regions) in relation to similar orientations of a positive training model of the DNN model. The similar orientations of the parts of the object 110 may describe orientation based relationships between the parts of the object 110 that are expected to match orientation based relationships within the positive training model. As such, the CADe module may conclusively detect the object 110 as a lesion when the attributes of the parts of the object 110 (such as the similar orientations) match comparable attributes in the positive training model.
  • In addition, the selection of the sub-regions may next be designated as the ROI 114 by the CADe module. Moreover, the medical image 108 may be provided with the ROI 114 to the user 102. The ROI 114 may alert the user 102 regarding a disease state (such as malignant/cancer or benign) associated with the object 110.
  • Previous example(s) to detect the object 110 in the medical image 108 are not provided in a limiting sense. Alternatively, the image analysis application 106 may perform operations associated with detecting the object 110 in the medical image 108 as a desktop application, a workstation application, and/or a server application, among others. The image analysis application 106 may also be a client interface of a server based application. Furthermore, the device 104 may include a component of the medical image provider 112. As such, the image analysis application 106 may include a service associated with the medical image provider 112.
  • The user 102 may interact with the image analysis application 106 with a keyboard based input, a mouse based input, a voice based input, a pen based input, and a gesture based input, among others. The gesture based input may include one or more touch based actions such as a touch action, a swipe action, and a combination of each, among others.
  • While the example system in FIG. 1 has been described with specific components including the device 104, the image analysis application 106, embodiments are not limited to these components or system configurations and can be implemented with other system configuration employing fewer or additional components.
  • FIG. 2 shows a display diagram illustrating components of the image analysis application 106 determining the ROI 114 in the medical image 108 and labelling the ROI 114 with annotation(s) 228. In an example scenario, the medical image 108 may be received as an input by the image analysis application 106. The CADe module 216 may partition the medical image 108 into sub-regions 218 and process the sub-regions 218 with a DNN model 230 to detect parts of the object 110. A selection 220 of the sub-regions 218 may be correlated to the parts of the object 110.
  • Bounding boxes may next be determined for the selection 220. The bounding boxes may be representations of the selection 220 and may be used interchangeable to refer to the selection 220. The bounding boxes may be evaluated based on a confidence score 222 detected as above a threshold 224 level. The confidence score 222 (detected as above the threshold 224 level) may designate the parts of the object 110 as contained within the selection 220 (of the sub-regions 218). As such, the confidence score 222 associated with the selection 220 (and/or the bounding boxes) may confirm that the CADe module 216 correctly identified and captured the parts of the object 110 within the selection 220.
  • Next, the ROI 114 may be determined as a group (of the sub-regions 218) that includes the selection 220. Similar orientations 226 associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model 230. The similar orientations 226 of the parts of the object 110 may describe orientation based relationships between the parts of the object 110 that are expected to match orientation based relationships within the positive training model. As such, the CADe module 216 may conclusively detect the object 110 as a lesion when the attributes of the parts of the object 110 (such as the similar orientations 226) match comparable attributes in the positive training model.
  • Similar orientations (226) of the bounding boxes may include similar and/or complimentary angular orientations between parts of the object 110. For example, parts of the object 110 within a sub-region 218 positioned in a right top quadrant may include edges that are oriented outwards toward a top and right directions. Similarly, other parts of the object 110 within the sub-regions 218 located in top left, bottom left, and bottom right quadrants of the selection 220 may include edges that are oriented outwards toward top left, bottom left, and bottom right directions, respectively.
  • Furthermore, similar orientations 226 may include similar distances between the parts of the object 110. In addition, the CADe module 216 may apply a non-maximum suppression (NMS) mechanism to the selection 220 of the sub-regions 218 to obtain the bounding boxes (associated with the selection 220) in relation to the object 110. Moreover, the DNN model 230 associated with a detection of the parts of the object 110 may include a region based convolutional neural network (R-CNN) model, a fast R-CNN model, a faster R-CNN model, a you only look once (YOLO) model, and/or a single shot multi-box (SSD) model, among others.
  • Furthermore, the CADe module 216 may label the selection 220 of the sub-regions within the ROI 114 with annotation(s) 228 associated with a type of tissue (in relation to the object 110). The type of tissue may include lobulated, spiculated, angular, clear boundary, oval, circumscribed, and/or abrupt interface, among others. The annotation(s) 228 may also include a relative position associated with a part (of the object 110). The relative position may include top, bottom, left side, right side, and/or combinations of relative positions, among others.
  • The CADe module 216 may emphasize the object 110 with the ROI 114 as a lesion within the medical image 108 by processing the medical image 108 with the DNN model 230. Furthermore, a training mechanism associated with the DNN model 230 may include a compensation for an unbalanced training data such as a majority of training medical images with no lesion and a minority of training medical images with a lesion. In addition, the training mechanism may include a down-sampling of the majority, an up-sampling of the minority, or a utilization of a cost sensitive mechanism, a gradient boost machine, or a hard negative mining mechanism.
  • FIG. 3 shows another display diagram illustrating components of a user interface allowing a user to interact with the ROI 114 and the annotation(s) 228 associated with the ROI 114 within the medical image 108. The image analysis application 106 (executed by the device 104) may provide the medical image 108 (or a digital copy) with the annotation(s) 228 and the ROI 114 to a user (such as a medical practitioner or a patient). Examples of the annotation(s) 228 may include a lobulated 332 sub-region, a spiculated 334 sub-region, an angular 336 sub-region, a clear boundary 338 sub-region, an oval 340 sub-region, a circumscribed 342 sub-region, and/or an abrupt interface 344 sub-region, among others.
  • The user interface may also be configured to allow the user to change the annotation(s) 228 associated with the ROI 114. In an example scenario, the image analysis application 106 may detect the user provide a change to the annotation(s) 228 to personalize the annotation(s) for the user. In response, the image analysis application 106 may identify a rate of concordance of the user in relation to the DNN model 230. The rate of concordance may include an evaluation of a correctness of the user when diagnosing medical images in relation to the object(s) within the medical images (whether malignant or benign).
  • The image analysis application 106 may determine the concordance rate of the user as above (or equal to) a threshold. The threshold may include a level which is correlated with a competence (associated with the user) when diagnosing an object within medical images as malignant or benign. In response to the determination associated with the concordance rate, the image analysis application 106 may retrain the DNN model 230 based on the change to the annotation(s) 228. The selection 220 (of the sub-regions 218) may be re-processed based on the change to the annotation(s) 228. Furthermore, the selection 220 of the sub-regions 218 may be re-labelled based on the change to the annotation(s) 228 to personalize the annotation(s) 228 to preference(s) of the user.
  • In another example scenario, the image analysis application 106 may determine the concordance rate of the user as below a threshold. In response, the change to the annotation(s) 228 (introduced by the user) may be rejected. A notification may be provided to the user to re-evaluate the change to the annotation(s). The notification may alert the user that the user may be incorrect regarding the change to the annotation(s) 228.
  • In yet another example scenario, the user may be allowed to interact, through an augmented reality display, with the medical image 108, the ROI 114, the annotation(s) 228. The annotation(s) 228 may also be provided as text, sound, or texture associated with the ROI 114.
  • The example scenarios and schemas in FIGS. 1 through 3 are shown with specific components, data types, and configurations. Embodiments are not limited to systems according to these example configurations. A device to detect an object in a medical image may be implemented in configurations employing fewer or additional components in applications and user interfaces. Furthermore, the example schema and components shown in FIGS. 1 through 3 and their subcomponents may be implemented in a similar manner with other values using the principles described herein.
  • FIG. 4 is a block diagram of an example computing device, which may be used to detect an object in a medical image, according to embodiments.
  • For example, computing device 400 may be used as a server, desktop computer, portable computer, smart phone, special purpose computer, or similar device. In a basic configuration 402, the computing device 400 may include one or more processors 404 and a system memory 406. A memory bus 408 may be used for communication between the processor 404 and the system memory 406. The basic configuration 402 may be illustrated in FIG. 4 by those components within the inner dashed line.
  • Depending on the desired configuration, the processor 404 may be of any type, including but not limited to a microprocessor (μP), a microcontroller (μC), a digital signal processor (DSP), or any combination thereof. The processor 404 may include one more levels of caching, such as a level cache memory 412, one or more processor cores 414, and registers 416. The example processor cores 414 may (each) include an arithmetic logic unit (ALU), a floating-point unit (FPU), a digital signal processing core (DSP Core), a graphics processing unit (GPU), or any combination thereof. An example memory controller 418 may also be used with the processor 404, or in some implementations, the memory controller 418 may be an internal part of the processor 404.
  • Depending on the desired configuration, the system memory 406 may be of any type including but not limited to volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.), or any combination thereof. The system memory 406 may include an operating system 420, the image analysis application 106, and a program data 424. The image analysis application 106 may include components such as the CADe module 216. The CADe module 216 may execute the instructions and processes associated with the image analysis application 106. In an example scenario, the CADe module 216 may receive the medical image as an input. The medical image may next be partitioned to sub-regions. Parts of the object may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model. Bounding boxes for the selection may also be determined. The bounding boxes may be evaluated based on a confidence score detected as above a threshold level. The confidence score may designate the parts as contained within the selection. Next, a region of interest (ROI) may be determined as a group including the selection. Similar orientations associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model. Furthermore, the selection may be designated as the ROI within the medical image. The medical image may also be provided with the ROI to a user.
  • Input to and output out of the image analysis application 106 may be captured and displayed through a display component that may be integrated to the computing device 400. The display component may include a display screen, and/or a display monitor, among others that may capture an input through a touch/gesture based component such as a digitizer. The program data 424 may also include, among other data, the medical image 108, or the like, as described herein. The object 110 in the medical image 108 may be identified and emphasized with the ROI 114 and annotation(s) 228, among other things.
  • The computing device 400 may have additional features or functionality, and additional interfaces to facilitate communications between the basic configuration 402 and any desired devices and interfaces. For example, a bus/interface controller 430 may be used to facilitate communications between the basic configuration 402 and one or more data storage devices 432 via a storage interface bus 434. The data storage devices 432 may be one or more removable storage devices 436, one or more non-removable storage devices 438, or a combination thereof. Examples of the removable storage and the non-removable storage devices may include magnetic disk devices, such as flexible disk drives and hard-disk drives (HDDs), optical disk drives such as compact disk (CD) drives or digital versatile disk (DVD) drives, solid state drives (SSDs), and tape drives, to name a few. Example computer storage media may include volatile and nonvolatile, removable, and non-removable media implemented in any method or technology for storage of information, such as computer-readable instructions, data structures, program modules, or other data.
  • The system memory 406, the removable storage devices 436 and the non-removable storage devices 438 are examples of computer storage media. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVDs), solid state drives, or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which may be used to store the desired information and which may be accessed by the computing device 400. Any such computer storage media may be part of the computing device 400.
  • The computing device 400 may also include an interface bus 440 for facilitating communication from various interface devices (for example, one or more output devices 442, one or more peripheral interfaces 444, and one or more communication devices 466) to the basic configuration 402 via the bus/interface controller 430. Some of the example output devices 442 include a graphics processing unit 448 and an audio processing unit 450, which may be configured to communicate to various external devices such as a display or speakers via one or more A/V ports 452. One or more example peripheral interfaces 444 may include a serial interface controller 454 or a parallel interface controller 456, which may be configured to communicate with external devices such as input devices (for example, keyboard, mouse, pen, voice input device, touch input device, etc.) or other peripheral devices (for example, printer, scanner, etc.) via one or more I/O ports 458. An example of the communication device(s) 466 includes a network controller 460, which may be arranged to facilitate communications with one or more other computing devices 462 over a network communication link via one or more communication ports 464. The one or more other computing devices 462 may include servers, computing devices, and comparable devices.
  • The network communication link may be one example of a communication media. Communication media may typically be embodied by computer readable instructions, data structures, program modules, or other data in a modulated data signal, such as a carrier wave or other transport mechanism, and may include any information delivery media. A “modulated data signal” may be a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. By way of example, and not limitation, communication media may include wired media such as a wired network or direct-wired connection, and wireless media such as acoustic, radio frequency (RF), microwave, infrared (IR) and other wireless media. The term computer readable media as used herein may include both storage media and communication media.
  • The computing device 400 may be implemented as a part of a specialized server, mainframe, or similar computer, which includes any of the above functions. The computing device 400 may also be implemented as a personal computer including both laptop computer and non-laptop computer configurations. Additionally, the computing device 400 may include specialized hardware such as an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA), a programmable logic device (PLD), and/or a free form logic on an integrated circuit (IC), among others.
  • Example embodiments may also include methods to detect an object in a medical image. These methods can be implemented in any number of ways, including the structures described herein. One such way may be by machine operations, of devices of the type described in the present disclosure. Another optional way may be for one or more of the individual operations of the methods to be performed in conjunction with one or more human operators performing some of the operations while other operations may be performed by machines. These human operators need not be collocated with each other, but each can be only with a machine that performs a portion of the program. In other embodiments, the human interaction can be automated such as by pre-selected criteria that may be machine automated.
  • FIG. 5 is a logic flow diagram illustrating a process for detecting an object in a medical image. Process 500 may be implemented on a computing device, such as the computing device 400 or another system.
  • Process 500 begins with operation 510, where an image analysis application may receive the medical image as an input. At operation 520, the medical image may be partitioned to sub-regions. At operation 530, parts of the object may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model. At operation 540, bounding boxes for the selection may be determined. The bounding boxes may be evaluated based on a confidence score detected as above a threshold level. The confidence score may designate the parts as contained within the selection.
  • Next, at operation 550, a ROI may be determined as a group including the selection. Similar orientations associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model. Furthermore, at operation 560, the selection may be designated as the ROI within the medical image. At operation 570, the medical image may be provided with the ROI to a user.
  • The operations included in process 500 is for illustration purposes. Detecting an object in a medical image may be implemented by similar processes with fewer or additional steps, as well as in different order of operations using the principles described herein. The operations described herein may be executed by one or more processors operated on one or more computing devices, one or more processor cores, specialized processing devices, and/or special purpose processors, among other examples.
  • A method of detecting an object in a medical image is also described. The method includes receiving the medical image as an input. The medical image may next be partitioned to sub-regions. Parts of the object may be detected in a selection of the sub-regions using a deep-learning neural network (DNN) model. Bounding boxes for the selection may also be determined. The bounding boxes may be evaluated based on a confidence score detected as above a threshold level. The confidence score may designate the parts as contained within the selection. Next, a region of interest (ROI) may be determined as a group including the selection. Similar orientations associated with the bounding boxes may be comparable to similar orientations of a positive training model of the DNN model. Furthermore, the selection may be designated as the ROI within the medical image. The medical image may also be provided with the ROI to a user.
  • When introducing elements of the present disclosure or the embodiment(s) thereof, the articles “a,” “an,” and “the” are intended to mean that there are one or more of the elements. Similarly, the adjective “another,” when used to introduce an element, is intended to mean one or more elements. The terms “including” and “having” are intended to be inclusive such that there may be additional elements other than the listed elements.
  • Although this invention has been described with a certain degree of particularity, it is to be understood that the present disclosure has been made only by way of illustration and that numerous changes in the details of construction and arrangement of parts may be resorted to without departing from the spirit and the scope of the invention.

Claims (1)

What is claimed is:
1. A method comprising:
receiving, by a processor, an image;
determining, by the processor, a size of the image;
utilizing, by the processor, a neural network (NN) model to analyze the image to identify objects within the image based at least in part on confidence score; and
instructing, by the processor, an output computer module to produce an alert indicative of the identified objects.
US17/352,438 2018-12-13 2021-06-21 Neural network-based object detection in visual input Abandoned US20210313048A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/352,438 US20210313048A1 (en) 2018-12-13 2021-06-21 Neural network-based object detection in visual input

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US16/218,832 US20200194108A1 (en) 2018-12-13 2018-12-13 Object detection in medical image
US16/989,625 US11043297B2 (en) 2018-12-13 2020-08-10 Neural network-based object detection in visual input
US17/352,438 US20210313048A1 (en) 2018-12-13 2021-06-21 Neural network-based object detection in visual input

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US16/989,625 Continuation US11043297B2 (en) 2018-12-13 2020-08-10 Neural network-based object detection in visual input

Publications (1)

Publication Number Publication Date
US20210313048A1 true US20210313048A1 (en) 2021-10-07

Family

ID=69167907

Family Applications (3)

Application Number Title Priority Date Filing Date
US16/218,832 Abandoned US20200194108A1 (en) 2018-12-13 2018-12-13 Object detection in medical image
US16/989,625 Active US11043297B2 (en) 2018-12-13 2020-08-10 Neural network-based object detection in visual input
US17/352,438 Abandoned US20210313048A1 (en) 2018-12-13 2021-06-21 Neural network-based object detection in visual input

Family Applications Before (2)

Application Number Title Priority Date Filing Date
US16/218,832 Abandoned US20200194108A1 (en) 2018-12-13 2018-12-13 Object detection in medical image
US16/989,625 Active US11043297B2 (en) 2018-12-13 2020-08-10 Neural network-based object detection in visual input

Country Status (2)

Country Link
US (3) US20200194108A1 (en)
WO (1) WO2020123749A1 (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108230294B (en) * 2017-06-14 2020-09-29 北京市商汤科技开发有限公司 Image detection method, image detection device, electronic equipment and storage medium
CN108537151A (en) * 2018-03-27 2018-09-14 上海小蚁科技有限公司 A kind of non-maxima suppression arithmetic unit and system
US20200394458A1 (en) * 2019-06-17 2020-12-17 Nvidia Corporation Weakly-supervised object detection using one or more neural networks
US11720817B2 (en) * 2019-07-01 2023-08-08 Medical Informatics Corp. Waveform annotator
US11188740B2 (en) * 2019-12-18 2021-11-30 Qualcomm Incorporated Two-pass omni-directional object detection
CN111598866B (en) * 2020-05-14 2023-04-11 四川大学 Lens key feature positioning method based on eye B-ultrasonic image
CA3103872A1 (en) * 2020-12-23 2022-06-23 Pulsemedica Corp. Automatic annotation of condition features in medical images
US11961314B2 (en) * 2021-02-16 2024-04-16 Nxp B.V. Method for analyzing an output of an object detector
CN113191353A (en) * 2021-04-15 2021-07-30 华北电力大学扬中智能电气研究中心 Vehicle speed determination method, device, equipment and medium
CN113744328B (en) * 2021-11-05 2022-02-15 极限人工智能有限公司 Medical image mark point identification method and device, electronic equipment and storage medium
US12020475B2 (en) * 2022-02-21 2024-06-25 Ford Global Technologies, Llc Neural network training
US20240112329A1 (en) * 2022-10-04 2024-04-04 HeHealth PTE Ltd. Distinguishing a Disease State from a Non-Disease State in an Image

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180204111A1 (en) * 2013-02-28 2018-07-19 Z Advanced Computing, Inc. System and Method for Extremely Efficient Image and Pattern Recognition and Artificial Intelligence Platform
US20190384304A1 (en) * 2018-06-13 2019-12-19 Nvidia Corporation Path detection for autonomous machines using deep neural networks

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20120102447A (en) 2011-03-08 2012-09-18 삼성전자주식회사 Method and apparatus for diagnostic
US8467607B1 (en) 2011-11-21 2013-06-18 Google Inc. Segmentation-based feature pooling for object models
US10140709B2 (en) * 2017-02-27 2018-11-27 International Business Machines Corporation Automatic detection and semantic description of lesions using a convolutional neural network
CN107392218B (en) 2017-04-11 2020-08-04 创新先进技术有限公司 Vehicle loss assessment method and device based on image and electronic equipment
EP3392832A1 (en) 2017-04-21 2018-10-24 General Electric Company Automated organ risk segmentation machine learning methods and systems
EP3399465A1 (en) * 2017-05-05 2018-11-07 Dassault Systèmes Forming a dataset for fully-supervised learning
JP7227168B2 (en) * 2017-06-19 2023-02-21 モハメド・アール・マーフーズ Surgical Navigation of the Hip Using Fluoroscopy and Tracking Sensors
US10268204B2 (en) 2017-08-30 2019-04-23 GM Global Technology Operations LLC Cross traffic detection using cameras
US11446008B2 (en) * 2018-08-17 2022-09-20 Tokitae Llc Automated ultrasound video interpretation of a body part with one or more convolutional neural networks

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180204111A1 (en) * 2013-02-28 2018-07-19 Z Advanced Computing, Inc. System and Method for Extremely Efficient Image and Pattern Recognition and Artificial Intelligence Platform
US20190384304A1 (en) * 2018-06-13 2019-12-19 Nvidia Corporation Path detection for autonomous machines using deep neural networks

Also Published As

Publication number Publication date
WO2020123749A1 (en) 2020-06-18
US20200194108A1 (en) 2020-06-18
US11043297B2 (en) 2021-06-22
US20210050095A1 (en) 2021-02-18

Similar Documents

Publication Publication Date Title
US20210313048A1 (en) Neural network-based object detection in visual input
US10290101B1 (en) Heat map based medical image diagnostic mechanism
KR101943011B1 (en) Method for facilitating medical image reading and apparatus using the same
US10853449B1 (en) Report formatting for automated or assisted analysis of medical imaging data and medical diagnosis
US9760689B2 (en) Computer-aided diagnosis method and apparatus
US10692602B1 (en) Structuring free text medical reports with forced taxonomies
JP2022024139A (en) Computer-aided detection using multiple images from different views of region of interest to improve detection accuracy
US20210264574A1 (en) Correcting image blur in medical image
US20210297588A1 (en) Medical image based distortion correction mechanism
JP6796060B2 (en) Image report annotation identification
US20210118551A1 (en) Device to enhance and present medical image using corrective mechanism
CN109191451B (en) Abnormality detection method, apparatus, device, and medium
JP7240001B2 (en) METHOD FOR SUPPORTING BROWSING IMAGES AND DEVICE USING THE SAME
US20150173705A1 (en) Apparatus and method for adapting diagnostic model for computer-aided diagnosis
US20210048941A1 (en) Method for providing an image base on a reconstructed image group and an apparatus using the same
US20210407637A1 (en) Method to display lesion readings result
JP2019030584A (en) Image processing system, apparatus, method, and program
Cheng et al. Development and validation of a deep learning pipeline to measure pericardial effusion in echocardiography
Arias-Londoño et al. Analysis of the Clever Hans effect in COVID-19 detection using Chest X-Ray images and Bayesian Deep Learning
Ragnarsdottir et al. Interpretable prediction of pulmonary hypertension in newborns using echocardiograms
KR102507451B1 (en) Method to read chest image
US20230033263A1 (en) Information processing system, information processing method, information terminal, and non-transitory computer-readable medium
CN111028173B (en) Image enhancement method, device, electronic equipment and readable storage medium
US20240087304A1 (en) System for medical data analysis
US20220076796A1 (en) Medical document creation apparatus, method and program, learning device, method and program, and trained model

Legal Events

Date Code Title Description
STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

AS Assignment

Owner name: RUTGERS, THE STATE UNIVERSITY OF NEW JERSEY, NEW JERSEY

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:PODILCHUK, CHRISTINE I.;MAMMONE, RICHARD;REEL/FRAME:057751/0897

Effective date: 20210121

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION