WO2022245191A1 - Method and apparatus for learning image for detecting lesions - Google Patents

Method and apparatus for learning image for detecting lesions Download PDF

Info

Publication number
WO2022245191A1
WO2022245191A1 PCT/KR2022/007308 KR2022007308W WO2022245191A1 WO 2022245191 A1 WO2022245191 A1 WO 2022245191A1 KR 2022007308 W KR2022007308 W KR 2022007308W WO 2022245191 A1 WO2022245191 A1 WO 2022245191A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
training
data distribution
captured image
neural network
Prior art date
Application number
PCT/KR2022/007308
Other languages
French (fr)
Inventor
Sungwan Kim
Jooyoung Lee
Seung Ho Choi
Sun Young Yang
Seon Hee Lim
Soomin Park
Seunggi PARK
Dan YOON
Byeongsoo KIM
Woo Sang Cho
Jung Chan Lee
Jung Ho Bae
Hyoun-Joong Kong
Original Assignee
Endoai Co., Ltd.
Seoul National University R&Db Foundation
Seoul National Universitiy Hospital
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from KR1020210089799A external-priority patent/KR20220157833A/en
Application filed by Endoai Co., Ltd., Seoul National University R&Db Foundation, Seoul National Universitiy Hospital filed Critical Endoai Co., Ltd.
Publication of WO2022245191A1 publication Critical patent/WO2022245191A1/en

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B1/00Instruments for performing medical examinations of the interior of cavities or tubes of the body by visual or photographical inspection, e.g. endoscopes; Illuminating arrangements therefor
    • A61B1/00002Operational features of endoscopes
    • A61B1/00004Operational features of endoscopes characterised by electronic signal processing
    • A61B1/00009Operational features of endoscopes characterised by electronic signal processing of image signals during a use of endoscope
    • A61B1/000096Operational features of endoscopes characterised by electronic signal processing of image signals during a use of endoscope using artificial intelligence
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B1/00Instruments for performing medical examinations of the interior of cavities or tubes of the body by visual or photographical inspection, e.g. endoscopes; Illuminating arrangements therefor
    • A61B1/00002Operational features of endoscopes
    • A61B1/00004Operational features of endoscopes characterised by electronic signal processing
    • A61B1/00009Operational features of endoscopes characterised by electronic signal processing of image signals during a use of endoscope
    • A61B1/000094Operational features of endoscopes characterised by electronic signal processing of image signals during a use of endoscope extracting biological structures
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B1/00Instruments for performing medical examinations of the interior of cavities or tubes of the body by visual or photographical inspection, e.g. endoscopes; Illuminating arrangements therefor
    • A61B1/31Instruments for performing medical examinations of the interior of cavities or tubes of the body by visual or photographical inspection, e.g. endoscopes; Illuminating arrangements therefor for the rectum, e.g. proctoscopes, sigmoidoscopes, colonoscopes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0475Generative networks
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/40ICT specially adapted for the handling or processing of medical images for processing medical images, e.g. editing
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning

Definitions

  • the present disclosure relates to a device for learning an image on a neural network for detecting lesions, and a method for such a device for learning an image on a neural network.
  • Colorectal cancer is a cancer with a very high incidence worldwide, and colonoscopy has been widely used for early detection.
  • the present disclosure provides an image learning apparatus that acquires a synthesized image for a sessile serrated adenoma using a generative adversarial network (GAN) and uses the acquired synthesized image when training a convolutional neural network (CNN) to detect lesions, and an image learning method performed by the image learning apparatus.
  • GAN generative adversarial network
  • CNN convolutional neural network
  • an image learning method performed by an image learning apparatus for detecting lesions, the image learning method comprising: inputting a captured image for a sessile serrated adenoma to a generative adversarial network (GAN) to obtain a synthesized image for the sessile serrated adenoma as an output of the generative adversarial network; extracting the captured image for the sessile serrated adenoma from a first training dataset that is used when training a convolutional neural network to detect lesions; waiting for a training command after visualizing and providing data distribution for the synthesized image and the captured image; generating a second training dataset including the synthesized image according to the training command; and training the convolutional neural network using the second training dataset.
  • GAN generative adversarial network
  • an image learning apparatus for detecting lesions, comprising: a user interface unit configured to provide a user interface for inputting various commands; a neural network model unit configured to include a generative adversarial network and a convolutional neural network; a visualization unit configured to provide visualization information; and a processor.
  • the processor inputs a captured image for a sessile serrated adenoma to a generative adversarial network (GAN) to obtain a synthesized image for the sessile serrated adenoma as an output of the generative adversarial network, extracts the captured image for the sessile serrated adenoma from a first training dataset that is used when training a convolutional neural network to detect lesions, visualizes the data distribution for the synthesized image and the captured image, provides the visualized data distribution through the visualization unit, and then waits for a training command, generates a second training dataset including the synthesized image according to the training command through the user interface, and trains the convolutional neural network using the second training dataset.
  • GAN generative adversarial network
  • a non-transitory computer-readable storage medium including computer executable instructions, wherein the instructions, when executed by a processor, cause the processor to perform the image learning method of any one of claims 1 to 3 when executed by the processor.
  • a computer program stored on a non-transitory computer-readable storage medium including computer executable instructions, wherein the instructions, when executed by a processor, cause the processor to perform the image learning method of any one of claims 1 to 3 when executed by the processor.
  • a synthesized image for a sessile serrated adenoma using a generative adversarial neural network, and use the acquired synthesized image when training a convolutional neural network to detect lesions.
  • by visually providing a data distribution for the acquired synthesized image and the pre-prepared captured image it is possible to easily check whether the synthesized image is suitable to detect lesions for the sessile serrated adenoma.
  • FIG. 1 is a configuration diagram of an image learning apparatus according to an embodiment of the present disclosure.
  • FIG. 2 is a flowchart for describing an image learning method performed by the image learning apparatus according to the embodiment of the present disclosure.
  • FIGS. 3 and 4 are examples of data distribution maps provided by a visualization unit illustrated in FIG. 1.
  • a term such as a "unit” or a “portion” used in the specification means a software component or a hardware component such as FPGA or ASIC, and the "unit” or the “portion” performs a certain role.
  • the “unit” or the “portion” is not limited to software or hardware.
  • the "portion” or the “unit” may be configured to be in an addressable storage medium, or may be configured to reproduce one or more processors.
  • the "unit” or the “portion” includes components (such as software components, object-oriented software components, class components, and task components), processes, functions, properties, procedures, subroutines, segments of program code, drivers, firmware, microcode, circuits, data, database, data structures, tables, arrays, and variables.
  • the functions provided in the components and "unit” may be combined into a smaller number of components and “units” or may be further divided into additional components and "units.
  • a term such as a “sessile serrated adenoma” in the embodiment of the present disclosure may comprises a “sessile serrated polyp".
  • FIG. 1 is a configuration diagram of an image learning apparatus 100 according to an embodiment of the present disclosure.
  • the image learning apparatus 100 includes a user interface unit 110, a neural network model unit 120, a visualization unit 130, and a processor unit 140.
  • the image learning apparatus 100 may further include an input unit 150 and/or a providing unit 160.
  • the neural network model unit 120 and/or the processor unit 140 may include computing means such as a microprocessor.
  • the user interface unit 110 provides a user interface through which a user may input various commands.
  • the user interface unit 110 may include at least one of a keyboard, a keypad, and a coordinate input device (e.g., a computer mouse) connected to the image learning apparatus 100 including the computing means such as a microprocessor.
  • a coordinate input device e.g., a computer mouse
  • the neural network model unit 120 includes a generative adversarial network and a convolutional neural network. As illustrated in FIG. 1, the neural network model unit 120 may be implemented physically separately from the processor unit 140, but may be implemented in the form of one module by combining the neural network model unit 120 and the processor unit 140.
  • the visualization unit 130 provides visualization information under the control of the processor unit 140.
  • the visualization unit 130 may include a display device or the like capable of outputting the processing result of the processor unit 140 to a screen.
  • a training dataset that may be used when training the convolutional neural network of the neural network model unit 120 is input to the input unit 150 to detect lesions.
  • the training dataset may include a captured image for a sessile serrated adenoma. Then, the captured image for the sessile serrated adenoma that is input to the generative adversarial network of the neural network model unit 120 and used to generate the synthesized image is input to the input unit 150.
  • the input unit 150 includes a communication module or the like capable of receiving the training dataset and/or the captured image through a serial interface or communication channel that may directly receive the training dataset and/or the captured image from the outside.
  • the processor unit 140 inputs the captured image for the sessile serrated adenoma to the generative adversarial network of the neural network model unit 120 to acquire the synthesized image for the sessile serrated adenoma as an output of the generative adversarial network.
  • the processor 140 extracts the captured image for the sessile serrated adenoma from the first training dataset that may be used when training the convolutional neural network to detect lesions.
  • the processor unit 140 visualizes a data distribution for the acquired synthesized image and the extracted captured image, provides the data distribution through the visualization unit 130, and then, waits for a training command.
  • the processor unit 140 generates a second training dataset including the previously acquired synthesized image according to the training command through the user interface unit 110.
  • the processor unit 140 trains the convolutional neural network using the generated second training dataset.
  • the processor 140 may use a t-distributed stochastic neighbor embedding (t-SNE) algorithm to reduce a data distribution in a high-dimensional space to a data distribution in a two-dimensional space.
  • t-SNE stochastic neighbor embedding
  • the processor unit 140 may input the previously acquired synthesized image and the previously extracted captured image to the t-SNE algorithm with the same number and the same size.
  • a filtering command not the training command, may be input through the user interface unit 110.
  • the processor unit 140 performs clustering based on a comparison result with a relative distance and a preset threshold for the synthesized image and the captured image, and deletes an image outside the clustering area according to the clustering among the synthesized images, thereby filtering the captured image not to be included in the second training dataset.
  • the providing unit 160 may provide the learned convolutional neural network to the outside under the control of the processor unit 140.
  • the providing unit 160 may include an interface capable of transmitting various data to a peripheral device, and transmit the convolutional neural network to the peripheral device through the interface.
  • the providing unit 160 may include a communication module, and transmit the convolutional neural network to the outside through the communication module.
  • FIG. 2 is a flowchart for describing an image learning method performed by an image learning apparatus according to an embodiment of the present disclosure
  • FIGS. 3 and 4 are examples of a data distribution map provided by the visualization unit illustrated in FIG. 1.
  • a user inputs a training dataset that may be used when training the convolutional neural network of the neural network model unit 120 to detect lesions and inputs the captured image for the sessile serrated adenoma to be used for generation of the synthesized image into the generative adversarial network of the neural network model unit 120 through the input unit 150 of the image learning device 100.
  • the input unit 150 provides the received captured image for the sessile serrated adenoma to the processor unit 140 of the image learning device 100, and provides the training dataset to the processor 140 and/or the neural network model unit 120.
  • the processor unit 140 inputs the captured image for the sessile serrated adenoma to the generative adversarial network of the neural network model unit 120 to acquire the synthesized image for the sessile serrated adenoma as the output of the generative adversarial network.
  • the captured image for the sessile serrated adenoma input through the input unit 150 is in a form suitable for acquiring the synthesized image
  • the captured image may be input to the generative adversarial network as it is, but the user interface unit 110 may perform a crop process.
  • the processor unit 140 may reproduce the captured image for the sessile serrated adenoma through the visualization unit 130, and the user may command to crop the captured image to an appropriate size by using a coordinate input device (e.g., a mouse for a computer) or the like included in the user interface unit 110, and input the cropped captured image to the generative adversarial network (S210).
  • a coordinate input device e.g., a mouse for a computer
  • S210 generative adversarial network
  • the processor 140 extracts the captured image for the sessile serrated adenoma from the first training dataset that may be used when training the convolutional neural network to detect lesions.
  • the first training dataset not only the captured image for the sessile serrated adenoma, but also the captured image and label for various types of polyps may constitute a dataset, and the processor unit 140 may extract the captured image for the sessile serrated adenoma based on the identification information of the label (S220).
  • the processor 140 determines the data distribution for the synthesized image acquired in step S210 and the captured image extracted in step S220. For example, the processor 140 may reduce the data distribution in the high-dimensional space to the data distribution in the two-dimensional space by using the t-SNE algorithm. In addition, the processor unit 140 may input the synthesized image acquired in step S210 and the captured image extracted in step S220 to the t-SNE algorithm with the same number and the same size. For example, the processor unit 140 may extract the captured image for a total of 200 sessile serrated adenomas from the first training dataset when the number of synthesized images acquired in step S210 is 200 in total.
  • the processor unit 140 may crop the captured image to the size of the synthesized image and input the cropped captured image to the t-SNE algorithm.
  • the processor unit 140 generates visualization data for the data distribution identified for the synthesized image and the captured image, provides the visualization information on the data distribution through the visualization unit 130, and then waits for the input command through the user interface unit 110 (S230).
  • FIGS. 3 and 4 are examples of data distribution maps provided by a visualization unit illustrated in FIG. 1. It can be seen from the example of FIG. 3 that the captured image and the synthesized image are located within a single clustered area. In the example of FIG. 4, it can be seen that a part of the synthesized image is located within the same cluster area as the captured image, but the remaining part of the synthesized image is located within a separate cluster area unlike the captured image.
  • FIG. 1 is examples of data distribution maps provided by a visualization unit illustrated in FIG. 1. It can be seen from the example of FIG. 3 that the captured image and the synthesized image are located within a single clustered area. In the example of FIG. 4, it can be seen that a part of the synthesized image is located within the same cluster area as the captured image, but the remaining part of the synthesized image is located within a separate cluster area unlike the captured image.
  • step S210 illustrates a case in which the captured image for the sessile serrated adenoma received through the input unit 150 before step S210 is not actually the captured image for the sessile serrated adenoma, or a case in which the generative adversarial network used in the process of acquiring the synthesized image through step S210 is trained in an incorrect direction.
  • the user may check the visualization information on the data distribution provided through the visualization unit 130 in step S230, and input the training command or the filtering command through the user interface unit 110.
  • the training command may be input when the data distribution map as illustrated in FIG. 3 is provided
  • the filtering command may be input when the data distribution map as illustrated in FIG. 4 is provided.
  • the processor unit 140 When the filtering command is input through the user interface unit 110, the processor unit 140 extracts an image out of the same clustering area from a result of clustering with the captured image among the synthesized images, and deletes the extracted image. Subsequently, the processor unit 140 may provide the updated data distribution map through the visualization unit 130 by performing step S230 again. Alternatively, the processor unit 140 may wait for the next command or terminate the image learning process after guiding the error occurrence situation through the visualization unit 130 (S250).
  • the processor unit 140 When the training command is input through the user interface unit 110, the processor unit 140 generates the second training dataset including the synthesized image acquired in step S210.
  • the processor unit 140 may generate a dataset in which the synthesized image acquired through step S210 is an input and the identification information of the sessile serrated adenoma is a label, and generate and generate a new second training dataset by merging the dataset with a pre-prepared first training dataset to be used when training the convolutional neural network to detect lesions (S260).
  • the processor unit 140 trains the convolutional neural network of the neural network model unit 120 using the second training dataset generated in step S260 (S270).
  • the processor unit 140 may control the providing unit 160 to externally provide the learned convolutional neural network through step S270.
  • the processor unit 140 may transmit the learned convolutional neural network to a peripheral device through the interface of the providing unit 160 or may transmit the learned convolutional neural network to the outside through the communication module of the providing unit 160.
  • each step included in the image learning method performed by the image learning apparatus 100 according to the above-described embodiment may be implemented in a computer-readable recording medium for recording a computer program including instructions for performing these steps.
  • the synthesized image for the sessile serrated adenoma using the generative adversarial neural network, and use the acquired synthesized image when learning the convolutional neural network to detect lesions.
  • the synthesized image is suitable for the detect lesions for the sessile serrated adenoma.
  • Combinations of steps in each flowchart attached to the present disclosure may be executed by computer program instructions. Since the computer program instructions can be mounted on a processor of a general-purpose computer, a special purpose computer, or other programmable data processing equipment, the instructions executed by the processor of the computer or other programmable data processing equipment create a means for performing the functions described in each step of the flowchart.
  • the computer program instructions can also be stored on a computer-usable or computer-readable storage medium which can be directed to a computer or other programmable data processing equipment to implement a function in a specific manner. Accordingly, the instructions stored on the computer-usable or computer-readable recording medium can also produce an article of manufacture containing an instruction means which performs the functions described in each step of the flowchart.
  • the computer program instructions can also be mounted on a computer or other programmable data processing equipment. Accordingly, a series of operational steps are performed on a computer or other programmable data processing equipment to create a computer-executable process, and it is also possible for instructions to perform a computer or other programmable data processing equipment to provide steps for performing the functions described in each step of the flowchart.
  • each step may represent a module, a segment, or a portion of codes which contains one or more executable instructions for executing the specified logical function(s).
  • the functions mentioned in the steps may occur out of order. For example, two steps illustrated in succession may in fact be performed substantially simultaneously, or the steps may sometimes be performed in a reverse order depending on the corresponding function.

Abstract

In accordance with an aspect of the present disclosure, there is provided an image learning method performed by an image learning apparatus for detecting lesions, the image learning method comprising: inputting a captured image for a sessile serrated adenoma to a generative adversarial network (GAN) to obtain a synthesized image for the sessile serrated adenoma as an output of the generative adversarial network; extracting the captured image for the sessile serrated adenoma from a first training dataset that is used when training a convolutional neural network to detect lesions; waiting for a training command after visualizing and providing data distribution for the synthesized image and the captured image; generating a second training dataset including the synthesized image according to the training command; and training the convolutional neural network using the second training dataset.

Description

METHOD AND APPARATUS FOR LEARNING IMAGE FOR DETECTING LESIONS
The present disclosure relates to a device for learning an image on a neural network for detecting lesions, and a method for such a device for learning an image on a neural network.
Colorectal cancer is a cancer with a very high incidence worldwide, and colonoscopy has been widely used for early detection.
In addition, for the detection of colonic lesions, research on a system for automatically detecting colonic lesions using deep learning technology based on computer vision and image processing technology is being actively conducted. When the detection result by such a system for automatically detecting colonic lesions is provided to a clinician, it is possible to increase a detection rate of adenoma by the clinician and help the detection of lesions that are easy to miss.
In view of the above, the present disclosure provides an image learning apparatus that acquires a synthesized image for a sessile serrated adenoma using a generative adversarial network (GAN) and uses the acquired synthesized image when training a convolutional neural network (CNN) to detect lesions, and an image learning method performed by the image learning apparatus.
The technical problems to be achieved by the present disclosure are not limited to the technical problems mentioned above, and other technical problems that are not mentioned may be clearly understood by those with ordinary knowledge in the technical field to which the present disclosure belongs from the following description.
In accordance with an aspect of the present disclosure, there is provided an image learning method performed by an image learning apparatus for detecting lesions, the image learning method comprising: inputting a captured image for a sessile serrated adenoma to a generative adversarial network (GAN) to obtain a synthesized image for the sessile serrated adenoma as an output of the generative adversarial network; extracting the captured image for the sessile serrated adenoma from a first training dataset that is used when training a convolutional neural network to detect lesions; waiting for a training command after visualizing and providing data distribution for the synthesized image and the captured image; generating a second training dataset including the synthesized image according to the training command; and training the convolutional neural network using the second training dataset.
In accordance with another aspect of the present disclosure, there is provided an image learning apparatus for detecting lesions, comprising: a user interface unit configured to provide a user interface for inputting various commands; a neural network model unit configured to include a generative adversarial network and a convolutional neural network; a visualization unit configured to provide visualization information; and a processor.
Wherein the processor inputs a captured image for a sessile serrated adenoma to a generative adversarial network (GAN) to obtain a synthesized image for the sessile serrated adenoma as an output of the generative adversarial network, extracts the captured image for the sessile serrated adenoma from a first training dataset that is used when training a convolutional neural network to detect lesions, visualizes the data distribution for the synthesized image and the captured image, provides the visualized data distribution through the visualization unit, and then waits for a training command, generates a second training dataset including the synthesized image according to the training command through the user interface, and trains the convolutional neural network using the second training dataset.
In accordance with another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium including computer executable instructions, wherein the instructions, when executed by a processor, cause the processor to perform the image learning method of any one of claims 1 to 3 when executed by the processor.
In accordance with another aspect of the present disclosure, there is provided a computer program stored on a non-transitory computer-readable storage medium including computer executable instructions, wherein the instructions, when executed by a processor, cause the processor to perform the image learning method of any one of claims 1 to 3 when executed by the processor.
According to an embodiment, it is possible to acquire a synthesized image for a sessile serrated adenoma using a generative adversarial neural network, and use the acquired synthesized image when training a convolutional neural network to detect lesions. In particular, by visually providing a data distribution for the acquired synthesized image and the pre-prepared captured image, it is possible to easily check whether the synthesized image is suitable to detect lesions for the sessile serrated adenoma. In addition, it is possible to train the convolutional neural network using a training dataset including a synthesized image according to a training command input after providing visualization information on the data distribution. It is possible to improve the reliability of the training dataset and ultimately improve the reliability of lesions detection by allowing only the synthesized image verified by the user to be included in the training dataset of the convolutional neural network.
FIG. 1 is a configuration diagram of an image learning apparatus according to an embodiment of the present disclosure.
FIG. 2 is a flowchart for describing an image learning method performed by the image learning apparatus according to the embodiment of the present disclosure.
FIGS. 3 and 4 are examples of data distribution maps provided by a visualization unit illustrated in FIG. 1.
The advantages and features of the embodiments and the methods of accomplishing the embodiments will be clearly understood from the following description taken in conjunction with the accompanying drawings. However, embodiments are not limited to those embodiments described, as embodiments may be implemented in various forms. It should be noted that the present embodiments are provided to make a full disclosure and also to allow those skilled in the art to know the full range of the embodiments. Therefore, the embodiments are to be defined only by the scope of the appended claims.
Terms used in the present specification will be briefly described, and the present disclosure will be described in detail.
In terms used in the present disclosure, general terms currently as widely used as possible while considering functions in the present disclosure are used. However, the terms may vary according to the intention or precedent of a technician working in the field, the emergence of new technologies, and the like. In addition, in certain cases, there are terms arbitrarily selected by the applicant, and in this case, the meaning of the terms will be described in detail in the description of the corresponding invention. Therefore, the terms used in the present disclosure should be defined based on the meaning of the terms and the overall contents of the present disclosure, not just the name of the terms.
When it is described that a part in the overall specification "includes" a certain component, this means that other components may be further included instead of excluding other components unless specifically stated to the contrary.
In addition, a term such as a "unit" or a "portion" used in the specification means a software component or a hardware component such as FPGA or ASIC, and the "unit" or the "portion" performs a certain role. However, the "unit" or the "portion" is not limited to software or hardware. The "portion" or the "unit" may be configured to be in an addressable storage medium, or may be configured to reproduce one or more processors. Thus, as an example, the "unit" or the "portion" includes components (such as software components, object-oriented software components, class components, and task components), processes, functions, properties, procedures, subroutines, segments of program code, drivers, firmware, microcode, circuits, data, database, data structures, tables, arrays, and variables. The functions provided in the components and "unit" may be combined into a smaller number of components and "units" or may be further divided into additional components and "units.
Hereinafter, the embodiment of the present disclosure will be described in detail with reference to the accompanying drawings so that those of ordinary skill in the art may easily implement the present disclosure. In the drawings, portions not related to the description are omitted in order to clearly describe the present disclosure.
Hereinafter, a term such as a "sessile serrated adenoma" in the embodiment of the present disclosure, may comprises a "sessile serrated polyp".
FIG. 1 is a configuration diagram of an image learning apparatus 100 according to an embodiment of the present disclosure.
Referring to FIG. 1, the image learning apparatus 100 according to the embodiment includes a user interface unit 110, a neural network model unit 120, a visualization unit 130, and a processor unit 140. In addition, the image learning apparatus 100 may further include an input unit 150 and/or a providing unit 160. Here, the neural network model unit 120 and/or the processor unit 140 may include computing means such as a microprocessor.
The user interface unit 110 provides a user interface through which a user may input various commands. For example, the user interface unit 110 may include at least one of a keyboard, a keypad, and a coordinate input device (e.g., a computer mouse) connected to the image learning apparatus 100 including the computing means such as a microprocessor.
The neural network model unit 120 includes a generative adversarial network and a convolutional neural network. As illustrated in FIG. 1, the neural network model unit 120 may be implemented physically separately from the processor unit 140, but may be implemented in the form of one module by combining the neural network model unit 120 and the processor unit 140.
The visualization unit 130 provides visualization information under the control of the processor unit 140. For example, the visualization unit 130 may include a display device or the like capable of outputting the processing result of the processor unit 140 to a screen.
A training dataset that may be used when training the convolutional neural network of the neural network model unit 120 is input to the input unit 150 to detect lesions. Here, the training dataset may include a captured image for a sessile serrated adenoma. Then, the captured image for the sessile serrated adenoma that is input to the generative adversarial network of the neural network model unit 120 and used to generate the synthesized image is input to the input unit 150. For example, the input unit 150 includes a communication module or the like capable of receiving the training dataset and/or the captured image through a serial interface or communication channel that may directly receive the training dataset and/or the captured image from the outside.
The processor unit 140 inputs the captured image for the sessile serrated adenoma to the generative adversarial network of the neural network model unit 120 to acquire the synthesized image for the sessile serrated adenoma as an output of the generative adversarial network. The processor 140 extracts the captured image for the sessile serrated adenoma from the first training dataset that may be used when training the convolutional neural network to detect lesions. The processor unit 140 visualizes a data distribution for the acquired synthesized image and the extracted captured image, provides the data distribution through the visualization unit 130, and then, waits for a training command. In addition, the processor unit 140 generates a second training dataset including the previously acquired synthesized image according to the training command through the user interface unit 110. The processor unit 140 trains the convolutional neural network using the generated second training dataset. Here, when visualizing the data distribution, the processor 140 may use a t-distributed stochastic neighbor embedding (t-SNE) algorithm to reduce a data distribution in a high-dimensional space to a data distribution in a two-dimensional space. In addition, when visualizing the data distribution, the processor unit 140 may input the previously acquired synthesized image and the previously extracted captured image to the t-SNE algorithm with the same number and the same size. Meanwhile, a filtering command, not the training command, may be input through the user interface unit 110. In this case, the processor unit 140 performs clustering based on a comparison result with a relative distance and a preset threshold for the synthesized image and the captured image, and deletes an image outside the clustering area according to the clustering among the synthesized images, thereby filtering the captured image not to be included in the second training dataset.
The providing unit 160 may provide the learned convolutional neural network to the outside under the control of the processor unit 140. For example, the providing unit 160 may include an interface capable of transmitting various data to a peripheral device, and transmit the convolutional neural network to the peripheral device through the interface. Alternatively, the providing unit 160 may include a communication module, and transmit the convolutional neural network to the outside through the communication module.
FIG. 2 is a flowchart for describing an image learning method performed by an image learning apparatus according to an embodiment of the present disclosure, and FIGS. 3 and 4 are examples of a data distribution map provided by the visualization unit illustrated in FIG. 1.
Hereinafter, an image learning method performed by the image learning apparatus 100 according to the embodiment of the present disclosure will be described in detail with reference to FIGS. 1 to 4.
First, a user inputs a training dataset that may be used when training the convolutional neural network of the neural network model unit 120 to detect lesions and inputs the captured image for the sessile serrated adenoma to be used for generation of the synthesized image into the generative adversarial network of the neural network model unit 120 through the input unit 150 of the image learning device 100.
The input unit 150 provides the received captured image for the sessile serrated adenoma to the processor unit 140 of the image learning device 100, and provides the training dataset to the processor 140 and/or the neural network model unit 120.
Next, the processor unit 140 inputs the captured image for the sessile serrated adenoma to the generative adversarial network of the neural network model unit 120 to acquire the synthesized image for the sessile serrated adenoma as the output of the generative adversarial network. Here, when the captured image for the sessile serrated adenoma input through the input unit 150 is in a form suitable for acquiring the synthesized image, the captured image may be input to the generative adversarial network as it is, but the user interface unit 110 may perform a crop process. For example, the processor unit 140 may reproduce the captured image for the sessile serrated adenoma through the visualization unit 130, and the user may command to crop the captured image to an appropriate size by using a coordinate input device (e.g., a mouse for a computer) or the like included in the user interface unit 110, and input the cropped captured image to the generative adversarial network (S210).
The processor 140 extracts the captured image for the sessile serrated adenoma from the first training dataset that may be used when training the convolutional neural network to detect lesions. Here, in the first training dataset, not only the captured image for the sessile serrated adenoma, but also the captured image and label for various types of polyps may constitute a dataset, and the processor unit 140 may extract the captured image for the sessile serrated adenoma based on the identification information of the label (S220).
Subsequently, the processor 140 determines the data distribution for the synthesized image acquired in step S210 and the captured image extracted in step S220. For example, the processor 140 may reduce the data distribution in the high-dimensional space to the data distribution in the two-dimensional space by using the t-SNE algorithm. In addition, the processor unit 140 may input the synthesized image acquired in step S210 and the captured image extracted in step S220 to the t-SNE algorithm with the same number and the same size. For example, the processor unit 140 may extract the captured image for a total of 200 sessile serrated adenomas from the first training dataset when the number of synthesized images acquired in step S210 is 200 in total. In addition, when the size of the captured image extracted in step S220 is larger than the size of the synthesized image acquired in step S210, the processor unit 140 may crop the captured image to the size of the synthesized image and input the cropped captured image to the t-SNE algorithm.
The processor unit 140 generates visualization data for the data distribution identified for the synthesized image and the captured image, provides the visualization information on the data distribution through the visualization unit 130, and then waits for the input command through the user interface unit 110 (S230).
FIGS. 3 and 4 are examples of data distribution maps provided by a visualization unit illustrated in FIG. 1. It can be seen from the example of FIG. 3 that the captured image and the synthesized image are located within a single clustered area. In the example of FIG. 4, it can be seen that a part of the synthesized image is located within the same cluster area as the captured image, but the remaining part of the synthesized image is located within a separate cluster area unlike the captured image. FIG. 4 illustrates a case in which the captured image for the sessile serrated adenoma received through the input unit 150 before step S210 is not actually the captured image for the sessile serrated adenoma, or a case in which the generative adversarial network used in the process of acquiring the synthesized image through step S210 is trained in an incorrect direction.
Meanwhile, the user may check the visualization information on the data distribution provided through the visualization unit 130 in step S230, and input the training command or the filtering command through the user interface unit 110. For example, in step S230, the training command may be input when the data distribution map as illustrated in FIG. 3 is provided, and in step S230, the filtering command may be input when the data distribution map as illustrated in FIG. 4 is provided.
When the filtering command is input through the user interface unit 110, the processor unit 140 extracts an image out of the same clustering area from a result of clustering with the captured image among the synthesized images, and deletes the extracted image. Subsequently, the processor unit 140 may provide the updated data distribution map through the visualization unit 130 by performing step S230 again. Alternatively, the processor unit 140 may wait for the next command or terminate the image learning process after guiding the error occurrence situation through the visualization unit 130 (S250).
When the training command is input through the user interface unit 110, the processor unit 140 generates the second training dataset including the synthesized image acquired in step S210. For example, the processor unit 140 may generate a dataset in which the synthesized image acquired through step S210 is an input and the identification information of the sessile serrated adenoma is a label, and generate and generate a new second training dataset by merging the dataset with a pre-prepared first training dataset to be used when training the convolutional neural network to detect lesions (S260).
The processor unit 140 trains the convolutional neural network of the neural network model unit 120 using the second training dataset generated in step S260 (S270).
Thereafter, the processor unit 140 may control the providing unit 160 to externally provide the learned convolutional neural network through step S270. For example, the processor unit 140 may transmit the learned convolutional neural network to a peripheral device through the interface of the providing unit 160 or may transmit the learned convolutional neural network to the outside through the communication module of the providing unit 160.
Meanwhile, each step included in the image learning method performed by the image learning apparatus 100 according to the above-described embodiment may be implemented in a computer-readable recording medium for recording a computer program including instructions for performing these steps.
According to an embodiment as described above, it is possible to acquire the synthesized image for the sessile serrated adenoma using the generative adversarial neural network, and use the acquired synthesized image when learning the convolutional neural network to detect lesions. In particular, by visually providing the data distribution for the acquired synthesized image and the pre-prepared captured image, it is possible to easily check whether the synthesized image is suitable for the detect lesions for the sessile serrated adenoma. In addition, it is possible to train the convolutional neural network using the training dataset including the synthesized image according to the training command input after providing the visualization information on the data distribution. It is possible to improve the reliability of the training dataset and ultimately improving the reliability of the lesions detection by allowing only the synthesized image verified by the user to be included in the training dataset of the convolutional neural network.
Combinations of steps in each flowchart attached to the present disclosure may be executed by computer program instructions. Since the computer program instructions can be mounted on a processor of a general-purpose computer, a special purpose computer, or other programmable data processing equipment, the instructions executed by the processor of the computer or other programmable data processing equipment create a means for performing the functions described in each step of the flowchart. The computer program instructions can also be stored on a computer-usable or computer-readable storage medium which can be directed to a computer or other programmable data processing equipment to implement a function in a specific manner. Accordingly, the instructions stored on the computer-usable or computer-readable recording medium can also produce an article of manufacture containing an instruction means which performs the functions described in each step of the flowchart. The computer program instructions can also be mounted on a computer or other programmable data processing equipment. Accordingly, a series of operational steps are performed on a computer or other programmable data processing equipment to create a computer-executable process, and it is also possible for instructions to perform a computer or other programmable data processing equipment to provide steps for performing the functions described in each step of the flowchart.
In addition, each step may represent a module, a segment, or a portion of codes which contains one or more executable instructions for executing the specified logical function(s). It should also be noted that in some alternative embodiments, the functions mentioned in the steps may occur out of order. For example, two steps illustrated in succession may in fact be performed substantially simultaneously, or the steps may sometimes be performed in a reverse order depending on the corresponding function.
The above description is merely illustrative of the technical idea of the present disclosure, and various modifications and variations can be made by those skilled in the art to which the present disclosure pertains without departing from the essential quality of the present disclosure. Therefore, the embodiments disclosed herein are not intended to limit the technical spirit of the present disclosure, but to illustrate it, and the scope of the technical spirit of the present disclosure is not limited by these embodiments. The protection scope of the present disclosure should be interpreted by the following claims, and all technical ideas within the scope equivalent thereto should be interpreted as being included in the scope of the present disclosure.

Claims (9)

  1. An image learning method performed by an image learning apparatus for detecting lesions, the image learning method comprising:
    inputting a captured image for a sessile serrated adenoma to a generative adversarial network (GAN) to obtain a synthesized image for the sessile serrated adenoma as an output of the generative adversarial network;
    extracting the captured image for the sessile serrated adenoma from a first training dataset that is used when training a convolutional neural network to detect lesions;
    waiting for a training command after visualizing and providing data distribution for the synthesized image and the captured image;
    generating a second training dataset including the synthesized image according to the training command; and
    training the convolutional neural network using the second training dataset.
  2. The image learning method of claim 1, wherein, when visualizing the data distribution, a t-distributed stochastic neighbor embedding (t-SNE) algorithm is used to reduce a data distribution in a high-dimensional space to a data distribution in a two-dimensional space.
  3. The image learning method of claim 2, wherein, when visualizing the data distribution, the synthesized image and the captured image are input to the t-SNE algorithm with the same number and the same size.
  4. An image learning apparatus for detecting lesions, comprising:
    a user interface unit configured to provide a user interface for inputting various commands;
    a neural network model unit configured to include a generative adversarial network and a convolutional neural network;
    a visualization unit configured to provide visualization information; and
    a processor,
    wherein the processor inputs a captured image for a sessile serrated adenoma to a generative adversarial network (GAN) to obtain a synthesized image for the sessile serrated adenoma as an output of the generative adversarial network,
    extracts the captured image for the sessile serrated adenoma from a first training dataset that is used when training a convolutional neural network to detect lesions,
    visualizes the data distribution for the synthesized image and the captured image, provides the visualized data distribution through the visualization unit, and then waits for a training command,
    generates a second training dataset including the synthesized image according to the training command through the user interface, and
    trains the convolutional neural network using the second training dataset.
  5. The image learning method of claim 4, wherein, when visualizing the data distribution, a t-SNE algorithm is used to reduce a data distribution in a high-dimensional space to a data distribution in a two-dimensional space.
  6. The image learning method of claim 5, wherein, when visualizing the data distribution, the synthesized image and the captured image are input to the t-SNE algorithm with the same number and the same size.
  7. A non-transitory computer-readable storage medium including computer executable instructions, wherein the instructions, when executed by a processor, cause the processor to perform a image learning, the method comprising:
    inputting a captured image for a sessile serrated adenoma to a generative adversarial network (GAN) to obtain a synthesized image for the sessile serrated adenoma as an output of the generative adversarial network;
    extracting the captured image for the sessile serrated adenoma from a first training dataset that is used when training a convolutional neural network to detect lesions;
    waiting for a training command after visualizing and providing data distribution for the synthesized image and the captured image;
    generating a second training dataset including the synthesized image according to the training command; and
    training the convolutional neural network using the second training dataset.
  8. The non-transitory computer-readable storage medium of claim 7, wherein, when visualizing the data distribution, a t-distributed stochastic neighbor embedding (t-SNE) algorithm is used to reduce a data distribution in a high-dimensional space to a data distribution in a two-dimensional space.
  9. The non-transitory computer-readable storage medium of claim 8, wherein, when visualizing the data distribution, the synthesized image and the captured image are input to the t-SNE algorithm with the same number and the same size.
PCT/KR2022/007308 2021-05-21 2022-05-23 Method and apparatus for learning image for detecting lesions WO2022245191A1 (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
KR20210065538 2021-05-21
KR10-2021-0065538 2021-05-21
KR10-2021-0089799 2021-07-08
KR1020210089799A KR20220157833A (en) 2021-05-21 2021-07-08 Method and apparatus for learning image for detecting lesions

Publications (1)

Publication Number Publication Date
WO2022245191A1 true WO2022245191A1 (en) 2022-11-24

Family

ID=84140714

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2022/007308 WO2022245191A1 (en) 2021-05-21 2022-05-23 Method and apparatus for learning image for detecting lesions

Country Status (1)

Country Link
WO (1) WO2022245191A1 (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190385018A1 (en) * 2018-06-13 2019-12-19 Casmo Artificial Intelligence-Al Limited Systems and methods for training generative adversarial networks and use of trained generative adversarial networks
US20200085382A1 (en) * 2017-05-30 2020-03-19 Arterys Inc. Automated lesion detection, segmentation, and longitudinal identification
US20210125000A1 (en) * 2019-10-23 2021-04-29 Samsung Sds Co., Ltd. Method and apparatus for training model for object classification and detection

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200085382A1 (en) * 2017-05-30 2020-03-19 Arterys Inc. Automated lesion detection, segmentation, and longitudinal identification
US20190385018A1 (en) * 2018-06-13 2019-12-19 Casmo Artificial Intelligence-Al Limited Systems and methods for training generative adversarial networks and use of trained generative adversarial networks
US20210125000A1 (en) * 2019-10-23 2021-04-29 Samsung Sds Co., Ltd. Method and apparatus for training model for object classification and detection

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DONGHEON LEE: "Deep Learning Approaches for Clinical Performance Improvement: Applications to Colonoscopic Diagnosis and Robotic Surgical Skill Assessment", PH. D. DISSERTATION, 1 August 2020 (2020-08-01), XP093007996, [retrieved on 20221214] *
FRID-ADAR MAAYAN, DIAMANT IDIT, KLANG EYAL, AMITAI MICHAL, GOLDBERGER JACOB, GREENSPAN HAYIT: "GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification", NEUROCOMPUTING, ELSEVIER, AMSTERDAM, NL, vol. 321, 1 December 2018 (2018-12-01), AMSTERDAM, NL , pages 321 - 331, XP093007999, ISSN: 0925-2312, DOI: 10.1016/j.neucom.2018.09.013 *

Similar Documents

Publication Publication Date Title
Wang et al. Joint activity recognition and indoor localization with WiFi fingerprints
EP3848853A2 (en) Image detection method, apparatus, electronic device and storage medium
US20180082178A1 (en) Information processing device
WO2019004671A1 (en) Artificial intelligence based malware detection system and method
US20190294866A9 (en) Method and apparatus for expression recognition
WO2013009062A2 (en) Method and terminal device for controlling content by sensing head gesture and hand gesture, and computer-readable recording medium
WO2015050322A1 (en) Method by which eyeglass-type display device recognizes and inputs movement
WO2021182889A2 (en) Apparatus and method for image-based eye disease diagnosis
EP4168983A1 (en) Visual object instance segmentation using foreground-specialized model imitation
WO2022260386A1 (en) Method and apparatus for composing background and face by using deep learning network
CN111860203B (en) Abnormal pig identification device, system and method based on image and audio mixing
EP4004872A1 (en) Electronic apparatus and method for controlling thereof
CN106598356A (en) Method, device and system for detecting positioning point of input signal of infrared emission source
WO2011078596A2 (en) Method, system, and computer-readable recording medium for adaptively performing image-matching according to conditions
EP3983951A1 (en) Multi-task fusion neural network architecture
WO2022245191A1 (en) Method and apparatus for learning image for detecting lesions
CN115049954A (en) Target identification method, device, electronic equipment and medium
WO2012057389A1 (en) System for extracting a target area using a plurality of cameras, and method for same
WO2019088697A1 (en) Pose recognition method and device
WO2023059087A1 (en) Augmented reality interaction method and apparatus
WO2017003150A1 (en) Tram intersection image detection device and method
WO2014204126A2 (en) Apparatus for capturing 3d ultrasound images and method for operating same
WO2021071258A1 (en) Mobile security image learning device and method based on artificial intelligence
WO2022107925A1 (en) Deep learning object detection processing device
WO2022177069A1 (en) Labeling method and computing device therefor

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22805044

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE