WO2021085258A1 - Dispositif de traitement d'image, procédé de commande de dispositif de traitement d'image, procédé de génération d'identifiant, procédé d'identification, dispositif d'identification, dispositif de génération d'identifiant, et identifiant - Google Patents

Dispositif de traitement d'image, procédé de commande de dispositif de traitement d'image, procédé de génération d'identifiant, procédé d'identification, dispositif d'identification, dispositif de génération d'identifiant, et identifiant Download PDF

Info

Publication number
WO2021085258A1
WO2021085258A1 PCT/JP2020/039496 JP2020039496W WO2021085258A1 WO 2021085258 A1 WO2021085258 A1 WO 2021085258A1 JP 2020039496 W JP2020039496 W JP 2020039496W WO 2021085258 A1 WO2021085258 A1 WO 2021085258A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
learning
image
data set
information
Prior art date
Application number
PCT/JP2020/039496
Other languages
English (en)
Japanese (ja)
Inventor
泰 吉正
彰大 田谷
河村 英孝
Original Assignee
キヤノン株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by キヤノン株式会社 filed Critical キヤノン株式会社
Publication of WO2021085258A1 publication Critical patent/WO2021085258A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis

Definitions

  • the present invention relates to an image processing device, a control method of the image processing device, a method of generating a classifier for identifying identification target information in data, a method of discriminating using the classifier generated by the method of generating the classifier, and the like.
  • the present invention relates to a discriminator, a method of generating a discriminator, a discriminator generator, and a discriminator.
  • Segmentation is a process for specifying the class (classification) to which the pixel belongs for each region, and is used for diagnosis using medical images, infrastructure inspection, various particle analysis, and the like.
  • Patent Document 1 describes a technique for distinguishing between benign and malignant of the target abnormal shadow by acquiring the region and the feature amount of the target abnormal shadow (hereinafter referred to as the target abnormal shadow) from the medical image.
  • This technology extracts the region of interest from the region of interest in the medical image using multiple position coordinates that are different from each other, and performs learning to perform differential diagnosis with high accuracy even if there are variations due to the work of the doctor. is there.
  • Increasing the data used for learning to give diversity in this way is called “Data Augmentation”, and is a technique often used to improve the accuracy of inference results.
  • the image processing apparatus for solving the above problems is an image processing apparatus that acquires information of a specific region in an image based on inference, and extracts information from the image based on a predetermined inference condition. It has an information acquisition means for acquiring the information of the specific region, which is inferred by inputting each of the information of the plurality of attention regions to the trained model, and the plurality of attention regions are the first attention. A region and a second region of interest are included, and each of the first region of interest and the second region of interest has a region that overlaps with each other and a region that does not overlap with each other.
  • the control method of the image processing device is a control method of an image processing device that acquires information of a specific region in an image based on inference, and is extracted from the image based on a predetermined inference condition. It has an information acquisition step of acquiring the information of the specific area, which is inferred by inputting the information of the plurality of areas of interest into the trained model, and the plurality of areas of interest are the first areas of interest.
  • a control method for an image processing apparatus including a second attention region, wherein the first attention region and the second attention region each have a region that overlaps with each other and a region that does not overlap with each other.
  • Another invention is a method for generating a classifier for identifying identification target information in data, and is a first training data in an initial data set including a plurality of learning data created from the data.
  • the amount of the identification target information included in the first learning data set includes a second learning step of updating the information contained in the classifier by learning using the data set for learning.
  • the present invention relates to a method for generating a classifier, which is characterized in that the amount of the identification target information included in the second learning data set is larger than the amount of the identification target information.
  • Yet another generation method is a method for generating a classifier for estimating identification target information in data, and includes input data and learning data composed of teacher data for the input data.
  • the first training data set For a training data set group having 1 training data set and a second training data set including a larger number of the training data than the first training data set, the first training data set
  • the amount of the identification target information contained in the input data included in the first training data set includes a generation step of generating the classifier using the training data set group, and the amount of the identification target information is the second training data.
  • the input data included in the set is characterized in that it is larger than the amount of the identification target information.
  • Yet another generator is a generator for estimating identification target information in data, and includes input data and learning data composed of teacher data for the input data.
  • the first training data set For a training data set group having 1 training data set and a second training data set including a larger number of the training data than the first training data set, the first training data set
  • the amount of the identification target information contained in the input data included in the first training data set includes a generation means for generating the classifier using the training data set group, and the amount of the identification target information is the second training data.
  • the input data included in the set is characterized in that it is larger than the amount of the identification target information.
  • the image processing apparatus since each of the information of the plurality of areas of interest in the image is input to the trained model to perform inference, the inference accuracy of the information of the specific area in the image can be improved. ..
  • the image processing apparatus 1-100 acquires information of a specific region in an image based on inference. Specifically, each of the information of the plurality of attention regions (1-540 to 1-542) extracted from the image (1-500) based on a predetermined inference condition is input to the trained model 1-47.
  • the information acquisition means 1-50 for acquiring the information of the specific area (1-520) inferred by the above is provided.
  • the plurality of attention regions include a first attention region (for example, 1-540) and a second attention region (for example, 1-541).
  • a trained model is used to extract a specific region (1-520) in image 1-500 by inference.
  • the trained model is obtained by training an image whose specific region is known as teacher data.
  • each of the information of the plurality of areas of interest extracted from the image is input to the above-mentioned trained model.
  • the plurality of areas of interest are selected so as to have a region that overlaps with each other and a region that does not overlap with each other.
  • the plurality of inference results can be obtained in a certain region A (area in which the regions of interest overlap each other) in the image, but also inference results in the region around the region A can be obtained.
  • the image in the present embodiment is, for example, an image including an image of a first material and an image of a second material different from the first material.
  • the information in the specific region includes at least one of the position of the image of the second material in the image and the size of the image of the second material.
  • the size of the first attention area and the size of the second attention area are the same because it is easy to input to the trained model.
  • the information of the region of interest includes information on at least one of the position and size of the region extracted from the image.
  • the image processing device may further have a reception unit 1-41 that accepts the setting of inference conditions.
  • the reception unit may be one that receives an instruction issued by the user operating the operation unit 1-140, one that receives an automatic instruction by the image processing device, or another.
  • the information acquisition means 1-50 may have a model acquisition unit 1-42 for acquiring the trained model 1-47.
  • the model acquisition unit has a generation unit (not shown) that generates a trained model, and the trained model may be acquired from the generation unit or may be acquired from the data server 1-120.
  • the information acquisition means may have extraction units 1-43 that extract a plurality of regions of interest from the image based on the inference conditions received by the reception unit. Further, the information acquisition means acquires a plurality of inference results by inputting each of the plurality of attention regions extracted by the extraction unit into the trained model, and obtains information in a specific region based on the plurality of inference results. It may have the information acquisition unit 1-45 to be acquired.
  • the extraction unit may extract a plurality of areas of interest using random numbers, may extract areas of interest regularly from end to end of the image, or may use both methods.
  • the inference conditions are, for example, the number of inferences performed on average for each pixel of the image, the threshold value of the ratio of the number of times the area of interest is inferred to be a specific area to the number of times the area of interest is inferred, and the attention. Includes at least one of the size of the area.
  • the image processing apparatus when the image includes a plurality of specific areas and the areas of the plurality of specific areas have a distribution, the image processing apparatus according to the present embodiment is preferably used. Further, when the ratio of the maximum value to the minimum value of the area of the plurality of specific regions is 50 or more, particularly when the ratio is 100 or more, the image processing apparatus according to the present embodiment is preferably used.
  • the image processing apparatus is a display control unit that displays on the display unit so that the display mode of the specific area in the image and the display mode other than the specific area are different based on the information of the specific area. May further have. For example, as shown in FIG. 4, it is possible to display the specific areas 1-520 as black and the other areas as white. As a means for changing the display mode, a means other than changing the color may be used.
  • the control method of the image processing device is the control method of the image processing device that acquires the information of the specific region in the image based on the inference.
  • the plurality of areas of interest include a first area of interest and a second area of interest, and the first area of interest and the second area of interest do not overlap with each other. Has an area.
  • the image processing apparatus processes the inference process using the trained model.
  • the user sets inference conditions, and the image processing device extracts a plurality of regions of interest from the inference image based on the inference conditions.
  • the image processing device makes inferences using a common trained model for each of the plurality of areas of interest, and calculates the final inference result based on each inference result.
  • the inference result refers to, for example, an object detection result or a segmentation result.
  • the image processing system 1-190 includes an image capturing device 1-110 for capturing an image, a data server 1-120 for storing the captured image, and an image processing device 1-100 for performing image processing. Further, it has a display unit 1-130 for displaying the acquired input image and the image processing result, and an operation unit 1-140 for inputting an instruction from the user.
  • the image processing device 1-100 acquires an input image and performs image processing on the region of interest reflected in the input image.
  • the input image is, for example, an image obtained by subjecting image data acquired by the image capturing apparatus 1-110 to image processing or the like to obtain an image suitable for analysis. Further, the input image in the present embodiment is an inference image.
  • the image processing device 1-100 is, for example, a computer, and performs image processing according to the present embodiment.
  • the image processing device 1-100 has at least a CPU 1-31, a communication IF 1-32, a ROM 1-33, a RAM 1-34, a storage unit 1-35, and a common bus 1-36.
  • the CPU 1-31 integrally controls the operation of each component of the image processing device 1-100.
  • the image processing device 1-100 may also control the operation of the image capturing device 1-110 by controlling the CPU 1-31.
  • the data server 1-120 holds an image captured by the image capturing device 1-110.
  • Communication IF (Interface) 1-32 is realized by, for example, a LAN card. Communication between the external device (for example, data server 1-120) and the image processing device 1-100 is performed by the communication IF1-32.
  • the ROM 1-33 is realized by a non-volatile memory or the like, stores a control program executed by the CPU 1-31, and provides a work area when the program is executed by the CPU 1-31.
  • RAM (Random Access Memory) 1-34 is realized by a volatile memory or the like, and temporarily stores various information.
  • the storage unit 1-35 is realized by, for example, an HDD (Hard Disk Drive) or the like. Then, the storage unit 1-35 stores various application software including an operating system (OS: Operating System), a device driver of a peripheral device, and a program for performing image processing according to the present embodiment described later.
  • the operation unit 1-140 is realized by, for example, a keyboard, a mouse, or the like, and inputs an instruction from the user into the device.
  • the display unit 1-130 is realized by, for example, a display or the like, and displays various information toward the user.
  • the operation unit 1-140 and the display unit 1-130 provide a function as a GUI (Graphical User Interface) under the control of the CPU 1-31.
  • the display unit 1-130 may be a touch panel monitor that accepts operation input, and the operation unit 1-140 may be a stylus pen.
  • Each of the above components is communicably connected to each other by common bus 1-36.
  • the imaging apparatus 1-110 is, for example, a scanning electron microscope (SEM), a transmission electron microscope (TEM: Transmission Electron Microscope), or an optical microscope.
  • the image capturing device 1-110 may also be a device having an image capturing function such as a digital camera or a smartphone.
  • the image capturing device 1-110 transmits the acquired image to the data server 1-120.
  • An imaging control unit (not shown) that controls the imaging apparatus 1-110 may be included in the image processing apparatus 1-100.
  • the main body that executes the program may be one or more CPUs, and the ROM that stores the program may also be one or more memories. Further, another processor such as GPU (Graphics Processing Unit) may be used instead of the CPU or in combination with the CPU. That is, the functions of the respective parts shown in FIG. 2 are realized by executing a program stored in at least one or more memories in which at least one or more processors (hardware) are communicably connected to the processors.
  • processors hardware
  • the image processing device 1-100 has a reception unit 1-41, a model acquisition unit 1-42, an extraction unit 1-43, an inference unit 1-44, an information acquisition unit 1-45, and a display control unit 1-46 as functional configurations. Has.
  • the image processing device 1-100 is communicably connected to the data server 1-120 and the display unit 1-130.
  • Reception unit 1-41 receives the inference condition input from the user via operation unit 1-140. That is, the operation unit 1-140 corresponds to an example of a reception means that accepts the setting of the inference condition.
  • the inference condition includes at least one of information on the number of inferences (described later), a threshold value, and a patch size.
  • the model acquisition unit 1-42 acquires the trained model 1-47 constructed in advance and the inference image from the data server 1-120.
  • the extraction unit 1-43 extracts a plurality of regions of interest from the inference image based on the inference conditions received by the reception unit 1-41. That is, it corresponds to an example of an extraction means for extracting a plurality of regions of interest from an image for inference.
  • the area of interest refers to a part cut out from the inference image.
  • the inference unit 1-44 makes inferences for each of the plurality of areas of interest using the trained model 1-47 acquired by the model acquisition unit 1-42. That is, it corresponds to an example of an inference means that makes an inference using a common trained model for each of a plurality of areas of interest.
  • the information acquisition unit 1-45 calculates the final inference result based on the inference result performed by the inference unit 1-44. That is, it corresponds to an example of a calculation means for calculating the final inference result based on a plurality of inference results.
  • the display control unit 1-46 outputs the information regarding the inference result acquired in each process to the display unit 1-130, and causes the display unit 1-130 to display the result of each process.
  • each part of the image processing device 1-100 may be realized as an independent device.
  • the image processing device 1-100 may be a workstation.
  • the functions of each part may be realized as software that operates on a computer, and the software that realizes the functions of each part may be realized on a server via a network such as a cloud.
  • a network such as a cloud.
  • each part is realized by software running on a computer installed in a local environment.
  • FIG. 3 is a diagram showing a processing procedure of processing executed by the image processing apparatus 1-100 of the present embodiment.
  • This embodiment is realized by the CPU 1-31 executing a program that realizes the functions of each part stored in the ROM 1-33.
  • an example in which the image to be processed is a TEM image will be described.
  • the TEM image is acquired as a two-dimensional shading image.
  • carbon black in the coating film of the melamine / alkyd resin paint will be described as an example of the object to be processed included in the image to be processed.
  • the reception unit 1-41 receives the inference condition input by the user in the operation unit 1-140.
  • the inference condition in the present embodiment includes at least one of information regarding the number of inferences, a threshold value, and a patch size.
  • the information regarding the number of inferences is information such as the average number of inferences and the number of extractions of each pixel, which will be described later.
  • step S1-202 the model acquisition unit 1-42 acquires the trained model constructed in advance and the inference image.
  • the inference image is acquired from the data server 1-120. If the patch size is set in steps S1-201, the trained model trained with the same patch size is acquired.
  • the patch size is the number of pixels in the vertical and horizontal directions of the cropped image when a part of the target image is cropped.
  • a pair of a TEM image, which is an image to be processed, and a teacher image is prepared.
  • the teacher image is an image processed image to be processed by using an appropriate image processing method. For example, it is an image obtained by binarizing an area to be detected and an area not to be detected, an image in which an area to be detected is filled, and an image in which an area not not to be detected is not filled.
  • the trained model 1-47 is generated by performing machine learning according to a predetermined algorithm using the image to be processed and the teacher image.
  • U-Net is used as a predetermined algorithm.
  • a learning method using U-Net a known technique can be used.
  • SVM Small Vector machine
  • DNN Deep Neural Network
  • CNN Convolutional Neural Network
  • FCN Full Convolutional Network
  • SegNet and the like can also be used as an algorithm used for Semantic Segmentation that classifies classes in 1-pixel units.
  • GAN Geneative Adversarial Networks
  • step S1-203 the extraction unit 1-43 extracts a plurality of regions of interest from the inference image.
  • FIG. 4 shows an example in which the attention region 1-540, the attention region 1-541, and the attention region 1-542 are extracted with respect to the position coordinates 1-530, the position coordinates 1-531, and the position coordinates 51-32. Shown.
  • the inference image in this embodiment is composed of a plurality of pixels whose positions can be specified by two-dimensional Cartesian coordinates (x, y).
  • a set of random numbers (x- i , y- i- ) satisfying 0 ⁇ x-i ⁇ x_size and 0 ⁇ y i - ⁇ y_size is generated.
  • the region of interest is set with (x- i , y- i-) as the upper left coordinate.
  • the size of the area of interest should be equal to the patch size.
  • the user sets the average number of inferences for each pixel in the operation unit 1-140.
  • the average number of inferences is the average number of extractions for each pixel when performing extraction. When extracting, it can be obtained by recording the number of times of extraction for each pixel.
  • (x- i, y i - ) is positioned near the end of the image, when the size of the region of interest becomes smaller than the patch size may fill the periphery of the image pixel values 0, so-called padding process By doing something like this, adjust the size of the area of interest so that it is the same as the patch size.
  • step S1-204 the inference unit 1-44 makes an inference using the trained model 1-47 for each of the plurality of areas of interest extracted in step S1-203.
  • step S1-205 the information acquisition unit 1-45 calculates and acquires the final inference result based on the inference result in step S1-204.
  • the number of times inferred and the number of times determined to be carbon black are recorded for each pixel, and the number of times determined to be carbon black / the number of times inferred becomes equal to or greater than the threshold value. Finally, it is determined that it is carbon black.
  • the threshold value may be set by the user in the operation unit 1-140. If the inference is not classification but regression processing, a new threshold is set in addition to the above threshold, and if it is above the threshold, the result is classified in advance by assuming that it is carbon black, and then the final result. Judgment processing is performed.
  • step S1-206 the display control unit 1-46 causes the display unit 1-130 to display the final inference result.
  • the display control unit 1-46 controls the display unit 1-130 to transmit the final inference result to the display unit 1-130 connected to the image processing device 1-100 and display the final inference result on the display unit 1-130.
  • it is determined for each pixel whether or not it is carbon black, and the pixel determined to be carbon black is displayed with a brightness of 255, and the pixel determined to be not carbon black is displayed with a brightness of 0.
  • IoU Intersection over Union
  • TP True Positive
  • FP False Positive
  • FN False Negative
  • the image processing apparatus 1-100 in the present embodiment can improve the inference accuracy by performing inference using a common trained model for each of a plurality of areas of interest. Further, since the user can set the threshold value, the inference accuracy can be controlled according to the purpose. For example, if you want to reduce undetected, lower the threshold, and if you want to reduce false positives, raise the threshold. You can make inferences according to the purpose while using the same trained model.
  • the reception unit 1-41 receives the inference condition input by the user in the operation unit 1-140.
  • the inference condition in the present embodiment includes at least one of information regarding the number of inferences, a threshold value, and a patch size.
  • the information regarding the number of inferences is information such as the number of times the reference coordinates are set, which will be described later.
  • step S1-203 the extraction unit 1-43 extracts a plurality of regions of interest with respect to the inference image 1-501. 6, reference coordinates 1-560 of (x- 1, y 1) as a reference, an example that extracts the attention region 1-550-interest region 1-558.
  • a plurality of reference coordinates (x- j, y j) ( j 1,2, ⁇ , N) and then, (x- j, y j) is 0 ⁇ x- j ⁇ p x, 0 ⁇ y j - a set of random numbers satisfying ⁇ p y.
  • the coordinates of the upper left of the other areas of interest are (x- j + p x x m, y j + p y x n) (where n is 1 or more, x_size / p x -1 or less integer. M is 1 or more, y_size / p y -1 an integer).
  • the user sets the reference coordinate setting number of times in the operation unit 1-140.
  • the reference coordinate setting number is the number of times that the upper left reference coordinate (x- j , yj )) is set by using a random number when extracting.
  • the reception unit 1-41 receives the inference condition input by the user in the operation unit 1-140.
  • the inference condition in the present embodiment includes at least one of information regarding the number of inferences, a threshold value, and a patch size.
  • the information regarding the number of inferences is information such as the number of times the reference coordinates are set, which will be described later.
  • step S1-203 the extraction unit 1-43 extracts a plurality of regions of interest with respect to the inference image 1-502. 8, reference coordinates 1-660 of (x- 1, y 1) as a reference, an example that extracts the attention region 1-560-interest region 1-568.
  • a plurality of reference coordinates (x- j, y j) ( j 1,2, ⁇ , N) and then, (x- j, y j) is 0 ⁇ x- j ⁇ p x, 0 ⁇ y j - a set of random numbers satisfying ⁇ p y.
  • the coordinates of the upper left of the other areas of interest are (x- j + p x x m, y j + p y x n) (where n is 1 or more, x_size / p x -1 or less integer. M is 1 or more, y_size / p y -1 an integer).
  • the user sets the reference coordinate setting number of times in the operation unit 1-140.
  • the reference coordinate setting number is the number of times that the upper left reference coordinate (x- j , yj )) is set by using a random number when extracting.
  • mIoU is defined by equation (1-2).
  • the patch size was 128 ⁇ 128 and the threshold was 0.2.
  • the reception unit 1-41 receives the inference condition input by the user in the operation unit 1-140.
  • the inference condition in the present embodiment includes at least one of information regarding the number of inferences, a threshold value, and a patch size.
  • the information regarding the number of inferences is information such as pitch, which will be described later.
  • step S1-203 the extraction unit 1-43 extracts a plurality of regions of interest from the inference image. 10, the reference coordinates 1-580 of (x- 1, y 1) as a reference, an example that extracts the attention region 1-570-interest region 1-572.
  • a plurality of areas of interest are extracted by shifting the areas of interest by the pitch vertically or horizontally.
  • the upper left coordinates of the region of interest 1-571 and the region of interest 1-572 are (x- 1 + pitch_x, y 1 ) and (x- 1 + 2 pitch_x, y 1 ), respectively.
  • mIoU was used as in the first and second embodiments.
  • evaluation was performed using mIoU.
  • c 3.
  • the trained model (discriminator) that has been learned by the following first learning step and the second learning step can be used.
  • the first learning step is a step in which learning is performed using the first learning data set among the initial data sets including a plurality of learning data created from the data including the identification target information.
  • the second learning process is performed by learning using the information contained in the trained model generated by learning in the first learning process and the second training data set of the initial data sets.
  • the amount of identification target information included in the first learning data set is larger than the amount of identification target information included in the second learning data set.
  • the trained model (identifier) can be used by the following padding step and the generation step.
  • the padding process consists of a first training data set containing input data and training data composed of teacher data for the input data, and a second training data set containing a larger number of training data than the first training data set. And, for the training data set group having, the training data is inflated so that the number of training data contained in the first training data set is equal to or larger than the number of training data contained in the second training data set. I do.
  • the generation step generates a trained model using the padding step and the training data set group having the padded training data.
  • the amount of identification target information contained in the input data included in the first learning data set is larger than the amount of identification target information contained in the input data included in the second learning data set. Since the contents of the third embodiment will be described later, they will be omitted here.
  • the first learning step when the first embodiment, the second embodiment, and the third embodiment are combined, the first learning step, the second, when generating the trained model of the first embodiment, Perform learning process, padding process, and generation process.
  • the image processing device and the image processing system in each of the above-described embodiments may be realized as a single device, or may be a form in which devices including a plurality of information acquisition devices are combined so as to be able to communicate with each other to execute the above-mentioned processing. Often, both are included in the embodiments of the present invention.
  • the above-mentioned processing may be executed by a common server device or a group of servers.
  • the common server device corresponds to the image processing device according to the embodiment
  • the server group corresponds to the image processing system according to the embodiment.
  • the image processing device and the plurality of devices constituting the image processing system need not be present in the same facility or in the same country as long as they can communicate at a predetermined communication rate.
  • the present invention can take an embodiment as, for example, a system, an apparatus, a method, a program, a recording medium (storage medium), or the like. Specifically, it may be applied to a system composed of a plurality of devices (for example, a host computer, an interface device, an imaging device, a Web application, etc.), or it may be applied to a device composed of one device. good.
  • a recording medium (or storage medium) in which a software program code (computer program) that realizes the functions of the above-described embodiment is recorded is supplied to the system or device.
  • the storage medium is a computer-readable storage medium.
  • the computer or CPU or GPU
  • the program code itself read from the recording medium realizes the function of the above-described embodiment, and the recording medium on which the program code is recorded constitutes the present invention.
  • Second Embodiment >> (Background of the second embodiment)
  • image processing, voice processing, text processing and the like are known.
  • the discrimination accuracy is improved by using deep learning, but various efforts are being made to further improve the discrimination accuracy.
  • Japanese Unexamined Patent Publication No. 2019-118670 (Reference 2-1) describes a diagnostic support device that supports diagnosis of a diseased area by using deep learning. This technique makes it possible to perform highly accurate diagnosis by normalizing the color brightness of an image in advance and separating the diseased part and the non-diseased part.
  • identification target information information to be identified in one data
  • the literature It was found that the methods described in 2-1 and Document 2-2 are difficult to identify. Further, when there is a large difference in the amount of identification target information for each data, it has been difficult to construct a classifier capable of accurately identifying the identification target information regardless of the amount of identification target information by the conventional method. ..
  • the object of the second embodiment is to accurately identify the identification target information even when there are a plurality of identification target information in one data or when it is difficult to distinguish the identification target information from other information.
  • the purpose is to provide a method for generating a discriminator that can be identified.
  • Another object of the present invention is to provide an identification method and an identification device using the identification device generated by the identification device generation method.
  • the method of generating the classifier according to the present embodiment includes a first learning step of learning using the first learning data set among the initial data sets including a plurality of learning data created from the data. Further, it is included in the classifier by learning using the information included in the classifier generated by learning in the first learning step and the second learning data set of the initial data sets. It has a second learning step of updating information. At that time, the amount of identification target information included in the first learning data set is larger than the amount of identification target information included in the second learning data set. In this way, the classifier is trained in two steps, starting with a data set having a large amount of information to be identified. As a result, it is possible to first learn the parameters of image conversion having a large degree of conversion and gradually change the parameters, so that the identification target information can be accurately identified.
  • Data is a representation of information that is formalized for transmission, interpretation, or processing and can be reinterpreted as information. Examples of data include image data, voice data, text data, and the like.
  • the identification target information is information to be identified in the data.
  • the data is image data, for example, at least one piece of information on the position, area, and distribution of the identification target area in the image data is the identification target information.
  • the classifier generated by the generation method according to the present embodiment can estimate and extract the identification target area in the image data, which is difficult to extract visually by the user.
  • the data is voice data
  • at least one of the frequency and intensity of the identification target sound in the voice data is the identification target information.
  • the classifier generated by the generation method according to the present embodiment can estimate and extract the sound to be identified in the sound data including noise, which is difficult for the user to extract.
  • the sound data is the voice data of a plurality of speakers
  • the voice data of at least one speaker can be used as the identification target information.
  • the data is text data
  • at least one of the characters of the identification target character and the character string in the text data is the identification target information.
  • the classifier generated by the generation method according to the present embodiment can estimate and extract a character string to be identified in text data, which is difficult for the user to extract.
  • the amount of identification target information contained in the training data set is the value obtained by dividing the total amount of identification target information contained in the training data set by the number of training data contained in the training data set (average value).
  • the learning data is a pair of input data and teacher data, and the learning data set includes a plurality of learning data.
  • the amount of identification target information included in the learning data set is, for example, the area of the identification target area in the image.
  • the area of the identification target area in the image can be calculated from the number of pixels.
  • the data is voice data, it is the length of the identification target information in the data separated by voice breaks.
  • the initial data set may be a collection of data in which the input data and the teacher data are separated by audio breaks, etc., and the data sets may be sorted in descending order of the difference between the input data and the teacher data signals. ..
  • the initial data set may be a collection of data in which the input data and the teacher data are separated by a break in a sentence, and the data may be sorted in descending order of the difference between the input data and the teacher data text. ..
  • FIG. 12 is a diagram showing an example of the device configuration of the learning system (identifier generation system) according to the second embodiment.
  • the learning system 2-190 composed of the learning device (identifier generator) 2-100 and each device connected to the learning device 2-100 will be described in detail.
  • the learning system 2-190 includes a learning device 2-100 for learning, a data acquisition device 2-110 for acquiring data, and a data server 2-120 for storing the acquired data.
  • the learning system 2-190 is a data processing device 2-130 that processes data to create teacher data, a display unit 2-140 that displays the acquired input data and the learning result, and instructions from the user. It has an operation unit 2-150 for inputting.
  • the learning device 2-100 acquires a pair (learning data) of the input data and the teacher data created by processing the input data with the data processing device 2-130.
  • the learning data set including the plurality of learning data created in this way is the initial data set.
  • the training data set is acquired from the initial data set and training is performed.
  • the data acquisition device 2-110 in the present embodiment is a transmission electron microscope (TEM: Transmission Electron Microscope), and the input data is a TEM image.
  • TEM Transmission Electron Microscope
  • the learning device 2-100 is, for example, a computer, and performs learning according to the present embodiment.
  • the learning device 2-100 has at least a CPU 2-31, a communication IF2-32, a ROM 2-33, a RAM 2-34, a storage unit 2-35, and a common bus 2-36.
  • the CPU 2-31 integrally controls the operation of each component of the learning device 2-100. By controlling the CPU 2-31, the learning device 2-100 may also control the operations of the data acquisition device 2-110 and the data processing device 2-130.
  • the data server 2-120 holds the data acquired by the data acquisition device 2-110.
  • the data processing device 2-130 processes the input data stored in the database so that it can be used for learning.
  • Communication IF (Interface) 2-32 is realized by, for example, a LAN card.
  • the communication IF2-32 controls communication between the external device (for example, the data server 2-120) and the learning device 2-100.
  • the ROM 2-33 is realized by a non-volatile memory or the like, stores a control program executed by the CPU 2-31, and provides a work area when the program is executed by the CPU 2-31.
  • RAM (Random Access Memory) 2-34 is realized by a volatile memory or the like, and temporarily stores various information.
  • the storage unit 2-35 is realized by, for example, an HDD (Hard Disk Drive) or the like, and includes an operating system (OS: Operating System), a device driver of a peripheral device, and a program for performing learning according to the present embodiment described later. Stores various application software.
  • OS Operating System
  • the operation unit 2-150 is realized by, for example, a keyboard or a mouse, and inputs an instruction from the user into the device.
  • the display unit 2-140 is realized by, for example, a display or the like, and displays various information toward the user.
  • the operation unit 2-150 and the display unit 2-140 provide a function as a GUI (Graphical User Interface) under the control of the CPU 2-31.
  • the display unit 2-140 may be a touch panel monitor that accepts operation input, and the operation unit 2-150 may be a stylus pen.
  • Each component of the learning device 100 is communicably connected to each other by a common bus 2-36.
  • the data acquisition device 2-110 is, for example, a scanning electron microscope (SEM), a transmission electron microscope (TEM), an optical microscope, a digital camera, a smartphone, or the like.
  • the data acquisition device 2-110 transmits the acquired data to the data server 2-120.
  • a data acquisition control unit (not shown) that controls the data acquisition device 2-110 may be included in the learning device 2-100.
  • FIG. 13 is a diagram showing an example of the functional configuration of the learning system according to the second embodiment.
  • the main body that executes the program may be one or more CPUs, and the ROM that stores the program may also be one or more memories.
  • another processor such as GPU (Graphics Processing Unit) may be used instead of the CPU or in combination with the CPU. That is, the functions of the respective parts shown in FIG. 13 are realized by executing a program stored in at least one or more memories in which at least one or more processors (hardware) are communicably connected to the processors.
  • the learning device 2-100 includes a reception unit 2-41, an acquisition unit 2-42, a selection unit 2-43, a learning unit 2-44, a classifier 2-45, a display control unit 2-48, and a display unit 2-140. Have at least.
  • the learning device 2-100 is communicably connected to the data server 2-120 and the display unit 2-140.
  • Reception unit 2-41 accepts data set selection conditions (described later) via operation unit 2-150.
  • Acquisition unit 2-42 acquires the initial data set from the data server 2-120.
  • the selection unit 2-43 processes the initial data set acquired by the acquisition unit 2-42, and selects the first learning data set and the second learning data set.
  • the learning unit 2-44 sequentially executes learning using the first learning data set acquired by the selection unit 2-43 and the second learning data set. That is, by learning using at least the first learning data set, the first learning, and the information contained in the classifier generated in the first learning and the second learning data set. , Perform a second learning to update the information contained in the classifier.
  • the information included in the classifier generated in the first learning is stored in the information storage unit in the classifier.
  • each part of the learning device 2-100 may be realized as an independent device.
  • the learning device 2-100 may be a workstation.
  • the functions of each part may be realized as software that operates on a computer, and the software that realizes the functions of each part may be realized on a server via a network such as a cloud.
  • a network such as a cloud.
  • each part is realized by software running on a computer installed in a local environment.
  • FIG. 14 is a flow chart showing an example of a method for generating a classifier according to the second embodiment.
  • This embodiment is realized by the CPU 2-31 executing a program that realizes the functions of each part stored in the ROM 2-33.
  • the image to be processed will be described as a TEM image.
  • the TEM image is acquired as a two-dimensional shading image.
  • carbon black in the coating film of the melamine / alkyd resin paint will be described as identification target information.
  • 2000 images of size 128 ⁇ 128 and 1000 pairs of initial data sets were used.
  • the learning and evaluation were divided into 8: 2 and used.
  • the learning data set includes the learning data.
  • the learning data is composed of input data and teacher data for the input data.
  • the teacher data is the image data with the identification target information attached. For example, the identification target area is shown in the image data.
  • the correct image is an image obtained by processing the identification target information in the identification target image by using an appropriate image processing method. For example, an image obtained by binarizing the identification target information and other information, or an image filled with the identification target information.
  • the carbon black in the TEM image will be described using an image filled with a luminance value (0,255,0).
  • the reception unit 2-41 receives the data set selection condition via the operation unit 2-150.
  • the dataset selection criteria are entered by the user.
  • the data set selection condition includes at least a method of dividing the initial data set, information on the data set used for training among the divided data sets, and a learning order.
  • a method of classifying the data set a method of classifying by the threshold value of the amount of identification target information is used.
  • the amount of identification target information is defined by the number of pixels filled with the luminance value (0,255,0). Further, here, the threshold value is set to 5000pixel.
  • step S2-202 the acquisition unit 2-42 acquires the initial data set from the data server 2-120.
  • step S2-203 the selection unit 2-43 processes the initial data set acquired by the acquisition unit 2-42, and selects the first learning data set and the second learning data set.
  • step S2-203 the carbon black in the melamine / alkyd resin is used as the identification target information.
  • the data sets are sorted in order from the one with the largest amount of identification target information. That is, the data sets are sorted in descending order of the number of pixels filled with the luminance value (0,255,0).
  • the data set is divided according to the threshold value received by the reception unit 2-41.
  • the learning process is determined according to the information of the data set used for learning received by the reception unit 2-41 and the learning order.
  • a data set containing images having an amount of identification target information of 5000 pixels or more is referred to as a first training data set
  • a data set containing images having an amount of identification target information of 0 pixels or more is referred to as a second learning data set.
  • the second learning data set includes the first learning data set, it is possible to generate a classifier having higher discrimination accuracy.
  • step S2-204 the learning unit 2-44 executes learning using the first learning data set selected by the selection unit.
  • learning refers to generating a classifier by performing machine learning according to a predetermined algorithm using a learning data set.
  • U-Net is used as a predetermined algorithm. Since the learning method of U-Net is a well-known technique, detailed description thereof will be omitted in the present embodiment.
  • SVM Small Vector machine
  • DNN Deep Neural Network
  • CNN Convolutional Neural Network
  • SVM Small Vector machine
  • U-Net Deep Neural Network
  • FCN Full Convolutional Network
  • SegNet and the like can also be used as an algorithm used for Semantic Segmentation that classifies classes in 1-pixel units.
  • GAN Geneative Adversarial Networks
  • the padding in the present embodiment is to generate new data used for learning and increase the amount of data by performing at least one of rotation, inversion, luminance conversion, distortion addition, enlargement, and reduction, for example. .. Inflating data can also be rephrased as data augmentation. Further, when the input data is audio data, it is possible to generate new data used for learning and increase the amount of data by adding a sound combining sounds of one or more kinds of frequencies to the input data. it can.
  • the initial data set into a learning data set and an evaluation data set in advance.
  • step S2-205 the information generated in step S2-204 is stored in the information storage unit 2-46 of the classifier.
  • step S2-206 learning is performed using the information contained in the classifier saved in step S2-205 and the second learning data set.
  • the information contained in the classifier refers to the structure, weight, bias, and the like of the model.
  • the weight and bias are parameters when calculating the output from the input. For example, in the case of a neural network, when x in the equation (2-1) is input, w is the weight and b is the bias.
  • the model structure is not changed, and training is performed so as to optimize the weights and biases for the second training data set.
  • the display control unit 2-48 displays the learning result on the display unit 2-140.
  • the display control unit 2-48 controls to transmit the learning result to the display unit 2-140 connected to the learning device 2-100 and display the learning result on the display unit 2-140.
  • the progress of learning can be confirmed by displaying the input image, the correct answer image, and the image subjected to the inference processing using the generated discriminator side by side. Further, in order to confirm the progress of learning in more detail, the value of IoU (described later) may be displayed.
  • IoU Intersection over Union
  • TP True Positive
  • FP False Positive
  • FN False Negative
  • the learning device 2-100 in the present embodiment sequentially learns from the data set having a large amount of identification target information. Therefore, it is possible to first learn the parameters of image conversion having a large degree of conversion and gradually change the parameters, so that the identification target information can be accurately identified.
  • FIG. 15 is a diagram showing an example of the functional configuration of the learning system (identifier generation system) according to the second embodiment.
  • the learning device (identifier generator) 2-200 includes a reception unit 2-41, an acquisition unit 2-42, a selection unit 2-43, a learning unit 2-44, a discriminator 2-45, and a display control unit 2-48. , Data expansion unit 2-49, and display unit 2-140.
  • the data expansion unit 2-49 expands the initial data set acquired by the acquisition unit 2-42. That is, the data expansion unit 2-49 can increase the number of images of input data.
  • FIG. 16 is a flow chart showing an example of a method for generating a classifier according to the second embodiment.
  • the reception unit 2-41 receives the data set selection condition via the operation unit 2-150.
  • the dataset selection criteria are entered by the user.
  • the data set selection condition is at least the number of data expansions per image in the initial data set, the patch size, the method of dividing the data set, and the data used for training among the divided data sets. Includes set information and learning order.
  • the patch size is the number of vertical and horizontal pixels of the selected image when a part of the image is selected.
  • the method of classifying the data set shall be based on the threshold value of the amount of identification target information.
  • the amount of identification target information is defined by the number of pixels filled with the luminance value (0,255,0). Further, the threshold value is set to two, 5000pixel and 1000pixel.
  • step S2-302 the acquisition unit 2-42 acquires the initial data set from the data server 2-120.
  • step S2-303 the data expansion unit 2-43 expands the initial data set acquired by the acquisition unit 2-42.
  • 2000 input data are generated by cutting out 100 images of patch size 128 ⁇ 128 from each of the initial data sets including 20 images of 40 pairs having a size of 1280 ⁇ 960.
  • learning and evaluation were divided into 8: 2 and used.
  • FIG. 17 is a diagram showing an example of the data expansion processing procedure according to the second embodiment.
  • the process of step S2-303 will be described with reference to FIG.
  • the carbon black in the melamine / alkyd resin is used as the identification target information.
  • the data expansion unit 2-43 expands the data by extracting a plurality of areas of interest with respect to the initial data set.
  • FIG. 17 shows an example in which the area of interest 2-540, the area of interest 2-541, and the area of interest 2-542 are extracted for each of the position coordinates 2-530, the position coordinates 2-531, and the position coordinates 2-532.
  • the input image in this embodiment is composed of a plurality of pixels whose positions can be specified by two-dimensional Cartesian coordinates (x, y). Assuming that the number of pixels in the horizontal direction and the vertical direction of the image is x_size and y_size, respectively, 0 ⁇ x ⁇ x_size and 0 ⁇ y ⁇ y_size hold.
  • the size of the area of interest should be equal to the patch size. Further, (x- i, y i - ) is positioned near the end of the image, when the size of the region of interest becomes smaller than the patch size may fill the periphery of the image pixel values 0, so-called padding process By doing something like this, adjust the size of the area of interest so that it is the same as the patch size.
  • step S2-304 the selection unit 2-43 processes the initial data set acquired by the acquisition unit 2-42, and selects the first learning data set and the second learning data set.
  • step S2-304 The process of step S2-304 will be described.
  • the data sets are sorted in order from the one with the largest amount of identification target information. That is, the data sets are sorted in descending order of the number of pixels filled with the luminance value (0,255,0).
  • the data set is divided according to the threshold value received by the reception unit 2-41.
  • the learning process is determined according to the information of the data set used for learning received by the reception unit 2-41 and the learning order.
  • a data set containing an image having an amount of identification target information of 5000 pixels or more is referred to as a first training data set
  • a data set containing an image having an amount of identification target information of 1000 pixels or more is referred to as a second learning data set. To do.
  • a data set including an image in which the amount of identification target information is 0pixel or more may be used as a third learning data set for further learning.
  • the second training data set includes the first training data set
  • the third training data set includes the first training data set and the second training data set. .. This makes it possible to generate a classifier with higher discrimination accuracy.
  • step S2-305 the learning unit 2-44 executes learning using the first learning data set selected by the selection unit.
  • step S2-306 the information generated in step S2-305 is stored in the information storage unit 2-46 of the classifier.
  • step S2-307 the information contained in the classifier is obtained by learning using the information stored in the classifier stored in the information storage unit in step S2-306 and the second learning data set.
  • the information contained in the classifier refers to the structure, weight, bias, and the like of the model.
  • further learning may be performed using the information contained in the discriminator generated by the learning using the second learning data set and the third learning data set.
  • the number of data sets may be large, the amount of identification target information preferably decreases monotonically as n increases when the number of learning steps is n (n is an integer of 2 or more). .. That is, it is preferable that the slope when plotting the amount of identification target information with respect to the number of learnings is negative.
  • the display control unit 2-48 causes the display unit 2-140 to display the learning result.
  • the learning device 2-100 in the present embodiment can accurately identify the identification target information by sequentially learning from the data set having a large amount of identification target information.
  • the data set is automatically selected, and the learning process is repeated until the evaluation value reaches the target value.
  • the red blood cell part in the image is filled with the brightness value (255,0,0)
  • the white blood cell part is filled with the brightness value (0,255,0)
  • the platelet part is filled with the brightness value (0,0,255).
  • 2000 images of size 128 ⁇ 128 and 1000 pairs of initial data sets were used.
  • the learning and evaluation were divided into 8: 2 and used.
  • FIG. 18 is a diagram showing an example of input data according to the second and third embodiments.
  • FIG. 19 is a diagram showing an example of the functional configuration of the learning system (identifier generation system) according to the second to third embodiments.
  • the learning device (identifier generator) 2-300 has a functional configuration of a reception unit 2-41, an acquisition unit 2-42, a selection unit 2-43, a learning unit 2-44, a discriminator 2-45, and an evaluation unit. It has at least 2-50, a display control unit 2-48, and a display unit 2-140.
  • Evaluation unit 2-50 makes inferences using the discriminator stored in discriminator 2-45, ends learning when the value of IoUavg is higher than the target value, and ends learning when the value of IoUavg is lower than the target value. , Repeat the learning process.
  • FIG. 20 is a flow chart showing an example of a method for generating a classifier according to the second and third embodiments.
  • the reception unit 2-41 receives the data set selection condition via the operation unit 2-150.
  • the dataset selection criteria are entered by the user.
  • the selection condition includes at least the target value of IoU, the upper limit learning time, and the initial value of the class width.
  • the method of classifying the initial data set is to classify the initial data set according to the amount of identification target information.
  • the initial value of the class width is 1000.
  • step S2-402 the acquisition unit 2-42 acquires the initial data set from the data server 2-120.
  • step S2-403 the selection unit 2-43 processes the initial data set acquired by the acquisition unit 2-42, and selects the first learning data set and the second learning data set.
  • the selection unit 2-43 divides the initial data set into classes according to the initial value of the width of the class received by the reception unit 2-41.
  • the data set belonging to the class having the largest amount of identification target information is set as the first learning data set, and belongs to the class having the largest amount of identification target information and the class having the second largest amount of identification target information.
  • the combined data set is used as the second training data set.
  • step S2-404 the learning unit 2-44 executes learning using the first learning data set selected by the selection unit.
  • step S2-405 the information generated in step S2-404 is stored in the information storage unit 46.
  • step S2-406 the information contained in the classifier is obtained by learning using the information stored in the classifier stored in the information storage unit in step S2-405 and the second learning data set. Update.
  • the information contained in the classifier refers to the structure, weight, bias, etc. of the model.
  • step S2-407 the evaluation unit 2-50 makes an inference using the classifier 2-45, ends learning when the value of IoUavg is higher than the target value, and ends learning when the value of IoUavg is lower than the target value. , Repeat the learning process.
  • the display control unit 2-48 causes the display unit 2-140 to display the learning result.
  • mIoU is defined by equation (2-3).
  • IoUavg 0.08
  • IoUavg 0.45.
  • identification is performed. It is possible to provide a method of generating a classifier that can accurately identify target information. Further, according to the present invention, it is possible to provide an identification method and an identification device using the identification device generated by the identification device generation method capable of accurately identifying the identification target information.
  • the learning device and the learning system in each of the above-described embodiments may be realized as a single device, or may be a form in which devices including a plurality of information processing devices are combined so as to be able to communicate with each other to execute the above-mentioned processing. Both are included in the embodiments of the present invention.
  • the above-mentioned processing may be executed by a common server device or a group of servers.
  • the common server device corresponds to the learning device according to the embodiment
  • the server group corresponds to the learning system according to the embodiment.
  • the learning device and the plurality of devices constituting the learning system need not be present in the same facility or in the same country as long as they can communicate at a predetermined communication rate.
  • the present invention can take an embodiment as, for example, a system, an apparatus, a method, a program, or a recording medium (storage medium). Specifically, it may be applied to a system composed of a plurality of devices (for example, a host computer, an interface device, an imaging device, a Web application, etc.), or it may be applied to a device composed of one device. good.
  • a system composed of a plurality of devices (for example, a host computer, an interface device, an imaging device, a Web application, etc.), or it may be applied to a device composed of one device. good.
  • a recording medium (or storage medium) in which a software program code (computer program) that realizes the functions of the above-described embodiment is recorded is supplied to the system or device.
  • the storage medium is a computer-readable storage medium.
  • the computer or CPU or GPU
  • the program code itself read from the recording medium realizes the function of the above-described embodiment, and the recording medium on which the program code is recorded constitutes the present invention.
  • the above document 2-1 describes a diagnostic support device that supports diagnosis of a diseased area by using deep learning. This technique performs highly accurate diagnosis by normalizing the color brightness of an image in advance and separating the diseased part and the non-diseased part.
  • the above-mentioned Document 2-2 discloses a technique for accurately identifying a nodule from a nodule candidate image by connecting a plurality of classifiers and learning while removing a sample that is clearly normal. Connecting a plurality of classifiers in this way is called a cascade type classifier, and is a technique often used to improve the discrimination accuracy.
  • identification target areas areas to be identified
  • identification target areas areas to be identified
  • the method of generating the classifier according to the present embodiment is a method of generating the classifier for estimating the identification target information in the data. Specifically, a padding step (S3-102) in which training data is padded with respect to the training data set group, and a generation step (S3-102) in which a classifier is generated by performing training using the padded learning data set group. S3-103) and at least (FIG. 21).
  • the training data set group includes at least the first training data set and the second training data set.
  • the first and second training data sets include training data.
  • the learning data is composed of input data and teacher data for the input data.
  • the second training data set contains a larger number of training data than the first training data set.
  • the amount of identification target information contained in the input data included in the first learning data set is larger than the amount of identification target information contained in the input data included in the second learning data set.
  • the present inventors have found that if the first learning data set and the second learning data set are trained without going through the padding step, the identification target information cannot be accurately identified. It is considered that this is due to the small amount of identification target information of the input data included in the second learning data set. That is, it was found that when learning is performed with input data in which the amount of identification target information is small, inference without the identification target information tends to be performed even if the inference data includes the identification target information. .. Therefore, the training data of the first training data set having the input data having a large amount of identification target information is inflated, and the number of training data included in the first training data set is included in the second training data set. Make sure that the number of training data is greater than or equal to the number of training data. By doing so, the amount of input data having a large amount of identification target information increases, and the identification target information can be accurately identified.
  • learning data set group reception process (S3-101) may be provided.
  • the data is an expression of information, which is formalized to be suitable for transmission, interpretation or processing, and can be reinterpreted as information.
  • Examples of data include image data, sound data (voice data, etc.), text data, and the like.
  • the input data is input image data, sound input data, and input text data.
  • the identification target information is information to be identified in the data.
  • the data is image data
  • at least one piece of information on the position, area, and distribution of the identification target area in the image data is the identification target information.
  • the classifier generated by the generation method according to the present embodiment can estimate and extract the identification target area in the image data, which is difficult to extract visually by the user.
  • the amount of identification target information can be the number of pixels included in the identification target area.
  • the data is sound data
  • at least one of the frequency and intensity of the sound to be identified (identification target sound) in the sound data is the identification target information.
  • the classifier generated by the generation method according to the present embodiment can estimate and extract the sound to be identified in the noise-containing sound data, which is difficult for the user to extract.
  • the sound data is the voice data of a plurality of speakers
  • the voice data of at least one speaker can be used as the identification target information.
  • the information of the character, the character string, and the number of the identification target characters in the text data is the identification target information.
  • the classifier generated by the generation method according to the present embodiment can estimate and extract a character string to be identified in text data, which is difficult for the user to extract.
  • the learning data in the present embodiment is learning data for generating a discriminator, and is composed of input data and teacher data for the input data.
  • the input data is image data (input image data)
  • the teacher data is the image data with the identification target information attached. For example, the identification target area is shown in the image data.
  • the amount of identification target information contained in the input data is, for example, the ratio of the identification target area to the image data when the input data is image data. That is, a large amount of identification target information means that, for example, when the input data is image data, the ratio of the identification target region to the image data is large. Further, when the input data is sound data, a large amount of identification target information means that the intensity of the identification target sound in the sound data is large, or the sound data is voice data of a plurality of speakers. In the case of, it means that the number of speakers to be extracted is large.
  • a large amount of identification target information means, for example, a large number of characters or character strings to be identified in the text data.
  • the training data set in the present embodiment includes the above-mentioned training data.
  • the number of training data contained in the second training data set is larger than the number of training data contained in the first training data set.
  • the learning data set group in the present embodiment includes at least a first learning data set and a second learning data set.
  • the training data set group may include three or more training data sets.
  • the data padding is to generate new input data and increase the number of input image data by performing at least one of rotation, inversion, luminance conversion, distortion addition, enlargement, and reduction, for example. That is. Inflating data can also be rephrased as data augmentation.
  • the input data is sound, it is possible to generate new sound input data and inflate it by adding a sound that is a combination of sounds of one type or a plurality of types to the input data.
  • the classifier generator is a classifier generator for estimating identification target information in data. Specifically, the inflated unit 3-22 that inflates the learning data for the learning data set group and the generation unit 3-23 that generates a classifier by performing learning using the inflated learning data set group. And at least (Fig. 22).
  • the training data set group includes at least the first training data set and the second training data set.
  • the first and second training data sets include training data.
  • the learning data is composed of input data and teacher data for the input data.
  • the second training data set contains a larger number of training data than the first training data set.
  • the amount of identification target information contained in the input data included in the first learning data set is larger than the amount of identification target information contained in the input data included in the second learning data set.
  • the generation device can be configured such that the acquisition unit 3-21 acquires the learning data set group by operating the operation unit 3-150. Further, the generator according to the present embodiment can be configured to send and receive data to and from the data server 3-120.
  • the classifier according to the present embodiment is generated by the generation method and the generation device according to the present embodiment.
  • the discriminator generated by the generation method and the generation device according to the present embodiment can accurately infer the identification target information included in the input inference data.
  • the information processing apparatus includes the above-mentioned classifier, and has an inference unit that infers the identification target information included in the data for inference using the classifier.
  • the information processing method includes the above-mentioned classifier and has an inference step of inferring the identification target information included in the inference data using the classifier.
  • the area identification system 3-190 has a data input device 3-110 that captures an image for learning, and a data server 3-120 that stores the captured image. Then, the user identifies the area of the image, and has a data processing device 3-130 for coloring the identified area and a classifier learning device 3-100 for learning the classifier. Then, it has a display unit 3-140 for displaying the learning result and the frequency distribution, and an operation unit 3-150 for the user to input an operation instruction of the discriminator learning device.
  • the classifier learning device 3-100 acquires a learning input image and a learning correct answer image at the time of learning, learns them, and outputs a learned model.
  • inference can be performed using the classifier generated by the classifier learning device 3-100.
  • an input image for inference is acquired, the generated trained model is used, an identification area in the input image is extracted, the entire area or its boundary is colored with a certain color, and the image is output as an inferred image. be able to.
  • the classifier learning device 3-100 has at least 3-CPU31, communication IF3-32, ROM3-33, RAM3-34, storage unit 3-35, and common bus 3-36.
  • the CPU 3-31 integrally controls the operation of each component of the classifier learning device 3-100.
  • the classifier learning device 3-100 may also control the operation of the data input device 3-110.
  • the data server 3-120 holds an image taken by the data input device 3-110.
  • Communication IF (Interface) 3-32 is realized by, for example, a LAN card.
  • the communication IF3-32 controls communication between the external device (for example, the data server 3-120) and the classifier learning device 3-100.
  • the ROM 3-33 is realized by a non-volatile memory or the like, stores a control program executed by the CPU 3-31, and provides a work area when the program is executed by the CPU 3-31.
  • the RAM (Random Access Memory) 3-34 is realized by a volatile memory or the like, and temporarily stores various information.
  • the storage unit 3-35 is realized by, for example, an HDD (Hard Disk Drive) or the like. Then, various application software including an operating system (OS: Operating System), a device driver of a peripheral device, and a program for identifying an area according to the present embodiment described later are stored.
  • the operation unit 3-150 is realized by, for example, a keyboard, a mouse, or the like, and inputs an instruction from the user into the device.
  • the display unit 3-140 is realized by, for example, a display or the like, and displays various information toward the user.
  • the operation unit 3-150 and the display unit 3-140 provide a function as a GUI (Graphical User Interface) under the control of the CPU 3-31.
  • the display unit 3-140 may be a touch panel monitor that accepts operation input, and the operation unit 3-150 may be a stylus pen.
  • Each of the above components is communicably connected to each other by a common bus 3-36.
  • the data input device 3-110 is, for example, a scanning electron microscope (SEM), a transmission electron microscope (TEM: Transmission Electron Microscope), an optical microscope, a digital camera, a smartphone, or the like.
  • SEM scanning electron microscope
  • TEM Transmission Electron Microscope
  • the data input device 3-110 transmits the acquired image to the data server 3-120.
  • An imaging control unit (not shown) that controls the data input device 3-110 may be included in the classifier learning device 3-100.
  • the functional configuration of the area identification system including the classifier learning device 3-100 according to the present embodiment will be described with reference to FIG. 24.
  • the main body that executes the program may be one or more CPUs, and the ROM that stores the program may also be one or more memories.
  • another processor such as GPU (Graphics Processing Unit) may be used instead of the CPU or in combination with the CPU. That is, the functions of the respective parts shown in FIG. 24 are realized by executing a program stored in at least one or more memories in which at least one or more processors (hardware) are communicably connected to the processors.
  • the classifier learning device 3-100 has a functional configuration of a reception unit 3-41, an acquisition unit 3-42, a frequency distribution calculation unit 3-44, a data expansion unit 3-45, a learning unit 3-46, and a storage unit 3-47. , And display control unit 3-48. Further, it may have an extraction unit 3-43.
  • the classifier learning device 3-100 is communicably connected to the data server 3-120 and the display unit 3-140.
  • the reception unit 3-41 receives the data expansion condition input from the user via the operation unit 3-150. That is, the operation unit 3-150 corresponds to an example of the reception means for setting the extended condition, patch size (described later), and receiving.
  • the expansion condition includes at least one of a frequency distribution (described later), a number of bins, a bin width, and an augmentation method (described later).
  • a bin is an interval or class in which each is relatively prime in a frequency distribution (histogram).
  • the acquisition unit 3-42 acquires a plurality of learning data (which can also be called a learning data pair) composed of a learning input image and a learning correct answer image from the data server 3-120.
  • the extraction unit 3-43 When the extraction unit 3-43 has, a plurality of small area (data block) pairs are extracted from each of the learning input image and the learning correct answer image based on the patch size received by the reception unit 3-41. ..
  • the frequency distribution calculation unit 3-44 determines the area or the number of pixels of the extraction region for each of a plurality of correct answer images for learning or, if there is an extracted data block group, the data block group extracted from the correct answer image for learning. calculate. Further, using the number of bins and the width of the bins received by the reception unit 3-41, a frequency distribution is created with the calculated area or the number of pixels as the characteristic value.
  • the data expansion unit 3-45 expands the data of the learning input image and the learning correct answer image based on the created frequency distribution and the instruction to execute the augmentation received by the reception unit 3-41.
  • Learning unit 3-46 learns based on the above teacher data and creates a learned model.
  • the storage unit 3-47 stores the trained model.
  • the display control unit 3-48 uses the display unit 3-140 to output information on the frequency distribution and the learning result.
  • the start command of the inference operation input from the user is received via the operation unit 3-150.
  • Acquisition unit 3-42 acquires an inference image from the data server 3-120.
  • the inference unit (not shown) makes inferences based on teacher data 3-49. Subsequently, the display control unit 48 outputs the inference result using the display unit 3-140.
  • each part of the classifier learning device 3-100 may be realized as an independent device.
  • the classifier learning device 3-100 may be a workstation.
  • the functions of each part may be realized as software that operates on a computer, and the software that realizes the functions of each part may be realized on a server via a network such as a cloud.
  • a network such as a cloud.
  • each part is realized by software running on a computer installed in a local environment.
  • FIG. 25 is a diagram showing a processing procedure of processing executed by the classifier learning device 3-100 of the present embodiment. This embodiment is realized by the CPU 3-31 executing a program that realizes the functions of each part stored in the ROM 3-33.
  • the image to be processed will be described as a TEM image.
  • the TEM image is acquired as a two-dimensional shading image.
  • the identification target included in the image will be described as an example of the processing target object included in the processing target image.
  • step S3-201 the reception unit 3-41 receives the data expansion condition input by the user in the operation unit 3-150.
  • the data expansion condition in the present embodiment includes at least one of the number of bins, the width of the bins, and the augmentation method regarding the frequency distribution to be created.
  • step S3-202 the acquisition unit 3-42 acquires a learning data pair consisting of a learning input image and a learning correct answer image from the data server 3-120.
  • a learning data pair consisting of a learning input image and a learning correct answer image from the data server 3-120.
  • the same image pair can be used except that the entire extraction region portion or the boundary portion is colored.
  • step S3-202b a small area (data block) pair is extracted from the learning input image and the learning correct answer image according to the patch size.
  • the patch size is the number of pixels in the vertical and horizontal directions of the cropped image when a part of the target image is cropped.
  • Each pair of extracted data blocks is extracted from the same coordinates on the image.
  • step S3-203 the frequency distribution calculation unit 3-44 for a plurality of learning correct answer images, and for the data block group extracted from the learning correct answer image when the extraction unit 3-43 is provided, respectively.
  • the area of the extraction area is calculated, and a frequency distribution is created using this area value as a characteristic value.
  • step S3-204 the data expansion unit 3-45 expands the data of the learning input image and the learning correct answer image based on the frequency distribution and the instruction to execute the augmentation received by the reception unit 3-41.
  • a method called augmentation such as inversion, enlargement, reduction, distortion addition, and brightness change is used to increase the input image for learning and the correct answer image for learning so that they are included in the same frequency distribution. ..
  • teacher data is generated in which the frequency of the bin containing a large amount of the identification target area is higher than the frequency of the bin containing a smaller identification target area.
  • Augmentation executes, for example, rotation, inversion, enlargement, reduction, etc., and each process can be performed as follows. That is, a blank image (white) having a size 10 times the length and width of the patch size is prepared in advance, and the image to be rotated is arranged in the center portion thereof. Next, the affine transformation is performed at each coordinate according to Eq. (3-1) and Table 1. In the equation, x and y indicate the coordinates before conversion, and x'and y'indicate the coordinates after conversion. Further, in a normal case, ⁇ may be set between 30 ° and 330 ° in terms of rotation angle. Further, a and d are enlargement / reduction ratios in the vertical and horizontal directions, respectively, and are usually set between 0.1 and 10. Next, the center is cut out at the patch size to make the image after augmentation.
  • an arbitrary value is added to the x-coordinate and translated, but this arbitrary value is changed according to the y-coordinate.
  • the maximum value of any value is usually preferably between 20% and 60% of the length of the patch size in the X direction.
  • gamma correction can be used as an example of changing the brightness.
  • the gamma value at this time is usually 1.2 or more or 1 / 1.2 or less.
  • linear interpolation processing may be performed on the augmentation image. This makes it possible to smooth out a mosaic-like jagged image.
  • the learning unit 3-46 generates a trained model 3-49 by performing machine learning according to a predetermined algorithm using the learning teacher data.
  • U-Net As a predetermined algorithm, for example, SVM (Support Vector machine), DNN (Deep Neural Network), CNN (Convolutional Neural Network), or the like may be used.
  • SVM Small Vector machine
  • DNN Deep Neural Network
  • CNN Convolutional Neural Network
  • FCN Full Convolutional Network
  • SegNet and the like can also be used as an algorithm used for Semantic Segmentation that classifies classes in 1-pixel units.
  • an algorithm in which a so-called generative model such as GAN (Generative Adversarial Networks) is combined with the above algorithm may be used.
  • step S3-206 the storage unit 3-47 stores the trained model.
  • step S3-207 the display control unit 3-48 uses the display unit 3-140 to output information related to the frequency distribution and learning.
  • step S3-201 the same processing as in step S3-201 (not shown) is performed except that the information received by the reception unit is an inference start command instead of the data extension condition.
  • step S3-202 a process (not shown) similar to step S3-202 is performed except that the inference input data is acquired from the acquisition unit instead of the learning data pair.
  • the area is inferred using the same algorithm as the learning process by using the learning data and the inference input data.
  • step S3-207 the same non-illustrated process as in step S3-207 is performed except that the inference result is output instead of the information related to the frequency distribution and learning.
  • the inference accuracy can be improved by the above processing.
  • the data handled in the third embodiment can be audio data instead of an image, and the input device can be a microphone. Further, by supporting voice data such as using the difference amount between the learning input data and the learning correct answer data instead of the area, it can be used for voice processing such as speaker identification and noise cancellation in the voice data.
  • noise cancellation it is possible to identify which voice component in all voices is unnecessary voice, that is, noise by the same method.
  • this classifier it is possible to clarify the noise by eliminating it from all the voice.
  • the processing content is the same as that of the third embodiment except that the data expansion method is to increase / decrease the volume, frequency, and speed.
  • the classifier learning device and the area identification system in each of the above-described embodiments may be realized as a single device, or as a mode in which devices including a plurality of information processing devices are combined so as to be able to communicate with each other to execute the above-described processing. Also, both are included in the embodiments of the present invention.
  • the above-mentioned processing may be executed by a common server device or a group of servers.
  • the common server device corresponds to the classifier learning device according to the embodiment
  • the server group corresponds to the area identification system according to the embodiment.
  • the classifier learning device and the plurality of devices constituting the area identification system need only be able to communicate at a predetermined communication rate, and do not need to exist in the same facility or in the same country.
  • the present invention can take an embodiment as, for example, a system, an apparatus, a method, a program, a recording medium (storage medium), or the like. Specifically, it may be applied to a system composed of a plurality of devices (for example, a host computer, an interface device, an imaging device, a Web application, etc.), or it may be applied to a device composed of one device. good.
  • a recording medium (or storage medium) in which a software program code (computer program) that realizes the functions of the above-described embodiment is recorded is supplied to the system or device.
  • the storage medium is a computer-readable storage medium.
  • the computer or CPU or GPU
  • the program code itself read from the recording medium realizes the function of the above-described embodiment, and the recording medium on which the program code is recorded constitutes the present invention.
  • Example 1 the classifier learning device of the embodiment of the present invention was used to grasp the amount of magenta pigment in the cross-sectional TEM image of the color toner.
  • Toner preparation A pulverized toner containing a magenta pigment was obtained according to a conventional method. As a method for obtaining the pulverized toner, the methods described in JP-A-2010-140062 and JP-A-2003-233215 can be used.
  • FIG. 26 shows an example of a learning input image cut out from the TEM image of the toner.
  • FIG. 26 shows an example of a colored correct answer image for learning.
  • the batch size was 128 ⁇ 128, and 100 small areas (data blocks) at the same position for each of the learning input image and the learning correct answer image were created for a total of 1800 pairs.
  • data expansion such as rotation, inversion, enlargement, reduction, distortion addition, and brightness change was performed under the conditions shown in Table 2, and learning was performed to create a classifier.
  • Example 2 In this example, the same as in Example 1 except for the data expansion condition, the same TEM image was used to learn for measuring the amount of magenta pigment in the toner, and a classifier was created. As shown in Table 1, the data expansion conditions are such that the larger the number of pixels in the target area, the higher the frequency.
  • Comparative Example 1 Comparative Example 1
  • a discriminator was created by learning for measuring the amount of magenta pigment in the toner using the same TEM image as in Example 1 except that the data was not expanded.
  • the classifier learning device of the embodiment of the present invention is used to identify the area of the automobile part in order to measure the number of cars in the city from the aerial photograph.
  • the image used is https: // gdo152. llnl. Four aerial photographs of Potsdam City obtained from gov / cowc / (as of October 2019).
  • data expansion such as rotation, inversion, enlargement, reduction, distortion addition, and brightness change was performed under the conditions shown in Table 3, and learning was performed to create a classifier.
  • Comparative Example 2 In this comparative example, a classifier was created by learning to measure the number of cars in the city from the same aerial photograph as in Example 3 except that the data was not expanded.
  • IoU Intersection over Union
  • TP True Positive
  • FP False Positive
  • FN FalseNegative
  • the IoU value of the example according to the embodiment of the present invention is larger than the IoU value of the comparative example, and it can be confirmed that the identification accuracy is improved. That is, in the classifier generated by the generation method according to the above embodiment, since the input data having a large number of pixels in the magenta pigment region and the automobile portion region increases, the identification target information (magenta pigment region and automobile portion) It was found that the area) can be identified with high accuracy.
  • a classifier with high inference accuracy can be generated.
  • the fourth embodiment of the present invention is a combination of the first embodiment, the second embodiment, and the third embodiment of the present invention.
  • the effect of further improving the identification accuracy can be obtained.
  • An example of the fourth embodiment will be described in detail with reference to the drawings. The description of the configuration, function, and operation similar to those of the first to third embodiments will be omitted, and the differences from the above embodiments will be mainly described.
  • the image to be processed is a TEM image
  • the TEM image is acquired as a two-dimensional shading image.
  • carbon black in the coating film of the melamine / alkyd resin paint will be described as an example of the object to be identified.
  • the initial data set including 25 images and 50 pairs of images having a size of 1280 ⁇ 960
  • 40 pairs of 20 images were used for learning and 10 pairs of 5 images were used for evaluation.
  • 2000 input data were generated by cutting out 100 images each having a patch size of 128 ⁇ 128 from the learning image.
  • the value of the maximum area / minimum area of the identification object in one cropped image was 30 to 120, and the amount of the identification object in one cropped image was 0pixel to 16384pixel.
  • IoUavg obtained by calculating the value of IoU for each image for evaluation and averaging was used.
  • the learning processing portion is the same as that of the 3-1 embodiment, and the inference processing portion is the same as that of the 1-1 embodiment.
  • the data of the image having a low power is expanded to the same power as the image having a high power, and then the image of the second embodiment is 2-1.
  • the learning was performed in two stages.
  • the inference processing part is the same as that of the first embodiment.
  • Table 6 shows a list of processing contents, identification targets, and evaluation values of each of the above embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Image Analysis (AREA)

Abstract

La présente invention concerne un dispositif de traitement d'image qui acquiert des informations concernant une région spécifique d'une image sur la base de l'inférence. Le dispositif de traitement d'image comporte un moyen d'acquisition d'informations qui acquiert les informations concernant la région spécifique inférées en saisissant des informations concernant une pluralité de régions d'intérêt qui ont été extraites de l'image sur la base de conditions d'inférence prescrites dans un modèle appris. La pluralité de régions d'intérêt comprend une première région d'intérêt et une seconde région d'intérêt. La première région d'intérêt et la seconde région d'intérêt présentent chacune une région qui chevauche l'autre et une région qui ne chevauche pas l'autre.
PCT/JP2020/039496 2019-10-31 2020-10-21 Dispositif de traitement d'image, procédé de commande de dispositif de traitement d'image, procédé de génération d'identifiant, procédé d'identification, dispositif d'identification, dispositif de génération d'identifiant, et identifiant WO2021085258A1 (fr)

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
JP2019-199099 2019-10-31
JP2019199099 2019-10-31
JP2019-217334 2019-11-29
JP2019217335 2019-11-29
JP2019217334 2019-11-29
JP2019-217335 2019-11-29

Publications (1)

Publication Number Publication Date
WO2021085258A1 true WO2021085258A1 (fr) 2021-05-06

Family

ID=75715063

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/039496 WO2021085258A1 (fr) 2019-10-31 2020-10-21 Dispositif de traitement d'image, procédé de commande de dispositif de traitement d'image, procédé de génération d'identifiant, procédé d'identification, dispositif d'identification, dispositif de génération d'identifiant, et identifiant

Country Status (2)

Country Link
JP (1) JP2021093142A (fr)
WO (1) WO2021085258A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024116309A1 (fr) * 2022-11-30 2024-06-06 日本電気株式会社 Dispositif de génération d'image, dispositif d'apprentissage, procédé de génération d'image et programme

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013114596A (ja) * 2011-11-30 2013-06-10 Kddi Corp 画像認識装置及び方法
JP2015103144A (ja) * 2013-11-27 2015-06-04 富士ゼロックス株式会社 画像処理装置及びプログラム

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013114596A (ja) * 2011-11-30 2013-06-10 Kddi Corp 画像認識装置及び方法
JP2015103144A (ja) * 2013-11-27 2015-06-04 富士ゼロックス株式会社 画像処理装置及びプログラム

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024116309A1 (fr) * 2022-11-30 2024-06-06 日本電気株式会社 Dispositif de génération d'image, dispositif d'apprentissage, procédé de génération d'image et programme

Also Published As

Publication number Publication date
JP2021093142A (ja) 2021-06-17

Similar Documents

Publication Publication Date Title
Baur et al. Generating highly realistic images of skin lesions with GANs
CN110543837B (zh) 一种基于潜在目标点的可见光机场飞机检测方法
CN109791693B (zh) 用于提供可视化全切片图像分析的数字病理学系统及相关工作流程
JP6710135B2 (ja) 細胞画像の自動分析方法及びシステム
CN111524106B (zh) 颅骨骨折检测和模型训练方法、装置、设备和存储介质
CN105144239B (zh) 图像处理装置、图像处理方法
JP5174040B2 (ja) 画像の構成要素と背景とを区別するためにコンピュータで実行される方法および画像の構成要素と背景とを区別するためのシステム
JP6235921B2 (ja) 内視鏡画像診断支援システム
JP2016534709A (ja) 顕微鏡画像内の個々の細胞を分類および識別するための方法およびシステム
Zheng et al. Improvement of grayscale image 2D maximum entropy threshold segmentation method
CN115775226B (zh) 基于Transformer的医学图像分类方法
JP2020160543A (ja) 情報処理システムおよび情報処理方法
WO2021085258A1 (fr) Dispositif de traitement d'image, procédé de commande de dispositif de traitement d'image, procédé de génération d'identifiant, procédé d'identification, dispositif d'identification, dispositif de génération d'identifiant, et identifiant
JP2021170284A (ja) 情報処理装置及びプログラム
Burget et al. Trainable segmentation based on local-level and segment-level feature extraction
JPH1091782A (ja) 濃淡画像用特定部位抽出方法
JP2018206260A (ja) 画像処理システム、評価モデル構築方法、画像処理方法及びプログラム
CN104268845A (zh) 极值温差短波红外图像的自适应双局部增强方法
Zhang et al. Simultaneous lung field detection and segmentation for pediatric chest radiographs
JP6546385B2 (ja) 画像処理装置及びその制御方法、プログラム
Santamaria-Pang et al. Cell segmentation and classification via unsupervised shape ranking
JP6425468B2 (ja) 教師データ作成支援方法、画像分類方法、教師データ作成支援装置および画像分類装置
Khalid et al. DeepMuCS: a framework for co-culture microscopic image analysis: from generation to segmentation
US20220383616A1 (en) Information processing apparatus and image processing method
Zheng et al. Improvement of grayscale image segmentation based on pso algorithm

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20883053

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20883053

Country of ref document: EP

Kind code of ref document: A1