WO2024116503A1 - 学習装置、学習方法及び学習プログラム - Google Patents

学習装置、学習方法及び学習プログラム Download PDF

Info

Publication number
WO2024116503A1
WO2024116503A1 PCT/JP2023/031112 JP2023031112W WO2024116503A1 WO 2024116503 A1 WO2024116503 A1 WO 2024116503A1 JP 2023031112 W JP2023031112 W JP 2023031112W WO 2024116503 A1 WO2024116503 A1 WO 2024116503A1
Authority
WO
WIPO (PCT)
Prior art keywords
image data
learning
data
xth
data group
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2023/031112
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
克樹 奥野
悠 岡野
光晴 松沢
克彦 富坂
剛幸 市村
直哉 岩崎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Resonac Corp
Original Assignee
Resonac Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Resonac Corp filed Critical Resonac Corp
Priority to JP2024561166A priority Critical patent/JPWO2024116503A1/ja
Priority to CN202380081173.8A priority patent/CN120283258A/zh
Priority to EP23897163.4A priority patent/EP4629168A1/en
Publication of WO2024116503A1 publication Critical patent/WO2024116503A1/ja
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • G06V10/7747Organisation of the process, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/69Microscopic objects, e.g. biological cells or cellular parts
    • G06V20/698Matching; Classification

Definitions

  • This disclosure relates to a learning device, a learning method, and a learning program.
  • stance segmentation processing performs the task of assigning class labels at the pixel level to objects to be detected within image data.
  • the purpose of this disclosure is to reduce the workload of workers when generating training data.
  • a learning device that performs a learning process using learning data including each image data included in a default image data group and each correct answer data when each image data included in the default image data group is subjected to a segmentation process, and generates a trained model;
  • a collection unit that collects an x-th output image data group output by inputting an x-th (1 ⁇ x ⁇ N, N is an integer equal to or greater than 2) image data group among the multiple image data groups to a (x ⁇ 1)-th trained model;
  • a generating unit that generates x-th learning data by acquiring a processed x-th output image data group in which each output image data included in the collected x-th output image data group is processed into each correct answer data, and adding the processed x-th output image data group to the (x-1)-th learning data;
  • the learning unit performs a learning process using the xth-order learning data to generate an xth-order trained model.
  • a second aspect of the present disclosure is the learning device according to the first aspect,
  • the learning system further includes a storage unit that stores the N-th order learning data generated by the generation unit.
  • a third aspect of the present disclosure is a learning device according to the first or second aspect,
  • the number of image data contained in the x-th image data group is greater than the number of image data contained in the default image data group.
  • a fourth aspect of the present disclosure is a learning device according to any one of the first to third aspects,
  • the number of image data included in the (x+1)th image data group is greater than the number of image data included in the xth image data group.
  • a fifth aspect of the present disclosure is a learning device according to any one of the first to fourth aspects,
  • the correct answer data includes image data in which pixels are assigned class labels of normal microparticles, and image data in which pixels are assigned class labels of agglomerated microparticles.
  • a sixth aspect of the present disclosure is a learning method, comprising: The computer of the learning device A step of performing a learning process using learning data including each image data included in a default image data group and each correct answer data when each image data included in the default image data group is subjected to a segmentation process, and generating a learned model; A step of collecting an x-th output image data group outputted by inputting an x-th (1 ⁇ x ⁇ N, N is an integer equal to or greater than 2) image data group out of the multiple image data groups into a (x ⁇ 1)-th trained model; A step of acquiring a processed x-th output image data group in which each output image data included in the collected x-th output image data group is processed into each correct answer data, and adding the processed x-th output image data group to the (x-1)-th learning data, thereby generating x-th learning data; A step of performing a learning process using the xth-order learning data and generating an xth-order trained model is executed.
  • a seventh aspect of the present disclosure is a learning program, comprising: On the learning device computer, A step of performing a learning process using learning data including each image data included in a default image data group and each correct answer data when each image data included in the default image data group is subjected to a segmentation process, and generating a trained model; A step of collecting an x-th output image data group output by inputting an x-th (1 ⁇ x ⁇ N, N is an integer equal to or greater than 2) image data group out of the multiple image data groups into a (x ⁇ 1)-th trained model; A step of acquiring a processed x-th output image data group in which each output image data included in the collected x-th output image data group is processed into each correct answer data, and adding the processed x-th output image data group to the (x-1)-th learning data, thereby generating x-th learning data; A step of performing a learning process using the xth-order learning data and generating an xth-order trained model is executed.
  • This disclosure makes it possible to reduce the workload of workers when generating learning data.
  • FIG. 1 is a diagram illustrating an example of a system configuration of a learning system including a learning device according to the first embodiment.
  • FIG. 2 is a diagram for explaining a method of generating correct answer data in the instance segmentation process.
  • FIG. 3 is a diagram illustrating an example of a hardware configuration of the learning device according to the first embodiment.
  • FIG. 4 is a diagram illustrating an example of a functional configuration of the learning device according to the first embodiment.
  • FIG. 5 is a diagram showing an example of an image data group.
  • FIG. 6 is a first diagram illustrating an application example of the learning device according to the first embodiment.
  • FIG. 7 is a diagram showing an example of input data and correct answer data of learning data.
  • FIG. 8 is a second diagram illustrating an application example of the learning device according to the first embodiment.
  • FIG. 9 is a diagram showing an example of input data and correct answer data of the first additional learning data.
  • FIG. 10 is a third diagram illustrating an application example of the learning device according to the first embodiment.
  • FIG. 11 is a diagram showing an example of input data and correct answer data of the N-th additional learning data.
  • FIG. 12 is a diagram showing the result of instance segmentation processing using the Nth-order trained segmentation model.
  • FIG. 13 is a flowchart showing the flow of the learning process and the learning data generation process performed by the learning system.
  • FIG. 14 is a diagram illustrating an example of a functional configuration of a learning device according to the second embodiment.
  • Fig. 1 is a diagram showing an example of the system configuration of the learning system including the learning device according to the first embodiment.
  • the learning system 100 has an imaging device 110, an image data acquisition device 120, an image data storage device 130, a learning device 140, and an image data processing device 150.
  • the imaging device 110 captures an image of a target object and transmits the captured image data to the image data acquisition device 120.
  • the imaging device 110 may be a digital camera, an optical microscope, a scanning electron microscope (SEM), a transmission electron microscope (TEM), or the like.
  • the image data acquisition device 120 divides the image data captured by the imaging device 110, classifies the divided multiple image data into multiple image data groups, and stores them in the image data storage device 130.
  • the number of divisions of the image data is adjusted according to the processing capabilities of the learning device 140 and the image data processing device 150, for example.
  • each image data included in the 0th image data group (referred to as the default image data group) is sent to the image data processing device 150.
  • the learning device 140 acquires learning data from the image data processing device 150 in response to each image data included in the default image data group being transmitted to the image data processing device 150.
  • the learning data includes each image data included in the default image data group and each correct answer data when the image data is subjected to instance segmentation processing.
  • the learning device 140 also performs learning processing using the learning data including each acquired image data and each acquired correct answer data, and generates a trained instance segmentation model. Note that instead of generating a trained instance segmentation model using the default image data group, the subsequent learning processing may be configured to be performed using an existing trained instance segmentation model that has been trained using an appropriate training image data group.
  • the learning device 140 also reads out each image data included in the first image data group, excluding the default image data group, from among the multiple image data groups stored in the image data storage device 130.
  • the learning device 140 also inputs each read image data into the trained instance segmentation model, and transmits each output image data output as the first output image data group to the image data processing device 150.
  • the learning device 140 acquires the first additional learning data from the image data processing device 150.
  • the first additional learning data includes each image data included in the first image data group and each output image data (each correct answer data) obtained after processing each output image data included in the first output image data group.
  • the learning device 140 also generates primary learning data by adding the first additional learning data to the learning data (note that learning data generated based on the default image data group is also referred to as zeroth-order learning data). Furthermore, the learning device 140 performs a learning process using the primary learning data to generate a primary trained instance segmentation model.
  • the learning device 140 repeats the above process N times (N is an integer equal to or greater than 2). Specifically, the learning device 140 reads out each piece of image data included in the xth image data group, excluding the default image data group, from among the multiple image data groups stored in the image data storage device 130 (1 ⁇ x ⁇ N). The learning device 140 also inputs each piece of read image data into the (x-1)th trained instance segmentation model, and transmits the output image data outputted by inputting the image data thus read out to the image data processing device 150 as the xth output image data group.
  • the learning device 140 acquires the xth additional learning data from the image data processing device 150.
  • the xth additional learning data includes each image data included in the xth image data group and each output image data (each correct answer data) after processing each image data included in the xth output image data group.
  • the learning device 140 generates the xth learning data by adding the xth additional learning data to the (x-1)th learning data.
  • the learning device 140 performs a learning process using the xth learning data to generate the xth learned instance segmentation model.
  • the image data processing device 150 When a default image data group is transmitted, the image data processing device 150 generates correct answer data for each image data that is subjected to instance segmentation processing based on each image data included in the image data group. Furthermore, the image data processing device 150 associates each generated correct answer data with each image data and transmits it to the learning device 140 as learning data.
  • the image data processing device 150 processes each output image data included in the first output image data group into correct answer data.
  • the image data processing device 150 associates each output image data (each correct answer data) included in the first output image data group after processing with each image data included in the first image data group, and transmits it to the learning device 140 as the first additional learning data.
  • the image data processing device 150 processes each output image data included in the xth output image data group into correct answer data.
  • the image data processing device 150 associates each output image data (each correct answer data) included in the xth output image data group after processing with each image data included in the xth image data group, and transmits it to the learning device 140 as the xth additional learning data.
  • image data 210 is an example of image data used as input data for learning data in the instance segmentation process.
  • Image data 220 is an example of image data used as correct answer data for learning data in the instance segmentation process.
  • image data 210 in order to generate correct answer data for learning data, it is necessary to assign class labels to all pixels in the image data 210 that correspond to objects.
  • image data 210 shown in FIG. 2 in order to assign class labels so that a horse can be detected, it is necessary to appropriately select the contours of the pixels that correspond to the horse and assign class labels to them. For this reason, preparing multiple pieces of learning data for the instance segmentation process takes an enormous amount of time and places a heavy workload on the worker.
  • the learning device 140 is configured to execute the learning process and the instance segmentation process in combination as described above, and the correct answer data of the learning data is ⁇ The default image data set is generated from scratch, The first and subsequent image data groups are generated by processing each output image data included in the output image data group, instead of generating them from scratch.
  • FIG. 3 is a diagram showing an example of the hardware configuration of the learning device according to the first embodiment.
  • the learning device 140 has a processor 301, a memory 302, an auxiliary storage device 303, an I/F (Interface) device 304, a communication device 305, and a drive device 306.
  • Each piece of hardware in the learning device 140 is connected to each other via a bus 307.
  • the processor 301 has various computing devices such as a CPU (Central Processing Unit) and a GPU (Graphics Processing Unit).
  • the processor 301 reads various programs (e.g., a learning program, etc.) onto the memory 302 and executes them.
  • programs e.g., a learning program, etc.
  • Memory 302 has a primary storage device such as a ROM (Read Only Memory) or a RAM (Random Access Memory).
  • the processor 301 and memory 302 form what is known as a computer, and the processor 301 executes various programs read onto memory 302, causing the computer to realize various functions.
  • the auxiliary storage device 303 stores various programs and various data used when the programs are executed by the processor 301.
  • the I/F device 304 is a connection device that connects to an operation device 311, which is an example of a user interface device, and a display device 312.
  • the communication device 305 is a communication device for communicating with an external device (not shown) via a network.
  • the drive unit 306 is a device for setting the recording medium 313.
  • the recording medium 313 here includes media that record information optically, electrically, or magnetically, such as CD-ROMs, flexible disks, and magneto-optical disks.
  • the recording medium 313 may also include semiconductor memories that record information electrically, such as ROMs and flash memories.
  • the various programs to be installed in the auxiliary storage device 303 are installed, for example, by setting the distributed recording medium 313 in the drive device 306 and reading the various programs recorded on the recording medium 313 by the drive device 306.
  • the various programs to be installed in the auxiliary storage device 303 may be installed by downloading them from a network via the communication device 305.
  • FIG. 4 is a diagram showing an example of the functional configuration of the learning device according to the first embodiment.
  • a learning program is installed in the learning device 140, and by executing the learning program, the learning device 140: Image data acquisition unit 410, xth order trained segmentation model 420; Collection unit 430, xth-order learning data acquisition unit 440, Learning unit 450, xth-order learning data generation unit 460, N-th order learning data generation unit 470, It functions as:
  • the image data acquisition unit 410 inputs each image data included in the xth image data group into the xth trained segmentation model 420.
  • the xth-th trained segmentation model 420 is an R-CNN (Region Based Convolutional Neural Networks or Regions with CNN features) based model, and is generated by a training process performed by the training unit 450 described below.
  • R-CNN is an example, and models such as Mask R-CNN, YOLACT, SOLO, or derived models of these models may be used as the xth-th trained segmentation model 420.
  • the xth trained segmentation model 420 performs instance segmentation processing on each image data included in the xth image data group input by the image data acquisition unit 410.
  • the xth-order trained segmentation model 420 performs instance segmentation processing to output each output image data and notify the collection unit 430.
  • the collection unit 430 collects each output image data output from the xth trained segmentation model and transmits it to the image data processing device 150 as the xth output image data group.
  • the x-th order learning data generating unit 460 receives the following learning data from the image data processing device 150: Each image data included in the default image data group, - Correct answer data when each image data is subjected to instance segmentation processing, When the xth-order learning data storage unit 480 receives the learning data, the xth-order learning data storage unit 480 stores the received learning data.
  • the x-th learning data generating unit 460 receives the following as the x-th additional learning data from the image data processing device 150: Each image data included in the xth image data group, Each output image data (each correct answer data) after processing each output image data included in the x-th output image data group; When the above-mentioned data is received, each image data and each correct answer data is added to the (x-1)th order learning data already stored in the xth order learning data storage unit 480, thereby generating the xth order learning data.
  • the x-th order learning data acquisition unit 440 reads out the learning data stored in the x-th order learning data storage unit 480, and inputs it to the learning unit 450. In addition, each time x-th order learning data is stored in the x-th order learning data storage unit 480, the x-th order learning data acquisition unit 440 reads out the x-th order learning data and inputs it to the learning unit 450.
  • the learning unit 450 has a segmentation model 451 and a comparison/change unit 452, and inputs the "input data" of the learning data or the xth-order learning data input by the xth-order learning data acquisition unit 440 to the segmentation model 451.
  • the learning unit 450 also inputs the "correct answer data" of the learning data or the xth-order learning data input by the xth-order learning data acquisition unit 440 to the comparison/change unit 452.
  • the segmentation model 451 is a model based on R-CNN (Region Based Convolutional Neural Networks or Regions with CNN features).
  • R-CNN is an example
  • the xth-order trained segmentation model 420 may be a model such as Mask R-CNN, YOLACT, SOLO, or a derived model of these models.
  • the segmentation model 451 When the segmentation model 451 receives image data included in the "input data" of the learning data or the xth-order learning data, it performs instance segmentation processing and outputs the output image data.
  • the comparison/change unit 452 updates the model parameters of the segmentation model 451 based on each output image data output from the segmentation model 451 and each correct answer data included in the "correct answer data" of the learning data or the xth-order learning data.
  • the learning unit 450 performs a learning process using the learning data and the xth-order learning data, respectively, to generate a learned segmentation model or an xth-order learned segmentation model.
  • the learned segmentation model and the xth-order learned segmentation model generated by the learning unit 450 function as the xth-order learned segmentation model 420.
  • the learned segmentation model generated by the learning unit 450 functions as a 0th-order learned segmentation model.
  • the 1st-order learned segmentation model generated by the learning unit 450 functions as a 1st-order learned segmentation model.
  • the Nth-order learning data generation unit 470 reads the Nth-order learning data stored in the xth-order learning data storage unit 480 by adding the Nth additional learning data from the xth-order learning data storage unit 480, and stores it in the Nth-order learning data storage unit 490.
  • Fig. 5 is a diagram showing an example of an image data group.
  • reference numeral 510 denotes one piece of image data captured by the imaging device 110 and acquired by the image data acquisition device 120.
  • Reference numeral 511 indicates an enlarged portion of the image data indicated by reference numeral 510.
  • the target object photographed in this embodiment is a group of microparticles. Instance segmentation processing is performed on the image data photographing the group of microparticles to identify individual microparticles and determine whether they are normal microparticles (normal particles) or aggregated microparticles (aggregated particles).
  • reference numerals 511a and 511b indicate aggregated particles, and the other microparticles are normal particles.
  • the microparticles covered by this disclosure include particles with diameters on the order of microns, as shown in FIG. 5 as an example.
  • the upper limit of the average particle size of the microparticles is 50 ⁇ m or less, preferably 30 ⁇ m or less, and more preferably 20 ⁇ m or less.
  • the lower limit of the average particle size of the microparticles is 1.0 ⁇ m or more, preferably 2.0 ⁇ m or more, and more preferably 2.5 ⁇ m or more.
  • the substance constituting the microparticles may be an inorganic material containing metal or the like, or an organic material.
  • the surface of the microparticles may be formed with a functional layer for manifesting a specific function.
  • reference numeral 520 indicates that the image data acquisition device 120 divides the image data into a plurality of image data, and classifies the divided plurality of image data into a plurality of image data groups. Specifically, reference numeral 520 indicates that the image data indicated by reference numeral 510 is divided into 20 pieces, and 20 pieces of image data are generated. Reference numeral 520 also indicates that, of the 20 pieces of image data, the five pieces of image data located on the first row are classified into a default image data group. Furthermore, reference numeral 520 indicates that, of the 20 pieces of image data, the 15 pieces of image data located on the second to fourth rows are classified into the first image data group.
  • the second to tenth image data (not shown in FIG. 5) are similarly divided into 20 parts each to generate 180 pieces of image data, which are then classified into the second image data group.
  • FIG. 6 is a first diagram showing an application example of the learning device according to the first embodiment.
  • the image data processing device 150 Normal image data in which normal particles in the image data 601 are assigned normal class labels; and agglutination image data in which an agglutination class label is assigned to the agglutination particles of the image data 601; Generate.
  • Image data processing device 150 Image data 601 is input data, Normal image data and aggregated image data are treated as correct answer data.
  • the learning device 140 generates learning data such that:
  • the learning data sent from the image data processing device 150 is acquired by the x-th order learning data generation unit 460 and stored in the x-th order learning data storage unit 480 as learning data 602. Note that in the example of FIG. 6, due to space limitations, only one set of input data and correct answer data is shown as the learning data 602, but the learning data 602 includes five sets of input data and correct answer data corresponding to the number of image data included in the default image data group.
  • the five sets of input data and correct answer data contained in the learning data 602 are read out by the xth-order learning data acquisition unit 440 and input sequentially to the learning unit 450, where learning processing is performed and a trained segmentation model is generated.
  • FIG. 7 shows an example of input data and correct answer data for training data, and is an enlarged view of one set of input data and correct answer data for training data 602 shown in FIG. 6.
  • image data 601 is input data for training data 602
  • symbols 601a and 601b are agglomerated particles.
  • normal image data 701 and agglomerated image data 702 are correct answer data for training data 602.
  • FIG. 8 is a second diagram showing an application example of the learning device according to the first embodiment.
  • the image data acquisition unit 410 acquires image data 801 included in the first image data group, and inputs the image data to the xth-order learned segmentation model 420, which functions as a 0th-order learned segmentation model.
  • the xth-order learned segmentation model 420 outputs output image data 811, 812.
  • the collection unit 430 collects 15 sets of output image data as the first output image data group and transmits them to the image data processing device 150.
  • the image data processing device 150 For microparticles in the output image data 811 included in the first output image data group, if a normal class label has not been assigned to pixels corresponding to normal particles, a normal class label is assigned. Also, if a normal class label has been assigned to pixels corresponding to microparticles other than normal particles, the normal class label is deleted. In this way, the output image data 811 is processed into normal image data. For microparticles in the output image data 812 included in the first output image data group, if an aggregation class label has not been assigned to a pixel corresponding to an aggregate particle, an aggregation class label is assigned. Also, if an aggregation class label has been assigned to a pixel corresponding to a microparticle other than an aggregate particle, the aggregation class label is deleted. In this way, the output image data 812 is processed into aggregate image data.
  • Image data 801 is input data
  • the normal image data and the aggregated image data obtained by processing the output image data 811 and 812 are regarded as the correct answer data.
  • the first additional learning data set is transmitted to the learning device 140.
  • the first additional learning data sent from the image data processing device 150 is acquired by the x-th order learning data generation unit 460 and added to the learning data 602 to generate primary learning data 820, which is stored in the x-th order learning data storage unit 480.
  • the learning data contains five sets of input data and correct answer data
  • the first additional learning data contains 15 sets of input data and correct answer data.
  • the primary learning data 820 contains a total of 20 sets of input data and correct answer data.
  • the 20 sets of input data and correct answer data contained in the primary learning data 820 are read out by the xth-order learning data acquisition unit 440 and sequentially input into the learning unit 450 for learning processing, generating a primary trained segmentation model.
  • FIG. 9 is a diagram showing an example of input data and correct answer data for the first additional learning data.
  • image data 801 is input data for the first additional learning data.
  • normal image data 811' and agglomerated image data 812' are correct answer data for the first additional learning data.
  • normal image data 811' is generated by processing output image data 811 included in the first output image data group to assign a normal class label to microparticles for which pixels corresponding to normal particles do not have a normal class label assigned.
  • the microparticles indicated by black arrows refer to microparticles for which pixels corresponding to normal particles do not have a normal class label assigned.
  • agglomerated image data 812' shows the output image data 812 used as is, without being processed.
  • normal image data 811' and agglomerated image data 812' are obtained by processing the output image data 811 and 812. This reduces the workload of the worker when generating learning data, compared to generating normal image data 811' and agglomerated image data 812' from scratch based on image data 801.
  • the image data acquisition unit 410 acquires image data 1001 contained in the Nth image data group, and inputs it to the xth order trained segmentation model 420, which functions as an (N-1)th order trained segmentation model.
  • the xth order trained segmentation model 420 outputs output image data 1011, 1012.
  • the collection unit 430 collects 180 sets of output image data as the Nth output image data group and transmits them to the image data processing device 150.
  • the image data processing device 150 For microparticles in the output image data 1011 included in the Nth output image data group, if a normal class label has not been assigned to pixels corresponding to normal particles, a normal class label is assigned. Also, if a normal class label has been assigned to pixels corresponding to microparticles other than normal particles, the normal class label is deleted. In this way, the output image data 1011 is processed into normal image data.
  • an aggregation class label For microparticles in the output image data 1012 included in the Nth output image data group, if an aggregation class label has not been assigned to a pixel corresponding to an aggregate particle, an aggregation class label is assigned. Also, if an aggregation class label has been assigned to a pixel corresponding to a microparticle other than an aggregate particle, the aggregation class label is deleted. In this way, the output image data 1012 is processed into aggregation image data.
  • Image data 1001 is input data
  • the normal image data and the aggregated image data obtained by processing the output image data 1011 and 1012 are regarded as the correct answer data.
  • the N-th additional learning data set is transmitted to the learning device 140.
  • the Nth additional learning data transmitted from the image data processing device 150 is acquired by the xth learning data generation unit 460 and added to the (N-1)th learning data to generate Nth learning data 1020, which is stored in the xth learning data storage unit 480.
  • Nth learning data 1020 contains a total of 200 sets of input data and correct answer data.
  • the 200 sets of input data and correct answer data contained in the Nth-order learning data 1020 are read out by the xth-order learning data acquisition unit 440 and sequentially input into the learning unit 450, where learning processing is performed and a trained segmentation model is generated.
  • FIG. 11 is a diagram showing an example of input data and correct answer data for the Nth additional learning data.
  • image data 1001 is input data for the Nth additional learning data.
  • normal image data 1011' and agglomerated image data 1012' are correct answer data for the Nth additional learning data.
  • normal image data 1011' is generated by processing output image data 1011 included in the Nth output image data group to assign a normal class label to microparticles whose pixels corresponding to normal particles do not have a normal class label assigned to them.
  • the microparticles indicated by black arrows refer to microparticles whose pixels corresponding to normal particles do not have a normal class label assigned to them.
  • agglomerated image data 1012' shows the output image data 1012 included in the Nth output image data group used as is, without processing it.
  • normal image data 1011' and agglomerated image data 1012' are obtained by processing the output image data 1011 and 1012. This reduces the workload of the worker when generating learning data, compared to generating normal image data 1011' and agglomerated image data 1012' from scratch based on image data 1001.
  • the output image data processed to generate the (x+1)th additional learning data requires less processing than the output image data processed to generate the xth additional learning data. This is because the amount of data used in the learning process is increased, and the processing accuracy of the xth trained instance segmentation model is improved.
  • FIG. 12 is a diagram showing the result of instance segmentation processing using the Nth-order trained segmentation model.
  • the processing accuracy is improved by performing the learning process using the Nth-order learning data, and as a result, according to the Nth-order trained segmentation model.
  • pixels corresponding to normal particles can be assigned normal class labels without excess or deficiency (green in FIG. 12).
  • the pixels corresponding to the aggregated particles can be assigned the appropriate aggregation class labels (blue-purple in FIG. 12).
  • Fig. 13 is a flowchart showing the flow of the learning process and the learning data generation process by the learning system.
  • step S1301 the imaging device 110 captures an image of a group of microparticles, which is the target object, and generates multiple image data.
  • step S1302 the image data acquisition device 120 divides the image data, classifies it into multiple image data groups, and stores it in the image data storage device 130.
  • the image data processing device 150 acquires the default image data group transmitted from the image data storage device 130.
  • the image data processing device 150 also generates learning data in which each image data included in the image data group is used as input data, and normal image data and agglomerated image data generated based on each image data included in the image data group are used as correct answer data.
  • step S1304 the learning device 140 performs a learning process using the learning data and generates a learned instance segmentation model.
  • step S1305 the learning device 140 assigns "1" to x.
  • step S1306 the learning device 140 inputs the xth image data group into the (x-1)th trained instance segmentation model and collects the xth output image data group.
  • step S1307 the image data processing device 150 processes each output image data included in the xth output image data group, and generates the xth processed output image data group including each processed output image data.
  • the image data processing device 150 also generates the xth additional learning data, in which each image data included in the xth image data group is input data, and each processed output image data included in the xth processed output image data group is correct answer data.
  • step S1308 the learning device 140 adds the xth additional learning data to the (x-1)th learning data to generate the xth learning data.
  • step S1309 the learning device 140 performs a learning process using the xth-order learning data and generates an xth-order trained instance segmentation model.
  • step S1310 the learning device 140 increments x.
  • step S1311 the learning device 140 determines whether or not x has exceeded N. If it is determined in step S1311 that x has not exceeded N (NO in step S1311), the process returns to step S1306.
  • step S1311 if it is determined in step S1311 that x exceeds N (YES in step S1311), proceed to step S1312.
  • step S1312 the learning device 140 stores the Nth-order learning data in the Nth-order learning data storage unit 490.
  • the learning device 140 As is clear from the above description, the learning device 140 according to the first embodiment: A learning process is performed using learning data including each image data included in the default image data group and correct answer data when each image data included in the default image data group is subjected to instance segmentation processing, thereby generating an (x-1)th order learned instance segmentation model. Collect the xth output image data group that is output by inputting the xth image data group out of the multiple image data groups into the (x-1)th order trained instance segmentation model. - Each output image data included in the collected xth output image data group is processed into correct answer data, and the processed xth output image data group is obtained and added to the (x-1)th learning data, thereby generating the xth learning data. Perform learning processing using the xth-order learning data to generate an xth-order trained instance segmentation model.
  • the learning device 140 is configured to execute a combination of learning processing and instance segmentation processing, and generates correct answer data for learning data by processing each output image data, instead of generating the correct answer data from scratch.
  • the learning device 140 when performing learning processing using xth-order learning data, instead of performing additional re-learning processing on the (x-1)th-order trained instance segmentation model, it is also possible to perform the learning processing from scratch using the segmentation model that was initially prepared each time.
  • the learning device 140 By generating learning data in this manner and generating an Nth-order trained instance segmentation model, the learning device 140 according to the first embodiment can avoid biased learning processing that occurs due to a small amount of learning data.
  • the image data acquisition device 120 and the image data processing device 150 are configured as separate entities from the learning device 140.
  • the functions of the image data acquisition device 120 and the functions of the image data processing device 150 may be realized in the learning device 140.
  • FIG. 14 is a diagram showing an example of the functional configuration of a learning device according to the second embodiment.
  • the difference from the functional configuration of the learning device 140 according to the first embodiment shown in FIG. 4 is that the learning device 1400 has a classification unit 1410 and an image data processing unit 1420.
  • the classification unit 1410 divides the multiple image data captured by the imaging device 110 and acquired by the image data acquisition unit 410. The classification unit 1410 also classifies the divided multiple image data into multiple image data groups.
  • the classification unit 1410 also notifies the image data processing unit 1420 of the default image data group from among the multiple image data groups. Furthermore, each time the xth trained segmentation model 420 is updated, the classification unit 1410 notifies the image data processing unit 1420 of the xth image data group and inputs it to the xth trained segmentation model 420.
  • the image data processing unit 1420 When the image data processing unit 1420 is notified of a default image data group by the classification unit 1410, it generates each correct answer data when each image data included in the image data group is subjected to instance segmentation processing. In addition, the image data processing unit 1420 associates each generated correct answer data with each image data, and notifies the x-th learning data generation unit 460 as learning data.
  • the image data processing unit 1420 processes each output image data included in the xth output image data group into correct answer data.
  • the image data processing unit 1420 associates each output image data (each correct answer data) included in the xth output image data group after processing with each image data included in the xth image data group, and notifies the xth learning data generation unit 460 of the correspondence as the xth additional learning data.
  • the second embodiment can achieve the same effects as the first embodiment.
  • both the functions of the image data acquisition device 120 and the functions of the image data processing device 150 are implemented in the learning device 1400.
  • either the functions of the image data acquisition device 120 or the functions of the image data processing device 150 may be implemented in the learning device 1400.
  • the number of microparticles to which class labels should be assigned to pixels in order to generate ground truth data from each image data included in the default image data group is large.
  • the number of microparticles to which class labels should be assigned or deleted in order to process each output image data included in the xth output image data group into each ground truth data is small.
  • the number of microparticles to which class labels should be assigned or deleted from pixels in order to process the data into each correct answer data is smaller than that of each output image data included in the xth output image data group.
  • the number of microparticles to which class labels should be assigned or deleted from pixels does not increase, and the worker's workload does not increase.
  • the number of microparticles to which class labels should be assigned or deleted from pixels does not increase, and the worker's workload does not increase. For this reason, by using the classification method described above, it is possible to accumulate a large amount of correct answer data without increasing the worker's workload.
  • the target object is described as a microparticle with a diameter on the order of microns, but the size of the target object may be on the order of submicrons or millimeters.
  • segmentation model 451 has been described as a model that performs instance segmentation processing, but it may also be a model that performs segmentation processing other than instance segmentation processing.
  • the Nth-order trained segmentation model is used, for example, in a process of distinguishing between normal particles and agglomerated particles.
  • the determination result by the Nth-order trained segmentation model may be used to control the manufacturing conditions of an apparatus for manufacturing microparticles.
  • Learning system 110 Imaging device 120: Image data acquisition device 140: Learning device 150: Image data processing device 410: Image data acquisition unit 420: x-th order trained segmentation model 430: Collection unit 440: x-th order learning data acquisition unit 450: Learning unit 460: x-th order learning data generation unit 470: N-th order learning data generation unit 1400: Learning device 1410: Classification unit 1420: Image data processing unit

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)
PCT/JP2023/031112 2022-11-30 2023-08-29 学習装置、学習方法及び学習プログラム Ceased WO2024116503A1 (ja)

Priority Applications (3)

Application Number Priority Date Filing Date Title
JP2024561166A JPWO2024116503A1 (https=) 2022-11-30 2023-08-29
CN202380081173.8A CN120283258A (zh) 2022-11-30 2023-08-29 学习装置、学习方法及学习程序
EP23897163.4A EP4629168A1 (en) 2022-11-30 2023-08-29 Training device, training method, and training program

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2022-191298 2022-11-30
JP2022191298 2022-11-30

Publications (1)

Publication Number Publication Date
WO2024116503A1 true WO2024116503A1 (ja) 2024-06-06

Family

ID=91323298

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2023/031112 Ceased WO2024116503A1 (ja) 2022-11-30 2023-08-29 学習装置、学習方法及び学習プログラム

Country Status (4)

Country Link
EP (1) EP4629168A1 (https=)
JP (1) JPWO2024116503A1 (https=)
CN (1) CN120283258A (https=)
WO (1) WO2024116503A1 (https=)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019101535A (ja) 2017-11-29 2019-06-24 コニカミノルタ株式会社 教師データ作成装置および該方法ならびに画像セグメンテーション装置および該方法
JP2019204312A (ja) * 2018-05-24 2019-11-28 日本電子株式会社 生物組織画像処理装置及び方法
WO2020111048A1 (ja) * 2018-11-26 2020-06-04 大日本印刷株式会社 コンピュータプログラム、学習モデル生成装置、表示装置、粒子識別装置、学習モデル生成方法、表示方法及び粒子識別方法
JP2021039748A (ja) * 2019-08-30 2021-03-11 キヤノン株式会社 情報処理装置、情報処理方法、情報処理システム及びプログラム
JP2022191298A (ja) 2020-03-31 2022-12-27 昭和電工マテリアルズ株式会社 細胞懸濁液の製造方法、及び細胞懸濁液又はマイクロキャリアの評価用試薬

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019101535A (ja) 2017-11-29 2019-06-24 コニカミノルタ株式会社 教師データ作成装置および該方法ならびに画像セグメンテーション装置および該方法
JP2019204312A (ja) * 2018-05-24 2019-11-28 日本電子株式会社 生物組織画像処理装置及び方法
WO2020111048A1 (ja) * 2018-11-26 2020-06-04 大日本印刷株式会社 コンピュータプログラム、学習モデル生成装置、表示装置、粒子識別装置、学習モデル生成方法、表示方法及び粒子識別方法
JP2021039748A (ja) * 2019-08-30 2021-03-11 キヤノン株式会社 情報処理装置、情報処理方法、情報処理システム及びプログラム
JP2022191298A (ja) 2020-03-31 2022-12-27 昭和電工マテリアルズ株式会社 細胞懸濁液の製造方法、及び細胞懸濁液又はマイクロキャリアの評価用試薬

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4629168A1

Also Published As

Publication number Publication date
EP4629168A1 (en) 2025-10-08
JPWO2024116503A1 (https=) 2024-06-06
CN120283258A (zh) 2025-07-08

Similar Documents

Publication Publication Date Title
CN111583229B (zh) 一种基于卷积神经网络路面故障检测方法
Nada et al. Pushing the limits of unconstrained face detection: a challenge dataset and baseline results
CN111008961B (zh) 一种输电线路设备缺陷检测方法及其系统、设备、介质
CN114746898B (zh) 用于生成图像抠图的三分图的方法和系统
CN108665443B (zh) 一种机械设备故障的红外图像敏感区域提取方法及装置
EP3223239B1 (en) Learned feature motion detection
JP7502972B2 (ja) プルーニング管理装置、プルーニング管理システム及びプルーニング管理方法
CN113111947A (zh) 图像处理方法、装置和计算机可读存储介质
US20150139546A1 (en) Image segmenting apparatus and method
US20070065003A1 (en) Real-time recognition of mixed source text
US20070041638A1 (en) Systems and methods for real-time object recognition
Zhang et al. Web-supervised network for fine-grained visual classification
CN109035167A (zh) 对图像中的多个人脸进行处理的方法、装置、设备和介质
CN118965201A (zh) 一种基于多模态特征融合的恶意软件检测分类方法及系统
Mehta et al. Near-duplicate detection for LCD screen acquired images using edge histogram descriptor
WO2024116503A1 (ja) 学習装置、学習方法及び学習プログラム
JP2010086466A (ja) データ分類装置及びプログラム
Araghi et al. Pushing the boundaries of event subsampling in event-based video classification using CNNs
Vijayan et al. A universal foreground segmentation technique using deep-neural network
JP2022129792A (ja) 領域変換装置、領域変換方法及び領域変換システム
US20240233328A1 (en) Non-transitory computer-readable recording medium, determination method, and information processing apparatus
Rafi et al. L2-constrained remnet for camera model identification and image manipulation detection
Ghosh et al. A detail analysis and implementation of Haar cascade classifier
Yildiz et al. An extended visual intelligence scheme for disassembly in automated recycling routines
CN117314835A (zh) 资源介质的计数方法、装置、计算机设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23897163

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 202380081173.8

Country of ref document: CN

WWE Wipo information: entry into national phase

Ref document number: 2024561166

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 2023897163

Country of ref document: EP

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2023897163

Country of ref document: EP

Effective date: 20250630

WWP Wipo information: published in national office

Ref document number: 202380081173.8

Country of ref document: CN

WWP Wipo information: published in national office

Ref document number: 2023897163

Country of ref document: EP