WO2023032438A1 - 回帰推定装置および方法、プログラム並びに学習済みモデルの生成方法 - Google Patents
回帰推定装置および方法、プログラム並びに学習済みモデルの生成方法 Download PDFInfo
- Publication number
- WO2023032438A1 WO2023032438A1 PCT/JP2022/025288 JP2022025288W WO2023032438A1 WO 2023032438 A1 WO2023032438 A1 WO 2023032438A1 JP 2022025288 W JP2022025288 W JP 2022025288W WO 2023032438 A1 WO2023032438 A1 WO 2023032438A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- regression
- data
- model
- estimation device
- estimated
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20076—Probabilistic image processing
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Definitions
- the present disclosure relates to a regression estimation device and method, a program, and a method of generating a trained model, and more particularly to an information processing technology that performs regression estimation for estimating numerical values of objective variables based on input data.
- Non-Patent Literature 1 discloses a configuration for a classification problem in which, when integrating multiple inference results, the weight of inference results near the boundary value (0.5) is reduced.
- Non-Patent Document 2 discloses a configuration in which inference results obtained from a plurality of linear regression models are integrated with a median weighted for each model.
- a plurality of regression models are used to estimate a valence (induction) value and an arousal (awakening) value as music impression values from a music sound signal, and a plurality of estimation results obtained by the plurality of regression models are used. Describes how to integrate.
- Non-Patent Document 3 when solving a regression problem, after creating multiple images by rotating or flipping a single image, input them to a learning model and calculate the estimated values for the number of inputs obtained. The final result is obtained by averaging.
- a normal deep regression model does not output the confidence level for the estimated value, but in Non-Patent Document 4, the regression confidence level is obtained by using the mean and standard deviation of the normal distribution as the output of the deep learning machine.
- Non-Patent Document 2 uses a weighted median, but this method is intended for linear regression and does not dynamically change the weight according to the input.
- Non-Patent Document 3 the final result is obtained by simple averaging from multiple estimated values obtained from the learning model, so the influence of inputs unsuitable for estimation cannot be reduced by weighting.
- the method described in Non-Patent Document 4 only obtains the degree of certainty of regression, and is not a mechanism for integrating estimation results.
- the present disclosure has been made in view of such circumstances, and integrates the estimation results obtained by performing multiple different inputs to one (single) regression model to derive one estimated value. It is an object of the present invention to provide a regression estimation device and method, a program, and a method of generating a trained model that can improve the accuracy of case estimation.
- a regression estimation device includes one or more processors and one or more storage devices in which programs executed by the one or more processors are stored, wherein the one or more processors are , by executing a program instruction, accepts multiple data inputs, inputs multiple data into a single regression model, and estimates multiple pairs of estimated values and the likelihood of estimated values from multiple data Then, the multiple sets of estimation results are integrated based on the multiple sets of estimated values estimated by the regression model and the likelihood of the estimated values.
- a plurality of data are input to a single regression model to obtain a plurality of sets of estimated values and their probabilities according to the input, and these sets of The estimation results are integrated based on the estimated values and their likelihoods, and an estimated value is obtained as the integrated result. Since the probability of each estimated value is taken into consideration when integrating, the estimated value (final estimated value) as the integration result derived by this embodiment can be a highly accurate estimated value.
- Single regression model means one type of regression model, and may have multiple processing modules that operate as the same regression model.
- estimate includes the concepts of inference and prediction.
- probability encompasses the concepts of certainty and confidence.
- one or more processors estimate a probability distribution with the estimated value as a random variable, based on the estimated value and the probability of the estimated value, and each of the plurality of sets are integrated to generate an integrated distribution, and the final estimated value is determined based on the integrated distribution.
- one or more processors estimate a probability distribution with the estimated value as a random variable, based on the estimated value and the probability of the estimated value, and each of the plurality of sets A value that maximizes the product of probabilities of the same random variable can be specified based on the probability distribution of .
- the one or more processors transform the estimated value output from the regression model into the first parameter of the probability distribution model, and the probability output from the regression model can be configured to variable-transform the value indicating to the second parameter of the probability distribution model.
- the probability distribution model may be Laplace distribution.
- the probability distribution model may be Gaussian distribution.
- the one or more processors perform logarithmic transformation that takes logarithms of the probability distributions, and when integrating, logarithmic probability densities corresponding to each of the plurality of sets of probability distributions. It can be configured to calculate the sum and find the value that maximizes the joint logarithmic probability density.
- the regression model includes a learned model generated by performing machine learning using training data in which input data and teacher signals are associated.
- the regression model may be constructed using a convolutional neural network.
- the plurality of data may be medical images.
- the multiple data may be slice images within the same series.
- the plurality of data may be configured to include different partial images included in the 3D image.
- the plurality of data may include generated images generated based on different partial images included in the 3D image.
- the plurality of data may be configured to include different partial images included in the time-series images.
- the plurality of data may include images with different resolutions.
- the estimated value may be a value indicating the position of a specific target.
- the estimated value may be a value indicating the position of the partial image in the 3D image.
- the estimated value may be the age of the person in the image that is the input data.
- a regression estimation method is a regression estimation method executed by a processor, which receives input of a plurality of data and inputs the plurality of data into a single regression model to obtain a plurality of Estimate multiple sets of estimated values and the likelihood of the estimated values from the data, and integrate the multiple sets of estimation results based on the multiple sets of estimated values and the likelihood of the estimated values estimated by the regression model including
- a program provides a computer with a function of receiving input of a plurality of data, and inputting the plurality of data into a single regression model, so that the estimated value and the accuracy of the estimated value are obtained from the plurality of data.
- a function of estimating a plurality of sets of likelihood and a function of integrating a plurality of sets of estimation results based on the plurality of sets of estimated values estimated by a regression model and the likelihood of the estimated values are realized.
- a method of generating a trained model is a method of generating a trained model used as a regression model that receives data input and outputs an estimated value and the likelihood of the estimated value from the data. Then, using the training data in which the input data and the teacher signal are associated, the input data is input to the learning model, and the output of the estimated value and the value indicating the likelihood of the estimated value is obtained from the learning model. , variable conversion of the estimated value output from the learning model to the first parameter of the probability distribution model, and variable conversion of the value indicating the likelihood output from the learning model to the second parameter of the probability distribution model. calculating a loss function using the first parameter, the second parameter, and the teacher signal; and updating the parameters of the learning model based on the calculation result of the loss function.
- a method for generating a trained model is understood as an invention of a method for manufacturing (producing) a trained model.
- the probability distribution model is a Laplace distribution
- the first parameter is ⁇
- the second parameter is b
- the teacher signal is t.
- the probability distribution model is a Gaussian distribution
- the first parameter is ⁇
- the second parameter is ⁇ 2
- the teacher signal is t.
- log ⁇ 2 +(t- ⁇ ) 2 /2 ⁇ 2 can be used.
- highly accurate estimates can be derived from multiple data inputs for a single regression model.
- FIG. 1 is a conceptual diagram showing an outline of processing by the regression estimation device according to the first embodiment.
- FIG. 2 is an explanatory diagram showing an example 1 of processing in the number-of-seconds distribution estimating unit.
- FIG. 4 shows an example of a graph of the number-of-seconds distribution (Laplace distribution) estimated by the parameters ⁇ and b estimated by the number-of-seconds distribution estimator.
- FIG. 5 is an explanatory diagram of an example of processing in the integrating unit and the maximum point specifying unit.
- FIG. 6 is a schematic illustration of an example of a machine learning method for generating a regression model to be applied to the seconds distribution estimator.
- FIG. 1 is a conceptual diagram showing an outline of processing by the regression estimation device according to the first embodiment.
- FIG. 2 is an explanatory diagram showing an example 1 of processing in the number-of-seconds distribution estimating unit.
- FIG. 7 is an explanatory diagram of a loss function used during training.
- FIG. 8 is a block diagram schematically showing an example of the hardware configuration of the regression estimation device according to the first embodiment;
- FIG. 9 is a functional block diagram showing an overview of processing functions of the regression estimation device according to the first embodiment.
- FIG. 10 is an explanatory diagram showing example 2 of processing in the number-of-seconds distribution estimation unit of the regression estimation device according to the second embodiment.
- FIG. 11 shows an example of a graph of the number-of-seconds distribution (Gaussian distribution) estimated by the parameters ⁇ and ⁇ 2 estimated by the number-of-seconds distribution estimator.
- FIG. 12 is an explanatory diagram illustrating an example of processing in the integration unit and maximum point identification unit of the regression estimation device according to the second embodiment.
- FIG. 13 is a schematic explanatory diagram of an example of a machine learning method for generating a regression model applied to the number-of-seconds distribution estimator in the second embodiment.
- FIG. 14 is an explanatory diagram showing Modified Example 1 of data used for input to the regression estimation device.
- FIG. 15 is an explanatory diagram showing Modified Example 2 of data used for input to the regression estimation apparatus.
- FIG. 16 is a block diagram showing a configuration example of a medical information system to which the regression estimation device is applied.
- FIG. 1 is a conceptual diagram showing an outline of processing by the regression estimation device 10 according to the first embodiment.
- a plurality of slice images sampled at equal intervals from a patient's three-dimensional CT data taken using a CT (Computed Tomography) device are used as input, and a contrast agent is injected based on the input plurality of slice images.
- the term "seconds" in this specification includes the number of seconds indicating the elapsed time from the injection of the contrast medium, unless explicitly stated otherwise.
- the slice image may also be called a tomographic image.
- a slice image may be understood as a substantially two-dimensional image (cross-sectional image).
- the regression estimation device 10 can be realized using computer hardware and software.
- the regression estimation device 10 includes a seconds distribution estimating unit 14 that receives an input of an image IM and estimates a probability distribution of seconds (hereinafter referred to as a "seconds distribution"), and a plurality of seconds estimated from a plurality of inputs. It includes an integration unit 16 that integrates the number distribution PD, and a maximum point identification unit 18 that identifies the number of seconds with the maximum probability from the new distribution obtained by the integration process (hereinafter referred to as "integrated distribution"). The number of seconds specified by the maximum point specifying unit 18 (the number of seconds with the maximum probability) is output as the final result.
- FIG. 1 three seconds distribution estimating units 14 are shown in order to show the flow of processing when three different images IM are input.
- the distribution estimator 14 is the same (single) processor.
- FIG. 2 is an explanatory diagram showing Example 1 of processing in the number-of-seconds distribution estimation unit 14.
- the number-of-seconds distribution estimator 14 includes a regression estimator 22 and a variable converter 24 .
- the regression estimation unit 22 is trained by machine learning so as to receive an input of the image IM and output an estimated value Oa of the number of seconds and a score value Ob indicating the likelihood (certainty factor) of the estimated value Oa.
- a trained model as a regression model applied to the regression estimation unit 22 is configured using, for example, a convolutional neural network (CNN).
- CNN convolutional neural network
- the numerical range of the estimated value Oa of the number of seconds output from the regression estimation unit 22 may be “ ⁇ Oa ⁇ ”, and the numerical range of the likelihood score value Ob may be “ ⁇ Ob ⁇ ”. It's okay.
- the regression model is not limited to CNN, and various machine learning models can be applied.
- the function of formula (2) is an example of a mapping that converts the likelihood score value Ob to a value b in the positive region.
- Parameter ⁇ is an example of a “first parameter” in the present disclosure.
- Parameter b is an example of a "second parameter" in the present disclosure.
- the Laplace distribution is applied as the probability distribution model of the number of seconds distribution.
- Laplacian distribution is represented by the function of the following equation (3).
- the reason for converting the likelihood score value Ob to a positive value b is related to applying the Laplace distribution as a probability distribution model for the number of seconds distribution. This is because if the parameter b is a negative value (b ⁇ 0), the Laplace distribution does not hold as a probability distribution, so it is necessary to ensure that the parameter b is a positive value (b>0). .
- FIG. 4 shows an example of a graph of the number-of-seconds distribution estimated by the parameters ⁇ and b estimated by the number-of-seconds distribution estimation unit 14 .
- the position indicated by the dashed line GT in the drawing corresponds to the correct number of seconds (correct number of seconds).
- Estimating a set of the estimated value Oa and the probability score Ob from the input image IM substantially corresponds to estimating the number-of-seconds distribution.
- the estimated value Oa of the number of seconds is an example of a "random variable" in this disclosure.
- FIG. 5 is an explanatory diagram showing an example of processing in the integrating section 16 and the maximum point specifying section 18.
- FIG. 5 To simplify the explanation, an example of integrating two distributions of seconds estimated by the distribution of seconds estimating unit 14 is shown here, but the same applies to the case of integrating three or more distributions of seconds.
- Graph GD1 shown in the upper left of FIG. 5 is a distribution of seconds (probability distribution P1 ) is an example.
- the integration unit 16 takes the logarithm of the estimated number-of-seconds distribution, converts it into a logarithmic probability density, takes the sum of a plurality of logarithmic probability densities, and integrates them. This corresponds to finding the product of probabilities over the same number of seconds.
- Graph GL1 in FIG. 5 is an example of logarithmic probability density logP1 obtained by taking the logarithm of probability distribution P1.
- Graph GD2 shown in the lower left of FIG. 5 is a distribution of seconds (probability distribution P2 ) is an example.
- a graph GL2 in FIG. 5 is an example of the logarithmic probability density obtained by taking the logarithm of the probability distribution P2.
- the rightmost graph GLS in FIG. 5 is an example of the joint logarithmic probability density that integrates the logarithmic probability density logP1 and the logarithmic probability density logP2.
- the distribution shown in graph GLS is an example of "integrated distribution" in the present disclosure.
- the maximum point identifying unit 18 identifies the value x of the parameter ⁇ that maximizes the logarithmic probability from the integrated logarithmic probability density.
- the processing in the maximum point identification unit 18 can be expressed by the following equation (4).
- the target function of argmin shown on the right side of the equal sign in the second row of Equation (4) corresponds to the loss function during training in machine learning, which will be described later.
- the right side of the equal sign described in the third row corresponds to the weighted median formula.
- the parameter bi corresponding to the weight for integration dynamically changes according to the output of the regression estimator 22 .
- the input value (maximum point) at which the joint log probability is maximized is ⁇ 1, and ⁇ 1 is selected as the final estimation result (final result).
- ⁇ 1 is the estimation result for the image IM1 among the plurality of input slice images.
- the calculation is performed by converting the distribution of seconds into a logarithmic probability density. Processing is performed to derive the maximum value as the final result.
- the integrated distribution takes the form of a weighted median.
- a highly accurate estimated value can be obtained by suppressing the influence of the outlier.
- ⁇ Description of medical images used for input In the DICOM (Digital Imaging and Communications in Medicine) standard, which defines the format and communication protocol of medical images, a unit called a study ID, which is an identification code (ID) for specifying the type of examination, , the series ID is defined.
- DICOM Digital Imaging and Communications in Medicine
- CT imaging of a range including the liver is performed a plurality of times (four times in this case) at different imaging timings as described below.
- CT data is three-dimensional data composed of a plurality of continuous slice images (tomographic images), and is an aggregate of a plurality of slice images (continuous slices) that constitute the three-dimensional data.
- a set of images) is called an “image series”.
- CT data is an example of a "three-dimensional image" in this disclosure.
- “study 1” is given as a study ID for a specific patient's liver contrast imaging examination
- “series 1” is given as the series ID of CT data obtained by imaging before contrast medium injection
- “series 1” is given as a series ID for CT data obtained by imaging before "Series 2" for CT data obtained by imaging 35 seconds after injection
- "Series 3" for CT data obtained by imaging 70 seconds after injection of contrast agent, 180 seconds after injection of contrast injection
- a unique ID is assigned to each series, such as "series 4", to the CT data obtained by imaging. Therefore, CT data can be identified by a combination of study ID and series ID.
- CT data can be identified by a combination of study ID and series ID.
- the correspondence relationship between the series ID and the imaging timing elapsed time after injection of the contrast medium
- the number of seconds is estimated by image analysis using multiple slice images in the same series as input. "By image analysis” means by processing based on pixel values that constitute image data.
- FIG. 6 is a schematic explanatory diagram of an example of a machine learning method for generating a regression model applied to the number-of-seconds distribution estimator 14.
- Training data used for machine learning includes an image TIM as input data and correct data (teacher signal t) corresponding to the input.
- the image TIM may be a slice image that constitutes an image series of three-dimensional CT data
- the teacher signal t is a value that indicates the number of seconds (ground truth) from the injection of the contrast agent when the series to which the slice image belongs is captured. It can be.
- a plurality of training data are generated by linking the corresponding teacher signal t.
- "Binding” may also be referred to as correspondence or association.
- "Training” is synonymous with "learning.”
- the same teacher signal t may be associated with slices of the same image series. That is, the teacher signal t may be associated with each image series.
- each slice is associated with a corresponding teacher signal t to generate multiple training data.
- a set of training data thus generated is used as a training data set.
- the learning model 20 When the image TIM read from the training data set is input to the learning model 20, the learning model 20 outputs the estimated value Oa of the number of seconds and the likelihood score Ob.
- the estimated value Oa and the score value Ob are variable-transformed into the parameter ⁇ and the parameter b of the probability distribution model by the variable transformation unit 24 .
- the loss function L used during training is defined by the following equation (5).
- the subscript i is an index that identifies each slice.
- FIG. 7 is an explanatory diagram of the loss function used during training.
- the loss function is a negative log-likelihood, which directly optimizes the formula used for regression estimation by learning. Learning maximizes the log-likelihood of the teacher signal t in seconds.
- a graph for the parameter ⁇ of the loss function shown in Equation (5) is the graph GR ⁇ in FIG.
- the graph GR ⁇ has a stable slope with respect to the parameter ⁇ .
- the graph for parameter b of the loss function shown in Equation (5) is graph GRb in FIG.
- Graph GRb has an unstable slope with respect to parameter b. In regions where the value of b is small, 1/b is dominant, and in regions where the value of b is large, logb is dominant.
- the function used for variable transformation of the parameter b is a function that asymptotically approaches -1/x when x ⁇ - ⁇ and exp(x) when x ⁇ . can be canceled.
- the machine learning method of the learning model 20 described using FIGS. 6 and 7 is an example of the "learned model generating method" in the present disclosure.
- the regression estimator 10 includes a processor 102 , a non-transitory tangible computer-readable medium 104 , a communication interface 106 , an input/output interface 108 and a bus 110 .
- the processor 102 includes a CPU (Central Processing Unit). Processor 102 may include a GPU (Graphics Processing Unit). Processor 102 is coupled to computer-readable media 104 , communication interface 106 , and input/output interface 108 via bus 110 . The processor 102 reads various programs and data stored in the computer-readable medium 104 and executes various processes.
- CPU Central Processing Unit
- GPU Graphics Processing Unit
- the computer-readable medium 104 includes, for example, a memory 104A that is a main storage device and a storage 104B that is an auxiliary storage device.
- the storage 104B is configured using, for example, a hard disk drive (HDD) device, a solid state drive (SSD) device, an optical disk, a magneto-optical disk, or a semiconductor memory, or an appropriate combination thereof. .
- HDD hard disk drive
- SSD solid state drive
- Various programs, data, and the like are stored in the storage 104B.
- Computer-readable medium 104 is an example of a "storage device" in this disclosure.
- the memory 104A is used as a work area for the processor 102, and is used as a storage unit that temporarily stores programs and various data read from the storage 104B.
- a program stored in the storage 104B is loaded into the memory 104A, and the processor 102 executes the instructions of the program, whereby the processor 102 functions as means for performing various processes defined by the program.
- the memory 104A stores a regression estimation program 130 executed by the processor 102, various data, and the like.
- the regression estimation program 130 includes a trained model trained by machine learning, and causes the processor 102 to execute the processing described with reference to FIG.
- the communication interface 106 performs wired or wireless communication processing with an external device, and exchanges information with the external device.
- the regression estimation device 10 is connected to a communication line (not shown) via a communication interface 106 .
- the communication line may be a local area network or a wide area network.
- the communication interface 106 can serve as a data acquisition unit that receives input of data such as images.
- the regression estimator 10 may further include an input device 114 and a display device 116 .
- Input device 114 and display device 116 are connected to bus 110 via input/output interface 108 .
- the input device 114 may be, for example, a keyboard, mouse, multi-touch panel, or other pointing device, voice input device, or any suitable combination thereof.
- the display device 116 is an output interface that displays various information.
- the display device 116 may be, for example, a liquid crystal display, an organic electro-luminescence (OEL) display, a projector, or an appropriate combination thereof.
- OEL organic electro-luminescence
- FIG. 9 is a functional block diagram showing an outline of processing functions of the regression estimation device 10 according to the first embodiment.
- the processor 102 of the regression estimation device 10 executes the regression estimation program 130 stored in the memory 104A to obtain the data acquisition unit 12, the number-of-seconds distribution estimation unit 14, the integration unit 16, the maximum point identification unit 18, and the output unit 19. function as
- the data acquisition unit 12 accepts input of data to be processed.
- the data acquisition unit 12 acquires an image IMi, which is a slice image sampled from CT data.
- the data acquisition unit 12 may perform processing for cutting out slice images from CT data at regular intervals, or may acquire slice images sampled in advance by a processing unit (not shown) or the like.
- the image IMi captured via the data acquisition unit 12 is input to the regression estimation unit 22 of the seconds distribution estimation unit 14 .
- the regression estimator 22 outputs a set of an estimated value Oa of the number of seconds and a score value Ob indicating the likelihood of the estimated value Oa from each of the input images IMi.
- the estimated value Oa output from the regression estimating unit 22 is converted into the parameter ⁇ i of the probability distribution model in the variable transforming unit 24, and the likelihood score Ob output from the regression estimating unit 22 is converted to probability in the variable transforming unit 24. It is converted into parameters bi of the distribution model. These two parameters ⁇ i, bi estimate the probability distribution Pi of the seconds.
- the integration unit 16 performs processing to integrate multiple probability distributions Pi obtained by inputting multiple images IMi.
- the logarithm of the probability distribution Pi is taken in the logarithmic conversion unit 26 and converted into the logarithmic probability density logPi, and the integrated distribution is obtained by calculating the sum of the logarithmic probability densities logPi in the integrated distribution generation unit 28 .
- the maximum point specifying unit 18 specifies the value of the number of seconds (maximum point) with the maximum probability from the integrated distribution, and outputs the value of the specified number of seconds as the final estimated value. Note that the maximum point identification unit 18 may be configured to be incorporated in the integration unit 16 .
- the output unit 19 is an output interface for displaying the final estimated value specified by the maximum point specifying unit 18 and providing it to other processing units.
- the output unit 19 may include a processing unit such as processing for generating data for display and/or data conversion processing for transmitting data to the outside.
- the number of seconds estimated by the regression estimation device 10 may be displayed on a display device (not shown) or the like.
- the contrast-enhanced state may be estimated from the number of seconds estimated by the regression estimation device 10, and the estimated result of the contrast-enhanced state classification may be displayed on a display device or the like instead of or together with the number of seconds.
- the estimated result of the contrast-enhanced state classification may be displayed on a display device or the like instead of or together with the number of seconds.
- the contrast-enhanced state may be estimated from the number of seconds estimated by the regression estimation device 10, and the estimated result of the contrast-enhanced state classification may be displayed on a display device or the like instead of or together with the number of seconds.
- the contrast-enhanced state may be estimated from the number of seconds estimated by the regression estimation device 10, and the estimated result of the contrast-enhanced state classification may be displayed on a display device or the like instead of or together with the number of seconds.
- the regression estimation device 10 may be incorporated in a medical image processing device for processing medical images acquired in medical institutions such as hospitals. Also, the processing functions of the regression estimation device 10 may be provided as a cloud service.
- the method of regression estimation processing executed by the processor 102 is an example of the “regression estimation method” in the present disclosure.
- the hardware configuration of the regression estimation device 10 according to the second embodiment may be the same as that of the first embodiment. Regarding the second embodiment, points different from the first embodiment will be described. In the second embodiment, the processing contents of each of the second number distribution estimation unit 14, the integration unit 16, and the maximum point identification unit 18 are different from those in the first embodiment.
- FIG. 10 is an explanatory diagram showing Example 2 of processing in the number-of-seconds distribution estimation unit 14 of the regression estimation device 10 according to the second embodiment. Instead of the processing described with reference to FIG. 2, the processing of FIG. 10 is applied.
- variable conversion unit 24 in the second embodiment converts the likelihood score value Ob into the parameter ⁇ 2 using the following equation (7) instead of equation (2).
- ⁇ 2 1/log(1+exp( ⁇ Ob)) (7)
- ⁇ 2 plays the role of certainty. ⁇ 2 corresponds to variance and ⁇ to standard deviation.
- the Gaussian distribution is represented by the function of the following formula (8).
- the reason for converting the score value Ob into a positive value ( ⁇ 2 ) is the same as in the first embodiment. This is because if the parameter ⁇ 2 is a negative value, the Gaussian distribution does not hold as a probability distribution, so it is necessary to ensure that the parameter ⁇ 2 is a positive value ( ⁇ 2 >0).
- FIG. 11 shows an example of a graph of the number-of-seconds distribution estimated by the parameters ⁇ and ⁇ 2 estimated by the number-of-seconds distribution estimator 14 .
- FIG. 12 is an explanatory diagram showing an example of processing in the integration unit 16 and the maximum point identification unit 18 of the regression estimation device 10 according to the second embodiment. Here, an example of integrating two number-of-seconds distributions estimated by the number-of-seconds distribution estimating unit 14 is shown.
- a graph GD1g shown in the upper left of FIG. 12 is an example of the number of seconds distribution (probability distribution P1) represented by the parameters ⁇ 1 and ⁇ 2 1 estimated by the number of seconds distribution estimation unit 14 of FIG.
- the integration unit 16 takes the logarithm of the estimated number-of-seconds distribution, converts it into a logarithmic probability density, takes the sum of a plurality of logarithmic probability densities, and integrates them. This corresponds to finding the product of probabilities over the same number of seconds.
- a graph GL1g in FIG. 12 is an example of the logarithmic probability density logP1 obtained by taking the logarithm of the probability distribution P1.
- a graph GD2g shown in the lower left of FIG. 12 is an example of the number of seconds distribution (probability distribution P2 ) represented by the parameters ⁇ 2 and ⁇ 22 estimated by the number of seconds distribution estimation unit 14 .
- a graph GL2g in FIG. 12 is an example of the logarithmic probability density obtained by taking the logarithm of the probability distribution P2.
- the rightmost graph GLSg in FIG. 12 is an example of the joint logarithmic probability density that integrates the logarithmic probability density logP1 and the logarithmic probability density logP2.
- the maximum point identifying unit 18 identifies the value x that maximizes the logarithmic probability from the integrated joint logarithmic probability density.
- the processing in the maximum point identification unit 18 can be represented by the following equation (9).
- the target function of argmin shown on the right side of the equal sign in the second row of Equation (9) corresponds to the loss function during training in machine learning, which will be described later. Also, the right side of the equal sign described in the third row corresponds to the weighted average formula.
- the input value (maximum point) x that maximizes the logarithmic probability is selected as the final estimation result (final result).
- FIG. 13 is an explanatory diagram schematically showing an example of a machine learning method for generating a regression model applied to the number-of-seconds distribution estimator 14 in the second embodiment.
- Training data used for learning may be the same as in the first embodiment.
- FIG. 13 points different from FIG. 6 will be described.
- the learning model 20 When the image TIM read from the training data set is input to the learning model 20, the learning model 20 outputs the estimated value Oa of the number of seconds and the likelihood score Ob.
- the estimated value Oa and the likelihood score value Ob are variable-transformed into the parameters ⁇ and ⁇ 2 of the probability distribution model by the variable transformation unit 24 .
- the loss function L during training is defined by the following equation (10).
- the error backpropagation method is applied using the loss sum represented by Equation (11), and the learning model 20 is trained using the stochastic gradient descent method in the same way as in normal CNN learning.
- the learning model 20 is trained using multiple training data comprising multiple image series, the parameters of the learning model 20 are optimized to obtain a trained model.
- the learned model thus obtained is applied to the number-of-seconds distribution estimation unit 14 .
- slice images obtained by extracting slices at equal intervals from three-dimensional CT data were used as input, but the image to be processed is not limited to this.
- a MIP (Maximum Intensity Projection) image MIPimg configured at regular intervals or an average image AVEimg generated from a plurality of slice images may be used.
- Data used for input is not limited to a two-dimensional image, and may be a three-dimensional image (three-dimensional data). For example, 3D partial images at different positions within the same series may be used as input.
- the input to the number-of-seconds distribution estimation unit 14 may be a combination of multiple types of data elements. For example, as shown in FIG. 15, at least one of three-dimensional images (a set of multiple slice images), slice images, MIP images, and average images, which are partial images of the same series of CT data, is used as an input. A combination of these image types may be input to the seconds distribution estimating unit 14 to obtain an output of the estimated value of seconds and its likelihood. For example, the combination of the average image and the MIP image may be input to the seconds distribution estimation unit 14 to estimate the seconds distribution.
- MIP images and average images are examples of generated images generated from partial images of three-dimensional CT data.
- FIG. 16 is a block diagram showing a configuration example of a medical information system 200 including a medical image processing device 220. As shown in FIG. The regression estimation device 10 described as the first embodiment and the second embodiment is incorporated into a medical image processing device 220, for example.
- a medical information system 200 is a computer network built in a medical institution such as a hospital.
- the medical information system 200 includes a modality 230 that captures medical images, a DICOM server 240, a medical image processing device 220, an electronic chart system 244, and a viewer terminal 246. These elements are connected via a communication line 248. Connected. Communication line 248 may be a local communication line within a medical institution. Also, part of the communication line 248 may be a wide area communication line.
- the modality 230 include a CT device 231, an MRI (Magnetic Resonance Imaging) device 232, an ultrasonic diagnostic device 233, a PET (Positron Emission Tomography) device 234, an X-ray diagnostic device 235, an X-ray fluoroscopic diagnostic device 236, and an internal A scope device 237 and the like are included.
- the types of modalities 230 connected to the communication line 248 can be combined in various ways for each medical institution.
- the DICOM server 240 is a server that operates according to the DICOM specifications.
- the DICOM server 240 is a computer that stores and manages various data including images captured using the modality 230, and has a large-capacity external storage device and a database management program.
- the DICOM server 240 communicates with other devices via a communication line 248 to transmit and receive various data including image data.
- the DICOM server 240 receives image data generated by the modality 230 and other various data via a communication line 248, and stores and manages them in a recording medium such as a large-capacity external storage device.
- the storage format of image data and communication between devices via the communication line 248 are based on the DICOM protocol.
- the medical image processing apparatus 220 can acquire data from the DICOM server 240 or the like via the communication line 248.
- the medical image processing apparatus 220 performs image analysis and various other processes on medical images captured by the modality 230 .
- the medical image processing device 220 performs, for example, a process of recognizing a lesion area from an image, a process of identifying a classification such as a disease name, or a segmentation process of recognizing an area such as an organ. , various Computer Aided Diagnosis (Computer Aided Detection: CAD) and other analytical processes.
- the medical image processor 220 can also send processing results to the DICOM server 240 and viewer terminal 246 . Note that the processing functions of the medical image processing apparatus 220 may be installed in the DICOM server 240 or the viewer terminal 246 .
- Various data stored in the database of the DICOM server 240 and various information including the processing results generated by the medical image processing apparatus 220 can be displayed on the viewer terminal 246.
- the viewer terminal 246 is a terminal for viewing images called a PACS (Picture Archiving and Communication Systems) viewer or a DICOM viewer.
- a plurality of viewer terminals 246 can be connected to the communication line 248 .
- the form of the viewer terminal 246 is not particularly limited, and may be a personal computer, a workstation, a tablet terminal, or the like.
- a program that causes a computer to implement the processing functions of the regression estimation device 10 is recorded on a computer-readable medium that is a non-temporary information storage medium that is an optical disk, a magnetic disk, or a semiconductor memory or other tangible object, and the program is transmitted through this information storage medium. It is possible to provide
- part or all of the processing functions in the regression estimation device 10 may be realized by cloud computing, or may be provided as a SasS (Software as a Service) service.
- SasS Software as a Service
- processors include CPUs, which are general-purpose processors that run programs and function as various processing units, GPUs, which are processors specialized for image processing, and FPGAs (Field Programmable Gate Arrays).
- PLD Programmable Logic Device
- ASIC Application Specific Integrated Circuit
- a single processing unit may be composed of one of these various processors, or may be composed of two or more processors of the same type or different types.
- one processing unit may be configured by a plurality of FPGAs, a combination of CPU and FPGA, or a combination of CPU and GPU.
- a plurality of processing units may be configured by one processor.
- a single processor is configured by combining one or more CPUs and software. There is a form in which a processor functions as multiple processing units.
- SoC System On Chip
- the various processing units are configured using one or more of the above various processors as a hardware structure.
- the hardware structure of these various processors is, more specifically, an electrical circuit that combines circuit elements such as semiconductor elements.
- the first and second embodiments have the following advantages.
- the DICOM tag Since the number of seconds with a high degree of certainty can be estimated by image analysis of the input image, the DICOM tag does not record attached information related to the shooting time, or images in which incorrect time information is recorded. etc., it is possible to estimate the number of seconds with high confidence.
- ⁇ 4> As an input to the regression model, it may be difficult to input and process three-dimensional CT data at once due to size, but as described in the first and second embodiments, By sequentially processing two-dimensional images such as slice images, which are part of three-dimensional CT data, and integrating these estimation results, an appropriate estimated value can be obtained by looking at the entirety of the input data. can lead.
- the joint probability distribution takes the shape of a weighted median, and when one of the estimation results for some inputs deviates greatly due to artifacts, etc., it is less susceptible to the outliers and is even more robust.
- An image used for the final result (estimation of the final estimated value) can be extracted from the multiple images used for input.
- the technology of the present disclosure can be applied to various uses, and there are various aspects of the types of data used for input and target variables to be estimated.
- the technology of the present disclosure is applicable to, for example, the following regression estimation problem.
- Application Example 1 Problem of Regression Using Multiple Slice Images It is applicable to the task of recognizing the position of target organs from slice images (two-dimensional images) in three-dimensional directions as well.
- the technology of the present disclosure can be applied to regression estimation of the coordinates of a rectangular parallelepiped (three-dimensional bounding box) indicating the position of an organ from a plurality of slice images within the same series.
- the organ referred to here is an example of the "specific object” in the present disclosure
- the coordinates of the bounding box are an example of the "value indicating the position of the specific object” in the present disclosure.
- the technique of the present disclosure can be applied to the process of estimating the slice position (position within CT data) of an input slice image.
- the slice position here is an example of the “partial image position” in the present disclosure.
- Application example 2 Problem of performing regression on input of time-series images such as moving images or multiple images Specifically, for example, the technology of the present disclosure can be applied to processing for estimating the age of a person appearing in images such as moving images. . The technology of the present disclosure can also be applied to regression estimation processing when scene recognition is performed on images such as moving images.
- Application Example 3 Problem of Regression from Sound Data
- the technology of the present disclosure can be applied to regression estimation processing, for example, when performing emotion recognition from voice.
- Application Example 4 Problem of regressing one value from multiple resolutions Specifically, for example, the technology of the present disclosure can be applied to a process of regressively estimating the position of a bounding box for object detection from multiple images with different resolutions. .
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Mathematical Analysis (AREA)
- Computational Mathematics (AREA)
- Pure & Applied Mathematics (AREA)
- Mathematical Optimization (AREA)
- Quality & Reliability (AREA)
- Radiology & Medical Imaging (AREA)
- Algebra (AREA)
- Probability & Statistics with Applications (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Biomedical Technology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Image Analysis (AREA)
Priority Applications (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2023545114A JPWO2023032438A1 (https=) | 2021-08-31 | 2022-06-24 | |
| US18/587,965 US20250005405A1 (en) | 2021-08-31 | 2024-02-27 | Regression estimation device, regression estimation method, program, and method for generating trained model |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2021-141458 | 2021-08-31 | ||
| JP2021141458 | 2021-08-31 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US18/587,965 Continuation US20250005405A1 (en) | 2021-08-31 | 2024-02-27 | Regression estimation device, regression estimation method, program, and method for generating trained model |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2023032438A1 true WO2023032438A1 (ja) | 2023-03-09 |
Family
ID=85412040
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2022/025288 Ceased WO2023032438A1 (ja) | 2021-08-31 | 2022-06-24 | 回帰推定装置および方法、プログラム並びに学習済みモデルの生成方法 |
Country Status (3)
| Country | Link |
|---|---|
| US (1) | US20250005405A1 (https=) |
| JP (1) | JPWO2023032438A1 (https=) |
| WO (1) | WO2023032438A1 (https=) |
Families Citing this family (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20240119647A1 (en) * | 2022-09-30 | 2024-04-11 | Canon Medical Systems Corporation | Nuclear medicine diagnosis apparatus, data processing method, and non-transitory computer-readable storage medium |
Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2009230751A (ja) * | 2008-02-25 | 2009-10-08 | Omron Corp | 年令推定装置 |
| JP2013003662A (ja) * | 2011-06-13 | 2013-01-07 | Sony Corp | 情報処理装置及び方法、並びにプログラム |
-
2022
- 2022-06-24 WO PCT/JP2022/025288 patent/WO2023032438A1/ja not_active Ceased
- 2022-06-24 JP JP2023545114A patent/JPWO2023032438A1/ja active Pending
-
2024
- 2024-02-27 US US18/587,965 patent/US20250005405A1/en active Pending
Patent Citations (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2009230751A (ja) * | 2008-02-25 | 2009-10-08 | Omron Corp | 年令推定装置 |
| JP2013003662A (ja) * | 2011-06-13 | 2013-01-07 | Sony Corp | 情報処理装置及び方法、並びにプログラム |
Also Published As
| Publication number | Publication date |
|---|---|
| US20250005405A1 (en) | 2025-01-02 |
| JPWO2023032438A1 (https=) | 2023-03-09 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| KR101857624B1 (ko) | 임상 정보를 반영한 의료 진단 방법 및 이를 이용하는 장치 | |
| CN112825619B (zh) | 使用数字重建放射影像训练机器学习算法的方法及系统 | |
| CN113287149B (zh) | 使用机器学习和解剖向量的医学图像分析 | |
| CN111369562B (zh) | 图像处理方法、装置、电子设备及存储介质 | |
| CN111223158B (zh) | 心脏冠脉图像的伪影校正方法和可读存储介质 | |
| JP2018505705A (ja) | 機械学習を用いた医用イメージングの変換のためのシステムおよび方法 | |
| Ahn et al. | Multi-frame attention network for left ventricle segmentation in 3D echocardiography | |
| US12573062B2 (en) | Image processing method, image processing device, program, and trained model | |
| US20240119750A1 (en) | Method of generating language feature extraction model, information processing apparatus, information processing method, and program | |
| US20240193781A1 (en) | Contrast state determination device, contrast state determination method, and program | |
| US20240193777A1 (en) | Medical image processing device, medical image processing method, and program | |
| CN113316803B (zh) | 使用历史校正的统计分析来校正医学图像的分割 | |
| WO2019208130A1 (ja) | 医療文書作成支援装置、方法およびプログラム、学習済みモデル、並びに学習装置、方法およびプログラム | |
| CN115965785B (zh) | 图像分割方法、装置、设备、程序产品及介质 | |
| US20250054136A1 (en) | A computer implemented method, a method and a system | |
| US20250005405A1 (en) | Regression estimation device, regression estimation method, program, and method for generating trained model | |
| KR102556646B1 (ko) | 의료 영상 생성 방법 및 장치 | |
| US20230046302A1 (en) | Blood flow field estimation apparatus, learning apparatus, blood flow field estimation method, and program | |
| CN119150993A (zh) | 多模态生成模型训练方法及装置、文本生成方法及装置 | |
| EP3667674A1 (en) | Method and system for evaluating images of different patients, computer program and electronically readable storage medium | |
| EP4724977A1 (en) | Method for determining a probability of the presence of at least one candidate lesion in at least one medical image | |
| Garcia-Jara et al. | Efficient and motion correction-free myocardial perfusion segmentation in small mri data using deep transfer learning from cine images: A promising framework for clinical implementation | |
| Tian et al. | A prediction model of microcirculation disorder in myocardium based on ultrasonic images | |
| JP7376715B2 (ja) | 経過予測装置、経過予測装置の作動方法および経過予測プログラム | |
| Saber et al. | Left Ventricle Segmentation Using Scale-Independent Multi-Gate UNET in MRI Images |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22864026 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2023545114 Country of ref document: JP |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 22864026 Country of ref document: EP Kind code of ref document: A1 |