WO2024053046A1 - 情報処理システム、内視鏡システム、学習済みモデル、情報記憶媒体及び情報処理方法 - Google Patents
情報処理システム、内視鏡システム、学習済みモデル、情報記憶媒体及び情報処理方法 Download PDFInfo
- Publication number
- WO2024053046A1 WO2024053046A1 PCT/JP2022/033706 JP2022033706W WO2024053046A1 WO 2024053046 A1 WO2024053046 A1 WO 2024053046A1 JP 2022033706 W JP2022033706 W JP 2022033706W WO 2024053046 A1 WO2024053046 A1 WO 2024053046A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- imaging system
- information processing
- learning
- processing system
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/60—Image enhancement or restoration using machine learning, e.g. neural networks
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/73—Deblurring; Sharpening
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10068—Endoscopic image
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
Definitions
- the present invention relates to an information processing system, an endoscope system, a learned model, an information storage medium, an information processing method, and the like.
- Patent Document 1 discloses a technique for correcting optical deterioration of an imaging system using deep learning.
- Patent Document 1 a pre-photographed reference image with optical deterioration information added thereto is used as a learning image, but there is an infinite number of optical deterioration information to be learned depending on the object distance and image height. , since it requires a huge number of learning images and the network scale required for processing increases, there are concerns about a decrease in processing capacity and an increase in implementation cost.
- One aspect of the present invention includes a storage unit that stores a learned model machine-learned using a dataset including a learning image group and a correct image, and an image captured by a first imaging system using the learned model.
- an information processing system that corrects blur caused by defocusing of the first imaging system of an image to be processed, wherein the learning image group includes a predetermined subject imaged by an arbitrary imaging system;
- the influence of blur due to defocus of the first imaging system is calculated based on the transfer function or point spread function of the first imaging system at a plurality of object distances.
- the defocus simulation process is performed on the other regions based on the transfer function or the point spread function on the optical axis, and the correct image is determined at the object distance at which the first imaging system is focused.
- an image generated by performing best focus simulation processing that simulates a focused state of the first imaging system for the predetermined subject image based on the transfer function or the point spread function; , the predetermined subject image itself, and the learned model is related to an information processing system in which machine learning is performed so that each of the learning images becomes the correct image.
- Another aspect of the present invention relates to an endoscope system including a processor unit having the information processing system described above, and an endoscope scope connected to the processor unit and capturing the image to be processed. do.
- Still another aspect of the present invention is a data set that is used in an information processing system that includes a storage unit that stores a trained model, an input unit, a processing unit, and an output unit, and that includes a learning image group and a correct image.
- the training image group is a trained model that has been machine-learned by an arbitrary imaging system
- the training image group is a predetermined subject image captured by an arbitrary imaging system and focused by the arbitrary imaging system, at a plurality of object distances.
- a plurality of learning images generated by performing a defocus simulation process that simulates the effect of blurring due to defocus of the first imaging system based on the transfer function or point spread function of the first imaging system.
- the defocus simulation process is performed based on the defocus simulation process, and the correct image is determined based on the transfer function or the point spread function at the object distance at which the first imaging system is focused.
- the trained model is an image generated by performing a best focus simulation process that simulates a state in which each learning image matches the predetermined subject image, or the predetermined subject image itself, and the trained model is
- the input unit inputs a processing target image, which is an image photographed by the first imaging system, to the learned model, and the processing unit uses the learned model to perform machine learning so as to become a correct image.
- a correction process is performed to correct blur caused by defocusing of the first imaging system of the image to be processed, and the output unit is related to a learned model that outputs a corrected image resulting from the correction process.
- Still another aspect of the present invention relates to an information storage medium that stores the trained model described above.
- the first imaging system captures an image to be processed, which is an image captured by the first imaging system, using a trained model machine-learned using a dataset including a learning image group and a correct image.
- An information processing method for correcting blur caused by defocusing wherein the learning image group includes a plurality of images of a predetermined subject captured by an arbitrary imaging system and focused by the arbitrary imaging system.
- Generated by defocus simulation processing that simulates the effect of blurring due to defocus of the first imaging system based on the transfer function or point spread function of the first imaging system at the object distance.
- the transfer function or the point on the optical axis includes a plurality of learning images, and for a region on the optical axis of the first imaging system and a region other than on the optical axis in each learning image of the plurality of learning images.
- the defocus simulation process is performed based on an image distribution function, and the correct image is determined based on the transfer function or the point spread function at the object distance at which the first imaging system is focused.
- the trained model is an image generated by performing a best focus simulation process that simulates a state in which the system is in focus on the predetermined subject image, or the predetermined subject image itself, and the trained model is It relates to an information processing method in which machine learning is performed so that a learning image becomes the correct image.
- FIG. 1 is a block diagram illustrating a configuration example of an information processing system.
- FIG. 2 is a block diagram illustrating a more detailed configuration example of the information processing system.
- 5 is a flowchart illustrating a processing example of the information processing system.
- FIG. 2 is a block diagram illustrating a configuration example of a learning device.
- FIG. 3 is a diagram illustrating the relationship between depth of field and target depth of field.
- FIG. 3 is a diagram illustrating an example of image data generation processing.
- FIG. 1 is a block diagram illustrating a configuration example of an information processing system.
- FIG. 2 is a block diagram illustrating a more detailed configuration example of the information processing system.
- 5 is a flowchart
- FIG. 3 is a diagram illustrating defocus simulation processing according to the present embodiment.
- FIG. 1 is a block diagram illustrating an example of an endoscope system. The block diagram explaining another example of an endoscope system.
- FIG. 6 is a diagram illustrating the relationship between object distance and MTF related to defocus simulation processing.
- FIG. 7 is another diagram illustrating the relationship between object distance and MTF related to defocus simulation processing.
- FIG. 3 is a diagram illustrating a specific calculation method of defocus simulation processing.
- FIG. 7 is another diagram illustrating a specific calculation method of defocus simulation processing.
- FIG. 7 is a diagram illustrating a specific calculation method of best focus simulation processing.
- FIG. 7 is another diagram illustrating a specific calculation method of best focus simulation processing.
- FIG. 7 is another diagram illustrating an example of the lens configuration of the first imaging system.
- FIG. 3 is a diagram illustrating the amount of distortion.
- FIG. 2 is a diagram illustrating a lens configuration including a phase modulation element.
- FIG. 3 is a diagram illustrating an example of a change in MTF due to the inclusion of a phase modulation element.
- FIG. 7 is a diagram illustrating another example of defocus simulation processing.
- FIG. 7 is a diagram illustrating another example of defocus simulation processing.
- FIG. 7 is a diagram illustrating another example of best focus simulation processing.
- FIG. 7 is a diagram illustrating another example of image data generation processing.
- FIG. 7 is a diagram illustrating another example of defocus simulation processing.
- FIG. 3 is a diagram illustrating the relationship between mosaic processing and demosaic processing.
- FIG. 7 is a diagram illustrating another example of best focus simulation processing. The figure explaining another example of composition of an information processing system.
- 7 is a flowchart illustrating another processing example of the information processing system.
- 5 is a flowchart illustrating the first learned model creation process.
- 12 is a flowchart illustrating second learned model creation processing.
- FIG. 7 is a diagram illustrating another example of defocus simulation processing.
- FIG. 7 is a diagram illustrating another example of best focus simulation processing.
- the information processing system is applied to a medical endoscope, but the present invention is not limited thereto, and the information processing system of the present invention can be applied to various imaging systems or video display systems.
- the information processing system of the present invention can be applied to a still camera, a video camera, a television receiver, a microscope, or an industrial endoscope.
- FIG. 1 is a block diagram illustrating a configuration example of an information processing system 100 of this embodiment.
- the information processing system 100 includes a storage section 110 and a processing section 130.
- the storage unit 110 stores a learned model 120 that has been subjected to machine learning.
- the learned model 120 is a program module that outputs a corrected image in which blur caused by defocusing of the processing target image is corrected, and is generated or updated by performing machine learning, which will be described later.
- the processing target image is, for example, image data photographed by the first imaging system 101 as shown in FIG. 1, but is not limited thereto, and details will be described later. Note that in this embodiment, image data that can be processed as digital data may be simply referred to as an image.
- the learning image group 32G is a set of learning images 32 consisting of a first learning image 32-1, a second learning image 32-2, ..., an N-th learning image 32-N, and the details will be described later together with the correct image 36.
- the processing unit 130 of the present embodiment uses the learned model 120 to correct blur caused by defocusing of the first imaging system 101 of the processing target image, which is an image photographed by the first imaging system 101.
- the storage unit 110 and the processing unit 130 are also referred to as a storage device and a processing device, respectively.
- Machine learning in this embodiment is, for example, supervised learning.
- Training data in supervised learning is a data set in which input data and correct labels are associated with each other.
- the trained model 120 of this embodiment is a data set in which input data consisting of learning images 32 simulating the effects of various blurs is associated with correct labels consisting of focused correct images 36. It is generated by supervised learning based on .
- the processing unit 130 of this embodiment is composed of the following hardware.
- the hardware can include at least one of a circuit that processes digital signals and a circuit that processes analog signals.
- the hardware can be composed of one or more circuit devices mounted on a circuit board or one or more circuit elements.
- the one or more circuit devices are, for example, ICs.
- the one or more circuit elements are, for example, resistors, capacitors, etc.
- the processing unit 130 may be realized by the following processor.
- the processing unit 130 of this embodiment includes a memory that stores information and a processor that operates based on the information stored in the memory.
- the memory is, for example, the storage unit 110.
- the information includes, for example, programs and various data.
- a processor includes hardware.
- Various types of processors can be used as the processor, such as a CPU (Central Processing Unit), a GPU (Graphics Processing Unit), and a DSP (Digital Signal Processor).
- the memory may be a semiconductor memory such as SRAM (Static Random Access Memory) or DRAM (Dynamic Random Access Memory), a register, or a magnetic storage device such as a hard disk drive. , an optical storage device such as an optical disk device.
- the memory stores computer-readable instructions, and when the instructions are executed by the processor, the functions of each part of the processing unit 130 are realized as processing.
- the instructions here may be instructions of an instruction set that constitutes a program, or instructions that instruct a hardware circuit of a processor to operate.
- the trained model 120 of this embodiment may be used in the information processing system 100 shown in the configuration example of FIG. 2. That is, the trained model 120 of this embodiment is used in the information processing system 100 including the storage unit 110 that stores the trained model 120, the input unit 140, the processing unit 130, and the output unit 150, and is used in the learning image group. Machine learning is performed using a data set including 32G and the correct image 36.
- the input unit 140 is an interface that receives images to be processed from the outside. Specifically, it is an image data interface that receives image data as a processing target image from the first imaging system 101, as shown in FIGS. 1 and 2, for example.
- the input unit 140 functions as the input unit 140 by using the received image to be processed as input data to the trained model 120, and by having the processing unit 130 perform processing to be described later. That is, in the trained model 120 of this embodiment, the input unit 140 inputs the processing target image, which is an image photographed by the first imaging system 101, to the trained model 120.
- the output unit 150 is an interface that transmits the above-mentioned corrected image to the outside. For example, by using output data from the learned model 120 as a corrected image transmitted by the output unit 150, the function of the output unit 150 is achieved.
- the destination of the corrected image is, for example, a predetermined display device connected to the information processing system 100, and the corrected image is displayed on the display device by, for example, making the output unit 150 an interface connectable to the predetermined display device. , functions as the output section 150.
- the output destination of the corrected image may be a storage device of an external device or the like.
- FIG. 3 is a flowchart illustrating a method performed by the information processing system 100 of this embodiment.
- the processing unit 130 performs correction processing (step S30) after reading the processing target image (step S10) and reading the learned model (step S20). Specifically, for example, the processing unit 130 performs a process of inputting the processing target image received via the input unit 140 into the trained model 120 read out from the storage unit 110. If the trained model 120 determines that the processing target image, which is input data, is common to the learning image 32, it estimates that the data to be output is the correct image 36. Therefore, when the processing target image is input, the trained model 120 determines that the processing target image is correct. Output image 36.
- the correct image 36 is an image in which blur caused by defocusing of the first imaging system 101 in the processing target image has been corrected. That is, the processing unit 130 uses the trained model 120 to perform a correction process (step S30) to correct blur caused by defocusing of the first imaging system 101 of the image to be processed.
- the processing unit 130 outputs the corrected image (step S40).
- the output unit 150 functions as described above, so that the corrected image is output to a desired output destination. In other words, the output unit 150 outputs a corrected image by the correction process.
- FIG. 4 is a block diagram showing a configuration example of the learning device 10. As shown in FIG.
- the learning device 10 includes, for example, a communication section 12, a learning device processing section 16, and a learning device storage section 18.
- the communication unit 12 is a communication interface that can communicate with the information processing system 100 using a predetermined communication method.
- the predetermined communication method is, for example, a communication method compliant with a wireless communication standard such as Wi-Fi (registered trademark), but is not limited thereto, and may be a communication method compliant with a wired communication standard such as USB.
- the learning device 10 can transmit the learned model 120 machine-trained by the method described later to the information processing system 100, and the information processing system 100 can update the learned model 120.
- FIG. 4 shows an example in which the learning device 10 and the information processing system 100 are separated, this does not preclude a configuration example in which the information processing system 100 includes a learning server corresponding to the learning device 10.
- the learning device processing section 16 performs data input/output control with each functional section such as the communication section 12 and the learning device storage section 18.
- the learning device processing section 16 can be realized by a processor similar to the processing section 130 in FIG.
- the learning device processing unit 16 executes various calculation processes based on a predetermined program read from the learning device storage unit 18, an operation input signal from an operation unit not shown in FIG. Controls data output operations, etc.
- the predetermined program here includes a machine learning program. That is, the learning device processing unit 16 performs the machine learning function by reading out and executing the machine learning program and necessary data from the learning device storage unit 18.
- the learning device storage unit 18 stores a training model 20, a predetermined subject image 30, and optical system information 40 in addition to a machine learning program (not shown).
- the learning device storage section 18 can be realized by a semiconductor memory or the like similar to the storage section 110 described above. Note that the learning device storage unit 18 may further include other information. Other information is, for example, image sensor information 50, which will be described later.
- the predetermined subject image 30 is an image of a subject related to the processing target image, and a learning image 32 and a correct image 36, which will be described later, are created based on the predetermined subject image 30. That is, the learning device storage unit 18 stores in advance as many predetermined subject images 30 as there are types of subjects that can be processed target images.
- the information processing system 100 is used in an endoscope system 300 (described later)
- an image of a lumen or the like captured by an endoscope 310 may be a predetermined subject image 30.
- an imaging system that is not particularly concerned about the imaging system that captures the predetermined subject image 30 will be referred to as an arbitrary imaging system 104.
- an imaging system that is not particularly concerned about the imaging system that captures the predetermined subject image 30 will be referred to as an arbitrary imaging system 104.
- a case where the predetermined subject image 30 is captured with limited imaging systems will be described later.
- the training model 20 is a model to be subjected to machine learning by the learning device processing unit 16.
- the model here is information for deriving the correspondence between estimation target data and estimation result data. More specifically, it is information for deriving the output image 34, which is estimation result data, from the learning image 32, which is estimation target data.
- the training model 20 of this embodiment at least a part of the model includes a neural network NN. Details of the neural network NN will be described later with reference to FIG. Note that, as described above, when the information processing system 100 and the learning device 10 are integrated, machine learning may be performed on the trained model 120.
- the training model 20 when the first learning image 32-1 is input to the training model 20, the training model 20 outputs the first output image 34-1.
- the training model 20 when the Nth learning image 32-N is input to the training model 20, the training model 20 outputs the Nth output image 34-N. That is, as shown in FIG. 5, in the learning device 10 of this embodiment, N images consisting of the first learning image 32-1 to the Nth learning image 32-N are input to the training model 20 as the learning image group 32G. be done.
- FIG. 6 is a schematic diagram illustrating the neural network NN.
- the neural network NN has an input layer into which data is input, an intermediate layer that performs calculations based on the output from the input layer, and an output layer that outputs data based on the output from the intermediate layer.
- FIG. 6 illustrates a network having two intermediate layers, the intermediate layer may have one layer, or three or more layers.
- the number of nodes included in each layer is not limited to the example shown in FIG. 6, and various modifications are possible.
- nodes included in a given layer are combined with nodes in adjacent layers.
- a weighting coefficient is set for each connection. Each node multiplies the output of the previous node by a weighting coefficient, and obtains a total value of the multiplication results.
- each node adds a bias to the total value and applies an activation function to the addition result to obtain the output of the node.
- an activation function By sequentially executing this process from the input layer to the output layer, the output of the neural network NN is obtained.
- various functions such as a sigmoid function and a ReLU function are known as activation functions, and these can be widely applied in this embodiment.
- the neural network NN may be a CNN (Convolutional Neural Network), an RNN (Recurrent Neural Network), or another model.
- CNN Convolutional Neural Network
- RNN Recurrent Neural Network
- FIG. 7 is a flowchart illustrating an example of the learned model creation process (step S100).
- the learned model creation process (step S100) is a process of creating or updating the learned model 120 by machine learning.
- the learning device processing unit 16 performs image data generation processing (step S120) after reading a predetermined subject image (step S110). For example, the learning device processing unit 16 reads a predetermined subject image 30 from the learning device storage unit 18 and performs a predetermined process of generating a learning image 32 and a correct image 36 using the predetermined subject image 30.
- the predetermined processing includes defocus simulation processing (step S200), best focus simulation processing (step S300), and the like, and details will be described later.
- the learning device processing unit 16 performs a correction learning process (step S130). For example, the learning device processing unit 16 performs a process of reading out the training model 20 from the learning device storage unit 18, a process of inputting the learning image 32 generated in the image data generation process (step S120) into the training model 20, and a process of inputting the learning image 32 generated in the image data generation process (step S120) into the training model 20.
- Machine learning processing is performed based on the output image 34 and correct image 36 that have been output.
- the machine learning process based on the output image 34 and the correct image 36 is, for example, as shown in FIG.
- This process changes network parameters.
- the process of changing network parameters of the neural network NN is, for example, a process of updating appropriate weighting coefficients in the neural network NN.
- the weighting coefficient here includes a bias.
- an error backpropagation method in which the weighting coefficients are updated from the output layer to the input layer. That is, the learning device 10 inputs the input data of the learning data into the model, and calculates the output by performing forward calculation according to the model configuration using the weighting coefficient at that time. An error function is calculated based on the output and the correct label, and the weighting coefficients are updated so as to reduce the error function.
- the learning device processing unit 16 inputs the first learning image 32-1 as input data to the neural network NN included in the training model 20, and performs forward-direction processing using the weighting coefficient at that time. By performing the calculation, a first output image 34-1, which is output data, is output. The learning device processing unit 16 calculates an error function based on the first output image 34-1 and the correct image 36 which is the correct label. Then, processing is performed to update the weighting coefficients so as to reduce the error function. Further, the learning device processing unit 16 repeatedly performs similar processing on the second output image 34-2 to the Nth output image 34-N. By doing so, the training model 20 is machine-learned so that one correct image 36 can be output for a plurality of types of learning images 32.
- the learned model 120 stored in the storage unit 110 is updated.
- the learning device 10 and the information processing system 100 are illustrated as being communicatively connected via the communication unit 12 in FIG. 4, the learning device 10 and the information processing system 100 do not need to be communicatively connected.
- the user performs a process on the learning device 10 to temporarily store the training model 20 as the learned model 120 in the information storage medium, and carries the information storage medium to a location in the information processing system 100.
- the learned model 120 can be updated by performing a process on the information processing system 100 to update the learned model 120 based on the information storage medium.
- FIG. 9 is a diagram illustrating the relationship between the depth of focus and the depth of field when the optical axis is set as the horizontal axis for the first imaging system 101 of this embodiment.
- FIG. 9 is a diagram for convenience and does not show a specific lens configuration of the first imaging system 101.
- the range indicated by DP1 is the depth of field corresponding to the depth of focus in the optical design of the first imaging system 101. Therefore, for example, if the distance between the subject and the first imaging system 101 is the first object distance shown in D1, the subject is located outside the depth of field, so if the first imaging system 101 captures the image. , a processing target image including the effect of blur due to defocusing is obtained.
- the distance between the subject and the first imaging system 101 is the second object distance shown in D2, the subject is located within the depth of field, so the image becomes a focused processing target image.
- the distance between the subject and the first imaging system 101 is the object distance shown as D3, that is, the position shown on the optical axis as P1 in the depth of field is a position that satisfies the best focus condition.
- the first object distance indicated by D1 and the second object distance indicated by D2 are shown from the position indicated by P1 toward the near point, but are not limited to the far point. It may be on the side.
- the method of this embodiment will be explained while illustrating the object distance on the periapsis side, but the method of this embodiment cannot be applied even when using the object distance on the far point side. isn't it.
- the depth of field becomes narrower, so it is desired to expand the depth of field.
- the first imaging system 101 is used in an endoscope 310 of an endoscope system 300 described later, it may be difficult to align the endoscope 310 to the best focus position for a desired subject. Therefore, it is desired to expand the depth of field.
- an image obtained by simulating the effect of blur on a predetermined subject image 30 captured in advance is used as a learning image 32, and an in-focus image is used as a correct image 36 as a data set as described above in FIG. 8 etc.
- the trained model 120 that has undergone machine learning is incorporated into the information processing system 100.
- the captured image to which the effect of blur due to defocus is added is set as the processing target image, and by performing the processing in FIG. 3, the information processing system 100 outputs the corrected image in focus.
- the range of the depth of field of the first imaging system 101 can be substantially expanded.
- the depth of field can be substantially expanded from the range shown in DP1 in FIG. 9 to the range shown in DP2.
- Substantially enlarging means that the depth of field has not been optically enlarged, but the image processing performed by the information processing system 100 allows the subject, which is originally located outside the range of the depth of field, to be enlarged as if it were in the depth of field. This is to expand the apparent depth of field to a range that can be imaged as if it were located within the depth range.
- a processing target image with added blur is output from the first imaging system 101, but the actual object distance indicated by DP2 is output from the first imaging system 101.
- the image to be processed is corrected into a corrected image that is in focus, and is output from the information processing system 100. Furthermore, in the following description, the substantial depth of field shown in DP2 in FIG. 9, expanded using the learned model 120 of this embodiment, will be referred to as a target depth of field. Note that the corrected image that is in focus here does not need to be strictly focused for the entire image. For example, even if a part of the output corrected image is blurred, the user may determine that the function of the information processing system 100 is sufficient as long as a treatment using the endoscope 310 can be performed.
- the distance of the target depth of field in this embodiment is wider than the distance of the optically determined depth of field, but it is a distance that can vary depending on the user's tolerance level and the like. Therefore, DP2 shown in FIG. 9 is only shown for convenience and does not indicate a fixed length. The same applies to the following explanation.
- the trained model 120 of this embodiment has a blur obtained by imaging a subject located in the range shown in DP10 in FIG. 9 as the difference between the target depth of field shown in DP2 and the depth of field shown in DP1.
- Machine learning is used to correct images to bring them into focus.
- the distance shown in DP10 is the distance required for machine learning.
- step S120 A method of image data generation processing (step S120) for generating the learning image 32 and correct image 36 necessary for the machine learning will be explained using FIG. 10. Note that the method of image data generation processing is not limited to that shown in FIG. 10, and various modifications can be implemented as described later. Therefore, the image data generation process shown in FIG. 10 can also be called step S120-1.
- the predetermined subject image 30 of this embodiment is assumed to be captured at an object distance that allows the imaging system to focus.
- the learning device processing unit 16 generates a learning image 32 by performing defocus simulation processing (step S200) on a predetermined subject image 30 captured by an arbitrary imaging system 104.
- the defocus simulation process for generating the first learning image 32-1 can also be called step S200-1, and similarly the defocus simulation process for generating the Nth learning image 32-N
- the defocus simulation process can be called step S200-N.
- step S202, step S204, step S206, step S208, step S210, step S220, and step S230 which will be described later.
- the learning device processing unit 16 selects information on the first object distance from the read optical system information 40 when generating the first learning image 32-1 through the defocus simulation process (step S200-1).
- the learning device processing unit 16 selects second object distance information from the read optical system information 40 when generating the second learning image 32-2 in step S200-2. That is, in the present embodiment, the optical system information 40 corresponding to the Nth learning image 32-N is the Nth object distance, and the learning device processing unit 16 By selecting the corresponding Nth object distance information from the system information 40, it can be expanded and expressed.
- the defocus simulation process will be exemplified with respect to the process for generating the first learning image 32-1, but will also be described with respect to the case of generating the second learning image 32-2 to the Nth learning image 32-N. The process is similar.
- the learning device processing unit 16 generates the correct image 36 by performing the best focus simulation process (step S300) on the predetermined subject image 30.
- the learning device processing unit 16 selects information on the object distance at which the first imaging system 101 is focused from the read optical system information 40.
- the information on the object distance at which the first imaging system 101 is focused is, for example, as shown in D3, the designed distance from the first imaging system 101 to the point shown in P1 in FIG. 9, which corresponds to the so-called best focus condition. is the object distance.
- image data generation process of this embodiment may be performed as shown in FIG. 11.
- the image data generation process shown in FIG. 11 can also be called step S120-2.
- descriptions of processes similar to those in FIG. 10 will be omitted as appropriate.
- Step S120-2 in FIG. 11 differs from step S120-1 in FIG. 10 in that the best focus simulation process (step S300) is not performed and the correct image 36 is the predetermined subject image 30 itself. This is because if the predetermined subject image 30 is an image captured at an object distance that can be focused by an arbitrary imaging system 104, it can be used as the correct image 36.
- the defocus simulation process (step S200) will be explained using FIGS. 12 and 13.
- the optical system information 40 read when performing the defocus simulation process (step S200) includes information on a transfer function or a point spread function.
- the transfer function or point spread function changes depending on the amount of defocus in the optical axis direction and the image height in a plane perpendicular to the optical axis. For example, at the first object distance, the areas perpendicular to the optical axis and the same size as the predetermined subject image 30 are set to area FC11-1, area FC12-1, area FC13-1, area FC21-1, and area FC22-1. , region FC23-1, region FC31-1, region FC32-1, and region FC33-1.
- the transfer function or point spread function at the first object distance may exhibit different values for each divided region.
- areas perpendicular to the optical axis and the same size as the predetermined subject image 30 are set to area FC11-N, area FC12-N, area FC13-N, area FC21-N, area Assume that the area is divided into FC22-N, area FC23-N, area FC31-N, area FC32-N, and area FC33-N.
- the transfer function or point spread function at the Nth object distance may exhibit different values for each divided region.
- the transfer function or point spread function of the area FC11-1 and the transfer function or point spread function of the area FC11-N may exhibit different values.
- a transfer function or a point spread function on the optical axis is used to perform machine learning.
- the region FC22-1 is assumed to be the region through which the optical axis of the first imaging system 101 passes. That is, the transfer function or point spread function in the region FC22-1 is the transfer function or point spread function on the optical axis of the first imaging system 101 at the first object distance.
- the transfer function or point spread function of the region FC22-N at the Nth object distance is the transfer function or point spread function on the optical axis of the first imaging system 101. Note that although the transfer function or point spread function is divided into nine parts in FIG. 12, this is just an example, and the same applies to FIG.
- areas FC22-1 to FC22-N in FIG. 12 are a set containing a predetermined number of images in each of the vertical and horizontal directions, but may be one pixel.
- the transfer function or point spread function on the optical axis in this embodiment refers to the transfer function or point spread function in at least one of the area of one pixel passing through the optical axis or the area of a predetermined number of pixels including the pixel. It is a point spread function.
- step S210 is performed for areas other than the optical axis of the predetermined subject image 30 based on the transfer function on the optical axis or the point spread function on the optical axis of the first imaging system 101.
- step S210 is performed for areas other than the optical axis of the predetermined subject image 30 based on the transfer function on the optical axis or the point spread function on the optical axis of the first imaging system 101.
- the predetermined subject image 30 is divided into nine regions AR11, AR12, AR13, AR21, AR22, AR23, AR31, AR32, and AR33, as in FIG.
- the learning device processing unit 16 when generating the first learning image 32-1, the learning device processing unit 16 performs step S210 on the area AR11 using the transfer function or point spread function on the optical axis shown in FC22-1 in FIG. -1 calculation is performed. Note that in the following explanation and illustration in FIG. 13, this calculation will be simply expressed as AR11*FC22-1. The same applies to calculations such as step S210 using other areas. Furthermore, although the details will be described later, "*" here indicates convolution when, for example, PSF is used as the point spread function. Further, for example, when OTF is used as the transfer function, "*" indicates that the frequency characteristic obtained by Fourier transforming the region AR11 is multiplied by the OTF of the region FC22-1.
- the learning device processing unit 16 also performs step S210-1 for the areas AR12 to AR33 using the transfer function or point spread function on the optical axis shown in FC22-1.
- the learning device processing unit 16 includes AR12*FC22-1, AR13*FC22-1, AR21*FC22-1, AR22*FC22-1, AR23*FC22-1, We are conducting AR31*FC22-1, AR32*FC22-1, and AR33*FC22-1.
- the learning device processing unit 16 divides the same area as the predetermined subject image 30 into a desired number of areas, uses the transfer function or point spread function of one of the divided areas, and performs step S210. I do.
- the generated first learning image 32-1 is divided into areas BR11-1, BR12-1, BR13-1, BR21-1, BR22-1, BR23-1, BR31-1, Assume that the area is divided into nine areas, BR32-1 and BR33-1.
- BR12-1 AR12*FC22-1
- BR13-1 AR13*FC22-1
- BR21-1 AR21*FC22-1
- BR22-1 AR22*FC22-1
- BR23-1 AR23*FC22 -1
- BR31-1 AR31*FC22-1
- BR32-1 AR32*FC22-1
- BR33-1 AR33*FC22-1.
- Defocus simulation processing is performed based on the transfer function or point spread function (FC22).
- the transfer function of this embodiment can also be called an optical transfer function or OTF.
- OTF is an abbreviation for Optical Transfer Function.
- the point spread function of this embodiment can also be called a point spread function or PSF.
- PSF is an abbreviation for Point Spread Function.
- OTF is the result of Fourier transform of PSF.
- the PSF is the result of inverse Fourier transform of the OTF.
- OTF is a complex function, and the absolute value of OTF is called a modulation transfer function, amplitude transfer function, or MTF.
- MTF is an abbreviation for Modulation Transfer Function.
- the information processing system 100 of the present embodiment includes the storage unit 110 that stores the learned model 120 that has been machine-learned using the dataset including the learning image group 32G and the correct image 36, and the learned model 120. and a processing unit 130 that corrects blur caused by defocusing of the first imaging system 101 of an image to be processed, which is an image photographed by the first imaging system 101.
- the learning image group 32G includes transfer functions or points of the first imaging system 101 at a plurality of object distances for a predetermined subject image 30 in which a predetermined subject captured by an arbitrary imaging system 104 is in focus.
- It includes a plurality of learning images 32 that are generated by performing a defocus simulation process (step S200) that simulates the influence of blur due to defocus of the first imaging system 101 based on the image distribution function.
- Defocusing is simulated for the area on the optical axis of the first imaging system 101 and the area other than on the optical axis in each of the plurality of learning images 32 based on the transfer function or point spread function on the optical axis.
- Processing takes place.
- the correct image 36 is a best focus image that simulates the focused state of the first imaging system 101 for the predetermined subject image 30 based on the transfer function or point spread function at the object distance at which the first imaging system 101 is focused. This is an image generated by performing the simulation process (step S300) or the predetermined subject image 30 itself.
- the learned model 120 undergoes machine learning such that each learning image 32 becomes the correct image 36.
- the information processing system 100 of the present embodiment includes the storage unit 110 that stores the trained model 120 and the processing unit 130, the processing target image captured by the first imaging system 101 is blurred due to defocus. Even if the image contains the influence of blur, it is possible to output a corrected image in which the influence of blur is corrected. Thereby, the depth of field of the first imaging system 101 can be substantially expanded. Furthermore, since the learning image group 32G and the correct image 36 are created in advance based on the predetermined subject image 30 captured by an arbitrary imaging system 104, the subject associated with the processing target image is captured for the first time by the first imaging system 101. In the case where the subject is a photographed subject, a learned model 120 that has been subjected to machine learning in advance can be used.
- step S200 defocus simulation processing
- step S200 the amount of information required for the defocus simulation process (step S200) can be reduced.
- the learned model 120 can be easily implemented in the information processing system 100.
- the method of this embodiment can also be realized as a trained model 120. That is, the trained model 120 of this embodiment is used in the information processing system 100 including the storage unit 110 that stores the trained model 120, the input unit 140, the processing unit 130, and the output unit 150, and is used in the learning image group.
- Machine learning is performed using a data set including 32G and the correct image 36.
- the learning image group 32G includes transfer functions or points of the first imaging system 101 at a plurality of object distances for a predetermined subject image 30 in which a predetermined subject captured by an arbitrary imaging system 104 is in focus. It includes a plurality of learning images 32 that are generated by performing a defocus simulation process that simulates the influence of blur due to defocus of the first imaging system 101 based on the image distribution function.
- Defocusing is simulated for the area on the optical axis of the first imaging system 101 and the area other than on the optical axis in each of the plurality of learning images 32 based on the transfer function or point spread function on the optical axis. Processing takes place.
- the correct image 36 is a best focus image that simulates the focused state of the first imaging system 101 for the predetermined subject image 30 based on the transfer function or point spread function at the object distance at which the first imaging system 101 is focused. This is an image generated by performing simulation processing, or the predetermined subject image 30 itself.
- the learned model 120 undergoes machine learning such that each learning image 32 becomes the correct image 36.
- the input unit 140 inputs a processing target image, which is an image captured by the first imaging system 101, to the learned model 120.
- the processing unit 130 uses the trained model 120 to perform a correction process to correct blur caused by defocusing of the first imaging system 101 of the image to be processed.
- the output unit 150 outputs a corrected image by the correction process. By doing so, effects similar to those described above can be obtained.
- the technique of this embodiment can also be realized as an information processing method.
- the information processing method of the present embodiment uses the trained model 120 machine-learned using the data set including the learning image group 32G and the correct image 36 to obtain the processing target image, which is an image photographed by the first imaging system 101. Blur caused by defocusing of the first imaging system 101 is corrected.
- the learning image group 32G includes transfer functions or points of the first imaging system 101 at a plurality of object distances for a predetermined subject image 30 in which a predetermined subject captured by an arbitrary imaging system 104 is in focus. It includes a plurality of learning images 32 that are generated by performing a defocus simulation process that simulates the influence of blur due to defocus of the first imaging system 101 based on the image distribution function.
- Defocusing is simulated for the area on the optical axis of the first imaging system 101 and the area other than on the optical axis in each of the plurality of learning images 32 based on the transfer function or point spread function on the optical axis. Processing takes place.
- the correct image 36 is a best focus image that simulates the focused state of the first imaging system 101 for the predetermined subject image 30 based on the transfer function or point spread function at the object distance at which the first imaging system 101 is focused. This is an image generated by performing simulation processing, or the predetermined subject image 30 itself.
- the learned model 120 undergoes machine learning such that each learning image 32 becomes the correct image 36. By doing so, effects similar to those described above can be obtained.
- the method of this embodiment can also be realized as an information storage medium that stores the learned model 120.
- the training model 20 machine-learned by the learning device 10 can be stored in the information storage medium.
- the training model 20 can be updated as the latest learned model 120.
- the predetermined circumstances include, for example, a situation where the location where the learning device 10 is located and a location where the information processing system 100 is located are far apart, a situation where data communication is not possible between the learning device 10 and the information processing system 100, and the like.
- the method of this embodiment may be realized as an endoscope system 300.
- the endoscope system 300 of this embodiment includes a processor unit 200 that includes the above-described information processing system 100, and an endoscope 310 that is connected to the processor unit 200 and captures images to be processed. By doing so, it is possible to construct an endoscope system 300 that includes the information processing system 100 that has the above effects.
- the endoscope system 300 can have a configuration example as shown in FIG. 14, for example.
- Endoscope system 300 includes an endoscope scope 310, an operation section 320, a display section 330, and a processor unit 200.
- Processor unit 200 includes a storage section 210, a control section 220, and information processing system 100.
- the information processing system 100 in FIG. 14 further includes a storage interface 160 in addition to the configuration described above in FIG. Note that descriptions of configurations similar to those in FIG. 2 will be omitted as appropriate.
- the endoscope 310 includes an imaging device at its distal end (not shown).
- the imaging device includes a first imaging system 101.
- the distal end of the endoscope 310 is inserted into the body cavity, the imaging device takes an image of the abdominal cavity, and the imaging data is transmitted from the endoscope 310 to the processor unit 200.
- the operation unit 320 is a device for a user to operate the endoscope system 300, and is, for example, a button, a dial, a foot switch, a touch panel, or the like.
- the display unit 330 is a device that displays images captured by the endoscope 310, and is, for example, a liquid crystal display, but may also be hardware integrated with the operation unit 320, such as a touch panel.
- the processor unit 200 performs various processes such as control and image processing in the endoscope system 300.
- the control section 220 realizes the function of the processor unit 200 by performing mode switching, zoom operation, display switching, etc. of the endoscope system 300 based on information input from the operation section 320.
- the storage unit 210 records images captured by the endoscope 310.
- the storage unit 210 is, for example, a semiconductor memory, a hard disk drive, an optical drive, or the like.
- the processor unit 200 may further include an interface circuit that receives image data.
- the storage interface 160 is an interface for accessing the storage unit 210.
- the storage interface 160 records the image data received by the input unit 140 in the storage unit 210.
- the storage interface 160 reads the image data from the storage section 210 and sends the image data to the processing section 130.
- the processing unit 130 performs the processing described above with reference to FIG. 3 using the image data from the input unit 140 or the storage interface 160 as a processing target image. As a result, the processing unit 130 outputs the corrected image via the output unit 150, and the corrected image in focus is displayed on the display unit 330.
- the endoscope system 300 of this embodiment may have the configuration example shown in FIG. 15, for example.
- the configuration example in FIG. 15 differs from the configuration example in FIG. 14 in that the information processing system 100 and the processor unit 200 are provided separately.
- the information processing system 100 and the processor unit 200 may be connected by inter-device communication such as USB, or may be connected by network communication such as LAN or WAN.
- the information processing system 100 is configured by one or more information processing devices.
- the information processing system 100 may be a cloud system in which a plurality of PCs, a plurality of servers, etc. connected via a network perform parallel processing.
- the storage unit 170 in FIG. 15 corresponds to the storage unit 210 in FIG. 14.
- the processor unit 200 includes a control section 220, an imaging data receiving section 230, an input section 240, an output section 250, a processing section 260, and a display interface 270.
- the imaging data receiving section 230 is configured with an interface circuit similar to the input section 140 in FIG. 14, and receives imaging data from the endoscope 310.
- the processing unit 260 transmits the image data received by the imaging data receiving unit 230 to the information processing system 100 via the output unit 250.
- the information processing system 100 performs the process shown in FIG. 3 using the received image data as a processing target image to generate a corrected image.
- the input unit 240 receives the corrected image transmitted from the information processing system 100 via the output unit 150 and outputs the corrected image to the processing unit 260.
- the processing unit 260 outputs the corrected image to the display unit 330 via the display interface 270. As a result, the corrected image is displayed on the display section 330.
- the display interface 270 in FIG. 15 is configured with the same hardware as the output unit 150 in FIG. 14, and realizes the same functions as the output unit 150 in FIG. Note that in FIG. 15, the input section 140 and the output section 150 of the information processing system 100 may be configured with separate interfaces, but the functions of the input section 140 and the output section 150 may be realized with a single input/output interface. Good too. The same applies to the input section 240 and output section 250 of the processor unit 200.
- each object distance included in the optical system information 40 may be determined based on the difference in corresponding MTF.
- the learning image group 32G includes the first learning image 32-1 in which step S200-1 was performed based on the transfer function or point spread function of the first object distance, and the transfer function or point spread function of the second object distance. It is assumed that the image is composed of the second learning image 32-2 on which step S200-2 was performed based on the image data. Further, it is assumed that the first object distance is an object distance with a larger amount of defocus than the second object distance. In this case, qualitatively illustrating the spatial frequency dependence of MTF, the MTF based on the second object distance is as shown in A0 in FIG.
- the MTF based on the first object distance is as shown in A1.
- a predetermined spatial frequency shown as B0 is determined, the difference in MTF is determined as shown in C0. Therefore, the first object distance and the second object distance are determined so that the difference in MTF indicated by C0 is smaller than a predetermined value.
- the difference in MTF is the difference in MTF between adjacent object distances.
- the learning image group 32G includes a first learning image 32-1, a second learning image 32-2, and a third learning image 32-3.
- the first object distance, the second object distance, and the third object distance are the object distances with the largest defocus amount in this order.
- A10 in FIG. 17 shows the frequency characteristic of MTF at the third object distance
- A11 shows the frequency characteristic of MTF at the second object distance
- A12 shows the frequency characteristic of MTF at the first object distance.
- both the difference between the MTF of A10 and the MTF of A11 indicated by C10 and the difference between the MTF of A11 and the MTF of A12 indicated by C11 are both lower than the predetermined value.
- the difference between the MTF of A10 and the MTF of A12 is not considered as a predetermined value.
- the trained model 120 subjected to machine learning performs a correction process (step S30) so that both the first learning image 32-1 and the second learning image 32-2 can be corrected to the correct image 36. Furthermore, in order to correct the processing target image captured at an object distance between the first object distance and the second object distance to the correct image 36 by the correction process (step S30), the first learning image 32-1 and the It is preferable that the difference in the influence of blur added to the two learning images 32-2 is within a certain range.
- the object distance of each learning image is defined based on the MTF indicating the degree of influence of blur simulated on the predetermined subject image 30, so it is possible to A learning image group 32G can be generated. This allows the data set to be appropriate for machine learning.
- the optical system information 40 may include the object distance under the best focus condition of the first imaging system 101.
- the object distance under the best focus condition is, for example, the distance shown in D3 in FIG.
- the learning device processing unit 16 performs best focus simulation processing (step S300) on the predetermined subject image 30 using a transfer function or a point spread function using the object distance under the best focus condition. 36 may be generated. That is, in the information processing system 100 of this embodiment, the object distance that is in focus is the object distance under the best focus condition. By doing so, an appropriate correct image 36 can be generated.
- the transfer function or point spread function based on the object distance and the learning image 32 have a one-to-one correspondence. More specifically, for example, in the defocus simulation process (step S200), for one predetermined subject image 30, a transfer function or point spread function based on the first object distance, and a transfer function or point spread function based on the second object distance are determined. It is assumed that the process of generating the third learning image 32-3 using both point spread functions is not performed. In other words, in the information processing system 100 of the present embodiment, each learning image 32 is created for a predetermined subject image 30 based on a transfer function or a point spread function at any one of a plurality of object distances. This is an image generated by performing focus simulation processing (step S200). By doing so, the relationship between the learning images 32 in the learning image group 32G can be clarified.
- the MTF of an object distance shorter than the object distance at the near point of the target expanded depth of field shown in P2 in FIG. 9 may be 0 at the spatial frequency shown in B0.
- the spatial frequency lower than the lowest spatial frequency at which aliasing occurs is the spatial frequency shown in B0.
- the processing unit 130 uses the learned model 120 to correct the blur caused by the defocus of the first imaging system 101 on the processing target image.
- An image in which the depth of field of the first imaging system 101 is expanded to a target enlarged depth of field that is wider than the depth of field is estimated.
- the predetermined spatial frequency is a spatial frequency lower than the lowest spatial frequency at which the MTF value at the near point of the target expanded depth of field becomes zero.
- the predetermined spatial frequency indicated by B0 is, for example, 0.1 as a normalized frequency. That is, in the information processing system 100 of this embodiment, the predetermined spatial frequency is a spatial frequency that is 1/5 of the Nyquist frequency of the image sensor of the first imaging system 101. By doing so, it is possible to establish a one-to-one correspondence between the spatial frequency and the MTF for many optical systems. Thereby, the method of this embodiment can be applied to processing target images captured by many types of optical systems.
- the optical system information 40 of this embodiment may be a combination of an object distance within the depth of field and an object distance outside the depth of field.
- the optical system information 40 may include a first object distance outside the depth of field shown in D1 in FIG. 9 and a second object distance shown in D2.
- the first object distance among the plurality of object distances is an object distance outside the depth of field
- the second object distance among the plurality of object distances is an object distance outside the depth of field. This is the object distance within the field depth.
- the defocus simulation process (step S200) produces a first learning image 32-1 in which the influence of blur is largely simulated, and a second learning image 32-2 in which the influence of blur is simulated to be small. can be combined with the correct image 36 to form a data set.
- the trained model 120 that has been machine-trained using these data sets can correct the processing target image that has been affected by blur over a wide range by the correction process (step S30).
- the predetermined value may be determined based on the number of learning images 32 that constitute the learning image group 32G.
- the MTF indicated by A0 is the MTF at the object distance corresponding to the best focus condition
- the MTF indicated by A1 is the MTF at the object distance corresponding to the near point of the target depth of field.
- the spatial frequency is determined to be the spatial frequency indicated by B0
- the MTF range having the maximum range indicated by C0 is uniquely determined.
- a value obtained by dividing the range shown by C0 based on the desired number of learning images 32 is determined as a predetermined value. From the above, in the information processing system 100 of this embodiment, the predetermined value is determined based on the number of object distances that can be set to two or more. By doing this, the number of data sets required for machine learning can be determined by considering the load of machine learning.
- a predetermined value may be determined in advance and the number of learning images 32 may be determined based on the predetermined value.
- the machine learning policy can be determined depending on the circumstances.
- the predetermined value is preferably 0.2 or less. That is, in the information processing system 100 of this embodiment, the predetermined value is set to be 0.2 or less.
- the possible range of MTF is considered to be about 0.2. Therefore, for example, if the predetermined value is set to 0.2, the number of learning images 32 forming the learning image group 32G will be two.
- the first object distance is considered to be an object distance outside the depth of field, and the second object distance is considered to be an object distance within the depth of field.
- the predetermined value be 0.1 or less. That is, in the information processing system 100 of this embodiment, the predetermined value is set to be 0.1 or less. Furthermore, it is desirable that the predetermined value is 0.05 or less. That is, in the information processing system 100 of this embodiment, the predetermined value is set to be 0.05 or less.
- the number of learning images 32 forming the learning image group 32G can be further increased.
- the trained model 120 receives a processing target image captured at an object distance other than the object distance not used for machine learning, it is highly likely that the trained model 120 can output a corrected image that appropriately removes the effects of blur. Become. In other words, the accuracy of the correction process (step S30) for the learned model 120 can be further improved. Note that as the number of learning images 32 forming the learning image group 32G increases, the processing load of machine learning increases. Therefore, the appropriate number of learning images 32 constituting the learning image group 32G is determined as appropriate depending on the circumstances.
- step S200 defocus simulation processing
- step S200 defocus simulation processing
- step S200 defocus simulation processing
- step S200 a point spread function
- the learning device processing unit 16 uses the PSF of the first object distance of the first imaging system 101 to Convolution calculation processing is performed on the subject image 30.
- convolution can also be called convolution integral.
- the PSF of the first object distance is a PSF consisting of the area shown in FC22-1 in FIG. 12.
- the PSF convolution calculation process corresponds to step S210 in FIG. 13.
- the learning device processing unit 16 uses the PSF of the Nth object distance of the first imaging system 101 to Performs convolution calculation processing.
- the defocus simulation process based on the PSF convolution calculation process can be called step S200-A.
- the defocus simulation process is a process of convolutionally calculating the PSF at each object distance of the first imaging system 101 with respect to the predetermined subject image 30. It is. By doing so, it is possible to generate a trained model 120 that has been subjected to machine learning using the data set of the learning image 32 and the correct image 36 using PSF.
- step S200 a specific method in which the learning device processing unit 16 performs defocus simulation processing (step S200) using a transfer function will be described.
- a process of multiplying the OTF of the first object distance of the first imaging system 101 and a process of inverse Fourier transforming the frequency characteristics subjected to the multiplication are performed.
- the OTF of the first object distance here is an OTF consisting of the area shown in FC22-1 in FIG. 12.
- the OTF multiplication corresponds to step S210 in FIG. 13.
- the learning device processing unit 16 when generating the Nth learning image 32-N in step S200-N, the learning device processing unit 16 performs a Fourier transform process on the predetermined subject image 30, and performs a Fourier transform process on the frequency characteristic that is the result of the Fourier transform.
- a process of multiplying the OTF of the Nth object distance of the first imaging system 101 and a process of inverse Fourier transform of the multiplied frequency characteristics are performed. Note that the defocus simulation process based on OTF multiplication can be called step S200-B.
- the defocus simulation process performs Fourier transform on the predetermined subject image 30, and the frequency characteristics of the predetermined subject image 30, which is the result of the Fourier transform, are This is a process of multiplying by the OTF at each object distance of the first imaging system 101 and inverse Fourier transforming the multiplied frequency characteristics. By doing so, it is possible to generate a trained model 120 that has been subjected to machine learning using the data set of the learning image 32 and the correct image 36 using OTF.
- step S200 the user may appropriately select whether to use PSF or OTF.
- the learning device processing unit 16 may perform best focus simulation processing (step S300) using a point spread function. For example, as shown in FIG. 20, the learning device processing unit 16 uses the PSF of the object distance that the first imaging system 101 focuses on to perform convolution calculation processing on the predetermined subject image 30, thereby creating a correct image. Generate 36. Note that the best focus simulation process based on the PSF convolution calculation process can also be called step S300-A.
- the learning device processing unit 16 may perform best focus simulation processing (step S300) using a transfer function. For example, as shown in FIG. 21, the learning device processing unit 16 performs a Fourier transform process on a predetermined subject image 30, and determines the object distance at which the first imaging system 101 is focused on the frequency characteristics that are the results of the Fourier transform. A correct image 36 is generated by performing a process of multiplying by OTF and a process of inverse Fourier transform of the frequency characteristics subjected to the multiplication. Note that the best focus simulation process based on OTF multiplication can also be called step S300-B.
- the first imaging system 101 of this embodiment may have a retrofocus type lens configuration.
- the retrofocus type is also called the reverse telephoto type.
- a retrofocus type lens configuration can be realized by arranging a lens with negative bending power and a lens with positive bending power from the subject side.
- the lens group on the object side will be referred to as the front lens group
- the lens group on the image side will be referred to as the rear lens group.
- the optical system shown in FIG. 22 includes, in order from the subject side, a front lens group shown at G1, an aperture stop shown at S1, a rear lens group shown at G2, and a cover glass shown at CG1.
- the intervals between the lenses, etc. that constitute the optical system are not shown accurately.
- the positive lens shown at L6 and the cover glass shown at CG1 are actually joined together, but are shown spaced apart for convenience. The same applies to FIGS. 23 and 25, which will be described later.
- the front lens group indicated by G1 includes an object-side negative lens indicated by L1 and a positive lens indicated by L2, and has negative bending power as a whole.
- the rear lens group indicated by G2 includes a positive lens indicated by L3, a lens obtained by cementing a positive lens indicated by L4 and a negative lens indicated by L5, and a positive lens indicated by L6, and has a positive bending power as a whole.
- the front lens group or the rear lens group may be composed of a plurality of lens groups.
- a lens group indicated by G11 functions as a front lens group
- a lens group indicated by G12 and a lens group indicated by G13 function as a rear lens group.
- the lens group indicated by G11 includes, in order from the subject side, a plano-concave lens with a concave surface facing the image side as indicated by L11, and a negative meniscus lens as indicated by L12, and has negative refractive power as a whole.
- the lens group indicated by G12 includes a subject-side positive lens indicated by L13 and an image-side positive lens indicated by L14.
- an aperture stop shown in S11 may be further arranged between the lens shown in L13 and the lens shown in L14.
- the lens group shown in G13 has positive refractive power as a whole.
- the lens group indicated by G13 may include a cemented lens composed of a positive lens indicated by L15 and a negative lens indicated by L16. Thereby, spherical aberration and comatic aberration can be favorably corrected.
- the lens group indicated by G13 may further include a plano-convex lens indicated by L17. This makes it possible to secure a wide field of view.
- the plano-convex lens shown at L17 and the cover glass shown at CG11 are shown separated from each other in FIG. 23, they are actually joined together.
- a cover glass shown at CG11 is provided on an image sensor (not shown), and a plano-convex lens shown at L17 is used for positioning the image sensor.
- the first imaging system 101 may further include a parallel plate.
- Parallel plates are also called filters.
- the parallel plates are arranged, for example, at the position F1 in FIG. 22 and the position F11 in FIG. 23, they can also be arranged at other positions.
- the parallel plate is used, for example, for the purpose of adjusting the position of the image point.
- the amount of distortion at the maximum angle of view is -30% or less.
- the value of the amount of distortion (%) at the maximum angle of view is determined by (AD-PD)/PD using the length shown in PD of the subject shown in E1 and the length shown in AD of the image shown in E2. It can be expressed as x100. It is desirable that the value is more negative than -30.
- the first imaging system 101 has a retrofocus type lens configuration, and the amount of distortion at the maximum angle of view is -30% or less.
- the magnification at the periphery becomes smaller than that at the center of the image, so the transfer function or point spread function in areas other than on the optical axis can be made smaller.
- the front lens group or the rear lens group may be composed of a single lens.
- the first imaging system 101 shown in FIG. 25 includes a lens group shown in G21, a lens group shown in G22, an aperture stop shown in S21, a lens group shown in G23, and a cover glass shown in CG21.
- the lens group indicated by G21 includes a single negative lens indicated by L21, and has negative refractive power.
- the lens group indicated by G21 functions as part of the front lens group.
- the lens group indicated by G23 includes a positive lens indicated by L23, a lens obtained by cementing a positive lens indicated by L24 and a negative lens indicated by L25, and a positive lens indicated by L26, and has a positive refractive power as a whole. That is, the lens group shown in G23 functions as a rear lens group.
- the first imaging system 101 of this embodiment may further include a phase modulation element.
- the second lens group G2 in FIG. 25 includes a positive lens indicated by L22, an aperture stop indicated by S21, and a phase modulation element indicated by PM.
- a phase modulation element shown as PM is arranged at the pupil position of the first imaging system 101.
- the phase modulation element shown in PM is an element to which wavefront coding (WFC) is applied, and has, for example, a phase modulation surface shown in PMS.
- WFC wavefront coding
- phase modulation surface shown by the PMS is shown as being represented by a predetermined cubic function using coordinates perpendicular to the optical axis, but the surface shape of the phase modulation surface is not limited to this. Instead, other surface shapes may be adopted. Further, although the phase modulation surface is shown on the image side in FIG. 25, the same effect can be obtained even if it is provided on the subject side. Further, the lens group indicated by G22 has positive refractive power as a whole, and also functions as part of a retrofocus type front lens group.
- the MTF of the first imaging system 101 changes less with respect to defocus by including the phase modulation element shown in PM.
- the MTF of the first imaging system 101 is made to match against changes in object distance. More specifically, for example, the difference between the MTF of the first object distance and the MTF of the second object distance in the first imaging system 101 including the phase modulation element is the difference between the MTF of the first object distance and the MTF of the second object distance in the first imaging system 101 including the phase modulation element. This is smaller than the difference between the MTF for the first object distance and the MTF for the second object distance.
- A20 is the MTF of the first imaging system 101 at an in-focus object distance
- A21 is the MTF at an object distance where the amount of defocus is larger than the object distance related to A20.
- A22 is the MTF of an object distance with a larger defocus amount than the object distance related to A21.
- A20 to A22 are MTFs of the first imaging system 101 that does not include a phase modulation element.
- the MTF shown in A20 changes to the MTF shown in A30
- the MTF shown in A21 changes to the MTF shown in A31
- the MTF shown in A22 changes to the MTF shown in A31.
- the MTF shown in changes to the MTF shown in A32.
- the difference in MTF shown at C20 becomes small as shown in C30
- the difference in MTF shown at C21 becomes small as shown in C31.
- the first imaging system 101 further includes an optical wavefront modulation element that changes the transfer function or point spread function.
- the predetermined subject image 30 is generated based on the optical information of the first imaging system 101 for the predetermined subject image 30 captured by an arbitrary imaging system 104.
- the method of this embodiment is not limited to these.
- the learning device processing unit 16 may perform defocus simulation processing to further include processing that simulates removal of the influence of imaging by an arbitrary imaging system 104 from the predetermined subject image 30.
- FIG. 27 shows an image of a predetermined subject image 30-1 captured by the first imaging system 101 in a case where the process further includes simulating the removal of the influence of imaging by removing the influence of the first imaging system 101.
- An example of data generation processing is shown below. Note that the image data generation process shown in FIG. 27 can also be called step S122. Comparing step S122 in FIG. 27 and step S120-2 in FIG. 11, the content of the defocus simulation processing is different. Note that FIG. 27 is similar to FIG. 11 in that the best focus simulation process (step S300) is not performed and the correct image 36 is the predetermined subject image 30-1 itself. This is because the predetermined subject image 30-1 is an image captured under the best focus condition of the first imaging system 101, and there is no need to perform the same process as step S202 in the first place.
- FIG. 28 shows an example of the defocus simulation process (step S202-1) in the image data generation process (step S122).
- the learning device processing unit 16 simulates, for the predetermined subject image 30-1, removal of the influence of the first imaging system 101 at the time of photographing the predetermined subject image 30-1. Then, the processing to load the data (step S220-1) is performed. Step S220-1 is performed based on the transfer function or point spread function at the focused object distance of the first imaging system 101 and the transfer function or point spread function at the first object distance of the first imaging system 101.
- the learning device processing unit 16 performs calculation processing for deconvolving the PSF at an object distance at which the first imaging system 101 focuses on the predetermined subject image 30, and Arithmetic processing is performed by appropriately combining the arithmetic processing of convolving the PSF at the distance (step S200-A).
- Arithmetic processing that is appropriately combined is arithmetic processing that combines part or all of one arithmetic processing and the other arithmetic processing in any order, but it also means that one arithmetic processing and the other arithmetic processing are performed separately. This does not preclude this, and will be determined as appropriate depending on the given circumstances. The same applies to the following explanation.
- the predetermined circumstances include, for example, the processing time required for machine learning, the processing load on the processor, and the like. That is, by performing step S220-1, for example, the effect of the calculation process of deconvolving the PSF at the object distance at which the first imaging system 101 is focused on the predetermined subject image 30-1, and the It is possible to obtain an arithmetic processing result that reflects both the effects of the arithmetic processing of convolving the PSF at the first object distance (step S200-A).
- the arbitrary imaging system 104 is the first imaging system 101.
- the defocus simulation process (step S202) includes a transfer function or point spread function at a focused object distance of the first imaging system 101, and a transfer function or point spread function at a plurality of object distances of the first imaging system 101.
- the process further includes a process of removing the influence of the first imaging system 101 from the predetermined subject image 30-1 based on (step S212). By doing so, a more accurate learning image 32 can be generated.
- the learning image 32 and the correct image 36 obtained by the method shown in FIGS. 10 and 11 have both the influence of the arbitrary imaging system 104 and the influence of the first imaging system 101 on the predetermined subject, whereas FIGS.
- the learning image 32 and the correct image 36 obtained by the method shown in 28 have only the influence of the first imaging system 101 on the predetermined subject. This allows machine learning to be performed using a more appropriate data set.
- FIG. 29 shows an example of image data generation processing that includes processing that simulates removal of the influence of imaging by any imaging system 104.
- the second imaging system 102 is illustrated as a representative of any imaging system 104.
- the second imaging system 102 is an imaging system whose image sensor has a higher resolution than that of the first imaging system 101.
- the image data generation process shown in FIG. 29 can also be called step S124, and the image that is the source of step S124 can also be called the predetermined subject image 30-2.
- step S126 in FIG. 29 further reads the image sensor information 50 and then performs defocus simulation processing (step S204) and best focus simulation processing (step S304). They differ in some respects.
- the image sensor information 50 is information related to the resolution of the image sensors included in the first image sensor 101 and any image sensor 104. That is, in the case of the example shown in FIG. 29, the learning device storage unit 18 further stores image sensor information 50, which is not shown in FIG. Note that the image sensor information 50 is also used for calculation processing of defocus simulation processing (step S204) and best focus simulation processing (step S304).
- FIG. 30 shows an example of the defocus simulation process in the image data generation process (step S124) shown in FIG. 29.
- the defocus simulation processing shown in FIGS. 29 and 30 can also be called step S204.
- the learning device processing unit 16 when generating the first learning image 32-1, the learning device processing unit 16 performs a process (step S230 -1), the process of reducing the predetermined subject image 30 (step S240), and the arithmetic process based on the image sensor information 50 (not shown in FIG. 30) are appropriately combined.
- Step S230-1 is performed based on the transfer function or point spread function at the focused object distance of the second imaging system 102 and the transfer function or point spread function at the first object distance of the first imaging system 101.
- step S230-1 for example, the effect of the calculation process of deconvolving the PSF at the object distance at which the second imaging system 102 is focused on the predetermined subject image 30-2, and the It is possible to obtain an arithmetic processing result that reflects both the effects of the arithmetic processing of convolving the PSF at the first object distance (step S200-A). Furthermore, by performing step S204-1, the arithmetic processing result that reflects the effect of the arithmetic processing of step S230-1, the effect of the arithmetic processing of step S240, and the effect of the arithmetic processing based on the image sensor information 50 is obtained. Obtainable.
- FIG. 31 shows an example of the best focus simulation process shown in FIG. 29.
- the best focus simulation processing shown in FIGS. 29 and 31 can also be called step S304.
- the learning device processing unit 16 performs a process of simulating the difference between the second imaging system 102 and the first imaging system 101 for the predetermined subject image 30-2 (step S330), and reduces the predetermined subject image 30-2.
- the process (step S340) and the arithmetic process based on the image sensor information 50 (not shown in FIG. 31) are appropriately combined. Thereby, the learning device processing section 16 can generate the correct image 36.
- step S330 for example, the effect of the calculation process of deconvolving the PSF at the object distance at which the second imaging system 102 is focused on the predetermined subject image 30-2, and the focus of the first imaging system 101 are improved. It is possible to obtain an arithmetic processing result that reflects both the effects of the arithmetic processing of convolving the PSF at matching distances (step S300-A). Further, step S340 in FIG. 31 is the same calculation process as step S240 in FIG. 30.
- step S304 it is possible to obtain a calculation result that reflects the effect of the calculation process in step S330, the effect of the calculation process in step S340, and the effect of the calculation process based on the image sensor information 50.
- the correct image 36 may be generated by a process in which step S330 is omitted from the best focus simulation process (step S304) in FIG. 31.
- the correct image 36 may be generated by performing processing corresponding to step S340 on the predetermined subject image 30-2. If the predetermined subject image 30-2 is an image captured at an object distance that can be focused by any imaging system 104, the number of pixels of the predetermined subject image 30-2 is changed in step S340, and the correct image 36 is obtained. This is because there are cases where it can be done.
- the defocus simulation process is the process of simulating the difference between the arbitrary imaging system 104 and the first imaging system 101 (step S230), The process further includes a process of reducing the predetermined subject image 30-2 (step S240).
- the correct image 36 is an image generated by performing the best focus simulation process (step S304) or an image generated by performing a process of reducing the predetermined subject image 30-2.
- the processing (step S230) of simulating the difference between the arbitrary imaging system 104 and the first imaging system 101 in the defocus simulation processing (step S204) is performed using a transfer function or a point at the object distance where the arbitrary imaging system 104 is in focus.
- the best focus simulation process (step S304) includes a process of simulating the difference between the arbitrary imaging system 104 and the first imaging system 101 (step S330), and a process of reducing the predetermined subject image 30-2 (step S340). , further including.
- the process (step S330) of simulating the difference between the arbitrary imaging system 104 and the first imaging system 101 in the best focus simulation process (step S304) is based on the transfer function or It is based on a point spread function and a transfer function or a point spread function at an object distance at which the first imaging system 101 is focused.
- the method of this embodiment can be applied to a case where the imaging systems of the arbitrary imaging system 104 and the first imaging system 101 are different.
- the first imaging system 101 includes a simultaneous imaging device 106.
- any imaging system 104 includes a monochrome image sensor 108.
- a method of image data generation processing in this case will be explained using FIG. 33. Note that the image data generation process in FIG. 33 can also be called step S126, and the image that is the source of step S126 can also be called the predetermined subject image 30-3.
- FIG. 32 the first imaging system 101 includes a simultaneous imaging device 106.
- any imaging system 104 includes a monochrome image sensor 108.
- FIG. 33 the image data generation process in FIG. 33 can also be called step S126, and the image that is the source of step S126 can also be called the predetermined subject image 30-3.
- FIG. 33 the image data generation process in FIG. 33 can also be called step S126, and the image that is the source of step S126 can also be called the predetermined subject image 30-3.
- step S206 shows the contents of the defocus simulation process (step S206) and the best focus simulation process (step S306), and the fact that the color shift determination process (step S190) is performed before performing step S206 and step S306. , which is different from FIG.
- the second imaging system 102 is exemplified as a representative of an arbitrary imaging system 104, as in the example of FIG. 29.
- the color shift determination process (S190) is a process of comparing the amount of coloring, such as around the saturated portion of the predetermined subject image 30-3, with a predetermined threshold value. Note that color shift is a shift that occurs between an R image, a G image, and a B image due to differences in imaging timing, etc.
- steps S206 and S306 in FIG. 33 use the predetermined subject image 30-3 in which the amount of coloring around the saturated portion is determined to be less than or equal to a predetermined threshold in step S190.
- steps S206 and S306 in FIG. 33 use the predetermined subject image 30-3 in which the amount of coloring around the saturated portion is determined to be less than or equal to a predetermined threshold in step S190.
- FIG. 34 shows an example of the defocus simulation process in the image data generation process (step S126) shown in FIG. 33.
- the defocus simulation processing shown in FIGS. 33 and 34 can also be called step S206.
- FIG. 34 differs from FIG. 30 in that it further includes a process of generating a mosaic image from the predetermined subject image 30-3 (step S250) and a process of demosaicing the mosaic image (step S252).
- the learning device processing unit 16 when generating the first learning image 32-1, the learning device processing unit 16 performs the above-mentioned step S230-1, the above-mentioned step S240, step S250, and step S252 for the predetermined subject image 30-3.
- step S206-1 performs arithmetic processing that is appropriately combined with arithmetic processing based on the image sensor information 50 (not shown in FIG. 34). That is, by performing step S206-1, the effect of the calculation process in step S230-1, the effect of the calculation process in step S240, the effect of the calculation process in step S250, and the effect of the calculation process in step S252, It is possible to obtain an arithmetic processing result that reflects the effect of the arithmetic processing based on the image sensor information 50.
- Steps S250 and S252 will be specifically explained.
- the predetermined subject image 30-3 is obtained by combining a plurality of images captured by the monochrome image sensor 108 at the timing when light of each wavelength band is irradiated when light of a plurality of wavelength bands is sequentially irradiated. This is a frame-sequential image obtained by processing.
- a mosaic image is generated by the process including step S250.
- processing including step S252 a frame sequential image is generated again from the mosaic image, thereby generating the first learning image 32-1. Note that in step S206-1 of FIG. 35, illustration of processes other than step S250 and step S252 is omitted.
- FIG. 36 shows an example of the best focus simulation process in the image data generation process (step S126) shown in FIG. 33.
- the best focus simulation processing shown in FIGS. 33 and 36 can also be called step S306.
- FIG. 36 differs from FIG. 31 in that it further includes a process of generating a mosaic image from the predetermined subject image 30-3 (step S350) and a process of demosaicing the mosaic image (step S352). Further, step S350 in FIG. 36 is the same process as step S250 in FIG. 34, and step S352 in FIG. 36 is the same process as step S252 in FIG.
- the learning device processing unit 16 performs arithmetic processing that appropriately combines the above-mentioned step S330-1, the above-mentioned step S340, step S350, step S352, and arithmetic processing based on the image sensor information 50 (not shown in FIG. 36). I do. Thereby, the learning device processing section 16 can generate the correct image 36. Thereby, by performing step S306, the effect of the arithmetic processing of step S330, the effect of the arithmetic processing of step S340, the effect of the arithmetic processing of step S350, the effect of the arithmetic processing of step S352, and the image sensor information It is possible to obtain a calculation result that reflects the effect of the calculation process based on 50.
- the correct image 36 may be generated by a process in which steps S330, S350, and S352 are omitted from the best focus simulation process (step S306).
- the correct image 36 may be generated by performing processing equivalent to step S340 on the predetermined subject image 30-3.
- any imaging system 104 includes the monochrome imaging element 108.
- the predetermined subject image 30-3 is obtained by combining a plurality of images captured by the monochrome image sensor 108 at the timing when light of each wavelength band is irradiated when light of a plurality of wavelength bands is sequentially irradiated. This is a frame-sequential image obtained by processing.
- the first imaging system 101 includes a simultaneous imaging element 106 that has a plurality of pixels having different colors and one color is assigned to each pixel.
- the defocus simulation process includes a process of generating a mosaic image in which each pixel is assigned one color from the predetermined subject image 30-3, a process of demosaicing the mosaic image, and a process of defocusing the mosaic image using the arbitrary imaging system 104. and the first imaging system 101; and processing to reduce the predetermined subject image 30-3.
- the process of simulating the difference between the arbitrary imaging system 104 and the first imaging system 101 in the defocus simulation process (step S206) is based on the transfer function or point spread function at the object distance at which the arbitrary imaging system 104 is in focus. , based on the transfer function or point spread function of the first imaging system 101 at a plurality of object distances.
- the correct image 36 is an image generated by performing the best focus simulation process (step S306) or an image generated by performing a process of reducing the predetermined subject image 30-3.
- the best focus simulation process (step S306) includes a process of generating a mosaic image, a process of demosaicing the mosaic image, a process of simulating the difference between the arbitrary imaging system 104 and the first imaging system 101, and a process of simulating the difference between the arbitrary imaging system 104 and the first imaging system 101. It further includes processing to reduce the image 30-3.
- the process of simulating the difference between the arbitrary imaging system 104 and the first imaging system 101 in the best focus simulation process is based on the transfer function or point spread function at the object distance at which the arbitrary imaging system 104 is in focus. , based on the transfer function or point spread function at the object distance that the first imaging system 101 focuses on.
- the trained models 120 may be used depending on the imaging method. That is, in the information processing system 100 of this embodiment, the storage unit 110 may store the first trained model 121 and the second trained model 122, as shown in FIG. 37, for example.
- the flow shown in FIG. 3 may be changed to the flow shown in FIG. 38, for example.
- the processing unit 130 After reading the image to be processed (step S10), the processing unit 130 performs a process of checking the imaging method of the first imaging system 101 (step S12).
- the imaging method is a frame sequential method
- reading of the first trained model (step S21), correction processing (step S31), and output of a corrected image (step S41) are performed.
- the imaging method is the Bayer simultaneous method
- reading of the second trained model (step S22), correction processing (step S32), and output of a corrected image (step S42) are performed.
- steps S21 and S22 in FIG. 38 are processes corresponding to step S20 in FIG. 3.
- steps S31 and S32 in FIG. 38 are processes corresponding to step S30 in FIG. 3
- steps S41 and S42 in FIG. 38 are processes corresponding to step S40 in FIG.
- step S100 in FIG. 7 may be replaced with step S101 in FIG. 39 and step S102 in FIG. 40.
- the image data generation process may be performed in step S124 in FIG. 29, in contrast to step S100 in FIG.
- the image data generation process may be performed in step S126 in FIG. 33, in contrast to step S100 in FIG.
- the method of this embodiment can be applied to a case where the observation method is different between the arbitrary imaging system 104 and the first imaging system 101.
- a method of image data generation processing when different observation methods are used will be described using FIG. 41.
- the image data generation process in FIG. 41 can also be called step S128, and the image that is the source of step S128 can also be called the predetermined subject image 30-4.
- Step S128 in FIG. 41 reads the contents of the defocus simulation process (step S208) and the best focus simulation process (step S308), and the observation method information 60 before performing steps S206 and S306. This is different from step S124 of No. 29.
- the observation method information 60 is information regarding the observation method of the first imaging system 101, for example.
- the learning device storage unit 18 further stores observation method information 60, which is not shown in FIG. 41.
- the second imaging system 102 is exemplified as a representative of an arbitrary imaging system 104, as in the example of FIG. 29.
- the observation method can also be called observation mode.
- a case where the observation method is different is, for example, a case where the light source used for observation is different, and for example, image processing performed between the time when the user performs processing to image the subject and the time when the predetermined subject image 30-4 is obtained.
- the method may be different.
- Observation methods include, for example, a WLI (White Light Imaging) mode that uses white illumination light, and a special light observation mode that uses special light that is not white light.
- the special light observation mode includes an NBI (Narrow Band Imaging) mode that uses two narrow band lights. The two narrowband lights are narrowband light included in the blue wavelength band and narrowband light included in the green wavelength band.
- WLI and NBI differ in image processing when generating a color image from an image signal output by an image sensor.
- the content of demosaic processing or the parameters in image processing are different.
- an RDI (Red Dichromatic Imaging) mode can be adopted as the special light observation mode.
- the RDI mode is an observation mode that uses narrowband light included in the amber wavelength band, narrowband light included in the green wavelength band, and narrowband light included in the red wavelength band. The technique disclosed in No. 9,775,497 B2 etc. is used.
- FIG. 42 shows an example of defocus simulation processing (step S208-1) that generates the first learning image 32-1 from the predetermined subject image 30-4.
- Step S208-1 in FIG. 42 is different from step S204-1 in FIG. The difference is that it further includes processing (step S268).
- TXI is an abbreviation for Texture and Color Enhancement Imaging, and the details will be described later.
- step S128 in FIG. 41 is an example in which the above-described different processing is added to S124 in FIG. Additional processing may be added.
- the color misregistration determination process (step S190) in FIG. 33 is further performed before performing step S208 and step S308.
- step S208 in FIG. 42 in this case further includes step S240, step S250, and step S252 in FIG.
- step S308 in FIG. 43 in this case further includes step S340, step S350, and step S352 in FIG.
- descriptions of points that overlap with step S124 in FIG. 29 and step S126 in FIG. 33 will be omitted as appropriate.
- the learning device processing unit 16 reads the observation method information 60 and acquires the observation method used in the first imaging system 101. The learning device processing unit 16 then selects one of step S262, step S264, step S266, and step S268 as the process corresponding to the acquired observation method.
- the learning device processing unit 16 when the first imaging system 101 is imaging in the TXI mode, information to that effect is stored in the learning device storage unit 18 as the observation method information 60. Then, by reading the observation method information 60, the learning device processing unit 16 performs defocus simulation processing (step S208) including TXI mode processing (step S368) on the predetermined subject image 30-4. Specifically, for example, the learning device processing unit 16 processes, for the predetermined subject image 30-4, a texture image portion that is an image portion related to the surface structure of the predetermined subject image 30-4, and a base image portion other than the texture image portion. Performs processing to decompose into parts.
- the learning device processing unit 16 then performs a first process to emphasize the surface structure of the texture image part, a second process to optimize the brightness of the base image part, and a second process to the image related to the first process.
- a third process is performed to optimize the tone of an image obtained by combining such images.
- the learning device processing unit 16 reads the observation method information 60 and performs color complementation on the predetermined subject image 30-4 to correspond to the light source. Furthermore, color complementation may be performed together with step S252 in FIG. 34, for example.
- the learning device processing unit 16 selects the WLI mode process (step S262)
- the learning device processing unit 16 performs a process of interpolating the R image and the B image using the G image in step S252.
- the learning device processing unit 16 selects the NBI mode processing (step S264)
- the learning device processing unit 16 performs processing to independently interpolate the G image and the B image in step S252.
- FIG. 43 shows an example of the best focus simulation process (step S308) that generates the correct image 36 from the predetermined subject image 30-4 in the image data generation process (step S128).
- Step S308 in FIG. 43 is compared with step S304 in FIG. ).
- Step S362 in FIG. 43 is the same process as step S262 in FIG. 42
- step S364 in FIG. 43 is the same process as step S264 in FIG. 42
- Step S368 in FIG. 43 is the same process as step S268 in FIG. 42.
- the correct image 36 may be generated by a process in which steps S330 and the like are omitted from the best focus simulation process (step S308) in FIG. 43.
- First learned model 122... Second learned model, 130, 260... Processing section, 140, 240... Input section , 150, 250... Output section, 160... Storage interface, 170, 210... Storage section, 200... Processor unit, 220... Control section, 230... Imaging data receiving section, 270... Display interface, 300... Endoscope system, 310 ...Endoscope scope, 320...Operation unit, 330...Display unit, NN...Neural network
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Radiology & Medical Imaging (AREA)
- Quality & Reliability (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Priority Applications (4)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| JP2024545361A JP7796889B2 (ja) | 2022-09-08 | 2022-09-08 | 情報処理システム、内視鏡システム、学習済みモデル、情報記憶媒体及び情報処理方法 |
| CN202280099456.0A CN119768825A (zh) | 2022-09-08 | 2022-09-08 | 信息处理系统、内窥镜系统、学习完毕模型、信息存储介质和信息处理方法 |
| PCT/JP2022/033706 WO2024053046A1 (ja) | 2022-09-08 | 2022-09-08 | 情報処理システム、内視鏡システム、学習済みモデル、情報記憶媒体及び情報処理方法 |
| US18/960,040 US20250124575A1 (en) | 2022-09-08 | 2024-11-26 | Information processing system, endoscope system, information storage medium, and information processing method |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| PCT/JP2022/033706 WO2024053046A1 (ja) | 2022-09-08 | 2022-09-08 | 情報処理システム、内視鏡システム、学習済みモデル、情報記憶媒体及び情報処理方法 |
Related Child Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US18/960,040 Continuation US20250124575A1 (en) | 2022-09-08 | 2024-11-26 | Information processing system, endoscope system, information storage medium, and information processing method |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| WO2024053046A1 true WO2024053046A1 (ja) | 2024-03-14 |
Family
ID=90192512
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| PCT/JP2022/033706 Ceased WO2024053046A1 (ja) | 2022-09-08 | 2022-09-08 | 情報処理システム、内視鏡システム、学習済みモデル、情報記憶媒体及び情報処理方法 |
Country Status (4)
| Country | Link |
|---|---|
| US (1) | US20250124575A1 (https=) |
| JP (1) | JP7796889B2 (https=) |
| CN (1) | CN119768825A (https=) |
| WO (1) | WO2024053046A1 (https=) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20220253979A1 (en) * | 2019-11-08 | 2022-08-11 | Olympus Corporation | Information processing system, endoscope system, and information storage medium |
Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2020201540A (ja) * | 2019-06-06 | 2020-12-17 | キヤノン株式会社 | 画像処理方法、画像処理装置、画像処理システム、学習済みウエイトの製造方法、および、プログラム |
| JP2021082118A (ja) * | 2019-11-21 | 2021-05-27 | キヤノン株式会社 | 学習方法、プログラム、学習装置、および、学習済みウエイトの製造方法 |
| JP2021140758A (ja) * | 2020-03-09 | 2021-09-16 | キヤノン株式会社 | 学習データの製造方法、学習方法、学習データ製造装置、学習装置、およびプログラム |
| JP2021168048A (ja) * | 2020-04-10 | 2021-10-21 | キヤノン株式会社 | 画像処理方法、画像処理装置、画像処理システム、およびプログラム |
| JP2022514580A (ja) * | 2018-12-18 | 2022-02-14 | ライカ マイクロシステムズ シーエムエス ゲゼルシャフト ミット ベシュレンクテル ハフツング | 機械学習による光学補正 |
| JP2022532206A (ja) * | 2019-05-22 | 2022-07-13 | エヌイーシー ラボラトリーズ アメリカ インク | ぼけた画像/ビデオを用いたsfm/slamへの適用を有する畳み込みニューラルネットワークを使用した画像/ビデオのボケ除去 |
Family Cites Families (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP5124835B2 (ja) | 2008-02-05 | 2013-01-23 | 富士フイルム株式会社 | 画像処理装置、画像処理方法、およびプログラム |
| JP2017050662A (ja) | 2015-09-01 | 2017-03-09 | キヤノン株式会社 | 画像処理装置、撮像装置および画像処理プログラム |
-
2022
- 2022-09-08 JP JP2024545361A patent/JP7796889B2/ja active Active
- 2022-09-08 CN CN202280099456.0A patent/CN119768825A/zh active Pending
- 2022-09-08 WO PCT/JP2022/033706 patent/WO2024053046A1/ja not_active Ceased
-
2024
- 2024-11-26 US US18/960,040 patent/US20250124575A1/en active Pending
Patent Citations (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| JP2022514580A (ja) * | 2018-12-18 | 2022-02-14 | ライカ マイクロシステムズ シーエムエス ゲゼルシャフト ミット ベシュレンクテル ハフツング | 機械学習による光学補正 |
| JP2022532206A (ja) * | 2019-05-22 | 2022-07-13 | エヌイーシー ラボラトリーズ アメリカ インク | ぼけた画像/ビデオを用いたsfm/slamへの適用を有する畳み込みニューラルネットワークを使用した画像/ビデオのボケ除去 |
| JP2020201540A (ja) * | 2019-06-06 | 2020-12-17 | キヤノン株式会社 | 画像処理方法、画像処理装置、画像処理システム、学習済みウエイトの製造方法、および、プログラム |
| JP2021082118A (ja) * | 2019-11-21 | 2021-05-27 | キヤノン株式会社 | 学習方法、プログラム、学習装置、および、学習済みウエイトの製造方法 |
| JP2021140758A (ja) * | 2020-03-09 | 2021-09-16 | キヤノン株式会社 | 学習データの製造方法、学習方法、学習データ製造装置、学習装置、およびプログラム |
| JP2021168048A (ja) * | 2020-04-10 | 2021-10-21 | キヤノン株式会社 | 画像処理方法、画像処理装置、画像処理システム、およびプログラム |
Cited By (2)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20220253979A1 (en) * | 2019-11-08 | 2022-08-11 | Olympus Corporation | Information processing system, endoscope system, and information storage medium |
| US12347066B2 (en) * | 2019-11-08 | 2025-07-01 | Olympus Corporation | Information processing system, endoscope system, and information storage medium |
Also Published As
| Publication number | Publication date |
|---|---|
| CN119768825A (zh) | 2025-04-04 |
| JPWO2024053046A1 (https=) | 2024-03-14 |
| US20250124575A1 (en) | 2025-04-17 |
| JP7796889B2 (ja) | 2026-01-09 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| JP5414752B2 (ja) | 画像処理方法、画像処理装置、撮像装置、および、画像処理プログラム | |
| JP2025123455A (ja) | 画像処理方法、画像処理装置、画像処理システム、学習済みウエイトの生成方法、および、プログラム | |
| JP5147994B2 (ja) | 画像処理装置およびそれを用いた撮像装置 | |
| JP5535053B2 (ja) | 画像処理装置、及び画像処理方法 | |
| JP5709911B2 (ja) | 画像処理方法、画像処理装置、画像処理プログラムおよび撮像装置 | |
| CN111010504B (zh) | 图像处理方法、装置和系统、摄像装置及存储介质 | |
| CN114651439B (zh) | 信息处理系统、内窥镜系统、信息存储介质及信息处理方法 | |
| CN107077722B (zh) | 图像记录设备及用于记录图像的方法 | |
| JP6381376B2 (ja) | 撮像装置、カメラシステム、画像処理装置および画像処理プログラム | |
| JP7414430B2 (ja) | 画像処理方法、画像処理装置、撮像装置、画像処理システム、プログラム、および、記憶媒体 | |
| WO2011121763A1 (ja) | 画像処理装置、およびそれを用いた撮像装置 | |
| JP7796889B2 (ja) | 情報処理システム、内視鏡システム、学習済みモデル、情報記憶媒体及び情報処理方法 | |
| JP5730036B2 (ja) | 画像処理装置、撮像装置、画像処理方法およびプログラム。 | |
| JP2020030569A (ja) | 画像処理方法、画像処理装置、撮像装置、レンズ装置、プログラム、および、記憶媒体 | |
| JP5425135B2 (ja) | 画像処理方法、画像処理装置、撮像装置および画像処理プログラム | |
| JP2017050662A (ja) | 画像処理装置、撮像装置および画像処理プログラム | |
| JP7129229B2 (ja) | 画像処理方法、画像処理装置、撮像装置、プログラム、および、記憶媒体 | |
| US20250285233A1 (en) | Information processing system, endoscope system, image processing method and information storage medium | |
| JP5611439B2 (ja) | 画像処理方法、画像処理装置、撮像装置、および、画像処理プログラム | |
| JP5344647B2 (ja) | 画像処理方法、画像処理装置および画像処理プログラム | |
| JP7676164B2 (ja) | 画像処理装置、撮像装置、画像処理方法、およびプログラム | |
| JP2023179838A (ja) | 画像処理方法、画像処理装置、画像処理システム、画像処理プログラム | |
| JP5197784B2 (ja) | 撮像装置 | |
| JP2012156714A (ja) | プログラム、画像処理装置、画像処理方法および撮像装置。 | |
| JP2015109681A (ja) | 画像処理方法、画像処理装置、画像処理プログラムおよび撮像装置 |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| 121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22958125 Country of ref document: EP Kind code of ref document: A1 |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 2024545361 Country of ref document: JP |
|
| WWE | Wipo information: entry into national phase |
Ref document number: 202280099456.0 Country of ref document: CN |
|
| WWP | Wipo information: published in national office |
Ref document number: 202280099456.0 Country of ref document: CN |
|
| NENP | Non-entry into the national phase |
Ref country code: DE |
|
| 122 | Ep: pct application non-entry in european phase |
Ref document number: 22958125 Country of ref document: EP Kind code of ref document: A1 |