WO2022176465A1 - 画像処理装置および画像処理方法 - Google Patents

画像処理装置および画像処理方法 Download PDF

Info

Publication number
WO2022176465A1
WO2022176465A1 PCT/JP2022/001112 JP2022001112W WO2022176465A1 WO 2022176465 A1 WO2022176465 A1 WO 2022176465A1 JP 2022001112 W JP2022001112 W JP 2022001112W WO 2022176465 A1 WO2022176465 A1 WO 2022176465A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
teacher
posture
similarity
unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2022/001112
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
健太 先崎
響子 室園
昭吾 佐藤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to CN202280015888.9A priority Critical patent/CN116868234A/zh
Priority to JP2023500634A priority patent/JP7464188B2/ja
Priority to US18/273,943 priority patent/US20240296663A1/en
Publication of WO2022176465A1 publication Critical patent/WO2022176465A1/ja
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/74Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/75Determining position or orientation of objects or cameras using feature-based methods involving models
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/74Image or video pattern matching; Proximity measures in feature spaces
    • G06V10/761Proximity, similarity or dissimilarity measures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/776Validation; Performance evaluation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Definitions

  • the present invention relates to an image processing device and an image processing method, and more particularly to an image processing device and an image processing method capable of detecting deterioration in estimation accuracy in estimating the pose of an object using machine learning.
  • SSA Space Situational Awareness
  • information such as the position, speed, or appearance of objects is obtained by methods such as radar, optical telescopes, or satellite imaging.
  • SSA One of the purposes of SSA is to estimate the 3D pose of an object from the external image of the object.
  • the posture of an object is represented by parameters such as Euler angles and quaternions.
  • a method for estimating the 3D pose of an object from an image there is a method that uses image classification based on machine learning.
  • a common image classification problem is that an object in an image identifies the appropriate label from predefined labels such as "dog”, “cat”, “apple”, and so on.
  • Image classification methods applied to 3D pose estimation indirectly estimate the pose of an object by identifying which of the predefined poses the object pose in the image matches. .
  • Patent Document 1 describes a method of suppressing a decrease in classification accuracy for a specific posture group. Specifically, Patent Literature 1 describes a technique for suppressing deterioration in pose recognition accuracy in the vicinity of a specific pose class when estimating the pose of a target object in an input image.
  • a method that uses regression based on machine learning as a method for estimating the three-dimensional pose of an object from an image.
  • a regression model is generated by directly learning the relationship between images and pose parameters using statistical methods.
  • the regression model outputs parameters representing the estimated orientation of an object that appears in the image of interest.
  • Patent Document 2 describes an information processing device that enables selection of an image that enables efficient observation of differences in posture of a person from among a plurality of images in which a person is photographed.
  • Patent Document 3 discloses a video classification device and a video classification program for classifying video scenes that are still images or moving images, and a video search device and video search program for searching for a specific scene from the video scenes. Have been described.
  • the image classification method requires a database that stores labels corresponding to various postures and lighting environments.
  • a method using image recognition based on machine learning, such as regression, requires a database that stores learning images corresponding to various postures, lighting environments, and the like.
  • Patent Documents 1 to 3 do not describe a technique capable of detecting deterioration in posture estimation accuracy in actual operation.
  • an object of the present invention is to provide an image processing apparatus and an image processing method that can detect a decrease in estimation accuracy in estimating the pose of an object using machine learning.
  • An image processing apparatus converts a posture parameter, which is a parameter representing the posture of an object in a target image, into an image of the captured object, based on a target image, which is a captured image of the target object whose posture is to be estimated.
  • an estimator for estimating by a posture estimation model trained using one or more training data including a teacher image and posture parameters of an object in the teacher image; an acquisition unit that acquires a teacher image with a maximum posture similarity, which is a similarity to a parameter, among one or more teacher images included in one or more teacher data; It is characterized by comprising a first calculator that calculates an image similarity, which is the degree of similarity, and a determination unit that determines whether the calculated image similarity is equal to or less than a predetermined threshold.
  • the image processing method is based on a target image, which is a photographed image of an object whose orientation is to be estimated, and converts a posture parameter, which is a parameter representing the orientation of the object in the target image, into an image in which the object is photographed.
  • a posture estimation model trained using one or more training data including a teacher image and posture parameters of an object in the teacher image
  • the estimated posture parameter and the posture parameter related to the teacher image are estimated
  • Posture similarity which is the degree of similarity, is obtained by acquiring the maximum teacher image among one or more teacher images included in one or more teacher data, and image similarity is the degree of similarity between the target image and the acquired teacher image. and determining whether or not the calculated image similarity is equal to or less than a predetermined threshold.
  • a computer-readable recording medium recording an image processing program according to the present invention, when executed by a computer, is based on a target image, which is a photographed image of a target object whose orientation is to be estimated.
  • a pose estimation model trained using one or more teacher data including a teacher image, which is an image in which an object is photographed, and the posture parameters of the object in the teacher image. acquire the teacher image with the largest posture similarity, which is the similarity between the estimated posture parameter and the posture parameter related to the teacher image, among one or more teacher images included in one or more teacher data.
  • an image processing program for calculating an image similarity, which is a degree of similarity between a target image and an acquired teacher image, and determining whether or not the calculated image similarity is equal to or less than a predetermined threshold is stored.
  • FIG. 1 is a block diagram showing a configuration example of an image processing apparatus according to a first embodiment of the present invention
  • FIG. FIG. 4 is an explanatory diagram showing an example of a target image
  • FIG. 10 is an explanatory diagram showing an example of processing in which a similarity calculation unit 130 processes an image of interest and a teacher image, respectively
  • 4 is a flowchart showing operations of posture estimation accuracy determination processing by the image processing apparatus 100 of the first embodiment
  • FIG. 5 is a block diagram showing a configuration example of an image processing apparatus according to a second embodiment of the present invention
  • FIG. 9 is a flowchart showing operations of posture estimation accuracy determination processing by the image processing apparatus 101 of the second embodiment.
  • 1 is an explanatory diagram showing a hardware configuration example of an image processing apparatus according to the present invention
  • FIG. 1 is a block diagram showing an outline of an image processing device according to the present invention
  • FIG. 1 is a block diagram showing a configuration example of an image processing apparatus according to the first embodiment of the present invention.
  • the image processing apparatus 100 includes a posture estimation unit 110, an image acquisition unit 120, a similarity calculation unit 130, a similarity determination unit 140, an output information generation unit 150, and a posture estimation model storage. It includes a unit 160 and a teacher data storage unit 170 .
  • an input device 200 for inputting images and related information to the image processing apparatus 100 is communicably connected to the image processing apparatus 100 .
  • the input device 200 is, for example, a database in which images and related information are accumulated.
  • the input device 200 may be an interface for acquiring images and related information from a database in which images and related information are accumulated.
  • the image processing apparatus 100 is communicatively connected to an output device 300 that outputs the processing results of the image processing apparatus 100 .
  • the output device 300 is, for example, a visualization device such as a display or a printer for displaying processing results.
  • the output device 300 may be a recording device that records processing results in a storage medium such as a hard disk or a memory card.
  • the output device 300 may be an interface that supplies processing results to a recording device.
  • an image input by the input device 200 to the image processing device 100 is called a "image of interest".
  • the image of interest is, for example, an image captured by an optical sensor of a satellite.
  • FIG. 2 is an explanatory diagram showing an example of a target image.
  • the above “related information” is information that accompanies the image of interest.
  • the related information includes, for example, the distance between the object to be photographed and the optical sensor when the image of interest was photographed, the position information and velocity information of the object to be photographed and the object with the optical sensor in a predetermined coordinate space, the orientation of the object with the optical sensor.
  • relevant information are parameters that can be acquired at the same time as the image acquisition.
  • the posture estimation model storage unit 160 has a function of storing the structure, parameters, etc. of an image recognizer that has been learned in advance using teacher data.
  • the image recognizer uses an algorithm for pose estimation. That is, posture estimation model storage section 160 stores the parameters of the posture estimation model.
  • the pose estimation algorithm used in the above image recognizer may be an algorithm configured by a general supervised machine learning method.
  • the pose estimation algorithm may be an algorithm configured by a method using regression, such as Support Vector Regression (SVR) or a convolutional neural network.
  • SVR Support Vector Regression
  • the teacher data storage unit 170 has a function of storing teacher data used in learning parameters of the posture estimation model stored in the posture estimation model storage unit 160 .
  • the teacher data used in learning is data that represents the object itself, which is the target of pose estimation.
  • the training data is a set of three-dimensional pose parameters of an object whose pose is to be estimated and an image of the object.
  • An image included in the training data is hereinafter referred to as a training image.
  • the teacher data storage unit 170 may store all the teacher data used for learning, or may store part of the teacher data appropriately sampled from all the teacher data.
  • the training data storage unit 170 stores the distance between the object to be photographed and the optical sensor when the training image was photographed, the position information of the object to be photographed in a predetermined coordinate space, the speed information of the object to be photographed, the light source position, Parameters of imaging conditions such as information may be stored together.
  • the teacher image may be a CG image generated by a three-dimensional model as well as a photographed image.
  • the pose estimation model in this embodiment uses, for example, one or more teacher data including a teacher image, which is an image of an object, and a posture parameter, which is a parameter representing the posture of the object in the teacher image. It is a model trained by
  • ⁇ X , ⁇ Y , and ⁇ Z be the rotation parameters about the X axis, the Y axis, and the Z axis, respectively.
  • the posture estimation unit 110 has a function of estimating the posture of an object. Specifically, posture estimation section 110 refers to posture estimation model storage section 160 to acquire the structure and parameters of the posture estimation model, and constructs the posture estimation model.
  • posture estimation section 110 estimates the three-dimensional posture of the object in target image I_target input from input device 200 using the constructed posture estimation model.
  • the pose parameter ⁇ target of the estimated object in the image of interest is defined as follows.
  • the posture estimation unit 110 of the present embodiment calculates the posture parameters of the object in the target image based on the target image (image of interest), which is a photographed image of the object whose posture is to be estimated, using the posture estimation model. presume.
  • posture estimation section 110 inputs the estimated posture parameter ⁇ target to output information generation section 150 and image acquisition section 120 .
  • the image acquisition unit 120 receives the estimated orientation parameter ⁇ target of the object in the image of interest I_target from the orientation estimation unit 110 .
  • the image acquisition unit 120 has a function of acquiring a teacher image from the teacher data storage unit 170 based on the input posture parameter ⁇ target .
  • the image acquisition unit 120 retrieves the image I_train, which is a training image in which an object whose posture is most similar to the posture of the object in the image of interest I_target, and related information of the image I_train from the training data storage unit 170. get.
  • the pose parameter ⁇ train,i of the object in the i-th teacher image included in the teacher data is defined as follows.
  • the image acquisition unit 120 calculates the difference ⁇ i between the orientation parameter ⁇ target of the object in the image of interest I_target and the orientation parameter ⁇ train,i of the object in the i-th training image included in the training data as follows: Calculate as
  • the image acquisition unit 120 calculates ⁇ i over one or more teacher images included in one or more teacher data.
  • the training image with the smallest 2-norm of ⁇ i is the training image showing the object whose posture is most similar to that of the object in the image of interest I_target. be.
  • the calculation formula used to acquire the teacher image is not limited to formula (1).
  • the image acquisition unit 120 may acquire the training image with the smallest infinity norm as the training image showing an object with the most similar posture.
  • the image acquiring unit 120 may add a process of limiting the range of angles to [-180, 180] to the process of calculating the difference.
  • the formula for calculating the angle difference around the X axis is changed as follows.
  • the image acquisition unit 120 of the present embodiment obtains the posture similarity, which is the degree of similarity between the estimated posture parameter and the posture parameter related to the teacher image, for one or more teacher images included in one or more teacher data.
  • the posture similarity which is the degree of similarity between the estimated posture parameter and the posture parameter related to the teacher image
  • the reciprocal of the 2-norm of ⁇ i corresponds to the posture similarity.
  • the image acquiring unit 120 of the present embodiment calculates the posture similarity of the teacher image for each of one or more teacher images included in one or more teacher data, and calculates the posture similarity based on the calculated posture similarity. to acquire the teacher image.
  • the image acquisition unit 120 inputs the acquired teacher image and related information of the teacher image to the similarity calculation unit 130 .
  • the similarity calculation unit 130 has a function of calculating the similarity ⁇ between the target image I_target and the training image I_train.
  • the similarity calculator 130 can use, for example, the peak value of the phase-only correlation method or an index such as zero-mean normalized cross-correlation as the similarity ⁇ . Note that the similarity calculation unit 130 may use an index other than the above indices as the similarity ⁇ .
  • the similarity calculation unit 130 calculates the distance between the object and the optical sensor, which is related information for I_target and I_train so that the sizes of the objects captured in I_target and I_train are approximately the same. to enlarge or reduce the image.
  • the similarity calculation unit 130 calculates the following value s.
  • FIG. 3 is an explanatory diagram showing an example of processing in which the similarity calculation unit 130 processes the target image and the teacher image.
  • the similarity calculation unit 130 of the present embodiment calculates the image similarity ( ⁇ ), which is the similarity between the target image (image of interest) and the acquired teacher image.
  • the similarity calculation section 130 inputs the calculated similarity ⁇ to the similarity determination section 140 .
  • the similarity determination unit 140 has a function of comparing the similarity ⁇ input from the similarity calculation unit 130 with a predetermined threshold ⁇ . Specifically, similarity determination section 140 generates flag information f indicating whether or not similarity ⁇ is equal to or less than a predetermined threshold ⁇ as information representing an estimated posture error, as follows. .
  • the similarity determination unit 140 of this embodiment determines whether the calculated image similarity is equal to or less than a predetermined threshold. That is, similarity determination section 140 inputs similarity ⁇ and flag information f to output information generation section 150 .
  • output information generation section 150 Based on the estimated posture parameter ⁇ target input from posture estimation section 110 and the similarity ⁇ and flag information f input from similarity determination section 140, output information generation section 150 outputs It has a function to generate input information.
  • output information generating section 150 determines that the estimated posture parameter error is large, that is, there is a possibility that the posture estimation accuracy has decreased.
  • a warning message is displayed on the output device 300 .
  • the output information generation unit 150 displays a warning message on the output device 300 together with the estimated posture parameter value and similarity.
  • output information generation section 150 may input a set of a simply estimated posture parameter value, similarity, and flag information to a storage device (not shown) connected to output device 300 .
  • the output information generation unit 150 of the present embodiment outputs information indicating that the posture estimation accuracy has decreased when the image similarity is calculated to be equal to or less than a predetermined threshold.
  • FIG. 4 is a flowchart showing the operation of posture estimation accuracy determination processing by the image processing apparatus 100 of the first embodiment.
  • the image processing device 100 receives from the input device 200 an image of interest showing an object whose orientation is to be estimated, and information related to the image of interest (step S101).
  • posture estimation section 110 of image processing device 100 constructs a posture estimation model using information on the structure and parameters of the posture estimation model stored in posture estimation model storage section 160 .
  • the pose estimation unit 110 uses the constructed pose estimation model to estimate the pose parameters of the object in the input image of interest (step S102).
  • posture estimation section 110 may construct a posture estimation model in advance.
  • Posture estimation section 110 inputs the estimated posture parameters to image acquisition section 120 .
  • the image acquiring unit 120 acquires a teacher image showing an object whose posture is most similar to that of the object in the image of interest from the teacher data storage unit 170 based on the estimated posture parameter (step S103). .
  • the image acquisition unit 120 inputs the acquired teacher image and related information of the teacher image to the similarity calculation unit 130 .
  • the similarity calculation unit 130 calculates the similarity between the target image and the input teacher image (step S104).
  • the similarity calculation section 130 inputs the calculated similarity to the similarity determination section 140 .
  • the similarity determination unit 140 generates flag information indicating whether the input similarity is equal to or less than a predetermined threshold (step S105).
  • the similarity determination section 140 inputs the similarity and flag information to the output information generation section 150 .
  • output information generation section 150 generates output information based on the estimated posture parameter values, similarity, and flag information.
  • the output information generator 150 inputs the generated output information to the output device 300 (step S106). After inputting the output information, the image processing apparatus 100 ends the posture estimation accuracy determination process.
  • the posture estimation unit 110 estimates posture parameters from an image of interest showing an object whose posture is to be estimated.
  • the image acquisition unit 120 acquires a teacher image based on the estimated posture parameter, and the similarity calculation unit 130 calculates the similarity between the target image and the acquired teacher image.
  • similarity determination section 140 detects a decrease in posture estimation accuracy based on the calculated similarity.
  • the image processing apparatus 100 of the present embodiment acquires a teacher image showing an object whose orientation is most similar to that of the object in the image of interest, unlike the video classification apparatus and the like described in Patent Document 3, for example, and Based on the similarity between the image of interest and the teacher image, it is determined whether or not the accuracy of posture estimation is degraded. In other words, the image processing apparatus 100 can more reliably detect a decrease in pose estimation accuracy than the video classification apparatus or the like described in Patent Document 3.
  • a user of the image processing apparatus 100 of this embodiment can detect the state of an object existing in outer space based on the low-accuracy estimated posture parameter by detecting a decrease in the accuracy of the posture parameter estimated by image recognition. can avoid misjudgment.
  • FIG. 5 is a block diagram showing a configuration example of an image processing apparatus according to the second embodiment of the present invention.
  • the image processing apparatus 101 includes a posture estimation unit 110, a similarity calculation unit 130, a similarity determination unit 140, an output information generation unit 150, a posture estimation model storage unit 160, an image generation unit A unit 180 and a 3D model storage unit 190 are provided. Further, as shown in FIG. 5, the image processing apparatus 101 is connected to an input device 200 and an output device 300 so as to be able to communicate with each other.
  • posture estimation section 110 Similarity calculation section 130, similarity determination section 140, output information generation section 150, and posture estimation model storage section 160 of the present embodiment are the same as those of the first embodiment. It is the same.
  • Each component of the image generation unit 180 and the 3D model storage unit 190 will be described below.
  • the 3D model storage unit 190 stores a 3D model of the same object as the object indicated by the teacher data used for learning the parameters of the posture estimation model stored in the posture estimation model storage unit 160, or a 3D model of the same type of object. has a function of storing
  • the image generator 180 has a function of generating a simulation image of the teacher image I_train. Specifically, the image generation unit 180 rotates the 3D model acquired from the 3D model storage unit 190 based on the pose parameter of the object in the estimated image of interest I_target input from the pose estimation unit 110. Let By rotating the three-dimensional model, the image generator 180 generates a simulation image.
  • the image generation unit 180 uses the distance between the object in the image of interest and the optical sensor so that the object in the simulation image generated from the three-dimensional model is the same distance away from the optical sensor as the object in the image of interest. It may be considered to exist at the location. For example, the image generator 180 may appropriately enlarge or reduce the generated simulation image.
  • the image generation unit 180 of the present embodiment generates a teacher image (simulation image) with the maximum posture similarity based on the estimated posture parameters.
  • the image generator 180 generates a teacher image using a three-dimensional model representing an object.
  • the similarity calculator 130 of this embodiment acquires the teacher image from the image generator 180 .
  • FIG. 6 is a flowchart showing the operation of posture estimation accuracy determination processing by the image processing apparatus 101 of the second embodiment.
  • the image processing device 101 receives from the input device 200 an image of interest showing an object whose orientation is to be estimated, and information related to the image of interest (step S201).
  • the posture estimation unit 110 of the image processing device 101 constructs a posture estimation model using the structure and parameter information of the posture estimation model stored in the posture estimation model storage unit 160 .
  • the pose estimation unit 110 uses the constructed pose estimation model to estimate the pose parameters of the object in the input image of interest (step S202).
  • posture estimation section 110 may construct a posture estimation model in advance.
  • Posture estimation section 110 inputs the estimated posture parameters to image generation section 180 .
  • the image generation unit 180 rotates the 3D model acquired from the 3D model storage unit 190 based on the posture parameters estimated in step S202. By rotating the three-dimensional model, the image generation unit 180 generates a simulation image of the teacher image I_train showing an object whose posture is most similar to that of the object in the image of interest (step S203). The image generation unit 180 inputs the generated simulation image and related information of the simulation image to the similarity calculation unit 130 .
  • the similarity calculation unit 130 calculates the similarity between the image of interest and the input simulation image (step S204).
  • the similarity calculation section 130 inputs the calculated similarity to the similarity determination section 140 .
  • the similarity determination unit 140 generates flag information indicating whether the input similarity is equal to or less than a predetermined threshold (step S205).
  • the similarity determination section 140 inputs the similarity and flag information to the output information generation section 150 .
  • output information generation section 150 generates output information based on the estimated posture parameter values, similarity, and flag information.
  • the output information generator 150 inputs the generated output information to the output device 300 (step S206). After inputting the output information, the image processing apparatus 101 ends the posture estimation accuracy determination process.
  • the teacher data storage unit 170 of the image processing apparatus 100 of the first embodiment stores part or all of the teacher data used for learning the posture estimation model. If the posture sampling angle is fine, a huge amount of data is stored in the training data storage unit 170, which may increase the cost of the storage area.
  • the image processing apparatus 101 of the present embodiment stores, instead of the teacher data storage unit 170, a three-dimensional model of an object that is the same as or of the same type as the object indicated by the teacher data used in learning the parameters of the posture estimation model.
  • a 3D model storage unit 190 is provided. That is, since the amount of data stored in the 3D model storage unit 190 does not change regardless of the value of the orientation sampling angle, the image processing apparatus 101 can suppress an increase in the cost of the storage area.
  • the image processing apparatuses 100 and 101 of each embodiment can be used, for example, in the field of remote sensing.
  • FIG. 7 is an explanatory diagram showing a hardware configuration example of the image processing apparatus according to the present invention.
  • the image processing apparatus shown in FIG. 7 includes a CPU (Central Processing Unit) 11, a main storage section 12, a communication section 13, and an auxiliary storage section . It also has an input unit 15 for user operation, and an output unit 16 for presenting the processing result or the progress of the processing content to the user.
  • CPU Central Processing Unit
  • the image processing device is realized by software by the CPU 11 shown in FIG. 7 executing a program that provides the functions of each component.
  • the CPU 11 loads a program stored in the auxiliary storage unit 14 into the main storage unit 12, executes it, and controls the operation of the image processing apparatus, thereby realizing each function by software.
  • the image processing apparatus shown in FIG. 7 may include a DSP (Digital Signal Processor) instead of the CPU 11.
  • the image processing apparatus shown in FIG. 7 may include both the CPU 11 and the DSP.
  • the main storage unit 12 is used as a data work area and a data temporary save area.
  • the main storage unit 12 is, for example, a RAM (Random Access Memory).
  • the communication unit 13 has a function of inputting data to and outputting data from peripheral devices via a wired network or a wireless network (information communication network).
  • the auxiliary storage unit 14 is a non-temporary tangible storage medium.
  • non-temporary tangible storage media include magnetic disks, magneto-optical disks, CD-ROMs (Compact Disk Read Only Memory), DVD-ROMs (Digital Versatile Disk Read Only Memory), and semiconductor memories.
  • the input unit 15 has a function of inputting data and processing instructions.
  • the input unit 15 is, for example, an input device such as a keyboard or mouse.
  • the output unit 16 has a function of outputting data.
  • the output unit 16 is, for example, a display device such as a liquid crystal display device, or a printing device such as a printer.
  • each component in the image processing apparatus is connected to the system bus 17 .
  • the auxiliary storage unit 14 implements the posture estimation unit 110, the image acquisition unit 120, the similarity calculation unit 130, the similarity determination unit 140, and the output information generation unit 150. program is memorized. Posture estimation model storage unit 160 and teacher data storage unit 170 are implemented by main storage unit 12 .
  • the image processing apparatus 100 may be implemented with a circuit containing hardware components such as LSI (Large Scale Integration) that implements the functions shown in FIG.
  • LSI Large Scale Integration
  • the auxiliary storage unit 14 implements the posture estimation unit 110, the similarity calculation unit 130, the similarity determination unit 140, the output information generation unit 150, and the image generation unit 180. I remember a program to do that. Also, the posture estimation model storage unit 160 and the 3D model storage unit 190 are implemented by the main storage unit 12 .
  • the image processing apparatus 101 may be implemented with, for example, a circuit containing hardware components such as an LSI that implements the functions shown in FIG.
  • the image processing apparatuses 100 and 101 may be realized by hardware that does not include computer functions using elements such as CPUs.
  • part or all of each component may be implemented by general-purpose circuitry, dedicated circuitry, processors, etc., or combinations thereof. These may be composed of a single chip (for example, the LSI described above), or may be composed of a plurality of chips connected via a bus. A part or all of each component may be implemented by a combination of the above-described circuit or the like and a program.
  • the constituent elements of the image processing apparatuses 100 and 101 may be composed of one or more information processing apparatuses each having a calculation unit and a storage unit.
  • the plurality of information processing devices, circuits, etc. may be centrally arranged or distributed.
  • the information processing device, circuits, and the like may be implemented as a client-and-server system, a cloud computing system, or the like, each of which is connected via a communication network.
  • FIG. 8 is a block diagram showing an outline of an image processing device according to the present invention.
  • the image processing apparatus 20 converts a posture parameter, which is a parameter representing the posture of an object in a target image, into a target image, which is a captured image of a target object whose posture is to be estimated.
  • an estimating unit 21 e.g., posture estimating unit 110
  • An acquisition unit 22 For example, the image acquisition unit 120 or the similarity calculation unit 130
  • the first calculation unit 23 for example, the similarity calculation unit 130
  • a determination unit 24 for example, a similarity determination unit 140
  • the image processing device can detect a decrease in estimation accuracy in estimating the pose of an object using machine learning.
  • the image processing device 20 includes a second calculation unit (for example, the image acquisition unit 120) that calculates the posture similarity of the teacher image over one or more teacher images included in one or more teacher data.
  • the acquiring unit 22 may acquire the teacher image based on the calculated posture similarity.
  • the image processing device can calculate posture similarity using teacher data.
  • the image processing device 20 also includes a generating unit (for example, the image generating unit 180) that generates a training image with the maximum posture similarity based on the estimated posture parameters. may be obtained. Also, the generation unit may generate the teacher image using a three-dimensional model representing the object.
  • a generating unit for example, the image generating unit 180
  • the generation unit may generate the teacher image using a three-dimensional model representing the object.
  • the image processing device can suppress an increase in the cost of the storage area.
  • the image processing apparatus 20 also includes an output unit (for example, the output information generation unit 150) that outputs information indicating that the posture estimation accuracy has decreased when an image similarity that is equal to or less than a predetermined threshold is calculated. good too.
  • an output unit for example, the output information generation unit 150
  • the image processing device can present the user with a decrease in estimation accuracy in estimating the pose of the object.
  • the posture parameter may be represented by Euler angles.
  • the image processing device can detect a decrease in estimation accuracy in estimating the pose of a rigid body.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)
PCT/JP2022/001112 2021-02-18 2022-01-14 画像処理装置および画像処理方法 Ceased WO2022176465A1 (ja)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN202280015888.9A CN116868234A (zh) 2021-02-18 2022-01-14 图像处理设备和图像处理方法
JP2023500634A JP7464188B2 (ja) 2021-02-18 2022-01-14 画像処理装置および画像処理方法
US18/273,943 US20240296663A1 (en) 2021-02-18 2022-01-14 Image processing device and image processing method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2021024043 2021-02-18
JP2021-024043 2021-02-18

Publications (1)

Publication Number Publication Date
WO2022176465A1 true WO2022176465A1 (ja) 2022-08-25

Family

ID=82930801

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/001112 Ceased WO2022176465A1 (ja) 2021-02-18 2022-01-14 画像処理装置および画像処理方法

Country Status (4)

Country Link
US (1) US20240296663A1 (https=)
JP (1) JP7464188B2 (https=)
CN (1) CN116868234A (https=)
WO (1) WO2022176465A1 (https=)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2026004700A1 (ja) * 2024-06-25 2026-01-02 Jfeスチール株式会社 Kr脱硫方法

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019087229A (ja) * 2017-11-02 2019-06-06 キヤノン株式会社 情報処理装置、情報処理装置の制御方法及びプログラム
JP2020098575A (ja) * 2018-12-13 2020-06-25 富士通株式会社 画像処理装置、画像処理方法、及び画像処理プログラム

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4553141B2 (ja) * 2003-08-29 2010-09-29 日本電気株式会社 重み情報を用いた物体姿勢推定・照合システム
US10977827B2 (en) * 2018-03-27 2021-04-13 J. William Mauchly Multiview estimation of 6D pose
JP7054392B2 (ja) * 2019-06-06 2022-04-13 Kddi株式会社 姿勢推定装置、方法およびプログラム
CN111311679B (zh) * 2020-01-31 2022-04-01 武汉大学 一种基于深度相机的自由漂浮目标位姿估计方法

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019087229A (ja) * 2017-11-02 2019-06-06 キヤノン株式会社 情報処理装置、情報処理装置の制御方法及びプログラム
JP2020098575A (ja) * 2018-12-13 2020-06-25 富士通株式会社 画像処理装置、画像処理方法、及び画像処理プログラム

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2026004700A1 (ja) * 2024-06-25 2026-01-02 Jfeスチール株式会社 Kr脱硫方法

Also Published As

Publication number Publication date
US20240296663A1 (en) 2024-09-05
JP7464188B2 (ja) 2024-04-09
CN116868234A (zh) 2023-10-10
JPWO2022176465A1 (https=) 2022-08-25

Similar Documents

Publication Publication Date Title
CN111862201B (zh) 一种基于深度学习的空间非合作目标相对位姿估计方法
US11842514B1 (en) Determining a pose of an object from rgb-d images
US10535160B2 (en) Markerless augmented reality (AR) system
US20190026948A1 (en) Markerless augmented reality (ar) system
WO2019011249A1 (zh) 一种图像中物体姿态的确定方法、装置、设备及存储介质
JP6624794B2 (ja) 画像処理装置、画像処理方法及びプログラム
CN108897836B (zh) 一种机器人基于语义进行地图构建的方法和装置
CN112329663B (zh) 一种基于人脸图像序列的微表情时刻检测方法及装置
US20210338109A1 (en) Fatigue determination device and fatigue determination method
CN110096929A (zh) 基于神经网络的目标检测
CN113378712B (zh) 物体检测模型的训练方法、图像检测方法及其装置
CN110956131B (zh) 单目标追踪方法、装置及系统
JP5833507B2 (ja) 画像処理装置
CN114359377B (zh) 一种实时6d位姿估计方法及计算机可读存储介质
CN106575363A (zh) 用于追踪场景中的关键点的方法
US11836839B2 (en) Method for generating animation figure, electronic device and storage medium
CN115471863A (zh) 三维姿态的获取方法、模型训练方法和相关设备
CN114882480A (zh) 用于获取目标对象状态的方法、装置、介质以及电子设备
CN113793370A (zh) 三维点云配准方法、装置、电子设备及可读介质
US12525060B2 (en) Work estimation device, work estimation method, and non-transitory computer readable medium
US20210398292A1 (en) Movement amount estimation device, movement amount estimation method, and computer readable medium
JP7464188B2 (ja) 画像処理装置および画像処理方法
CN110287764A (zh) 姿势预测方法、装置、计算机设备和存储介质
WO2021098666A1 (zh) 手部姿态检测方法和装置、及计算机存储介质
US11983242B2 (en) Learning data generation device, learning data generation method, and learning data generation program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22755790

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2023500634

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 18273943

Country of ref document: US

WWE Wipo information: entry into national phase

Ref document number: 202280015888.9

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22755790

Country of ref document: EP

Kind code of ref document: A1