WO2021192225A1 - 教師データ変換装置、教師データ変換方法、及び、非一時的な記録媒体 - Google Patents

教師データ変換装置、教師データ変換方法、及び、非一時的な記録媒体 Download PDF

Info

Publication number
WO2021192225A1
WO2021192225A1 PCT/JP2020/014031 JP2020014031W WO2021192225A1 WO 2021192225 A1 WO2021192225 A1 WO 2021192225A1 JP 2020014031 W JP2020014031 W JP 2020014031W WO 2021192225 A1 WO2021192225 A1 WO 2021192225A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
teacher data
orientation
geometric transformation
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2020/014031
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
光 古根村
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to US17/908,419 priority Critical patent/US12380591B2/en
Priority to JP2022510328A priority patent/JP7283631B2/ja
Priority to PCT/JP2020/014031 priority patent/WO2021192225A1/ja
Publication of WO2021192225A1 publication Critical patent/WO2021192225A1/ja
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/02Affine transformations
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/20Linear translation of whole images or parts thereof, e.g. panning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/60Rotation of whole images or parts thereof
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • G06T7/62Analysis of geometric attributes of area, perimeter, diameter or volume
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/242Aligning, centring, orientation detection or correction of the image by image rotation, e.g. by 90 degrees
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present invention relates to a teacher data conversion device, a teacher data conversion method, and a non-temporary recording medium.
  • Patent Documents 1 and 2 object detection that predicts the category, position, and size of an object in an input image using a neural network such as deep learning is known (Patent Documents 1 and 2).
  • the position and size of the object detected in the object detection consist of sides parallel to the outer frame of the input image, and are specified by the position and size of the bounding box surrounding the detected object.
  • SSD Single Shot Multibox Detector
  • YOLO You Only Look Once
  • FIG. 2 shows a prediction result image in which the bounding box is rotated so that the long side of the bounding box shown in FIG. 2 is parallel to the longitudinal direction of the object when the longitudinal direction of the object can be detected by some method. Is shown. Even in this case, there is still a large gap between the object and the bounding box.
  • Non-Patent Document 1 a technique has been reported in which the orientation is predicted in addition to the category, position, and size of the object in the input image by adding the orientation of the object to the teacher data used when learning the neural network. It is said that this technique can directly predict a bounding box rotated according to the orientation of an object as shown in FIG. However, manually adding the orientation of an object to a large amount of teacher data is expensive and impractical.
  • An object of the present invention is to provide a technique for automatically adding the orientation of an object to the teacher data in view of the above-mentioned problems.
  • the image of the object specified based on the object information of the first teacher data including the image and the object information including the category, position, and size of the object in the image When a certain object image is input, a storage unit that stores a learned first neural network trained to output geometric conversion parameters corresponding to the object image, and the geometric conversion output from the first neural network. By adding the calculation unit that calculates the orientation of the object based on the parameters and the orientation of the object calculated by the calculation unit to the first teacher data, the image and the category and position of the object in the image can be obtained.
  • a teacher data conversion device is provided that includes an object information including size and orientation, and a generation unit that generates a second teacher data including.
  • the trained first neural network that has been trained to output the geometric conversion parameter corresponding to the object image is stored, and the geometric conversion parameter output from the first neural network is stored.
  • the image and the category, position, size, and orientation of the image and the object in the image are added.
  • a teacher data conversion method is provided that generates object information including, and second teacher data including.
  • the teacher data conversion device 1000 shown in FIG. 5 includes a storage unit 1001, a calculation unit 1002, and a generation unit 1003.
  • the storage unit 1001 stores the first neural network.
  • the first neural network is a trained neural network that has been trained to output geometric transformation parameters corresponding to the object image when the object image is input.
  • the object image is an image of an object specified based on the object information of the first teacher data including the image, the object information including the category, position, and size of the object in the image.
  • the calculation unit 1002 calculates the orientation of the object based on the geometric transformation parameters output from the first neural network.
  • the generation unit 1003 By adding the orientation of the object calculated by the calculation unit 1002 to the first teacher data, the generation unit 1003 includes the image and the object information including the category, position, size, and orientation of the object in the image. 2 Generate teacher data.
  • the second embodiment relates to an object detection technique for predicting the category, position, and size of an object in an image using a neural network, particularly deep learning.
  • the orientation information of the object is added to the output of the neural network, and then the teacher data including the orientation of the object is learned, so that the bounding is rotated according to the orientation of the object. Predict the box directly.
  • a network that predicts geometric transformation parameters that perform spatial correction of an image represented by Spatial Transformer Networks (http://papers.nips.cc/paper/5854-spatial-transformer-networks.pdf) (http://papers.nips.cc/paper/5854-spatial-transformer-networks.pdf).
  • a self-learning geometric transformation device will be introduced. Specifically, the following steps are added to the object detection technology.
  • a self-learning geometric converter that outputs the image after spatial correction and correction parameters using teacher data that has object information including object category, position, and size used in general object detection as input data. Steps to learn.
  • a trained self-learning geometric converter is applied to the teacher data having the object category, position, and size information to generate the teacher data having the object information including the position, size, and orientation of the object.
  • (Learning phase) A phase in which the user learns the image correction method according to the orientation of the object from the first teacher data having the category, position, and size information of the image and the object corresponding to the image created in advance.
  • (Teacher data conversion phase) A phase in which the orientation information of an object is derived from the learned image correction method, and the first teacher data is converted into the second teacher data using the information.
  • (Object detector learning phase) The phase of learning the object detector using the converted second teacher data.
  • (Forecast phase) A phase in which object detection is performed using a trained object detector (trained model).
  • FIG. 6 is a configuration diagram of the object detection device 101.
  • the object detection device 101 (teacher data conversion device) includes a self-learning geometric converter learning unit 108 (learning unit), a teacher data conversion unit 110 (generation unit), an object detector learning unit 118, and a prediction unit 121. , Equipped with.
  • the self-learning geometric transformer learning unit 108 has object category 105 and position 106 (bounding box center coordinates cx, cy) and size 107 (bounding box scale w, as object information 104 corresponding to the image 103 and the image.
  • teacher data 102 first teacher data
  • h a geometric transformation method for capturing the features of an object is learned.
  • the teacher data conversion unit 110 performs a process of adding orientation information 117 to the object information 104 of the teacher data (first teacher data) using the self-learning type geometric converter 109 (storage unit) after learning.
  • the object detector learning unit 118 learns the object detector using the converted teacher data 111.
  • the prediction unit 121 makes a prediction on the prediction image data 120 using the learned object detector 119.
  • FIG. 7 is a configuration diagram of the self-learning type geometric transformation device learning unit 108.
  • the self-learning geometric transformation learning unit 207 includes a marking location extraction unit 208, a transformation matrix generation unit 210, a geometric transformation unit 211, a self-learning geometric transformation storage unit 212, and an image classification unit 213. And a prediction error calculation unit 214.
  • the marking location extraction unit 208 extracts the marking location of the object from the created teacher data 201 (first teacher data).
  • the transformation matrix generation unit 210 calculates the transformation matrix from the small image from which the marking portion is extracted.
  • the transformation matrix generation unit 210 corresponds to the localization network of Spatial Transformer Networks.
  • the geometric transformation unit 211 applies geometric transformation to the small image from which the marking location has been extracted, and outputs the converted image.
  • the geometric transformation unit 211 corresponds to the Grid Generator and Sampler of Spatial Transformer Networks.
  • the self-learning geometric transformation storage unit 212 performs the storage processing of the self-learning geometric transformation device that has completed learning.
  • the self-learning geometric transformation storage unit 212 stores the trained self-learning geometric transformation device 209 (first neural network) as the self-learning geometric transformation device 215 (storage unit).
  • the image classification unit 213 (second neural network) classifies the image output from the geometric transformation unit into images and outputs a predicted value.
  • the prediction error calculation unit 214 calculates the prediction error from the prediction value (category) of the image classification unit 213 and the category information 204 of the teacher data, and updates the parameters of the image classification unit 213 and the self-learning geometric transformation device 209.
  • FIG. 8 is a configuration diagram inside the teacher data conversion unit 110.
  • the teacher data conversion unit 308 includes a marking location extraction unit 309, a self-learning geometric transformation device reading unit 310, an inverse transformation matrix calculation unit 314, and an object orientation calculation unit 315 (calculation unit).
  • the marking location extraction unit 309 extracts the marking location of the object from the created teacher data 301.
  • the self-learning geometric transformation unit reading unit 310 reads a trained self-learning geometric transformation unit 311 including a transformation matrix generation unit 312 and a geometric transformation unit 313.
  • the inverse transformation matrix calculation unit 314 calculates an inverse transformation matrix (inverse geometric transformation matrix) with respect to the transformation matrix (geometric transformation matrix) output from the transformation matrix generation unit 312.
  • the object orientation calculation unit 315 calculates a new orientation while correcting the position and size of the object using the inverse transformation matrix, and the object position (bounding box) of the converted teacher data.
  • the center coordinates cx, cy) 320, the size (bounding box scale w, h) 321, and the orientation (bounding box orientation ⁇ ) 322 are saved as information.
  • FIG. 9 is a flowchart showing an example of the entire process from the process of adding orientation information to the teacher data prepared by the user to the actual object detection prediction.
  • FIG. 10 is a detailed flowchart of the self-learning geometric transformation device learning step S101 in the entire processing flow.
  • FIG. 11 is a flowchart showing the teacher data conversion step S102 in detail in the overall processing flow.
  • FIG. 12 is a supplementary material for the self-learning geometric transformation learning step.
  • FIG. 13 is a supplementary material for the geometric transformation performed on the image during the flow.
  • FIG. 14 is a supplementary material of the teacher data conversion step.
  • step S101 the user inputs this teacher data 102 into the self-learning geometric transformor learning unit 108.
  • the self-learning geometric transformor 108 learns the correction method of the input data, and the model 109 that has reached the end condition is saved.
  • step S102 the user can acquire new teacher data 111 including the orientation information of the object by inputting the trained self-learning geometric converter 109 and the teacher data 102 into the teacher data conversion unit 110. ..
  • this teacher data 111 not only the orientation 117 of the object is added to the original teacher data 101, but also the position 115 and the size 116 are corrected.
  • step S103 the user inputs the converted teacher data 111 into the object detector learning unit 118.
  • the object detector 119 learns information on the category, position, size, and orientation of the object, and the object detector 119 that has reached the end condition is stored.
  • Non-Patent Document 1 is used as an example of the learning method of the object detector 119 in consideration of the orientation.
  • step S104 the user detects an object using the trained object detector 119 on the image data 120 for prediction.
  • the category, position, size, and orientation of the object in the image are predicted for the input image data 120, and the prediction result is output in a format such as a bounding box.
  • Non-Patent Document 1 is used as an example of the object detection method in consideration of the orientation.
  • step S201 the teacher data 201 input by the user is read into the self-learning geometrical transformant learning unit 108.
  • an image showing a crescent-shaped object is input to the self-learning geometric transformation device learning unit 108.
  • step S202 the marking location extraction unit 208 acquires one object information 203 from the teacher data 201.
  • the object information of the object shown in the lower right of the input image is acquired.
  • step S203 the marking location extraction unit 208 cuts out a small image (object image) of the object position based on the position 205 and the size 206 in the object information.
  • the aspect ratio is changed so that the cut-out image becomes a square, but it is not always necessary to change the aspect ratio depending on the input method to the self-learning geometric transformation device 209. ..
  • step S204 When a small image is input to the self-learning geometric transformation device 209, it is first passed to the transformation matrix generator 210, step S204 is performed, and the transformation matrix is output.
  • the following describes the affine transformation as an example, but as described in the paper of Spatial Transformer Networks, transformation methods other than the affine transformation can also be applied.
  • step S205 the geometric transformation unit 211 applies a transformation matrix to the small image, and geometric transformation of the data is performed.
  • FIG. 13 is an image of geometric transformation, in which the thick frame portion in the center is focused on the small image on the left, and the coordinates of the thick frame portion are enlarged / reduced, rotated, and translated so as to be converted into the small image on the right. Geometric transformation such as movement is performed.
  • step S206 the image classification unit 213 is used to predict the image classification of the geometrically transformed small image.
  • step S207 the prediction error calculation unit 214 calculates the error of the prediction result based on the prediction result (classification result) output from the image classification unit 213 and the category information 204 of the teacher data 201.
  • step S208 the image classifier 213 and the self-learning geometric transformation device 209 are updated so as to reduce the prediction error based on the prediction error output by the prediction error calculation unit 214.
  • the image classifier 213 and the self-learning geometric converter 209 are both constructed by a neural network, and updating the image classifier 213 and the self-learning geometric converter 209 means that the image classifier 213 and the self-learning geometric converter 209 are updated. It means updating the weighting coefficient of the neural network that builds the converter 209.
  • step S209 it is checked whether the learning end condition is reached. The processes of steps S202 to S208 are repeated until the end condition is reached.
  • step S210 the self-learning geometric transformer storage unit 212 saves the self-learning geometric transformer 209 for which learning has been completed.
  • the image classifier 213 is installed for learning the self-learning geometric transformation device 209, and does not necessarily have to be saved. A flow that does not save even in this embodiment will be described.
  • step S301 the teacher data 301 input by the user is read into the teacher data conversion unit 308.
  • step S301 an image showing a crescent-shaped object is input.
  • step S302 the self-learning geometric transformation device 311 saved in step S210 is read into the teacher data conversion unit 308.
  • step S303 one object information 303 is selected from the teacher data 301.
  • the object information displayed in the thick frame of the object shown in the lower right of the input image is selected.
  • step S304 a small image of the object position is cut out based on the position 305 and the size 306 in the object information.
  • the aspect ratio is changed so that the cut-out image becomes a square.
  • the aspect ratio is not necessarily changed depending on the input method to the self-learning geometric transformation device 311. There is no need to perform any processing.
  • step S305 is performed, and the transformation matrix is output.
  • the following describes the affine transformation as an example, but similar to S204, transformation methods other than the affine transformation can be applied as described in the paper of Spatial Transformer Networks. Since it is not necessary to perform the geometric transformation itself at the time of teacher data conversion unlike S204, the geometric transformation unit 313 is not used.
  • step S306 the transformation matrix output in step 305 is input to the inverse transformation matrix calculation unit 314, and the inverse matrix is calculated.
  • the object orientation calculation unit 315 calculates the object orientation information using the inverse transformation matrix calculated in step S306.
  • the coordinates of the thick frame of the central image can be calculated by performing inverse transformation on the coordinates of the four corners of the thick frame of the small image on the right. The position and size of the object are corrected based on these coordinate values.
  • the orientation of the object is determined in the following steps. First, the rotation angle in the inverse transformation matrix is obtained.
  • the affine transformation is generally the product of the matrices of scaling, rotation, and translation. Therefore, by decomposing the inverse transformation matrix into these three types of matrices and obtaining the angle of the rotation matrix, it is possible to obtain the rotation angle when converting to a thick frame.
  • the tilt angle of this thick frame is determined by definition.
  • the following is an example of the definition when SSD is adopted for the object detector 119.
  • SSD predicts the amount of translation and the scaling ratio with respect to the default boxes having a plurality of different aspect ratios as shown in FIG.
  • the translation amount, the enlargement / reduction ratio, and the angle with respect to the default box having different not only the aspect ratio but also the angle are predicted as shown in FIG.
  • the angle is defined as the inclination angle of the long side of the default box as shown in FIG.
  • the definition of the tilt angle of the thick frame of the image in the center of FIG. 13 is the angle of the long side as in FIG.
  • step S308 one piece of data converted in S307 is saved as converted teacher data 316.
  • step S309 it is confirmed whether or not the conversion process is performed on all the teacher data. If there is data that has not been processed yet, processing is continued from S302. When the conversion process is performed on all the teacher data, the process ends.
  • the object detection device 101 (teacher data conversion device) includes a self-learning geometric converter 109 (storage unit), an object orientation calculation unit 315 (calculation unit), and a teacher data conversion unit 308 (generation unit).
  • the self-learning geometric converter 109 as a storage unit composed of RAM, ROM, or the like includes teacher data 301 (first) including an image and object information including the category, position, and size of the object in the image.
  • teacher data 301 first
  • object information including the category, position, and size of the object in the image.
  • a trained self-learning geometric converter (learned to output geometric transformation parameters corresponding to the object image) when an object image, which is an image of the object specified based on the object information 303 of the teacher data), is input.
  • the first neural network is stored.
  • the object orientation calculation unit 315 calculates the orientation of the object based on the geometric transformation parameters output from the self-learning geometric transformation device 109.
  • the teacher data conversion unit 308 adds the orientation of the object calculated by the object orientation calculation unit 315 to the teacher data 301 to obtain the image, the object information including the category, position, size, and orientation of the object in the image, and the object information.
  • the converted teacher data 316 (second teacher data) including the above is generated.
  • the object detection device 101 further includes a self-learning geometric transformation learning unit 108 (learning unit) that generates a self-learning geometric transformation device by learning.
  • the self-learning geometric transformation device learning unit 108 geometrically transforms the object image based on the geometric transformation parameters output from the self-learning geometric transformation device 109 by inputting the object image to the self-learning geometric transformation device 109.
  • the self-learning geometric converter learning unit 108 is included in the category output from the image classification unit 213 by inputting the geometrically converted object image into the image classification unit 213 (second neural network) and the teacher data 301. Calculate the prediction error of the category.
  • the self-learning geometric transformation unit learning unit 108 learns the self-learning geometric transformation unit 109 by updating the weighting coefficients of the self-learning geometric transformation unit 109 and the image classification unit 213 so that the prediction error becomes small.
  • the geometric transformation parameter is a parameter for rotating the object image.
  • the geometric transformation parameter is a parameter for executing at least one of enlargement, reduction, and translation of the object image.
  • Geometric transformation parameters are parameters for affine transformation of an object image.
  • the geometric transformation parameter is a geometric transformation matrix.
  • the object orientation calculation unit 315 calculates the orientation of the object based on the inverse geometric transformation matrix which is the inverse matrix of the geometric transformation matrix.
  • the teacher data conversion method when an object image, which is an image of an object specified based on the object information of the teacher data 301, is input, learned self-learning is performed so as to output geometric conversion parameters corresponding to the object image.
  • the type geometric converter 109 first neural network
  • the orientation of the object is calculated based on the geometric transformation parameters output from the self-learning geometric transformation device 109.
  • the teacher data conversion method by adding the calculated orientation of the object to the teacher data 301, the converted teacher data 111 including the image and the object information including the category, position, size, and orientation of the object in the image. (Second teacher data) is generated. According to the above method, a technique for automatically adding the direction of the object to the teacher data 301 is realized.
  • the above teacher data conversion method can be executed by a computer. That is, when the CPU of the computer reads and executes the program stored in the ROM of the computer, the program causes the computer to execute the teacher data generation method.
  • the program may be stored on a non-temporary recording medium.
  • Modification example 1 That is, by preparing the self-learning geometric transformation device 109 for each category to be detected, the conversion method in each category can be learned more accurately. Since the position, size, and orientation of the object can be grasped more accurately, it is expected that the quality of the teacher data after conversion and the accuracy of object detection will be improved.
  • Steps S504 and S605 are added as shown in FIGS. 22 and 23, the saving and reading process of the converter is changed as shown in steps S511 and S602, and the converter is selected according to the category of the object.
  • step S504 the converter selection unit 509 selects the converter in the target category and proceeds with the subsequent process.
  • the converter selection unit 509 specifies a category for each object by referring to the category included in the object information of the teacher data, and selects the self-learning geometric transformation device 510 corresponding to the specified category.
  • step S511 the self-learning geometric transformer storage unit 513 saves all the self-learning geometric transformers 510.
  • step S602 all the self-learning geometric transformers 612 are read by the self-learning geometric transformer reading unit 610.
  • step S605 the converter selection unit 611 selects the converter of the target category and proceeds with the subsequent process.
  • the converter selection unit 611 specifies a category for each object by referring to the category included in the object information of the teacher data, and selects the self-learning geometric transformation device 612 corresponding to the specified category.
  • the orientation of the object may be treated in the form of coordinate values (cos ⁇ , sin ⁇ ) corresponding to the orientation on the unit circle as shown in FIG. 24 instead of the angle value ⁇ .
  • coordinate values cos ⁇ , sin ⁇
  • both 0 ° and 360 ° have the same coordinate values as (1,0). This can be expected to improve the detection accuracy of the object detection unit 119.
  • Non-temporary computer-readable media include various types of tangible storage mediums.
  • Examples of non-temporary computer-readable media include magnetic recording media (eg, flexible disks, magnetic tapes, hard disk drives), magneto-optical recording media (eg, magneto-optical disks).
  • Examples of non-temporary computer-readable media further include CD-ROMs (Read Only Memory), CD-Rs, CD-R / Ws, semiconductor memories (eg, mask ROMs, etc.) of non-temporary computer-readable media. Examples further include PROM (Programmable ROM), EPROM (Erasable PROM), flash ROM, RAM (random access memory)).
  • the program may also be supplied to the computer by various types of temporary computer readable medium.
  • temporary computer-readable media include electrical, optical, and electromagnetic waves.
  • the temporary computer-readable medium can supply the program to the computer via a wired communication path such as an electric wire and an optical fiber, or a wireless communication path.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Geometry (AREA)
  • Image Analysis (AREA)
PCT/JP2020/014031 2020-03-27 2020-03-27 教師データ変換装置、教師データ変換方法、及び、非一時的な記録媒体 Ceased WO2021192225A1 (ja)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US17/908,419 US12380591B2 (en) 2020-03-27 2020-03-27 Generation of teaching data including image and object information including category, position, size, and orientation of object included in image
JP2022510328A JP7283631B2 (ja) 2020-03-27 2020-03-27 教師データ変換装置、教師データ変換方法、及び、プログラム
PCT/JP2020/014031 WO2021192225A1 (ja) 2020-03-27 2020-03-27 教師データ変換装置、教師データ変換方法、及び、非一時的な記録媒体

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2020/014031 WO2021192225A1 (ja) 2020-03-27 2020-03-27 教師データ変換装置、教師データ変換方法、及び、非一時的な記録媒体

Publications (1)

Publication Number Publication Date
WO2021192225A1 true WO2021192225A1 (ja) 2021-09-30

Family

ID=77889952

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2020/014031 Ceased WO2021192225A1 (ja) 2020-03-27 2020-03-27 教師データ変換装置、教師データ変換方法、及び、非一時的な記録媒体

Country Status (3)

Country Link
US (1) US12380591B2 (https=)
JP (1) JP7283631B2 (https=)
WO (1) WO2021192225A1 (https=)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024075447A1 (ja) * 2022-10-05 2024-04-11 キヤノン株式会社 情報処理装置、情報処理装置の制御方法、及び記憶媒体

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021192225A1 (ja) * 2020-03-27 2021-09-30 日本電気株式会社 教師データ変換装置、教師データ変換方法、及び、非一時的な記録媒体
JP7360997B2 (ja) * 2020-06-18 2023-10-13 京セラ株式会社 情報処理システム、情報処理装置、および情報処理方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011039994A (ja) * 2009-08-18 2011-02-24 Nec Soft Ltd 部品検出装置、部品検出方法、プログラムおよび記録媒体
JP2017515189A (ja) * 2014-03-07 2017-06-08 クゥアルコム・インコーポレイテッドQualcomm Incorporated 写真管理
JP2018036848A (ja) * 2016-08-31 2018-03-08 株式会社デンソーアイティーラボラトリ 物体状況推定システム、物体状況推定装置、物体状況推定方法、及び物体状況推定プログラム
WO2018163404A1 (ja) * 2017-03-10 2018-09-13 三菱電機株式会社 顔向き推定装置および顔向き推定方法
JP2019164836A (ja) * 2019-06-19 2019-09-26 株式会社Preferred Networks 学習装置、学習方法、学習モデル、検出装置及び把持システム
JP2020021170A (ja) * 2018-07-30 2020-02-06 Kddi株式会社 特定装置、特定方法及び特定プログラム

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6132217A (en) * 1998-12-23 2000-10-17 Dickson; Marilyn M. Method and apparatus for teaching relationship of trigonometric functions between unit circle and rectangular coordinate system
JP4816874B2 (ja) 2005-05-31 2011-11-16 日本電気株式会社 パラメータ学習装置、パラメータ学習方法、およびプログラム
JP5406705B2 (ja) 2009-12-28 2014-02-05 キヤノン株式会社 データ補正装置及び方法
JP5645079B2 (ja) * 2011-03-31 2014-12-24 ソニー株式会社 画像処理装置および方法、プログラム、並びに記録媒体
EP2799902A1 (en) * 2013-04-30 2014-11-05 Baselabs GmbH Method and apparatus for the tracking of multiple objects
US11906441B2 (en) * 2019-06-03 2024-02-20 Nec Corporation Inspection apparatus, control method, and program
US11373026B2 (en) * 2019-06-10 2022-06-28 General Electric Company Deep learning surrogate for turbulent flow
US11879964B2 (en) * 2020-02-13 2024-01-23 Mitsubishi Electric Research Laboratories, Inc. System and method for tracking expanded state of moving object with model geometry learning
US11619494B2 (en) * 2020-02-13 2023-04-04 Mitsubishi Electric Research Laboratories, Inc. System and method for tracking expanded state of an object
WO2021192225A1 (ja) * 2020-03-27 2021-09-30 日本電気株式会社 教師データ変換装置、教師データ変換方法、及び、非一時的な記録媒体

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011039994A (ja) * 2009-08-18 2011-02-24 Nec Soft Ltd 部品検出装置、部品検出方法、プログラムおよび記録媒体
JP2017515189A (ja) * 2014-03-07 2017-06-08 クゥアルコム・インコーポレイテッドQualcomm Incorporated 写真管理
JP2018036848A (ja) * 2016-08-31 2018-03-08 株式会社デンソーアイティーラボラトリ 物体状況推定システム、物体状況推定装置、物体状況推定方法、及び物体状況推定プログラム
WO2018163404A1 (ja) * 2017-03-10 2018-09-13 三菱電機株式会社 顔向き推定装置および顔向き推定方法
JP2020021170A (ja) * 2018-07-30 2020-02-06 Kddi株式会社 特定装置、特定方法及び特定プログラム
JP2019164836A (ja) * 2019-06-19 2019-09-26 株式会社Preferred Networks 学習装置、学習方法、学習モデル、検出装置及び把持システム

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024075447A1 (ja) * 2022-10-05 2024-04-11 キヤノン株式会社 情報処理装置、情報処理装置の制御方法、及び記憶媒体

Also Published As

Publication number Publication date
JP7283631B2 (ja) 2023-05-30
JPWO2021192225A1 (https=) 2021-09-30
US20230143661A1 (en) 2023-05-11
US12380591B2 (en) 2025-08-05

Similar Documents

Publication Publication Date Title
WO2021192225A1 (ja) 教師データ変換装置、教師データ変換方法、及び、非一時的な記録媒体
JP5075861B2 (ja) 画像処理装置及び画像処理方法
JP2021520124A5 (ja) 画像動き補償装置、画像動き補償装置、符号化デバイス、及び復号化デバイス
US20180182165A1 (en) Shape prediction model compression for face alignment
WO2020181456A1 (en) Inter coding for adaptive resolution video coding
US11037341B1 (en) Generative shape creation and editing
KR20180019976A (ko) 화상 형성 장치, 그의 스캔 이미지 보정 방법 및 비일시적 컴퓨터 판독가능 기록매체
US10019651B1 (en) Robust shape prediction for face alignment
CN1902464A (zh) 用于测量、确定和控制金属带材平直度的方法和设备
US20230316697A1 (en) Association method, association system, and non-transitory computer-readable storage medium
JP2013066164A (ja) 画像処理装置、および画像処理方法、並びにプログラム
CN115457254A (zh) 一种基于深度学习的二维码定位正畸方法、介质及处理器
WO2018118099A1 (en) Shape prediction for face alignment
CN113469887B (zh) 物体数模的转换方法、装置、设备及存储介质
CN111149101A (zh) 一种目标图案查找方法及计算机可读存储介质
JP2018124990A (ja) モデル生成装置、評価装置、モデル生成方法、評価方法及びプログラム
US9693076B2 (en) Video encoding and decoding methods based on scale and angle variation information, and video encoding and decoding apparatuses for performing the methods
KR20190069893A (ko) 내용 기반 영상 크기 조절 장치 및 방법
CN119919282A (zh) 用于提升ai识别率的室内设计电子图纸的标准化流程算法
US11210551B2 (en) Iterative multi-directional image search supporting large template matching
KR102810067B1 (ko) 패턴의 임계 치수 변동의 결정
JP2010066865A (ja) 差分抽出装置及び差分抽出プログラム
CN116052252A (zh) 人脸特征检测方法、模型训练方法、设备和介质
JP2011019190A (ja) 画像処理装置及び画像処理方法
JP6814484B2 (ja) 画像処理装置、画像処理方法および画像処理プログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20927782

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022510328

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20927782

Country of ref document: EP

Kind code of ref document: A1

WWG Wipo information: grant in national office

Ref document number: 17908419

Country of ref document: US