WO2022250154A1 - 学習済みモデル生成装置、学習済みモデル生成方法、及び認識装置 - Google Patents

学習済みモデル生成装置、学習済みモデル生成方法、及び認識装置 Download PDF

Info

Publication number
WO2022250154A1
WO2022250154A1 PCT/JP2022/021815 JP2022021815W WO2022250154A1 WO 2022250154 A1 WO2022250154 A1 WO 2022250154A1 JP 2022021815 W JP2022021815 W JP 2022021815W WO 2022250154 A1 WO2022250154 A1 WO 2022250154A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
model
learning
adapter
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2022/021815
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
南己 淺谷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kyocera Corp
Original Assignee
Kyocera Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kyocera Corp filed Critical Kyocera Corp
Priority to EP22811422.9A priority Critical patent/EP4350614A4/en
Priority to US18/565,070 priority patent/US20240265691A1/en
Priority to CN202280037790.3A priority patent/CN117396927A/zh
Priority to JP2023513902A priority patent/JP7271809B2/ja
Publication of WO2022250154A1 publication Critical patent/WO2022250154A1/ja
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/75Determining position or orientation of objects or cameras using feature-based methods involving models
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/10Pre-processing; Data cleansing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/096Transfer learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20076Probabilistic image processing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Definitions

  • the present disclosure relates to a trained model generation device, a trained model generation method, and a recognition device.
  • a trained model generating device includes a control unit that generates a trained model that outputs a recognition result of a recognition target included in input information.
  • the control unit is connected to at least one base model generated by executing first learning using teacher data including learning target information that is the same as or related to the input information, and the learning Inputting the input information generated by performing second learning using teacher data including target information different from information used in the first learning to the at least one base model Get a convertible adapter before you do.
  • the control unit performs third learning using teacher data including information different from information used in the first learning and information used in the second learning among the information to be learned. Generate the target model by running it.
  • the control unit generates the learned model by combining the adapter and the target model.
  • a trained model generation method is executed by a trained model generation device that generates a trained model that outputs a recognition result of a recognition target included in input information.
  • the method for generating a trained model is coupled to at least one base model generated by performing a first learning using teacher data including information to be learned that is the same as or related to the input information. , the input information generated by performing second learning using teacher data including information different from the information used in the first learning among the information to be learned, the at least one base; Including getting translatable adapters before populating the model.
  • the trained model generation method includes a third learning method using teacher data including information different from information used in the first learning and information used in the second learning, among the information to be learned.
  • the trained model generation method includes generating the trained model by combining the adapter and the target model.
  • a recognition device includes a trained model that outputs a recognition result of a recognition target included in input information.
  • the trained model is combined with at least one base model generated by performing a first learning using teacher data including information to be learned that is the same as or related to the input information, applying the input information generated by executing second learning using teacher data including information different from the information used in the first learning out of information to be learned to the at least one base model; Including an adapter that can be converted before entering.
  • third learning using teacher data wherein the trained model includes information different from the information used in the first learning and the information used in the second learning among the information to be learned; contains the target model generated by running
  • the trained model is constructed by combining the adapter and the target model.
  • FIG. 1 is a block diagram showing a configuration example of a trained model generation system according to an embodiment
  • FIG. FIG. 4 is a schematic diagram showing a generic library and a trained model to which an image adapter is coupled
  • FIG. 3 is a diagram showing an example of an image adapter
  • FIG. 4 is a schematic diagram showing generation of an image adapter coupled to a plurality of base models and generation of a trained model by transferring the image adapter to the trained model
  • 4 is a flow chart showing an example procedure of a learned model generation method
  • 1 is a schematic diagram showing a configuration example of a robot control system
  • recognition accuracy can be improved.
  • a trained model generation device 20 As shown in FIG. 1 , a trained model generation device 20 according to an embodiment of the present disclosure includes a control section 22 and an information generation section 26 . The trained model generating device 20 generates a trained model 70 (see FIG. 2).
  • the control unit 22 acquires information about the target applied to learning from the information generation unit 26 .
  • Objects that are applied to learning are also referred to as learning objects.
  • the control unit 22 performs learning using the information about the learning target acquired from the information generating unit 26 as teacher data, and outputs information or data based on the learning result.
  • the learning target for generating the trained model 70 may include the object itself to be recognized, or another object. may include An object that can be recognized by the trained model 70 is also called a recognition target.
  • the control unit 22 may include at least one processor to provide control and processing power to perform various functions.
  • the processor may execute programs that implement various functions of the controller 22 .
  • a processor may be implemented as a single integrated circuit.
  • An integrated circuit is also called an IC (Integrated Circuit).
  • a processor may be implemented as a plurality of communicatively coupled integrated and discrete circuits. Processors may be implemented based on various other known technologies.
  • the control unit 22 may include a storage unit.
  • the storage unit may include an electromagnetic storage medium such as a magnetic disk, or may include a memory such as a semiconductor memory or a magnetic memory.
  • the storage unit stores various information.
  • the storage unit stores programs and the like executed by the control unit 22 .
  • the storage unit may be configured as a non-transitory readable medium.
  • the storage section may function as a work memory for the control section 22 . At least part of the storage section may be configured separately from the control section 22 .
  • the information generation unit 26 outputs teacher data used in learning in the control unit 22 to the control unit 22 .
  • the information generator 26 may generate teacher data, or acquire teacher data from an external device.
  • the information generation unit 26 may be configured including at least one processor to provide control and processing capabilities for generating or acquiring teacher data.
  • the processor may execute a program that generates or acquires teacher data.
  • the information generator 26 may be configured identically or similarly to the controller 22 .
  • the information generator 26 may be configured integrally with the controller 22 .
  • the information generation unit 26 may generate information representing the actual mode of the learning target as teacher data. Information representing the actual aspect of the learning object is also referred to as actual information.
  • the information generator 26 may include a camera that takes an actual image of the learning target.
  • the information generation unit 26 may perform annotation by adding information such as a label to the actual image to be learned.
  • the information generator 26 may receive an operation input related to annotation from the user.
  • the information generation unit 26 may perform annotation based on a learning model for annotation prepared in advance.
  • the information generator 26 can generate actual information by annotating the actual image to be learned.
  • the information generating unit 26 virtually generates, as teacher data, information about the learning target as information of a task that is the same as or related to the input information input to the trained model 70 .
  • the input information will be the image in which the object was taken.
  • a task that is the same as or related to the input information corresponds to a task that is executed using the input information to be processed by the trained model 70 or a task that is executed using information similar to or related to the input information.
  • the same task as the input information corresponds to the task of classifying the screws and nails that are actually classified by the trained model 70. do.
  • the task associated with the input information corresponds to the task of classifying screws and nails from an image that also includes other types of screws or nails that are similar to a given type of screws and nails, or objects that are similar to these.
  • the information about the learning object that is virtually generated is also called pseudo information.
  • the pseudo information may be, for example, a computer graphics (CG) image of the screw or nail to be recognized instead of image information of the actual screw or nail.
  • the task may include, for example, a classification task for classifying recognition targets included in input information into at least two types.
  • the task may include, for example, a task of distinguishing whether a recognition target is a screw or a nail, or an evaluation task of calculating at least one type of evaluation value based on input information.
  • the classification task can be subdivided into, for example, a task of distinguishing whether a recognition target is a dog or a cat.
  • Tasks are not limited to classification tasks, and may include tasks that implement various other operations.
  • a task may include a segmentation determining from pixels belonging to a particular object.
  • a task may include object detection to detect an enclosing rectangular region.
  • the task may include object pose estimation.
  • a task may include keypoint detection to find certain feature points.
  • both the input information and the information about the learning target are classification task information
  • the relationship between the input information and the information about the learning target is assumed to be related task information.
  • both the input information and the information about the learning target are task information for distinguishing whether the recognition target is a dog or a cat
  • the relationship between the input information and the information about the learning target is the same. task information.
  • the relationship between the input information and the learning target information is not limited to these examples, and can be determined under various conditions.
  • the information generation unit 26 may generate information that virtually represents the appearance of the learning target in order to generate pseudo information.
  • the information generator 26 may generate modeling data such as three-dimensional CAD (Computer Aided Design) data of the appearance of the learning object as information that virtually represents the appearance of the learning object.
  • the information generation unit 26 may generate an image of the learning target as information that virtually represents the appearance of the learning target.
  • the information generation unit 26 may perform annotation by adding information such as a label to modeling data or an image that virtually represents the appearance of the object to be learned.
  • the information generation unit 26 can generate pseudo information by annotating the generated information that virtually represents the appearance of the object to be learned.
  • the information generation unit 26 may acquire information that virtually represents the appearance of the learning object from an external device.
  • the information generation unit 26 may receive input regarding modeling data from the user.
  • the information generation unit 26 may acquire data obtained by annotating information that virtually represents the appearance of the object to be learned.
  • the information generator 26 may receive an operation input related to annotation from the user.
  • the information generation unit 26 may perform annotation on information that virtually represents the appearance of a learning object based on a learning model for annotation that has been prepared in advance.
  • the trained model generating device 20 generates a trained model 70 that outputs recognition results of recognition targets included in input information.
  • the trained model 70 is configured as a model in which the image adapter 50 is coupled to the input side of the target model 40 .
  • the image adapter 50 is configured to be able to input input information.
  • the image adapter 50 is also simply called an adapter.
  • the trained model generation device 20 performs the following operations in preparation for generating the trained model 70.
  • the trained model generating device 20 generates the base model 30 by learning based on the pseudo information.
  • the training performed to generate the base model 30 is also referred to as first training.
  • the teacher data used in the first learning may include learning target information that is the same as or related to the input information.
  • the trained model generating device 20 may use real information instead of pseudo information, or may use both pseudo information and real information.
  • the pseudo information used for learning to generate the base model 30 is also called first pseudo information.
  • the trained model generation device 20 generates the image adapter 50 by further learning based on the actual information while the image adapter 50 is connected to the input side of the base model 30 .
  • the learning performed to generate the image adapter 50 is also referred to as second learning.
  • the teacher data used in the second learning includes learning target information that is the same as or related to the input information, and may include information different from the information used in the first learning.
  • the real information used for learning to generate the image adapter 50 is also called first real information. Second pseudo information and second real information, which will be described later, may be used as the first pseudo information and first real information.
  • the trained model generation device 20 generates the target model 40 by learning based on pseudo information or real information without connecting the image adapter 50 .
  • the learning performed to generate the target model 40 is also referred to as third learning.
  • the teacher data used in the third learning contains learning target information that is the same as or related to the input information, and is different from the information used in the first learning and the information used in the second learning. May contain information.
  • the pseudo information used for learning to generate the target model 40 is also called second pseudo information.
  • the real information used for learning to generate the target model 40 is also referred to as second real information.
  • the trained model generating device 20 transfers the image adapter 50 generated in advance by pre-learning in a state where it is connected to the base model 30, and connects it to the input side of the newly generated target model 40 to generate a trained model 70. to generate Note that the trained model generation device 20 may transfer the base model 30 used for pre-learning as the target model 40 . Also, the trained model generation device 20 combines the image adapter 50 and the target model 40, performs further learning using the second pseudo information and the second real information as teacher data, and generates a trained model 70. good too.
  • the trained model generating device 20 generates the image adapter 50 in advance by pre-learning, generates the target model 40 by learning based only on the pseudo information, and generates the trained model 70 simply by connecting the image adapter 50. can be generated. As a result, the workload of generating the target model 40 can be reduced.
  • pre-learning real information, pseudo information, or information combining these may be used as training data.
  • the base model 30 and the target model 40 are configured as a CNN (Convolution Neural Network) having multiple layers. Information input to the base model 30 and the target model 40 is subjected to convolution based on predetermined weighting factors in each layer of the CNN. In training the base model 30 and the target model 40, the weighting factors are updated.
  • Base model 30 and target model 40 may be configured by VGG 16 or ResNet 50 .
  • the base model 30 and the target model 40 are not limited to these examples, and may be configured as various other models.
  • the base model 30 includes a first base model 31 and a second base model 32 .
  • the target model 40 includes a first target model 41 and a second target model 42 .
  • the first base model 31 and the first target model 41 are also called backbone.
  • the second base model 32 and the second target model 42 are also called heads.
  • Base model 30 and target model 40 include a backbone and a head.
  • each trained model included in the target model 40 may be different from the trained model included in the base model 30 .
  • each of the trained models included in the target model 40 may be subjected to a different learning process than each of the trained models included in the base model 30 . More specifically, the learning process may be performed using teacher data containing information different from each other.
  • the pre-learning model included in the target model 40 may be the same model as the pre-learning model included in the base model 30 .
  • the backbone is configured to output the result of extracting the feature quantity of the input information.
  • the feature quantity represents, for example, the feature of the appearance of the learning object as a numerical value.
  • the head is configured to make predetermined decisions about the input information based on the output of the backbone. Specifically, the head may output the recognition result of the recognition target included in the input information based on the feature amount of the input information output by the backbone. That is, the head is configured to perform recognition of the recognition target as a predetermined determination.
  • the feature quantity can be a parameter representing the ratio of striped area on the body surface.
  • the predetermined determination may be to determine whether the recognition target is a horse or a zebra by comparing the area ratio of the striped pattern on the body surface with a threshold value.
  • the feature quantity may be a parameter representing the size or the number of holes in the shell.
  • the predetermined determination may be comparing the size or the number of holes in the shell with a threshold value to determine whether the recognition target is an abalone or a tokobushi.
  • the image adapter 50 may be configured as a CNN with multiple layers, as illustrated in FIG.
  • the image adapter 50 is configured to convert information input to the base model 30 or the target model 40 before being input to the base model 30 or the target model 40 .
  • the image adapter 50 is coupled to the input side of the target model 40 in FIG. 3, but can also be coupled to the input side of the base model 30.
  • the block labeled "Conv” represents executing convolution. Convolution is also called downsampling. Also, the block described as “Conv Trans” represents the execution of transposed convolution. Transposed convolution is also called upsampling. Transposed convolution is sometimes referred to as deconvolution.
  • the block labeled "Conv 4x4" represents that the size of the filter used to perform the convolution on the two-dimensional data is 4x4.
  • a filter also called a kernel, corresponds to a set of weighting coefficients in performing a convolution or deconvolution of the information input to the block.
  • the block labeled “Conv Trans 4x4" represents that the size of the filter used to perform the transposed convolution on the two-dimensional data is 4x4.
  • the block labeled "stride 2" represents shifting the filter by two elements when performing convolution or transposed convolution. Conversely, blocks without “stride 2" indicate that the filter is shifted by one element when performing convolution or transposed convolution.
  • the image adapter 50 When the image adapter 50 is connected to the input side of the base model 30, it converts pseudo information or real information input for learning and outputs it to the base model 30. If the pseudo information or real information is an image, the image adapter 50 converts the input image and outputs it to the base model 30 . When connected to the input side of the target model 40 , the image adapter 50 converts and outputs an image to be recognized included in the input information input to the trained model 70 . Further, the image adapter 50 may convert the form of the input image and output it. The image adapter 50 may output the input image by, for example, emphasizing the edges of the image or brightening the shaded portion of the image. The image adapter 50 converts the target model 40 to which it is connected so that it can process the task correctly. For example, if the task is recognition of an object included in an image, the base model 30 or the target model 40 converts the mode so that the result of correctly recognizing the recognition target can be output.
  • the control unit 22 of the trained model generating device 20 can generate the trained model 70 by executing the operations schematically shown in FIG. 4, for example.
  • the operation of the trained model generation device 20 will be described below with reference to FIG.
  • the control unit 22 generates at least one base model 30 as a first step. Specifically, the control unit 22 acquires the first pseudo information as teacher data from the information generation unit 26 . The control unit 22 generates the base model 30 by learning based on the first pseudo information. The control unit 22 updates the base model 30 so as to increase the probability that the information output from the learning base model 30 is the information representing the learning target included in the first pseudo information. The controller 22 may update the base model 30 by updating the weighting coefficients of the base model 30 . Before starting learning, the base model 30 may be in a predetermined initial state. That is, the weighting factor of the base model 30 may be set to a predetermined initial value.
  • the control unit 22 can generate the base model 30 by learning based on the first pseudo information. Since the learning for generating the base model 30 is executed prior to the learning for generating the image adapter 50 in the second step, which will be described later, it can be said to be pre-learning.
  • the controller 22 has been described as acquiring the first pseudo information from the information generator 26 as teacher data, but the present invention is not limited to this.
  • training data not only the first pseudo information but also the first real information can be used.
  • the second pseudo information or the second real information may be used as the training data.
  • the control unit 22 generates x base models 30 .
  • the x number of base models 30 are distinguished from the first base model 301 to the x-th base model 30x.
  • the control unit 22 acquires different pieces of information as the first pseudo information used for learning to generate each base model 30 .
  • the first base model 301 includes a first base model 311 and a second base model 321 .
  • the x-th base model 30x includes a first base model 31x and a second base model 32x.
  • the control unit 22 generates the image adapter 50 as a second step. Specifically, the control unit 22 may further acquire actual information as teacher data from the information generation unit 26 .
  • the control unit 22 updates the image adapter 50 by learning based on the first pseudo information and real information while the image adapter 50 is connected to the learned base model 30 generated in the first step.
  • the controller 22 may update the image adapter 50 by updating the weighting coefficients of the image adapter 50 .
  • the control unit 22 acquires different information as actual information used for learning for generating each base model 30 .
  • the image adapter 50 coupled to the base model 30 may be in a predetermined initial state. That is, the weighting factor of the image adapter 50 may be set to a predetermined initial value.
  • the learning image adapter 50a to be updated by learning is represented by a black rectangle.
  • the control unit 22 learns based on the first pseudo information and the real information in a state in which the image adapter 50 is connected to the learned base model 30 generated in the first step, and the image adapter 50 has been described as updating, but it is not limited to this.
  • the control unit 22 may perform learning based on only one of the first pseudo information and the real information to update the image adapter 50 .
  • the control unit 22 learns based on the first pseudo information or real information corresponding to each base model 30 while the image adapter 50a being learned is connected to each of the x number of base models 30 .
  • the control unit 22 inputs the first pseudo information and the real information to the image adapter 50a under learning, and inputs the output of the image adapter 50a under learning to each of the x base models 30 for learning.
  • the control unit 22 generates the image adapter 50 by updating the image adapter 50 through learning.
  • the control unit 22 outputs information output from each base model 30 to which the first pseudo information is input via the image adapter 50, and output from each base model 30 to which actual information is input via the image adapter 50. Update the image adapter 50 so that the information becomes closer.
  • the control unit 22 outputs information output from each base model 30 to which the first pseudo information is input via the image adapter 50, and output from each base model 30 to which actual information is input via the image adapter 50.
  • the image adapter 50 may be updated to increase the probability of matching information.
  • the control unit 22 may update each base model 30 together with the image adapter 50 through learning, or may update only the image adapter 50 .
  • the control unit 22 may perform learning for each combination of one base model 30 coupled with the image adapter 50a being learned.
  • the control unit 22 may combine a plurality of combinations of one base model 30 and the image adapter 50a being learned and perform learning in parallel.
  • control unit 22 can generate the image adapter 50 through learning based on the first pseudo information and real information.
  • the learning for generating the image adapter 50 can be performed independently of the learning for generating the target model 40 in the third step, which will be described later.
  • the control unit 22 generates a target model 40 as a third step. Specifically, the control unit 22 acquires the second pseudo information as teacher data from the information generation unit 26 . As the second pseudo information, the control unit 22 acquires task information that is the same as or related to the first pseudo information used for learning to generate the base model 30 . The control unit 22 generates the target model 40 by learning based on the second pseudo information. The control unit 22 inputs the second pseudo information to the image adapter 50 and inputs it to the target model 40 without conversion. The control unit 22 updates the target model 40 so as to increase the probability that the information output from the learning target model 40 is the information representing the learning target included in the second pseudo information. The control unit 22 may update the target model 40 by updating the weighting coefficients of the target model 40 .
  • the target model 40 Before starting learning, the target model 40 may be in a predetermined initial state. That is, the weighting factor of the target model 40 may be set to a predetermined initial value.
  • the target models 40 to be updated by learning include a first target model 41a and a second target model 42a that are being learned, and are represented by black rectangles.
  • the control unit 22 can generate the target model 40 by learning based on the second pseudo information.
  • the controller 22 has been described as acquiring the second pseudo information from the information generator 26 as teacher data, but the present invention is not limited to this. As training data, not only the second pseudo information but also the second real information may be used.
  • control unit 22 inputs the second pseudo information to the target model 40 without converting it to update the target model 40, but the present invention is not limited to this.
  • the control unit 22 updates the target model 40 and the image adapter 50 by combining the target model 40 and the image adapter 50 and learning using the second pseudo information, the second real information, or both.
  • the control unit 22 generates a trained model 70 by connecting the image adapter 50 to the target model 40 .
  • the control unit 22 converts the trained image adapter 50b generated in the second step to the target model 40 including the trained first target model 41b and the second trained target model 42b generated in the third step. Join. That is, the control unit 22 transfers the image adapter 50 generated in the second step and couples it to the target model 40 .
  • the target model 40 and the image adapter 50 generated in the third step have been described as being combined, but the present invention is not limited to this.
  • the target model 40 the base model 30 generated in the first step may be used. In this case, the third step may not be executed.
  • the controller 22 of the trained model generation device 20 may perform the above-described operations as a trained model generation method including the procedures of the flowchart illustrated in FIG.
  • the learned model generation method may be implemented as a learned model generation program that is executed by a processor that configures the control unit 22 .
  • the trained model generation program may be stored on non-transitory computer-readable media.
  • the control unit 22 acquires a plurality of base models 30 (step S1).
  • the control unit 22 may generate a plurality of base models 30 by learning based on the first pseudo information, or may acquire them from an external device.
  • the control unit 22 acquires only the plurality of base models 30 used for learning to generate the image adapter 50.
  • the control unit 22 selects at least one base model 30 from a plurality of base models 30 (step S2).
  • the control unit 22 acquires information on a learning target (step S3).
  • the control unit 22 may acquire real information of a task that is the same as or related to pseudo information used in learning for generating the selected base model 30 as learning target information.
  • the control unit 22 generates the image adapter 50 by learning based on the learning target information while the image adapter 50 is connected to the selected base model 30 (step S4). Specifically, the control unit 22 inputs real information to the image adapter 50 as learning target information. Information converted from actual information by the image adapter 50 is input to the selected base model 30 . The control unit 22 generates the image adapter 50 by updating the image adapter 50 based on the information output from the selected base model 30 .
  • the control unit 22 determines whether all base models 30 have been selected (step S5). If all the base models 30 have not been selected (step S5: NO), that is, if at least one base model 30 has not been selected, the control unit 22 returns to the procedure of step S2 to select the unselected base model. Select 30.
  • step S6 the control unit 22 acquires information on the recognition target (step S6). Specifically, the control unit 22 may acquire second pseudo information of a task that is the same as or related to the first pseudo information used in learning for generating the selected base model 30 as information to be recognized.
  • the control unit 22 generates the target model 40 by learning based on the information of the recognition target (step S7).
  • the control unit 22 connects the image adapter 50 and the target model 40 (step S8).
  • the control unit 22 can generate the learned model 70 that combines the image adapter 50 and the target model 40 by executing the above procedure.
  • the control unit 22 ends the execution of the procedure of the flowchart of FIG.
  • the control unit 22 inputs the input information to the generated trained model 70, and evaluates the recognition accuracy of the recognition target included in the input information based on the output of the trained model 70. good.
  • the control unit 22 may output the generated learned model 70 to the robot control device 110 (see FIG. 6), which will be described later.
  • the trained model generation device 20 combines the image adapter 50 generated by learning in the state of being connected to the base model 30 with the target model 40 newly generated by another learning. By doing so, the trained model 70 can be generated.
  • the trained model generating device 20 generates the image adapter 50 by learning based on real information or pseudo information.
  • the trained model generating device 20 generates the target model 40 by learning based only on the pseudo information.
  • the recognition accuracy by the trained model 70 combined with the image adapter 50 generated by learning based on real information or pseudo information is improved compared to the case of using only the target model 40 . Therefore, if the image adapter 50 is generated in advance by learning based on real information or pseudo information, high recognition accuracy can be expected by combining the image adapter 50 with the target model 40 .
  • the trained model generating device 20 can increase the recognition accuracy by generating the trained model 70 by connecting the image adapter 50 . In other words, the recognition accuracy of the trained model 70 can be improved without transferring the base model 30 to the target model 40 .
  • the operation of transferring the base model 30 itself can be a constraint on the generation of the trained model 70.
  • the target model 40 may not match the desired recognition target. be.
  • the trained model generation device 20 according to the present embodiment does not need to transfer the base model 30 to the target model 40, so that the target model 40 can be easily matched with the model desired by the end user.
  • the image adapter 50 generated by learning in a state of being linked to each of the plurality of base models 30 is also called an upstream task because it is generated by the service provider's prior learning.
  • the trained model 70 generated by transferring the image adapter 50 from the upstream task and combining it with the newly generated target model 40 is generated according to the recognition target desired by the end user of the service. , also called downstream tasks.
  • the trained model generation device 20 In the downstream task, it is required to generate the trained model 70 with little data acquisition effort or in a short learning time to quickly operate the system.
  • upstream tasks a lot of data and computational resources can be expended in advance in order to provide high-quality metamodels with fast transfer learning and high generalization performance.
  • the trained model generation device 20 according to the present embodiment generates upstream tasks using a large amount of data and computational resources, so that downstream tasks can be generated with a small load, and as a result, the system can be put into operation early.
  • the trained model generation device 20 recognizes the real information even in the downstream task that has not learned based on the real information. Accuracy can be improved.
  • the image adapter 50 is generated so as to increase the recognition accuracy for real information of each of the plurality of base models 30 generated so as to increase the recognition accuracy for pseudo information.
  • the recognition accuracy of the target model 40 newly generated in the downstream task can also be improved.
  • the generation of the image adapter 50 to improve the recognition accuracy of each of the plurality of base models 30 is also called generalization of the image adapter 50 or Generalized Image Adapter (GIA).
  • GAA Generalized Image Adapter
  • image quality improvements that are fundamentally useful for the task can be obtained, such as emphasizing common features that perform well in multiple base models 30 while suppressing features that are sources of noise. This improvement in image quality is expected not only to improve the Sim-to-Real problem, but also to improve recognition accuracy with various base models.
  • the trained model generation device 20 may generate the image adapter 50 in the upstream task and transfer the image adapter 50 generated in the upstream task to the downstream task.
  • the trained model generation device 20 may generate the image adapter 50 by learning based on the second real information or the second pseudo information only in downstream tasks.
  • ⁇ Comparison of recognition accuracy> When recognizing a recognition target from input information including a real image using a model generated by learning based only on a generated image that is pseudo information, the recognition accuracy decreases due to the difference between the generated image and the real image. Specifically, in a model that can recognize a recognition target with a probability close to 100% for a generated image, the probability that a recognition target can be recognized for a real image can drop to about 70%.
  • the trained model 70 is generated as a model in which the image adapter 50 generated by learning in a state of being connected to each of the plurality of base models 30 is connected to the target model 40 .
  • the image adapter 50 can correct errors in recognition results due to differences between the generated image and the actual image.
  • the probability that the recognition target can be recognized with respect to the real image can be increased to about 80%. That is, when the image adapter 50 is connected, the probability of recognizing the recognition target can be increased compared to when the image adapter 50 is not connected.
  • the learned model 70 according to this embodiment is generated without transferring the base model 30 . That is, it is possible to increase the probability that the recognition target can be recognized with respect to the real image without transferring the base model 30 . By not having to transfer the base model 30, the target model 40 is more likely to match the model desired by the end user.
  • a robot control system 100 includes a robot 2 and a robot control device 110 .
  • the robot 2 moves the work object 8 from the work start point 6 to the work target point 7 . That is, the robot control device 110 controls the robot 2 so that the work object 8 moves from the work start point 6 to the work target point 7 .
  • the work object 8 is also referred to as work object.
  • the robot control device 110 controls the robot 2 based on information regarding the space in which the robot 2 works. Information about space is also referred to as spatial information.
  • the robot 2 has an arm 2A and an end effector 2B.
  • the arm 2A may be configured as, for example, a 6-axis or 7-axis vertical articulated robot.
  • the arm 2A may be configured as a 3-axis or 4-axis horizontal articulated robot or SCARA robot.
  • the arm 2A may be configured as a 2-axis or 3-axis Cartesian robot.
  • Arm 2A may be configured as a parallel link robot or the like.
  • the number of shafts forming the arm 2A is not limited to the illustrated one.
  • the robot 2 has an arm 2A connected by a plurality of joints and operates by driving the joints.
  • the end effector 2B may include, for example, a gripping hand configured to grip the work object 8.
  • the grasping hand may have multiple fingers. The number of fingers of the grasping hand may be two or more. The fingers of the grasping hand may have one or more joints.
  • the end effector 2B may include a suction hand configured to be able to suction the work object 8 .
  • the end effector 2B may include a scooping hand configured to scoop the work object 8 .
  • the end effector 2 ⁇ /b>B includes a tool such as a drill, and may be configured to be able to perform various machining operations such as drilling a hole in the work object 8 .
  • the end effector 2B is not limited to these examples, and may be configured to perform various other operations. In the configuration illustrated in FIG. 1, the end effector 2B is assumed to include a grasping hand.
  • the robot 2 can control the position of the end effector 2B by operating the arm 2A.
  • the end effector 2 ⁇ /b>B may have an axis that serves as a reference for the direction in which it acts on the work object 8 . If the end effector 2B has an axis, the robot 2 can control the direction of the axis of the end effector 2B by operating the arm 2A.
  • the robot 2 controls the start and end of the action of the end effector 2B acting on the work object 8 .
  • the robot 2 can move or process the workpiece 8 by controlling the position of the end effector 2B or the direction of the axis of the end effector 2B and controlling the operation of the end effector 2B. In the configuration illustrated in FIG.
  • the robot 2 causes the end effector 2B to grip the work object 8 at the work start point 6 and moves the end effector 2B to the work target point 7 .
  • the robot 2 causes the end effector 2B to release the work object 8 at the work target point 7 . By doing so, the robot 2 can move the work object 8 from the work start point 6 to the work target point 7 .
  • the robot control system 100 further comprises a sensor 3, as shown in FIG. A sensor 3 detects physical information of the robot 2 .
  • the physical information of the robot 2 may include information on the actual position or orientation of each constituent part of the robot 2 or the velocity or acceleration of each constituent part of the robot 2 .
  • the physical information of the robot 2 may include information about forces acting on each component of the robot 2 .
  • the physical information of the robot 2 may include information about the current flowing through the motors that drive each component of the robot 2 or the torque of the motors.
  • the physical information of the robot 2 represents the result of the actual motion of the robot 2 . In other words, the robot control system 100 can grasp the result of the actual motion of the robot 2 by acquiring the physical information of the robot 2 .
  • the sensor 3 may include a force sensor or a tactile sensor that detects force acting on the robot 2, distributed pressure, slip, or the like as physical information of the robot 2.
  • the sensor 3 may include a motion sensor that detects the position or posture, or the speed or acceleration of the robot 2 as the physical information of the robot 2 .
  • the sensor 3 may include a current sensor that detects the current flowing through the motor that drives the robot 2 as the physical information of the robot 2 .
  • the sensor 3 may include a torque sensor that detects the torque of the motor that drives the robot 2 as the physical information of the robot 2 .
  • the sensor 3 may be installed in a joint of the robot 2 or in a joint driving section that drives the joint.
  • the sensor 3 may be installed on the arm 2A of the robot 2 or the end effector 2B.
  • the sensor 3 outputs the detected physical information of the robot 2 to the robot control device 110 .
  • the sensor 3 detects and outputs physical information of the robot 2 at a predetermined timing.
  • the sensor 3 outputs physical information of the robot 2 as time-series data.
  • the robot control system 100 is assumed to have two cameras 4 .
  • the camera 4 captures an image of an object, a person, or the like located within the influence range 5 that may affect the motion of the robot 2 .
  • An image captured by the camera 4 may include monochrome luminance information, or may include luminance information of each color represented by RGB (Red, Green and Blue) or the like.
  • the range of influence 5 includes the motion range of the robot 2 . It is assumed that the influence range 5 is a range obtained by expanding the motion range of the robot 2 further outward.
  • the range of influence 5 may be set so that the robot 2 can be stopped before a person or the like moving from the outside to the inside of the motion range of the robot 2 enters the inside of the motion range of the robot 2 .
  • the range of influence 5 may be set, for example, as a range that extends a predetermined distance from the boundary of the motion range of the robot 2 to the outside.
  • the camera 4 may be installed so as to capture a bird's-eye view of the influence range 5 or the motion range of the robot 2 or a peripheral area thereof.
  • the number of cameras 4 is not limited to two, and may be one or three or more.
  • the robot control device 110 acquires the learned model 70 generated by the trained model generation device 20 . Based on the image captured by the camera 4 and the learned model 70, the robot control device 110 identifies the work object 8, the work start point 6, the work target point 7, or the like, which exist in the space where the robot 2 works. to recognize In other words, the robot control device 110 acquires the learned model 70 generated for recognizing the work object 8 and the like based on the image captured by the camera 4 . Robot controller 110 is also referred to as a recognizer.
  • the robot controller 110 may be configured with at least one processor to provide control and processing power to perform various functions.
  • Each component of the robot control device 110 may be configured including at least one processor.
  • a plurality of components among the components of the robot control device 110 may be realized by one processor.
  • the entire robot controller 110 may be implemented with one processor.
  • the processor may execute programs that implement various functions of the robot controller 110 .
  • a processor may be implemented as a single integrated circuit.
  • An integrated circuit is also called an IC (Integrated Circuit).
  • a processor may be implemented as a plurality of communicatively coupled integrated and discrete circuits. Processors may be implemented based on various other known technologies.
  • the robot control device 110 may include a storage unit.
  • the storage unit may include an electromagnetic storage medium such as a magnetic disk, or may include a memory such as a semiconductor memory or a magnetic memory.
  • the storage unit stores various information, programs executed by the robot control device 110, and the like.
  • the storage unit may be configured as a non-transitory readable medium.
  • the storage unit may function as a work memory for the robot control device 110 . At least part of the storage unit may be configured separately from the robot controller 110 .
  • the robot control device 110 acquires the learned model 70 in advance.
  • the robot control device 110 may store the trained model 70 in the storage unit.
  • the robot control device 110 obtains an image of the work object 8 from the camera 4 .
  • the robot control device 110 inputs the captured image of the work target 8 to the learned model 70 as input information.
  • the robot control device 110 acquires output information output from the learned model 70 according to the input of input information.
  • the robot control device 110 recognizes the work object 8 based on the output information, and performs work such as gripping and moving the work object 8 .
  • the robot control system 100 can acquire the learned model 70 from the learned model generation device 20 and recognize the work object 8 by the learned model 70 .
  • the trained model generation device 20 may set the loss function so that the output when input information is input to the generated trained model 70 approaches the output when teacher data is input.
  • cross-entropy can be used as the loss function.
  • Cross-entropy is calculated as a value representing the relationship between two probability distributions. Specifically, in this embodiment, the cross-entropy is calculated as a value representing the relationship between the input pseudo information or real information and the backbone, head or adapter.
  • the trained model generation device 20 learns so that the value of the loss function becomes small.
  • the output corresponding to the input of the input information can approach the output corresponding to the input of the teacher data.
  • the control unit 22 of the trained model generation device 20 trains the image adapter 50 by optimizing the loss function of the same or related task as the input information while the image adapter 50 is connected to the base model 30. may be generated. Optimization of the loss function may be, for example, minimization of the value of the loss function. Loss functions for tasks that are identical or related to the input information include the loss function of the base model 30 . On the other hand, the control unit 22 generates the image adapter 50 by learning to optimize a loss function other than the task that is the same as or related to the input information while the image adapter 50 is connected to the base model 30. good too. Non-task loss functions that are the same as or related to the input information include various significant loss functions other than the base model 30 loss function.
  • Discrimination Loss is a loss function used to learn the authenticity of a generated image by labeling it with a numerical value between 1, which represents complete truth, and 0, which represents complete falsehood. .
  • the control unit 22 learns an image output by the image adapter 50 when an image is input to the image adapter 50 as input information, using the correct answer as a label. By doing so, the control unit 22 controls the image adapter 50 so that the base model 30 generated by learning based on the pseudo information cannot distinguish between the image as the actual information and the image output by the image adapter 50 . can generate
  • the control unit 22 of the trained model generation device 20 generates the image adapter 50 by learning with the image adapter 50 coupled to each of the plurality of base models 30 . That is, the control unit 22 applies each of the plurality of base models 30 to pre-learning for generating the image adapter 50 .
  • the control unit 22 When the plurality of base models 30 includes the first base model 301 to the x-th base model 30x, the control unit 22 generates a combination in which each base model 30 is coupled to the image adapter 50 in order, Image adapter 50 may be generated by learning and updating image adapter 50 for each of each combination. That is, the control unit 22 may sequentially apply each of the plurality of base models 30 one by one to pre-learning for generating the image adapter 50 .
  • the control unit 22 may randomly determine the order in which the base model 30 is applied to pre-learning, or may determine it based on a predetermined rule.
  • the control unit 22 may execute in parallel a plurality of pre-learnings applying each of a plurality of combinations. That is, the control unit 22 may apply a plurality of base models 30 in parallel to pre-learning.
  • the control unit 22 may classify a plurality of base models 30 into a plurality of groups, and apply each group to pre-learning for generating the image adapter 50 in order.
  • the control unit 22 may classify a plurality of base models 30 into one group. In this case, the control unit 22 may apply the plurality of base models 30 classified into groups in parallel to pre-learning, or may apply each of the plurality of base models 30 one by one to pre-learning in order. good.
  • the control unit 22 may classify one base model 30 into each group.
  • the control unit 22 may randomly determine the order in which each group is applied to pre-learning, or may determine it based on a predetermined rule.
  • the embodiments of the trained model generation system 1 and the robot control system 100 have been described above. It can also be embodied as a medium (for example, an optical disk, a magneto-optical disk, a CD-ROM, a CD-R, a CD-RW, a magnetic tape, a hard disk, or a memory card).
  • a medium for example, an optical disk, a magneto-optical disk, a CD-ROM, a CD-R, a CD-RW, a magnetic tape, a hard disk, or a memory card.
  • the implementation form of the program is not limited to an application program such as an object code compiled by a compiler or a program code executed by an interpreter. good.
  • the program may or may not be configured so that all processing is performed only in the CPU on the control board.
  • the program may be configured to be partially or wholly executed by another processing unit mounted on an expansion board or expansion unit added to the board as required.
  • Embodiments according to the present disclosure are not limited to any specific configuration of the embodiments described above. Embodiments of the present disclosure extend to any novel feature or combination thereof described in the present disclosure or any novel method or process step or combination thereof described. be able to.
  • Descriptions such as “first” and “second” in this disclosure are identifiers for distinguishing the configurations. Configurations that are differentiated in descriptions such as “first” and “second” in this disclosure may interchange the numbers in that configuration. For example, the first pseudo information can replace the identifiers “first” and “second” with the second pseudo information. The exchange of identifiers is done simultaneously. The configurations are still distinct after the exchange of identifiers. Identifiers may be deleted. Configurations from which identifiers have been deleted are distinguished by codes. The description of identifiers such as “first” and “second” in this disclosure should not be used as a basis for interpreting the order of the configuration or the existence of lower numbered identifiers.
  • Trained model generation device (22: control unit, 26: information generation unit) 30 base model (31: first base model (31a: during learning, 31b: already learned), 32: second base model (32a: during learning, 32b: already learned), 301 to 30x: 1st to xth base model, 311-31x: 1st to x-th first base model, 321-32x: 1st to x-th second base model) 40 target model (41: first target model (41a: during learning, 41b: already learned), 42: second target model (42a: during learning, 42b: already learned)) 50 adapter (50a: learning, 50b: already learned) 70 trained model 100 robot control system (2: robot, 2A: arm, 2B: end effector, 3: sensor, 4: camera, 5: range of robot influence, 6: work start table, 7: work target table, 8 : work object, 110: robot control device (recognition device)

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Image Analysis (AREA)
PCT/JP2022/021815 2021-05-28 2022-05-27 学習済みモデル生成装置、学習済みモデル生成方法、及び認識装置 Ceased WO2022250154A1 (ja)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP22811422.9A EP4350614A4 (en) 2021-05-28 2022-05-27 DEVICE FOR GENERATING A TRAINED MODEL, METHOD FOR GENERATING A TRAINED MODEL AND RECOGNITION DEVICE
US18/565,070 US20240265691A1 (en) 2021-05-28 2022-05-27 Trained model generating device, trained model generating method, and recognition device
CN202280037790.3A CN117396927A (zh) 2021-05-28 2022-05-27 训练模型生成装置、训练模型生成方法和识别装置
JP2023513902A JP7271809B2 (ja) 2021-05-28 2022-05-27 学習済みモデル生成装置、学習済みモデル生成方法、及び認識装置

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2021090676 2021-05-28
JP2021-090676 2021-05-28

Publications (1)

Publication Number Publication Date
WO2022250154A1 true WO2022250154A1 (ja) 2022-12-01

Family

ID=84228930

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/021815 Ceased WO2022250154A1 (ja) 2021-05-28 2022-05-27 学習済みモデル生成装置、学習済みモデル生成方法、及び認識装置

Country Status (5)

Country Link
US (1) US20240265691A1 (https=)
EP (1) EP4350614A4 (https=)
JP (2) JP7271809B2 (https=)
CN (1) CN117396927A (https=)
WO (1) WO2022250154A1 (https=)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2025141421A1 (ja) * 2023-12-27 2025-07-03 株式会社半導体エネルギー研究所 表示装置

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016071502A (ja) 2014-09-29 2016-05-09 セコム株式会社 対象識別装置
WO2019194256A1 (ja) * 2018-04-05 2019-10-10 株式会社小糸製作所 演算処理装置、オブジェクト識別システム、学習方法、自動車、車両用灯具
US10565471B1 (en) * 2019-03-07 2020-02-18 Capital One Services, Llc Systems and methods for transfer learning of neural networks
US20200134469A1 (en) * 2018-10-30 2020-04-30 Samsung Sds Co., Ltd. Method and apparatus for determining a base model for transfer learning
JP2020144700A (ja) * 2019-03-07 2020-09-10 株式会社日立製作所 画像診断装置、画像処理方法及びプログラム
JP2021056785A (ja) * 2019-09-30 2021-04-08 セコム株式会社 画像認識システム、撮像装置、認識装置及び画像認識方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8005932B2 (en) * 2003-11-20 2011-08-23 Hewlett-Packard Development Company, L.P. Network discovery
CN112633459B (zh) * 2019-09-24 2024-09-20 华为技术有限公司 训练神经网络的方法、数据处理方法和相关装置
CN110781976B (zh) * 2019-10-31 2021-01-05 重庆紫光华山智安科技有限公司 训练图像的扩充方法、训练方法及相关装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016071502A (ja) 2014-09-29 2016-05-09 セコム株式会社 対象識別装置
WO2019194256A1 (ja) * 2018-04-05 2019-10-10 株式会社小糸製作所 演算処理装置、オブジェクト識別システム、学習方法、自動車、車両用灯具
US20200134469A1 (en) * 2018-10-30 2020-04-30 Samsung Sds Co., Ltd. Method and apparatus for determining a base model for transfer learning
US10565471B1 (en) * 2019-03-07 2020-02-18 Capital One Services, Llc Systems and methods for transfer learning of neural networks
JP2020144700A (ja) * 2019-03-07 2020-09-10 株式会社日立製作所 画像診断装置、画像処理方法及びプログラム
JP2021056785A (ja) * 2019-09-30 2021-04-08 セコム株式会社 画像認識システム、撮像装置、認識装置及び画像認識方法

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP4350614A4

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2025141421A1 (ja) * 2023-12-27 2025-07-03 株式会社半導体エネルギー研究所 表示装置

Also Published As

Publication number Publication date
JP7271809B2 (ja) 2023-05-11
JPWO2022250154A1 (https=) 2022-12-01
EP4350614A1 (en) 2024-04-10
CN117396927A (zh) 2024-01-12
EP4350614A4 (en) 2025-05-07
JP2023099084A (ja) 2023-07-11
US20240265691A1 (en) 2024-08-08

Similar Documents

Publication Publication Date Title
CN111275063B (zh) 一种基于3d视觉的机器人智能抓取控制方法及系统
US11338435B2 (en) Gripping system with machine learning
US11694432B2 (en) System and method for augmenting a visual output from a robotic device
JP7200610B2 (ja) 位置検出プログラム、位置検出方法及び位置検出装置
CN109079777B (zh) 一种机械臂手眼协调作业系统
JP2025147230A (ja) ロボットの保持態様決定装置、保持態様決定方法、及びロボット制御システム
JP7271809B2 (ja) 学習済みモデル生成装置、学習済みモデル生成方法、及び認識装置
CN117260702A (zh) 用于控制机器人来操纵、尤其是拾取对象的方法
JP7271810B2 (ja) 学習済みモデル生成装置、学習済みモデル生成方法、及び認識装置
US20240289695A1 (en) Trained model generation method, inference apparatus, and trained model generation apparatus
US20250077864A1 (en) Method for training a machine learning model for generating descriptor images for images showing one or more objects
JP7483179B1 (ja) 推定装置、学習装置、推定方法及び推定プログラム
CN118747913A (zh) 一种基于图网络模型的动态手势识别方法
Mohammed et al. Color matching based approach for robotic grasping
Andersen et al. Using a flexible skill-based approach to recognize objects in industrial scenarios
Palliwar et al. Real-time inverse kinematics function generation using (GANs) and advanced computer vision for Robotics joints
CN115229791B (zh) 含有工具的机械臂控制系统及方法
US20240331213A1 (en) Method for ascertaining a descriptor image for an image of an object
JP7717169B2 (ja) 学習済みモデル生成方法、学習済みモデル生成装置、学習済みモデル、及び保持態様の推定装置
JP2022108450A (ja) 情報処理装置、および、学習認識システム
CN120347735B (zh) 一种基于多模态数据融合的机器人动态抓取方法和系统
Nazari et al. Robot Finger Detection and Joints Angles Estimation using DeepLabCut
Lulu AI and Vision 3D-based Robot Arm for Object Grasping and Placing

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22811422

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2023513902

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 202280037790.3

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2022811422

Country of ref document: EP

ENP Entry into the national phase

Ref document number: 2022811422

Country of ref document: EP

Effective date: 20240102

WWW Wipo information: withdrawn in national office

Ref document number: 2022811422

Country of ref document: EP