WO2018168536A1

WO2018168536A1 - Learning apparatus and learning method

Info

Publication number: WO2018168536A1
Application number: PCT/JP2018/008140
Authority: WO
Inventors: Tanichi Ando; Koji Takizawa
Original assignee: Omron Corporation
Priority date: 2017-03-14
Filing date: 2018-03-02
Publication date: 2018-09-20

Abstract

A mechanism for imparting a new ability to an apparatus placed at a remote location is provided. A learning apparatus according to an aspect of the present invention includes: a learning request accepting unit configured to accept, as a learning request, designation of a learning target apparatus for which machine learning is to be performed, and designation of an ability that the learning target apparatus is to acquire through the machine learning, the learning request accepting unit being placed at a location that is remote from the learning target apparatus; a remote manipulation unit configured to remotely manipulate the learning target apparatus so as to execute an operation that is associated with the designated ability; a learning data collection unit configured to collect learning data for the machine learning of the designated ability, based on a result of remotely manipulating the learning target apparatus; and a learning processing unit configured to perform machine learning of a learning device so as to acquire the designated ability, using the collected learning data.

Description

LEARNING APPARATUS AND LEARNING METHOD

The present invention relates to a learning apparatus and a learning method.

The volume of data that can be handled in a system has been explosively increasing, owing to the advance of CPUs (Central Processing Units) and storage devices of computers, as well as networks. Such large volumes of data are called big data. Furthermore, a great number of apparatuses that serve as data sources and operation targets are connected to each other via networks, and various IoT (Internet of Things) systems have been developed as a mechanism for integrating them. Various kinds of information processing can be performed by handling big data in IoT systems. However, to perform new information processing, new abilities are imparted to applications. In recent years, the amount of development regarding this has increased greatly, and there is a situation where resources for application development are insufficient. If new AI (Artificial Intelligence) technology including deep learning enables applications to acquire new abilities, such a situation where resources are insufficient can be solved.

Artificial intelligence including neural networks has been widely researched for many years. For example, technology for recognizing objects appearing in images has been improved in many ways, and the recognition rate has been increasing gradually. In particular, introduction of deep learning has rapidly increased the object recognition rate. Deep learning technology is used in a wide variety of fields including not only image recognition, but also voice recognition, text summarization, automatic translation, autonomous driving, fault prediction, sensor data analysis, and the like. Machine learning such as this deep learning enables a machine to acquire a new ability.

For example, as techniques related to a method for equipping an apparatus with a new ability,

Patent Documents

1 and 2 propose techniques of rewriting printer firmware. Patent Documents 3 and 4 propose techniques associated with machine learning, and Patent Document 5 proposes a character identification system using deep learning.

JP 2009-134474A JP 2007-140952A JP 2014-228972A JP 5816771B JP 2015-53008A

The present inventors found that the aforementioned conventional AI technology has the following problem. That is to say, in the conventional AI technology, an apparatus to which a new ability is to be imparted, such as a robot, is prepared at hand, and this apparatus is enabled to acquire the new ability by executing machine learning processing. For this reason, if the apparatus to which a new ability is to be imparted is placed at a remote location, it has been difficult to impart the new ability to this apparatus.

To solve the foregoing problem, the present inventors examined construction of a system for imparting a new ability to an apparatus placed at a remote location by causing the apparatus to execute machine learning processing by means of remote manipulation. However, the present inventors found that this system may cause the following problem.

That is to say, in machine learning such as deep learning, usually, a large volume of data is used in learning and repeated calculation is performed many times. For this reason, a large-scale training system is used. That is to say, adequate machine power is required to perform machine learning processing. Nevertheless, there are cases where an apparatus placed at a remote location has limited machine power. In addition, the machine power of this apparatus, which is placed at a remote location, cannot be readily increased. The present inventors found that, due to the insufficient machine power, a problem may occur in that an apparatus placed at a remote location cannot perform machine learning processing to acquire a new ability.

The present invention has been made in view of the foregoing situation in an aspect, and aims to provide a technical mechanism for appropriately imparting a new ability to an apparatus placed at a remote location.

To achieve the above-stated object, the present invention employs the following configuration.

That is to say, a learning apparatus according to an aspect of the present invention includes a learning request accepting unit configured to accept, as a learning request, designation of a learning target apparatus for which machine learning is to be performed, and designation of an ability that the learning target apparatus is to acquire through the machine learning, the learning request accepting unit being placed at a remote location; a remote manipulation unit configured to remotely manipulate the learning target apparatus so as to execute an operation that is associated with the designated ability, by transmitting control data to the learning target apparatus; a learning data collection unit configured to collect learning data for the machine learning of the designated ability, based on a result of remotely manipulating the learning target apparatus; and a learning processing unit configured to perform machine learning of a learning device so as to acquire the designated ability, using the collected learning data.

The learning apparatus having this configuration accepts, as the learning request from a client, the designation of the learning target apparatus for which machine learning is to be performed, and the designation of the ability that the learning target apparatus is to acquire. Subsequently, the learning apparatus collects the learning data to be used in machine learning of the designated ability by remotely manipulating the designated learning target apparatus. The learning apparatus then carries out machine learning of the learning device so as to acquire the designated ability, using the collected learning data. Thus, a learning device for causing the learning target apparatus to carry out the designated ability can be constructed. In addition, in this configuration, the learning target apparatus placed at a remote location only executes the operation that is associated with the designated ability, and the processing for machine learning of this ability is executed by the learning apparatus. For this reason, the processing for machine learning of the ability to be acquired by the learning target apparatus can be performed even if the machine power of the learning target apparatus placed at a remote location is limited. Accordingly, this configuration can provide a technical mechanism for appropriately imparting a new ability to an apparatus (learning target apparatus) that is placed at a remote location.

Note that the learning target apparatus may not be particularly limited as long as the learning target apparatus is an apparatus that can be controlled by a computer, and may be selected as appropriate, as per an embodiment. For example, the learning target apparatus may be a robot system that is used for a production line, or surgery. The ability to be acquired may include any kind of ability that the learning target apparatus can be equipped with, and is, for example, a function that can be provided by the learning target apparatus, or information processing that can be executed by the learning target apparatus. To acquire an ability includes the learning target apparatus becoming able to carry out a new function or information processing that it is not equipped with, and the learning target apparatus becoming able to more efficiently carry out an equipped function or information processing. Furthermore, being placed at a remote location means that the learning apparatus and the learning target apparatus are physically separate from each other, and refers to, for example, placement in which a person who is present on the learning apparatus side cannot see a person who is present on the learning target apparatus side, or cannot directly hear the voice of the person who is present on the learning target apparatus side, as in the case where these apparatuses are partitioned by a wall or are placed in different buildings. Also, for example, the learning target apparatus is installed in a factory of the client who has requested the learning, the learning apparatus is installed in a building of the company that takes on the learning request, and the learning target apparatus and the learning apparatus are placed in different buildings. The present invention is particularly effective in the case where it takes time, to some extent, for an engineer who belongs to a company that manages the learning apparatus to visit the location where the learning target apparatus is placed, e.g. the learning apparatus and the learning target apparatus are placed in different prefectures. It is also favorable that machine power of the learning apparatus is higher than machine power of the learning target apparatus. The machine power may be compared based on the processing speed of a CPU, the capacity of a memory, the readout speed of the memory, or the like.

The learning apparatus according to the above-described aspect may further include: an allowable area setting unit configured to set an allowable area in which the learning target apparatus is allowed to operate, within a movable area in which the learning target apparatus can move; and a state acquisition unit configured to acquire state information indicating a state of the movable area, from a monitoring apparatus that monitors the state of the movable area. The remote manipulation unit may remotely manipulate the learning target apparatus so as to operate within the set allowable area, based on the acquired state information. With this configuration, by limiting the area in which the learning target apparatus operates to the allowable area, it is possible to reduce wasteful operations, increase the efficiency of machine learning, and ensure safety around the learning target apparatus.

In the learning apparatus according to the above-described aspect, the monitoring apparatus may be a shooting apparatus that is placed so as to capture an image of the movable area of the learning target apparatus, and the state information may be a captured image that is captured by the shooting apparatus. With this configuration, the monitoring apparatus for monitoring the state of the learning target apparatus can be constructed cheaply.

In the learning apparatus according to the above-described aspect, if a foreign object enters the set allowable area, the remote manipulation unit may temporarily stop remote manipulation of the learning target apparatus, and resume remote manipulation of the learning target apparatus after the foreign object exits from the allowable area. This configuration can ensure safety within the allowable area.

In the learning apparatus according to the above-described aspect, the learning request accepting unit may further accept, as the learning request, designation of a password that is set for the learning target apparatus to allow remote manipulation, and the remote manipulation unit may remotely manipulate the learning target apparatus after being authenticated by the learning target apparatus using the designated password. This configuration can increase the security while the learning target apparatus is remotely manipulated.

In the learning apparatus according to the above-described aspect, the learning request accepting unit may further accept, as the learning request, the designation of a time period in which remote manipulation of the learning target apparatus is allowed, and the remote manipulation unit may also remotely manipulate the learning target apparatus only during the designated time period. This configuration can limit the time period in which remote manipulation of the learning target apparatus is allowed. Thus, for example, the learning data to be used in machine learning of the learning target apparatus can be collected during a time period at night or early morning in which the learning target apparatus is not used. Accordingly, the learning target apparatus can be used more efficiently.

In the learning apparatus according to the above-described aspect, the learning request accepting unit may further accept, as the learning request, designation of a learning period in which remote manipulation of the learning target apparatus is allowed, and the remote manipulation unit may remotely manipulate the learning target apparatus during the designated learning period, and delete information used in remote manipulation of the learning target apparatus after the designated learning period passes. With this configuration, the learning period can be set. As a result, for example, the learning data to be used in machine learning of the learning target apparatus can be collected in a closure period of one or two weeks during which the learning target apparatus is not used.

The learning apparatus according to the above-described aspect may further include: a cancellation accepting unit configured to accept cancellation of the learning request; and a data deletion unit configured to delete, if cancellation of the learning request is accepted, information associated with the learning request that includes the learning data collected until cancellation of the learning request is accepted and information used in remote manipulation of the learning target apparatus. With this configuration, the machine learning request that is no longer required can be canceled, and the efficiency of the resources of the learning apparatus can thus be increased.

The learning apparatus according to the above-described aspect may further include an ability-imparting data generation unit configured to generate ability-imparting data for imparting the designated ability to the learning target apparatus by mounting the trained learning device for which the machine learning has been completed onto the learning target apparatus. With this configuration, the ability-imparting data for imparting the ability designated in the learning request to the learning target apparatus can be automatically created.

Note that the format of the ability-imparting data may not be particularly limited as long as the ability designated in the learning request can be imparted to the learning target apparatus, and may be selected as appropriate, as per an embodiment. For example, in the case where the learning target apparatus includes a predetermined learning device, the ability-imparting data may be data indicating the configuration, parameters, or the like of the learning device. Also, for example, in the case where the learning target apparatus includes an FPGA (Field-Programmable Gate Array), the ability-imparting data may be data that is written into this FPGA in order to realize a trained learning device within the FPGA. Also, for example, the ability-imparting data may be a program that can be executed on the learning target apparatus, patch data for correcting the program, or the like.

The learning apparatus according to the above-described aspect may further include a distribution unit configured to distribute the generated ability-imparting data to the learning target apparatus. With this configuration, the ability designated in the learning request can be automatically imparted to the learning target apparatus.

In the learning apparatus according to the above-described aspect, the learning device may be constituted by a neural network. With this configuration, a learning apparatus for carrying out machine learning can be relatively readily realized.

In the learning apparatus according to the above-described aspect, the learning data collection unit may generate goal data indicating a task goal to be achieved, in accordance with the designated ability, determine, based on a result of the remote manipulation, whether or not the learning target apparatus achieves the task goal indicated by the goal data, and generate the learning data by combining the goal data and the control data into a set, if the learning target apparatus achieves the task goal. With this configuration, learning data that is appropriate for machine learning of the designated ability can be appropriately collected.

A learning method according to an aspect of the present invention includes: a learning request accepting step of accepting, as a learning request, designation of a learning target apparatus that is placed at a remote location and for which machine learning is to be performed, and designation of an ability that the learning target apparatus is to acquire through the machine learning; a remote manipulation step of remotely manipulating the learning target apparatus so as to execute an operation that is associated with the designated ability, by transmitting control data to the learning target apparatus; a collection step of collecting learning data for the machine learning of the designated ability, based on a result of remotely manipulating the learning target apparatus; and a machine learning step of performing machine learning of a learning device so as to acquire the designated ability using the collected learning data. The learning request accepting step, the remote manipulation step, the collection step, and the machine learning step are executed by a computer. This configuration can provide a technical mechanism for appropriately imparting a new ability to an apparatus placed at a remote location.

The learning method according to the above-described aspect may further include: an area setting step of setting an allowable area in which the learning target apparatus is allowed to operate, within a movable area in which the learning target apparatus can move; and an information acquisition step of acquiring state information indicating a state of the movable area, from a monitoring apparatus that monitors the state of the movable area. The area setting step and the information acquisition step are executed by the computer. In the remote manipulation step, the computer may remotely manipulate the learning target apparatus so as to operate within the set allowable area, based on the acquired state information. With this configuration, by limiting the area in which the learning target apparatus operates to the allowable area, it is possible to reduce wasteful operations, increase the efficiency of machine learning, and ensure safety around the learning target apparatus.

In the learning method according to the above-described aspect, the monitoring apparatus may be a shooting apparatus that is placed so as to capture an image of the movable area of the learning target apparatus, and the state information may be a captured image that is captured by the shooting apparatus. With this configuration, the monitoring apparatus for monitoring the state of the learning target apparatus can be constructed cheaply.

In the learning method according to the above-described aspect, in the remote manipulation step, if a foreign object enters the set allowable area, the computer may temporarily stop remote manipulation of the learning target apparatus, and resume remote manipulation of the learning target apparatus after the foreign object exits from the allowable area. This configuration can ensure safety within the allowable area.

In the learning method according to the above-described aspect, in the learning request accepting step, the computer may further accept, as the learning request, designation of a password that is set for the learning target apparatus to allow remote manipulation. In the remote manipulation step, the computer may remotely manipulate the learning target apparatus after being authenticated by the learning target apparatus using the designated password. This configuration can increase the security while the learning target apparatus is being remotely manipulated.

In the learning method according to the above-described aspect, in the learning request accepting step, the computer may further accept, as the learning request, designation of a time period in which remote manipulation of the learning target apparatus is allowed, and the computer may also execute the remote manipulation step only during the designated time period. With this configuration, the time period in which remote manipulation of the learning target apparatus is allowed can be limited.

In the learning method according to the above-described aspect, in the learning request accepting step, the computer may further accept, as the learning request, designation of a learning period in which remote manipulation of the learning target apparatus is allowed. The computer may execute the remote manipulation step during the designated learning period, and delete information used in remote manipulation of the learning target apparatus after the designated learning period passes. With this configuration, the learning period can be set.

The learning method according to the above-described aspect may further include: a cancellation request accepting step of accepting cancellation of the learning request; and a deletion step of deleting, if cancellation of the learning request is accepted, information associated with the learning request that includes the learning data collected until cancellation of the learning request is accepted and information used in remote manipulation of the learning target apparatus. The cancellation request accepting step and the deletion step are executed by the computer. With this configuration, the machine learning request that is no longer required can be canceled, and the efficiency of the resources of the system that performs machine learning can thus be increased.

The learning method according to the above-described aspect may further include a generation step of generating ability-imparting data for imparting the designated ability to the learning target apparatus by mounting the trained learning device for which the machine learning has been completed onto the learning target apparatus, the step being performed by the computer. With this configuration, the ability-imparting data for imparting the ability designated in the learning request to the learning target apparatus can be automatically created.

The learning method according to the above-described aspect may further include a distribution step of distributing the generated ability-imparting data to the learning target apparatus, by the computer. With this configuration, the ability designated in the learning request can be automatically imparted to the learning target apparatus.

In the learning method according to the above-described aspect, the learning device may be constituted by a neural network. With this configuration, a learning apparatus for carrying out machine learning can be relatively readily realized.

In the learning method according to the above-described aspect, the computer may generate goal data indicating a task goal to be achieved, in accordance with the designated ability, determine, based on a result of the remote manipulation, whether or not the learning target apparatus achieves the task goal indicated by the goal data, and generate the learning data by combining the goal data and the control data into a set, if the learning target apparatus achieves the task goal. With this configuration, learning data that is appropriate for machine learning of the designated ability can be appropriately collected.

The present invention can provide a mechanism for imparting a new ability to an apparatus that is placed at a remote location.

schematically shows an example of an instance in which the present invention is applied. schematically shows an example of a hardware configuration of a learning apparatus according to an embodiment. schematically shows an example of a hardware configuration of a robot arm system according to an embodiment. schematically shows an example of an operation state of the robot arm system according to an embodiment. schematically shows an example of a software configuration of the learning apparatus according to an embodiment. schematically shows an example of a software configuration of the robot arm system according to an embodiment. shows an example of a processing procedure of the learning apparatus according to an embodiment. shows an example of a processing procedure of the robot arm system according to an embodiment. schematically shows an example of a configuration of a learning apparatus according to a modification. shows an example of a processing procedure during machine learning of the learning apparatus according to a modification. schematically shows an example of a robot arm system according to a modification.

Hereinafter, an embodiment according to an aspect of the present invention (also referred to as “the embodiment” below) will be described based on the drawings. However, the embodiment described below is merely an example of the present invention in every respect. Needless to say, various improvements and modifications may be made without departing from the scope of the present invention. That is to say, to implement the present invention, a specific configuration corresponding to an embodiment may also be employed as appropriate. For example, in the following embodiment, a robot arm system that performs a predetermined task in a factory is taken as an example of a learning target apparatus for which machine learning is to be performed. However, the object to which the present invention is to be applied is not limited to robot arm systems, and the learning target apparatus may be selected as appropriate, as per an embodiment. Note that, although data that is used in the embodiment is described using natural language, more specifically, the data is defined by pseudo-language, commands, parameters, machine language, or the like that can be recognized by a computer.

§1 Application example

First, an example of an instance where the present invention is applied will be described using FIG. 1. FIG. 1 schematically shows an example of an application instance of a learning apparatus and a learning target apparatus according to the embodiment.

As shown in FIG. 1, a learning apparatus 1 according to the embodiment is an information processing apparatus that performs machine learning to acquire a new ability that is designated for a learning target apparatus placed at a remote location, in accordance with a learning request from a client. Specifically, the learning apparatus 1 accepts, as a learning request from the client, the designation of a learning target apparatus that is placed at a remote location and for which machine learning is to be performed, and the designation of an ability that is to be acquired by the learning target apparatus through machine learning. The client uses a user terminal 4 to designate the learning target apparatus and the ability to be acquired, via a network 10.

In the embodiment, it is assumed that a robot arm system 2, which performs a predetermined task in a factory, is designated as the learning target apparatus that is to acquire the designated ability through machine learning of the learning apparatus 1. Note that the ability to be acquired may be selected as appropriate, as per an embodiment, from among any kind of ability that the robot arm system 2 can be equipped with. For example, the ability to be acquired may be an ability to carry out a new task, or an ability to more efficiently carry out a task that is utilized.

The learning apparatus 1 transmits control data to the robot arm system 2, which has been designated as the learning target apparatus, and thus remotely manipulates the robot arm system 2 so as to execute an operation for learning associated with the ability designated in the learning request. Next, the learning apparatus 1 collects learning data for machine learning of the designated ability, based on the result of remotely manipulating the robot arm system 2. The learning apparatus 1 then performs machine learning of a learning device (later-described neural network 6) so as to acquire the designated ability, using the collected learning data. The learning apparatus 1 can thus generate a trained learning device for enabling the robot arm system 2 designated as the learning target apparatus to carry out the ability designated in the learning request.

In this embodiment, the robot arm system 2 placed at a remote location only executes the operation associated with the designated ability, and machine learning processing for acquiring the designated ability is executed by the learning apparatus 1. Accordingly, even if machine power of the robot arm system 2 placed at a remote location is limited, the processing for machine learning of the ability to be acquired by the robot arm system 2 can be performed.

Accordingly, according to the embodiment it is possible to provide a technical mechanism for accepting a learning request from a normal company (client) that does not have highly skilled workers and a complicated system to be used in machine learning, and carry out machine learning in accordance with the accepted learning request. In particular, it is possible to provide a technical mechanism for appropriately imparting a new ability to an apparatus, such as the robot arm system 2, that is placed in a factory, warehouse, or the like at a remote location.

Note that being placed at a remote location means that the learning apparatus 1 is physically separate from the learning target apparatus, and refers to, for example, placement in which a person who is present on the learning apparatus side cannot see a person who is present on the learning target apparatus side, or cannot directly hear the voice of the person who is present on the learning target apparatus side, as in the case where these apparatuses are partitioned by a wall or are placed in different buildings. Also, for example, the learning target apparatus is installed in a factory of the client who has requested the learning, the learning apparatus 1 is installed in a building of the company that takes on the learning request, and the learning target apparatus and the learning apparatus 1 are placed in different buildings. Accordingly, the placement of the user terminal 4 used by the client may be selected as appropriate, as per an embodiment. For example, the user terminal 4 may also be placed in a local area network that is different from that of the learning apparatus 1 and the robot arm system 2, and may also be placed so as to be connected to the learning apparatus 1 and the robot arm system 2 via a network such as the Internet. Also, for example, the user terminal 4 may also be placed in the same local area network as that of the learning apparatus 1, and may also be placed in the same local area network as that of the robot arm system 2. Furthermore, the learning apparatus 1 may accept the learning request from the client without using the user terminal 4, by directly receiving input.

On the other hand, the robot arm system 2 according to the embodiment is an example of a learning target apparatus that is placed at a remote location relative to the learning apparatus 1 for performing machine learning to acquire the ability designated in accordance with the learning request from the client, and that is made to acquire the designated ability through machine learning of the learning apparatus 1. The robot arm system 2 according to the embodiment includes a robot arm 30, which performs a predetermined task, and a robot controller (RC) 20, which controls the robot arm 30. The robot controller may also be a PLC (programmable logic controller) or the like. Thus, the robot arm system 2 is configured to accept, from the learning apparatus 1, a command made through remote manipulation for making an instruction to carry out an operation associated with the designated ability, and execute the operation associated with the designated ability in accordance with the accepted command made through remote manipulation. That is to say, the robot arm system 2 is configured so that the RC 20 causes the robot arm 30 to execute the operation designated from the learning apparatus 1.

The robot arm system 2 also includes a display 32, which performs predetermined display. The display 32 is placed at a location that can be seen by workers in the factory who are in the vicinity of the robot arm system 2, e.g. near the robot arm 30. This display 32 is an example of a display unit. The robot arm system 2 according to the embodiment is configured to cause the display 32 to display that the robot arm system 2 is operating while being remotely manipulated by the learning apparatus 1, while executing an operation in accordance with the command made through remote manipulation.

Thus, in the embodiment, the displayed content on the display 32 can notify workers or the like in the factory that the robot arm system 2 is being remotely manipulated by the learning apparatus 1. Accordingly, according to the embodiment, safety around the robot arm system 2 can be ensured while remote manipulation is performed by the learning apparatus 1.

Note that, in the embodiment, the robot arm system 2 includes a camera 31, which monitors the state of a movable area of the robot arm 30. The camera 31 is an example of a “monitoring apparatus (shooting apparatus)” of the present invention. Also, in the embodiment, not only the robot arm system 2, but also a robot apparatus 5, which can be moved heteronomously as per an operation made by an operator, or can move autonomously, perform tasks in the factory.
§2 Configuration example
(Hardware configuration)
<Learning apparatus>

Next, an example of a hardware configuration of the learning apparatus 1 according to the embodiment will be described using FIG. 2. FIG. 2 schematically shows an example of a hardware configuration of the learning apparatus 1 according to the embodiment.

As shown in FIG. 2, the learning apparatus 1 according to the embodiment is a computer in which a control unit 11, a storage unit 12, a communication interface 13, an input device 14, an output device 15, and a drive 16 are electrically connected to each other. Note that, in FIG. 2, the communication interface is denoted as “communication I/F”.

The control unit 11 includes a CPU, a RAM (Random Access Memory), a ROM (Read Only Memory), and the like, and is configured to execute various kinds of information processing based on programs and data. The storage unit 12 is constituted by a hard disk drive, a solid-state drive, or the like, for example, and stores a learning program 121 to be executed by the control unit 11, learning data 122 to be used in the learning of the learning device, ability-imparting data 123 for imparting the ability designated by the client to the robot arm system 2, and so on.

The learning program 121 is a program for causing the learning apparatus 1 to execute later-described machine learning processing (FIG. 7). The learning data 122 is data to be used in machine learning of the ability designated by the client, and is collected from the robot arm system 2 that is remotely manipulated. The ability-imparting data 123 is data for imparting an ability acquired as a result of machine learning to the robot arm system 2. The details will be described later.

The communication interface 13 is a wired LAN (Local Area Network) module, a wireless LAN module, or the like, for example, and is an interface for performing wired or wireless communication via the network 10. The learning apparatus 1 can perform data communication with the robot arm system 2 and the user terminal 4 via the network 10, using this communication interface 13. Note that the type of the network 10 may be selected as appropriate from among the Internet, a wireless communication network, a telecommunication network, a telephone network, a dedicated network, and the like, for example.

The input device 14 is a device for performing input, such as a mouse or a keyboard, for example. The output device 15 is a device for performing output, such as a display or a speaker, for example. The operator can operate the learning apparatus 1 via the input device 14 and the output device 15.

The drive 16 is a CD drive, a DVD drive, or the like, for example, and is a drive device for loading a program stored in a storage medium 91. The type of the drive 16 may be selected as appropriate in accordance with the type of the storage medium 91. The learning program 121 may also be stored in this storage medium 91.

The storage medium 91 is a medium that accumulates information such as that of a program by means of an electric, magnetic, optical, mechanical, or chemical effect, so that a computer or other kind of apparatus, machine, or the like can read the recorded information such as that of the program. The learning apparatus 1 may also acquire the learning program 121 from this storage medium 91.

Here, FIG. 2 shows a disk-type storage medium such as a CD or DVD as an example of the storage medium 91. However, the type of storage medium 91 is not limited to the disk type, and may also be a type other than the disk type. Examples of a storage medium of a type other than the disk type may include a semiconductor memory such as a flash memory.

Note that, regarding the specific hardware configuration of the learning apparatus 1, constituent elements may be omitted, replaced, and added as appropriate, as per an embodiment. For example, the control unit 11 may also include a plurality of processors. The learning apparatus 1 may also be constituted by a plurality of information processing apparatuses. The learning apparatus 1 may also be an information processing apparatus designed exclusively for a service to be provided, as well as a general-purpose server device, a PC (Personal Computer), or the like. To execute the later-described machine learning processing, it is favorable that the learning apparatus 1 is configured to have higher machine power than that of the robot arm system 2. Note that the machine power may be specified by the processing speed of a CPU, the storage capacity of a memory, the readout speed of the memory, or the like. For example, the learning apparatus 1 may also have higher machine power than that of the robot arm system 2 as a result of having a CPU that operates at a higher speed than the RC 20 in the robot arm system 2. In the case where the CPUs in the learning apparatus 1 and the RC 20 have the same processing speed, the learning apparatus 1 may also have higher machine power than that of the robot arm system 2 as a result of the RAM in the learning apparatus 1 having more capacity or a higher speed than the RAM in the RC 20.
<Robot arm system>

Next, an example of a hardware configuration of the robot arm system 2 according to the embodiment will be described, also using FIGS. 3 and 4. FIG. 3 schematically shows an example of the hardware configuration of the RC 20 according to the embodiment. FIG. 4 schematically shows an example of an operation state of the robot arm 30 according to the embodiment. As shown in FIGS. 1 and 3, the robot arm system 2 according to the embodiment includes the RC 20, the robot arm 30, the camera 31, and the display 32. The respective constituent elements are described below.
(RC)

First, the RC 20 will be described. The RC 20 according to the embodiment is a computer in which a control unit 21, a storage unit 22, external interfaces 23, and a communication interface 24 are electrically connected to one another. The RC 20 is thus configured to control operations of the robot arm 30, the camera 31, and the display 32. Note that, in FIG. 3, each of the external interfaces and the communication interface are denoted as “external I/F” and “communication I/F”, respectively.

The control unit 21 is configured to include a CPU, a RAM, a ROM, and the like, and execute various kinds of information processing based on programs and data. The storage unit 22 is constituted by a RAM, a ROM, or the like, for example, and stores a control program 221. The control program 221 is a program for causing the RC 20 to execute later-described processing for controlling the robot arm 30 (FIG. 8). The control unit 21 is configured to execute processes in later-described steps by interpreting and executing this control program 221.

The external interfaces 23 are interfaces for connection with external devices, and are configured as appropriate in accordance with the external devices to be connected. In the embodiment, the RC 20 is connected to the robot arm 30, the camera 31, and the display 32 via the respective external interfaces 23.

The communication interface 24 is a wired LAN (Local Area Network) module, a wireless LAN module, or the like, for example, and is an interface for wired or wireless communication. The communication interface 24 is an example of a “communication unit” that is configured to communicate with other apparatuses. Using the communication interface 24, the RC 20 can communicate data with the learning apparatus 1 that is placed at a remote location, and a peripheral device (e.g. a self-running robot apparatus 5) that is placed in the vicinity of the robot arm system 2 in the factory.

Note that, regarding the specific hardware configuration of the RC 20, constituent elements may be omitted, replaced, and added as appropriate, as per an embodiment. The control unit 21 may also include a plurality of processors. The control unit 21 may also be constituted by an FPGA. The storage unit 22 may also be constituted by the RAM and ROM included in the control unit 21. The storage unit 22 may also be constituted by an auxiliary storage device such as a hard disk drive or a solid-state drive. The RC 20 may also be an information processing apparatus designed exclusively for a service to be provided, as well as a general-purpose desktop PC, a tablet PC, or the like, in accordance with the control target.
(Robot arm)

Next, the robot arm 30 will be described. The robot arm 30 may be configured to be able to execute a predetermined operation, as appropriate. In the example in FIG. 4, the robot arm 30 includes a base portion 301, which serves as a starting point, two joint portions 302, which serve as movable shafts, two link portions 303, which form a frame, and an end effector 304, which is attached to a leading end.

The joint portions 302 are each configured to include a drive motor such as a servo motor or a brushless motor, and to be able to turn or rotate a corresponding link portion 303. An angle sensor able to detect an angle, such as a rotary encoder, is attached to each of the joint portions 302. Thus, the robot arm 30 is configured to be able to specify the angles of the joint portions 302.

The end effector 304 is formed as appropriate in accordance with a task to be carried out in the factory. A force sensor configured to detect a force exerted on the end effector 304 may also be attached to this end effector 304. Thus, the robot arm 30 can be configured to detect the force exerted on the end effector 304.

The robot arm 30 has a movable area 308, which is defined in accordance with the joint portions 302, the link portions 303, and the end effector 304. That is to say, the movable area 308 is an area that can be reached by the end effector 304 with the joint portions 302 being driven. In the embodiment, an allowable area 309, in which the robot arm 30 is allowed to operate, is set within the movable area 308. The details will be described later.

Note that, regarding the specific configuration of the robot arm 30, constituent elements may be omitted, replaced, and added as appropriate, as per an embodiment. For example, the number of joint portions 302 and the number of link portions 303 may be selected as appropriate, as per an embodiment. A torque sensor may also be attached, in addition to the angle sensor, to each joint portion 302. Thus, the joint portions 302 can be controlled according to torque.
(Display)

In the embodiment, the display 32 is used for displaying the state of the robot arm system 2 (robot arm 30). For this reason, the display 32 may not be particularly limited as long as it can display the state, and may be a known liquid-crystal display, a touch panel display, or the like.
(Camera)

In the embodiment, the camera 31 is placed so as to capture an image of the state of the movable area 308 of the robot arm system 2 (robot arm 30). Thus, the state of the movable area 308 is reflected in the image captured by the camera 31. This captured image is an example of “state information” of the present invention. Note that the camera 31 may also be fixed at a predetermined location, or may also be configured to be able to change its shooting direction (orientation) using a motor or the like. The camera 31 may be a general digital camera, video camera, 360-degree camera, or the like, or may also be a camera for capturing visible light or infrared light.
<Robot apparatus>

For example, the robot apparatus 5 includes a control unit, which is constituted by a CPU or the like, a storage unit for storing programs and the like, a communication interface for communicating with the RC 20, a robot arm that is similar to the robot arm 30, a wheel module for moving heteronomously or autonomously, and so on. Thus, the robot apparatus 5 is configured to move in the factory and perform a predetermined task, as appropriate. Note that the type of the robot apparatus 5 is not particularly limited. Needless to say, the robot apparatus 5 may not be a humanoid robot, and may be selected as appropriate in accordance with tasks to be performed in the factory.
<User terminal>

For example, the user terminal 4 is a computer in which a control unit, which is constituted by a CPU or the like, a storage unit for storing programs and the like, a communication interface for communication via a network, and an input-output device are electrically connected to one another. The user terminal 4 is used by a client to make a machine learning request (learning request) to a service provider that manages the learning apparatus 1. For example, the user terminal 4 may be a desktop PC, a tablet PC, a cellular phone including a smartphone, or the like that can connect to a network.
(Software configuration)
<Learning apparatus>

Next, an example of a software configuration of the learning apparatus 1 according to the embodiment will be described using FIG. 5. FIG. 5 schematically shows an example of a software configuration of the learning apparatus 1 according to the embodiment.

The control unit 11 in the learning apparatus 1 loads the learning program 121 stored in the storage unit 12 to the RAM. The control unit 11 then interprets and executes the learning program 121 loaded to the RAM, using the CPU, and controls the constituent elements. Thus, as shown in FIG. 5, the learning apparatus 1 according to the embodiment is configured to be a computer that includes, as software modules, a learning request accepting unit 110, an allowable area setting unit 111, a state acquisition unit 112, a remote manipulation unit 113, a learning data collection unit 114, a learning processing unit 115, an ability-imparting data generation unit 116, and a distribution unit 117.

The learning request accepting unit 110 accepts, from the client, the designation of a learning target apparatus placed at a remote location and the designation of an ability that the designated learning target apparatus is to acquire through machine learning, as a learning request. In the embodiment, it is assumed that a request for machine learning of the robot arm system 2 is accepted from the client.

The allowable area setting unit 111 sets the allowable area in which the learning target apparatus is allowed to operate, within the movable area of the learning target apparatus. In the embodiment, the allowable area setting unit 111 sets an allowable area 309 within the movable area 308 of the robot arm 30.

The state acquisition unit 112 acquires state information that indicates the state of the movable area, from a monitoring apparatus that monitors the state of the movable area of the learning target apparatus. In the embodiment, the state acquisition unit 112 acquires, from the camera 31 that is placed so as to capture images of the movable area 308, an image captured by this camera 31 as the state information.

The remote manipulation unit 113 transmits control data to the learning target apparatus, and thus remotely manipulates the learning target apparatus so as to execute an operation associated with the ability designated in the learning request. The learning data collection unit 114 collects the learning data for machine learning of the designated ability, based on the remote manipulation result.

In the embodiment, the remote manipulation unit 113 remotely manipulates the robot arm 30 in the robot arm system 2 by transmitting control data for making a command that a predetermined operation be performed to the RC 20 via the network 10. At this time, the remote manipulation unit 113 remotely manipulates the robot arm system 2 so as to operate within the designated allowable area 309, based on the captured image acquired from the camera 31. The learning data collection unit 114 then collects the learning data 122, in which goal data indicating a task goal to be achieved with respect to the designated ability and sensor data obtained in the process of operation performed until this task goal is achieved are pieces of input data, and control data transmitted to the RC 20 in the process of operation performed until the task goal is achieved is training data.

The learning processing unit 115 performs machine learning of the learning device so as to acquire the designated ability, using the collected learning data. The ability-imparting data generation unit 116 generates ability-imparting data for imparting the designated ability to the learning target apparatus by mounting the trained learning device for which machine learning has been completed onto the learning target apparatus. The distribution unit 117 distributes the generated ability-imparting data to the learning target apparatus.

In the embodiment, the learning processing unit 115 performs machine learning of the neural network 6, using the learning data 122 collected from the robot arm system 2. The ability-imparting data generation unit 116 generates the ability-imparting data 123 for equipping the RC 20 with the trained neural network 6. The distribution unit 117 distributes the generated ability-imparting data 123 to the RC 20 via the network 10.
(Learning device)

Next, the learning device will be described. As shown in FIG. 5, the learning device according to the embodiment is constituted by the neural network 6. The neural network 6 is a multi-layer neural network that is to be used in so-called deep learning, and includes an input layer 61, an intermediate layer (hidden layer) 62, and an output layer 63 in this order from the input side.

Note that, in the example of FIG. 5, the neural network 6 includes one intermediate layer 62, the output from the input layer 61 is input to the intermediate layer 62, and the output from the intermediate layer 62 is input to the output layer 63. However, the number of intermediate layers 62 may not be limited to one, and the neural network 6 may also include two or more intermediate layers 62.

Each of the layers 61 to 63 includes one or more neurons. For example, the number of neurons in the input layer 61 can be set in accordance with input data to be used as an input. The number of neurons in the intermediate layer 62 can be set as appropriate, as per an embodiment. The number of neurons in the output layer 63 can be set in accordance with control data to be output. A threshold value is set for each neuron. Basically, the output of each neuron is determined based on whether or not the sum of products of the input and the weight exceeds the threshold value.

Neurons in adjacent layers are connected as appropriate, and a weight (connection weight) is set for each connection. In the example in FIG. 5, each neuron is connected to all neurons in an adjacent layer. However, the connection of neurons may not be limited to this example, and may be set as appropriate, as per an embodiment.

The learning processing unit 115 performs neural network learning processing to construct the neural network 6 so as to output control data as output values upon the goal data and sensor data included in the collected learning data 122 being input. The ability-imparting data generation unit 116 generates the ability-imparting data 123 that includes information indicating the configuration of the constructed neural network 6 (e.g. the number of layers of the neural network, the number of neurons in each layer, the connection relationship between neurons, and transfer functions of the neurons), the connection weights between neurons, and the threshold value for each neuron.

<Robot arm system>

Next, an example of a software configuration of the robot arm system 2 according to the embodiment will be described using FIG. 6. FIG. 6 schematically shows an example of the software configuration of the robot arm system 2 that includes the RC 20 according to the embodiment.

The control unit 21 in the RC 20 loads the control program 221 stored in the storage unit 22 to the RAM. The control unit 21 then interprets and executes, using the CPU, the control program 221 loaded to the RAM, and controls the constituent elements. Thus, as shown in FIG. 6, the robot arm system 2 that includes the RC 20 according to the embodiment is configured to be a computer that includes, as software modules, a remote manipulation accepting unit 211, an operation processing unit 212, a display control unit 213, and a notification unit 214.

The remote manipulation accepting unit 211 accepts, from the learning apparatus 1, a command made through remote manipulation for giving an instruction to execute an operation for learning associated with the designated ability. The operation processing unit 212 executes an operation associated with the designated ability in accordance with the accepted command made through remote manipulation. The display control unit 213 causes the display 32 to display that the operation is being performed in accordance with remote manipulation from the learning apparatus 1, while the operation is being executed in accordance with the command made through remote manipulation. The notification unit 214 notifies peripheral apparatuses (e.g. the robot apparatus 5) that the operation is being performed in accordance with remote manipulation from the learning apparatus 1, while the operation is being executed in accordance with the command made through remote manipulation.
<Others>

The software modules of the learning apparatus 1 and the robot arm system 2 (RC 20) will be described in detail in a later-described operation example. Note that the embodiment describes an example in which all of the software modules of the learning apparatus 1 and the RC 20 are realized by general-purpose CPUs. However, some or all of those software modules may also be realized by one or more dedicated processors. Regarding the software configurations of the learning apparatus 1 and the RC 20, software modules may be omitted, replaced, and added as appropriate, as per an embodiment.
§3 Operation example
(Learning apparatus)

Next, an operation example of the learning apparatus 1 will be described using FIG. 7. FIG. 7 is a flowchart showing an example of a processing procedure of the learning apparatus 1 according to the embodiment. Note that the processing procedure described below is merely an example, and each process may be modified as much as possible. Regarding the processing procedure described below, steps may be omitted, replaced, and added as appropriate, as per an embodiment.

(Step S101)
In step S101, the control unit 11 operates as the learning request accepting unit 110, and accepts the learning request from the client. This step S101 is an example of a “learning request accepting step” of the present invention. For example, the client operates the user terminal 4, designates the learning target apparatus that is placed at a remote location and for which machine learning is to be performed, and designates the ability that the learning target apparatus is to acquire through machine learning. This learning request may also be input by a person who was informed of the content of the request from the client, rather than by the client himself/herself. That is to say, the content of the request may not be input by the client himself/herself. After completing designation of the learning target apparatus and the ability to be acquired, the control unit 11 advances the processing to the next step S102.

Note that, in step S101, the learning apparatus 1 acquires the information to be used in remote manipulation of the learning target apparatus by accepting the designation of the learning target apparatus. For example, the learning apparatus 1 acquires an IP address or the like of the RC 20 as the information to be used in remote manipulation of the robot arm system 2, in accordance with the robot arm system 2 having been designated as the learning target apparatus.

The learning apparatus 1 may also accept the designation of the ability that is to be learned through machine learning, by presenting a list of abilities that can be acquired through machine learning to the client in accordance with the type of the learning target apparatus. The list of abilities to be acquired through machine learning may also be prepared in advance as a template for each learning target apparatus.

The ability to be acquired through machine learning may be selected as appropriate from all abilities that the learning target apparatus can be equipped with. For example, in the case where the robot arm 30 is used in a task such as transfer, attachment, processing, removal of burrs, soldering, welding or the like of parts, an ability to carry out this task for a new object or the like may be designated as a target of machine learning, for example. In the case where such a task is already utilized, the ability to more efficiently carry out this already-utilized task or the like may be designated as a target of machine learning, for example.

Also, in step S101, the control unit 11 may also accept the input of a condition for achievement of this ability together with the designation of the ability to be acquired through machine learning. The condition for achievement of the ability is an additional condition for the ability that the learning target apparatus is to acquire, and is a temporal condition that a certain designated task is performed within a certain number of seconds, for example.

(Step S102)
In step S102, the control unit 11 operates as the allowable area setting unit 111, and sets the allowable area in which the learning target apparatus is allowed to operate, within the movable area of the learning target apparatus designated in step S101. This step S102 is an example of an “area setting step” of the present invention. In the embodiment, the control unit 11 sets the allowable area 309 in which the robot arm 30 is allowed to operate, within the movable area 308 of the robot arm 30. After completing the setting of the allowable area 309, the control unit 11 advances processing to the next step S103.

Note that the setting of the allowable area 309 may be performed as appropriate. For example, the control unit 11 may accept the designation of the allowable area 309 from the operator. In this case, the operator sets the allowable area 309 within the movable area 308 by operating the input device 14. At this time, the learning apparatus 1 acquires a captured image from the camera 31 that captures images of the state of the movable area 308, and outputs the acquired captured image to the output device 15. The operator can thus designate the allowable area 309 within the output captured image while omitting locations that are not related to the ability designated in step S101. The allowable area 309 may also be changed in real time based on the result of image processing performed on the captured image acquired from the camera 31. For example, if it is determined as a result of image processing that a person or an object is present within the movable area 308, the allowable area 309 at this point in time may be set while excluding the portion where the person or object is present from the area.

Also, for example, the control unit 11 may also accept the designation of the allowable area 309 from the client. In this case, the control unit 11 may also accept the designation of the allowable area 309 together with the learning request in step S101. Thus, the control unit 11 can set the allowable area 309 within the movable area 308 based on the input made by the client.

Also, for example, the control unit 11 may also automatically set the allowable area 309 within the movable area 308 based on the ability designated in step S101. In this case, the control unit 11 may also specify an area that is associated with the carrying out of the ability designated in step S101, and set the specified area as the allowable area 309.

(Step S103)
In step S103, the control unit 11 operates as the state acquisition unit 112, and acquires state information that indicates the state of the movable area, from the monitoring apparatus that monitors the state of the movable area of the learning target apparatus designated in step S101. This step S103 is an example of an “information acquisition step” of the present invention. In the embodiment, the control unit 11 accesses the RC 20 using the information acquired in step S101, and captures an image of the state of the movable area 308 using the camera 31 connected to the RC 20. Thus, the control unit 11 can acquire the captured image that reflects the state of the movable area 308 as the state information. After acquiring the captured image, the control unit 11 advances the processing to the next step S104.

(Step S104)
In step S104, the control unit 11 operates as the remote manipulation unit 113, transmits control data to the learning target apparatus, and thus remotely manipulates the learning target apparatus so as to execute an operation associated with the ability designated in step S101. This step S104 is an example of a “remote manipulation step” of the present invention. In the embodiment, the control unit 11 transmits the control data for making a command that a predetermined operation associated with the ability designated in step S101 be performed to the RC 20 via the network 10. The control data defines the amount of driving of the drive motors for the joint portions 302, for example. As will be described later, the RC 20 drives the joint portions 302 of the robot arm 30 based on the received control data. Thus, the control unit 11 remotely manipulates the robot arm system 2. After remotely manipulating the robot arm system 2, the control unit 11 advances the processing to the next step S105.

The content of the operation performed through remote manipulation may be determined as appropriate. For example, the content of the operation performed through remote manipulation may also be determined by the operator. For example, a plurality of templates that define different operations of the robot arm 30 may also be prepared. In this case, the control unit 11 may also determine the content of the operation performed through remote manipulation by randomly selecting a template. The control unit 11 may also determine the content of the operation performed through remote manipulation so as to match the ability to be acquired that is designated in step S101, using a method such as dynamic planning, during repeated remote manipulation. Furthermore, the control unit 11 may also cause the robot arm system 2 to execute a series of operations including a plurality of steps through this remote manipulation.

Note that, in the embodiment, the allowable area 309 is set within the movable area 308 in step S102, and in step S103, the control unit 11 acquires, from the camera 31, the captured image obtained by capturing an image of the state of the movable area 308. Then, in step S104, the control unit 11 remotely manipulates the robot arm system 2 so that the robot arm 30 operates within the set allowable area 309, based on the captured image acquired from the camera 31. That is to say, the control unit 11 remotely manipulates the robot arm system 2 while checking whether or not the robot arm 30 has moved outside the allowable area 309, using the captured image acquired from the camera 31.

The control unit 11 also monitors whether or not any foreign object (e.g. a person, object etc.) has entered the allowable area 309. For example, whether or not any foreign object has entered the allowable area 309 can be determined through known image processing, such as template matching. If it is determined that a foreign object has entered the allowable area 309, the control unit 11 temporarily stops (suspends) transmission of commands made through remote manipulation to the robot arm system 2. At this time, the control unit 11 may also transmit, to the robot arm system 2, a command for announcing that the foreign object that has entered the allowable area 309 be removed therefrom. After the foreign object has been removed from the allowable area 309, the control unit 11 resumes transmitting the commands made through remote manipulation to the robot arm system 2. Thus, safety within the allowable area 309 can be ensured.

(Step S105)
In step S105, the control unit 11 operates as the learning data collection unit 114, and collects the learning data 122 for machine learning of the designated ability, based on the result of remote manipulation in step S104. This step S105 is an example of a “collection step” of the present invention. After collecting the learning data 122, the control unit 11 advances the processing to the next step S106.

The content of the learning data 122 may be determined as appropriate in accordance with the type of learning device, the type of learning target apparatus, the ability to be acquired, and the like. In the embodiment, the neural network 6 is used as the learning device. The robot arm system 2 is designated as the learning target apparatus. The ability of the robot arm 30 to carry out a new task or to more efficiently carry out an already-utilized task is designated as the ability to be acquired. It is also assumed that the RC 20 controls operations of the robot arm 30 based on the goal to be achieved and the sensor data from the angle sensors of the joint portions 302.

In this case, the control unit 11 creates goal data that indicates a task goal to be achieved in accordance with the ability to be acquired that is designated in step S101. The content of the goal data may be determined as appropriate, as per an embodiment. For example, the goal data may specify the position, angle, moving speed, or the like of the robot arm 30 in accordance with the goal of completing a target task within a predetermined time. In the case of improving operations of the robot arm 30, the control unit 11 may also determine the content of the goal data so as to improve operations of the robot arm 30 by acquiring a captured image obtained by capturing an image of operations of the robot arm 30 from the camera 31 and performing image analysis on the acquired captured image.

Next, the control unit 11 determines whether or not the robot arm system 2 has achieved the task goal indicated by the goal data, based on the result of remote manipulation in step S104. If the robot arm system 2 has achieved the task goal indicated by the goal data, the control unit 11 acquires the control data transmitted to the RC 20 in the process of operation performed until the task goal is achieved. Furthermore, the control unit 11 acquires, from the RC 20, sensor data detected from the angle sensors of the joint portions 302 in the process of operation performed until the task goal is achieved. The sensor data is an example of the state data that indicates a state of the learning target apparatus (robot arm system 2). The sensor data may be acquired before the robot arm 30 is driven in accordance with a command indicated by the control data.

The control unit 11 then sets the goal data, the sensor data, and the control data in association with one another, with the control data as training data, and with the sensor data and goal data that are obtained immediately before the operation is performed based on the control data that is set as the training data, as input data. The control unit 11 thus collects the learning data 122 that includes the goal data and the sensor data as the input data, and includes the control data as the training data. That is to say, in step S105, the control unit 11 ignores the result of remote manipulation in the case where the task goal is not achieved, and collects the learning data to be used in machine learning of the ability from the result of remote manipulation in the case where the designated ability is achieved.

Note that whether or not the task goal has been achieved may be determined as appropriate. For example, the control unit 11 may also determine whether or not the task goal has been achieved, by acquiring a captured image in which the remote manipulation result appears from the camera 31, and performing image analysis on the acquired captured image. The remote manipulation result may also be obtained by using the robot arm system 2 and various sensors (such as the angle sensors) provided therearound. In this case, the control unit 11 may also determine whether or not the task goal has been achieved, based on detection results from various sensors (such as the angle sensors) provided in and around the robot arm system 2. If, in step S101, the input of a condition for achievement of the ability was accepted together with the designation of the ability to be acquired, the control unit 11 may also determine, as appropriate, whether or not this achievement condition is satisfied.

(Step S106)
In step S106, the control unit 11 determines whether or not a sufficient number of pieces of learning data 122 has been collected. If it is determined that a sufficient number of pieces of learning data 122 has been collected, the control unit 11 advances the processing to the next step S107. On the other hand, if it is determined that a sufficient number of pieces of learning data 122 has not been collected, the control unit 11 repeats the processes in steps S103 to S105.

Note that this determination may be performed using a threshold value. That is to say, the control unit 11 may also determine whether or not a sufficient number of pieces of learning data 122 has been collected, by comparing the number of collected pieces of learning data 122 with the threshold value. At this time, the threshold value may also be set by the operator, or may also be set in accordance with the ability that is to be learned through machine learning. The method for setting the threshold value can be selected as appropriate, as per an embodiment.

(Step S107)
In step S107, the control unit 11 transmits, to the robot arm system 2, a completion notification indicating that remote manipulation for machine learning has been completed. After completing transmission of the completion notification, the control unit 11 advances the processing to the next step S108.

(Step S108)
In step S108, the control unit 11 operates as the learning processing unit 115, and performs machine learning of the neural network 6 so as to acquire the designated ability, using the learning data 122 collected in step S105. This step S108 is an example of a “machine learning step” of the present invention.

Specifically, first, the control unit 11 prepares the neural network 6 for which machine learning processing is to be performed. The configuration of the neural network 6 to be prepared, initial values of the connection weights between neurons, and initial values of the threshold values for the respective neurons may also be provided by a template, or may also be provided through input made by an operator. In the case of re-learning, the control unit 11 may also prepare the neural network 6 based on learning result data indicating the configuration of the neural network with which re-learning is to be performed, the connection weights between neurons, and the threshold value for each neuron.

Next, the control unit 11 trains the neural network 6 using the goal data and sensor data included in the learning data 122 collected in step S105 as input data, and using the control data as the training data. The neural network 6 may be trained using a gradient descent method, a stochastic gradient descent method, or the like.

For example, the control unit 11 inputs the goal data and the sensor data included in the learning data 122 to the input layer 61, and performs computation processing for the neural network 6 in the direction of forward propagation. Thus, the control unit 11 obtains output values from the output layer 63 of the neural network 6. Next, the control unit 11 calculates errors between the output values output from the output layer 63 and the control data included in the learning data 122. Subsequently, the control unit 11 calculates errors in the connection weights between neurons and in the threshold values for the respective neurons using the errors in the output values calculated by means of an error back-propagation method. The control unit 11 updates the values of the connection weights between neurons and the threshold values for the respective neurons, based on the calculated errors.

The control unit 11 performs machine learning of the neural network 6 by repeating this series of processes until the output values output from the output layer 63 match the corresponding control data, for each piece of the learning data 122. Thus, a trained neural network 6 can be constructed that outputs corresponding control data upon goal data and sensor data being input. After completing machine learning of the neural network 6, the control unit 11 advances the processing to the next step S109.

(Step S109)
In step S109, the control unit 11 operates as the ability-imparting data generation unit 116, and generates the ability-imparting data 123 for imparting the designated ability to the robot arm system 2 by equipping the robot arm system 2 (RC 20) with the trained neural network 6 for which machine learning has been completed. This step S109 is an example of a “generation step” of the present invention. After generating the ability-imparting data 123, the control unit 11 advances the processing to the next step S110.

Note that the format of the ability-imparting data 123 may be determined as appropriate, as per an embodiment. For example, in the case where the RC 20 performs computation processing using a neural network, the control unit 11 may also generate, as the ability-imparting data 123, learning result data that indicates the configuration of the neural network 6 constructed in step S108, the connection weights between neurons, and the threshold values for the respective neurons. Also, for example, in the case where the RC 20 includes an FPGA, the control unit 11 may also generate, as the ability-imparting data 123, data that is to be written in the FPGA in order to realize, within the FPGA, the neural network 6 constructed in step S108. Also, for example, the control unit 11 may also generate, as the ability-imparting data 123, a program or patch data for correcting a program so as to cause the RC 20 to execute computation processing using the neural network 6 constructed in step S108. The ability-imparting data 123 of the aforementioned formats may also be automatically generated using any known automatic program generation method or the like.

(Step S110)
In step S110, the control unit 11 operates as the distribution unit 117, and distributes the ability-imparting data 123 generated in step S109 to the robot arm system 2 via the network 10. This step S110 is an example of a “distribution step” of the present invention. The RC 20 can acquire the ability designated in step S101 by installing the received ability-imparting data 123. After completing the distribution of the ability-imparting data 123, the control unit 11 ends the processing in this operation example.

Robot arm system

Next, an operation example of the robot arm system 2 will be described using FIG. 8. FIG. 8 is a flowchart showing an example of a processing procedure of the robot arm system 2 according to the embodiment. Note that the processing procedure described below is merely an example, and each process may be modified as much as possible. Regarding the processing procedure described below, steps may be omitted, replaced, and added as appropriate, as per an embodiment.

(Step S201)
In step S201, the control unit 21 in the RC 20 operates as the remote manipulation accepting unit 211, and accepts, from the learning apparatus 1, a command made through remote manipulation for making an instruction to execute an operation associated with the designated ability. Specifically, the control unit 21 accepts, from the learning apparatus 1, a command made through remote manipulation based on control data in step S104. At this time, the control unit 21 may also receive a plurality of pieces of control data that make instructions to execute a plurality of operations. After receiving the control data, the control unit 21 advances the processing to the next step S202.

(In step S202) In step S202, the control unit 21 operates as the operation processing unit 212, and executes the operation associated with the designated ability in accordance with the command made through remote manipulation accepted in step S201. In the embodiment, the control unit 21 causes the robot arm 30 to execute an operation corresponding to the command made through remote manipulation, by driving the drive motors of the joint portions 302 based on the control data. While the robot arm 30 is executing the operation in accordance with the command made through remote manipulation in step S202, the control unit 21 executes the next steps S203 and S204.

Note that the learning apparatus 1 monitors whether or not a foreign object has entered the allowable area 309. If it is determined that a foreign object has entered the allowable area 309, the learning apparatus 1 temporarily stops carrying out remote manipulation. At this time, the control unit 21 may also cause the display 32 to display an announcement to remove the foreign object that has entered, from the allowable area 309. If the RC 20 is connected to a speaker (not shown), this announcement may also be output from the speaker. The control unit 21 may also carry out this announcement in accordance with a command from the learning apparatus 1.

(Step S203)
In step S203, the control unit 21 operates as the display control unit 213, and causes the display 32 to display that the operation is being performed in accordance with remote manipulation from the learning apparatus 1. After completing display control for the display 32, the control unit 21 advances the processing to the next step S204.

Here, the content to be displayed on the display 32 may not be particularly limited as long as the content is associated with the fact that the operation is being performed in accordance with remote manipulation from the learning apparatus 1. For example, the control unit 21 may also display “operation in progress in accordance with remote manipulation” or “learning in progress in accordance with remote manipulation” on the display 32. Also, for example, the control unit 21 may also cause the display 32 to display the content of the operation executed in accordance with remote manipulation from the learning apparatus 1, by referencing the control data.

Also, for example, if a plurality of pieces of control data are received in step S201, the control unit 21 may also cause the display 32 to display the content of the operation that is to be executed next after the operation that is being executed in step S202. At this time, the control unit 21 may also cause the display 32 to display the content of the operation that is being executed together with the content of the operation to be executed next.

Also, for example, if the operation that is being executed in step S202 is a dangerous operation or an operation executed at a higher speed than usual, the control unit 21 may also cause the display 32 to display that the operation that is being executed is a dangerous operation or an operation executed at a high speed. The display content to indicate that the operation that is being executed is a dangerous operation may be determined as appropriate, as per an embodiment. For example, the control unit 21 may also display “dangerous operation in progress” or “high-speed operation in progress” on the display 32. Also, for example, the control unit 21 may also cause the display 32 to display a message for prompting people around the robot arm 30 to be careful, as the display content to indicate that the operation that is being executed is a dangerous operation.

Note that the method for determining whether or not the operation that is being executed is a dangerous operation may be selected as appropriate, as per an embodiment. For example, the control unit 21 may also determine whether or not the operation that is being executed in step S202 is a dangerous operation, based on conditions that define dangerous operations. Also, for example, information indicating that a target operation is dangerous may be included in the control data. In this case, the control unit 21 can determine whether or not the operation that is being executed in step S202 is a dangerous operation, by referencing the control data received in step S201.

(Step S204)
In step S204, the control unit 21 operates as the notification unit 214, and notifies peripheral apparatuses (e.g. the robot apparatus 5) that the operation is being performed in accordance with remote manipulation from the learning apparatus 1, by controlling the communication interface 24. After completing this notification, the control unit 21 advances the processing to the next step S205.

Note that the peripheral apparatuses that have received this notification can recognize that the robot arm system 2 is being remotely manipulated by the learning apparatus 1. Thus, for example, in order to not inhibit the operation executed by the robot arm 30 in accordance with remote manipulation, the robot apparatus 5, which is configured to be able to move in the factory, can be configured not to approach an area near the robot arm 30 (particularly the movable area 308 or the allowable area 309) in response to receiving the notification. That is to say, it is possible to set a movement limit in accordance with remote manipulation in progress, and cause the robot apparatus 5 to move while avoiding an area near the robot arm 30.

(Step S205)
In step S205, the control unit 21 determines whether or not remote manipulation from the learning apparatus 1 has been completed. In the embodiment, upon remote manipulation being completed, the completion notification is transmitted from the learning apparatus 1 in the above-described step S107. Therefore, the control unit 21 determines whether or not remote manipulation from the learning apparatus 1 has been completed, based on whether or not the completion notification has been received. If it is determined that remote manipulation has been completed, i.e. after remote manipulation from the learning apparatus 1 has been completed, the control unit 21 advances the processing to the next step S206. On the other hand, if it is determined that remote manipulation has not been completed, the control unit 21 repeats the processing in steps S201 to S204.

(Step S206)
In step S206, the control unit 21 operates as the display control unit 213, and causes the display 32 to display that the operation executed in accordance with remote manipulation from the learning apparatus 1 has been completed. The content to be displayed on the display 32 may not be particularly limited as long as the content is associated with the fact that the operation executed in accordance with remote manipulation from the learning apparatus 1 has been completed. For example, the control unit 21 may display “remote manipulation ended” or “operation according to remote manipulation completed” on the display 32. Thus, workers around the robot arm system 2 can be notified that the operation executed in accordance with remote manipulation from the learning apparatus 1 has been completed, and that the robot arm 30 will not suddenly move. After completing this completion display, the control unit 21 advances the processing to the next step S207.

(Step S207)
In step S207, the control unit 21 operates as the notification unit 214, and notifies peripheral apparatuses (e.g. the robot apparatus 5) that the operation executed in accordance with remote manipulation from the learning apparatus 1 has been completed, by controlling the communication interface 24. After completing the notification, the control unit 21 ends the processing in this operation example.

Note that peripheral apparatuses that have received this notification can recognize that remote manipulation of the robot arm system 2 from the learning apparatus 1 has been completed. As a result, for example, the robot apparatus 5, which is configured to be able to move in the factory, can also be allowed to approach an area near the robot arm 30 (particularly the allowable area 309) when the robot arm 30 is not operating. That is to say, it is possible to cancel the movement limit in accordance with remote manipulation in progress, and allow the robot apparatus 5 to pass through the area near the robot arm 30.

Effects

As described above, in step S101, the learning apparatus 1 according to the embodiment accepts the designation of a learning target apparatus for which machine learning is to be performed and an ability that the learning target apparatus is to acquire, as a learning request from a client. Subsequently, in steps S104 and S105, the learning apparatus 1 remotely manipulates the learning target apparatus (the robot arm system 2), thereby collecting the learning data 122 to be used in machine learning of the ability designated in the learning request. Then, in step S108, the learning apparatus 1 carries out machine learning of the neural network 6 so as to acquire the ability designated in the learning request, using the learning data 122 collected. Thus, a trained neural network 6 for causing the learning target apparatus to carry out the ability designated in the learning request can be constructed. The learning target apparatus (robot arm system 2) placed at a remote location only executes, through steps S201 and S202, an operation associated with the ability designated in step S101, and the machine learning processing in step S108 is executed by the learning apparatus 1. For this reason, the processing for machine learning of the ability to be acquired by the learning target apparatus can be performed even if the machine power of the learning target apparatus placed at a remote location is limited. Accordingly, the embodiment can provide a technical mechanism for appropriately imparting a new ability to an apparatus placed at a remote location.

In step S102, the learning apparatus 1 according to the embodiment sets, within the movable area 308, the allowable area 309 in which the robot arm 30 is allowed to operate. In step S103, the learning apparatus 1 acquires a captured image that reflects the state of the movable area 308. In step S104, the learning apparatus 1 then remotely manipulates the robot arm system 2 so that the robot arm 30 operates within the set allowable area 309, based on the captured image. Accordingly, with this configuration, the area in which the robot arm 30 operates can be limited to the allowable area 309. As a result, it is possible to reduce needless operations in collecting the learning data to be used in machine learning, and to ensure the safety around the robot arm system 2 (robot arm 30).

In step S109, the learning apparatus 1 according to the embodiment generates the ability-imparting data 123. Then, in step S110, the learning apparatus 1 distributes the generated ability-imparting data 123 to the learning target apparatus (robot arm system 2). Thus, the ability designated in the learning request can be automatically added to the learning target apparatus.
§4 Modifications

Although the embodiment of the present invention has been described above in detail, the above descriptions are merely examples of the present invention in all aspects. Needless to say, various improvements and modifications may be made without departing from the scope of the present invention. For example, the following modifications are possible. Note that, in the following description, the same constituent elements as the constituent elements described in the above embodiment are assigned the same signs, and descriptions of the same points as the points described in the above embodiment are omitted as appropriate. The following modifications may be combined as appropriate.

<4.1>
The above embodiment has described the robot arm system 2 as an example of the learning target apparatus. However, the type of learning target apparatus may not be limited to this example, and may be selected as appropriate, as per an embodiment.

For example, the learning target apparatus may also be a working robot, such as the robot apparatus 5, that moves in a warehouse and performs a task such as transportation of luggage. In this case, in the learning request, a procedure for efficiently transporting luggage in the warehouse can be designated as the ability to be acquired. An area in which the working robot can move is the movable area, and the area in which the working robot is to move can be limited by setting the allowable area.

For example, the learning target apparatus may also be a vehicle capable of autonomous driving. In this case, a client can designate, in the learning request, autonomous driving on a road as the ability that the vehicle is to acquire, using a test course or the like. The client can also designate, for example, autonomous parking, which is one function performed during autonomous driving operations, as the ability that the vehicle is to acquire. In this case, to set the movable area, one of or both a camera for capturing images of the outside of the vehicle and a laser or the like for detecting an object outside the vehicle can be used. Note that a display unit, such as a display, for displaying that an operation is being performed in accordance with remote manipulation may also be attached to an outer portion of the vehicle, or may be placed at a predetermined location in the test course.

A plurality of apparatuses, rather than one apparatus, may also be designated as the learning target apparatuses. For example, the robot arm system 2 may also include a plurality of robot arms 30. In this case, in the learning request, a task that is to be performed by the plurality of apparatuses in cooperation with each other can be designated as the ability to be acquired.

<4.2>
In the embodiment, a general, forward-propagation multi-layer neural network is used as the neural network 6, as shown in FIG. 5. However, the type of the neural network 6 may not be limited to this example, and may be selected as appropriate, as per an embodiment. For example, in the case of using images as input data, the neural network 6 may also be a convolutional neural network that includes a convolutional layer and a pooling layer. Also, for example, in the case of using time-series data as input data, the neural network 6 may also be a recurrent neural network having connections that recur from the output side to the input side, e.g. from the intermediate layer to the input layer. Note that the number of layers of the neural network 6, the number of neurons in each layer, the connection relationship between neurons, and transfer functions for neurons may be determined as appropriate, as per an embodiment.

<4.3>
In the embodiment, the learning device is constituted by a neural network. However, the type of learning device may not be limited to a neural network, and may be selected as appropriate, as per an embodiment. For example, the learning device may also be a support vector machine, a self-organizing map, a learning device that is trained by means of reinforcement learning, or the like. In the case of performing machine learning by means of reinforcement learning, the machine learning process in step S108 may also be carried out while carrying out remote manipulation in step S104.

<4.4>
In the embodiment, the camera 31 will be described as an example of the monitoring apparatus for monitoring the state of the movable area 308. However, the type of monitoring apparatus may not be limited to a shooting apparatus, and may be selected as appropriate, as per an embodiment. For example, the monitoring apparatus may also be a position detection system that is constituted by one or more infrared sensors and detects an operating position of the learning target apparatus (in the embodiment, the position of the robot arm 30). In this case, in step S103, the learning apparatus 1 can acquire information indicating the detection result from the position detection system as state information.

In the above embodiments, the camera 31 is connected to the RC 20. For this reason, the learning apparatus 1 can acquire a captured image from the camera 31 via the RC 20, using information (e.g. IP address) to be used in remote manipulation of the learning target apparatus designated in step S101. However, the method by which the learning apparatus 1 acquires the state information may not be limited to this example, and may be selected as appropriate, as per an embodiment. For example, if the camera 31 can connect to the network 10, the learning apparatus 1 may also acquire information (e.g. IP address) to be used in accessing the camera 31, similar to the learning target apparatus, in step S101.

Note that, if the movable area 308 of the robot arm system 2 does not need to be monitored, steps S102 and S103 may also be omitted in the processing procedure of the learning apparatus 1. In addition, in the software configuration of the learning apparatus 1, the allowable area setting unit 111 and the state acquisition unit 112 may also be omitted. Also, if the allowable area 309 does not need to be monitored, a series of processes from temporarily stopping remote manipulation until resumption may also be omitted in the processing procedure of the learning apparatus 1.

<4.5>
In step S109, the learning apparatus 1 according to the embodiment generates the ability-imparting data 123. Then, in step S110, the learning apparatus 1 distributes the ability-imparting data 123 to the robot arm system 2, which is the learning target apparatus. However, the method for generating and distributing the ability-imparting data 123 may not be limited to this example, and may be selected as appropriate, as per an embodiment.

For example, the ability-imparting data 123 may be generated by another information processing apparatus or the operator. In this case, step S109 may also be omitted in the processing procedure of the learning apparatus 1. In addition, in the software configuration of the learning apparatus 1, the ability-imparting data generation unit 116 may also be omitted.

For example, the ability-imparting data 123 may also be stored in a storage medium such as a CD drive, a DVD drive, or a flash memory. The storage medium that stores the ability-imparting data 123 may also be distributed to the client. In this case, step S110 may also be omitted in the processing procedure of the learning apparatus 1. In addition, the distribution unit 117 may also be omitted in the software configuration of the learning apparatus 1.

Note that, if the ability-imparting data 123 is thus distributed using a storage medium, the client reads out, as appropriate, the ability-imparting data 123 from the received storage medium, and installs the ability-imparting data 123 loaded to the RC 20 in the robot arm system 2. Thus, the ability-imparting data 123 can be applied to the robot arm system 2.

<4.6>
In the embodiment, the learning apparatus 1 is constituted by one computer. However, the learning apparatus 1 may also be constituted by a plurality of computers. In this case, each computer may be equipped with some functions of the learning apparatus 1. For example, only the learning data collection unit 114 may be mounted in one computer. When carrying out machine learning, the computer on which the learning data collection unit 114 is mounted may also be lent to the client. Thus, real-time properties in processing for collecting the learning data 122 in step S104 can be improved.

<4.7>
In step S101, the control unit 11 may further accept, as the learning request, the designation of a password that is set for the learning target apparatus (the robot arm system 2) to allow remote manipulation thereof. In this case, in step S104, the control unit 11 may also remotely manipulate the robot arm system 2 after being authenticated by the robot arm system 2 with the designated password. Thus, the security when remotely manipulating the robot arm system 2 can be improved.

<4.8>
In step S101, the control unit 11 may further accept, as the learning request, the designation of a time period in which remote manipulation of the learning target apparatus (the robot arm system 2) is allowed. In this case, the control unit 11 may also execute step S104 (remote manipulation of the robot arm system 2) only during the designated time period. Thus, for example, the learning data 122 to be used in machine learning of the robot arm system 2 can be collected during a time period at night or early morning in which the robot arm system 2 is not used. Thus, the efficiency in using the robot arm system 2 can be improved.

<4.9>
In step S101, the control unit 11 may further accept, as the learning request, the designation of a learning period in which remote manipulation of the learning target apparatus (the robot arm system 2) is allowed. In this case, the control unit 11 may also execute step S104 (remote manipulation of the robot arm system 2) during the designated learning period, and delete information (e.g. IP address) that was used in remote manipulation of the robot arm system 2, after the designated learning period has passed.

<4.10>
In the embodiment, after accepting the learning request in step S101, the learning apparatus 1 executes a series of processes until the neural network 6 that has acquired the ability designated in the accepted learning request through machine learning is constructed in step S108. However, the mode of processing the learning request performed by the learning apparatus 1 may not be limited to this example. For example, the learning apparatus 1 may also be configured to able to accept cancellation of the learning request.

FIG. 9 schematically shows an example of a software configuration of a learning apparatus 1A according to a modification. As shown in FIG. 9, the learning apparatus 1A according to this modification is configured to be a computer that further includes a cancellation accepting unit 118 for accepting cancellation of the learning request, and a data deletion unit 119 for deleting, if cancellation of the learning request is accepted, information associated with the learning request, including the learning data that has been collected until the cancellation of the learning request is accepted and information that has been used in remote manipulation of the learning target apparatus, by executing the learning program 121 using the control unit 11. Note that the learning apparatus 1A is configured similar to the learning apparatus 1 except for this point.

Next, an example of a processing procedure of the learning apparatus 1A according to the modification will be described using FIG. 10. FIG. 10 shows an example of a processing procedure associated with accepting cancellation of the learning request while the processes in steps S102 to S108 are being executed. After accepting the learning request in step S101, the control unit 11 in the learning apparatus 1A starts the process in step S102, and also starts the process in the following step S301.

(Step S301)
In step S301, the control unit 11 functions as the cancellation accepting unit 118, and accepts cancellation of the learning request. This step S301 is an example of a “cancellation request accepting step” of the present invention. The client who desires cancellation of the learning request operates the user terminal 4 to make a request to cancel the learning request made in step S101, to the learning apparatus 1A. In the case of accepting the cancellation of the learning request before starting the process in step S108, the control unit 11 advances the processing to the next step S302. On the other hand, in the case of not accepting the cancellation of the learning request before starting the process in step S108, the control unit 11 omits the processing in the following step S302, and ends processing associated with the cancellation of the learning request.

(Step S302)
In step S302, the control unit 11 functions as the data deletion unit 119, and deletes information associated with the learning request including the learning data 122 that has been collected in step S105 until the cancellation of the learning request is accepted and information (e.g. IP address) that was used in remote manipulation of the robot arm system 2. This step S302 is an example of a “deletion step” of the present invention. The information associated with the learning request includes information that was used in remote manipulation of the robot arm system 2, as well as information indicating the content of the learning request designated in step S101, for example. After completing deletion of the information associated with the learning request, the control unit 11 ends processing for canceling the learning request. According to this modification, a request for machine learning that is no longer necessary can be canceled, and thus, the efficiency of resources in the learning apparatus can be increased.

<4.11>
In the embodiment, in step S206, the control unit 21 causes the display 32 to display that remote manipulation has been completed. However, the processing in step S206 may also be omitted in the processing procedure of the robot arm system 2.

In the embodiment, in steps S204 and S207, the control unit 21 notifies peripheral apparatuses of the state of the robot arm system 2. However, at least one of these steps S204 and S207 may also be omitted in the processing procedure of the robot arm system 2. In the case of omitting both steps S204 and S207, the notification unit 214 may also be omitted in the software configuration of the robot arm system 2.

Note that the processing order between steps S203 and S204 may also be reversed. Similarly, the processing order between steps S206 and S207 may also be reversed.

<4.12>
In the embodiment, the display 32 is used as the display unit for displaying the state of the robot arm system 2. However, the type of the display unit may not be limited to a display, and may be selected as appropriate, as per an embodiment. For example, as shown in FIG. 11, an indicator lamp may also be used as the display unit.

FIG. 11 schematically shows an example of a configuration of a robot arm system 2B according to this modification. In the robot arm system 2B, the RC 20 is connected to an indicator lamp 33 via an external interface 23. For example, the indicator lamp 33 may also be an LED (light emitting diode) lamp, a neon lamp, or the like.

In this case, in step S203, the control unit 21 may also cause the indicator lamp 33 to indicate that an operation is being executed in accordance with remote manipulation from the learning apparatus 1, by causing the indicator lamp 33 to emit light in a first display mode. Then, in step S206, the control unit 21 may also cause the indicator lamp 33 to indicate that the operation executed in accordance with remote manipulation from the learning apparatus 1 has been completed, by causing the indicator lamp 33 to emit light in a second display mode that is different from the first display mode.

Note that the display mode is determined based on an element that affects the visual sense of a person who sees the indicator light 33, such as color or blinking speed. For example, in step S203, the control unit 21 may also cause the indicator lamp 33 to emit red light as the first display mode. In step S206, the control unit 21 may also cause the indicator lamp 33 to emit blue light as the second display mode. Thus, a display unit for displaying the state of the learning target apparatus can be configured cheaply.

<4. 13>
In the above embodiment, in step S105, the control unit 11 generates the learning data 122 by combining the sensor data, the goal data, and the control data into a set. Of these data, the sensor data is an example of state data that indicates a state of the learning target apparatus. However, the type of state data may not be limited to the sensor data, and may be selected as appropriate, as per an embodiment. If the state data is not required when controlling operations of the learning target apparatus, this state data may also be omitted from the learning data. In the above embodiment, in step S105, the control unit 11 may also generate the learning data 122 by combining the goal data and the control data into a set.

(Note 1)
A learning apparatus including:
a learning request accepting unit configured to accept, as a learning request, designation of a learning target apparatus for which machine learning is to be performed, and designation of an ability that the learning target apparatus is to acquire through the machine learning, the learning request accepting unit being placed at a remote location;
a remote manipulation unit configured to remotely manipulate the learning target apparatus so as to execute an operation that is associated with the designated ability, by transmitting control data to the learning target apparatus;
a learning data collection unit configured to collect learning data for the machine learning of the designated ability, based on a result of remotely manipulating the learning target apparatus; and
a learning processing unit configured to perform machine learning of a learning device so as to acquire the designated ability, using the collected learning data.

(Note 2)
The learning apparatus according to Note 1, further includes:
an allowable area setting unit configured to set an allowable area in which the learning target apparatus is allowed to operate, within a movable area in which the learning target apparatus can move; and
a state acquisition unit configured to acquire state information indicating a state of the movable area, from a monitoring apparatus that monitors the state of the movable area,
wherein the remote manipulation unit remotely manipulates the learning target apparatus so as to operate within the set allowable area, based on the acquired state information.

(Note 3)
The learning apparatus according to Note 2,
wherein the monitoring apparatus is a shooting apparatus that is placed so as to capture an image of the movable area of the learning target apparatus, and
the state information is a captured image that is captured by the shooting apparatus.

(Note 4)
The learning apparatus according to Note 2 or 3,
wherein, if a foreign object enters the set allowable area, the remote manipulation unit temporarily stops remote manipulation of the learning target apparatus, and resumes remote manipulation of the learning target apparatus after the foreign object exits from the allowable area.

(Note 5)
The learning apparatus according to any one of Notes 1 to 4,
wherein the learning request accepting unit further accepts, as the learning request, designation of a password that is set for the learning target apparatus to allow remote manipulation, and
the remote manipulation unit remotely manipulates the learning target apparatus after being authenticated by the learning target apparatus using the designated password.

(Note 6)
The learning apparatus according to any one of Notes 1 to 5,
wherein the learning request accepting unit further accepts, as the learning request, designation of a time period in which remote manipulation of the learning target apparatus is allowed, and
the remote manipulation unit remotely manipulates the learning target apparatus only during the designated time period.

(Note 7)
The learning apparatus according to any one of Notes 1 to 6,
wherein the learning request accepting unit further accepts, as the learning request, designation of a learning period in which remote manipulation of the learning target apparatus is allowed, and
the remote manipulation unit remotely manipulates the learning target apparatus during the designated learning period, and deletes information used in remote manipulation of the learning target apparatus after the designated learning period passes.

(Note 8)
The learning apparatus according to any one of Notes 1 to 7, further including:
a cancellation accepting unit configured to accept cancellation of the learning request; and
a data deletion unit configured to delete, if cancellation of the learning request is accepted, information associated with the learning request that includes the learning data collected until cancellation of the learning request is accepted and information used in remote manipulation of the learning target apparatus.

(Note 9)
The learning apparatus according to any one of Notes 1 to 8, further including:
an ability-imparting data generation unit configured to generate ability-imparting data for imparting the designated ability to the learning target apparatus by mounting the trained learning device for which the machine learning has been completed onto the learning target apparatus.

(Note 10)
The learning apparatus according to Note 9, further including:
a distribution unit configured to distribute the generated ability-imparting data to the learning target apparatus.

(Note 11)
The learning apparatus according to any one of Notes 1 to 10,
wherein the learning device is constituted by a neural network.

(Note 12)
A learning method including:
a learning request accepting step of accepting, as a learning request, designation of a learning target apparatus that is placed at a remote location and for which machine learning is to be performed, and designation of an ability that the learning target apparatus is to acquire through the machine learning;
a remote manipulation step of remotely manipulating the learning target apparatus so as to execute an operation that is associated with the designated ability, by transmitting control data to the learning target apparatus;
a collection step of collecting learning data for the machine learning of the designated ability, based on a result of remotely manipulating the learning target apparatus; and
a machine learning step of performing machine learning of a learning device so as to acquire the designated ability using the collected learning data,
the learning request accepting step, the remote manipulation step, the collection step, and the machine learning step being executed by a computer.

(Note 13)
The learning method according to Note 12, further including:
an area setting step of setting an allowable area in which the learning target apparatus is allowed to operate, within a movable area in which the learning target apparatus can move; and
an information acquisition step of acquiring state information indicating a state of the movable area, from a monitoring apparatus that monitors the state of the movable area,
the area setting step and the information acquisition step being executed by the computer,
wherein, in the remote manipulation step, the computer remotely manipulates the learning target apparatus so as to operate within the set allowable area, based on the acquired state information.

(Note 14)
The learning method according to Note 13,
wherein the monitoring apparatus is a shooting apparatus that is placed so as to capture an image of the movable area of the learning target apparatus, and
the state information is a captured image that is captured by the shooting apparatus.

(Note 15)
The learning method according to Note 13 or 14,
wherein, in the remote manipulation step, if a foreign object enters the set allowable area, the computer temporarily stops remote manipulation of the learning target apparatus, and resumes remote manipulation of the learning target apparatus after the foreign object exits from the allowable area.

(Note 16)
The learning method according to any one of Notes 12 to 15,
wherein, in the learning request accepting step, the computer further accepts, as the learning request, designation of a password that is set for the learning target apparatus to allow remote manipulation, and
in the remote manipulation step, the computer remotely manipulates the learning target apparatus after being authenticated by the learning target apparatus using the designated password.

(Note 17)
The learning method according to any one of Notes 12 to 16,
wherein, in the learning request accepting step, the computer further receives, as the learning request, designation of a time period in which remote manipulation of the learning target apparatus is allowed, and
the computer executes the remote manipulation step only during the designated time period.

(Note 18)
The learning method according to any one of Notes 12 to 17,
wherein, in the learning request accepting step, the computer further accepts, as the learning request, designation of a learning period in which remote manipulation of the learning target apparatus is allowed, and
the computer executes the remote manipulation step during the designated learning period, and deletes information used in remote manipulation of the learning target apparatus after the designated learning period passes.

(Note 19)
The learning method according to any one of Notes 12 to 18, further including:
a cancellation request accepting step of accepting cancellation of the learning request; and
a deletion step of deleting, if cancellation of the learning request is accepted, information associated with the learning request that includes the learning data collected until cancellation of the learning request is accepted and information used in remote manipulation of the learning target apparatus,
the cancellation request accepting step and the deletion step being executed by the computer.

(Note 20)
The learning method according to any one of Notes 12 to 19, further including:
a generation step of generating ability-imparting data for imparting the designated ability to the learning target apparatus by mounting the trained learning device for which the machine learning has been completed onto the learning target apparatus, the generating step being executed by the computer.

(Note 21)
The learning method according to Note 20, further including:
a distribution step of distributing the generated ability-imparting data to the learning target apparatus, by the computer.

(Note 22)
The learning method according to any one of Notes 12 to 21,
wherein the learning device is constituted by a neural network.

1,1A Learning apparatus
11 Control unit
12 Storage unit
13 Communication interface
14 Input device
15 Output device
16 Drive
110 Learning request accepting unit
111 Allowable area setting unit
112 State acquisition unit
113 Remote manipulation unit
114 Learning data collection unit
115 Learning processing unit
116 Ability-imparting data generation unit
117 Distribution unit
118 Cancellation accepting unit
119 Data deletion unit
121 Learning program
122 Learning data
123 Ability-imparting data
2 Robot arm system
20 RC
21 Control unit
22 Storage unit
23 External interface
24 Communication interface
211 Remote manipulation accepting unit
212 Operation processing unit
213 Display control unit
214 Notification unit
221 Control program
30 Robot arm
301 Base portion
302 Joint portion
303 Link portion
304 End effector
308 Movable area
309 Allowable area
31 Camera
32 Display
4 User terminal
5 Robot apparatus
6 Neural network
61 Input layer
62 Intermediate layer (hidden layer)
63 Output layer

Claims

A learning apparatus comprising:
a learning request accepting unit configured to accept, as a learning request, designation of a learning target apparatus for which machine learning is to be performed, and designation of an ability that the learning target apparatus is to acquire through the machine learning, the learning request accepting unit being placed at a remote location;
a remote manipulation unit configured to remotely manipulate the learning target apparatus so as to execute an operation that is associated with the designated ability, by transmitting control data to the learning target apparatus;
a learning data collection unit configured to collect learning data for the machine learning of the designated ability, based on a result of remotely manipulating the learning target apparatus; and
a learning processing unit configured to perform machine learning of a learning device so as to acquire the designated ability, using the collected learning data.
The learning apparatus according to claim 1, further comprising:
an allowable area setting unit configured to set an allowable area in which the learning target apparatus is allowed to operate, within a movable area in which the learning target apparatus can move; and
a state acquisition unit configured to acquire state information indicating a state of the movable area, from a monitoring apparatus that monitors the state of the movable area,
wherein the remote manipulation unit remotely manipulates the learning target apparatus so as to operate within the set allowable area, based on the acquired state information.
The learning apparatus according to claim 2,
wherein the monitoring apparatus is a shooting apparatus that is placed so as to capture an image of the movable area of the learning target apparatus, and
the state information is a captured image that is captured by the shooting apparatus.
The learning apparatus according to claim 2 or 3,
wherein, if a foreign object enters the set allowable area, the remote manipulation unit temporarily stops remote manipulation of the learning target apparatus, and resumes remote manipulation of the learning target apparatus after the foreign object exits from the allowable area.
The learning apparatus according to any one of claims 1 to 4,
wherein the learning request accepting unit further accepts, as the learning request, designation of a password that is set for the learning target apparatus to allow remote manipulation, and
the remote manipulation unit remotely manipulates the learning target apparatus after being authenticated by the learning target apparatus using the designated password.
The learning apparatus according to any one of claims 1 to 5,
wherein the learning request accepting unit further accepts, as the learning request, designation of a learning period in which remote manipulation of the learning target apparatus is allowed, and
the remote manipulation unit remotely manipulates the learning target apparatus during the designated learning period, and deletes information used in remote manipulation of the learning target apparatus after the designated learning period passes.
The learning apparatus according to any one of claims 1 to 6, further comprising:
a cancellation accepting unit configured to accept cancellation of the learning request; and
a data deletion unit configured to delete, if cancellation of the learning request is accepted, information associated with the learning request that includes the learning data collected until cancellation of the learning request is accepted and information used in remote manipulation of the learning target apparatus.
The learning apparatus according to any one of claims 1 to 7,
wherein the learning data collection unit
generates goal data indicating a task goal to be achieved, in accordance with the designated ability,
determines, based on a result of the remote manipulation, whether or not the learning target apparatus achieves the task goal indicated by the goal data, and
generates the learning data by combining the goal data and the control data into a set, if the learning target apparatus achieves the task goal.
The learning apparatus according to any one of claims 1 to 8,
wherein machine power of the learning apparatus is higher than machine power of the learning target apparatus.
A learning method comprising:
a learning request accepting step of accepting, as a learning request, designation of a learning target apparatus that is placed at a remote location and for which machine learning is to be performed, and designation of an ability that the learning target apparatus is to acquire through the machine learning;
a remote manipulation step of remotely manipulating the learning target apparatus so as to execute an operation that is associated with the designated ability, by transmitting control data to the learning target apparatus;
a collection step of collecting learning data for the machine learning of the designated ability, based on a result of remotely manipulating the learning target apparatus; and
a machine learning step of performing machine learning of a learning device so as to acquire the designated ability using the collected learning data,
the learning request accepting step, the remote manipulation step, the collection step, and the machine learning step being executed by a computer.
The learning method according to claim 10, further comprising:
an area setting step of setting an allowable area in which the learning target apparatus is allowed to operate, within a movable area in which the learning target apparatus can move; and
an information acquisition step of acquiring state information indicating a state of the movable area, from a monitoring apparatus that monitors the state of the movable area,
the area setting step and the information acquisition step being executed by the computer,
wherein, in the remote manipulation step, the computer remotely manipulates the learning target apparatus so as to operate within the set allowable area, based on the acquired state information.
The learning method according to claim 11,
wherein the monitoring apparatus is a shooting apparatus that is placed so as to capture an image of the movable area of the learning target apparatus, and
the state information is a captured image that is captured by the shooting apparatus.
The learning method according to claim 11 or 12,
wherein, in the remote manipulation step, if a foreign object enters the set allowable area, the computer temporarily stops remote manipulation of the learning target apparatus, and resumes remote manipulation of the learning target apparatus after the foreign object exits from the allowable area.
The learning method according to any one of claims 10 to 13,
wherein, in the learning request accepting step, the computer further accepts, as the learning request, designation of a password that is set for the learning target apparatus to allow remote manipulation, and
in the remote manipulation step, the computer remotely manipulates the learning target apparatus after being authenticated by the learning target apparatus using the designated password.
The learning method according to any one of claims 10 to 14,
wherein, in the learning request accepting step, the computer further accepts, as the learning request, designation of a learning period in which remote manipulation of the learning target apparatus is allowed, and
the computer executes the remote manipulation step during the designated learning period, and deletes information used in remote manipulation of the learning target apparatus after the designated learning period passes.
The learning method according to any one of claims 10 to 15, further comprising:
a cancellation request accepting step of accepting cancellation of the learning request; and
a deletion step of deleting, if cancellation of the learning request is accepted, information associated with the learning request that includes the learning data collected until cancellation of the learning request is accepted and information used in remote manipulation of the learning target apparatus,
the cancellation request accepting step and the deletion step being executed by the computer.
The learning method according claims 10 to 16,
wherein the computer
generates goal data indicating a task goal to be achieved, in accordance with the designated ability,
determines, based on a result of the remote manipulation, whether or not the learning target apparatus achieves the task goal indicated by the goal data, and
generates the learning data by combining the goal data and the control data into a set, if the learning target apparatus achieves the task goal.