WO2018139292A1

WO2018139292A1 - Control content determination device and control content determination method

Info

Publication number: WO2018139292A1
Application number: PCT/JP2018/001123
Authority: WO
Inventors: 伸裕見市; 智彦藤田; 真梨奈大野
Original assignee: パナソニックＩｐマネジメント株式会社
Priority date: 2017-01-30
Filing date: 2018-01-17
Publication date: 2018-08-02
Also published as: JP2020054408A

Abstract

A control content determination device (200) is provided with: a first acquisition unit (210) that acquires operational state information indicating the operational state of an operation assistance device (100) that mechanically assists a user operation; a determination unit (220) that (i) determines, on the basis of the operational state information, the control content for the operation assistance device (100) in accordance with a control content determination rule, or (ii) randomly determines the control content for the operation assistance device (100); an output unit (230) that outputs the determined control content; a second acquisition unit (240) that acquires comfort information indicating the comfort of the operation assistance by the operation assistance device (100); and an updating unit (250) that updates the control content determination rule on the basis of the operational state information and the comfort information. The determination unit (220) selects the determination of (ii) with a probability ε.

Description

Control content determination apparatus and control content determination method

The present invention relates to a control content determination device and a control content determination method for determining control content of an operation support device that mechanically supports a user's operation.

Conventionally, there has been proposed an operation support system that supports the standing motion of a cared person who is difficult to walk on his own (see, for example, Patent Document 1).

JP 2016-64124 A

However, in the related art, the operation support apparatus is controlled based on a predetermined control pattern. Therefore, even if comfortable operation support can be performed for one user, only operation support that is uncomfortable for another user is possible. There are cases where it is not possible. That is, it is difficult for the conventional technology to realize operation support adapted to individual users.

Therefore, the present invention provides a control content determination device and a control content determination method that can determine the control content of an operation support device for operation support adapted to individual users.

A control content determination apparatus according to an aspect of the present invention includes: a first acquisition unit that acquires operation state information indicating an operation state of an operation support apparatus that mechanically supports a user's operation; and (i) a control content determination rule Determining the control content of the operation support device from the operation state information; or (ii) a determination unit that randomly determines the control content of the operation support device; and an output unit that outputs the determined control content; A second acquisition unit that acquires comfort information indicating comfort of operation support by the operation support device; and an update unit that updates the control content determination rule based on the operation state information and the comfort information. The determination unit selects the determination of (ii) with a probability ε.

A control content determination method according to an aspect of the present invention includes a first acquisition step of acquiring operation state information indicating an operation state of an operation support apparatus that mechanically supports a user's operation, and (i) according to a control content determination rule. Determining the control content of the operation support device from the operation state information, or (ii) determining the control content of the operation support device at random; and outputting the determined control content; A second acquisition step of acquiring comfort information indicating comfort of operation support by the operation support device; and an update step of updating the control content determination rule based on the operation state information and the comfort information. In the determination step, the determination of (ii) is selected with probability ε.

Note that these comprehensive or specific modes may be realized by a recording medium such as a system, an integrated circuit, a computer program, or a computer-readable CD-ROM, and the system, the integrated circuit, the computer program, and the recording medium. You may implement | achieve in arbitrary combinations.

The control content determination device according to an aspect of the present invention can determine the control content of the operation support device for operation support adapted to each user.

FIG. 1 is a block diagram illustrating a configuration of the operation support system according to the first embodiment. FIG. 2 is a perspective view of the main body of the operation support apparatus according to the first embodiment. FIG. 3A is a diagram for explaining the operation of the main body of the operation support apparatus according to the first embodiment. FIG. 3B is a diagram for explaining the operation of the main body of the operation support apparatus according to the first embodiment. FIG. 3C is a diagram for explaining the operation of the main body of the operation support apparatus according to Embodiment 1. FIG. 4 is a conceptual diagram illustrating an example of a neural network in the operation support apparatus according to the first embodiment. FIG. 5 is a diagram illustrating an example of control contents in the first embodiment. FIG. 6 is a flowchart showing processing of the control content determination apparatus according to the first embodiment. FIG. 7 is a diagram showing an example of a graphical user interface for inputting comfort information according to the first embodiment. FIG. 8 is a block diagram illustrating a functional configuration of the control content determination apparatus according to the second embodiment. FIG. 9 is a flowchart illustrating processing of the control content determination apparatus according to the second embodiment.

Hereinafter, embodiments will be specifically described with reference to the drawings.

It should be noted that each of the embodiments described below shows a comprehensive or specific example. Numerical values, shapes, materials, components, arrangement positions and connection forms of components, steps, order of steps, and the like shown in the following embodiments are merely examples, and are not intended to limit the scope of the claims. In addition, among the constituent elements in the following embodiments, constituent elements that are not described in the independent claims indicating the highest concept are described as optional constituent elements.

Each figure is a schematic diagram and is not necessarily shown strictly. Moreover, in each figure, the same code | symbol is attached | subjected about the same or similar component and process step.

(Embodiment 1)
[Operation support system configuration]
First, the overall configuration of the operation support system will be described. FIG. 1 is a block diagram illustrating a configuration of an operation support system 10 according to the first embodiment. The operation support system 10 according to the present embodiment includes an operation support device 100, a control content determination device 200, and an input device 300. Hereinafter, the configuration of each apparatus will be specifically described with reference to the drawings.

[Operation support device configuration]
The operation support apparatus 100 mechanically supports a user's operation. That is, the operation support apparatus 100 supports the user's operation by physically assisting the user. As illustrated in FIG. 1, the operation support apparatus 100 includes a main body 110, a sensor 120, a control unit 130, and a drive unit 140.

The main body 110 mechanically supports the user's operation by applying force to the user according to the user's operation. For example, the main body 110 includes a plurality of mechanical parts that move in accordance with a user's standing motion or walking motion. A specific example of the main body 110 will be described later with reference to FIGS. 2 to 3C.

The sensor 120 detects the operation state of the operation support apparatus 100 and outputs the detection result as operation state information. The operation state includes a position, an angle, a trajectory or a moving speed of a machine part included in the main body 110, or a force or pressure received from the user by the machine part. Further, the operation state may include the state of the driving unit 140. Specifically, the operation state may include an output (power level) of the drive unit 140 and the like.

Specifically, the sensor 120 is, for example, a rotary encoder. In this case, the sensor 120 detects the rotation speed (rotation angle) of the drive unit 140.

The control unit 130 controls the drive unit 140 based on the control content determined by the control content determination device 200. For example, when the drive unit 140 is an electric motor, the control unit 130 transmits a pulse signal corresponding to the rotation speed of the electric motor to the drive unit 140 based on the control content.

The driving unit 140 drives the main body unit 110. For example, the drive unit 140 includes an electric motor, a pulley, a belt, and the like. When the main body 110 is driven by the driving unit 140, the operation of the user is mechanically supported.

[Specific example of main part of motion support device]
Next, a specific example of the main body 110 of the operation support apparatus 100 will be described with reference to FIG. FIG. 2 is a perspective view of the main body 110 of the motion support apparatus 100 according to the first embodiment. In each figure, the X-axis direction represents the direction opposite to the traveling direction of the motion support apparatus 100. Further, the Z-axis direction is upward in the vertical direction and represents a direction orthogonal to the X-axis. The Y axis direction represents a direction orthogonal to each of the X axis and the Z axis.

As shown in FIG. 2, the main body 110 of the motion support device 100 is configured to support a motion of the user sitting on the seat surface from standing up to walking. And a base 113.

The holding unit 111 holds the upper body of the user by being attached to the user. The holding part 111 is detachably connected to one end of the arm part 112.

The arm unit 112 is a robot arm with two degrees of freedom. The arm part 112 includes a first joint 112a, a first arm 112b, a second joint 112c, a second arm 112d, and a connection part 112e.

One end of the first arm 112b is rotatably connected to the base 113 via the first joint 112a, and the other end of the first arm 112b is rotatably connected to the second arm 112d via the second joint 112c. Connected. The first arm 112b is driven by the driving unit 140 and rotates around the first joint 112a.

One end of the second arm 112d is rotatably connected to the first arm 112b via the second joint 112c, and the other end of the second arm 112d is connected to the connecting portion 112e. The second arm 112d is driven independently of the first arm 112b by the driving unit 140, and rotates about the second joint 112c.

The base portion 113 supports the arm portion 112. Here, the base 113 has a wheel for moving the floor surface, and the rotation of the wheel is controlled by the drive unit 140 to move the floor surface in the negative direction of the X axis. The movement of the base 113 can support the user's walking motion. In addition, the arm portion 112 is connected to the base portion 113.

Here, the operation of the main body 110 when supporting the user's standing operation will be described with reference to FIGS. 3A to 3C. 3A to 3C are diagrams for explaining the operation of the main body 110 of the operation support apparatus 100 according to Embodiment 1. FIG.

3A, the user is sitting on a seat surface (for example, a bed, a chair, or a toilet seat) while being held by the holding unit 111. Here, when receiving an instruction input for starting support for the standing motion from the user, the driving unit 140 supports the motion from the sitting posture to the forward leaning posture by driving the arm unit 112 as shown in FIG. 3B. Furthermore, as shown in FIG. 3C, the driving unit 140 drives the arm unit 112 to support the operation from the forward leaning posture to the standing posture.

[Configuration of control content determination device]
Next, the control content determination apparatus 200 will be specifically described with reference to FIG. The control content determination device 200 determines the control content for the operation support device 100 and outputs the control content to the operation support device 100. As illustrated in FIG. 1, the control content determination device 200 includes a first acquisition unit 210, a determination unit 220, an output unit 230, a second acquisition unit 240, and an update unit 250.

The control content determination device 200 is realized by, for example, a processor and a memory. For example, when the processor executes a software program stored in the memory, the processor functions as the first acquisition unit 210, the determination unit 220, the output unit 230, the second acquisition unit 240, and the update unit 250. Further, the control content determination apparatus 200 may be realized as one or more dedicated electronic circuits corresponding to the first acquisition unit 210, the determination unit 220, the output unit 230, the second acquisition unit 240, and the update unit 250.

The first acquisition unit 210 acquires operation state information from the sensor 120 of the operation support apparatus 100. For example, the 1st acquisition part 210 acquires relative position information to base 113 of connection part 112e as operation state information by processing an output signal of sensor 120.

The determining unit 220 (i) determines the control content of the operation support apparatus 100 from the operation state information according to the control content determination rule, or (ii) determines the control content of the operation support apparatus 100 at random. That is, the determination unit 220 selectively executes one determination from among a plurality of determinations including the determination of (i) and the determination of (ii). At this time, the determination unit 220 selects the determination of (ii) with the probability ε. ε is a predetermined value larger than 0 and smaller than 1. For example, the determination unit 220 selects the determination of (i) with a probability of 1−ε, and selects the determination of (ii) with a probability of ε.

The control content determination rule is expressed by, for example, a neural network for estimating each value of a plurality of control content from the operation state information. The control content determination rule is stored in a storage unit (not shown). The neural network will be described later with reference to FIG.

The output unit 230 outputs the control content determined by the determination unit 220. Here, the output unit 230 outputs the control content to the operation support apparatus 100.

The second acquisition unit 240 acquires comfort information indicating the comfort of the user by the operation support of the operation support apparatus 100. This comfort information includes information input from the user via the input device 300. For example, the second acquisition unit 240 acquires a value indicating comfort of operation support input by the user from the input device 300.

Further, for example, the second acquisition unit 240 may acquire the comfort information by receiving a voice signal from the input device 300 and detecting a speech of a predetermined keyword by voice recognition. The predetermined keyword is a predetermined keyword indicating the comfort of the user. For example, the predetermined keyword is “painful” or “slow”.

The update unit 250 updates the control content determination rule used by the determination unit 220 based on the operation state information acquired by the first acquisition unit 210 and the comfort information acquired by the second acquisition unit 240. Specifically, the updating unit 250 updates the values of the plurality of control contents using a value based on the comfort information as a reward. Then, the updating unit 250 updates the neural network parameters (for example, the weight w) based on the updated value. That is, the update unit 250 learns the determination of the control content adapted to the user by reinforcement learning based on the values of the plurality of control content.

[Description of neural network]
Here, the neural network in the present embodiment will be described with reference to FIG. FIG. 4 is a conceptual diagram showing an example of a neural network in the control content determination apparatus 200 according to the first embodiment. This neural network is a multi-layered artificial neural network and is a mathematical model for estimating the value Qai of a plurality of control contents ai (i = 1 to n) in the environment s based on operation state information.

[Specific examples of control contents]
FIG. 5 is a diagram illustrating an example of control contents in the first embodiment.

The control content is information indicating the time change of the position of the connection unit 112e. For example, the control content indicates the movement trajectory of the connecting portion 112e on the XZ plane. Specifically, the control content indicates the relative position of the connection part 112e with respect to the base part 113 at each time ti (i = 0 to m). The value in each environment s (operation state) of the plurality of control contents a1 to an including such control contents is estimated by the neural network.

The input device 300 receives comfort information indicating the comfort of operation support from the user. For example, the input device 300 is provided in the operation support device 100 and receives comfort information input via a touch display, a mechanical push button, or the like. For example, the input device 300 may be a microphone. In this case, the input device 300 receives voice input from the user.

[Operation of control content determination device]
Next, the operation of the control content determination apparatus 200 configured as described above will be described with reference to FIGS.

FIG. 6 is a flowchart showing processing of the control content determination apparatus 200 according to Embodiment 1. This process may be executed based on an instruction from the user, for example.

First, the first acquisition unit 210 acquires operation state information (S110). The determination unit 220 estimates the value of each control content from the operation state information based on the neural network (S120).

Subsequently, the determination unit 220 performs branch processing using the probability ε (S130). Here, the determination unit 220 selects the determination of (i) with a probability of 1−ε, and selects the determination of (ii) with a probability of ε.

Here, when the determination of (ii) is selected (ε in S130), the determination unit 220 determines the control content at random (S140). That is, the determination unit 220 randomly selects a control content from a plurality of control details. In other words, the determination unit 220 determines the control content without depending on the value estimated based on the neural network.

On the other hand, when the determination of (i) is selected (1-ε of S130), the determination unit 220 determines the control content based on the estimated value (S150). For example, the determination unit 220 selects the control content having the highest value from the plurality of control contents.

The output unit 230 outputs the control content determined in step S140 or step S150 (S160). Thereby, the motion support apparatus 100 is controlled based on the determined control content.

Thereafter, the second acquisition unit 240 acquires comfort information (S170). For example, when the input device 300 includes a display, the second acquisition unit 240 acquires a value indicating the comfort of the user indoors via a graphical user interface (GUI) as illustrated in FIG. In the GUI of FIG. 7, the comfort value is input using a slider, but it is not necessary to be limited to this. The GUI may include a text box in which a numerical value is directly input, a numerical value increase / decrease button, or a combination thereof.

Subsequently, the updating unit 250 updates the values of the plurality of control contents based on the operation state information and the comfort information (S180). At this time, a value based on the comfort information is used as a reward in reinforcement learning. The value based on the comfort information is a value indicating comfort, for example, a value that increases as the comfort increases.

Furthermore, the update unit 250 updates the parameters of the neural network based on the updated value (S190). That is, the update unit 250 learns the parameters of a neural network having a plurality of layers by inputting the value of each updated control content as a teacher signal.

The so-called deep reinforcement learning is performed by internally repeating the processes in steps S180 and S190. The deep reinforcement learning is not particularly limited, and a conventional technique may be used. Therefore, detailed description of the deep reinforcement learning is omitted.

In addition, acquisition of comfort information does not need to be performed every time the content of control is determined. That is, step S170 may be skipped. In this case, the update unit 250 may learn the value of each control content using a predetermined value (for example, 0) as a reward.

[effect]
As described above, the control content determination device 200 according to the present embodiment operates according to the first acquisition unit 210 that acquires the operation state information indicating the operation state of the operation support device 100, and (i) the control content determination rule. The control content of the support device 100 is determined from the operation state information, or (ii) the control content of the motion support device 100 is determined at random, the output unit 230 that outputs the determined control content, and the motion support A second acquisition unit 240 that acquires comfort information indicating comfort of operation support by the apparatus 100; and an update unit 250 that updates a control content determination rule based on the operation state information and comfort information. The unit 220 selects the determination of (ii) with the probability ε.

With this configuration, the update unit 250 can update the control content determination rule based on the comfort information. Therefore, the control content determination apparatus 200 can learn a control content determination rule suitable for improving the comfort of the user, and can realize operation support adapted to each user. Furthermore, since the determination unit 220 selects a random determination with the probability ε, the optimal control content can be searched without being bound by the current control content determination rule. That is, the control content determination device 200 can balance the search and use of the learning result, and can effectively update the control content determination rule.

Moreover, in the control content determination device 200 according to the present embodiment, the control content determination rule is represented by a neural network for estimating the value of each of the plurality of control content from the operation state information, and the updating unit 250 is comfortable A value based on sex information is used as a reward to update the value of a plurality of control contents, and a parameter of the neural network is updated based on the updated value.

With this configuration, so-called deep reinforcement learning can be applied to the control content determination device 200, and the control content determination device 200 can construct a control content determination rule more suitable for the user. As a result, the control content determination apparatus 200 can realize operation support suitable for each user.

Moreover, in the control content determination apparatus 200 according to the present embodiment, the second acquisition unit 240 may acquire the comfort information by detecting an utterance of a predetermined keyword by voice recognition.

With this configuration, the control content determination apparatus 200 can reduce the burden of input of user comfort information, and can improve user convenience.

(Embodiment 2)
Next, a second embodiment will be described. In the second embodiment, one or more control contents are extracted from a plurality of control contents based on the safety level indicating the safety of the user when the operation support is performed, and the extracted one or more control contents The main difference from Embodiment 1 is that the contents of control are determined at random from the inside. The second embodiment will be described below with a focus on differences from the first embodiment.

[Configuration of control content determination device]
A detailed configuration of the control content determination apparatus according to Embodiment 2 will be described. FIG. 8 is a block diagram illustrating a functional configuration of the control content determination apparatus 200A according to the second embodiment. As shown in FIG. 8, the control content determination device 200A includes a first acquisition unit 210, a determination unit 220A, an output unit 230, a second acquisition unit 240, an update unit 250A, and a detection unit 260A. .

The determining unit 220A (i) determines the control content of the operation support apparatus 100 from the operation state information according to the control content determination rule, or (ii) determines the control content at random. Here, in the case of (ii), the determination unit 220A refers to the safety level information, and randomly determines the control content from among one or more control content in which the safety level satisfies a predetermined condition.

Safety level information is information in which a safety level is associated with each of a plurality of control contents. The safety level is a value indicating the safety of the user when the operation support is performed. For example, the safety level information is a table in which a value representing safety is associated with each of a plurality of control contents. The safety level information is stored in a storage unit (not shown).

The predetermined condition is a condition for determining the control content with high safety. For example, the predetermined condition is that the safety value is larger than a predetermined threshold value.

For example, the determination unit 220A refers to the safety degree information and extracts one or more control contents having a safety degree value larger than the threshold value from a plurality of control contents. Then, the determination unit 220A randomly determines the control contents from the extracted one or more control contents.

The detection unit 260A detects whether the user is safe. That is, when the control content is randomly determined, the detection unit 260A determines whether the operation support based on the determined control content is safe. For example, when the user falls, the detection unit 260A detects that the user is not safe (that is, dangerous).

The update unit 250A updates the control content determination rule based on the operation state information and the comfort information as in the first embodiment. The update unit 250A according to the present embodiment further updates the safety level information based on the detection result by the detection unit 260A when the control content is randomly determined. For example, when it is detected that the user is not safe, the updating unit 250A decreases the value of the safety level of the determined control content. Conversely, for example, when it is detected that the user is safe, the update unit 250A increases the value of the safety level of the determined control content.

[Operation of control content determination device]
Next, the operation of the control content determination apparatus 200A configured as described above will be described with reference to FIG. FIG. 9 is a flowchart showing processing of the control content determination apparatus 200A according to the second embodiment.

When the determination of (ii) is selected in step S130 (ε of S130), the determination unit 220A extracts one or more control contents from a plurality of control contents based on the safety degree information (S132A). For example, the determination unit 220A refers to the safety level information, and extracts control content having a safety level value greater than a predetermined threshold value (for example, 50) from the plurality of control content levels a1 to an.

Then, the determination unit 220A randomly determines the control content from the extracted control content (S140A).

Thereafter, Steps S160 to S190 are executed, and when the control content is not determined at random (No in S192A), the process is terminated as it is. On the other hand, when the control content is determined at random (Yes in S192A), the detection unit 260A detects whether the operation support is safe (S194A). The update unit 250A updates the safety level information based on the detection result by the detection unit 260A (S196A).

[effect]
As described above, in the control content determination device 200A according to the present embodiment, when the control content is randomly determined, the determination unit 220A provides operation support for each of the plurality of control content based on the control content. With reference to the safety level information associated with the safety level indicating the safety level of the user when it is performed, the control level is randomly determined from one or more control levels that satisfy the predetermined level.

With this configuration, the determination unit 220A can reduce the possibility of danger to the user when determining the control content at random. That is, the determination unit 220 </ b> A can suppress the determination of the control content causing danger to the user in the random determination.

In addition, the control content determination device 200A according to the present embodiment further includes a detection unit 260A that detects whether the operation support is dangerous, and the update unit 250A further includes a control content when the control content is randomly determined. The safety degree information is updated based on the detection result by the detection unit 260A.

With this configuration, the updating unit 250A can update the risk level information based on the detection result as to whether or not the user is at risk due to the operation support, and can improve the risk level information. Therefore, the determination unit 220 </ b> A can suppress the determination of the control content causing danger to the user in the random determination.

(Other embodiments)
The control content determination device according to one or more aspects of the present invention has been described based on the embodiment, but the present invention is not limited to this embodiment. Unless it deviates from the gist of the present invention, one or more of the present invention may be applied to various modifications that can be conceived by those skilled in the art, or forms constructed by combining components in different embodiments. It may be included within the scope of the embodiments.

In each of the above embodiments, the second acquisition unit 240 acquires the comfort information based on the information received from the input device 300. However, the second acquisition unit 240 acquires not only the input device 300 but also the information received from the sensor 120. Based on this, comfort information may be acquired. For example, the second acquisition unit 240 may acquire the comfort information by correcting the information received from the input device 300 using the information received from the sensor 120. Specifically, the second acquisition unit 240 may correct the information received from the input device 300 based on the user's facial expression, brain wave, or heart rate. In this case, the sensor 120 may include an image sensor, an electroencephalogram sensor, or a heart rate sensor.

In each of the above embodiments, the determination of the control content adapted to the user is learned using the deep reinforcement learning. However, the embodiment is not limited to the deep reinforcement learning. For example, the control content determination rule may be represented not by a multi-layer neural network but by a single-layer neural network. Further, the control content determination rule may be expressed not by a neural network but by another mathematical model (for example, linear regression, support vector machine, etc.).

In each of the above embodiments, the control content of the motion support apparatus 100 is determined from the motion state information mainly according to two determinations ((i) control content determination rules, or (ii) the motion support apparatus 100 is randomly selected. Although the control content is determined), it is not necessarily limited to two determinations. For example, one decision may be selected from three or more decisions. In other words, the determination unit only needs to selectively execute one of a plurality of determinations including the determination of (i) and the determination of (ii). At this time, the determination of (ii) is selected with the probability ε. Just do it.

In addition, in each said embodiment, although the control content determination apparatus was implement | achieved by the single apparatus, you may implement | achieve by the several apparatus connected mutually. For example, the control content determination device may be realized by cloud computing.

In the second embodiment, the safety level information is updated. However, the safety level information is not necessarily updated. In this case, the control content determination device 200A may not include the detection unit 260A.

In addition, some or all of the components included in the control content determination device in each of the above embodiments may be configured by a single system LSI (Large Scale Integration). For example, the control content determination apparatus 200 may be configured by a system LSI having a first acquisition unit 210, a determination unit 220, an output unit 230, a second acquisition unit 240, and an update unit 250.

The system LSI is an ultra-multifunctional LSI manufactured by integrating a plurality of components on one chip. Specifically, a microprocessor, a ROM (Read Only Memory), a RAM (Random Access Memory), etc. It is a computer system comprised including. A computer program is stored in the ROM. The system LSI achieves its functions by the microprocessor operating according to the computer program.

Note that although the system LSI is used here, it may be called IC, LSI, super LSI, or ultra LSI depending on the degree of integration. Further, the method of circuit integration is not limited to LSI's, and implementation using dedicated circuitry or general purpose processors is also possible. An FPGA (Field Programmable Gate Array) that can be programmed after manufacturing the LSI or a reconfigurable processor that can reconfigure the connection and setting of the circuit cells inside the LSI may be used.

Furthermore, if integrated circuit technology that replaces LSI emerges as a result of advances in semiconductor technology or other derived technology, it is naturally also possible to integrate functional blocks using this technology. Biotechnology can be applied.

Further, one aspect of the present invention may be a control content determination method that uses not only such a control content determination device but also a characteristic component included in the control content determination device as a step. Further, one aspect of the present invention may be a computer program that causes a computer to execute each characteristic step included in the control content determination method. One embodiment of the present invention may be a computer-readable non-transitory recording medium in which such a computer program is recorded.

In each of the above embodiments, each component may be configured by dedicated hardware or may be realized by executing a software program suitable for each component. Each component may be realized by a program execution unit such as a CPU or a processor reading and executing a software program recorded on a recording medium such as a hard disk or a semiconductor memory. Here, the software that realizes the control content determination device of each of the above embodiments is a program as follows.

That is, the program includes a first acquisition step of acquiring operation state information indicating an operation state of an operation support apparatus that mechanically supports a user operation, and (i) the operation support according to a control content determination rule. Determining a control content of a device from the operation state information; or (ii) determining a control content of the motion support device at random; an output step of outputting the determined control content; and the motion support device Control content determination including a second acquisition step of acquiring comfort information indicating the comfort of the operation support by, and an update step of updating the control content determination rule based on the operation state information and the comfort information In the determination step, the determination of (ii) is selected with probability ε.

DESCRIPTION OF SYMBOLS 100 Operation support apparatus 200,200A Control content determination apparatus 210 1st acquisition part 220,220A determination part 230 Output part 240 2nd acquisition part 250,250A Update part 260A Detection part

Claims

A first acquisition unit that acquires operation state information indicating an operation state of an operation support device that mechanically supports a user operation;
(I) According to the control content determination rule, the control content of the operation support device is determined from the operation state information, or (ii) the determination unit that randomly determines the control content of the operation support device;
An output unit for outputting the determined control content;
A second acquisition unit that acquires comfort information indicating comfort of operation support by the operation support device;
An update unit that updates the control content determination rule based on the operation state information and the comfort information,
The determination unit selects the determination of (ii) with a probability ε.
Control content determination device.
The control content determination rule is represented by a neural network for estimating the value of each of a plurality of control content from operation information,
The update unit updates a value of the plurality of control contents using a value based on the operation information as a reward, and updates a parameter of the neural network based on the updated value.
The control content determination apparatus according to claim 1.
The second acquisition unit acquires the comfort information by detecting an utterance of a predetermined keyword by voice recognition.
The control content determination apparatus according to claim 1 or 2.
In the determination of (ii), the determination unit includes safety level information associated with a safety level indicating safety of the user when operation support is performed based on the control content for each of the plurality of control content. Referring to the above, the control content is determined at random from one or more control content satisfying a predetermined degree of safety,
The control content determination device according to any one of claims 1 to 3.
The control content determination device further includes a detection unit that detects whether the operation support is safe,
The update unit further updates the safety degree information based on a detection result by the detection unit when the determination of (ii) is selected.
The control content determination apparatus according to claim 4.
A first acquisition step of acquiring operation state information indicating an operation state of an operation support device that mechanically supports a user's operation;
(I) According to the control content determination rule, the control content of the operation support device is determined from the operation state information, or (ii) the control content of the operation support device is determined at random.
An output step for outputting the determined control content;
A second acquisition step of acquiring comfort information indicating comfort of operation support by the operation support device;
Updating the control content determination rule based on the operating state information and the comfort information,
In the determination step, the determination of (ii) is selected with probability ε.
Control content determination method.
A program for causing a computer to execute the control content determination method according to claim 6.