WO2023021776A1

WO2023021776A1 - Information processing device, information processing method, and non-transitory computer-readable medium in which program is stored

Info

Publication number: WO2023021776A1
Application number: PCT/JP2022/014070
Authority: WO
Inventors: 修長谷川; 洸輔井加田; 直純津田
Original assignee: Ｓｏｉｎｎ株式会社
Priority date: 2021-08-18
Filing date: 2022-03-24
Publication date: 2023-02-23
Also published as: JPWO2023021776A1

Abstract

A model holding unit (1) holds a model trained by using training data based on time-series data including a plurality of data elements each expressed by a multidimensional vector including a first state quantity acquired in advance from a training subject and a first command value that corresponds to the first state quantity and is given to control the operation of the training subject. The model is created to represent training data distribution structure as a set of nodes each expressed by a multidimensional vector including a second state quantity and second command value based on the training result. A retrieval unit (2) receives input of a third state quantity acquired from a control subject as input data and retrieves a node that matches or is similar to the input data from among a smaller number of nodes than the number of data elements of the training data included in the model. An output unit (3) outputs a value based on the second command value of the retrieved node to the control subject as an output command value to be given to operate the control subject.

Description

Non-transitory computer-readable medium storing information processing device, information processing method, and program

The present invention relates to an information processing device, an information processing method, and a program.

Today, various data processing methods are used, and there are various types of data to be processed, such as those in which values are systematically arranged. A representative example of such systematic data is time-series data, which is a set of values obtained by observing a certain phenomenon continuously or intermittently, that is, a set of values that indicate changes in the phenomenon over time. It is

Time-series clustering is known as a method for analyzing such time-series data. As time-series clustering, generally three methods of whole time-series clustering, subsequence clustering, and time point clustering are known (Non-Patent Documents 1 and 2). In Whole time-series clustering, clustering is performed by measuring the similarity for each set of time series data (time series set). In subsequence clustering, one time-series data (time-series set) is divided into multiple segments, and clustering is performed for each segment. In time point clustering, one time series data (time series set) is divided into points, and clustering is performed by measuring the similarity between each point.

JP 2008-217246 A JP 2014-164396 A

For example, a method that obtains the state quantity of a certain system and gives a command value according to the state quantity is widely used to control the behavior of the system. In such a control method, when the control process is performed automatically, a suitable command value is estimated based on the state quantity of the system acquired at an arbitrary time, and the estimated command value is given to the system. . In this case, learning data, which is time-series data consisting of a history of human operations (command values) observed in advance and system state quantities, is learned by machine learning, and command values are based on the learning results. is performed.

At this time, for example, the k nearest neighbor method is known for estimating the command value corresponding to the state quantity of the input data. The k-nearest neighbor method uses the learning data as it is, and outputs an operation value from k pieces of history data close to the input data. The output value can be, for example, an average value of command values after a predetermined time has elapsed in each of k pieces of history data close to the input data. The k-nearest neighbor method does not require learning compared to other supervised learning, so it does not require a huge amount of data, and since it operates within the range of provided data, it is used in various fields. However, the k-nearest neighbor method has the problem that the distance between all the data elements of the learning data and the input data is calculated, resulting in a large amount of calculation and a long calculation time. .

This is not a problem for systems that have a margin of time between obtaining the state quantity of the system to be controlled and giving the command value, but for example, in vehicles, the command value can be quickly obtained according to the state quantity acquisition. In a system that must give, the length of calculation time hinders quick giving of command values.

Therefore, there is a need to establish a method that can output a command value according to the state quantity of the system at a higher speed according to the operating status of the system to be controlled.

The present invention has been made in view of the above circumstances, and aims to quickly provide a command value to the system according to the operating conditions and state quantities of the system to be controlled.

An information processing apparatus according to an aspect of the present invention includes a first state quantity acquired in advance from a learning object and a first command value corresponding to the first state quantity given for controlling the motion of the learning object. By learning learning data based on time-series data containing a plurality of data elements represented by a multidimensional vector, the distribution structure of the learning data is a second state quantity based on the learning result and a second command value. A search unit that searches for a node matching or approximating the input data from a smaller number of the nodes than the number of data elements of the learning data included in the search, and a value based on the second command value of the searched node, and an output unit for outputting to the controlled object as an output command value given for operating the controlled object. As a result, the command value corresponding to the operating conditions and state quantities of the controlled object can be quickly given to the controlled object.

An information processing apparatus according to an aspect of the present invention is the information processing apparatus described above, wherein the model is created by learning each of the learning data as a node, and the search unit is configured to: It is desirable to select nodes that are temporally relatively close to the input data as the small number of nodes, and search for a node that is the shortest distance from the selected nodes to the input data. As a result, a model used for estimating the command value can be created and the node with the shortest distance can be specified in order to quickly provide the command value to the controlled object according to the operating conditions and state quantities of the controlled object.

An information processing apparatus according to an aspect of the present invention is the above information processing apparatus, wherein the model is created by time-series clustering the learning data and classifying the nodes into a plurality of clusters, The searching unit searches for a cluster to which the input data belongs and a cluster temporally immediately after the cluster to which the input data belongs, and finds the shortest distance from the nodes belonging to the searched two clusters to the input data. It is desirable to explore the node. In this way, in order to quickly give a command value to the controlled object according to the operating conditions and state quantities of the controlled object, it is possible to create a model used for estimating the command value and to specifically identify the node with the shortest distance. can be done.

An information processing apparatus according to an aspect of the present invention is the information processing apparatus described above, wherein the search unit includes a statistic calculated from command values of some or all nodes of a cluster to which the searched node belongs. Based on this, it is desirable to determine the output command value. Thereby, an appropriate output command value can be determined as needed.

An information processing apparatus according to an aspect of the present invention is the information processing apparatus described above, in which the distribution structure of the data elements of the learning data is approximately learned by a smaller number of nodes than the data elements of the learning data. and outputting the created model to the model holding unit, wherein the search unit searches for a node that approximates the input data from the nodes included in the model. It is desirable to As a result, a model used for estimating the command value can be created and the node with the shortest distance can be specified in order to quickly provide the command value to the controlled object according to the operating conditions and state quantities of the controlled object.

The information processing apparatus according to one aspect of the present invention is the above information processing apparatus, and it is preferable that the search unit outputs the command value of the searched node as the output command value. As a result, a model used for estimating the command value can be created and the node with the shortest distance can be specified in order to quickly provide the command value to the controlled object according to the operating conditions and state quantities of the controlled object.

An information processing apparatus according to an aspect of the present invention is the information processing apparatus described above, wherein the searching unit includes a command value for the searched node and a command for one or more nodes similar to the searched node. It is desirable to determine the output command value based on the value and the statistic calculated from the value. Thereby, an output command value can be determined appropriately.

An information processing apparatus according to an aspect of the present invention is the information processing apparatus described above, wherein the one or more nodes similar to the searched node are nodes within a predetermined distance from the searched node, or It is preferable that a predetermined number of nodes are selected in descending order from the node that was selected. This makes it possible to determine the output command value according to need.

The information processing apparatus that is one aspect of the present invention is the information processing apparatus described above, and the statistic is preferably one of an average value, a median value, a maximum value, a minimum value, and a mode. This makes it possible to determine the output command value according to need.

An information processing apparatus according to an aspect of the present invention is the information processing apparatus described above, wherein the second state quantity is the first state quantity obtained in advance from the learning object, and is preferably the first command value given to the learning object according to the first state quantity acquired in advance. This allows a model to be created based on the appropriate data.

An information processing apparatus according to an aspect of the present invention is the information processing apparatus described above, wherein the first command value is an operation of the learning target based on the first state quantity obtained in advance from the learning target. It is preferable that the command value is the actual value of the command value given to the learning object by the person. This allows a model to be created based on the appropriate data.

An information processing apparatus according to an aspect of the present invention is the information processing apparatus described above, wherein the first command value included in the data element of the learning data is the first instruction value obtained in advance from the learning target. It is desirable that the command value is given to the controlled object after a predetermined time with respect to the state quantity. This makes it possible to create a model that takes into account the time lag between the acquisition of the state quantity and the output of the command value.

An information processing method according to an aspect of the present invention includes a first state quantity acquired in advance from a learning target and a first command value corresponding to the first state quantity given to control the motion of the learning target. By learning learning data based on time-series data containing a plurality of data elements represented by a multidimensional vector, the distribution structure of the learning data is a second state quantity based on the learning result and a second command value, a model created as a set of nodes represented as a multidimensional vector, the third state quantity obtained from the controlled object is input as input data, and the searching for a node matching or approximating the input data from a smaller number of the nodes than the number of data elements of the learning data, and setting the value of the searched node based on the second command value to operate the controlled object; is output to the controlled object as an output command value given to . As a result, the command value corresponding to the operating conditions and state quantities of the controlled object can be quickly given to the controlled object.

A program, which is one aspect of the present invention, includes a first state quantity obtained in advance from a learning target, a first command value corresponding to the first state quantity given for controlling the motion of the learning target, By learning learning data based on time-series data containing a plurality of data elements represented by multidimensional vectors, the distribution structure of the learning data is a second state quantity and a second state quantity based on the learning result A process of holding a model created as a set of nodes represented as a multidimensional vector containing 2 command values, and a third state quantity obtained from a controlled object is input as input data and included in the model. A process of searching for a node matching or approximating the input data from among the nodes whose number is smaller than the number of data elements of the learning data; and a process of outputting to the controlled object as an output command value to be given to the computer. As a result, the command value corresponding to the operating conditions and state quantities of the controlled object can be quickly given to the controlled object.

According to the present invention, it is possible to quickly provide a command value to the system according to the operating conditions and state quantities of the system to be controlled.

1 is a diagram illustrating an example of a system configuration for realizing an information processing apparatus according to a first embodiment; FIG. 1 is a diagram showing an external configuration of an information processing apparatus according to a first embodiment; FIG. 1 is a diagram schematically showing the configuration of an information processing apparatus according to a first embodiment; FIG. 1 is a diagram schematically showing a situation in which the information processing device according to Embodiment 1 is used; FIG. It is a figure which shows the outline|summary of the robot used for experiment. FIG. 4 is a diagram showing an example of motions learned by a robot arm; FIG. 4 is a diagram schematically showing forces acting in each state when the robot arm avoids an obstacle; It is a figure which shows the format of time-series data. FIG. 10 is a diagram showing an example of learning the motion of a robot arm; FIG. 4 is a diagram showing nodes created by a model creation unit; 4 is a flowchart of processing in the information processing apparatus according to the first embodiment; 7 is a flowchart of a modified example of processing in the information processing apparatus according to the first embodiment; FIG. 7 is a diagram showing nodes that are objects of distance calculation in the processing by the information processing apparatus according to the first embodiment and the processing by the k-nearest neighbor method; FIG. 10 is a diagram schematically showing the configuration of an information processing apparatus according to a second embodiment; FIG. FIG. 10 is a diagram schematically showing a situation in which the information processing device according to the second embodiment is used; 9 is a flowchart showing operations in the information processing apparatus according to the second embodiment; FIG. 10 is a diagram schematically showing the configuration of a model creation unit according to the second embodiment; FIG. 4 is a flowchart of learning processing by the SOINN method; FIG. 10 is a diagram showing an example of node distribution obtained by learning the learning data used in the first embodiment by the SOINN method in the information processing apparatus according to the second embodiment;

Embodiments of the present invention will be described below with reference to the drawings. In each drawing, the same elements are denoted by the same reference numerals, and redundant description will be omitted as necessary.

Embodiment 1
1 is a diagram illustrating an example of a system configuration for realizing an information processing apparatus according to a first embodiment; FIG. The information processing apparatus 100 can be implemented by a computer 1000 such as a dedicated computer or a personal computer (PC). However, the computer does not need to be physically single, and multiple computers may be used when performing distributed processing. As shown in FIG. 1, a computer 1000 has a CPU (Central Processing Unit) 1001, a ROM (Read Only Memory) 1002 and a RAM (Random Access Memory) 1003, which are interconnected via a bus 1004. there is It should be noted that although the explanation of OS software and the like for operating the computer will be omitted, it is assumed that the computer that constructs this information processing apparatus also has it as a matter of course.

An input/output interface 1005 is also connected to the bus 1004 . The input/output interface 1005 includes, for example, an input unit 1006 including a keyboard, mouse, sensors, etc., a display including a CRT, LCD, etc., an output unit 1007 including headphones, speakers, etc., a storage unit 1008 including a hard disk, etc. A communication unit 1009 including a modem, a terminal adapter, etc. is connected.

The CPU 1001 executes various kinds of processing according to various programs stored in the ROM 1002 or various programs loaded from the storage unit 1008 to the RAM 1003. In this embodiment, for example, processing of each unit of the information processing apparatus 100 described later. . A GPU (Graphics Processing Unit) may be provided separately from the CPU 1001 to perform processing similar to that of the CPU 1001 . Note that the GPU is suitable for performing routine processing in parallel, and by applying the GPU to learning processing, etc., which will be described later, it is possible to improve the processing speed compared to the CPU 1001 . The RAM 1003 also stores data necessary for the CPU 1001 and GPU to execute various types of processing.

The communication unit 1009 performs, for example, communication processing via the Internet (not shown), transmits data provided by the CPU 1001, and outputs data received from the communication partner to the CPU 1001, RAM 1003, and storage unit 1008. A storage unit 1008 communicates with the CPU 1001 to save and erase information. The communication unit 1009 also performs communication processing of analog signals or digital signals with other devices.

The input/output interface 1005 is also connected to a drive 1010 as necessary, and for example, a magnetic disk 1011, an optical disk 1012, a flexible disk 1013, or a semiconductor memory 1014 is appropriately mounted, and a computer program read from them is required. is installed in the storage unit 1008 according to the

The external configuration of the information processing device 100 will be described. FIG. 2 shows the external configuration of the information processing apparatus 100 according to the first embodiment. The information processing device 100 has a processing unit 110 , a display unit 120 and an input unit 130 . The processing unit 110 is configured as hardware having the aforementioned CPU 1001, ROM 1002, RAM 1003, bus 1004, input/output interface 1005, storage unit 1008, communication unit 1009, drive 1010, and the like. The display unit 120 corresponds to the output unit 1007 described above, and is configured as a display device such as an LCD that displays an image in a format that can be visually recognized by the operator. The input unit 130 corresponds to the input unit 1006 described above, and is composed of various input means such as a mouse and a keyboard.

Next, time-series data TSD, which is input data to be learned by the information processing apparatus 100, will be described. The time-series data TSD is given as a set of multidimensional vectors representing data elements. Assuming that i is a parameter representing the order of the data elements, a multidimensional vector a _i representing the data elements is defined as a vector containing the time t _i and m components p as shown in the following equation. Here, i is an integer of 2 and m is an integer of 1 or more.

Therefore, when the number of data elements is n, the time-series data TSD is represented by the following formula.

The time-series data TSD is stored, for example, in a storage unit provided in the information processing device 100 (for example, the RAM 1003 or the storage unit 1008).

Machine learning is applied to the information processing apparatus 100 described below, but as a prerequisite, an outline of machine learning will be described. The general machine learning described below is merely a premise for understanding the control system described in the following embodiments, and the machine learning applied to the control system is not limited to this.

Machine learning is broadly divided into supervised learning and unsupervised learning. An outline of each method will be described below.

Generally, in supervised learning, learning is performed to predict a certain variable (objective variable) from a given variable (explanatory variable). More specifically, supervised learning is a technique of giving correct data (objective variable) to input data (explanatory variables) and learning the relationship between the input data and the correct data.

For example, if the correct data are continuous values, learning by regression analysis is performed. Note that the method of learning continuous data is not limited to regression analysis (for example, linear regression). In regression analysis, by fitting input data with various functions, it is possible to predict output corresponding to input data.

Also, when the correct data for the input data is label information, learning by classification is performed. In learning by classification, for example, techniques such as regression (logistics regression, support vector machine), trees (decision trees, random forests), neural networks, and clustering (k nearest neighbor method, etc.) are used.

In unsupervised learning, the features of input data are learned without correct data being given. Methods of unsupervised learning include clustering represented by the k-means method and the SOINN method, dimension reduction such as the PCA method, and anomaly detection such as Hotelling's ^T2 method. For example, in clustering, it is possible to extract and group those that are similar and have features from the input data.

Here, the SOINN method will be explained. The SOINN method is a learning method for growing neurons as needed during learning, and is a method called a self-organizing neural network (SOINN). SOINN has many advantages such as being able to learn non-stationary inputs by autonomously managing the number of nodes, and being able to extract an appropriate number of classes and topological structure even for classes with complex distribution shapes. have As an application example of SOINN, for example, in pattern recognition, after learning a class of hiragana characters, a class of katakana characters can be additionally learned.

As an example of such SOINN, a method called E-SOINN (Enhanced SOINN, Patent Document 1) has been proposed. E-SOINN is capable of online additional learning in which learning is added at any time, and has the advantage of being more efficient than batch learning. Therefore, in E-SOINN, additional learning is possible even when the learning environment changes to a new environment. E-SOINN also has the advantage of high noise resistance to input data. In addition, various SOINN methods such as LB-SOINN (Load Balance Self-Organizing Incremental Neural Network, Patent Document 2) have been proposed.

In the SOINN method, a neural network with multiple nodes is applied. Specifically, the information processing apparatus 100 receives a non-hierarchical neural network in which nodes described by an n-dimensional vector (n is an integer equal to or greater than 1) are arranged. A neural network is stored in a storage unit such as the RAM 1003, for example.

A neural network in the SOINN method is a self-propagating neural network that inputs an input vector into the neural network and automatically increases the number of nodes and edges placed in the neural network based on the input vector. The number of nodes can be automatically increased by using a type neural network.

The neural network in the SOINN method has a non-hierarchical structure. By adopting a non-hierarchical structure, additional learning can be performed without specifying the timing of starting learning in other layers. That is, additional learning can be carried out online.

Clustering in the SOINN method refers to nodes and edges and performs class classification. Various class classification methods can be applied to the class classification of the nodes that constitute the network, and for example, the same processing as in LB-SOINN of Patent Document 2 may be performed.

The configuration and operation of the information processing apparatus 100 according to the first embodiment will be described below. FIG. 3 schematically shows the configuration of the information processing apparatus 100 according to the first embodiment. The information processing device 100 has a model holding unit 1 , a search unit 2 and an output unit 3 .

The model holding unit 1 reads and holds the model MD created from the learning data DAT, which is time-series data. Various storage means such as the RAM 1003 and the storage unit 1008 in FIG. 1 can be applied to the model holding unit 1, for example.

The search unit 2 is configured to be able to read the model MD from the model holding unit 1 as appropriate. The search unit 2 refers to the state quantity of the separately input data to be estimated (hereinafter referred to as input data), and searches for an approximate node from among the nodes included in the loaded model MD. Then, the search result is output to the output unit 3 .

The output unit 3 outputs an output command value corresponding to the state quantity of the input data determined based on the search result.

Next, a situation in which the information processing device 100 is used will be described. FIG. 4 schematically shows a situation in which the information processing apparatus 100 according to the first embodiment is used. The information processing apparatus 100 learns the operation performed by the operator 10 on the operation target device 20 (for example, a robot described later) in the learning phase, and uses the command value estimated based on the learning result in the estimation phase as the operation target device. By outputting to 20, the operation target device 20 is caused to perform an appropriate operation. In the following description, an object to be controlled by the information processing apparatus 100, such as the operation-targeted apparatus 20, is simply referred to as a control object.

In the following, in order to distinguish between the data handled in the learning phase and the data handled in the estimation phase, the data handled in the learning phase are shown in capital letters, and the data handled in the estimation phase are shown in lowercase letters. Let P be the state quantity handled in the learning phase, and Q be the command value. A state quantity handled in the learning phase, ie, input data, is denoted by p, and a command value, ie, an output value is denoted by q.

In the learning phase, the operator 10 operates the command device 11 to input an instruction INS in order to give the operation target device 20 a command value for commanding an appropriate action. The command device 11 outputs a command value Q corresponding to the input instruction INS to the operation target device 20 and the model creation unit 12 . The operation-targeted device 20 performs an operation according to the command value Q, and outputs the state quantity P to the model creation unit 12 .

As a result, the model creation unit 12 acquires the command value Q given to the operation-targeted device 20 and the state quantity P of the operation-targeted device 20 at a certain time. Since the acquisition of the command value Q and the state quantity P is performed sequentially, the model creation unit 12 obtains the time-series data TSD including a plurality of data elements including the sets of the command value Q and the state quantity P and the timing at which each set was acquired. be able to. Note that hereinafter, the command value Q given to the operation-targeted device 20 and the state quantity P of the operation-targeted device 20, which are obtained in advance for model creation, are also referred to as the first command value and the first state quantity, respectively. called.

The model creating unit 12 appropriately processes the time-series data to create learning data, learns the learning data to generate a model MD, and outputs the model MD to the model holding unit 1 . As a result, the model MD is held by the model holding unit 1 . That is, the learning data includes a plurality of data elements each including the first command value and the first state quantity obtained in advance. The model MD created by the model creation unit 12 is a learned model in which the distribution structure of time-series data is learned as a set of nodes described by multidimensional vectors containing at least state quantities and command values as elements. Note that the state quantity and the command value included in the multidimensional vector describing the node of the model MD are hereinafter also referred to as the second state quantity and the second command value, respectively.

In the estimation phase, the search unit 2 reads the model MD from the model holding unit 1 as appropriate. The search unit 2 determines one node N _NEAR from the nodes included in the model MD based on the input data for the model based on the model MD, ie, the state quantity p acquired from the operation-targeted device 20 . In addition, below, the state quantity acquired from the operation-targeted device 20 in the estimation phase is also referred to as a third state quantity. The output unit 3 reads the command value corresponding to the node N _NEAR and outputs the read command value as the command value Q to the operation target device 20 . In this embodiment, one node N _NEAR is determined from the nodes included in the model MD, but a plurality of nodes may be determined from the nodes included in the model MD. A command value to be output may be calculated using, for example, a statistical method from the determined command values corresponding to the plurality of nodes.

Next, as a premise of the processing performed by the information processing apparatus 100, the time-series data TSD composed of the state quantity P and the command value Q acquired by the model creation unit 12, and the learning data TSD created based on the time-series data TSD A model MD generated by learning the data DAT by the model generating unit 12 will be described.

The model creation unit 12 reads the learning data DAT, performs clustering, and learns the clustering information added to the learning data DAT as nodes. As will be described below, the learning data DAT is in the form of a set of multidimensional vectors obtained by appropriately processing time-series data including information indicating the state quantity P, command value Q, and timing. consist of the same multidimensional vectors.

As a specific example, we will explain an example of learning about an experiment using a robot. FIG. 5 shows the outline of the robot used in the experiment. The robot 30 has a robot arm 31 , and a cylindrical pole 33 extends from the tip of the robot arm 31 . A force sensor 32 is provided to detect force. Thus, when an external force is applied to the cylindrical pole 33, the force sensor 32 can detect the external force. The robot arm 31 moves in three-dimensional directions in the horizontal plane (x-direction and y-direction) and in the height direction (z-direction) by driving the horizontally extending beam 34 by a multi-joint drive mechanism (not shown) or the like. can move.

In this example, the operator learned the motion of the robot arm 31 by moving the robot arm 31 using the command device 11 . However, the learning of the motion of the robot arm 31 is not limited to this example, and the operator may learn the operation while holding the robot arm 31 by hand. Alternatively, the robot arm 31 may be operated to learn the operation.

FIG. 6 shows examples of motions that the robot arm 31 learns. In this example, it is assumed that the robot arm 31 contacts the obstacle 40 and moves while avoiding the obstacle 40 in the horizontal plane (xy plane).

FIG. 7 schematically shows forces acting in each state when the robot arm 31 avoids the obstacle 40. FIG. The operations and acting forces in each state will be described below.

[State 1]
The robot arm 31 moves straight in the +y direction from the initial position and contacts the obstacle 40 . In this case, the external force acting on the robot arm 31 is substantially zero until it contacts the obstacle 40 at the end of state 1 .

[State 2]
When the robot arm 31 comes into contact with the obstacle 40, the robot arm 31 changes its direction of movement to the -x direction while being pressed against the obstacle 40, and continues to move. In this case, the reaction force fy2 in the −y direction and the frictional force fx2 in the +x direction are applied to the robot arm 31 by being pressed against the obstacle 40 .

[State 3]
When the robot arm 31 reaches the corner of the obstacle 40 (the corner on the -x side and the -y side), it changes its movement direction to the +y direction, and while being pressed against the obstacle 40 in the +x direction, continue moving. In this case, the -x-direction reaction force fx3 and the -y-direction frictional force fy3 act on the robot arm 31 due to being pressed against the obstacle 40 .

[State 4]
When the robot arm 31 reaches the corner of the obstacle 40 (the corner on the -x side and the +y side), it changes its movement direction to the +x direction, and while being pressed against the obstacle 40 in the -y direction, continue moving. In this case, a reaction force fy4 in the +y direction and a frictional force fx4 in the −x direction are applied to the robot arm 31 due to being pressed against the obstacle 40 .

Next, we will explain the format of the time-series data in this experiment. FIG. 8 shows the format of the time-series data TSD. Each data element of the time-series data TSD is arranged in the vertical direction, and each row corresponds to one sampled data element. Each data element contains four areas DZ1-DZ4.

The first area DZ1 indicates the sampling number i of the data element. Here, the number of samples is n (n is an integer of 1 or more), and therefore i is an integer of 1 or more and n or less. A second area DZ2 indicates the time (timing) t(i) at which each data element is sampled.

The third area DZ3 includes external forces (f _x (i), f _y (i)), velocities (v _x (i), v _y (i)), and positions (x(i ), y(i)), etc., and the values contained in this area DZ3 indicate the values relating to the information of the controlled object.

A fourth area DZ4 is a force ( _Fx (i), _Fy (i)) applied to the robot arm 31 by an operation as a command value to be given to the robot arm 31, and the amount included in this area DZ4 is the command value. corresponds to the value. Although force is used as the command value here, this is merely an example, and other values such as velocity and position may be used as the command value.

Here, the sampling number i is used to indicate to which data element the force, velocity and position belong, but this is merely an example. good too.

In this example, the force and velocity included in the third region DZ3 are used as the state quantity P(i) included in the learning data DAT, as shown in the following equations.

It should be noted that the state quantity P is not limited to this, and if necessary, it may include values included in the area DZ3 before i, or the values of some or all of the areas DZ1, DZ2, and D4. may be included.

Further, as shown in the following formula, a future command value for a predetermined time h is used as the command value Q with respect to the command value included in the fourth region DZ4.

From the above, the data element of the learning data DAT is represented by the following formula.

Here, the fact that the value at i+h in the future by h is used as the time of the command value Q will be explained. In the present embodiment, the state quantity is acquired from the operation target device, and the command value estimated based on the state quantity acquired by the search unit can be given to the operation target device, thereby appropriately operating the operation target device. be. Therefore, it is conceivable that there is a certain amount of time lag between obtaining the state quantity and giving the command value. In this case, it is desirable that the state quantity acquired at a certain time corresponds to a future command value by this time lag. Therefore, in the present embodiment, in order to reflect this time lag, the future command value is made to correspond to the state quantity by h. However, since the time lag may be so small that it can be ignored, h may be an arbitrary value of 0 or more, that is, an integer of 0 or more.

The learning data DAT is represented by the following formula.

In the present embodiment, in the expressions [4] and [5], the command values Q(n+1) to Q(n+h) after the command value Q(n) sampled last in time are It was assumed that the value Q(n) was used. However, the future command values Q(n+1) to Q(n+h) are not limited to these, and other values may be used as appropriate so that the operation is favorable. Further, data elements Di having command values that do not originally exist in the time-series data TSD, such as future command values Q(n+1) to Q(n+h), may not be included in the learning data DAT.

In the present embodiment, the number of data elements of the learning data DAT is the same as the sampling number n, but it is not limited to this. For example, {D1, D3, D5 . . } may be used as the learning data DAT, or new data may be created to complement the space between Di and Di+1 and added to the learning data DAT.

Next, the model creation in this example will be explained. The model creating unit 12 performs clustering, which is unsupervised learning, on the learning data DAT, and creates a clustering result as a model. Here, the k-means method is used as the clustering method. Note that the clustering method is not limited to this, and various other clustering methods such as DBSCAN (Density-based spatial clustering of applications with noise) may be used. Also, in this example, the Euclidean distance is used as the distance index, but other distance indexes may be used as appropriate.

FIG. 9 shows an example in which the motion of the robot arm 31 is learned. In FIG. 9, two-dimensional planes of fx and fy are extracted from the state quantities of learning data and mapped for easy visualization. Regions separated by dashed lines are respective clustering regions, and are divided into four regions in this example. As shown in FIGS. 6 and 7, the area CL1 is in the state before collision with the obstacle (state 1), the area CL2 is in the state of moving leftward after the collision (state 2), and the area CL3 is on the left side of the obstacle. The state of moving upward (state 3) corresponds to the state CL4 of moving the obstacle to the right (state 4).

The model creating unit 12 creates nodes by adding clustering results to the learning data DAT. FIG. 10 shows nodes created by the model creating unit. As shown in FIG. 10, a node is created by adding one of CL1 to CL4 indicating the clustering result to each data element of the learning data DAT. That is, the i-th node is represented by the following formula.

Note that cl _i is a cluster number and takes any value from CL1 to CL4.

The model creating unit 12 outputs a model MD which is a set of nodes corresponding to the number of elements (n in this embodiment) of the learning data DAT represented by the above equation. Model MD is represented by the following formula.

Because the learning data is time series data, naturally the nodes obtained by learning also have an order in time series. Specifically, the nodes can be associated with a quantity related to time, here the sampling order, so it is possible to arrange the nodes in chronological order. It should be noted that the method of giving a chronological order to nodes is not limited to this, and may be expressed by including a quantity related to time (that is, time) as an element in a multidimensional vector that expresses nodes.

In this example, the four clusters CL1 to CL4 are obtained in chronological order CL1, CL2, CL3, and CL4.

Next, the processing performed by the information processing device 100 will be described. FIG. 11 shows a flowchart of processing in the information processing apparatus 100 according to the first embodiment. The operation of the information processing apparatus 100 consists of steps SA1 to SA5 below.

Step SA1
The searching unit 2 reads the model MD from the model holding unit 1 .

Step SA2
The search unit 2 acquires the state quantity p, which is input data.

Step SA3
The search unit 2 acquires to which cluster of the model MD the state quantity p, which is input data, belongs. Hereinafter, the cluster to which the state quantity p, which is the input data, belongs will be referred to as the target cluster C _TRG . Here, the distance between the state quantity p, which is the input data, and the center of gravity of each cluster is calculated, and the cluster having the center of gravity at the shortest distance is taken as the target cluster C _TRG . Note that the method of determining the target cluster C _TRG is not limited to this, and other determination methods may be used as appropriate.

Step SA4
The search unit 2 selects a node N _NEAR having a state quantity closest to the state quantity p, which is the input data, from the nodes included in the target cluster C _TRG and the cluster C _NEXT temporally immediately after the target cluster C _TRG . One search is performed and the search result is output to the output unit 3 . Here, Euclidean distance is used as a distance index for searching for nodes. Note that the node _{N_NEAR} having the state quantity closest to the state quantity p, which is the input data, is hereinafter simply referred to as the nearest node _{N_NEAR} .

Here, only one closest node was searched, but multiple nodes close to the state quantity p, which is the input data, may be searched. In this case, a predetermined number of nodes may be searched in order of proximity from the state quantity p, or a plurality of nodes within a predetermined distance from the state quantity p may be searched.

The clusters to _be searched are the target cluster C _TRG and the immediately following cluster C _NEXT . Only half of the nodes that are earlier in time may be targeted.

Also, the Euclidean distance is used as the distance index for searching for nodes, but other distance indexes may be used as appropriate.

Step SA5
The output unit 3 outputs an output command value q _OUT determined based on the nearest node N _NEAR . For example, the output unit 3 may output the command value q _NEAR held by the nearest node N _NEAR as the output command value q _OUT (first output command value determination method).

The method by which the output unit 3 determines the output command value q _OUT is not limited to this, and may be determined as follows, for example. For example, the output unit 3 may output the average value of the command values of all nodes in the cluster to which the nearest node N _NEAR belongs as the output command value q _OUT (second output command value determination method).

Further, the output unit 3 may output the average value of the command values of some of the nodes of the cluster to which the nearest node N _NEAR belongs as the output command value q _OUT (third output command value determination method ). At this time, the average value of the command values of the nodes selected by various selection methods, such as nodes within a predetermined distance from the nearest node N _NEAR and a predetermined number of nodes selected in order of closest distance from the nearest node N _NEAR It may be output as an output command value q _OUT .

Furthermore, when a plurality of nodes are searched in step SA4, the average value of the command values of the plurality of nodes may be output as the output command value q _OUT (fourth output command value determination method).

Although the average value is used in the second to fourth output command value determination methods, this is merely an example. In other words, a value determined based on statistics may be calculated as the command value.

As described above, according to this configuration, when obtaining the output value corresponding to the input data p, the nodes that calculate the distance to the input data p are not all nodes but nodes included in a limited cluster group. can be limited to This makes it possible to greatly reduce the number of nodes to be distance-calculated compared to general methods such as the k-nearest neighbor method, thereby speeding up the process of finding the output value corresponding to the input data p. becomes possible.　

The basic operation of the information processing apparatus 100 has been described above, but when a plurality of pieces of input data x are continuously input, the operation described below may be performed. FIG. 12 shows a flowchart of a modified example of processing in the information processing apparatus 100 according to the first embodiment. In this modification, for the second and subsequent input data p, the target cluster C _TRG determined based on the previous input data p is used to search for the nearest node N _NEAR . is. In the operation shown in FIG. 12, steps SA11 to SA14 are added to steps SA1 to SA5 in FIG. Since SA1 to SA5 are the same as in FIG. 11, the added steps SA11 to SA14 will be explained.

Step SA11
Step SA11 is a step inserted before step SA. First, the search unit 2 sets 0 as the value of the initial flag FG in order to indicate that the operation is in the initial state, that is, the input of the input data x is the first time. Note that the initial flag is not limited to this example, and may be data of any format as long as it can indicate that the operation is in the initial state.

Step SA12
Step SA12 is a step inserted between steps SA2 and SA3. The search unit 2 determines whether the operation is in the initial state, that is, whether the value of the initial flag FG is zero. If the value of the initial flag FG is 0 (initial state), the process proceeds to step SA3, and if the value of the initial flag FG is not 0 (not the initial state), the process proceeds to step SA4.

Step SA13
Step SA13 is a step inserted between steps SA3 and SA4. The search unit 2 switches the value of the initial flag FG to 1 to indicate that the operation is not in the initial state.

Step SA14
The searching unit 2 determines the target cluster C _TRG in the next process based on the cluster to which the closest node N _NEAR acquired in step SA4 belongs. Here, the cluster of the closest nodes _{N_NEAR} detected in this process is determined as the target cluster _{C_TRG} . Note that the method of determining the target cluster C _TRG in the next process is not limited _to this. It may be the target cluster C _TRG in the next processing. Furthermore, when a plurality of nodes are searched in step SA4, a cluster with a high frequency among the clusters to which the plurality of nodes belong may be set as the target cluster C _trg in the next processing. For example, if 5 nodes are searched in step SA4 and the cluster is {3, 3, 4, 4, 4}, the target cluster may be 4.

According to the modified example described above, the cluster to which the closest node N _NEAR corresponding to the input data p input last time belongs can be set as the target cluster C _TRG to be used for the input data p to be input next time. . In time-series data, in general, sharp changes between adjacent input data are rare, and it is expected to change relatively slowly, so the distance between adjacent input data is also expected to be short.

Therefore, it is considered that the input data p input in the next process also has a short distance from the nearest node _{N_NEAR} in the previous process. Therefore, the target cluster C _TRG to be used in the next process is likely to be the same cluster as the target cluster C _TRG in the previous process. Therefore, in this modification, by using the cluster to _which the node N _NEAR closest to the previous process belongs as the target cluster C TRG for the next process, the target cluster C _TRG for the second and subsequent processes is determined by a simple process. can do. Specifically, the first target cluster C _TRG is determined by calculating the distance between the centroid of each cluster and the input data as in step SA3 (FIG. 11). There is a need to. On the other hand, in this modified example, the determination of the target cluster C _TRG for the second and subsequent times is performed by a simple process, and it is not necessary to calculate the center of gravity of each cluster. realizable. In the present embodiment, the first target cluster Ctrg is determined by calculating the distance between the center of gravity of each cluster and the input data. (CL1 in this embodiment) may be set as the first target cluster Ctrg.

Next, a specific example of processing in the information processing apparatus 100 according to the first embodiment will be described. FIG. 13 shows nodes for distance calculation in the processing by the information processing apparatus 100 according to the first embodiment and the processing by the k-nearest neighbor method. In the information processing apparatus 100, as described above, only the nodes belonging to the two clusters CL2 and CL3 are subject to distance calculation for the input data p. On the other hand, in the k-nearest neighbor method, all nodes included in the model MD are subject to distance calculation for the input data p. In the present embodiment, a model including four clusters to which dozens of nodes belong has been described as an example for simplification of description, but this is merely an example. Even in a model that includes a larger number of clusters and has a larger number of nodes belonging to one cluster, by applying this configuration, the nodes that are the target of distance calculation for input data are limited to nodes belonging to a limited cluster. , it is possible to speed up the command value output process even in a model with a larger number of clusters.

Therefore, according to this configuration, when obtaining the output value corresponding to the input data p, it is possible to improve the processing speed by limiting the nodes that calculate the distance to the input data p.

Embodiment 2
An information processing apparatus 200 according to the second embodiment will be described. FIG. 14 schematically shows the configuration of an information processing apparatus 200 according to the second embodiment. Also, FIG. 15 schematically shows a situation in which the information processing apparatus 200 according to the second embodiment is used. The information processing apparatus 200 has a configuration obtained by adding a model creation unit 4 to the information processing apparatus 100 and replacing the search unit 2 with a search unit 6 .

The model creating unit 4 reads learning data DAT, which is time-series data, and performs learning by the SOINN method to create a model MD. The created model MD is output to the model holding unit 1 . The learning in the model creating unit 4 will be described later.

Using the model MD, the search unit 5 refers to the state quantity of separately input data to be estimated (hereinafter referred to as input data), and searches for an approximate node from among the nodes included in the model MD. do. Then, the search result is output to the output unit 3 .

The data holding unit 1 and the output unit 3 are the same as in the first embodiment, so descriptions thereof are omitted.

Next, the processing performed by the information processing device 200 will be described. FIG. 16 shows a flowchart of operations in the information processing apparatus 200 according to the second embodiment.

Step SB1
The model creation unit 4 creates the model MD by learning the learning data DAT obtained by receiving, holding and processing the state quantity P and the command value Q according to the SOINN method. In the SOINN method, by inputting input data described by a multidimensional vector, nodes representing the input data are generated, and a model MD is obtained as a network composed of the generated nodes. The learning process in the SOINN method will be described below.

The model creating unit 4 uses the SOINN method to create a model consisting of a neural network with a structure of at least one layer or more in which nodes described by n-dimensional vectors are arranged. The neural network that constitutes the model created by the model creating unit 4 is a self-propagating neural network that inputs an input vector into the neural network and automatically increases the number of nodes arranged in the neural network based on the input vector that is input. It is a network and has a one-layer structure.

As a result, the number of nodes can be automatically increased using a self-propagating neural network, so it is possible to perform additional online learning by sequentially inputting input vectors.

The configuration of the model creation unit 4 will be described below. FIG. 17 schematically shows the configuration of the model creation unit 4 according to the second embodiment. The model creation unit 4 includes input information acquisition means 41, winner node search means 42, similarity threshold calculation means 43, similarity threshold determination means 44, node insertion means 45, weight vector update means 46, node density calculation means 47, distribution It has overlapping area detection means 48 , edge connection determination means 49 , edge connection means 50 , edge deletion means 51 , noise node deletion means 52 and output information display means 53 .

Various alphabets such as i, j, k, and x are used in the following explanation of the SOINN method, but these are used for convenience in explaining the SOINN method, and are different from the alphabets used in the first embodiment. Even if they are used more than once, they shall indicate different values.

The input information acquisition means 41 acquires an n-dimensional input vector as information given to the model creation unit 4 as an input. Then, the acquired input vectors are stored in a temporary storage unit (for example, RAM 1003) and sequentially input to the neural network stored in the temporary storage unit.

The winner node search means 42 regards the node having the weight vector closest to the input vector as the first winner node, and the node having the weight vector closest to the input vector as the second winner node, among the input vectors and nodes stored in the temporary storage unit. It searches as a node and stores the result in the temporary memory. That is, for an n-dimensional input vector ξ, nodes that satisfy the following equations stored in the temporary memory are searched as the first winner node _a1 and the second winner node _a2, respectively, and the results are stored in the temporary memory. Store in the storage unit.

Here, a is a node included in the node set A stored in the temporary storage, and W _a is the weight vector of node a stored in the temporary storage.

The similarity threshold calculation means 43 calculates, for the node of interest and the similarity threshold of the node stored in the temporary storage unit, the node directly connected to the node of interest by an edge (hereinafter referred to as an adjacent node). exists, the distance to the node with the maximum distance from the node of interest among the adjacent nodes is calculated as the similarity threshold, the result is stored in the temporary storage unit, and if the adjacent node does not exist calculates the distance to the node with the smallest distance from the node of interest as the similarity threshold, and stores the result in the temporary storage unit. Specifically, for example, the similarity threshold of the node of interest is calculated as follows, and the result is stored in the temporary storage unit.
[Procedure M_A1]
The similarity threshold calculator 43 sets the similarity threshold T _i of the newly inserted node i stored in the temporary storage to +∞ (sufficiently large value), and stores the result in the temporary storage.
[Procedure M_A2]
For the nodes stored in the temporary storage unit, when the node i becomes the closest node or the second closest node from the input vector, it is determined whether or not the node i has an adjacent node, and the result is temporarily stored. Store in the department.
[Procedure M_A3]
As a result of the determination stored in the temporary storage unit, if there is an adjacent node, the similarity threshold value _Ti is set as the maximum distance to the adjacent node for the similarity threshold value and the node stored in the temporary storage unit, and the result is Store in temporary memory.
That is, for the node i, a similarity threshold T _i is calculated based on the following formula stored in the temporary storage unit, and the result is stored in the temporary storage unit.

Here, c is a node included in the adjacent node set _Ni of node i stored in the temporary storage unit, and _Wc is the weight vector of node c stored in the temporary storage unit.
[Procedure M_A4]
As a result of the determination, if there is no adjacent node, the distances from node i to each node other than node i are calculated, and the minimum distance among the calculated distances is set as the similarity threshold T _i .
That is, for the node i, a similarity threshold T _i is calculated based on the following formula stored in the temporary storage unit, and the result is stored in the temporary storage unit.

The similarity threshold determination means 44 determines that the distance between the input vector and the first winner node is greater than the similarity threshold of the first winner node with respect to the input vector, the node, and the similarity threshold of the node stored in the temporary storage unit. It is determined whether it is greater and whether the distance between the input vector and the second winner node is greater than the similarity threshold of the second winner node, and the result is stored in the temporary storage unit. That is, as shown in the following formula stored in the temporary memory, it is determined whether or not the distance between the input vector ξ and the first winner node _a1 is greater than the similarity threshold _Ta1 , and the result is temporarily stored. In addition to storing in the storage unit, it is determined whether or not the distance between the input vector ξ and the second winner node _a2 is greater than the similarity threshold _Ta2 , and the result is stored in the temporary storage unit.

The node insertion means 45 inserts the input vector stored in the temporary storage unit into a new node at the same position as the input vector based on the determination result of the similarity threshold determination unit 44 stored in the temporary storage unit. and store the result in temporary storage.

Weight vector update means 46 updates the weight vector of the node stored in the temporary storage unit so that the weight vector of the first winner node and the weight vector of the adjacent node of the first winner node are closer to the input vector, The result is stored in the temporary memory.
The update amount ΔW _a1 of the weight vector of the first winner node a ₁ and the update amount ΔW _ai of the weight vector of the adjacent node i of the first winner node a ₁ are stored in the temporary storage unit, for example, based on the following formula: Calculation is performed, and the result is stored in the temporary storage unit.

Here, ε ₁ (t) and ε ₂ (t) are calculated based on the following formulas respectively stored in the temporary storage section, and the results are temporarily stored in a billion parts.

In the present embodiment, in order to cope with additional learning, instead of the input vector input count _t , the accumulated count M _a1 is used.

The node density calculation means 47 calculates the node density of the node of interest based on the average distance between the adjacent nodes of the node of interest and the node density stored in the temporary storage unit, and temporarily stores the result. Store in the storage unit. Furthermore, the node density calculation means 47 has a unit node density calculation section, and the unit node density calculation section corresponds to additional learning, so that the first winning node and the node density stored in the temporary storage section are calculated in the first Based on the average distance between the winner node and its adjacent nodes, the node density of the first winner node is calculated as a ratio per unit number of inputs, and the result is stored in the temporary storage unit. Furthermore, the node density calculation means 47 calculates the node density point value of the first winner node based on the average distance between the first winner node and its adjacent nodes for the nodes and node density points stored in the temporary storage unit and a node density point calculation unit that calculates the number of input vector inputs, stores the node density points in a temporary storage unit until the number of input vector inputs reaches a predetermined unit input number, and accumulates the node density points until the number of input vector inputs reaches a predetermined unit input number. If not, calculate the accumulated node density points stored in the temporary storage unit as a ratio per unit input number, calculate the node density of the node per unit input number, and store the result in the temporary storage unit It has a unit node density point calculation unit.

Specifically, the node density point calculation unit calculates the node density point value pt _i given to the node i based on the following formula stored in the temporary storage unit, for example, and stores the result in the temporary storage unit. do. Note that the point value pt _i given to the node i is the point value calculated based on the following formula stored in the temporary storage unit when the node i becomes the first winner node. No points shall be awarded to node i if i is not the first winning node.

Here, _ei indicates the average distance from node i to its adjacent node, is calculated based on the following formula stored in the temporary memory, and the result is stored in the temporary memory.

Note that m indicates the number of adjacent nodes of node i stored in the temporary storage unit, and _Wi indicates the weight vector of node i stored in the temporary storage unit.

Here, when the average distance to adjacent nodes is large, it is considered that the area containing the node has few nodes, and conversely, when the average distance is small, the area has many nodes. it is conceivable that.
Therefore, the node density points are such that high points are awarded for the first winning node in an area with many nodes, and low points are awarded for the first winning node in an area with few nodes. The value calculation method is configured as described above.
As a result, it is possible to estimate the density of nodes in a certain range of areas including nodes, so even if the node is located in an area with a high node distribution, the node has the first number of winners. Compared to the conventional case where the number of times is the density of the node, it is possible to calculate node density points that have a density that is closer to the input distribution density of the input vector.

The unit node density point calculation unit calculates the node density density _i per unit input number of the node i based on the following formula stored in the temporary storage unit, for example, and stores the result in the temporary storage unit.

Here, the number of inputs of the continuously given input vector is divided into sections each having a certain number of inputs λ which is preset and stored in the temporary storage unit, and the sum of the points given to the node i in each section is accumulated. Define point _si . When the total number of inputs of the input vector is set in advance and stored in the temporary storage unit as LT, LT/λ is the total number of sections n, and the result is stored in the temporary storage unit. The number of sections in which the sum of given points is equal to or greater than 0 is calculated as N, and the result is stored in the temporary storage unit (note that N and n are not necessarily the same).
The accumulated points s _i are calculated based on the following formula stored in the temporary storage unit, for example, and the result is stored in the temporary storage unit.

Here, pt _i ^{(j, k)} indicates the point given to the node i by the k-th input in the j-th section, is calculated by the node density point calculation unit described above, and stores the result in the temporary storage unit. do.
In this way, the unit node density point calculation section calculates the density density _i of the node i stored in the temporary storage section as the average of the accumulated points _si , and stores the result in the temporary storage section.

Note that in this embodiment, N is used instead of n in order to deal with additional learning. This solves the problem that in additional learning, nodes generated in previous learning are often not given points, and when calculating the density using n, the density of previously learned nodes becomes progressively lower. This is to avoid That is, by calculating the node density using N instead of n, even if additional learning is performed for a long time, unless the added data is input near the previously learned node, the node can be kept unchanged.
As a result, even when additional learning is performed for a long time, it is possible to prevent the node density of nodes from becoming relatively small. It is possible to calculate by holding the node density obtained without changing it.

The distribution overlapping area detection means 48 calculates a cluster, which is a set of nodes connected by the edges, with respect to the nodes, the edges connecting the nodes, and the density of the nodes stored in the temporary storage unit by the node density calculation means 47. Based on the node densities obtained, the clusters are divided into sub-clusters, which are subsets of the clusters, and the results are stored in a temporary storage unit. Store.

Furthermore, the distribution overlapping area detection means 48 determines the node density based on the node density calculated by the node density calculation means 47 for the nodes, the edges connecting the nodes, and the density of the nodes stored in the temporary storage unit. A node searching unit that searches for a locally largest node, a first labeling unit that gives the searched node a label different from labels already given to other nodes, and a first label second labeling for giving the same label as the label of the node to which the label was given by the first labeling part to a node connected by an edge to the node, among the nodes to which the label was not given by the giving part; Cluster partitioning that divides a cluster, which is a set of nodes connected by an edge, into subclusters, which are subsets of the cluster, when there is a direct connection between a part and nodes with different labels. distribution overlapping region detection for detecting a region containing the node of interest and its neighboring nodes as a distribution overlapping region that is the boundary of the subclusters when the node of interest and its neighboring nodes belong to different subclusters. have a part.

Specifically, for the nodes stored in the temporary storage unit, the edges connecting the nodes, and the density of the nodes, for example, the overlap region of the distributions, which is the boundary of the sub-clusters, is detected as follows, and the result is Store in temporary memory.
[Procedure M_B1]
The node search unit searches for a node having a locally maximum node density based on the node density calculated by the node density calculation means 47, and the node density stored in the temporary storage unit. is stored in the temporary storage unit.
[Procedure M_B2]
A first label assigning unit assigns a label different from the labels already assigned to other nodes to the node searched in procedure M_B1 with respect to the nodes and node labels stored in the temporary storage unit, The result is stored in the temporary memory.
[Procedure M_B3]
The second labeling unit stores the nodes, the edges connecting the nodes, and the labels of the nodes stored in the temporary storage unit. assigning the same label as the label of the node labeled by the first labeling unit to a node connected by an edge to a node labeled by one labeling unit, and storing the result in a temporary storage unit; Store. That is, the same label as that of the adjacent node with the locally highest density is given.
[Procedure M_B4]
The cluster dividing unit divides the nodes, the edges connecting the nodes, and the labels of the nodes stored in the temporary storage unit into clusters, which are sets of nodes connected by the edges stored in the temporary storage unit. The cluster is divided into sub-clusters, which are subsets of the cluster consisting of the assigned nodes, and the result is stored in the temporary storage unit.
[Procedure M_B5]
The distribution overlapping area detection unit detects the nodes, the edges connecting the nodes, and the label of the nodes stored in the temporary storage unit when the node of interest and its adjacent nodes belong to different subclusters. and its adjacent nodes are detected as overlapping regions of distributions, which are the boundaries of subclusters, and the result is stored in a temporary storage unit.

If the first winner node and the second winner node are nodes located in the distribution overlap region, the edge connection determination means 49 determines the first winner node for the node, node density, and distribution overlap region stored in the temporary storage unit. Based on the node densities of the nodes and the second winner node, it is determined whether or not an edge is connected between the first winner node and the second winner node, and the result is stored in the temporary storage unit.

Further, the edge connection determining means 49 includes an belonging sub-cluster determining unit for determining a sub-cluster to which a node belongs, and a vertex of the sub-cluster to which the node belongs, for the nodes, node densities, and sub-clusters of the nodes stored in the temporary storage unit. and an edge connection determination unit that determines whether or not an edge is connected between the first winner node and the second winner node based on the density of the nodes and the density of the nodes.

Based on the judgment result of the edge connection judging means 49 stored in the temporary storage unit, the edge connection unit 50 connects the nodes and the edges between the nodes stored in the temporary storage unit between the first winner node and the second winner node. , and stores the result in temporary storage.

Based on the determination result of the edge connection determination unit 49 stored in the temporary storage unit, the edge deletion unit 51 removes the nodes and the edges between the nodes stored in the temporary storage unit between the first winner node and the second winner node. , and store the result in temporary storage.

Specifically, for the nodes, node densities, sub-clusters of nodes, and edges between nodes stored in the temporary storage unit, the edge connection determining means 49 determines whether or not the edges are connected, for example, in the following manner. Then, the side connecting means 50 and the side deleting means 51 execute side generation and deletion processing, and store the result in the temporary storage section.
[Procedure M_C1]
The belonging sub-cluster determination unit determines sub-clusters to which the first winner node and the second winner node belong among the nodes and sub-clusters of the nodes stored in the temporary storage unit, and stores the result in the temporary storage unit.
[Procedure M_C2]
If the result of determination in procedure M_C1 stored in the temporary storage unit is that the first winner node and the second winner node do not belong to any sub-cluster, or the first winner node and the second winner node belong to the same sub-cluster If so, the edge connecting means 50 connects the nodes by generating an edge between the first winner node and the second winner node for the node and the edge between the nodes stored in the temporary storage unit. , the result is stored in the temporary memory.
[Procedure M_C3]
If the result of determination in procedure M_C1 stored in the temporary storage unit is that the first winner node and the second winner node belong to different subclusters, the edge connection determination unit Density and edges between nodes, based on the density of the vertices of the subcluster to which the node belongs and the density of the nodes, it is determined whether to connect an edge between the first winner node and the second winner node, and the result is stored in the temporary storage unit.
[Procedure M_C4]
When it is determined that it is not necessary to connect the edges as a result of the determination by the edge connection determination unit in the procedure M_C3 stored in the temporary storage unit, the nodes and the edges between the nodes stored in the temporary storage unit are subjected to the first If the winner node and the second winner node are not connected by an edge and the nodes are already connected by an edge, the edge deletion means 51 deletes the node and the edge between the nodes stored in the temporary storage unit by: The edge between the first winner node and the second winner node stored in the temporary storage is deleted, and the result is stored in the temporary storage.
[Procedure M_C5]
When it is determined that it is necessary to connect the edges as a result of the determination by the edge connection determination unit in the procedure M_C3 stored in the temporary storage unit, the edge connection unit 50 connects the nodes and the internodes stored in the temporary storage unit. , an edge is generated between the first winner node and the second winner node to connect the nodes.

Here, determination processing by the edge connection determination unit will be described in detail.
First, for the nodes and node densities stored in the temporary storage unit, the edge connection determination unit determines the minimum node density m of the first winner node density density _win and the second winner node density density _sec-win. It is calculated based on the following formula stored in the temporary memory, and the result is stored in the temporary memory.

Next, with respect to the nodes, the node densities of the nodes, and the subclasses of the nodes stored in the temporary storage unit, with respect to the subcluster A and the subcluster B to which the first winner node and the second winner node belong, the vertices of the subcluster A The density A _max and the density B _max of the vertices of the subcluster B are calculated, and the results are stored in the temporary storage unit.
Among the nodes included in the sub-cluster, the node density having the maximum node density is defined as the density of the vertices of the sub-cluster.
Then, regarding the vertex densities A _max and B _max of the sub-cluster to which the node stored in the temporary storage unit belongs and the node density m, m is smaller than α _A A _max and m is smaller than α _B B _max . It is determined whether or not, and the result is stored in the temporary storage unit. That is, it is determined whether or not the following inequality stored in the temporary memory is satisfied, and the result is stored in the temporary memory.

As a result of the determination, when m is smaller than α _AA _max and m is smaller than α _BB _max , the nodes and the edges between the nodes stored in the temporary storage unit are the first winner node and the second winner node. It is determined that no edge is required between the nodes, and the result is stored in the temporary storage unit.
On the other hand, if the result of determination is that m is greater than or equal to α _AA _max or m is greater than or equal to α _BB _max , then the first winning node and the first winning node and the edge between the nodes stored in the temporary storage unit It is determined that an edge is required between the two winner nodes, and the result is stored in the temporary memory.

Thus, by comparing the minimum node density m of the first and second winner nodes with the average node density of sub-clusters containing the first and second winner nodes, respectively, the first winner node and the magnitude of unevenness in the node density in the region containing the second winner node. That is, when the node density m in the valley of _the distribution existing between subcluster A and subcluster B is larger than the threshold value _αAAmax or _αBBmax , it is determined that the shape of the node _density is small unevenness. can do.

Here, α _A and α _B are calculated based on the following formulas stored in the temporary storage section, and the results are stored in the temporary storage section. Since α _B can also be calculated in the same manner as α _A , the explanation is omitted here.
i) If A _max /mean _A −1≦1, let α _A =0.0.
ii) If 1<A _max /mean _A −1≦2, let α _A =0.5.
iii) If 2<A _max /mean _A −1, let α _A =1.0.
In case i) where the value of A _max /mean _A is 1 or less, the values of A _max and mean _A are approximately the same, and it is determined that the irregularities in density are due to the influence of noise. By setting the value of α to 0.0, the sub-clusters are integrated.
In the case of i ii) where the value of A _max /mean _A exceeds 2, it is determined that A _max is sufficiently larger than mean _A and there is a clear irregularity in density. By setting the value of α to 1.0, sub-clusters are separated.
Then, in the case of i) where the value of A _max /mean _A is other than the above-described case, by setting the value of α to 0.5, sub-clusters are integrated or keep them separate.
Note that mean _A indicates the average value of node density density _i of node i belonging to sub-cluster A, _NA is the number of nodes belonging to sub-cluster A, and is calculated based on the following formula stored in the temporary storage unit. and stores the result in the temporary memory.

In this way, when separating into sub-clusters, the degree of unevenness of the node density included in the sub-clusters is determined, and two sub-clusters that satisfy a certain criterion are integrated into one, so that the distribution overlap It is possible to prevent destabilization due to excessive division of sub-clusters in area detection.

The noise node deletion means 52 deletes the node density calculated by the node density calculation means 47 and the node density calculated by the node density calculation means 47 and deletes the node of interest based on the number of adjacent nodes in , and stores the result in the temporary storage unit.

Further, the noise node deletion means 52 includes a node density comparison unit that compares the node density of the node of interest with a predetermined threshold for the number of nodes, node density, edges between nodes, and adjacent nodes stored in the temporary storage unit; It has an adjacent node number calculation unit that calculates the number of adjacent nodes of a node of interest, and a noise node deletion unit that considers the node of interest as a noise node and deletes it.
Specifically, for example, for the number of nodes, node density, edges between nodes, and adjacent nodes stored in the temporary storage unit as follows, based on the node density and the number of adjacent nodes of the node of interest, deletes the node that

The noise node deletion means 52 calculates the number of adjacent nodes for the node i of interest with respect to the number of nodes, edges between nodes, and adjacent nodes stored in the temporary storage unit by the adjacent node number calculation unit, and the result is is stored in the temporary storage unit. Then, according to the number of adjacent nodes stored in the temporary storage unit, the following processing is performed.
i) When the number of neighboring nodes stored in the temporary storage is 2, the node density comparison unit compares the node density density _i of node i with a threshold value stored in the temporary storage, for example, calculated based on the following formula: , the result is stored in the temporary memory.

Regarding the comparison result stored in the temporary storage unit, if the node density density _i is smaller than the threshold, the noise node deletion unit deletes the node stored in the temporary storage unit, and stores the result in the temporary storage unit. store in
ii) When the number of adjacent nodes stored in the temporary storage unit is 1, the node density comparison unit compares the node density density _i of node i with a threshold value stored in the temporary storage unit and calculated based on the following formula, for example: , the result is stored in the temporary memory.

Regarding the comparison result stored in the temporary storage unit, if the node density density _i is smaller than the threshold, the noise node deletion unit deletes the node stored in the temporary storage unit and temporarily stores the result. Store in the department.
iii) If the number of adjacent nodes stored in the temporary storage unit does not have an adjacent node, the noise node deletion unit deletes the nodes stored in the temporary storage unit and stores the result in the temporary storage unit. do.
Here, by adjusting predetermined parameters _c1 and _c2 that are set in advance and stored in the temporary storage unit, the noise node deletion behavior of the noise node deletion unit 52 can be adjusted.

The output information display means 53 outputs information including the nodes stored in the temporary storage unit, that is, the model MD.

Next, the flow of learning processing by the SOINN method will be explained. FIG. 18 shows a flowchart of learning processing by the SOINN method.

Step M1
The input information acquiring means 41 acquires two input vectors at random, initializes the node set A as a set containing only the two nodes corresponding to them, and stores the result in the temporary storage unit. Also, the edge set C⊂A×A is initialized as an empty set, and the result is stored in the temporary memory.

Step M2
The input information acquisition means 41 inputs a new input vector ξ randomly selected from the learning data DAT, and stores the result in the temporary storage unit. However, it goes without saying that the once selected input vector is never selected again.

Step M3
The winner node searching means 42 selects the first winner node a1 having the weight vector closest to the input vector ξ and the second winner node _a1 having the second closest weight vector to the input vector and nodes stored in the temporary storage unit. ₂ and store the result in temporary storage.

Step M4
The similarity threshold determination means 44 determines that the distance between the input vector ξ and the first winner node _a1 is the similarity of the first winner node _a1 with respect to the input vector, the node, and the similarity threshold of the node stored in the temporary storage unit. Determine whether it is greater than the threshold _T1 and whether the distance between the input vector ξ and the second winner node _a2 is greater than the similarity threshold _T2 of the second winner node _a2 , and temporarily store the result as Store in the storage unit.
Here, the similarity threshold T ₁ of the first winner node a ₁ and the similarity threshold T ₂ of the second winner node a ₂ stored in the temporary storage unit are the similar is calculated by the degree threshold calculation means 43, and the result is stored in the temporary storage unit.

Step M5
As a result of the determination in step M4 stored in the temporary storage unit, the distance between the input vector ξ and the first winner node _a1 is greater than the similarity threshold _T1 of the first winner node _a1 , or the input vector ξ and the first winner node a1 If the distance between the two winner nodes _a2 is greater than the similarity threshold _T2 of the second winner node _a2 , the node inserting means 45 inserts the input vector ξ for the input vectors and nodes stored in the temporary storage unit. A new node i is inserted at the same position as the input vector ξ, and the result is stored in the temporary memory.

Step M6
On the other hand, as a result of the determination in step M4 stored in the temporary storage unit, the distance between the input vector ξ and the first winner node _a1 is equal to or less than the similarity threshold _T1 of the first winner node _a1 , and the input vector If the distance between ξ and the second winner node _a2 is equal to or less than the similarity threshold _T2 of the second winner node _a2 , the edge connection determination means 49 determines the nodes stored in the temporary storage unit, the node density, For an edge between nodes, determine whether to connect an edge between the first winner node _a1 and the second winner node _a2 based on the node densities of the first winner node _a1 and the second winner node _a2 and stores the result in the temporary memory.

Step M7
As a result of the judgment in step M6 stored in the temporary storage unit, if an edge is to be generated and connected between the first winner node _a1 and the second winner node _a2 , the edge connecting means 50 stores With respect to the stored nodes and edges between the nodes, an edge is connected between the first winner node and the second winner node, and the result is stored in the temporary storage unit.
Then, the information processing device stores the age of the newly generated edge and, if the edge has already been generated between nodes, the age of the edge with respect to the edge and the age of the edge stored in the temporary storage unit. Set it to 0, store the result in the temporary storage, increment the age of the edge directly connected to the first winner node _a1 (by 1), and store the result in the temporary storage.
On the other hand, if the result of determination in step M6 stored in the temporary storage unit is that no edge is connected between the first winner node _a1 and the second winner node _a2 , the process advances to step M8. If an edge has been generated between them, the edge deletion means 51 removes the edge between the first winner node _a1 and the second winner node a2 from the nodes _and the edges between the nodes stored in the temporary storage unit. delete and store the result in temporary storage. The edge connection determination unit 49, the edge connection unit 50, and the edge deletion unit 51 carry out the processing as shown in the procedures M_C1 to M_C5 described above.
Next, for the node and node density point values stored in the temporary storage section, for the first winner node _a1 , the node density calculation means 47 calculates the node density of the first winner node _a1 stored in the temporary storage section. By calculating a point value, storing the result in a temporary storage unit, and adding the node density point value calculated and stored in the temporary storage unit to the previously calculated point value stored in the temporary storage unit, Accumulate as node density points and store the result in temporary storage.
Next, the information processing device increments (increases by 1) the cumulative number M _a1 of the first winner node _a1 becoming the first winner node stored in the temporary storage unit, and stores the result in the temporary storage unit.

Step M8
The weight vector updating means 46 converts the weight vector of the first winner node _a1 and the weight vector of the adjacent node of the first winner node _a1 to the input vector ξ with respect to the node and the weight vector of the node stored in the temporary storage unit. It is updated so as to be closer, and the result is stored in the temporary storage unit.

Step M9
The information processing device deletes, from among the sides stored in the temporary storage section, sides having an age exceeding a preset threshold age _t stored in the temporary storage section, and stores the result in the temporary storage section. Note that age _t is used to delete edges that are erroneously generated due to the influence of noise or the like. By setting a small value for age _t , edges can be easily deleted and noise can be prevented. Become. On the other hand, if _{age_t} is set to an extremely large value, edges generated due to noise cannot be properly removed. Taking these factors into account, the parameter _aget is preliminarily calculated by experiment and stored in the temporary memory.

Step M10
The information processing device determines whether the total number of given input vectors ξ stored in the temporary memory is a multiple of λ preset and stored in the temporary memory. It judges and stores the result in the temporary memory. If the total number of input vectors is not a multiple of λ as a result of the determination stored in the temporary memory, the process returns to step M2 to process the next input vector ξ.
On the other hand, when the total number of input vectors ξ becomes a multiple of λ, the following processing is executed.
Note that λ is the period for deleting nodes considered to be noise. By setting a small value for λ, noise processing can be performed frequently, but if the value is extremely small, nodes that are not actually noise will be erroneously deleted. On the other hand, if λ is set to an extremely large value, it will not be possible to properly remove nodes generated due to noise. Considering these, the parameter λ is calculated in advance by experiment and stored in the temporary memory.

Step M11
The distribution overlapping region detection means 48 detects the overlapping region of the distributions, which is the boundary of the subclusters, as shown in the above procedures M_B1 to M_B5 for the overlapping regions of the subclusters and distributions stored in the temporary storage unit, The result is stored in the temporary memory.

Step M12
The node density calculation means 47 calculates the accumulated node density points stored in the temporary storage unit as a ratio per unit input number, stores the result in the temporary storage unit, and calculates the node density of the node per unit input number. Calculation is performed, and the result is stored in the temporary storage unit.

Step M13
The noise node deletion means 52 deletes the nodes regarded as noise nodes from the nodes stored in the temporary storage section, and stores the result in the temporary storage section. The parameters _c1 and _c2 used by the noise node removing means 52 in step M13 are used to determine whether or not the node is regarded as noise. A value close to ₀ is used for c1 because a node with 2 neighbors is usually not noise. Also, since a node whose number of neighboring nodes is 1 is often noise, a value close to 1 is used for _c2 , and these parameters are set in advance and stored in the temporary storage unit.

Step M14
The information processing device determines whether or not the total number of given input vectors ξ stored in the temporary storage unit is LT preset and stored in the temporary storage unit. , the result is stored in the temporary memory. If the result of determination stored in the temporary memory is that the total number of input vectors is not a multiple of LT, the process returns to step M2 to process the next input vector ξ.
On the other hand, when the total number of input vectors ξ reaches LT, learning is stopped.

By executing the above steps M1 to M14, a model MD constructed as a network of nodes created by the SOINN method is obtained. FIG. 19 shows an example of node distribution obtained by learning the learning data DAT used in the first embodiment by the SOINN method in the information processing apparatus 200 according to the second embodiment. In FIG. 19, for easy visualization, two-dimensional planes of fx and fy are extracted from the state quantities of the learning data and mapped.

In this example, by learning with the SOINN method, the learning results of the data elements included in the learning data are represented by representative nodes generated by the SOINN method. That is, the number of nodes included in the model MD is smaller than the number of data elements included in the learning data DAT.

For the SOINN method, for example, the SOINN method in

Patent Documents

1 and 2 can be applied. Also, various SOINN methods may be applied to learn the distribution structure of data elements contained in the learning data as a distribution structure represented by a smaller number of nodes than the data elements.

Returning to FIG. 16, the processing after step SB2 in the information processing device 200 will be described.

Step SB2
The searching unit 5 reads the model MD from the model holding unit 1 as in step SA1 of FIG.

Step SB3
The search unit 5 acquires the state quantity p, which is the input data, in the same manner as in step SA2 of FIG.

Step SB4
The searching unit 5 searches for the node N _NEAR closest to the input data p from the model MD, as in step SA4 of FIG.

Step SB5
The output unit 3 is the same as in the first embodiment, as in step SA5 of FIG. That is, the output unit 3 outputs the output command value q _OUT determined based on the nearest node N _NEAR . As in the first embodiment, output command value q _OUT may be determined using various methods including the first to fourth output command value determination methods.

As described above, according to this configuration, when obtaining the output value corresponding to the input data p, the number of nodes for calculating the distance to the input data p is less than the number of data elements of the learning data. can be limited to As a result, the number of nodes for distance calculation can be significantly reduced compared to a general k-nearest neighbor method that uses the same number of nodes as the learning data or the data elements of the learning data. , it is possible to speed up the process of obtaining the command value corresponding to the input data p.　

Therefore, according to this configuration, similarly to the first embodiment, when obtaining an output value corresponding to input data p, it is possible to improve the processing speed by limiting the nodes that calculate the distance to the input data p. can.

Other Embodiments The present invention is not limited to the above-described embodiments, and can be modified as appropriate without departing from the scope of the invention. For example, in the above-described embodiment, the magnitude determination of two values has been described, but this is merely an example, and the case where the two values are equal in the magnitude determination of the two values may be handled as necessary. good. That is, determining whether the first value is greater than or equal to the second value or less than the second value, and determining whether the first value is greater than or less than or equal to the second value. With respect to the determination of, any may be adopted as necessary. determining whether the first value is less than or greater than the second value and determining whether the first value is less than or greater than or equal to the second value Either may be adopted. In other words, when determining the magnitude of two values to obtain two determination results, the case where the two values are equal may be included in either of the two determination results as required.

In the above-described embodiments, the present invention has been described mainly as a hardware configuration, but it is not limited to this, and arbitrary processing can be realized by causing a CPU (Central Processing Unit) to execute a computer program. It is also possible to A program includes instructions (or software code) that, when loaded into a computer, cause the computer to perform one or more of the functions described in the embodiments above. The program may be stored in a non-transitory computer-readable medium or tangible storage medium. Non-limiting examples of non-transitory computer readable or tangible storage media include random-access memory (RAM), read-only memory (ROM), flash memory, SSD (Solid-state drive) or other kind of storage technology such as CD (Compact disc)-ROM, DVD (Digital versatile disc), Blu-ray disc or other kind of Including optical disk storage, magnetic cassette, magnetic tape, magnetic disk storage or other magnetic storage. The program may be transmitted by a transitory computer-readable medium or communication medium. Non-limiting examples of transitory computer readable media or tangible storage media may include electrical, optical, acoustic or other forms of propagating signals.

In the first embodiment, the nodes obtained by learning the learning data DAT are clustered. Needless to say, clusterin may be used for this purpose.

Some or all of the above embodiments can also be described as the following additional remarks, but are not limited to the following.

(Appendix 1) A multidimensional vector containing a first state quantity obtained in advance from a learning target and a first command value corresponding to the first state quantity given to control the motion of the learning target By learning learning data based on time-series data including a plurality of data elements represented by, the distribution structure of the learning data includes a second state quantity and a second command value based on the learning result A model holding unit holding a model created as a set of nodes represented by a multidimensional vector; a searching unit for searching for a node matching or approximating the input data from among the nodes whose number is smaller than the number of data elements; and an output unit that outputs to the controlled object as an output command value to be given.

(Appendix 2) The model is created by learning each of the learning data as a node, and the search unit temporally compares the input data from the node as the small number of nodes. The information processing apparatus according to supplementary note 1, wherein a close node is selected, and a node closest to the input data is searched from the selected node.

(Appendix 3) The model is created by time-series clustering the learning data and classifying the nodes into a plurality of clusters. 3. The information processing apparatus according to appendix 2, wherein a cluster belonging to the cluster and a cluster temporally immediately after the cluster are searched, and a node belonging to the searched two clusters is searched for a node closest to the input data.

(Appendix 4) The information processing apparatus according to appendix 3, wherein the search unit searches for a cluster whose center of gravity is closest to the input data as a cluster to which the input data belongs.

(Supplementary note 5) For the first input data, the search unit regards the cluster whose center of gravity is closest to the first input data as the cluster to which the first input data belongs. For the input data input after the input data input first, the cluster to which the node with the shortest distance searched for the input data input one before belongs belongs to the input data input first. 3. The information processing apparatus according to appendix 3, wherein the search is performed as a cluster to which the input data input after the input data belongs.

(Supplementary note 6) For the first input data, the search unit regards the cluster whose center of gravity is closest to the first input data as the cluster to which the first input data belongs. For the input data that is searched and input after the input data that was first input, the frequency of appearance among the plurality of clusters to which the nodes of the shortest distance searched for the plurality of input data that were input in the past belong. 5. The information processing apparatus according to appendix 4, wherein the cluster with the highest is searched as the cluster to which the input data input after the first input data belongs.

(Supplementary note 7) Any one of Supplementary notes 4 to 6, wherein the search unit determines the output command value based on a statistic calculated from the command values of some or all of the nodes of the cluster to which the searched node belongs. 1. The information processing device according to claim 1.

(Appendix 8) The some nodes are nodes within a predetermined distance from the searched node, or a predetermined number of nodes selected in order of proximity from the searched node in the cluster to which the searched node belongs. The information processing device according to appendix 7.

(Supplementary Note 9) The search unit approximates a command value with the searched node, a cluster to which the input data belongs, and a cluster temporally immediately after the cluster to which the input data belongs to the searched node 1 7. The information processing apparatus according to any one of appendices 4 to 6, wherein the output command value is determined based on command values of one or more nodes and a statistic calculated therefrom.

(Supplementary note 10) For the first input data, the search unit regards the cluster whose center of gravity is closest to the first input data as the cluster to which the first input data belongs. For input data that is searched and that is input after the input data that is input first, a cluster to which the node searched for the input data that was input immediately before belongs, and one or more similar to the searched node. 10. The information processing apparatus according to appendix 9, wherein the cluster to which the input data input after the first input data belongs is searched for a cluster with the highest appearance frequency among the clusters to which the nodes of .

(Appendix 11) creating the model by approximately learning the distribution structure of the data elements of the learning data using a smaller number of nodes than the data elements of the learning data, and storing the created model in the model holding unit; The information processing apparatus according to Supplementary Note 1, further comprising a model creation unit that outputs to the model, wherein the search unit searches for a node that approximates the input data from the nodes included in the model.

(Appendix 12) The information processing apparatus according to any one of Appendices 2 to 4 and 11, wherein the search unit outputs the command value of the searched node as the output command value.

(Supplementary Note 13) The search unit, based on a statistic calculated from a command value for the searched node and a command value for one or more nodes similar to the searched node, the output command value 12. The information processing apparatus according to any one of appendices 2 to 4 and 11, which determines the

(Appendix 14) The one or more nodes similar to the searched node are nodes within a predetermined distance from the searched node, or a predetermined number of nodes selected in order of proximity from the searched node. 14. The information processing device according to any one of 9, 10 and 13.

(Appendix 15) The information processing apparatus according to any one of Appendices 7 to 10, 13, and 14, wherein the statistic is one of an average value, median value, maximum value, minimum value, and mode value.

(Appendix 16) The second state quantity is the first state quantity obtained in advance from the learning target, and the second command value corresponds to the first state quantity obtained in advance. 16. The information processing apparatus according to any one of appendices 1 to 15, wherein the first command value given to the learning object is the first command value.

(Supplementary Note 17) The first command value is an actual value of the command value given to the learning target by the learning target operator based on the first state quantity obtained in advance from the learning target. 17. The information processing device according to 16.

(Appendix 18) The first command value included in the data element of the learning data is given to the controlled object after a predetermined time with respect to the first state quantity obtained in advance from the learning object. 17. The information processing device according to appendix 16, wherein the command value is the

(Appendix 19) The information processing apparatus according to any one of Appendices 1 to 17, wherein the information processing apparatus is provided in the learning object or the control object.

(Appendix 20) The information processing apparatus according to any one of Appendices 1 to 17, wherein the learning object and the control object are the same.

(Appendix 21) The information processing apparatus according to appendix 20, wherein the information processing apparatus is provided in the control target.

(Appendix 22) A multidimensional vector containing a first state quantity obtained in advance from a learning target and a first command value corresponding to the first state quantity given to control the motion of the learning target By learning learning data based on time-series data including a plurality of data elements represented by, the distribution structure of the learning data includes a second state quantity and a second command value based on the learning result A model created as a set of nodes represented as a multidimensional vector is held, a third state quantity obtained from a controlled object is input as input data, and the number of data elements of the learning data included in the model is searching for a node matching or approximating the input data from among a small number of the nodes, and using a value based on the second command value of the searched node as an output command value given to operate the controlled object, the control An information processing method that outputs to a target.

(Appendix 23) A multidimensional vector containing a first state quantity obtained in advance from a learning target and a first command value corresponding to the first state quantity given to control the motion of the learning target By learning learning data based on time-series data including a plurality of data elements represented by, the distribution structure of the learning data includes a second state quantity and a second command value based on the learning result A process of holding a model created as a set of nodes represented as a multidimensional vector, and a data element of the learning data included in the model, wherein the third state quantity obtained from the controlled object is input as input data. A process of searching for a node matching or approximating the input data from a smaller number of the nodes, and an output command giving a value based on the second command value of the searched node to operate the controlled object. A program for causing a computer to execute a process of outputting a value to the controlled object.

(Appendix 24) A multidimensional vector containing a first state quantity obtained in advance from a learning target and a first command value corresponding to the first state quantity given to control the motion of the learning target By learning learning data based on time-series data including a plurality of data elements represented by, the distribution structure of the learning data includes a second state quantity and a second command value based on the learning result A model creation unit that creates a model as a set of nodes represented as a multidimensional vector, a model holding unit that holds the created model, and a third state quantity obtained from a controlled object are input as input data, a search unit that searches for a node matching or approximating the input data from a smaller number of the nodes than the number of data elements of the learning data included in the model; and a value based on the second command value of the searched node. to the controlled object as an output command value given for operating the controlled object.

Although the present invention has been described with reference to the embodiments, the present invention is not limited to the above. Various changes that can be understood by those skilled in the art can be made to the configuration and details of the present invention within the scope of the invention.

This application claims priority based on Japanese Patent Application No. 2021-133415 filed on August 18, 2021, and the entire disclosure thereof is incorporated herein.

1 model holding unit 2 search unit 3 output unit 4 model creation unit 5 search unit 10 operator 11 command device 20 operation target device 30 robot 31 robot arm 32 force sensor 33 cylindrical pole 41 input information acquisition means 42 winner node search means 43 similar degree threshold calculation means 44 similarity threshold determination means 45 node insertion means 46 weight vector update means 47 node density calculation means 48 distribution overlapping area detection means 49 edge connection determination means 50 edge connection means 51 edge deletion means 52 noise node deletion means 53 output Information display means 100 information processing device 110 processing unit 120 display unit 130 input unit 200 information processing device 1000 computer 1001 CPU
1002 ROMs
1003 RAM
1004 bus 1005 input/output interface 1006 input section 1007 output section 1008 storage section 1009 communication section 1010 drive 1011 magnetic disk 1012 optical disk 1013 flexible disk 1014 semiconductor memory

Claims

represented by a multidimensional vector containing a first state quantity obtained in advance from a learning target and a first command value corresponding to the first state quantity given to control the motion of the learning target By learning the learning data based on the time-series data containing a plurality of data elements, the distribution structure of the learning data as a multidimensional vector containing the second state quantity and the second command value based on the learning result a model holder that holds a model created as a set of represented nodes;
A third state quantity obtained from a controlled object is input as input data, and a search is performed for searching for a node matching or approximating the input data from a smaller number of the nodes than the number of data elements of the learning data included in the model. Department and
an output unit that outputs a value based on the second command value of the searched node to the controlled object as an output command value given to operate the controlled object;
Information processing equipment.
The model is created by learning each of the learning data as a node,
The search unit selects nodes relatively close in time to the input data from the nodes as the small number of nodes, and searches for a node closest to the input data from the selected nodes. do,
The information processing device according to claim 1 .
The model is created by time-series clustering the learning data and classifying the nodes into a plurality of clusters,
The searching unit searches for a cluster to which the input data belongs and a cluster temporally immediately after the cluster to which the input data belongs, and finds the shortest distance from the nodes belonging to the searched two clusters to the input data. explore the node,
The information processing apparatus according to claim 2.
The search unit determines the output command value based on a statistic calculated from the command values of some or all of the nodes of the cluster to which the searched node belongs.
The information processing apparatus according to claim 3.
A model that creates the model by approximately learning the distribution structure of the data elements of the learning data using fewer nodes than the data elements of the learning data, and outputs the created model to the model holding unit. further equipped with a creation part,
The search unit searches for a node that approximates the input data from the nodes included in the model.
The information processing device according to claim 1 .
The search unit outputs the command value of the searched node as the output command value,
The information processing apparatus according to any one of claims 2, 3 and 5.
The search unit determines the output command value based on a statistic calculated from a command value for the searched node and a command value for one or more nodes similar to the searched node.
The information processing apparatus according to any one of claims 2, 3 and 5.
The one or more nodes similar to the searched node are nodes within a predetermined distance from the searched node, or a predetermined number of nodes selected in order of proximity from the searched node.
The information processing apparatus according to claim 7.
The statistic is one of mean, median, maximum, minimum and mode;
The information processing apparatus according to any one of claims 4, 7 and 8.
The second state quantity is the first state quantity pre-acquired from the learning object, and the second command value is the learning object according to the pre-acquired first state quantity. is the first command value given to
The information processing apparatus according to any one of claims 1 to 9.
The first command value is an actual value of a command value given to the learning target by an operator of the learning target based on the first state quantity obtained in advance from the learning target.
The information processing apparatus according to claim 10.
The first command value included in the data element of the learning data is a command value given to the controlled object after a predetermined time with respect to the first state quantity obtained in advance from the learning object. be,
The information processing apparatus according to claim 10.
represented by a multidimensional vector containing a first state quantity obtained in advance from a learning target and a first command value corresponding to the first state quantity given to control the motion of the learning target By learning the learning data based on the time-series data containing a plurality of data elements, the distribution structure of the learning data as a multidimensional vector containing the second state quantity and the second command value based on the learning result holds a model built as a collection of nodes represented by
A third state quantity obtained from a controlled object is input as input data, and a node matching or approximating the input data is searched from a smaller number of the nodes than the number of data elements of the learning data included in the model,
outputting a value based on the second command value of the searched node to the controlled object as an output command value given to operate the controlled object;
Information processing methods.
represented by a multidimensional vector containing a first state quantity obtained in advance from a learning target and a first command value corresponding to the first state quantity given to control the motion of the learning target By learning the learning data based on the time-series data containing a plurality of data elements, the distribution structure of the learning data as a multidimensional vector containing the second state quantity and the second command value based on the learning result a process of holding a model created as a set of represented nodes;
A third state quantity obtained from a controlled object is input as input data, and a process of searching for a node matching or approximating the input data from a smaller number of the nodes than the number of data elements of the learning data included in the model. and,
causing a computer to execute a process of outputting a value based on the second command value of the searched node to the controlled object as an output command value given to operate the controlled object;
A non-transitory computer-readable medium that stores a program.