WO2023276255A1

WO2023276255A1 - Information processing device, information processing method, and program

Info

Publication number: WO2023276255A1
Application number: PCT/JP2022/006846
Authority: WO
Inventors: 克文杉本
Original assignee: ソニーグループ株式会社
Priority date: 2021-07-01
Filing date: 2022-02-21
Publication date: 2023-01-05

Abstract

Provided are a device and a method of efficiently selecting an active restriction of a quadratic programming problem through use of a norm of each restriction, to enable high-speed calculation of the optimum solution for the quadratic programming problem. This information processing device comprises: a restriction norm estimator that estimates a norm corresponding to an input parameter for each restriction of a quadratic programming problem; and an active restriction selection unit that, through a comparison process between the estimated restriction norm and a predefined threshold value, generates restriction activeness analysis information enabling determination of whether each restriction of the quadratic programming problem is an active restriction used for calculating the optimum solution for an objective function of the quadratic programming problem, or a non-active restriction not used for calculating the optimum solution. A linear analysis unit selects the active restriction through use of the restriction activeness analysis information, to calculate the optimum solution for the quadratic programming problem.

Description

Information processing device, information processing method, and program

The present disclosure relates to an information processing device, an information processing method, and a program. More specifically, the present invention relates to an information processing apparatus, an information processing method, and a program that perform learning processing for determining control parameters of a robot, control of the robot using the learning result, and the like.

For example, when calculating the travel route of a robot or calculating the optimal trajectory of a robot arm, the information acquired by sensors such as cameras and distance sensors attached to the robot is analyzed to analyze the positions of obstacles. Processing is performed to calculate a route or trajectory that does not come into contact with obstacles.

Also, when controlling the robot to travel along the optimal path or to move the arm along the optimal trajectory, how to control the motors and actuators installed in each part such as the wheels, legs, and arm of the robot. A process is required to determine whether

As a method for performing such optimum control, a method using a quadratic programming problem (QP) is known.
A quadratic programming problem (QP) is, for example, the method of least squares, and is an optimization problem in which the objective function is a quadratic function and the constraint condition is a linear function.
For example, it is an optimization problem that can be solved as a minimization problem in which the objective function is downwardly convex.

For example, Non-Patent Document 1 (Online Mixed-Integer Optimization in Milliseconds) discloses a method of calculating an optimal solution to a quadratic programming problem.

This non-patent document 1 uses a neural network that uses pre-learned learning data to perform active constraints and integer value prediction when obtaining an optimal solution for a mixed integer quadratic programming problem, and linear It discloses a method of calculating the optimal solution at high speed by converting it into a simple problem.

The neural network predicts the combination of active constraints and integer values when calculating the optimal solution from the parameters for formulating a mixed integer quadratic programming problem.
input: parameter,
Output: Constraint/integer value Learned from the dataset that is the combination of this input and output. This data set can be prepared by solving countless problems in advance.

In this non-patent document 1, a neural network is modeled as a class classification problem. In general, the dimension of the output should be the same as the number of all possible “combinations of active constraints and integer values” (one-hot vectorization). However, this number grows exponentially with the number N of constraints.
Therefore, in Non-Patent Document 1, the number of combinations of active constraints and integer values appearing in a heuristically prepared data set is used as an output dimension.

With this method, the output dimension increases exponentially with respect to the number of constraints N, and the problem of increased memory consumption and computational complexity can be avoided. However, since all patterns cannot be covered, there is a possibility that the optimum solution cannot be obtained at any timing.

It should be noted that the problem of the output dimension increasing exponentially can be avoided by using a model that directly predicts the input/output relationship through regression prediction. For example, by adopting a configuration that does not use one-hot encoding, the output dimension is proportional to the number of constraints N, and the amount of calculation can be reduced. However, in this case, since the output is binary, it is not suitable for neural network learning, and the problem arises that the learning efficiency deteriorates.

The present disclosure has been made in view of the above problems, for example, and includes an information processing device and an information processing method that are capable of efficiently and quickly solving a quadratic programming problem (QP), and to provide programs.

In one embodiment of the present disclosure, for example, learning processing used for determining control parameters of a robot, an information processing device that performs robot control using a predictor and data generated by the learning processing, and An information processing method and a program are provided.

A first aspect of the present disclosure includes:
a quadratic programming problem optimal solution calculation unit that calculates the optimal solution of the quadratic programming problem corresponding to the input parameters;
a constraint norm calculator that calculates a constraint norm that is the norm between each constraint defined by the constraint function of the quadratic programming problem and the optimal solution;
An information processing apparatus having a learning process execution unit that generates a constraint norm estimator that executes a learning process using set data of the input parameter and the constraint norm as learning data and estimates the constraint norm according to various input parameters. It is in.

Furthermore, a second aspect of the present disclosure is
a constraint norm estimator that estimates a constraint norm according to input parameters for each constraint defined by a constraint function of the quadratic programming problem;
By comparing the constraint norm estimated by the constraint norm estimator with a predetermined threshold value,
Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem. an active constraint selection unit that generates constraint activity analysis information that can identify whether it is an inactive constraint that is not used in the calculation of
The information processing apparatus includes a linear system analysis unit that selects only active constraints using the constraint activity analysis information generated by the active constraint selection unit and calculates the optimum solution of the quadratic programming problem.

Furthermore, a third aspect of the present disclosure is
An information processing method executed in an information processing device,
a quadratic programming problem optimum solution calculation step in which the quadratic programming problem optimum solution calculation unit calculates the optimum solution of the quadratic programming problem corresponding to the input parameter;
a constraint norm calculation step in which a constraint norm calculation unit calculates a constraint norm that is a norm between each constraint defined by the constraint function of the quadratic programming problem and the optimal solution;
A learning processing execution unit executes learning processing using set data of the input parameter and the constraint norm as learning data, and generates a constraint norm estimator that estimates the constraint norm according to various input parameters. An information processing method for executing steps.

Furthermore, a fourth aspect of the present disclosure is
An information processing method executed in an information processing device,
a constraint norm estimation step in which the constraint norm estimator estimates the constraint norm according to the input parameters for each constraint defined by the constraint function of the quadratic programming problem;
The active constraint selection unit compares the constraint norm estimated by the constraint norm estimator with a predetermined threshold,
Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem. an active constraint selection step for generating constraint activity analysis information that enables identification of inactive constraints that are not used in the calculation of
A linear system analysis unit selects only active constraints using the constraint activity analysis information generated by the active constraint selection unit, and executes a linear system analysis step of calculating an optimal solution of the quadratic programming problem. It is in the information processing method to do.

Furthermore, a fifth aspect of the present disclosure is
A program for executing information processing in an information processing device,
a quadratic programming problem optimum solution calculation step for causing the quadratic programming problem optimum solution calculation unit to calculate the optimum solution of the quadratic programming problem corresponding to the input parameters;
a constraint norm calculation step of causing a constraint norm calculation unit to calculate a constraint norm that is a norm between each constraint defined by the constraint function of the quadratic programming problem and the optimal solution;
Execution of learning processing for generating a constraint norm estimator for estimating constraint norms corresponding to various input parameters by executing learning processing using the set data of the input parameter and the constraint norm in the learning processing execution unit as learning data. It is in the program that causes the steps to be executed.

Furthermore, a sixth aspect of the present disclosure is
A program for executing information processing in an information processing device,
a constraint norm estimation step that causes a constraint norm estimator to estimate a constraint norm according to input parameters for each constraint defined by a constraint function of the quadratic programming problem;
By comparing the constraint norm estimated by the constraint norm estimator with a predetermined threshold in the active constraint selection unit,
Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem. an active constraint selection step for generating constraint activity analysis information that enables identification of inactive constraints that are not used in the calculation of
The linear system analysis unit uses the constraint activity analysis information generated by the active constraint selection unit to select only active constraints and execute a linear system analysis step of calculating the optimal solution of the quadratic programming problem. There is a program to let

It should be noted that the program of the present disclosure is, for example, a program that can be provided in a computer-readable format to an information processing device or computer system capable of executing various program codes via a storage medium or communication medium. By providing such a program in a computer-readable format, processing according to the program is realized on the information processing device or computer system.

Still other objects, features, and advantages of the present disclosure will become apparent from more detailed descriptions based on the embodiments of the present disclosure and the accompanying drawings, which will be described later. In this specification, a system is a logical collective configuration of a plurality of devices, and the devices of each configuration are not limited to being in the same housing.

According to the configuration of an embodiment of the present disclosure, an apparatus that enables high-speed calculation of an optimal solution to a quadratic programming problem by efficiently selecting active constraints for the quadratic programming problem using the norm of each constraint. , a method is realized.
Specifically, for example, for each constraint of a quadratic programming problem, a constraint norm estimator that estimates the norm according to the input parameter, and a comparison process between the estimated constraint norm and a predetermined threshold, the quadratic Constraint activity analysis that makes it possible to identify whether each constraint of a planning problem is an active constraint used to calculate the optimal solution of the objective function of a quadratic programming problem or an inactive constraint that is not used to calculate the optimal solution. An active constraint selector for generating information, wherein the linear analyzer utilizes the constraint activity analysis information to select active constraints to compute an optimal solution to the quadratic programming problem.
With this configuration, it is possible to realize an apparatus and a method for efficiently selecting active constraints of a quadratic programming problem using the norm of each constraint and enabling high-speed calculation of the optimum solution of the quadratic programming problem.
Note that the effects described in this specification are merely examples and are not limited, and additional effects may be provided.

FIG. 4 is a diagram illustrating an example of control processing of a robot to which the processing of the present disclosure can be applied; FIG. 10 is a diagram illustrating a configuration example of a device that quickly calculates an optimal solution x of a quadratic programming problem by extracting "active constraints" from "inequality constraints" set in the quadratic programming problem; FIG. 2 is a diagram illustrating a configuration example of a device that executes learning processing to which a quadratic programming problem is applied; It is a figure explaining the example of active restrictions and inactive restrictions in a quadratic programming problem. FIG. 4 is a diagram illustrating a specific example of active constraint identification data (S ^* (θ)) generated by an active constraint identification data generation unit; FIG. 4 is a diagram illustrating a specific example of a predictor (NN: neural network) generated by a class classification processing unit (NN Classifier=neural network class classifying unit); FIG. 4 is a diagram illustrating an example of a label corresponding to active constraint identification data (S ^* (θ)) output by a predictor (NN: neural network); FIG. 4 is a diagram illustrating a configuration example of a control information generation unit that calculates an optimal solution x ^* of a quadratic programming problem including optimal control information from robot observation information (θ); FIG. 3 is a diagram for explaining the detailed configuration and processing of a class classification processing unit (NN Classifier=neural network class classifying unit); FIG. 2 is a diagram illustrating a configuration example of a device that executes learning processing to which a quadratic programming problem is applied; FIG. 3 is a diagram illustrating a specific example of active constraints and inactive constraints in a quadratic programming problem, and norms (constraint norms) used as indices for distinguishing active constraints and inactive constraints; FIG. 4 is a diagram illustrating a specific example of a constraint norm (S _l ^* (θ)) generated by a constraint norm calculator (Calc Norm); FIG. 10 is a diagram for explaining an example of a regression analyzer generated by learning processing by a constraint norm estimator generation learning processing execution unit (regression analyzer (NN Regressor) generation unit); FIG. 4 is a diagram illustrating a configuration example of a control information generation unit that calculates an optimal solution x ^* of a quadratic programming problem including optimal control information from robot observation information (θ); FIG. 4 is a diagram illustrating an example of processing executed by a constraint norm estimator (NN Regressor=neural network regression analyzer); FIG. 10 is a diagram illustrating a specific example of processing executed by a threshold applied active constraint selection unit (Threshold); It is a figure explaining the hardware structural example of the information processing apparatus of this indication.

Details of the information processing apparatus, the information processing method, and the program according to the present disclosure will be described below with reference to the drawings. The description will be made according to the following items.
1. Outline of quadratic programming problem (QP) and robot control 2. Concrete examples of learning processing to which the quadratic programming problem is applied and robot control processing using the learning results3. 4. Regarding the configuration that identifies the active constraint based on the norm from the optimal solution of the quadratic programming problem to the constraint. Hardware Configuration Example of Information Processing Apparatus5. SUMMARY OF THE STRUCTURE OF THE DISCLOSURE

[1. Overview of quadratic programming problem (QP) and robot control]
First, an outline of a quadratic programming problem (QP) and robot control will be described.

For example, when performing calculation processing of a traveling route 20 of a traveling robot 10 as shown in FIG. Information acquired by a sensor such as a distance sensor is analyzed to analyze the position of obstacles, and a process of calculating a route or trajectory that does not come into contact with obstacles is performed.

Also, when controlling the robot to travel along the optimal path or to move the arm along the optimal trajectory, how to control the motors and actuators installed in each part such as the wheels, legs, and arm of the robot. A process is required to determine whether or not

As a method for performing such optimum control, a method using a quadratic programming problem (QP) is known.
The quadratic programming problem is a problem of calculating an optimal solution, such as robot path information and control information, by applying quadratic programming, which is a representative example of a nonlinear programming technique for mathematical optimization.

A quadratic programming problem is an optimization problem in which the objective function is a quadratic function and the constraint condition is a linear function. For example, the least squares method is also a kind of quadratic programming problem.
A quadratic programming problem can be solved, for example, as a minimization problem with a downwardly convex objective function.

The quadratic programming problem is the problem of finding an n-dimensional vector x as the optimal solution to the problem shown in (Formula 1) below.

In the above (Formula 1), (a) is the objective function (or cost function) and (b) is the constraint function.
The quadratic programming problem is a problem of finding the optimum solution x (n-dimensional vector) that minimizes the (a) objective function in the above equation.
(b) The constraint function is a function that defines the allowable existence range of the optimal solution x.
When solving a quadratic programming problem, (b) it is necessary to obtain an optimal solution x within a range that satisfies the constraint function.

Each parameter shown in (a) objective function and (b) constraint function in the above formula is the following parameter.
P is an n×n real-valued symmetric matrix,
q is an n×1 real vector,
A is an m×n matrix,
l, u are m-dimensional vectors,
x ^T means the transposed matrix of n-dimensional vector x.

Note that “l≦Ax≦u” shown in (b) constraint function means a constraint that each element of vector Ax is greater than or equal to the corresponding element of vector l and less than or equal to the corresponding element of vector u.
Note that constraints using inequalities in this way are called “inequality constraints”. For this, an equation, e.g.
Cx=d
Constraints using such equations are called "equality constraints".

Further, the constraints include both active constraints that can be used for the calculation process of the optimum solution x and non-active constraints that are not used for the calculation process of the optimum solution x.
If only active equality constraints can be extracted, quadratic programming problems can be reduced to linear equations, and high-speed optimal solution calculation becomes possible.

With reference to FIG. 2, a configuration example of a device that quickly calculates the optimal solution x of a quadratic programming problem by extracting "active constraints" from "inequality constraints" set in the quadratic programming problem will be described. .

The quadratic programming problem optimum solution calculation device 30 shown in FIG.
The quadratic programming problem optimum solution calculation device 30 shown in FIG. 2 inputs the parameter θ and outputs the optimum solution x ^* of the quadratic programming problem.
Note that x ^* means a Hermitian transposed matrix of x (n-dimensional vector).

The relationship between the input parameter θ and the optimal solution x ^* , which is the output, can be the following correspondence relationship when applied to the control configuration configuration of the robot 10 shown in FIG. 1, for example.
Input parameter θ = observation information (distance of obstacles, robot position, speed, direction, etc.)
Output optimum solution x* = robot control information (robot traveling direction control information, speed control information, output control information for left and right wheels, etc.)

That is, the quadratic programming problem optimal solution calculation device 30 shown in FIG. 2 inputs the parameter θ configured by the observation information of the robot 10, and outputs the robot control information as the optimal solution x ^* of the quadratic programming problem. It can be used for processing such as
The constraint function is composed of, for example, a function that defines speed limit information of the robot 10, minimum distance information that is allowed between the robot and an obstacle, and the like.

The input parameter θ is, for example, a k _- dimensional vector (θ ₀ , θ ₁ , . n) of control information (x ₀ , x ₁ , . . . x _n−1 ).
Each parameter (P, q, A, l, u) set in the (a) objective function and (b) constraint function in the above (Equation 1) is the input parameter θ and the output optimal solution x A parameter defined by a relationship.

Processing executed by each component of the quadratic programming problem optimum solution calculation device 30 shown in FIG. 2 will be described.
A quadratic programming problem standardized model generation unit (QP Modeling) 31 of the quadratic programming problem optimum solution calculation device 30 shown in FIG. do.

The quadratic programming problem standardization model is the mathematical model shown in (Equation 1) below, which was explained earlier.

In the above (Formula 1), (a) is an objective function (or cost function).
(b) is a constraint function, which is an inequality constraint function composed of inequalities.
As described above, the quadratic programming problem is a problem of calculating the optimal solution x ^* (n-dimensional vector) that satisfies the constraints of (a) the objective function and (b) the constraint function in the above equation.

The quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 32 generates the quadratic programming problem standardized model generated by the quadratic programming problem standardized model generation unit (QP Modeling) 31, that is, the above (Equation 1) (a ) input the quadratic programming problem standardized model composed of the objective function and (b) the constraint function, the optimal solution x ^* (n-dimensional vector ) is calculated and output.

A quadratic programming problem standardized model optimum solution calculation unit (QP Solver) 32 calculates the optimum solution x ^{* of the optimization problem (quadratic programming problem), and applies the calculated optimum solution x*} ^to the inequality constraint “l≦Ax≦ u”.
This substitution process extracts only the rows where the equality holds. By generating selection matrices S _cl and S _cu in which the matrix elements satisfying the equation are set to 1 and the other matrix elements are set to 0, and using these matrices S _cl and S _cu , the following relational expression is obtained: ,
S _cl Ax ^* = S _cl l,
S _cu Ax ^* =S _cu u,
The above relational expression holds.
By concatenating these relational expressions, the following (Equation 2) is generated.

The quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 32 uses the above (formula 2) to extract the active constraints, treats the extracted active constraints as equality constraints, and calculates the quadratic programming problem standardized model The quadratic programming problem with inequality constraints generated by the generation unit (QP Modeling) 31 is converted into a quadratic programming problem with active equality constraints as shown in (Equation 3) below.

Note that (a) in the above (Equation 3) is an objective function (or a cost function).
(b) is a constraint function, a constraint function consisting of active equality constraints.

Each parameter shown in (a) objective function and (b) constraint function in the above formula is the following parameter.
P is an n×n real-valued symmetric matrix,
q is an n×1 real vector,
x ^T is the transposed matrix of the n-dimensional vector x,
C is an m×n matrix,
d is an m-dimensional vector,
means

In this way, the quadratic programming problem is replaced by the problem of calculating the optimal solution x ^* (n-dimensional vector) that satisfies the constraints of (a) the objective function is minimized and (b) the active equality constraint function in the above equation. be done.

As described above, the constraints of a quadratic programming problem include inequality constraints and equality constraints, active constraints that can be used for the calculation of the optimal solution x ^* , and active constraints that are not used for the calculation of the optimal solution x ^* . There is an inactivity constraint. If only the active constraint is extracted and the extracted active constraint is regarded as the active equality constraint, the quadratic programming problem can be reduced to a linear equation, and the optimum solution x ^* can be calculated at high speed.

As described above, the quadratic programming problem standardized model optimum solution calculation unit (QP ^Solver ) 32 shown in FIG. to extract only the rows where the equality holds, and to generate selection matrices S _cl and S _cu in which the other rows are set to 0.

By using these selection matrices S _cl and S _cu , only active constraints are extracted from the inequality constraints “l≦Ax≦u”, these are regarded as active equality constraints, and the optimal solution x ^* Perform processing to calculate (n-dimensional vector).
By performing such processing, it becomes possible to calculate the optimal solution x ^* of the quadratic programming problem at high speed.

However, with the configuration shown in FIG. The next planning problem is solved to calculate the optimum solution x ^* , and in subsequent processing, the calculated optimum solution x ^* is used to extract active equality constraints.
As a result, active equality constraints cannot be used in the process of calculating the first optimal solution x ^* .

For example, when actually controlling a robot, it is necessary to quickly calculate an optimum solution x ^* as optimum control information corresponding to various observation information (θ).
For this purpose, a process of extracting active constraints in advance and using the extracted active constraints to quickly calculate an unknown optimal solution x ^* is required.

In order to realize such processing, it is effective to perform learning processing in advance.
For example, set data of various input parameters (θ) and active constraint identification data for extracting active constraints corresponding to the input parameters (θ) are generated in advance as a learning data set.

For example, a learning process for generating active constraint identification data for extracting active constraints from inequality constraints included in a quadratic programming problem is executed in advance, and input parameters (θ) and active constraint identification data (S ^* (θ)) Generate a training data set (θ, S ^* (θ)) consisting of data corresponding to .

During actual robot control, active constraint identification data (S ^* ⁽ θ )) is applied to extract active constraints according to the input parameter (θ), the extracted active constraints are regarded as active equality constraints, and the quadratic programming problem is solved to calculate the optimal solution x ^* .
By performing such processing, it becomes possible to calculate the optimal solution x ^* of the quadratic programming problem at high speed.
In the following items, this learning process and an example of control processing using the learning result will be described.

[2. Concrete examples of learning processing applying the quadratic programming problem and robot control processing using the learning results]
A specific example of a learning process to which a quadratic programming problem is applied and a robot control process using the learning result of this learning process will be described below.

As described above, a method using learning processing is effective as one method for calculating the unknown optimal solution x ^* of the quadratic programming problem at high speed.

Active constraint identification data corresponding to various input parameters (θ), that is, data for selectively extracting active constraints corresponding to input parameters (θ) from inequality constraints included in quadratic programming problems, is generated in advance by learning processing. do.
That is, through the learning process, a learning data set (θ, S ^* (θ)) consisting of set data of various input parameters (θ) and active constraint identification data (S ^* (θ)) corresponding to each parameter (θ) ) is generated in advance.

Further, a learning process using the learning data set (θ, S ^* (θ)) is executed to obtain active constraint identification data (S ^* (θ) corresponding to the input parameter (θ) from various input parameters (θ). ), for example a neural network (NN).

When executing robot control, this predictor, such as a neural network (NN), is used to estimate active constraint identification data (S ^* (θ)) corresponding to input parameters (θ) from various input parameters (θ). do. Furthermore, based on the estimated active constraint identification data (S ^* (θ)), an active constraint corresponding to the input parameter (θ) is selected, the selected active constraint is used to solve the quadratic programming problem, and the robot Optimal solution x ^* is calculated as control information.

That is, a predictor for predicting active constraint identification data (S ^* (θ)) necessary for selectively extracting active constraints for various input parameters (θ) acquired by the robot as observation information is set as a learning data set. It is generated by learning processing using (θ, S ^* (θ)).

By using a predictor generated by such learning processing, such as a neural network (NN), it is possible to extract active constraints at high speed according to the input parameter (θ).
As a result, the optimal solution x ^* of the quadratic programming problem, that is, the optimal solution x ^* of robot control information and the like can be calculated at high speed.
An example of this learning process and a control process using the result of the learning process will be described below.

FIG. 3 is a diagram showing a configuration example of the learning processing unit 40 configured within the information processing apparatus.
As shown in FIG. 3 , the learning processing section 40 has a learning data set generation section 50 and a predictor generation section 60 .
The learning data set generation unit 50 calculates active constraint identification data (S ^* (θ)) that enables extraction of active constraints according to the input parameter (θ), and generates various input parameters (θ) and parameters ( .theta.) A learning data set (.theta., S ^* (.theta.)) 61 consisting of set data with corresponding active constraint identification data (S ^* (.theta.)) is generated.

A predictor generator 60 uses a learning data set (θ, S ^* (θ)) 61 to generate predictions for selecting active constraints according to input parameters (θ) from various input parameters (θ). It has a class classification processing unit 62 that generates an NN (neural network) corresponding to a device.

First, the configuration and processing of the learning data set generation unit 50 will be described.
As shown in FIG. 3, the learning data set generation unit 50 includes a quadratic programming problem standardized model generation unit (QP Modeling) 51, a quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 52, and an active constraint identification unit. It has a data generator 53 .

A quadratic programming problem standardized model generator (QP Modeling) 51 and a quadratic programming problem standardized model optimum solution calculator (QP Solver) 52 shown in FIG. A quadratic programming problem standardized model generator (QP Modeling) 31 and a quadratic programming problem standardized model optimal solution calculator (QP Solver) 32, which are components of the problem optimum solution calculation device 30, perform the same processing.
That is, the parameter θ is input and the optimal solution x ^* of the quadratic programming problem is output.
Note that x ^* means a Hermitian transposed matrix of x (n-dimensional vector).

As described above, the relationship between the input parameter θ and the optimal solution x ^* , which is the output, can be the following correspondence when applied to the control configuration of the robot 10 shown in FIG. be.
Input parameter θ = observation information (distance of obstacles, robot position, speed, direction, etc.)
Output optimum solution x* = robot control information (robot traveling direction control information, speed control information, output control information for left and right wheels, etc.)
The input parameter θ is, for example, a k _- dimensional vector (θ ₀ , θ ₁ , . n) of control information (x ₀ , x ₁ , . . . x _n−1 ).

Processing executed by each component of the learning data set generation unit 50 shown in FIG. 3 will be described.
A quadratic programming problem standardized model generator (QP Modeling) 51 of the learning data set generator 50 shown in FIG. 3 receives a parameter θ and generates a quadratic programming problem standardized model based on the input parameter θ.

In the above (Formula 1), (a) is an objective function (or cost function).
(b) is a constraint function, which is an inequality constraint function composed of inequalities.

Each parameter shown in (a) objective function and (b) constraint function in the above formula is the following parameter.
P is an n×n real-valued symmetric matrix,
q is an n×1 real vector,
A is an m×n matrix,
l, u are m-dimensional vectors,
x ^T is the transposed matrix of the n-dimensional vector x,
means

As described above, the quadratic programming problem is a problem of calculating the optimal solution x ^* (n-dimensional vector) that satisfies the constraints of (a) the objective function and (b) the constraint function in the above equation.

The quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 52 generates the quadratic programming problem standardized model generated by the quadratic programming problem standardized model generation unit (QP Modeling) 51, that is, the above (Equation 1) (a ) input the quadratic programming problem standardized model composed of the objective function and (b) the constraint function, the optimal solution x ^* (n-dimensional vector ) is calculated and output.

A quadratic programming problem standardized model optimum solution calculation unit (QP Solver) 52 calculates the optimum solution x ^{* of the optimization problem (quadratic programming problem), and applies the calculated optimum solution x*} ^to the inequality constraint “l≦Ax≦ u”.
This substitution process extracts only the rows where the equality holds. By generating selection matrices S _cl and S _cu in which the matrix elements satisfying the equation are set to 1 and the other matrix elements are set to 0, and using these matrices S _cl and S _cu , the following relational expression is obtained: ,
S _cl Ax ^* = S _cl l,
S _cu Ax ^* =S _cu u,
The above relational expression holds.
By concatenating these relational expressions, the following (Equation 2) described above is generated.

The quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 52 uses the above (formula 2) to extract the active constraints, treats the extracted active constraints as equality constraints, and calculates the quadratic programming problem standardized model The quadratic programming problem with inequality constraints generated by the generation unit (QP Modeling) 51 is converted into a quadratic programming problem with active equality constraints as shown in (Equation 3) below.

In this way, the quadratic programming problem standardized model optimum solution calculation unit (QP Solver) 52 applies the above (Equation 3) to (a) minimize the objective function and (b) constrain the active equality constraint function Calculate the optimal solution x ^* (n-dimensional vector) that satisfies

As described above, the constraints of the quadratic programming problem include active constraints that can be used for the calculation process of the optimum solution x ^* and inactive constraints that are not used for the calculation process of the optimum solution x ^* .
A specific example of active constraints and inactive constraints in a quadratic programming problem will be described with reference to FIG.

x ^{* shown in the center of FIG. 4 indicates the optimal solution x*} ^of the quadratic programming problem.
As described above, the optimal solution x ^* of the quadratic programming problem is the solution x ^* (n-dimensional vector) that satisfies the constraints of (a) the objective function and (b) the constraint function of the quadratic programming problem.

The circular dotted line shown in FIG. 4 is the contour line of the calculated value of the (a) objective function of the quadratic programming problem, and the calculated value becomes smaller toward the inner side of the contour line.
A region (V) shown in FIG. 4 is a region that satisfies the constraint of the (b) constraint function of the quadratic programming problem.

Line segments ab, cd, ef, and gh show examples of multiple constraints defined by the (b) constraint function of the quadratic programming problem.
Small dotted arrows extending vertically from each line segment indicate the direction in which each constraint is satisfied.
For example, a constraint ab indicated as a line segment ab is a region where the lower right region of the line segment ab satisfies the constraint ab. A constraint gh shown as a line segment gh is a region where the upper left region of the line segment gh satisfies the constraint gh.

A region (V) shown in FIG. 4 is a region (n region of the dimensional state vector).
Thus, the region (V) is a region that satisfies the constraints of the (b) constraint function of the quadratic programming problem, and within this region (V), the (a) objective function of the quadratic programming problem is the minimum value A solution x ^* (an n-dimensional vector) is calculated as the optimal solution x ^* of the quadratic programming problem.

The four constraints shown in FIG. 4, namely constraint ab, constraint cd, constraint ef, and constraint gh, among these four constraints, two constraints, constraint ab and constraint cd, are used to calculate the optimal solution x ^* of the quadratic programming problem. Active constraints available for processing.
On the other hand, two constraints, constraint ef and constraint gh, are inactive constraints that are not used in the process of calculating the optimal solution x ^* of the quadratic programming problem.
The inactive constraint only defines a region that satisfies the constraints of the (b) constraint function, and is not used in the process of calculating the optimal solution x ^* .

The (b) constraint function of the quadratic programming problem includes a plurality of different constraints, and determining which of these constraints is an active constraint that can be used in the process of calculating the optimal solution x ^* . is difficult, and the activeness and inactiveness of each constraint can only be determined as a result of trial and error in the calculation process of the calculation process of the optimum solution x ^* .

Before starting the calculation process of the optimal solution x ^* of the quadratic programming problem, the active constraints that can be used for the calculation process of the optimal solution x ^* and the inactive constraints that are not used for the calculation process of the optimal solution x ^* are discriminated. , extracting only the active constraint and regarding the extracted active constraint as the active equality constraint, it becomes possible to reduce the quadratic programming problem to a linear equation, and to calculate the optimum solution x ^* at high speed.
The active constraint identification data generator 53 of the learning data set generator 50 shown in FIG. 3 generates data for this purpose, that is, active constraint identification data (S ^* (θ)).

As described above, the quadratic programming problem standardized model optimum solution calculation unit (QP solver) 52 of the learning data set generation unit 50 shown in FIG. 3 calculates the optimum solution x ^* of the optimization problem (quadratic programming problem). , the calculated optimal solution x ^* is substituted into the inequality constraint “l≦Ax≦u”, and a selection matrix S is obtained in which the matrix elements for which the equality is established by the substitution process are set to 1, and the other matrix elements are set to 0. Generate _cl , S _cu .

The matrices S _cl and S _cu are input to the active constraint identification data generator 53 of the learning data set generator 50 shown in FIG. 3, and the active constraint identification data generator 53 uses the matrices S _cl and S _cu to generate data (active constraint identification data (S ^* (θ))) for identifying the activity and inactivity of each constraint included in the (b) constraint function of the quadratic programming problem.

The quadratic programming problem standardized model optimum solution calculation unit (QP ^Solver ) 52 of the learning data set generation unit 50 shown in FIG ^. The selection matrices S _cl and S _cu generated in the calculation process of the calculation process of are output to the active constraint identification data generation unit 53 .

As described above, the selection matrices S _cl and S _cu are obtained by substituting the calculated optimal solution x ^* into the inequality constraint “l≦Ax≦u” to select only the rows where the equality holds. A matrix generated by extraction. These are selection matrices S _cl and S _cu in which 1 is assigned to a matrix element for which an equality holds, and 0 is assigned to the other matrix elements.

These selection matrices S _cl , S _cu and the parameters A, l, u of the inequality constraints (l≦Ax<u) of the quadratic programming standardization model, that is,
A: m×n matrix,
l, u: , m-dimensional vector,
Each of these parameters has the following relationships:
S _cl Ax ^* = S _cl l,
S _cu Ax ^* =S _cu u,
The above relational expression holds.

The active constraint identification data generator 53 inputs the following data.
(a) From the quadratic programming problem standardized model generation unit (QP Modeling) 51, the parameters A, l, u of the inequality constraints (l≤Ax<u) of the quadratic programming standardized model,
(b) From the quadratic programming problem standardized model optimal solution calculator (QP Solver) 52, the quadratic programming problem optimal solution x ^* (n-dimensional vector), selection matrices S _cl , S _cu ,

Based on these input data, the active constraint identification data generator 53 generates active constraint identification data, which is information for selectively extracting only active constraints from the inequality constraints (l≦Ax<u) of the quadratic programming standardized model. Generate (S ^* (θ)).

The active constraint identification data generated by the active constraint identification data generator 53 is the output of the active constraint identification data generator 53 shown in FIG. 3, that is, S ^* (θ).
This active constraint identification data (S ^* (θ)) is data summarizing the diagonal components of the selection matrices S _cl and S _cu input from the quadratic programming problem standardized model optimum solution calculator (QP Solver) 52. .

The selection matrices S _cl and S _cu input from the quadratic programming problem standardized model optimum solution calculator (QP Solver) 52 are the matrices shown in (Equation 4) below.

The selection matrices S _cl and S _cu are matrices in which the diagonal elements from the upper left end to the lower right end are 0 or 1, and the other elements are 0s.
0 of the diagonal element from the upper left to the lower right is an element corresponding to the inactive constraint that is not used for the calculation process of the optimal solution x ^* (n-dimensional vector) of the quadratic programming problem, and 1 is the optimal of the quadratic programming problem. It becomes an element corresponding to the active constraint used to calculate the solution x ^* (n-dimensional vector).

A specific example of the active constraint identification data (S ^* (θ)) generated by the active constraint identification data generator 53 will be described with reference to FIG.
The active constraint identification data generator 53 generates active constraint identification data (S ^* (θ)) corresponding to the input parameter (θ), as shown in FIG.

For example, in the example shown in FIG. 5, the active constraint identification data (S ^* (θ)) corresponding to the input parameter (θ ₀ ) is (1000).
S ^* (θ ₀ )=(1000) is a data string indicating whether the four constraints, constraint ab, constraint cd, constraint ef, and constraint gh, are active constraints (1) or inactive constraints (0). .
The active constraint identification data (S ^* (θ)) is composed of a data string expressing 1 for an active constraint and 0 for a non-active constraint.

The active constraint identification data S ^* (θ ₀ )=(1000) means that the four constraints corresponding to the input parameter (θ ₀ ), constraint ab, constraint cd, constraint ef, and constraint gh are the following constraints: .
Constraint ab=1 (active constraint)
Constraint cd=0 (inactive constraint)
constraint ef=0 (inactive constraint)
Constraint gh=0 (inactive constraint)

Also, the active constraint identification data S ^* (θ ₁ )=(1100) indicates that the four constraints corresponding to the input parameter (θ ₁ ), ie, the constraint ab, the constraint cd, the constraint ef, and the constraint gh are the following constraints: means.
Constraint ab=1 (active constraint)
Constraint cd=1 (active constraint)
constraint ef=0 (inactive constraint)
Constraint gh=0 (inactive constraint)

In this way, the active constraint identification data generation unit 53 of the learning data set generation unit 50 shown in FIG. active constraint identification data (S ^* (θ)), which is information for selectively extracting only

The active constraint identification data (S ^* (θ)) generated by the active constraint identification data generator 53 of the learning data set generator 50 shown in FIG. and stored as a learning data set 61 in the learning data set storage unit (storage unit).

This is the learning data set (θ, S ^* (θ)) 61 shown in the predictor generator 60 shown in FIG.
The predictor generator 60 shown in FIG. 3 inputs this learning data set (θ, S ^* (θ)) to a class classification processor (NN Classifier=neural network class classifier) 62 .

A class classification processing unit (NN classifier=neural network class classifier) 62 executes a learning process using a learning data set (θ, S ^* (θ)) to obtain active constraint identification data from an input parameter θ Generate a predictor (NN: Neural Network) that predicts (S ^* (θ)).

A specific example of the predictor (NN: neural network) generated by the class classification processor (NN Classifier = neural network class classifier) 62 will be described with reference to FIG.

FIG. 6 shows a predictor (NN: neural network) 62a generated by the class classification processor (NN Classifier=neural network classifier) 62. FIG.

A predictor (NN: neural network) 62a is a predictor that selects and outputs a label corresponding to active constraint identification data (S ^* (θ)) from an input parameter θ.
The label is a label corresponding to active constraint identification data (S ^* (θ)).
FIG. 7 shows an example of label setting.

The table shown in FIG. 7 corresponds to data in which labels are associated with the active constraint identification data (S ^* (θ)) generated by the active constraint identification data generator 53 described above with reference to FIG.

For example, the active constraint identification data (S ^* (θ)) corresponding to the input parameter (θ ₀ ) is (1000), and this active constraint identification data (S ^* (θ))=(1000) has the label [1 ] is set.

Also, the active constraint identification data (S ^* (θ)) corresponding to the input parameter (θ ₁ ) is (1100), and this active constraint identification data (S ^* (θ))=(1100) has the label [2 ] is set.

In this way, the predictor (NN: neural network) 62a generated by the class classification processing unit (NN Classifier=neural network classifying unit) 62 generates active constraint identification data (S ^* Select and

output labels

1, 2, 3, .

The label selection process in the predictor (NN: neural network) 62a is performed using the learning data set (θ, S ^* (θ)) 61 generated by the learning data set generation unit.

For example, from the learning data set (θ, S ^* (θ)) 61 stored in the storage unit, learning data containing parameters (θ) highly similar to the input parameters θ for the predictor (NN: neural network) 62a A set (θ, S ^* (θ)) is selected, a high score is set in descending order of similarity, and a label ⁽ active constraint identification data corresponding label) is the output label.

In the example shown in FIG. 6, the calculated scores of the predictor (NN: neural network) 62a are as follows.
Score for label 1 (S ^* (θ)=1000)=0.11
Score for label 2 (S ^* (θ)=1000)=0.79
Score for label 3 (S ^* (θ)=1000)=0.05
Score for label 4 (S ^* (θ)=1000)=0.01

When such score calculation results are obtained, the predictor (NN: neural network) 62a outputs the label with the highest score, that is, label 2.
Label 2 corresponds to active constraint identification data S ^* (θ)=1000.
That is, it outputs the result of active constraint identification data (S ^* (θ))=1000 to be applied to the input parameter (θ).

This active constraint identification data (S ^* (θ))=1000 is the constraint ab, constraint cd, constraint ef, constraint gh are the following constraints:
Constraint ab=1 (active constraint)
Constraint cd=0 (inactive constraint)
constraint ef=0 (inactive constraint)
Constraint gh=0 (inactive constraint)

Thus, the class classification processor ( ^NN Classifier=neural network class classifier) 62 of the predictor generation unit 60 shown in FIG. ) 61 to generate a predictor (NN: Neural Network) that predicts the active constraint identification data (S ^* (θ)) from the input parameter θ.

For example, in the actual robot control process, this predictor (NN: neural network), that is, the predictor (NN: neural network ) is executed.

That is, first, a quadratic programming problem is set to calculate the optimal solution x ^* including the control information of the robot from the observation information (θ) of the robot.
Furthermore, using the above predictor (NN: neural network), the active constraint corresponding to the observation information (θ) of the robot is estimated, and the estimated active constraint is used to obtain the optimal solution x It is possible to execute processing for calculating ^* .
This processing will be described with reference to FIG.

FIG. 8 shows a control information generator 80 configured in, for example, an information processing device of a robot. The control information generating unit 80 receives, for example, an input parameter (θ), which is observation information of the robot, and executes a process of calculating the optimal solution x ^* of the quadratic programming problem as the control information of the robot.

The input parameter θ is expressed, for example, as a k _- dimensional vector (θ ₀ , θ ₁ , .
Also, the optimal solution x ^* of the quadratic programming problem is expressed as an _n- dimensional vector (x ₀ , x ₁ , .

As shown in FIG. 8, the control information generation unit 80 has a class classification processing unit (NN classifier=neural network class classifier) 81 and a linear system solver 82.

The class classification processing unit (NN classifier = neural network class classification unit) 81 is a class classification processing unit (NN classifier = neural・Network class classification unit) 61 has the same configuration. That is, the same processing as the predictor (NN: neural network) 62a generated by the predictor generating unit 60 of the learning processing unit 40 described above with reference to FIG. 6 is executed.

A class classification processing unit (NN classifier=neural network class classifier) 81 uses a predictor (NN: neural network) to generate active constraint identification data (S ^* (θ) ). The active constraint identification data (S ^* (θ)) is data that enables selection of only the active constraint corresponding to the input parameter (θ).

The active constraint identification data (S ^* (θ)) generated by the class classification processor (NN classifier=neural network class classifier) 81 is input to a linear system solver (Linear System Solver) 82 .

A linear system solver 82 uses the active constraint identification data (S ^* (θ)) to extract only the active constraints from the constraints included in the inequality constraints “l≦Ax≦u” of the quadratic programming problem. These are regarded as active equality constraints, and processing is performed to calculate the optimal solution x ^* (n-dimensional vector) that satisfies the active equality constraints.
By performing such processing, high-speed calculation processing of the optimal solution x ^* of the quadratic programming problem is realized, and the robot can be controlled quickly.

The detailed configuration and processing of the class classification processor (NN Classifier = neural network class classifier) 81 will be described with reference to FIG.

FIG. 9 shows a predictor (NN: neural network) 81a and a label converting section 81b configured in a class classification processing section (NN Classifier=neural network classifying section) 81. FIG.

The predictor (NN: neural network) 81a corresponds to the predictor (NN: neural network) 62a generated in the predictor generation unit 60 of the learning processing unit 40 previously described with reference to FIG.

That is, the predictor (NN: neural network) 81a selects and outputs a label corresponding to the input parameter (θ) (active constraint identification data (S ^* (θ)) corresponding label).
The label is the label described above with reference to FIG. 7, and is associated with each active constraint identification data (S ^* (θ)).

For example, label [1] is set for active constraint identification data (S ^* (θ))=(1000), and label [2] is set for active constraint identification data (S ^* (θ))=(1100). be done.
In this way, the label is a label with which the setting of the active constraint identification data (S ^* (.theta.)) can be comprehended.

A predictor (NN: neural network) 81a of a class classification processing unit ( ^NN classifier=neural network class classifying unit) 81 shown in FIG. Select and

output labels

1, 2, 3, .

The predictor (NN: neural network) 81a is a predictor (NN: neural network) generated by learning processing using the learning data set (θ, S ^* (θ)) 61 described above.

The predictor (NN: neural network) 81a has, for example, a learning data set (θ, S ^* (θ) ), and perform label estimation processing such that the label (active constraint identification data corresponding label) of the learning data set (θ, S ^* (θ)) with the highest set score is set as the output label. .

In the example shown in FIG. 9, the calculated scores of the predictor (NN: neural network) 81a are as follows.
Score for label 1 (S ^* (θ)=1000)=0.11
Score for label 2 (S ^* (θ)=1000)=0.79
Score for label 3 (S ^* (θ)=1000)=0.05
Score for label 4 (S ^* (θ)=1000)=0.01

When such score calculation results are obtained, the predictor (NN: neural network) 81a outputs the label with the highest score, that is, label 2 .
Label 2 corresponds to active constraint identification data S ^* (θ)=1000.
That is, it outputs the result of active constraint identification data (S ^* (θ))=1000 to be applied to the input parameter (θ).

Thus, the predictor (NN: neural network) 81a of the class classification processing unit (NN Classifier=neural network class classifying unit) 81 shown in FIG. S ^* (θ)) 61 is used to generate a label corresponding to the active constraint identification data (S ^* (θ)) from the input parameter θ and output to the label converter 81b.

The label conversion unit 81b inputs labels corresponding to active constraint identification data (S ^* (θ)) generated by a predictor (NN: neural network) 81a, and converts one active constraint identification data ( S ^* (θ)) is selected, and one selected active constraint identification data (S ^* (θ)) is output to a linear system solver 82 in the next stage.

A linear system solver 82 uses active constraint identification data (S ^* (θ)) input from a class classification processor (NN Classifier=neural network class classifier) 81 to perform a quadratic program Extract only the active constraints from the constraints contained in the problem inequality constraints "l≤Ax≤u", treat them as active equality constraints, and compute the optimal solution x ^* (n-dimensional vector) that satisfies the active equality constraints. process.

By performing such processing, high-speed calculation processing of the optimal solution x ^* of the quadratic programming problem is realized, and the robot can be controlled quickly.

However, in such a high-speed optimal solution calculation method for quadratic programming problems, if the number of constraints N included in the (b) constraint function in a quadratic programming problem increases, the number of labels described above becomes an exponential function. will increase substantially.

For example, in the above example, as described with reference to FIGS. 4 to 7, there are four constraints: constraint ab, constraint cd, constraint ef, and constraint gh. In this case, the number of labels is 2 ⁴ =16. However, for example, if the number of constraints is 8, the number of labels is 2 ⁸ =256. When the number of constraints is 10, the number of labels is 2 ¹⁰ =1024.

If the number of labels increases significantly, as a result, there is a problem that the processing cost in the predictor (NN: neural network) of the classification processing unit (NN classifier = neural network classifying unit) increases. Occur.
Specifically, for example, the amount of memory consumption and the amount of calculation increases when using a predictor (NN: neural network), which may cause problems such as a decrease in the efficiency of learning processing and a decrease in control speed during robot control. .
An embodiment that solves such a problem will be described below.

[3. Constructs that identify the active constraint based on the norm from the optimal solution to the constraint of the quadratic programming problem]
Next, a configuration for identifying active constraints based on the norm from the optimal solution of the quadratic programming problem to the constraint will be described.

In the following, a learning process that generates active constraint identification data for identifying active constraints based on the norm from the optimal solution to the constraint of the quadratic programming problem, and a robot control process that uses the learning results generated by the learning process. A specific example of is described.

As described above, one technique for quickly calculating an unknown optimal solution x ^* to a quadratic programming problem is to use a predictor generated by learning processing, that is, a predictor that estimates active constraints. is valid.

Specifically, active constraint identification data corresponding to various input parameters (θ), that is, data for selectively extracting active constraints corresponding to input parameters (θ) from inequality constraints included in quadratic programming problems. It is effective to generate them in advance by learning processing.

In the embodiment described below, the norm (L2 norm ⁽ Euclidean norm)) corresponding to the distance in the vector space from the optimal solution x ^* of the quadratic programming problem to the constraint is used as the active constraint identification data corresponding to the parameter (θ). take advantage of

First, through learning processing, various input parameters (θ) and the ^norm corresponding to each parameter (θ), that is, the norm (S _l A learning data set (θ, S _l ^* (θ)) consisting of paired data with ^* (θ)) is generated in advance.

This learning data set (θ, S _l ^* (θ)) is stored in the storage unit and used when calculating the optimum solution x ^* as control information during execution of robot control.
That is, the norm (S _l ^* (θ)) of each constraint with respect to various input parameters (θ) acquired by the robot as observation information is stored in the learning data set (θ, S _l ^* (θ)) is estimated using
Furthermore, depending on the value of the estimated norm (S _l ^* (θ)) of each constraint, we identify whether each constraint is active or inactive.

Such a learning result application process enables high-speed extraction of active constraints according to the input parameter (θ), and only active constraints are selected to obtain the optimal solution x ^* of the quadratic programming problem, that is, the robot It becomes possible to calculate the optimum solution x ^* such as control information at high speed.

Hereinafter, a learning data set (θ, S _l ^* (θ)) consisting of set data of various input parameters (θ) and norms (S _l ^* (θ)) of each constraint corresponding to each parameter (θ) and a specific example of the robot control process using the learning result generated by this learning process will be described.

FIG. 10 is a diagram showing a configuration example of the learning processing unit 100 configured within the information processing apparatus.
As shown in FIG. 10 , the learning processing unit 100 has a learning data set generation unit 110 and a constraint norm estimator generation unit 120 .
The learning data set generation unit 110 calculates the norm corresponding to each constraint (constraint norm (S _l ^* (θ)) that enables extraction of active constraints according to the input parameter (θ), and calculates various input parameters (θ ) and (S _l ^* (θ)) of each constraint corresponding to the _parameter (θ) ^.

The constraint norm estimator generation unit 120 uses the learning data set (θ, S _l ^* (θ)) 121 to estimate the constraint norm corresponding to the input parameter (θ) from various input parameters (θ). Generate a regression analyzer (NN Regressor).

First, the configuration and processing of the learning data set generation unit 110 will be described.
As shown in FIG. 10, the learning data set generator 110 includes a quadratic programming problem standardized model generator (QP Modeling) 111, a quadratic programming problem standardized model optimum solution calculator (QP Solver) 112, and a constraint norm calculator. It has a part (Calc Norm) 113 .

A quadratic programming problem standardized model generator (QP Modeling) 111 and a quadratic programming problem standardized model optimum solution calculator (QP Solver) 112 shown in FIG. A quadratic programming problem standardized model generator (QP Modeling) 31 and a quadratic programming problem standardized model optimal solution calculator (QP Solver) 32, which are components of the problem optimum solution calculation device 30, perform the same processing.
That is, the parameter θ is input and the optimal solution x ^* of the quadratic programming problem is output.
Note that x ^* means a Hermitian transposed matrix of x (n-dimensional vector).

Processing executed by each component of the learning data set generation unit 110 shown in FIG. 10 will be described.
A quadratic programming problem standardized model generator (QP Modeling) 111 of the learning data set generator 110 shown in FIG. 10 receives a parameter θ and generates a quadratic programming problem standardized model based on the input parameter θ.

The quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 112 generates the quadratic programming problem standardized model generated by the quadratic programming problem standardized model generation unit (QP Modeling) 111, that is, the above (Equation 1) (a ) input the quadratic programming problem standardized model composed of the objective function and (b) the constraint function, the optimal solution x ^* (n-dimensional vector ) is calculated and output.

A quadratic programming problem standardized model optimum solution calculation unit (QP Solver) 112 calculates the optimum solution x ^{* of the optimization problem (quadratic programming problem), and applies the calculated optimum solution x*} ^to the inequality constraint “l≦Ax≦ u”.
This substitution process extracts only the rows where the equality holds. By generating selection matrices S _cl and S _cu in which the matrix elements satisfying the equation are set to 1 and the other matrix elements are set to 0, and using these matrices S _cl and S _cu , the following relational expression is obtained: ,
S _cl Ax ^* = S _cl l,
S _cu Ax ^* =S _cu u,
The above relational expression holds.
By concatenating these relational expressions, the following (Equation 2) described above is generated.

The quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 112 uses the above (formula 2) to extract the active constraints, treats the extracted active constraints as equality constraints, and calculates the quadratic programming problem standardized model The quadratic programming problem with inequality constraints generated by the generation unit (QP Modeling) 111 is converted into a quadratic programming problem with active equality constraints as shown in (Formula 3) below.

In this way, the quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 112 applies the above (Equation 3) to (a) minimize the objective function and (b) constrain the active equality constraint function Calculate the optimal solution x ^* (n-dimensional vector) that satisfies

As described above, the constraints of the quadratic programming problem include active constraints that can be used for the calculation process of the optimum solution x ^* and inactive constraints that are not used for the calculation process of the optimum solution x ^* .
In this embodiment, the norm (L2 norm ⁽ Euclidean norm)) corresponding to the distance in the vector space from the optimal solution x ^* of the quadratic programming problem to the constraint is used as an index for distinguishing between the active constraint and the inactive constraint. use.

A specific example of active and inactive constraints in a quadratic programming problem and norms (constraint norms) used as indices for distinguishing between active and inactive constraints will be described with reference to FIG.

The x ^{* shown in the center of FIG. 11 indicates the optimal solution x*} ^of the quadratic programming problem.
As described above, the optimal solution x ^* of the quadratic programming problem is the solution x ^* (n-dimensional vector) that satisfies the constraints of (a) the objective function and (b) the constraint function of the quadratic programming problem.

The circular dotted line shown in FIG. 11 is the contour line of the calculated value of the (a) objective function of the quadratic programming problem, and the calculated value becomes smaller toward the inner side of the contour line.
A region (V) shown in FIG. 11 is a region that satisfies the constraint of the (b) constraint function of the quadratic programming problem.

A region (V) shown in FIG. 11 is a region (n region of the dimensional state vector).
Thus, the region (V) is a region that satisfies the constraints of the (b) constraint function of the quadratic programming problem, and within this region (V), the (a) objective function of the quadratic programming problem is the minimum value A solution x ^* (an n-dimensional vector) is calculated as the optimal solution x ^* of the quadratic programming problem.

The four constraints shown in FIG. 11, that is, constraint ab, constraint cd, constraint ef, and constraint ^gh . Active constraints available for processing.
On the other hand, two constraints, constraint ef and constraint gh, are inactive constraints that are not used in the process of calculating the optimal solution x ^* of the quadratic programming problem.
The inactive constraint only defines a region that satisfies the constraints of the (b) constraint function, and is not used in the process of calculating the optimal solution x ^* .

Although the (b) constraint function of the quadratic programming problem includes a plurality of different constraints, in order to determine which of these constraints is the active constraint that can be used for the calculation process of the optimal solution x ^* . In addition, the norm is used in this embodiment.

That is, the norm (L2 norm ⁽ Euclidean norm)) corresponding to the distance in the vector space from the optimal solution x ^* of the quadratic programming problem to each constraint is calculated for each constraint, and the calculated constraint norm value is defined in advance. If the constraint is greater than or equal to a threshold (λ), the constraint is determined to be an inactive constraint, and if the value of the constraint norm is less than a predefined threshold (λ), the constraint is determined to be an active constraint. to decide.

For example, when the threshold λ=0.20, in the example shown in FIG. 11, the constraints ab and cd are
constraint norm = 0.00
and the norms of these constraints ab and cd are less than the threshold λ=0.20, so they are determined to be active constraints.

Also, the constraint ef is
constraint norm = 1.60
and the norm of this constraint ef is equal to or greater than the threshold λ=0.20, so it is determined to be an inactive constraint.
Also, the constraint gh is
constraint norm = 1.60
and the norm of this constraint gh is also greater than or equal to the threshold λ=0.20, so it is determined to be an inactive constraint.

Thus, in this embodiment, the norm corresponding to the distance in the vector space from the optimal solution x ^* of the quadratic programming problem to each constraint (L ² norm (Euclidean norm)) is calculated for each constraint, and the calculated constraint If the value of the norm is greater than or equal to a predefined threshold (λ), then the constraint is determined to be an inactive constraint; if the value of the constraint norm is less than the predefined threshold (λ), then Determine that the constraint is an active constraint.

As described above, before starting the calculation process of the optimal solution x ^* of the quadratic programming problem, active constraints that can be used for the calculation process of the optimal solution x ^* and inactive constraints that are not used for the calculation process of the optimal solution x ^* are set. It is possible to reduce the quadratic programming problem to a linear equation by extracting only the active constraints, extracting only the active constraints, and treating the extracted active constraints as active equality constraints. Optimal solution x ^* can be calculated.

A constraint norm calculation unit (Calc Norm) 113 of the learning data set generation unit 110 shown in FIG. (Constraint norm (S _l ^* (θ))) is calculated.

That is, the constraint norm (S _l ^* (θ)) calculated by the constraint norm calculation unit (Calc Norm) 113 is such that each constraint defined by the constraint function of the quadratic programming problem is the optimum of the objective function of the quadratic programming problem. It is a constraint activity determination index value for identifying whether it is an active constraint used for calculating a solution or an inactive constraint not used for calculating the optimal solution of the objective function of the quadratic programming problem.

The constraint norm calculator (Calc Norm) 113 receives the following data.
(a) From the quadratic programming problem standardized model generation unit (QP Modeling) 111, the parameters A, l, u of the inequality constraints (l≤Ax<u) of the quadratic programming standardized model,
(b) From the quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 112, the quadratic programming problem optimal solution x ^* (n-dimensional vector),

Based on these input data, the constraint norm calculation unit (Calc Norm) 113 calculates a norm (constraint norm (S _l ^* (θ))) is calculated.
The constraint norm (S _l ^* (θ)) calculated by the constraint norm calculator (Calc Norm) 113 is, for example, data (matrix data connecting the norms of each constraint) represented by the following (Equation 5).

A specific example of the constraint norm (S _l ^* (θ)) generated by the constraint norm calculator (Calc Norm) 113 will be described with reference to FIG. 12 .
The constraint norm calculator (Calc Norm) 113 calculates the constraint norm (S _l ^* (θ)) corresponding to the input parameter (θ), as shown in FIG. 12 .

The constraint norm (S _l ^* (θ)) is the norm (L ² norm (Euclidean norm)) corresponding to the distance in the vector space from the optimal solution x ^* of the quadratic programming problem to each constraint, as described above. .
If the value of the constraint norm (S _l ^* (θ)) is greater than or equal to a predefined threshold value (λ), eg, λ=0.02, then the constraint is determined to be an inactive constraint, and the constraint norm If the value is less than a predefined threshold (λ), the constraint is determined to be an active constraint.

For example, in the example shown in FIG. 12, the constraint norm (S _l ^* (θ)) corresponding to the input parameter (θ ₀ ) is
The constraint norm (S _l ^* (θ))=(0.00, 1.25, 1.80, 1.50).
The constraint norm (S _l ^* (θ)) = (0.00, 1.25, 1.80, 1.50) is the norm (L ² Four values corresponding to the norm (Euclidean norm) are shown.

For example, when the predetermined threshold value (λ)=0.02,
Only constraint norm=0.00 of constraint ab is less than threshold (λ)=0.02 and constraint ab is determined to be an active constraint.
Constraint norms (1.25, 1.80, 1.50) of constraint cd, constraint ef, and constraint gh are all equal to or greater than the threshold value (λ)=0.02, and these constraints cd, constraint ef, and Constraint gh is determined to be an inactive constraint.

In this way, the constraint norm calculation unit (Calc Norm) 113 of the learning data set generation unit 110 shown in FIG. A norm (constraint norm (S _l ^* (θ))) to be a value is calculated.

The constraint norm (S _l ^* (θ)) of each constraint calculated by the constraint norm calculator (Calc Norm) 113 of the learning data set generator 110 shown in FIG. , and stored as the learning data set 121 in the learning data set storage unit (storage unit).

This is the learning data set (θ, S _l ^* (θ)) 121 shown in the constraint norm estimator generator 120 shown in FIG.
The constraint norm estimator generation unit 120 shown in FIG. 10 sends this learning data set (θ, S _l ^* (θ)) to the constraint norm estimator generation learning processing execution unit (NN Regressor generation unit) 122. input.

A constraint norm estimator generation learning processing execution unit (regression analyzer (NN Regressor) generation unit) 122 executes learning processing using a learning data set (θ, S _l ^* (θ)), and from the input parameter θ , generates a regression analyzer (NN Regressor) that estimates the constraint norm (S _l ^* (θ)) for each constraint.

An example of a regression analyzer generated by learning processing by the constraint norm estimator generation learning processing execution unit (regression analyzer (NN Regressor) generation unit) 122 will be described with reference to FIG.

In FIG. 13, the constraint norm estimator generation learning processing execution unit (regression analyzer (NN Regressor) generation unit) 122 generates a regression analyzer by learning processing, that is, from the input parameter θ, the constraint norm of each constraint (S A regression analyzer (NN Regressor) 122a is shown estimating _l ^* (θ)).

As shown in FIG. 13, a regression analyzer (NN Regressor: neural network regression analyzer) 122a estimates the constraint norm (S _l ^* (θ)) of each constraint from the input parameter θ.

A regression analyzer (NN Regressor: neural network regression analyzer) 122a performs, for example, a regression analysis process using a learning data set (θ, S _l ^* (θ)) 121 stored in a storage unit to obtain an input parameter θ Estimate and output the constraint norm (S _l ^* (θ)) corresponding to .

For example, select one or more learning data sets (θ, S _l ^* (θ)) containing parameters (θ) highly similar to the input parameter θ for the regression analyzer (NN Regressor: neural network regression analyzer) 122a Then, based on the constraint norm (S _l ^* (θ)) of the selected learning data set, the constraint norm (S _l ^* (θ)) corresponding to the input parameter θ is calculated by a recursive calculation method. Output the constraint norm (S _l ^* (θ)).

For example, in actual robot control processing, the regression analyzer (NN Regressor: neural network regression analyzer) 122a generated by this constraint norm estimator generation learning processing execution unit (regression analyzer (NN Regressor) generation unit) 122 is Execute the used process.

That is, first, a quadratic programming problem is set to calculate the optimal solution x ^* including the control information of the robot from the observation information (θ) of the robot.
Furthermore, the above regression analyzer (NN Regressor: neural network regression analyzer) is used to estimate the norm of the constraint of the quadratic programming problem, and based on the estimated norm, the observed information (θ) of the robot Select the corresponding active constraint.
Furthermore, the selected active constraint is used to perform processing for calculating the optimal solution x ^* of the quadratic programming problem.
Such processing enables high-speed processing of calculating the optimal solution x ^* of the quadratic programming problem including the optimal control information from the observation information (θ) of the robot.
This processing will be described with reference to FIG.

FIG. 14 shows the control information generator 200 configured in the information processing apparatus. The control information generation unit 200 receives, for example, an input parameter (θ), which is observation information of the robot, and executes a process of calculating the optimal solution x ^* of the quadratic programming problem as the control information of the robot.

As shown in FIG. 14, the control information generator 200 includes a constraint norm estimator (NN Regressor = neural network regression analyzer) 201, a threshold applied active constraint selector (Threshold) 202, and a linear system analyzer (Linear System Solver) 203.

The constraint norm estimator (NN Regressor = neural network regression analyzer) 201 executes the constraint norm estimator generation learning process in the constraint norm estimator generator 120 of the learning processing unit 100 described above with reference to FIG. A constraint norm estimation process using a regression analyzer (NN Regressor: neural network regression analyzer) generated by the unit (regression analyzer (NN Regressor) generation unit) 122 is executed.

That is, using the regression analyzer (NN _Regressor : neural network regression analyzer) 122a described with reference to FIG ^. ) is estimated and output.
The constraint norm (S _l ^* (θ)) is used to determine whether each constraint is an active constraint that is used to calculate the optimal solution x ^* of the quadratic programming problem or an inactive constraint that is not used. It is an index value.

An example of processing executed by the constraint norm estimator (NN Regressor=neural network regression analyzer) 201 will be described with reference to FIG.
FIG. 15 shows a regression analyzer (NN Regressor: neural network regression analyzer) configured in the constraint norm estimator (NN Regressor: neural network regression analyzer) 201 .

This regression analyzer (NN Regressor: neural network regression analyzer) is the regression analyzer described above with reference to FIG. It corresponds to the regression analyzer (NN Regressor: neural network regression analyzer) 122a generated by the device generation learning processing execution unit (regression analyzer (NN Regressor) generation unit) 122 .

That is, the regression analyzer (NN Regressor: neural network regression analyzer) shown in FIG. 15 estimates and outputs the constraint norm (S _l ^* (θ)) of each constraint corresponding to the input parameter (θ).

The regression analyzer (NN _Regressor : neural network regression analyzer) shown in FIG ^. (NN Regressor: Neural Network Regression Analyzer).

Therefore, for example, the constraint norm (S _l ^* (θ)) recorded in the training data set containing the input parameter θ for the regression analyzer (NN Regressor: neural network regression analyzer) and the parameter (θ) with high similarity is executed to estimate the constraint norm (S _l ^* (θ)) corresponding to the input parameter θ.

For example, select one or more learning data sets (θ, S _l ^* (θ)) containing parameters (θ) highly similar to the input parameter θ for the regression analyzer (NN Regressor: neural network regression analyzer) 122a Then, based on the constraint norm (S _{l * (θ)) of the selected learning data set, a process of estimating and outputting the constraint norm (S l} ^* ₍ ^θ )) corresponding to the input parameter θ by a recursive calculation method etc., are executed.

In the example shown in FIG. 15, the regression analyzer (NN Regressor: neural network regression analyzer) 122a uses the constraint norm (S _l ^* (θ))=(0.00, 1 . 25, 1.80, 1.50).

This constraint norm (S _l ^* (θ)) data is
Constraint norm of constraint ab (S _l ^* (θ)) = 0.00,
Constraint norm of constraint cd (S _l ^* (θ)) = 1.25,
Constraint norm of constraint ef (S _l ^* (θ)) = 1.80,
constraint norm of constraint gh (S _l ^* (θ))=1.50,
This data indicates the norm of each constraint.

In this way, the constraint norm estimator (NN Regressor = neural network regression analyzer) 201 uses a regression analyzer (NN Regressor: neural network regression analyzer) to determine each constraint corresponding to the input parameter (θ) Estimate the constraint norm (S _l ^* (θ)) of .

The constraint norm (S _l ^* (θ)) is used to determine whether each constraint is an active constraint that is used to calculate the optimal solution x ^* of the quadratic programming problem or an inactive constraint that is not used. It is an index value.

The constraint norm (S _l ^* (θ)) of each constraint corresponding to the input parameter (θ) estimated by the constraint norm estimator (NN Regressor = neural network regression analyzer) 201 is obtained by the threshold application active constraint selector (Threshold) 202 .

A threshold active constraint selector (Threshold) 202 applies a predefined threshold (λ) such that each of the constraints defined in the quadratic programming problem is the optimal solution x Generates active constraint identification data (S ^* (θ)), which is discriminant data as to whether the constraint is an active constraint that is used in the calculation of ^* or an inactive constraint that is not used.

This active constraint identification data (S ^* (θ)) is data similar to the active constraint identification data (S ^* (θ)) previously described with reference to FIGS. This is data that enables selection of only the active constraint corresponding to the parameter (θ).

A specific example of processing executed by the threshold application active constraint selection unit (Threshold) 202 will be described with reference to FIG.

As described above, the thresholded active constraint selector (Threshold) 202 applies a predefined threshold (λ) such that each of the constraints defined in the quadratic programming problem is a quadratic program Active constraint identification data (S ^* (θ)), which is discriminative data for determining whether the constraint is an active constraint used in calculating the optimum solution x ^* of the problem or an inactive constraint that is not used, is generated.

FIG. 16 shows (a) input data and (b) output data for the threshold application active constraint selection unit (Threshold) 202 .
(a) Input data is the constraint norm (S _l ^* (θ)) of each constraint corresponding to the input parameter (θ) generated by the constraint norm estimator (NN Regressor=neural network regression analyzer) 201 .
As described above, the constraint norm (S _l ^* (θ)) is the norm (L ² norm (Euclidean norm)) corresponding to the distance in the vector space from the optimal solution x ^* of the quadratic programming problem to each constraint. .

The threshold application active constraint selection unit (Threshold) 202 compares (a) the constraint norm (S _l ^* (θ)) of each constraint of the input data with a predetermined threshold value (λ). conduct.

If the constraint norm (S _l ^* (θ)) is greater than or equal to a predefined threshold value (λ), then the constraint is determined to be an inactive constraint, and the value of the constraint norm is equal to or greater than the predefined threshold value (λ ), the constraint is determined to be an active constraint, and active constraint identification data (S ^* (θ)) is generated according to the determination result.
That is, it generates active constraint identification data (S ^* (θ)) that is data for determining whether each constraint is an active constraint or an inactive constraint that is not used.
FIG. 16B shows active constraint identification data (S ^* (θ)) shown as output data.

In the example shown in FIG. 16, (a) the constraint norm (S _l ^* (θ)) of each constraint of the parameter (θ ₀ ) in the input data is
Constraint norm of constraint ab (S _l ^* (θ)) = 0.00,
Constraint norm of constraint cd (S _l ^* (θ)) = 1.25,
Constraint norm of constraint ef (S _l ^* (θ)) = 1.80,
constraint norm of constraint gh (S _l ^* (θ))=1.50,
Such data.

The threshold application active constraint selection unit (Threshold) 202 compares (a) the constraint norm (S _l ^* (θ)) of each constraint of the input data with a predetermined threshold value (λ). conduct.
For example, if the threshold λ = 0.20, the constraint ab is
constraint norm = 0.00
, and these constraints ab are less than a threshold (λ=0.20) and are determined to be active constraints.

All of the other constraints cd, ef, and gh are equal to or greater than the threshold (λ=0.20) and are determined to be inactive constraints.

A threshold applied active constraint selector (Threshold) 202 generates active constraint identification data (S ^* (θ)) based on these determination results. The active constraint identification data (S ^* (θ)) is data set by associating (1) with the active constraint corresponding to the input parameter (θ) and (0) with the inactive constraint, and only the active constraint is selected. data that made it possible.

Active constraint identification data (S ^* (θ)) generated by the threshold applied active constraint selector (Threshold) 202 is input to a linear system solver (Linear System Solver) 203 .

A linear system solver 203 uses the active constraint identification data (S ^* (θ)) to extract only the active constraints from the constraints included in the inequality constraints “l≦Ax≦u” of the quadratic programming problem. These are regarded as active equality constraints, and processing is performed to calculate the optimal solution x ^* (n-dimensional vector) that satisfies the active equality constraints.

That is, if the extracted active constraints are regarded as active equality constraints, it becomes possible to reduce the quadratic programming problem to a linear equation, and by solving the linear equation, it becomes possible to calculate the optimum solution x ^* at high speed.

Unlike the configuration and processing described earlier with reference to FIGS. 3 to 9, this embodiment does not set labels according to the combination of active and inactive constraints. Therefore, there is no problem that the number of labels increases exponentially when the number N of constraints included in the (b) constraint function in the quadratic programming problem increases.

As described above, in the configuration described above with reference to FIGS. 3 to 9, the number of labels is 2 ⁴ =16 when there are four constraints: constraint ab, constraint cd, constraint ef, and constraint gh. However, if the number of constraints is 8, the number of labels is 2 ⁸ =256. When the number of constraints is 10, the number of labels is 2 ¹⁰ =1024.

As described above, when the number of constraints N increases, the number of labels increases exponentially, resulting in an increase in memory consumption and calculation amount when using a predictor (NN: neural network). , there is a possibility of causing problems such as a decrease in the efficiency of learning processing and a decrease in control speed during robot control.

On the other hand, the configuration for determining whether each constraint is an active constraint or an inactive constraint based on the norm value described with reference to FIGS. 10 to 16 does not require classification by label. .

That is, it is possible to determine whether each constraint is an active constraint or an inactive constraint simply by determining whether the norm of each constraint is greater than or equal to the threshold or less than the threshold. can be reduced.
As a result, the efficiency of the learning process is improved, and the control speed during robot control can also be improved.

[4. Hardware configuration example of information processing device]
Next, a hardware configuration example of the information processing apparatus of the present disclosure will be described.

FIG. 17 is a block diagram showing one configuration example of the hardware configuration of the information processing apparatus of the present disclosure.
The information processing device is, for example, a device capable of executing processing executed by the learning processing unit described above with reference to FIGS. 3 and 10, or the control information generation unit described with reference to FIGS. 8 and 14. .

The information processing device can be configured as, for example, a device attached to the robot or a device capable of communicating with the robot to control the robot.
Each component of the information processing apparatus shown in FIG. 17 will be described.

A CPU (Central Processing Unit) 301 functions as a data processing section that executes various processes according to programs stored in a ROM (Read Only Memory) 302 or a storage section 308 . For example, the process according to the sequence described in the above embodiment is executed. A RAM (Random Access Memory) 303 stores programs and data executed by the CPU 301 . These CPU 301 , ROM 302 and RAM 303 are interconnected by a bus 304 .

The CPU 301 is connected to an input/output interface 305 via a bus 304. The input/output interface 305 includes various switches, a keyboard, a touch panel, a mouse, a microphone, and a user input unit, a camera, and various sensors 321 such as LiDAR for obtaining status data. An input unit 306 including a unit, etc., and an output unit 307 including a display, a speaker, etc. are connected.
The output unit 307 also outputs driving information to a driving unit 322 that drives a robot or the like.

The CPU 301 receives commands, situation data, and the like input from the input unit 306 , executes various processes, and outputs processing results to the output unit 307 , for example.
A storage unit 308 connected to the input/output interface 305 is composed of, for example, a flash memory, a hard disk, or the like, and stores programs executed by the CPU 301 and various data. A communication unit 309 functions as a transmission/reception unit for data communication via a network such as the Internet or a local area network, and communicates with an external device.
In addition to the CPU, a GPU (Graphics Processing Unit) may be provided as a dedicated processing unit for image information input from a camera.

A drive 310 connected to the input/output interface 305 drives a removable medium 311 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory such as a memory card to record or read data.

[5. Summary of the configuration of the present disclosure]
Embodiments of the present disclosure have been described in detail above with reference to specific embodiments. However, it is obvious that those skilled in the art can modify or substitute the embodiments without departing from the gist of this disclosure. That is, the present invention has been disclosed in the form of examples and should not be construed as limiting. In order to determine the gist of the present disclosure, the scope of claims should be considered.

In addition, the technique disclosed in this specification can take the following configurations.
(1) a quadratic programming problem optimum solution calculation unit that calculates the optimum solution of the quadratic programming problem corresponding to the input parameters;
a constraint norm calculator that calculates a constraint norm that is the norm between each constraint defined by the constraint function of the quadratic programming problem and the optimal solution;
An information processing apparatus having a learning process execution unit that generates a constraint norm estimator that executes a learning process using set data of the input parameter and the constraint norm as learning data and estimates the constraint norm according to various input parameters. .

(2) The constraint norm calculated by the constraint norm calculation unit is
Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem. The information processing apparatus according to (1), which is a constraint activity determination index value for identifying whether the constraint is an inactive constraint that is not used in the calculation of .

(3) The constraint norm calculated by the constraint norm calculation unit is
determining a constraint whose constraint norm is less than a predefined threshold as an active constraint;
The information processing apparatus according to (2), wherein the constraint norm is a constraint activity determination index value for determining a constraint equal to or greater than a predetermined threshold as an inactive constraint.

(4) The constraint norm according to various input parameters estimated by the constraint norm estimator generated by the learning processing unit is
each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used for calculating the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem The information processing apparatus according to any one of (1) to (3), which is a constraint activity determination index value for identifying whether the constraint is an inactive constraint that is not used for calculation.

(5) The constraint norm calculation unit
The information processing apparatus according to any one of (1) to ( ⁴ ), wherein an L2 norm (Euclidean norm) corresponding to a distance in a vector space from an optimal solution of a quadratic programming problem to a constraint is calculated as the constraint norm.

(6) The quadratic programming problem optimal solution calculation unit
The information processing apparatus according to any one of (1) to (5), wherein a solution that satisfies the constraints of the constraint function of the quadratic programming problem and that minimizes the objective function of the quadratic programming problem is calculated as the optimal solution. .

(7) The learning processing unit
The information processing device according to any one of (1) to (6), which generates a constraint norm estimator configured by a neural network.

(8) The learning processing unit
The information processing apparatus according to any one of (1) to (7), which generates a constraint norm estimator configured by a neural network that executes regression analysis processing.

(9) The information processing device
a quadratic programming problem standardized model generation unit that generates a quadratic programming problem standardized model corresponding to input parameters;
The quadratic programming problem optimal solution calculation unit,
The information processing apparatus according to any one of (1) to (8), wherein the optimal solution is calculated using the quadratic programming problem standardized model generated by the quadratic programming problem standardized model generation unit.

(10) a constraint norm estimator that estimates the constraint norm according to the input parameters for each constraint defined by the constraint function of the quadratic programming problem;
By comparing the constraint norm estimated by the constraint norm estimator with a predetermined threshold value,
Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem. an active constraint selection unit that generates constraint activity analysis information that can identify whether it is an inactive constraint that is not used in the calculation of
An information processing apparatus comprising a linear system analysis unit that selects only active constraints using the constraint activity analysis information generated by the active constraint selection unit and calculates the optimum solution of the quadratic programming problem.

(11) The constraint norm estimator,
The information processing apparatus according to (10), which is a constraint norm estimator generated by learning processing using set data of various input parameters and the constraint norm as learning data.

(12) The constraint norm estimator,
The information processing device according to (10) or (11), which is a constraint norm estimator configured by a neural network.

(13) The constraint norm estimator,
The information processing device according to any one of (10) to (12), which is a constraint norm estimator configured by a neural network that executes regression analysis processing.

(14) The linear system analysis unit
(10) to (13), wherein only active constraints of the quadratic programming problem are selected, the quadratic programming problem is converted into a linear equation, and an optimal solution is calculated by solving the linear equation. Information processing equipment.

(15) The linear system analysis unit
(10) to (14), wherein only active constraints are extracted from the inequality constraints of the quadratic programming problem, the extracted constraints are regarded as active equality constraints, and an optimal solution that satisfies the active equality constraints is calculated. Information processing equipment.

(16) An information processing method executed in an information processing device,
a quadratic programming problem optimum solution calculation step in which the quadratic programming problem optimum solution calculation unit calculates the optimum solution of the quadratic programming problem corresponding to the input parameter;
a constraint norm calculation step in which a constraint norm calculation unit calculates a constraint norm that is a norm between each constraint defined by the constraint function of the quadratic programming problem and the optimal solution;
A learning processing execution unit executes learning processing using set data of the input parameter and the constraint norm as learning data, and generates a constraint norm estimator that estimates the constraint norm according to various input parameters. An information processing method that performs a step.

(17) An information processing method executed in an information processing device,
a constraint norm estimation step in which the constraint norm estimator estimates the constraint norm according to the input parameters for each constraint defined by the constraint function of the quadratic programming problem;
The active constraint selection unit compares the constraint norm estimated by the constraint norm estimator with a predetermined threshold,
Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem. an active constraint selection step for generating constraint activity analysis information that enables identification of inactive constraints that are not used in the calculation of
A linear system analysis unit selects only active constraints using the constraint activity analysis information generated by the active constraint selection unit, and executes a linear system analysis step of calculating an optimal solution of the quadratic programming problem. information processing method.

(18) A program for executing information processing in an information processing device,
a quadratic programming problem optimum solution calculation step for causing the quadratic programming problem optimum solution calculation unit to calculate the optimum solution of the quadratic programming problem corresponding to the input parameters;
a constraint norm calculation step of causing a constraint norm calculation unit to calculate a constraint norm that is a norm between each constraint defined by the constraint function of the quadratic programming problem and the optimal solution;
Execution of learning processing for generating a constraint norm estimator for estimating constraint norms corresponding to various input parameters by executing learning processing using the set data of the input parameter and the constraint norm in the learning processing execution unit as learning data. A program that executes a step.

(19) A program for executing information processing in an information processing device,
a constraint norm estimation step that causes a constraint norm estimator to estimate a constraint norm according to input parameters for each constraint defined by a constraint function of the quadratic programming problem;
By comparing the constraint norm estimated by the constraint norm estimator with a predetermined threshold in the active constraint selection unit,
Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem. an active constraint selection step for generating constraint activity analysis information that enables identification of inactive constraints that are not used in the calculation of
The linear system analysis unit uses the constraint activity analysis information generated by the active constraint selection unit to select only active constraints and execute a linear system analysis step of calculating the optimal solution of the quadratic programming problem. program to make

It should be noted that the series of processes described in the specification can be executed by hardware, software, or a composite configuration of both. When executing processing by software, a program recording the processing sequence is installed in the memory of a computer built into dedicated hardware and executed, or the program is loaded into a general-purpose computer capable of executing various processing. It can be installed and run. For example, the program can be pre-recorded on a recording medium. In addition to being installed in a computer from a recording medium, the program can be received via a network such as a LAN (Local Area Network) or the Internet and installed in a recording medium such as an internal hard disk.

In addition, the various types of processing described in the specification may not only be executed in chronological order according to the description, but may also be executed in parallel or individually according to the processing capacity of the device that executes the processing or as necessary. Further, in this specification, a system is a logical collective configuration of a plurality of devices, and the devices of each configuration are not limited to being in the same housing.

As described above, according to the configuration of one embodiment of the present disclosure, the active constraints of the quadratic programming problem are efficiently selected using the norm of each constraint to find the optimal solution of the quadratic programming problem. A device and method that enable high-speed calculation are realized.
Specifically, for example, for each constraint of a quadratic programming problem, a constraint norm estimator that estimates the norm according to the input parameter, and a comparison process between the estimated constraint norm and a predetermined threshold, the quadratic Constraint activity analysis that makes it possible to identify whether each constraint of a planning problem is an active constraint used to calculate the optimal solution of the objective function of a quadratic programming problem or an inactive constraint that is not used to calculate the optimal solution. An active constraint selector for generating information, wherein the linear analyzer utilizes the constraint activity analysis information to select active constraints to compute an optimal solution to the quadratic programming problem.
With this configuration, it is possible to realize an apparatus and a method for efficiently selecting active constraints of a quadratic programming problem using the norm of each constraint and enabling high-speed calculation of the optimum solution of the quadratic programming problem.

10 Robot 20 Traveling Route 30 Quadratic Programming Problem Optimal Solution Calculator 31 Quadratic Programming Problem Standardized Model Generation Unit (QP Modeling)
32 Quadratic Programming Problem Standardized Model Optimal Solution Calculator (QP Solver)
40 learning processing unit 50 learning data set generation unit 51 quadratic programming problem standardized model generation unit (QP Modeling)
52 Quadratic Programming Problem Standardized Model Optimal Solution Calculator (QP Solver)
53 active constraint identification data generator 60 predictor generator 61 learning data set (θ, S ^* (θ))
62 class classification processor (NN Classifier=neural network class classifier)
80 control information generation unit 81 class classification processing unit (NN Classifier=neural network class classification unit)
82 Linear System Solver
100 learning processing unit 110 learning data set generation unit 111 quadratic programming problem standardized model generation unit (QP Modeling)
112 Quadratic Programming Problem Standardized Model Optimal Solution Calculator (QP Solver)
113 constraint norm calculator (Calc Norm)
120 constraint norm estimator generator 121 learning data set (θ, S _l ^* (θ))
122 constraint norm estimator generation learning processing execution unit (regression analyzer (NN Regressor) generation unit)
200 control information generator 201 constraint norm estimator (NN Regressor = Neural Network Regression Analyzer)
202 Threshold applied active constraint selector (Threshold)
203 Linear System Solver
301 CPUs
302 ROMs
303 RAM
304 bus 305 input/output interface 306 input unit 307 output unit 308 storage unit 309 communication unit 310 drive 311 removable media 321 sensor 322 drive unit

Claims

a quadratic programming problem optimal solution calculation unit that calculates the optimal solution of the quadratic programming problem corresponding to the input parameters;
a constraint norm calculator that calculates a constraint norm that is the norm between each constraint defined by the constraint function of the quadratic programming problem and the optimal solution;
An information processing apparatus having a learning process execution unit that generates a constraint norm estimator that executes a learning process using set data of the input parameter and the constraint norm as learning data and estimates the constraint norm according to various input parameters. .
The constraint norm calculated by the constraint norm calculation unit is
Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem. 2. The information processing apparatus according to claim 1, which is a constraint activity determination index value for identifying whether the constraint is an inactive constraint that is not used in the calculation of .
The constraint norm calculated by the constraint norm calculation unit is
determining a constraint whose constraint norm is less than a predefined threshold as an active constraint;
3. The information processing apparatus according to claim 2, wherein the constraint norm is a constraint activity determination index value for determining a constraint equal to or greater than a predetermined threshold as an inactive constraint.
The constraint norm according to various input parameters estimated by the constraint norm estimator generated by the learning processing unit is
each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used for calculating the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem 2. The information processing apparatus according to claim 1, which is a constraint activity determination index value for identifying whether the constraint is an inactive constraint that is not used for calculation.
The constraint norm calculator,
2. The information processing apparatus according to claim 1, wherein L2 norm (Euclidean norm) corresponding to the distance in vector space from the optimal solution of the quadratic programming problem to the constraint is calculated as the constraint norm.
The quadratic programming problem optimal solution calculation unit,
2. The information processing apparatus according to claim 1, wherein a solution that satisfies the constraint of the constraint function of the quadratic programming problem and that minimizes the objective function of the quadratic programming problem is calculated as the optimum solution.
The learning processing unit
2. The information processing apparatus according to claim 1, which generates a constraint norm estimator configured by a neural network.
The learning processing unit
2. The information processing apparatus according to claim 1, which generates a constraint norm estimator configured by a neural network that performs regression analysis processing.
The information processing device is
a quadratic programming problem standardized model generation unit that generates a quadratic programming problem standardized model corresponding to input parameters;
The quadratic programming problem optimal solution calculation unit,
2. The information processing apparatus according to claim 1, wherein the quadratic programming problem standardized model generated by the quadratic programming problem standardized model generator is used to calculate the optimal solution.
a constraint norm estimator that estimates a constraint norm according to input parameters for each constraint defined by a constraint function of the quadratic programming problem;
By comparing the constraint norm estimated by the constraint norm estimator with a predetermined threshold value,
Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem. an active constraint selection unit that generates constraint activity analysis information that can identify whether it is an inactive constraint that is not used in the calculation of
An information processing apparatus comprising a linear system analysis unit that selects only active constraints using the constraint activity analysis information generated by the active constraint selection unit and calculates the optimum solution of the quadratic programming problem.
The constraint norm estimator comprises:
11. The information processing apparatus according to claim 10, wherein the constraint norm estimator is a constraint norm estimator generated by a learning process using set data of various input parameters and the constraint norm as learning data.
The constraint norm estimator comprises:
11. The information processing device according to claim 10, which is a constraint norm estimator configured by a neural network.
The constraint norm estimator comprises:
11. The information processing device according to claim 10, wherein the constraint norm estimator is a neural network that performs regression analysis processing.
The linear system analysis unit
11. The information processing apparatus according to claim 10, wherein only active constraints of said quadratic programming problem are selected, said quadratic programming problem is converted into a linear equation, and an optimum solution is calculated by solving the linear equation.
The linear system analysis unit
11. The information processing apparatus according to claim 10, wherein only active constraints are extracted from the inequality constraints of said quadratic programming problem, the extracted constraints are regarded as active equality constraints, and an optimum solution satisfying the active equality constraints is calculated.
An information processing method executed in an information processing device,
a quadratic programming problem optimum solution calculation step in which the quadratic programming problem optimum solution calculation unit calculates the optimum solution of the quadratic programming problem corresponding to the input parameter;
a constraint norm calculation step in which a constraint norm calculation unit calculates a constraint norm that is a norm between each constraint defined by the constraint function of the quadratic programming problem and the optimal solution;
A learning processing execution unit executes learning processing using set data of the input parameter and the constraint norm as learning data, and generates a constraint norm estimator that estimates the constraint norm according to various input parameters. An information processing method that performs a step.
An information processing method executed in an information processing device,
a constraint norm estimation step in which the constraint norm estimator estimates the constraint norm according to the input parameters for each constraint defined by the constraint function of the quadratic programming problem;
The active constraint selection unit compares the constraint norm estimated by the constraint norm estimator with a predetermined threshold,
Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem. an active constraint selection step for generating constraint activity analysis information that enables identification of inactive constraints that are not used in the calculation of
A linear system analysis unit selects only active constraints using the constraint activity analysis information generated by the active constraint selection unit, and executes a linear system analysis step of calculating an optimal solution of the quadratic programming problem. information processing method.
A program for executing information processing in an information processing device,
a quadratic programming problem optimum solution calculation step for causing the quadratic programming problem optimum solution calculation unit to calculate the optimum solution of the quadratic programming problem corresponding to the input parameters;
a constraint norm calculation step of causing a constraint norm calculation unit to calculate a constraint norm that is a norm between each constraint defined by the constraint function of the quadratic programming problem and the optimal solution;
Execution of learning processing for generating a constraint norm estimator for estimating constraint norms corresponding to various input parameters by executing learning processing using the set data of the input parameter and the constraint norm in the learning processing execution unit as learning data. A program that executes a step.
A program for executing information processing in an information processing device,
a constraint norm estimation step that causes a constraint norm estimator to estimate a constraint norm according to input parameters for each constraint defined by a constraint function of the quadratic programming problem;
By comparing the constraint norm estimated by the constraint norm estimator with a predetermined threshold in the active constraint selection unit,
Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem. an active constraint selection step for generating constraint activity analysis information that enables identification of inactive constraints that are not used in the calculation of
The linear system analysis unit uses the constraint activity analysis information generated by the active constraint selection unit to select only active constraints and execute a linear system analysis step of calculating the optimal solution of the quadratic programming problem. program to make