WO2023276255A1 - Information processing device, information processing method, and program - Google Patents

Information processing device, information processing method, and program Download PDF

Info

Publication number
WO2023276255A1
WO2023276255A1 PCT/JP2022/006846 JP2022006846W WO2023276255A1 WO 2023276255 A1 WO2023276255 A1 WO 2023276255A1 JP 2022006846 W JP2022006846 W JP 2022006846W WO 2023276255 A1 WO2023276255 A1 WO 2023276255A1
Authority
WO
WIPO (PCT)
Prior art keywords
constraint
norm
quadratic programming
programming problem
active
Prior art date
Application number
PCT/JP2022/006846
Other languages
French (fr)
Japanese (ja)
Inventor
克文 杉本
Original Assignee
ソニーグループ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ソニーグループ株式会社 filed Critical ソニーグループ株式会社
Publication of WO2023276255A1 publication Critical patent/WO2023276255A1/en

Links

Images

Classifications

    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J13/00Controls for manipulators
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05BCONTROL OR REGULATING SYSTEMS IN GENERAL; FUNCTIONAL ELEMENTS OF SUCH SYSTEMS; MONITORING OR TESTING ARRANGEMENTS FOR SUCH SYSTEMS OR ELEMENTS
    • G05B13/00Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion
    • G05B13/02Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course or altitude of land, water, air, or space vehicles, e.g. automatic pilot
    • G05D1/02Control of position or course in two dimensions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N99/00Subject matter not provided for in other groups of this subclass

Definitions

  • the present disclosure relates to an information processing device, an information processing method, and a program. More specifically, the present invention relates to an information processing apparatus, an information processing method, and a program that perform learning processing for determining control parameters of a robot, control of the robot using the learning result, and the like.
  • the information acquired by sensors such as cameras and distance sensors attached to the robot is analyzed to analyze the positions of obstacles. Processing is performed to calculate a route or trajectory that does not come into contact with obstacles.
  • a quadratic programming problem is, for example, the method of least squares, and is an optimization problem in which the objective function is a quadratic function and the constraint condition is a linear function.
  • QP quadratic programming problem
  • it is an optimization problem that can be solved as a minimization problem in which the objective function is downwardly convex.
  • Non-Patent Document 1 Online Mixed-Integer Optimization in Milliseconds discloses a method of calculating an optimal solution to a quadratic programming problem.
  • This non-patent document 1 uses a neural network that uses pre-learned learning data to perform active constraints and integer value prediction when obtaining an optimal solution for a mixed integer quadratic programming problem, and linear It discloses a method of calculating the optimal solution at high speed by converting it into a simple problem.
  • the neural network predicts the combination of active constraints and integer values when calculating the optimal solution from the parameters for formulating a mixed integer quadratic programming problem.
  • input parameter
  • Output Constraint/integer value Learned from the dataset that is the combination of this input and output. This data set can be prepared by solving countless problems in advance.
  • Non-patent Document 1 a neural network is modeled as a class classification problem.
  • the dimension of the output should be the same as the number of all possible “combinations of active constraints and integer values” (one-hot vectorization). However, this number grows exponentially with the number N of constraints. Therefore, in Non-Patent Document 1, the number of combinations of active constraints and integer values appearing in a heuristically prepared data set is used as an output dimension.
  • the problem of the output dimension increasing exponentially can be avoided by using a model that directly predicts the input/output relationship through regression prediction.
  • the output dimension is proportional to the number of constraints N, and the amount of calculation can be reduced.
  • the output is binary, it is not suitable for neural network learning, and the problem arises that the learning efficiency deteriorates.
  • the present disclosure has been made in view of the above problems, for example, and includes an information processing device and an information processing method that are capable of efficiently and quickly solving a quadratic programming problem (QP), and to provide programs.
  • QP quadratic programming problem
  • learning processing used for determining control parameters of a robot an information processing device that performs robot control using a predictor and data generated by the learning processing, and An information processing method and a program are provided.
  • a first aspect of the present disclosure includes: a quadratic programming problem optimal solution calculation unit that calculates the optimal solution of the quadratic programming problem corresponding to the input parameters; a constraint norm calculator that calculates a constraint norm that is the norm between each constraint defined by the constraint function of the quadratic programming problem and the optimal solution; An information processing apparatus having a learning process execution unit that generates a constraint norm estimator that executes a learning process using set data of the input parameter and the constraint norm as learning data and estimates the constraint norm according to various input parameters. It is in.
  • a second aspect of the present disclosure is a constraint norm estimator that estimates a constraint norm according to input parameters for each constraint defined by a constraint function of the quadratic programming problem; By comparing the constraint norm estimated by the constraint norm estimator with a predetermined threshold value, Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem.
  • the information processing apparatus includes a linear system analysis unit that selects only active constraints using the constraint activity analysis information generated by the active constraint selection unit and calculates the optimum solution of the quadratic programming problem.
  • a third aspect of the present disclosure is An information processing method executed in an information processing device, a quadratic programming problem optimum solution calculation step in which the quadratic programming problem optimum solution calculation unit calculates the optimum solution of the quadratic programming problem corresponding to the input parameter; a constraint norm calculation step in which a constraint norm calculation unit calculates a constraint norm that is a norm between each constraint defined by the constraint function of the quadratic programming problem and the optimal solution; A learning processing execution unit executes learning processing using set data of the input parameter and the constraint norm as learning data, and generates a constraint norm estimator that estimates the constraint norm according to various input parameters.
  • An information processing method for executing steps a quadratic programming problem optimum solution calculation step in which the quadratic programming problem optimum solution calculation unit calculates the optimum solution of the quadratic programming problem corresponding to the input parameter; a constraint norm calculation step in which a constraint norm calculation unit calculates a constraint norm that is a norm between each constraint defined by the constraint
  • a fourth aspect of the present disclosure is An information processing method executed in an information processing device, a constraint norm estimation step in which the constraint norm estimator estimates the constraint norm according to the input parameters for each constraint defined by the constraint function of the quadratic programming problem;
  • the active constraint selection unit compares the constraint norm estimated by the constraint norm estimator with a predetermined threshold,
  • Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem.
  • an active constraint selection step for generating constraint activity analysis information that enables identification of inactive constraints that are not used in the calculation of
  • a linear system analysis unit selects only active constraints using the constraint activity analysis information generated by the active constraint selection unit, and executes a linear system analysis step of calculating an optimal solution of the quadratic programming problem. It is in the information processing method to do.
  • a fifth aspect of the present disclosure is A program for executing information processing in an information processing device, a quadratic programming problem optimum solution calculation step for causing the quadratic programming problem optimum solution calculation unit to calculate the optimum solution of the quadratic programming problem corresponding to the input parameters; a constraint norm calculation step of causing a constraint norm calculation unit to calculate a constraint norm that is a norm between each constraint defined by the constraint function of the quadratic programming problem and the optimal solution; Execution of learning processing for generating a constraint norm estimator for estimating constraint norms corresponding to various input parameters by executing learning processing using the set data of the input parameter and the constraint norm in the learning processing execution unit as learning data. It is in the program that causes the steps to be executed.
  • a sixth aspect of the present disclosure is A program for executing information processing in an information processing device, a constraint norm estimation step that causes a constraint norm estimator to estimate a constraint norm according to input parameters for each constraint defined by a constraint function of the quadratic programming problem; By comparing the constraint norm estimated by the constraint norm estimator with a predetermined threshold in the active constraint selection unit, Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem.
  • an active constraint selection step for generating constraint activity analysis information that enables identification of inactive constraints that are not used in the calculation of
  • the linear system analysis unit uses the constraint activity analysis information generated by the active constraint selection unit to select only active constraints and execute a linear system analysis step of calculating the optimal solution of the quadratic programming problem.
  • the program of the present disclosure is, for example, a program that can be provided in a computer-readable format to an information processing device or computer system capable of executing various program codes via a storage medium or communication medium.
  • processing according to the program is realized on the information processing device or computer system.
  • a system is a logical collective configuration of a plurality of devices, and the devices of each configuration are not limited to being in the same housing.
  • an apparatus that enables high-speed calculation of an optimal solution to a quadratic programming problem by efficiently selecting active constraints for the quadratic programming problem using the norm of each constraint.
  • a method is realized. Specifically, for example, for each constraint of a quadratic programming problem, a constraint norm estimator that estimates the norm according to the input parameter, and a comparison process between the estimated constraint norm and a predetermined threshold, the quadratic Constraint activity analysis that makes it possible to identify whether each constraint of a planning problem is an active constraint used to calculate the optimal solution of the objective function of a quadratic programming problem or an inactive constraint that is not used to calculate the optimal solution.
  • An active constraint selector for generating information wherein the linear analyzer utilizes the constraint activity analysis information to select active constraints to compute an optimal solution to the quadratic programming problem.
  • FIG. 4 is a diagram illustrating an example of control processing of a robot to which the processing of the present disclosure can be applied;
  • FIG. 10 is a diagram illustrating a configuration example of a device that quickly calculates an optimal solution x of a quadratic programming problem by extracting "active constraints" from "inequality constraints” set in the quadratic programming problem;
  • FIG. 2 is a diagram illustrating a configuration example of a device that executes learning processing to which a quadratic programming problem is applied; It is a figure explaining the example of active restrictions and inactive restrictions in a quadratic programming problem.
  • FIG. 4 is a diagram illustrating a specific example of active constraint identification data (S * ( ⁇ )) generated by an active constraint identification data generation unit;
  • FIG. 4 is a diagram illustrating an example of a label corresponding to active constraint identification data (S * ( ⁇ )) output by a predictor (NN: neural network);
  • FIG. 4 is a diagram illustrating a configuration example of a control information generation unit that calculates an optimal solution x * of a quadratic programming problem including optimal control information from robot observation information ( ⁇ );
  • FIG. 2 is a diagram illustrating a configuration example of a device that executes learning processing to which a quadratic programming problem is applied
  • FIG. 3 is a diagram illustrating a specific example of active constraints and inactive constraints in a quadratic programming problem, and norms (constraint norms) used as indices for distinguishing active constraints and inactive constraints
  • FIG. 4 is a diagram illustrating a specific example of a constraint norm (S l * ( ⁇ )) generated by a constraint norm calculator (Calc Norm);
  • FIG. 10 is a diagram for explaining an example of a regression analyzer generated by learning processing by a constraint norm estimator generation learning processing execution unit (regression analyzer (NN Regressor) generation unit);
  • FIG. 4 is a diagram illustrating a configuration example of a control information generation unit that calculates an optimal solution x * of a quadratic programming problem including optimal control information from robot observation information ( ⁇ );
  • FIG. 10 is a diagram illustrating a specific example of processing executed by a threshold applied active constraint selection unit (Threshold); It is a figure explaining the hardware structural example of the information processing apparatus of this indication.
  • quadratic programming problem is a problem of calculating an optimal solution, such as robot path information and control information, by applying quadratic programming, which is a representative example of a nonlinear programming technique for mathematical optimization.
  • a quadratic programming problem is an optimization problem in which the objective function is a quadratic function and the constraint condition is a linear function.
  • the least squares method is also a kind of quadratic programming problem.
  • a quadratic programming problem can be solved, for example, as a minimization problem with a downwardly convex objective function.
  • the quadratic programming problem is the problem of finding an n-dimensional vector x as the optimal solution to the problem shown in (Formula 1) below.
  • (a) is the objective function (or cost function) and (b) is the constraint function.
  • the quadratic programming problem is a problem of finding the optimum solution x (n-dimensional vector) that minimizes the (a) objective function in the above equation.
  • the constraint function is a function that defines the allowable existence range of the optimal solution x. When solving a quadratic programming problem, (b) it is necessary to obtain an optimal solution x within a range that satisfies the constraint function.
  • P is an n ⁇ n real-valued symmetric matrix
  • q is an n ⁇ 1 real vector
  • A is an m ⁇ n matrix
  • l is m-dimensional vectors
  • x T means the transposed matrix of n-dimensional vector x.
  • constraint function means a constraint that each element of vector Ax is greater than or equal to the corresponding element of vector l and less than or equal to the corresponding element of vector u.
  • constraints include both active constraints that can be used for the calculation process of the optimum solution x and non-active constraints that are not used for the calculation process of the optimum solution x. If only active equality constraints can be extracted, quadratic programming problems can be reduced to linear equations, and high-speed optimal solution calculation becomes possible.
  • the quadratic programming problem optimum solution calculation device 30 shown in FIG. 2 inputs the parameter ⁇ and outputs the optimum solution x * of the quadratic programming problem.
  • x * means a Hermitian transposed matrix of x (n-dimensional vector).
  • the relationship between the input parameter ⁇ and the optimal solution x * which is the output, can be the following correspondence relationship when applied to the control configuration configuration of the robot 10 shown in FIG. 1, for example.
  • Input parameter ⁇ observation information (distance of obstacles, robot position, speed, direction, etc.)
  • Output optimum solution x* robot control information (robot traveling direction control information, speed control information, output control information for left and right wheels, etc.)
  • the quadratic programming problem optimal solution calculation device 30 shown in FIG. 2 inputs the parameter ⁇ configured by the observation information of the robot 10, and outputs the robot control information as the optimal solution x * of the quadratic programming problem. It can be used for processing such as
  • the constraint function is composed of, for example, a function that defines speed limit information of the robot 10, minimum distance information that is allowed between the robot and an obstacle, and the like.
  • the input parameter ⁇ is, for example, a k - dimensional vector ( ⁇ 0 , ⁇ 1 , . n) of control information (x 0 , x 1 , . . . x n ⁇ 1 ).
  • Each parameter (P, q, A, l, u) set in the (a) objective function and (b) constraint function in the above (Equation 1) is the input parameter ⁇ and the output optimal solution x A parameter defined by a relationship.
  • the quadratic programming problem standardization model is the mathematical model shown in (Equation 1) below, which was explained earlier.
  • (a) is an objective function (or cost function).
  • (b) is a constraint function, which is an inequality constraint function composed of inequalities.
  • the quadratic programming problem is a problem of calculating the optimal solution x * (n-dimensional vector) that satisfies the constraints of (a) the objective function and (b) the constraint function in the above equation.
  • the quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 32 generates the quadratic programming problem standardized model generated by the quadratic programming problem standardized model generation unit (QP Modeling) 31, that is, the above (Equation 1) (a ) input the quadratic programming problem standardized model composed of the objective function and (b) the constraint function, the optimal solution x * (n-dimensional vector ) is calculated and output.
  • a quadratic programming problem standardized model optimum solution calculation unit (QP Solver) 32 calculates the optimum solution x * of the optimization problem (quadratic programming problem), and applies the calculated optimum solution x* to the inequality constraint “l ⁇ Ax ⁇ u”. This substitution process extracts only the rows where the equality holds.
  • Equation 2 is generated.
  • the quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 32 uses the above (formula 2) to extract the active constraints, treats the extracted active constraints as equality constraints, and calculates the quadratic programming problem standardized model
  • the quadratic programming problem with inequality constraints generated by the generation unit (QP Modeling) 31 is converted into a quadratic programming problem with active equality constraints as shown in (Equation 3) below.
  • the quadratic programming problem is replaced by the problem of calculating the optimal solution x * (n-dimensional vector) that satisfies the constraints of (a) the objective function is minimized and (b) the active equality constraint function in the above equation. be done.
  • the constraints of a quadratic programming problem include inequality constraints and equality constraints, active constraints that can be used for the calculation of the optimal solution x * , and active constraints that are not used for the calculation of the optimal solution x * .
  • quadratic programming problem standardized model optimum solution calculation unit (QP Solver ) 32 shown in FIG. to extract only the rows where the equality holds, and to generate selection matrices S cl and S cu in which the other rows are set to 0.
  • set data of various input parameters ( ⁇ ) and active constraint identification data for extracting active constraints corresponding to the input parameters ( ⁇ ) are generated in advance as a learning data set.
  • a learning process for generating active constraint identification data for extracting active constraints from inequality constraints included in a quadratic programming problem is executed in advance, and input parameters ( ⁇ ) and active constraint identification data (S * ( ⁇ )) Generate a training data set ( ⁇ , S * ( ⁇ )) consisting of data corresponding to .
  • active constraint identification data (S * ( ⁇ )) is applied to extract active constraints according to the input parameter ( ⁇ ), the extracted active constraints are regarded as active equality constraints, and the quadratic programming problem is solved to calculate the optimal solution x * .
  • the quadratic programming problem is solved to calculate the optimal solution x * .
  • a method using learning processing is effective as one method for calculating the unknown optimal solution x * of the quadratic programming problem at high speed.
  • Active constraint identification data corresponding to various input parameters ( ⁇ ), that is, data for selectively extracting active constraints corresponding to input parameters ( ⁇ ) from inequality constraints included in quadratic programming problems, is generated in advance by learning processing. do. That is, through the learning process, a learning data set ( ⁇ , S * ( ⁇ )) consisting of set data of various input parameters ( ⁇ ) and active constraint identification data (S * ( ⁇ )) corresponding to each parameter ( ⁇ ) ) is generated in advance.
  • a learning process using the learning data set ( ⁇ , S * ( ⁇ )) is executed to obtain active constraint identification data (S * ( ⁇ ) corresponding to the input parameter ( ⁇ ) from various input parameters ( ⁇ ). ), for example a neural network (NN).
  • NN neural network
  • this predictor When executing robot control, this predictor, such as a neural network (NN), is used to estimate active constraint identification data (S * ( ⁇ )) corresponding to input parameters ( ⁇ ) from various input parameters ( ⁇ ). do. Furthermore, based on the estimated active constraint identification data (S * ( ⁇ )), an active constraint corresponding to the input parameter ( ⁇ ) is selected, the selected active constraint is used to solve the quadratic programming problem, and the robot Optimal solution x * is calculated as control information.
  • NN active constraint identification data
  • a predictor for predicting active constraint identification data (S * ( ⁇ )) necessary for selectively extracting active constraints for various input parameters ( ⁇ ) acquired by the robot as observation information is set as a learning data set. It is generated by learning processing using ( ⁇ , S * ( ⁇ )).
  • the optimal solution x * of the quadratic programming problem that is, the optimal solution x * of robot control information and the like can be calculated at high speed.
  • FIG. 3 is a diagram showing a configuration example of the learning processing unit 40 configured within the information processing apparatus.
  • the learning processing section 40 has a learning data set generation section 50 and a predictor generation section 60 .
  • the learning data set generation unit 50 calculates active constraint identification data (S * ( ⁇ )) that enables extraction of active constraints according to the input parameter ( ⁇ ), and generates various input parameters ( ⁇ ) and parameters ( .theta.)
  • a learning data set (.theta., S * (.theta.)) 61 consisting of set data with corresponding active constraint identification data (S * (.theta.)) is generated.
  • a predictor generator 60 uses a learning data set ( ⁇ , S * ( ⁇ )) 61 to generate predictions for selecting active constraints according to input parameters ( ⁇ ) from various input parameters ( ⁇ ). It has a class classification processing unit 62 that generates an NN (neural network) corresponding to a device.
  • NN neural network
  • the learning data set generation unit 50 includes a quadratic programming problem standardized model generation unit (QP Modeling) 51, a quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 52, and an active constraint identification unit. It has a data generator 53 .
  • QP Modeling quadratic programming problem standardized model generation unit
  • QP Solver quadratic programming problem standardized model optimal solution calculation unit
  • active constraint identification unit It has a data generator 53 .
  • the relationship between the input parameter ⁇ and the optimal solution x * can be the following correspondence when applied to the control configuration of the robot 10 shown in FIG. be.
  • Input parameter ⁇ observation information (distance of obstacles, robot position, speed, direction, etc.)
  • Output optimum solution x* robot control information (robot traveling direction control information, speed control information, output control information for left and right wheels, etc.)
  • the input parameter ⁇ is, for example, a k - dimensional vector ( ⁇ 0 , ⁇ 1 , . n) of control information (x 0 , x 1 , . . . x n ⁇ 1 ).
  • a quadratic programming problem standardized model generator (QP Modeling) 51 of the learning data set generator 50 shown in FIG. 3 receives a parameter ⁇ and generates a quadratic programming problem standardized model based on the input parameter ⁇ .
  • the quadratic programming problem standardization model is the mathematical model shown in (Equation 1) below, which was explained earlier.
  • (a) is an objective function (or cost function).
  • (b) is a constraint function, which is an inequality constraint function composed of inequalities.
  • the quadratic programming problem is a problem of calculating the optimal solution x * (n-dimensional vector) that satisfies the constraints of (a) the objective function and (b) the constraint function in the above equation.
  • the quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 52 generates the quadratic programming problem standardized model generated by the quadratic programming problem standardized model generation unit (QP Modeling) 51, that is, the above (Equation 1) (a ) input the quadratic programming problem standardized model composed of the objective function and (b) the constraint function, the optimal solution x * (n-dimensional vector ) is calculated and output.
  • a quadratic programming problem standardized model optimum solution calculation unit (QP Solver) 52 calculates the optimum solution x * of the optimization problem (quadratic programming problem), and applies the calculated optimum solution x* to the inequality constraint “l ⁇ Ax ⁇ u”. This substitution process extracts only the rows where the equality holds.
  • Equation 2 the following (Equation 2) described above is generated.
  • the quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 52 uses the above (formula 2) to extract the active constraints, treats the extracted active constraints as equality constraints, and calculates the quadratic programming problem standardized model
  • the quadratic programming problem with inequality constraints generated by the generation unit (QP Modeling) 51 is converted into a quadratic programming problem with active equality constraints as shown in (Equation 3) below.
  • quadratic programming problem standardized model optimum solution calculation unit (QP Solver) 52 applies the above (Equation 3) to (a) minimize the objective function and (b) constrain the active equality constraint function Calculate the optimal solution x * (n-dimensional vector) that satisfies
  • the constraints of the quadratic programming problem include active constraints that can be used for the calculation process of the optimum solution x * and inactive constraints that are not used for the calculation process of the optimum solution x * .
  • active constraints and inactive constraints in a quadratic programming problem will be described with reference to FIG.
  • the optimal solution x * of the quadratic programming problem is the solution x * (n-dimensional vector) that satisfies the constraints of (a) the objective function and (b) the constraint function of the quadratic programming problem.
  • the circular dotted line shown in FIG. 4 is the contour line of the calculated value of the (a) objective function of the quadratic programming problem, and the calculated value becomes smaller toward the inner side of the contour line.
  • a region (V) shown in FIG. 4 is a region that satisfies the constraint of the (b) constraint function of the quadratic programming problem.
  • Line segments ab, cd, ef, and gh show examples of multiple constraints defined by the (b) constraint function of the quadratic programming problem.
  • Small dotted arrows extending vertically from each line segment indicate the direction in which each constraint is satisfied.
  • a constraint ab indicated as a line segment ab is a region where the lower right region of the line segment ab satisfies the constraint ab.
  • a constraint gh shown as a line segment gh is a region where the upper left region of the line segment gh satisfies the constraint gh.
  • a region (V) shown in FIG. 4 is a region (n region of the dimensional state vector).
  • the region (V) is a region that satisfies the constraints of the (b) constraint function of the quadratic programming problem, and within this region (V), the (a) objective function of the quadratic programming problem is the minimum value
  • a solution x * (an n-dimensional vector) is calculated as the optimal solution x * of the quadratic programming problem.
  • constraints are used to calculate the optimal solution x * of the quadratic programming problem.
  • Active constraints available for processing are used to calculate the optimal solution x * of the quadratic programming problem.
  • two constraints, constraint ef and constraint gh are inactive constraints that are not used in the process of calculating the optimal solution x * of the quadratic programming problem.
  • the inactive constraint only defines a region that satisfies the constraints of the (b) constraint function, and is not used in the process of calculating the optimal solution x * .
  • the (b) constraint function of the quadratic programming problem includes a plurality of different constraints, and determining which of these constraints is an active constraint that can be used in the process of calculating the optimal solution x * . is difficult, and the activeness and inactiveness of each constraint can only be determined as a result of trial and error in the calculation process of the calculation process of the optimum solution x * .
  • the active constraints that can be used for the calculation process of the optimal solution x * and the inactive constraints that are not used for the calculation process of the optimal solution x * are discriminated. , extracting only the active constraint and regarding the extracted active constraint as the active equality constraint, it becomes possible to reduce the quadratic programming problem to a linear equation, and to calculate the optimum solution x * at high speed.
  • the active constraint identification data generator 53 of the learning data set generator 50 shown in FIG. 3 generates data for this purpose, that is, active constraint identification data (S * ( ⁇ )).
  • the quadratic programming problem standardized model optimum solution calculation unit (QP solver) 52 of the learning data set generation unit 50 shown in FIG. 3 calculates the optimum solution x * of the optimization problem (quadratic programming problem).
  • the calculated optimal solution x * is substituted into the inequality constraint “l ⁇ Ax ⁇ u”, and a selection matrix S is obtained in which the matrix elements for which the equality is established by the substitution process are set to 1, and the other matrix elements are set to 0.
  • the calculated optimal solution x * is substituted into the inequality constraint “l ⁇ Ax ⁇ u”, and a selection matrix S is obtained in which the matrix elements for which the equality is established by the substitution process are set to 1, and the other matrix elements are set to 0.
  • Generate cl , S cu
  • the matrices S cl and S cu are input to the active constraint identification data generator 53 of the learning data set generator 50 shown in FIG. 3, and the active constraint identification data generator 53 uses the matrices S cl and S cu to generate data (active constraint identification data (S * ( ⁇ ))) for identifying the activity and inactivity of each constraint included in the (b) constraint function of the quadratic programming problem.
  • the selection matrices S cl and S cu generated in the calculation process of the calculation process of are output to the active constraint identification data generation unit 53 .
  • the selection matrices S cl and S cu are obtained by substituting the calculated optimal solution x * into the inequality constraint “l ⁇ Ax ⁇ u” to select only the rows where the equality holds.
  • the active constraint identification data generator 53 inputs the following data.
  • the active constraint identification data generator 53 Based on these input data, the active constraint identification data generator 53 generates active constraint identification data, which is information for selectively extracting only active constraints from the inequality constraints (l ⁇ Ax ⁇ u) of the quadratic programming standardized model. Generate (S * ( ⁇ )).
  • the active constraint identification data generated by the active constraint identification data generator 53 is the output of the active constraint identification data generator 53 shown in FIG. 3, that is, S * ( ⁇ ).
  • This active constraint identification data (S * ( ⁇ )) is data summarizing the diagonal components of the selection matrices S cl and S cu input from the quadratic programming problem standardized model optimum solution calculator (QP Solver) 52. .
  • the selection matrices S cl and S cu input from the quadratic programming problem standardized model optimum solution calculator (QP Solver) 52 are the matrices shown in (Equation 4) below.
  • the selection matrices S cl and S cu are matrices in which the diagonal elements from the upper left end to the lower right end are 0 or 1, and the other elements are 0s.
  • 0 of the diagonal element from the upper left to the lower right is an element corresponding to the inactive constraint that is not used for the calculation process of the optimal solution x * (n-dimensional vector) of the quadratic programming problem
  • 1 is the optimal of the quadratic programming problem. It becomes an element corresponding to the active constraint used to calculate the solution x * (n-dimensional vector).
  • the active constraint identification data generator 53 generates active constraint identification data (S * ( ⁇ )) corresponding to the input parameter ( ⁇ ), as shown in FIG.
  • the active constraint identification data (S * ( ⁇ )) corresponding to the input parameter ( ⁇ 0 ) is (1000).
  • the active constraint identification data (S * ( ⁇ )) is composed of a data string expressing 1 for an active constraint and 0 for a non-active constraint.
  • Constraint ab 1 (active constraint)
  • Constraint cd 0 (inactive constraint)
  • constraint ef 0 (inactive constraint)
  • Constraint gh 0 (inactive constraint)
  • Constraint ab 1 (active constraint)
  • Constraint cd 1 (active constraint)
  • constraint ef 0 (inactive constraint)
  • Constraint gh 0 (inactive constraint)
  • N neural network
  • N Classifier neural network class classifier
  • NN neural network
  • a predictor (NN: neural network) 62a is a predictor that selects and outputs a label corresponding to active constraint identification data (S * ( ⁇ )) from an input parameter ⁇ .
  • the label is a label corresponding to active constraint identification data (S * ( ⁇ )).
  • FIG. 7 shows an example of label setting.
  • the table shown in FIG. 7 corresponds to data in which labels are associated with the active constraint identification data (S * ( ⁇ )) generated by the active constraint identification data generator 53 described above with reference to FIG.
  • the label selection process in the predictor (NN: neural network) 62a is performed using the learning data set ( ⁇ , S * ( ⁇ )) 61 generated by the learning data set generation unit.
  • learning data set ( ⁇ , S * ( ⁇ )) 61 stored in the storage unit, learning data containing parameters ( ⁇ ) highly similar to the input parameters ⁇ for the predictor (NN: neural network) 62a
  • a set ( ⁇ , S * ( ⁇ )) is selected, a high score is set in descending order of similarity, and a label ( active constraint identification data corresponding label) is the output label.
  • the class classification processor ( NN Classifier neural network class classifier) 62 of the predictor generation unit 60 shown in FIG. ) 61 to generate a predictor (NN: Neural Network) that predicts the active constraint identification data (S * ( ⁇ )) from the input parameter ⁇ .
  • NN Neural Network
  • this predictor (NN: neural network), that is, the predictor (NN: neural network ) is executed.
  • a quadratic programming problem is set to calculate the optimal solution x * including the control information of the robot from the observation information ( ⁇ ) of the robot. Furthermore, using the above predictor (NN: neural network), the active constraint corresponding to the observation information ( ⁇ ) of the robot is estimated, and the estimated active constraint is used to obtain the optimal solution x It is possible to execute processing for calculating * . This processing will be described with reference to FIG.
  • FIG. 8 shows a control information generator 80 configured in, for example, an information processing device of a robot.
  • the control information generating unit 80 receives, for example, an input parameter ( ⁇ ), which is observation information of the robot, and executes a process of calculating the optimal solution x * of the quadratic programming problem as the control information of the robot.
  • an input parameter
  • the input parameter ⁇ is expressed, for example, as a k - dimensional vector ( ⁇ 0 , ⁇ 1 , .
  • the optimal solution x * of the quadratic programming problem is expressed as an n- dimensional vector (x 0 , x 1 , .
  • N classifier neural network class classifier
  • the active constraint identification data (S * ( ⁇ )) is data that enables selection of only the active constraint corresponding to the input parameter ( ⁇ ).
  • a linear system solver 82 uses the active constraint identification data (S * ( ⁇ )) to extract only the active constraints from the constraints included in the inequality constraints “l ⁇ Ax ⁇ u” of the quadratic programming problem. These are regarded as active equality constraints, and processing is performed to calculate the optimal solution x * (n-dimensional vector) that satisfies the active equality constraints. By performing such processing, high-speed calculation processing of the optimal solution x * of the quadratic programming problem is realized, and the robot can be controlled quickly.
  • S * ( ⁇ ) active constraint identification data
  • N Classifier neural network class classifier
  • NN neural network
  • the predictor (NN: neural network) 81a corresponds to the predictor (NN: neural network) 62a generated in the predictor generation unit 60 of the learning processing unit 40 previously described with reference to FIG.
  • the predictor (NN: neural network) 81a selects and outputs a label corresponding to the input parameter ( ⁇ ) (active constraint identification data (S * ( ⁇ )) corresponding label).
  • the label is the label described above with reference to FIG. 7, and is associated with each active constraint identification data (S * ( ⁇ )).
  • the label is a label with which the setting of the active constraint identification data (S * (.theta.)) can be comprehended.
  • a predictor (NN: neural network) 81a of a class classification processing unit (NN classifier neural network class classifying unit) 81 shown in FIG. Select and output labels 1, 2, 3, .
  • the predictor (NN: neural network) 81a is a predictor (NN: neural network) generated by learning processing using the learning data set ( ⁇ , S * ( ⁇ )) 61 described above.
  • the predictor (NN: neural network) 81a has, for example, a learning data set ( ⁇ , S * ( ⁇ ) ), and perform label estimation processing such that the label (active constraint identification data corresponding label) of the learning data set ( ⁇ , S * ( ⁇ )) with the highest set score is set as the output label. .
  • the predictor (NN: neural network) 81a outputs the label with the highest score, that is, label 2 .
  • the label conversion unit 81b inputs labels corresponding to active constraint identification data (S * ( ⁇ )) generated by a predictor (NN: neural network) 81a, and converts one active constraint identification data ( S * ( ⁇ )) is selected, and one selected active constraint identification data (S * ( ⁇ )) is output to a linear system solver 82 in the next stage.
  • constraint ab the number of labels
  • constraint cd the number of labels
  • constraint ef the number of labels
  • constraint gh the number of labels
  • one technique for quickly calculating an unknown optimal solution x * to a quadratic programming problem is to use a predictor generated by learning processing, that is, a predictor that estimates active constraints. is valid.
  • active constraint identification data corresponding to various input parameters ( ⁇ ) that is, data for selectively extracting active constraints corresponding to input parameters ( ⁇ ) from inequality constraints included in quadratic programming problems. It is effective to generate them in advance by learning processing.
  • the norm (L2 norm ( Euclidean norm)) corresponding to the distance in the vector space from the optimal solution x * of the quadratic programming problem to the constraint is used as the active constraint identification data corresponding to the parameter ( ⁇ ).
  • This learning data set ( ⁇ , S l * ( ⁇ )) is stored in the storage unit and used when calculating the optimum solution x * as control information during execution of robot control. That is, the norm (S l * ( ⁇ )) of each constraint with respect to various input parameters ( ⁇ ) acquired by the robot as observation information is stored in the learning data set ( ⁇ , S l * ( ⁇ )) is estimated using Furthermore, depending on the value of the estimated norm (S l * ( ⁇ )) of each constraint, we identify whether each constraint is active or inactive.
  • Such a learning result application process enables high-speed extraction of active constraints according to the input parameter ( ⁇ ), and only active constraints are selected to obtain the optimal solution x * of the quadratic programming problem, that is, the robot It becomes possible to calculate the optimum solution x * such as control information at high speed.
  • FIG. 10 is a diagram showing a configuration example of the learning processing unit 100 configured within the information processing apparatus.
  • the learning processing unit 100 has a learning data set generation unit 110 and a constraint norm estimator generation unit 120 .
  • the learning data set generation unit 110 calculates the norm corresponding to each constraint (constraint norm (S l * ( ⁇ )) that enables extraction of active constraints according to the input parameter ( ⁇ ), and calculates various input parameters ( ⁇ ) and (S l * ( ⁇ )) of each constraint corresponding to the parameter ( ⁇ ) .
  • the constraint norm estimator generation unit 120 uses the learning data set ( ⁇ , S l * ( ⁇ )) 121 to estimate the constraint norm corresponding to the input parameter ( ⁇ ) from various input parameters ( ⁇ ). Generate a regression analyzer (NN Regressor).
  • the learning data set generator 110 includes a quadratic programming problem standardized model generator (QP Modeling) 111, a quadratic programming problem standardized model optimum solution calculator (QP Solver) 112, and a constraint norm calculator. It has a part (Calc Norm) 113 .
  • x * means a Hermitian transposed matrix of x (n-dimensional vector).
  • the relationship between the input parameter ⁇ and the optimal solution x * can be the following correspondence when applied to the control configuration of the robot 10 shown in FIG. be.
  • Input parameter ⁇ observation information (distance of obstacles, robot position, speed, direction, etc.)
  • Output optimum solution x* robot control information (robot traveling direction control information, speed control information, output control information for left and right wheels, etc.)
  • the input parameter ⁇ is, for example, a k - dimensional vector ( ⁇ 0 , ⁇ 1 , . n) of control information (x 0 , x 1 , . . . x n ⁇ 1 ).
  • a quadratic programming problem standardized model generator (QP Modeling) 111 of the learning data set generator 110 shown in FIG. 10 receives a parameter ⁇ and generates a quadratic programming problem standardized model based on the input parameter ⁇ .
  • the quadratic programming problem standardization model is the mathematical model shown in (Equation 1) below, which was explained earlier.
  • (a) is an objective function (or cost function).
  • (b) is a constraint function, which is an inequality constraint function composed of inequalities.
  • the quadratic programming problem is a problem of calculating the optimal solution x * (n-dimensional vector) that satisfies the constraints of (a) the objective function and (b) the constraint function in the above equation.
  • the quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 112 generates the quadratic programming problem standardized model generated by the quadratic programming problem standardized model generation unit (QP Modeling) 111, that is, the above (Equation 1) (a ) input the quadratic programming problem standardized model composed of the objective function and (b) the constraint function, the optimal solution x * (n-dimensional vector ) is calculated and output.
  • a quadratic programming problem standardized model optimum solution calculation unit (QP Solver) 112 calculates the optimum solution x * of the optimization problem (quadratic programming problem), and applies the calculated optimum solution x* to the inequality constraint “l ⁇ Ax ⁇ u”. This substitution process extracts only the rows where the equality holds.
  • Equation 2 the following (Equation 2) described above is generated.
  • the quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 112 uses the above (formula 2) to extract the active constraints, treats the extracted active constraints as equality constraints, and calculates the quadratic programming problem standardized model
  • the quadratic programming problem with inequality constraints generated by the generation unit (QP Modeling) 111 is converted into a quadratic programming problem with active equality constraints as shown in (Formula 3) below.
  • quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 112 applies the above (Equation 3) to (a) minimize the objective function and (b) constrain the active equality constraint function Calculate the optimal solution x * (n-dimensional vector) that satisfies
  • the constraints of the quadratic programming problem include active constraints that can be used for the calculation process of the optimum solution x * and inactive constraints that are not used for the calculation process of the optimum solution x * .
  • the norm L2 norm ( Euclidean norm)) corresponding to the distance in the vector space from the optimal solution x * of the quadratic programming problem to the constraint is used as an index for distinguishing between the active constraint and the inactive constraint. use.
  • the x * shown in the center of FIG. 11 indicates the optimal solution x* of the quadratic programming problem.
  • the optimal solution x * of the quadratic programming problem is the solution x * (n-dimensional vector) that satisfies the constraints of (a) the objective function and (b) the constraint function of the quadratic programming problem.
  • the circular dotted line shown in FIG. 11 is the contour line of the calculated value of the (a) objective function of the quadratic programming problem, and the calculated value becomes smaller toward the inner side of the contour line.
  • a region (V) shown in FIG. 11 is a region that satisfies the constraint of the (b) constraint function of the quadratic programming problem.
  • Line segments ab, cd, ef, and gh show examples of multiple constraints defined by the (b) constraint function of the quadratic programming problem.
  • Small dotted arrows extending vertically from each line segment indicate the direction in which each constraint is satisfied.
  • a constraint ab indicated as a line segment ab is a region where the lower right region of the line segment ab satisfies the constraint ab.
  • a constraint gh shown as a line segment gh is a region where the upper left region of the line segment gh satisfies the constraint gh.
  • a region (V) shown in FIG. 11 is a region (n region of the dimensional state vector).
  • the region (V) is a region that satisfies the constraints of the (b) constraint function of the quadratic programming problem, and within this region (V), the (a) objective function of the quadratic programming problem is the minimum value
  • a solution x * (an n-dimensional vector) is calculated as the optimal solution x * of the quadratic programming problem.
  • constraints shown in FIG. 11 that is, constraint ab, constraint cd, constraint ef, and constraint gh .
  • Active constraints available for processing on the other hand, two constraints, constraint ef and constraint gh, are inactive constraints that are not used in the process of calculating the optimal solution x * of the quadratic programming problem.
  • the inactive constraint only defines a region that satisfies the constraints of the (b) constraint function, and is not used in the process of calculating the optimal solution x * .
  • the (b) constraint function of the quadratic programming problem includes a plurality of different constraints, in order to determine which of these constraints is the active constraint that can be used for the calculation process of the optimal solution x * .
  • the norm is used in this embodiment.
  • the norm (L2 norm ( Euclidean norm)) corresponding to the distance in the vector space from the optimal solution x * of the quadratic programming problem to each constraint is calculated for each constraint, and the calculated constraint norm value is defined in advance. If the constraint is greater than or equal to a threshold ( ⁇ ), the constraint is determined to be an inactive constraint, and if the value of the constraint norm is less than a predefined threshold ( ⁇ ), the constraint is determined to be an active constraint. to decide.
  • the norm corresponding to the distance in the vector space from the optimal solution x * of the quadratic programming problem to each constraint is calculated for each constraint, and the calculated constraint If the value of the norm is greater than or equal to a predefined threshold ( ⁇ ), then the constraint is determined to be an inactive constraint; if the value of the constraint norm is less than the predefined threshold ( ⁇ ), then Determine that the constraint is an active constraint.
  • active constraints that can be used for the calculation process of the optimal solution x * and inactive constraints that are not used for the calculation process of the optimal solution x * are set. It is possible to reduce the quadratic programming problem to a linear equation by extracting only the active constraints, extracting only the active constraints, and treating the extracted active constraints as active equality constraints. Optimal solution x * can be calculated.
  • a constraint norm calculation unit (Calc Norm) 113 of the learning data set generation unit 110 shown in FIG. (Constraint norm (S l * ( ⁇ ))) is calculated.
  • the constraint norm (S l * ( ⁇ )) calculated by the constraint norm calculation unit (Calc Norm) 113 is such that each constraint defined by the constraint function of the quadratic programming problem is the optimum of the objective function of the quadratic programming problem. It is a constraint activity determination index value for identifying whether it is an active constraint used for calculating a solution or an inactive constraint not used for calculating the optimal solution of the objective function of the quadratic programming problem.
  • the constraint norm calculator (Calc Norm) 113 receives the following data.
  • the constraint norm calculation unit (Calc Norm) 113 calculates a norm (constraint norm (S l * ( ⁇ ))) is calculated.
  • the constraint norm (S l * ( ⁇ )) calculated by the constraint norm calculator (Calc Norm) 113 is, for example, data (matrix data connecting the norms of each constraint) represented by the following (Equation 5).
  • the constraint norm calculator (Calc Norm) 113 calculates the constraint norm (S l * ( ⁇ )) corresponding to the input parameter ( ⁇ ), as shown in FIG. 12 .
  • the constraint norm (S l * ( ⁇ )) (0.00, 1.25, 1.80, 1.50) is the norm (L 2 Four values corresponding to the norm (Euclidean norm) are shown.
  • the constraint norm estimator generation unit 120 shown in FIG. 10 sends this learning data set ( ⁇ , S l * ( ⁇ )) to the constraint norm estimator generation learning processing execution unit (NN Regressor generation unit) 122. input.
  • a constraint norm estimator generation learning processing execution unit (regression analyzer (NN Regressor) generation unit) 122 executes learning processing using a learning data set ( ⁇ , S l * ( ⁇ )), and from the input parameter ⁇ , generates a regression analyzer (NN Regressor) that estimates the constraint norm (S l * ( ⁇ )) for each constraint.
  • regression analyzer generated by learning processing by the constraint norm estimator generation learning processing execution unit (regression analyzer (NN Regressor) generation unit) 122 will be described with reference to FIG.
  • the constraint norm estimator generation learning processing execution unit (regression analyzer (NN Regressor) generation unit) 122 generates a regression analyzer by learning processing, that is, from the input parameter ⁇ , the constraint norm of each constraint (S A regression analyzer (NN Regressor) 122a is shown estimating l * ( ⁇ )).
  • a regression analyzer (NN Regressor: neural network regression analyzer) 122a estimates the constraint norm (S l * ( ⁇ )) of each constraint from the input parameter ⁇ .
  • a regression analyzer (NN Regressor: neural network regression analyzer) 122a performs, for example, a regression analysis process using a learning data set ( ⁇ , S l * ( ⁇ )) 121 stored in a storage unit to obtain an input parameter ⁇ Estimate and output the constraint norm (S l * ( ⁇ )) corresponding to .
  • the regression analyzer (NN Regressor: neural network regression analyzer) 122a generated by this constraint norm estimator generation learning processing execution unit (regression analyzer (NN Regressor) generation unit) 122 is Execute the used process.
  • a quadratic programming problem is set to calculate the optimal solution x * including the control information of the robot from the observation information ( ⁇ ) of the robot.
  • the above regression analyzer (NN Regressor: neural network regression analyzer) is used to estimate the norm of the constraint of the quadratic programming problem, and based on the estimated norm, the observed information ( ⁇ ) of the robot Select the corresponding active constraint.
  • the selected active constraint is used to perform processing for calculating the optimal solution x * of the quadratic programming problem.
  • Such processing enables high-speed processing of calculating the optimal solution x * of the quadratic programming problem including the optimal control information from the observation information ( ⁇ ) of the robot. This processing will be described with reference to FIG.
  • FIG. 14 shows the control information generator 200 configured in the information processing apparatus.
  • the control information generation unit 200 receives, for example, an input parameter ( ⁇ ), which is observation information of the robot, and executes a process of calculating the optimal solution x * of the quadratic programming problem as the control information of the robot.
  • an input parameter
  • the input parameter ⁇ is expressed, for example, as a k - dimensional vector ( ⁇ 0 , ⁇ 1 , .
  • the optimal solution x * of the quadratic programming problem is expressed as an n- dimensional vector (x 0 , x 1 , .
  • N Regressor neural network regression analyzer
  • Threshold threshold applied active constraint selector
  • Linear System Solver Linear System Solver
  • a constraint norm estimation process using a regression analyzer (NN Regressor: neural network regression analyzer) generated by the unit (regression analyzer (NN Regressor) generation unit) 122 is executed.
  • the constraint norm (S l * ( ⁇ )) is used to determine whether each constraint is an active constraint that is used to calculate the optimal solution x * of the quadratic programming problem or an inactive constraint that is not used. It is an index value.
  • FIG. 15 shows a regression analyzer (NN Regressor: neural network regression analyzer) configured in the constraint norm estimator (NN Regressor: neural network regression analyzer) 201 .
  • This regression analyzer (NN Regressor: neural network regression analyzer) is the regression analyzer described above with reference to FIG. It corresponds to the regression analyzer (NN Regressor: neural network regression analyzer) 122a generated by the device generation learning processing execution unit (regression analyzer (NN Regressor) generation unit) 122 .
  • the regression analyzer (NN Regressor: neural network regression analyzer) shown in FIG. 15 estimates and outputs the constraint norm (S l * ( ⁇ )) of each constraint corresponding to the input parameter ( ⁇ ).
  • the regression analyzer (NN Regressor : neural network regression analyzer) shown in FIG . (NN Regressor: Neural Network Regression Analyzer).
  • the constraint norm (S l * ( ⁇ )) recorded in the training data set containing the input parameter ⁇ for the regression analyzer (NN Regressor: neural network regression analyzer) and the parameter ( ⁇ ) with high similarity is executed to estimate the constraint norm (S l * ( ⁇ )) corresponding to the input parameter ⁇ .
  • the constraint norm (S l * ( ⁇ )) is used to determine whether each constraint is an active constraint that is used to calculate the optimal solution x * of the quadratic programming problem or an inactive constraint that is not used. It is an index value.
  • a threshold active constraint selector (Threshold) 202 applies a predefined threshold ( ⁇ ) such that each of the constraints defined in the quadratic programming problem is the optimal solution x Generates active constraint identification data (S * ( ⁇ )), which is discriminant data as to whether the constraint is an active constraint that is used in the calculation of * or an inactive constraint that is not used.
  • This active constraint identification data (S * ( ⁇ )) is data similar to the active constraint identification data (S * ( ⁇ )) previously described with reference to FIGS. This is data that enables selection of only the active constraint corresponding to the parameter ( ⁇ ).
  • the thresholded active constraint selector (Threshold) 202 applies a predefined threshold ( ⁇ ) such that each of the constraints defined in the quadratic programming problem is a quadratic program Active constraint identification data (S * ( ⁇ )), which is discriminative data for determining whether the constraint is an active constraint used in calculating the optimum solution x * of the problem or an inactive constraint that is not used, is generated.
  • FIG. 16 shows (a) input data and (b) output data for the threshold application active constraint selection unit (Threshold) 202 .
  • the constraint norm (S l * ( ⁇ )) is the norm (L 2 norm (Euclidean norm)) corresponding to the distance in the vector space from the optimal solution x * of the quadratic programming problem to each constraint. .
  • the threshold application active constraint selection unit (Threshold) 202 compares (a) the constraint norm (S l * ( ⁇ )) of each constraint of the input data with a predetermined threshold value ( ⁇ ). conduct.
  • the constraint norm (S l * ( ⁇ )) is greater than or equal to a predefined threshold value ( ⁇ )
  • the constraint is determined to be an inactive constraint
  • the value of the constraint norm is equal to or greater than the predefined threshold value ( ⁇ )
  • the constraint is determined to be an active constraint
  • active constraint identification data (S * ( ⁇ )) is generated according to the determination result. That is, it generates active constraint identification data (S * ( ⁇ )) that is data for determining whether each constraint is an active constraint or an inactive constraint that is not used.
  • FIG. 16B shows active constraint identification data (S * ( ⁇ )) shown as output data.
  • a threshold applied active constraint selector (Threshold) 202 generates active constraint identification data (S * ( ⁇ )) based on these determination results.
  • the active constraint identification data (S * ( ⁇ )) is data set by associating (1) with the active constraint corresponding to the input parameter ( ⁇ ) and (0) with the inactive constraint, and only the active constraint is selected. data that made it possible.
  • Active constraint identification data (S * ( ⁇ )) generated by the threshold applied active constraint selector (Threshold) 202 is input to a linear system solver (Linear System Solver) 203 .
  • a linear system solver 203 uses the active constraint identification data (S * ( ⁇ )) to extract only the active constraints from the constraints included in the inequality constraints “l ⁇ Ax ⁇ u” of the quadratic programming problem. These are regarded as active equality constraints, and processing is performed to calculate the optimal solution x * (n-dimensional vector) that satisfies the active equality constraints.
  • this embodiment does not set labels according to the combination of active and inactive constraints. Therefore, there is no problem that the number of labels increases exponentially when the number N of constraints included in the (b) constraint function in the quadratic programming problem increases.
  • each constraint is an active constraint or an inactive constraint based on the norm value described with reference to FIGS. 10 to 16 does not require classification by label. .
  • each constraint is an active constraint or an inactive constraint simply by determining whether the norm of each constraint is greater than or equal to the threshold or less than the threshold. can be reduced. As a result, the efficiency of the learning process is improved, and the control speed during robot control can also be improved.
  • FIG. 17 is a block diagram showing one configuration example of the hardware configuration of the information processing apparatus of the present disclosure.
  • the information processing device is, for example, a device capable of executing processing executed by the learning processing unit described above with reference to FIGS. 3 and 10, or the control information generation unit described with reference to FIGS. 8 and 14. .
  • the information processing device can be configured as, for example, a device attached to the robot or a device capable of communicating with the robot to control the robot. Each component of the information processing apparatus shown in FIG. 17 will be described.
  • a CPU (Central Processing Unit) 301 functions as a data processing section that executes various processes according to programs stored in a ROM (Read Only Memory) 302 or a storage section 308 . For example, the process according to the sequence described in the above embodiment is executed.
  • a RAM (Random Access Memory) 303 stores programs and data executed by the CPU 301 . These CPU 301 , ROM 302 and RAM 303 are interconnected by a bus 304 .
  • the CPU 301 is connected to an input/output interface 305 via a bus 304.
  • the input/output interface 305 includes various switches, a keyboard, a touch panel, a mouse, a microphone, and a user input unit, a camera, and various sensors 321 such as LiDAR for obtaining status data.
  • An input unit 306 including a unit, etc., and an output unit 307 including a display, a speaker, etc. are connected.
  • the output unit 307 also outputs driving information to a driving unit 322 that drives a robot or the like.
  • the CPU 301 receives commands, situation data, and the like input from the input unit 306 , executes various processes, and outputs processing results to the output unit 307 , for example.
  • a storage unit 308 connected to the input/output interface 305 is composed of, for example, a flash memory, a hard disk, or the like, and stores programs executed by the CPU 301 and various data.
  • a communication unit 309 functions as a transmission/reception unit for data communication via a network such as the Internet or a local area network, and communicates with an external device.
  • a GPU Graphics Processing Unit
  • a drive 310 connected to the input/output interface 305 drives a removable medium 311 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory such as a memory card to record or read data.
  • a removable medium 311 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory such as a memory card to record or read data.
  • the technique disclosed in this specification can take the following configurations.
  • a quadratic programming problem optimum solution calculation unit that calculates the optimum solution of the quadratic programming problem corresponding to the input parameters
  • a constraint norm calculator that calculates a constraint norm that is the norm between each constraint defined by the constraint function of the quadratic programming problem and the optimal solution
  • An information processing apparatus having a learning process execution unit that generates a constraint norm estimator that executes a learning process using set data of the input parameter and the constraint norm as learning data and estimates the constraint norm according to various input parameters.
  • the constraint norm calculated by the constraint norm calculation unit is Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem.
  • the information processing apparatus according to (1) which is a constraint activity determination index value for identifying whether the constraint is an inactive constraint that is not used in the calculation of .
  • the constraint norm calculated by the constraint norm calculation unit is determining a constraint whose constraint norm is less than a predefined threshold as an active constraint;
  • the constraint norm according to various input parameters estimated by the constraint norm estimator generated by the learning processing unit is each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used for calculating the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem.
  • the information processing apparatus according to any one of (1) to (3), which is a constraint activity determination index value for identifying whether the constraint is an inactive constraint that is not used for calculation.
  • the constraint norm calculation unit The information processing apparatus according to any one of (1) to ( 4 ), wherein an L2 norm (Euclidean norm) corresponding to a distance in a vector space from an optimal solution of a quadratic programming problem to a constraint is calculated as the constraint norm.
  • L2 norm Euclidean norm
  • the quadratic programming problem optimal solution calculation unit The information processing apparatus according to any one of (1) to (5), wherein a solution that satisfies the constraints of the constraint function of the quadratic programming problem and that minimizes the objective function of the quadratic programming problem is calculated as the optimal solution. .
  • the learning processing unit The information processing device according to any one of (1) to (6), which generates a constraint norm estimator configured by a neural network.
  • the learning processing unit The information processing apparatus according to any one of (1) to (7), which generates a constraint norm estimator configured by a neural network that executes regression analysis processing.
  • the information processing device a quadratic programming problem standardized model generation unit that generates a quadratic programming problem standardized model corresponding to input parameters;
  • the quadratic programming problem optimal solution calculation unit The information processing apparatus according to any one of (1) to (8), wherein the optimal solution is calculated using the quadratic programming problem standardized model generated by the quadratic programming problem standardized model generation unit.
  • a constraint norm estimator that estimates the constraint norm according to the input parameters for each constraint defined by the constraint function of the quadratic programming problem; By comparing the constraint norm estimated by the constraint norm estimator with a predetermined threshold value, Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem.
  • An information processing apparatus comprising a linear system analysis unit that selects only active constraints using the constraint activity analysis information generated by the active constraint selection unit and calculates the optimum solution of the quadratic programming problem.
  • the constraint norm estimator The information processing apparatus according to (10), which is a constraint norm estimator generated by learning processing using set data of various input parameters and the constraint norm as learning data.
  • the constraint norm estimator The information processing device according to (10) or (11), which is a constraint norm estimator configured by a neural network.
  • the constraint norm estimator The information processing device according to any one of (10) to (12), which is a constraint norm estimator configured by a neural network that executes regression analysis processing.
  • An information processing method executed in an information processing device a quadratic programming problem optimum solution calculation step in which the quadratic programming problem optimum solution calculation unit calculates the optimum solution of the quadratic programming problem corresponding to the input parameter; a constraint norm calculation step in which a constraint norm calculation unit calculates a constraint norm that is a norm between each constraint defined by the constraint function of the quadratic programming problem and the optimal solution;
  • a learning processing execution unit executes learning processing using set data of the input parameter and the constraint norm as learning data, and generates a constraint norm estimator that estimates the constraint norm according to various input parameters.
  • An information processing method executed in an information processing device a constraint norm estimation step in which the constraint norm estimator estimates the constraint norm according to the input parameters for each constraint defined by the constraint function of the quadratic programming problem;
  • the active constraint selection unit compares the constraint norm estimated by the constraint norm estimator with a predetermined threshold,
  • Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem.
  • an active constraint selection step for generating constraint activity analysis information that enables identification of inactive constraints that are not used in the calculation of
  • a linear system analysis unit selects only active constraints using the constraint activity analysis information generated by the active constraint selection unit, and executes a linear system analysis step of calculating an optimal solution of the quadratic programming problem. information processing method.
  • a program for executing information processing in an information processing device a quadratic programming problem optimum solution calculation step for causing the quadratic programming problem optimum solution calculation unit to calculate the optimum solution of the quadratic programming problem corresponding to the input parameters; a constraint norm calculation step of causing a constraint norm calculation unit to calculate a constraint norm that is a norm between each constraint defined by the constraint function of the quadratic programming problem and the optimal solution; Execution of learning processing for generating a constraint norm estimator for estimating constraint norms corresponding to various input parameters by executing learning processing using the set data of the input parameter and the constraint norm in the learning processing execution unit as learning data.
  • a program that executes a step is causing the quadratic programming problem optimum solution calculation unit to calculate the optimum solution of the quadratic programming problem corresponding to the input parameters
  • a constraint norm calculation step of causing a constraint norm calculation unit to calculate a constraint norm that is a norm between each constraint defined by the constraint function of the quadr
  • a program for executing information processing in an information processing device a constraint norm estimation step that causes a constraint norm estimator to estimate a constraint norm according to input parameters for each constraint defined by a constraint function of the quadratic programming problem; By comparing the constraint norm estimated by the constraint norm estimator with a predetermined threshold in the active constraint selection unit, Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem.
  • an active constraint selection step for generating constraint activity analysis information that enables identification of inactive constraints that are not used in the calculation of
  • the linear system analysis unit uses the constraint activity analysis information generated by the active constraint selection unit to select only active constraints and execute a linear system analysis step of calculating the optimal solution of the quadratic programming problem. program to make
  • a program recording the processing sequence is installed in the memory of a computer built into dedicated hardware and executed, or the program is loaded into a general-purpose computer capable of executing various processing. It can be installed and run.
  • the program can be pre-recorded on a recording medium.
  • the program can be received via a network such as a LAN (Local Area Network) or the Internet and installed in a recording medium such as an internal hard disk.
  • a system is a logical collective configuration of a plurality of devices, and the devices of each configuration are not limited to being in the same housing.
  • the active constraints of the quadratic programming problem are efficiently selected using the norm of each constraint to find the optimal solution of the quadratic programming problem.
  • a device and method that enable high-speed calculation are realized. Specifically, for example, for each constraint of a quadratic programming problem, a constraint norm estimator that estimates the norm according to the input parameter, and a comparison process between the estimated constraint norm and a predetermined threshold, the quadratic Constraint activity analysis that makes it possible to identify whether each constraint of a planning problem is an active constraint used to calculate the optimal solution of the objective function of a quadratic programming problem or an inactive constraint that is not used to calculate the optimal solution.
  • An active constraint selector for generating information wherein the linear analyzer utilizes the constraint activity analysis information to select active constraints to compute an optimal solution to the quadratic programming problem.

Abstract

Provided are a device and a method of efficiently selecting an active restriction of a quadratic programming problem through use of a norm of each restriction, to enable high-speed calculation of the optimum solution for the quadratic programming problem. This information processing device comprises: a restriction norm estimator that estimates a norm corresponding to an input parameter for each restriction of a quadratic programming problem; and an active restriction selection unit that, through a comparison process between the estimated restriction norm and a predefined threshold value, generates restriction activeness analysis information enabling determination of whether each restriction of the quadratic programming problem is an active restriction used for calculating the optimum solution for an objective function of the quadratic programming problem, or a non-active restriction not used for calculating the optimum solution. A linear analysis unit selects the active restriction through use of the restriction activeness analysis information, to calculate the optimum solution for the quadratic programming problem.

Description

情報処理装置、および情報処理方法、並びにプログラムInformation processing device, information processing method, and program
 本開示は、情報処理装置、および情報処理方法、並びにプログラムに関する。具体的には、例えばロボットの制御パラメータを決定するための学習処理や、学習結果を用いたロボット制御などを行う情報処理装置、および情報処理方法、並びにプログラムに関する。 The present disclosure relates to an information processing device, an information processing method, and a program. More specifically, the present invention relates to an information processing apparatus, an information processing method, and a program that perform learning processing for determining control parameters of a robot, control of the robot using the learning result, and the like.
 例えばロボットの走行経路算出処理や、ロボットアームの最適軌道の算出処理などを行う場合、ロボットに装着されたカメラや距離センサなどのセンサ取得情報を解析して、障害物の位置解析などを行い、障害物に接触しない経路や軌道を算出する処理が行われる。 For example, when calculating the travel route of a robot or calculating the optimal trajectory of a robot arm, the information acquired by sensors such as cameras and distance sensors attached to the robot is analyzed to analyze the positions of obstacles. Processing is performed to calculate a route or trajectory that does not come into contact with obstacles.
 また、最適経路に従ってロボットを走行させる制御や、アームを最適軌道に従って動かすための制御を行う場合も、ロボットの車輪や脚部、アーム部など各部に備えられたモータやアクチュエータをどのように制御するべきかを決定する処理が必要となる。 Also, when controlling the robot to travel along the optimal path or to move the arm along the optimal trajectory, how to control the motors and actuators installed in each part such as the wheels, legs, and arm of the robot. A process is required to determine whether
 このような最適制御を行うための手法として二次計画問題(QP:Quadratic Programming Problem)を用いた手法が知られている。
 二次計画問題(QP)は、例えば最小二乗法などであり、目的関数が二次関数で制約条件が一次関数である最適化問題である。
 例えば目的関数が下に凸となる最小化問題として解くことができる最適化問題である。
As a method for performing such optimum control, a method using a quadratic programming problem (QP) is known.
A quadratic programming problem (QP) is, for example, the method of least squares, and is an optimization problem in which the objective function is a quadratic function and the constraint condition is a linear function.
For example, it is an optimization problem that can be solved as a minimization problem in which the objective function is downwardly convex.
 二次計画問題の最適解の算出手法を開示した従来技術として、例えば非特許文献1(Online Mixed-Integer Optimization in Milliseconds)がある。 For example, Non-Patent Document 1 (Online Mixed-Integer Optimization in Milliseconds) discloses a method of calculating an optimal solution to a quadratic programming problem.
 この非特許文献1は、混合整数二次計画問題の最適解を求める際に、予め学習した学習データを用いたニューラル・ネットワークを利用してアクティブな制約、および整数値の予測を行い、線形な問題に変換することで、最適解を高速に算出する手法を開示している。 This non-patent document 1 uses a neural network that uses pre-learned learning data to perform active constraints and integer value prediction when obtaining an optimal solution for a mixed integer quadratic programming problem, and linear It discloses a method of calculating the optimal solution at high speed by converting it into a simple problem.
 ニューラル・ネットワークは混合整数二次計画問題を立式する上でのパラメータから、最適解算出時のアクティブな制約と整数値の組み合わせを予測するもので、
 入力:パラメータ、
 出力:制約・整数値
 この入出力の組み合わせとするデータセットから学習される。このデータセットは予め無数の問題を解くことで用意することができる。
The neural network predicts the combination of active constraints and integer values when calculating the optimal solution from the parameters for formulating a mixed integer quadratic programming problem.
input: parameter,
Output: Constraint/integer value Learned from the dataset that is the combination of this input and output. This data set can be prepared by solving countless problems in advance.
 この非特許文献1では、ニューラル・ネットワークをクラス分類問題としてモデル化している。一般的には、出力の次元は出現しうるすべての「アクティブな制約と整数値の組み合わせ」の数と同一とする(One-hotベクトル化する)必要がある。しかしこの数は制約の個数Nに対して指数関数的に増加する。
 従って、この非特許文献1では、ヒューリスティックに予め用意したデータセットに出現するアクティブな制約と整数値の組み合わせの数を出力の次元としていた。
In this non-patent document 1, a neural network is modeled as a class classification problem. In general, the dimension of the output should be the same as the number of all possible “combinations of active constraints and integer values” (one-hot vectorization). However, this number grows exponentially with the number N of constraints.
Therefore, in Non-Patent Document 1, the number of combinations of active constraints and integer values appearing in a heuristically prepared data set is used as an output dimension.
 この手法では出力次元が制約の個数Nに対して指数関数的に増加し、メモリ消費量・計算量が増加するという問題は回避することができる。しかしすべてのパターンを網羅することはできないため、任意のタイミングで最適解が得られない可能性が発生する。 With this method, the output dimension increases exponentially with respect to the number of constraints N, and the problem of increased memory consumption and computational complexity can be avoided. However, since all patterns cannot be covered, there is a possibility that the optimum solution cannot be obtained at any timing.
 なお、出力次元が指数関数に増加する問題については、入出力関係を直接、回帰予測するモデルを利用することで回避することが可能となる。例えば、One-hotエンコーディングを利用しない構成とすることで出力次元が制約個数Nに比例し、計算量を削減できる。しかし、この場合は出力が二値となるために、ニューラル・ネットワークの学習には適さず、学習効率が悪化するという問題が発生する。 It should be noted that the problem of the output dimension increasing exponentially can be avoided by using a model that directly predicts the input/output relationship through regression prediction. For example, by adopting a configuration that does not use one-hot encoding, the output dimension is proportional to the number of constraints N, and the amount of calculation can be reduced. However, in this case, since the output is binary, it is not suitable for neural network learning, and the problem arises that the learning efficiency deteriorates.
 本開示は、例えば上記問題点に鑑みてなされたものであり、二次計画問題(Quadratic programming problem:QP)を効率的にかつ高速に解くことを可能とした情報処理装置、および情報処理方法、並びにプログラムを提供することを目的とする。 The present disclosure has been made in view of the above problems, for example, and includes an information processing device and an information processing method that are capable of efficiently and quickly solving a quadratic programming problem (QP), and to provide programs.
 本開示の一実施例においては、例えばロボットの制御パラメータを決定するためなどに利用される学習処理や、学習処理によって生成された予測器やデータを用いてロボット制御などを行う情報処理装置、および情報処理方法、並びにプログラムを提供する。 In one embodiment of the present disclosure, for example, learning processing used for determining control parameters of a robot, an information processing device that performs robot control using a predictor and data generated by the learning processing, and An information processing method and a program are provided.
 本開示の第1の側面は、
 入力パラメータに対応する二次計画問題の最適解を算出する二次計画問題最適解算出部と、
 前記二次計画問題の制約関数によって定義される制約各々と前記最適解とのノルムである制約ノルムを算出する制約ノルム算出部と、
 前記入力パラメータと前記制約ノルムとの組データを学習データとした学習処理を実行し、様々な入力パラメータに応じた制約ノルムを推定する制約ノルム推定器を生成する学習処理実行部を有する情報処理装置にある。
A first aspect of the present disclosure includes:
a quadratic programming problem optimal solution calculation unit that calculates the optimal solution of the quadratic programming problem corresponding to the input parameters;
a constraint norm calculator that calculates a constraint norm that is the norm between each constraint defined by the constraint function of the quadratic programming problem and the optimal solution;
An information processing apparatus having a learning process execution unit that generates a constraint norm estimator that executes a learning process using set data of the input parameter and the constraint norm as learning data and estimates the constraint norm according to various input parameters. It is in.
 さらに、本開示の第2の側面は、
 二次計画問題の制約関数によって定義される制約各々について、入力パラメータに応じた制約ノルムを推定する制約ノルム推定器と、
 前記制約ノルム推定器が推定した制約ノルムと予め規定したしきい値との比較処理により、
 前記二次計画問題の制約関数によって定義される制約各々が、前記二次計画問題の目的関数の最適解の算出に利用されるアクティブ制約であるか、前記二次計画問題の目的関数の最適解の算出に利用されない非アクティブ制約であるかを識別可能とした制約アクティブ性解析情報を生成するアクティブ制約選択部と、
 前記アクティブ制約選択部が生成した前記制約アクティブ性解析情報を利用して、アクティブ制約のみを選択して、前記二次計画問題の最適解を算出する線形システム解析部を有する情報処理装置にある。
Furthermore, a second aspect of the present disclosure is
a constraint norm estimator that estimates a constraint norm according to input parameters for each constraint defined by a constraint function of the quadratic programming problem;
By comparing the constraint norm estimated by the constraint norm estimator with a predetermined threshold value,
Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem. an active constraint selection unit that generates constraint activity analysis information that can identify whether it is an inactive constraint that is not used in the calculation of
The information processing apparatus includes a linear system analysis unit that selects only active constraints using the constraint activity analysis information generated by the active constraint selection unit and calculates the optimum solution of the quadratic programming problem.
 さらに、本開示の第3の側面は、
 情報処理装置において実行する情報処理方法であり、
 二次計画問題最適解算出部が、入力パラメータに対応する二次計画問題の最適解を算出する二次計画問題最適解算出ステップと、
 制約ノルム算出部が、前記二次計画問題の制約関数によって定義される制約各々と前記最適解とのノルムである制約ノルムを算出する制約ノルム算出ステップと、
 学習処理実行部が、前記入力パラメータと前記制約ノルムとの組データを学習データとした学習処理を実行し、様々な入力パラメータに応じた制約ノルムを推定する制約ノルム推定器を生成する学習処理実行ステップを実行する情報処理方法にある。
Furthermore, a third aspect of the present disclosure is
An information processing method executed in an information processing device,
a quadratic programming problem optimum solution calculation step in which the quadratic programming problem optimum solution calculation unit calculates the optimum solution of the quadratic programming problem corresponding to the input parameter;
a constraint norm calculation step in which a constraint norm calculation unit calculates a constraint norm that is a norm between each constraint defined by the constraint function of the quadratic programming problem and the optimal solution;
A learning processing execution unit executes learning processing using set data of the input parameter and the constraint norm as learning data, and generates a constraint norm estimator that estimates the constraint norm according to various input parameters. An information processing method for executing steps.
 さらに、本開示の第4の側面は、
 情報処理装置において実行する情報処理方法であり、
 制約ノルム推定器が、二次計画問題の制約関数によって定義される制約各々について、入力パラメータに応じた制約ノルムを推定する制約ノルム推定ステップと、
 アクティブ制約選択部が、前記制約ノルム推定器が推定した制約ノルムと予め規定したしきい値との比較処理により、
 前記二次計画問題の制約関数によって定義される制約各々が、前記二次計画問題の目的関数の最適解の算出に利用されるアクティブ制約であるか、前記二次計画問題の目的関数の最適解の算出に利用されない非アクティブ制約であるかを識別可能とした制約アクティブ性解析情報を生成するアクティブ制約選択ステップと、
 線形システム解析部が、前記アクティブ制約選択部が生成した前記制約アクティブ性解析情報を利用して、アクティブ制約のみを選択して、前記二次計画問題の最適解を算出する線形システム解析ステップを実行する情報処理方法にある。
Furthermore, a fourth aspect of the present disclosure is
An information processing method executed in an information processing device,
a constraint norm estimation step in which the constraint norm estimator estimates the constraint norm according to the input parameters for each constraint defined by the constraint function of the quadratic programming problem;
The active constraint selection unit compares the constraint norm estimated by the constraint norm estimator with a predetermined threshold,
Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem. an active constraint selection step for generating constraint activity analysis information that enables identification of inactive constraints that are not used in the calculation of
A linear system analysis unit selects only active constraints using the constraint activity analysis information generated by the active constraint selection unit, and executes a linear system analysis step of calculating an optimal solution of the quadratic programming problem. It is in the information processing method to do.
 さらに、本開示の第5の側面は、
 情報処理装置において情報処理を実行させるプログラムであり、
 二次計画問題最適解算出部に、入力パラメータに対応する二次計画問題の最適解を算出させる二次計画問題最適解算出ステップと、
 制約ノルム算出部に、前記二次計画問題の制約関数によって定義される制約各々と前記最適解とのノルムである制約ノルムを算出させる制約ノルム算出ステップと、
 学習処理実行部に、前記入力パラメータと前記制約ノルムとの組データを学習データとした学習処理を実行し、様々な入力パラメータに応じた制約ノルムを推定する制約ノルム推定器を生成する学習処理実行ステップを実行させるプログラムにある。
Furthermore, a fifth aspect of the present disclosure is
A program for executing information processing in an information processing device,
a quadratic programming problem optimum solution calculation step for causing the quadratic programming problem optimum solution calculation unit to calculate the optimum solution of the quadratic programming problem corresponding to the input parameters;
a constraint norm calculation step of causing a constraint norm calculation unit to calculate a constraint norm that is a norm between each constraint defined by the constraint function of the quadratic programming problem and the optimal solution;
Execution of learning processing for generating a constraint norm estimator for estimating constraint norms corresponding to various input parameters by executing learning processing using the set data of the input parameter and the constraint norm in the learning processing execution unit as learning data. It is in the program that causes the steps to be executed.
 さらに、本開示の第6の側面は、
 情報処理装置において情報処理を実行させるプログラムであり、
 制約ノルム推定器に、二次計画問題の制約関数によって定義される制約各々について、入力パラメータに応じた制約ノルムを推定させる制約ノルム推定ステップと、
 アクティブ制約選択部に、前記制約ノルム推定器が推定した制約ノルムと予め規定したしきい値との比較処理により、
 前記二次計画問題の制約関数によって定義される制約各々が、前記二次計画問題の目的関数の最適解の算出に利用されるアクティブ制約であるか、前記二次計画問題の目的関数の最適解の算出に利用されない非アクティブ制約であるかを識別可能とした制約アクティブ性解析情報を生成させるアクティブ制約選択ステップと、
 線形システム解析部に、前記アクティブ制約選択部が生成した前記制約アクティブ性解析情報を利用して、アクティブ制約のみを選択して、前記二次計画問題の最適解を算出する線形システム解析ステップを実行させるプログラムにある。
Furthermore, a sixth aspect of the present disclosure is
A program for executing information processing in an information processing device,
a constraint norm estimation step that causes a constraint norm estimator to estimate a constraint norm according to input parameters for each constraint defined by a constraint function of the quadratic programming problem;
By comparing the constraint norm estimated by the constraint norm estimator with a predetermined threshold in the active constraint selection unit,
Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem. an active constraint selection step for generating constraint activity analysis information that enables identification of inactive constraints that are not used in the calculation of
The linear system analysis unit uses the constraint activity analysis information generated by the active constraint selection unit to select only active constraints and execute a linear system analysis step of calculating the optimal solution of the quadratic programming problem. There is a program to let
 なお、本開示のプログラムは、例えば、様々なプログラム・コードを実行可能な情報処理装置やコンピュータ・システムに対して、コンピュータ可読な形式で提供する記憶媒体、通信媒体によって提供可能なプログラムである。このようなプログラムをコンピュータ可読な形式で提供することにより、情報処理装置やコンピュータ・システム上でプログラムに応じた処理が実現される。 It should be noted that the program of the present disclosure is, for example, a program that can be provided in a computer-readable format to an information processing device or computer system capable of executing various program codes via a storage medium or communication medium. By providing such a program in a computer-readable format, processing according to the program is realized on the information processing device or computer system.
 本開示のさらに他の目的、特徴や利点は、後述する本開示の実施例や添付する図面に基づくより詳細な説明によって明らかになるであろう。なお、本明細書においてシステムとは、複数の装置の論理的集合構成であり、各構成の装置が同一筐体内にあるものには限らない。 Still other objects, features, and advantages of the present disclosure will become apparent from more detailed descriptions based on the embodiments of the present disclosure and the accompanying drawings, which will be described later. In this specification, a system is a logical collective configuration of a plurality of devices, and the devices of each configuration are not limited to being in the same housing.
 本開示の一実施例の構成によれば、二次計画問題のアクティブ制約を、各制約のノルムを利用して効率的に選択して二次計画問題の最適解の高速算出を可能とした装置、方法が実現される。
 具体的には、例えば、二次計画問題の制約各々について、入力パラメータに応じたノルムを推定する制約ノルム推定器と、推定した制約ノルムと予め規定したしきい値との比較処理により、二次計画問題の制約各々が、二次計画問題の目的関数の最適解の算出に利用されるアクティブ制約であるか、最適解算出に利用されない非アクティブ制約であるかを識別可能とした制約アクティブ性解析情報を生成するアクティブ制約選択部を有し、線形解析部が、制約アクティブ性解析情報を利用してアクティブ制約を選択して二次計画問題の最適解を算出する。
 本構成により、、二次計画問題のアクティブ制約を、各制約のノルムを利用して効率的に選択して二次計画問題の最適解の高速算出を可能とした装置、方法が実現される。
 なお、本明細書に記載された効果はあくまで例示であって限定されるものではなく、また付加的な効果があってもよい。
According to the configuration of an embodiment of the present disclosure, an apparatus that enables high-speed calculation of an optimal solution to a quadratic programming problem by efficiently selecting active constraints for the quadratic programming problem using the norm of each constraint. , a method is realized.
Specifically, for example, for each constraint of a quadratic programming problem, a constraint norm estimator that estimates the norm according to the input parameter, and a comparison process between the estimated constraint norm and a predetermined threshold, the quadratic Constraint activity analysis that makes it possible to identify whether each constraint of a planning problem is an active constraint used to calculate the optimal solution of the objective function of a quadratic programming problem or an inactive constraint that is not used to calculate the optimal solution. An active constraint selector for generating information, wherein the linear analyzer utilizes the constraint activity analysis information to select active constraints to compute an optimal solution to the quadratic programming problem.
With this configuration, it is possible to realize an apparatus and a method for efficiently selecting active constraints of a quadratic programming problem using the norm of each constraint and enabling high-speed calculation of the optimum solution of the quadratic programming problem.
Note that the effects described in this specification are merely examples and are not limited, and additional effects may be provided.
本開示の処理が適用可能なロボットの制御処理例についてて説明する図である。FIG. 4 is a diagram illustrating an example of control processing of a robot to which the processing of the present disclosure can be applied; 二次計画問題に設定されている「不等式制約」から「アクティブ制約」を抽出することで、高速に二次計画問題の最適解xを算出する装置の構成例について説明する図である。FIG. 10 is a diagram illustrating a configuration example of a device that quickly calculates an optimal solution x of a quadratic programming problem by extracting "active constraints" from "inequality constraints" set in the quadratic programming problem; 二次計画問題を適用した学習処理を実行する装置の構成例について説明する図である。FIG. 2 is a diagram illustrating a configuration example of a device that executes learning processing to which a quadratic programming problem is applied; 二次計画問題におけるアクティブ制約と非アクティブ制約の具体例について説明する図である。It is a figure explaining the example of active restrictions and inactive restrictions in a quadratic programming problem. アクティブ制約識別データ生成部の生成するアクティブ制約識別データ(S(θ))の具体例について説明する図である。FIG. 4 is a diagram illustrating a specific example of active constraint identification data (S * (θ)) generated by an active constraint identification data generation unit; クラス分類処理部(NN Classifier=ニューラル・ネットワーク・クラス分類部)が生成する予測器(NN:ニューラル・ネットワーク)の具体例について説明する図である。FIG. 4 is a diagram illustrating a specific example of a predictor (NN: neural network) generated by a class classification processing unit (NN Classifier=neural network class classifying unit); 予測器(NN:ニューラル・ネットワーク)が出力するアクティブ制約識別データ(S(θ))対応のラベルの例について説明する図である。FIG. 4 is a diagram illustrating an example of a label corresponding to active constraint identification data (S * (θ)) output by a predictor (NN: neural network); ロボットの観測情報(θ)から最適制御情報を含む二次計画問題の最適解xを算出する制御情報生成部の構成例について説明する図である。FIG. 4 is a diagram illustrating a configuration example of a control information generation unit that calculates an optimal solution x * of a quadratic programming problem including optimal control information from robot observation information (θ); クラス分類処理部(NN Classifier=ニューラル・ネットワーク・クラス分類部)の詳細構成と処理について説明する図である。FIG. 3 is a diagram for explaining the detailed configuration and processing of a class classification processing unit (NN Classifier=neural network class classifying unit); 二次計画問題を適用した学習処理を実行する装置の構成例について説明する図である。FIG. 2 is a diagram illustrating a configuration example of a device that executes learning processing to which a quadratic programming problem is applied; 二次計画問題におけるアクティブ制約と非アクティブ制約の具体例と、アクティブ制約と非アクティブ制約を識別するための指標として利用するノルム(制約ノルム)について説明する図である。FIG. 3 is a diagram illustrating a specific example of active constraints and inactive constraints in a quadratic programming problem, and norms (constraint norms) used as indices for distinguishing active constraints and inactive constraints; 制約ノルム算出部(Calc Norm)が生成する制約ノルム(S (θ))の具体例について説明する図である。FIG. 4 is a diagram illustrating a specific example of a constraint norm (S l * (θ)) generated by a constraint norm calculator (Calc Norm); 制約ノルム推定器生成学習処理実行部(回帰分析器(NN Regressor)生成部)が学習処理により生成する回帰分析器の例について説明する図である。FIG. 10 is a diagram for explaining an example of a regression analyzer generated by learning processing by a constraint norm estimator generation learning processing execution unit (regression analyzer (NN Regressor) generation unit); ロボットの観測情報(θ)から最適制御情報を含む二次計画問題の最適解xを算出する制御情報生成部の構成例について説明する図である。FIG. 4 is a diagram illustrating a configuration example of a control information generation unit that calculates an optimal solution x * of a quadratic programming problem including optimal control information from robot observation information (θ); 制約ノルム推定部(NN Regressor=ニューラル・ネットワーク回帰分析器)が実行する処理例について説明する図である。FIG. 4 is a diagram illustrating an example of processing executed by a constraint norm estimator (NN Regressor=neural network regression analyzer); しきい値適用アクティブ制約選択部(Threshold)の実行する処理の具体例について説明する図である。FIG. 10 is a diagram illustrating a specific example of processing executed by a threshold applied active constraint selection unit (Threshold); 本開示の情報処理装置のハードウェア構成例について説明する図である。It is a figure explaining the hardware structural example of the information processing apparatus of this indication.
 以下、図面を参照しながら本開示の情報処理装置、および情報処理方法、並びにプログラムの詳細について説明する。なお、説明は以下の項目に従って行なう。
 1.二次計画問題(QP)とロボット制御の概要について
 2.二次計画問題を適用した学習処理と、学習結果を利用したロボット制御処理の具体例について
 3.二次計画問題の最適解から制約までのノルムに基づいてアクティブ制約を識別する構成について
 4.情報処理装置のハードウェア構成例について
 5.本開示の構成のまとめ
Details of the information processing apparatus, the information processing method, and the program according to the present disclosure will be described below with reference to the drawings. The description will be made according to the following items.
1. Outline of quadratic programming problem (QP) and robot control 2. Concrete examples of learning processing to which the quadratic programming problem is applied and robot control processing using the learning results3. 4. Regarding the configuration that identifies the active constraint based on the norm from the optimal solution of the quadratic programming problem to the constraint. Hardware Configuration Example of Information Processing Apparatus5. SUMMARY OF THE STRUCTURE OF THE DISCLOSURE
  [1.二次計画問題(QP)とロボット制御の概要について]
 まず、二次計画問題(QP:Quadratic Programming Problem)とロボット制御の概要について説明する。
[1. Overview of quadratic programming problem (QP) and robot control]
First, an outline of a quadratic programming problem (QP) and robot control will be described.
 例えば、図1に示すような走行型のロボット10の走行経路20の算出処理や、アームを動かして様々な作業を行うロボットのアームの軌道算出処理などを行う場合、ロボットに装着されたカメラや距離センサなどのセンサ取得情報を解析して、障害物の位置解析などを行い、障害物に接触しない経路や軌道を算出する処理が行われる。 For example, when performing calculation processing of a traveling route 20 of a traveling robot 10 as shown in FIG. Information acquired by a sensor such as a distance sensor is analyzed to analyze the position of obstacles, and a process of calculating a route or trajectory that does not come into contact with obstacles is performed.
 また、最適経路に従ってロボットを走行させる制御や、アームを最適軌道に従って動かすための制御を行う場合も、ロボットの車輪や脚部、アーム部など各部に備えられたモータやアクチュエータをどのように制御するべきかを決定する処理が必要となる。 Also, when controlling the robot to travel along the optimal path or to move the arm along the optimal trajectory, how to control the motors and actuators installed in each part such as the wheels, legs, and arm of the robot. A process is required to determine whether or not
 このような最適制御を行うための手法として二次計画問題(QP:Quadratic Programming Problem)を用いた手法が知られている。
 二次計画問題は、数理最適化の非線形計画手法の代表例である二次計画法(Quadratic Programming)を適用して、例えばロボットの経路情報や制御情報等の最適解を算出する問題である。
As a method for performing such optimum control, a method using a quadratic programming problem (QP) is known.
The quadratic programming problem is a problem of calculating an optimal solution, such as robot path information and control information, by applying quadratic programming, which is a representative example of a nonlinear programming technique for mathematical optimization.
 二次計画問題は、目的関数が二次関数で制約条件が一次関数である最適化問題であり、例えば最小二乗法なども二次計画問題の一種である。
 二次計画問題は、例えば目的関数が下に凸となる最小化問題として解くことができる。
A quadratic programming problem is an optimization problem in which the objective function is a quadratic function and the constraint condition is a linear function. For example, the least squares method is also a kind of quadratic programming problem.
A quadratic programming problem can be solved, for example, as a minimization problem with a downwardly convex objective function.
 二次計画問題は、以下の(式1)に示す問題の最適解としてのn次元ベクトルxを求める問題である。 The quadratic programming problem is the problem of finding an n-dimensional vector x as the optimal solution to the problem shown in (Formula 1) below.
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000001
 上記(式1)中、(a)は目的関数(またはコスト関数)であり、(b)は制約関数である。
 二次計画問題は、上記式中の(a)目的関数が最小となる最適解x(n次元ベクトル)を求める問題である。
 (b)制約関数は、最適解xの存在許容範囲を定義した関数である。
 二次計画問題を解く際には、(b)制約関数を満たす範囲にある最適解x求めることが必要となる。
In the above (Formula 1), (a) is the objective function (or cost function) and (b) is the constraint function.
The quadratic programming problem is a problem of finding the optimum solution x (n-dimensional vector) that minimizes the (a) objective function in the above equation.
(b) The constraint function is a function that defines the allowable existence range of the optimal solution x.
When solving a quadratic programming problem, (b) it is necessary to obtain an optimal solution x within a range that satisfies the constraint function.
 上記式の(a)目的関数と(b)制約関数に示す各パラメータは以下のパラメータである。
 Pはn×nの実数値対象行列、
 qはn×1の実数ベクトル、
 Aはm×nの行列、
 l,uは、m次元ベクトル、
 xはn次元ベクトルxの転置行列を意味する。
Each parameter shown in (a) objective function and (b) constraint function in the above formula is the following parameter.
P is an n×n real-valued symmetric matrix,
q is an n×1 real vector,
A is an m×n matrix,
l, u are m-dimensional vectors,
x T means the transposed matrix of n-dimensional vector x.
 なお、(b)制約関数に示す「l≦Ax≦u」は、ベクトルAxの要素の各々が、ベクトルlの対応要素以上であり、かつベクトルuの対応要素以下であるという制約を意味する。
 なお、このように不等式を用いた制約を「不等式制約」と呼ぶ。これに対して等式、例えば、
 Cx=d
 このような等式を用いた制約は「等式制約」と呼ばれる。
Note that “l≦Ax≦u” shown in (b) constraint function means a constraint that each element of vector Ax is greater than or equal to the corresponding element of vector l and less than or equal to the corresponding element of vector u.
Note that constraints using inequalities in this way are called “inequality constraints”. For this, an equation, e.g.
Cx=d
Constraints using such equations are called "equality constraints".
 また、制約には、最適解xの算出処理に利用可能なアクティブ制約と、最適解xの算出処理には利用されない非アクティブ制約とが混在する。
 アクティブな等式制約のみを抽出できれば、二次計画問題を線形方程式に帰着させることが可能となり、高速な最適解算出が可能となる。
Further, the constraints include both active constraints that can be used for the calculation process of the optimum solution x and non-active constraints that are not used for the calculation process of the optimum solution x.
If only active equality constraints can be extracted, quadratic programming problems can be reduced to linear equations, and high-speed optimal solution calculation becomes possible.
 図2を参照して、二次計画問題に設定される「不等式制約」から「アクティブ制約」を抽出することで、高速に二次計画問題の最適解xを算出する装置の構成例について説明する。 With reference to FIG. 2, a configuration example of a device that quickly calculates the optimal solution x of a quadratic programming problem by extracting "active constraints" from "inequality constraints" set in the quadratic programming problem will be described. .
 図2に示す二次計画問題最適解算出装置30は、二次計画問題標準化モデル生成部(QP Modeling)31と、二次計画問題標準化モデル最適解算出部(QP Solver)32を有する。
 図2に示す二次計画問題最適解算出装置30は、パラメータθを入力して、二次計画問題の最適解xを出力する。
 なお、xは、x(n次元ベクトル)のエルミート転置行列を意味する。
The quadratic programming problem optimum solution calculation device 30 shown in FIG.
The quadratic programming problem optimum solution calculation device 30 shown in FIG. 2 inputs the parameter θ and outputs the optimum solution x * of the quadratic programming problem.
Note that x * means a Hermitian transposed matrix of x (n-dimensional vector).
 入力パラメータθと出力である最適解xとの関係は、例えば図1に示すロボット10の制御構成構成に当てはめると、以下のような対応関係とすることが可能である。
 入力パラメータθ=観測情報(障害物の距離、ロボットの位置、速度、方向など)
 出力最適解x*=ロボット制御情報(ロボット進行方向制御情報、速度制御情報、左右車輪部に対する出力制御情報など)
The relationship between the input parameter θ and the optimal solution x * , which is the output, can be the following correspondence relationship when applied to the control configuration configuration of the robot 10 shown in FIG. 1, for example.
Input parameter θ = observation information (distance of obstacles, robot position, speed, direction, etc.)
Output optimum solution x* = robot control information (robot traveling direction control information, speed control information, output control information for left and right wheels, etc.)
 すなわち、図2に示す二次計画問題最適解算出装置30は、ロボット10の観測情報によって構成されるパラメータθを入力して、二次計画問題の最適解xとして、ロボットの制御情報を出力するといった処理に利用することができる。
 制約関数は、例えば、ロボット10の速度制限情報や、ロボットと障害物との間に許容される最低限の距離情報等を規定する関数等によって構成される。
That is, the quadratic programming problem optimal solution calculation device 30 shown in FIG. 2 inputs the parameter θ configured by the observation information of the robot 10, and outputs the robot control information as the optimal solution x * of the quadratic programming problem. It can be used for processing such as
The constraint function is composed of, for example, a function that defines speed limit information of the robot 10, minimum distance information that is allowed between the robot and an obstacle, and the like.
 入力パラメータθは、例えば複数(k)の観測情報からなるk次元ベクトル(θ,θ,・・・θk-1)であり、二次計画問題の最適解xは、ロボットの複数(n)の制御情報からなるn次元ベクトル(x,x,・・・xn-1)として表現される。
 上記(式1)の(a)目的関数や、(b)制約関数に設定される各パラメータ(P,q,A,l,u)は、入力パラメータθと、出力である最適解xとの関係によって規定されるパラメータである。
The input parameter θ is, for example, a k - dimensional vector (θ 0 , θ 1 , . n) of control information (x 0 , x 1 , . . . x n−1 ).
Each parameter (P, q, A, l, u) set in the (a) objective function and (b) constraint function in the above (Equation 1) is the input parameter θ and the output optimal solution x A parameter defined by a relationship.
 図2に示す二次計画問題最適解算出装置30の各構成部の実行する処理について説明する。
 図2に示す二次計画問題最適解算出装置30の二次計画問題標準化モデル生成部(QP Modeling)31は、パラメータθを入力し、入力パラメータθに基づいて、二次計画問題標準化モデルを生成する。
Processing executed by each component of the quadratic programming problem optimum solution calculation device 30 shown in FIG. 2 will be described.
A quadratic programming problem standardized model generation unit (QP Modeling) 31 of the quadratic programming problem optimum solution calculation device 30 shown in FIG. do.
 二次計画問題標準化モデルは、先に説明した以下の(式1)に示す数式モデルである。 The quadratic programming problem standardization model is the mathematical model shown in (Equation 1) below, which was explained earlier.
Figure JPOXMLDOC01-appb-M000002
Figure JPOXMLDOC01-appb-M000002
 上記(式1)中、(a)は目的関数(またはコスト関数)である。
 (b)は制約関数であり不等式によって構成される不等式制約関数である。
 前述したように、二次計画問題は、上記式中の(a)目的関数が最小となる(b)制約関数の制約を満たす最適解x(n次元ベクトル)を算出する問題である。
In the above (Formula 1), (a) is an objective function (or cost function).
(b) is a constraint function, which is an inequality constraint function composed of inequalities.
As described above, the quadratic programming problem is a problem of calculating the optimal solution x * (n-dimensional vector) that satisfies the constraints of (a) the objective function and (b) the constraint function in the above equation.
 二次計画問題標準化モデル最適解算出部(QP Solver)32は、二次計画問題標準化モデル生成部(QP Modeling)31が生成した二次計画問題標準化モデル、すなわち、上記(式1)示す(a)目的関数と(b)制約関数によって構成される二次計画問題標準化モデルを入力して、(a)目的関数が最小となる(b)制約関数の制約を満たす最適解x(n次元ベクトル)を算出して出力する。 The quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 32 generates the quadratic programming problem standardized model generated by the quadratic programming problem standardized model generation unit (QP Modeling) 31, that is, the above (Equation 1) (a ) input the quadratic programming problem standardized model composed of the objective function and (b) the constraint function, the optimal solution x * (n-dimensional vector ) is calculated and output.
 二次計画問題標準化モデル最適解算出部(QP Solver)32は、最適化問題(二次計画問題)の最適解xを算出し、算出した最適解xを、不等式制約「l≦Ax≦u」に代入する。
 この代入処理によって、等式が成立している行のみを抽出する。等式が成立している行列要素を1、それ以外の行列要素を0とした選択行列Scl,Scuを生成して、これらの行列Scl,Scuを利用すると以下の関係式、すなわち、
 SclAx=Scll,
 ScuAx=Scuu,
 上記の関係式が成立する。
 これらの関係式を連結して、以下の(式2)を生成する。
A quadratic programming problem standardized model optimum solution calculation unit (QP Solver) 32 calculates the optimum solution x * of the optimization problem (quadratic programming problem), and applies the calculated optimum solution x* to the inequality constraint “l≦Ax≦ u”.
This substitution process extracts only the rows where the equality holds. By generating selection matrices S cl and S cu in which the matrix elements satisfying the equation are set to 1 and the other matrix elements are set to 0, and using these matrices S cl and S cu , the following relational expression is obtained: ,
S cl Ax * = S cl l,
S cu Ax * =S cu u,
The above relational expression holds.
By concatenating these relational expressions, the following (Equation 2) is generated.
Figure JPOXMLDOC01-appb-M000003
Figure JPOXMLDOC01-appb-M000003
 二次計画問題標準化モデル最適解算出部(QP Solver)32は、上記(式2)を利用してアクティブ制約を抽出し、抽出したアクティブ制約を等式制約とみなして、二次計画問題標準化モデル生成部(QP Modeling)31が生成した不等式制約を持つ二次計画問題を以下の(式3)に示すようなアクティブ等式制約を持つ二次計画問題に変換する。 The quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 32 uses the above (formula 2) to extract the active constraints, treats the extracted active constraints as equality constraints, and calculates the quadratic programming problem standardized model The quadratic programming problem with inequality constraints generated by the generation unit (QP Modeling) 31 is converted into a quadratic programming problem with active equality constraints as shown in (Equation 3) below.
Figure JPOXMLDOC01-appb-M000004
Figure JPOXMLDOC01-appb-M000004
 なお、上記(式3)中、(a)は目的関数(またはコスト関数)である。
 (b)は制約関数であり、アクティブ等式制約からなる制約関数である。
Note that (a) in the above (Equation 3) is an objective function (or a cost function).
(b) is a constraint function, a constraint function consisting of active equality constraints.
 上記式の(a)目的関数と(b)制約関数に示す各パラメータは以下のパラメータである。
 Pはn×nの実数値対象行列、
 qはn×1の実数ベクトル、
 xはn次元ベクトルxの転置行列、
 Cはm×nの行列、
 dはm次元ベクトル、
 を意味する。
Each parameter shown in (a) objective function and (b) constraint function in the above formula is the following parameter.
P is an n×n real-valued symmetric matrix,
q is an n×1 real vector,
x T is the transposed matrix of the n-dimensional vector x,
C is an m×n matrix,
d is an m-dimensional vector,
means
 このように、二次計画問題は、上記式中の(a)目的関数が最小となる(b)アクティブ等式制約関数の制約を満たす最適解x(n次元ベクトル)を算出する問題に置き換えられる。 In this way, the quadratic programming problem is replaced by the problem of calculating the optimal solution x * (n-dimensional vector) that satisfies the constraints of (a) the objective function is minimized and (b) the active equality constraint function in the above equation. be done.
 前述したように、二次計画問題の制約には不等式制約と等式制約が存在し、さらに最適解xの算出処理に利用可能なアクティブ制約と、最適解xの算出処理には利用されない非アクティブ制約とが存在する。アクティブ式制約のみを抽出し、抽出したアクティブ制約をアクティブ等式制約としてみなせば、二次計画問題を線形方程式に帰着させることが可能となり、高速な最適解xの算出が可能となる。 As described above, the constraints of a quadratic programming problem include inequality constraints and equality constraints, active constraints that can be used for the calculation of the optimal solution x * , and active constraints that are not used for the calculation of the optimal solution x * . There is an inactivity constraint. If only the active constraint is extracted and the extracted active constraint is regarded as the active equality constraint, the quadratic programming problem can be reduced to a linear equation, and the optimum solution x * can be calculated at high speed.
 上述のように、図2に示す二次計画問題標準化モデル最適解算出部(QP Solver)32は、最適化問題(二次計画問題)の最適解xを不等式制約「l≦Ax≦u」に代入して、等式が成立している行のみを抽出し、それ以外を0とする選択行列Scl,Scuを生成する。 As described above, the quadratic programming problem standardized model optimum solution calculation unit (QP Solver ) 32 shown in FIG. to extract only the rows where the equality holds, and to generate selection matrices S cl and S cu in which the other rows are set to 0.
 これらの選択行列Scl,Scuを利用することで、不等式制約「l≦Ax≦u」からアクティブ制約のみを抽出し、これらをアクティブ等式制約とみなし、アクティブ等式制約を満たす最適解x(n次元ベクトル)を算出する処理を行う。
 このような処理を行えば、二次計画問題の最適解xを高速に算出することが可能となる。
By using these selection matrices S cl and S cu , only active constraints are extracted from the inequality constraints “l≦Ax≦u”, these are regarded as active equality constraints, and the optimal solution x * Perform processing to calculate (n-dimensional vector).
By performing such processing, it becomes possible to calculate the optimal solution x * of the quadratic programming problem at high speed.
 しかし、この図2に示す構成では、二次計画問題標準化モデル最適解算出部(QP Solver)32は、まず、二次計画問題標準化モデル生成部(QP Modeling)31が生成した不等式制約を持つ二次計画問題を解いて、最適解xを算出する処理を行っており、その後の処理で、算出済みの最適解xを利用してアクティブ等式制約を抽出している。
 これでは、最初の最適解xの算出処理においては、アクティブ等式制約を利用することができない。
However, with the configuration shown in FIG. The next planning problem is solved to calculate the optimum solution x * , and in subsequent processing, the calculated optimum solution x * is used to extract active equality constraints.
As a result, active equality constraints cannot be used in the process of calculating the first optimal solution x * .
 例えば、実際のロボット制御を行うような場合、様々な観測情報(θ)に応じた最適な制御情報としての最適解xを高速に算出する処理が必要となる。
 このためには、予めアクティブ制約を抽出し、抽出したアクティブ制約を利用して、未知の最適解xを高速に算出する処理が必要となる。
For example, when actually controlling a robot, it is necessary to quickly calculate an optimum solution x * as optimum control information corresponding to various observation information (θ).
For this purpose, a process of extracting active constraints in advance and using the extracted active constraints to quickly calculate an unknown optimal solution x * is required.
 このような処理を実現するためには、予め学習処理を行うことが有効である。
 例えば様々な入力パラメータ(θ)と、入力パラメータ(θ)に応じたアクティブ制約を抽出するためのアクティブ制約識別データとの組データを学習データセットとして、予め生成する。
In order to realize such processing, it is effective to perform learning processing in advance.
For example, set data of various input parameters (θ) and active constraint identification data for extracting active constraints corresponding to the input parameters (θ) are generated in advance as a learning data set.
 例えば二次計画問題に含まれる不等式制約からアクティブ制約を抽出するためのアクティブ制約識別データを生成する学習処理を予め実行して、入力パラメータ(θ)とアクティブ制約識別データ(S(θ))の対応データからなる学習データセット(θ,S(θ))を生成する。 For example, a learning process for generating active constraint identification data for extracting active constraints from inequality constraints included in a quadratic programming problem is executed in advance, and input parameters (θ) and active constraint identification data (S * (θ)) Generate a training data set (θ, S * (θ)) consisting of data corresponding to .
 実際のロボット制御時には、この学習データセット(θ,S(θ))を利用することで、観測情報として得られる様々な入力パラメータ(θ)に対して、アクティブ制約識別データ(S(θ))を適用して、入力パラメータ(θ)に応じたアクティブ制約の抽出を行い、抽出したアクティブ制約をアクティブ等式制約とみなして、二次計画問題を解いて最適解xを算出する。
 このような処理を行うことで、二次計画問題の最適解xを高速に算出することが可能となる。
 以下の項目では、この学習処理と、学習結果を用いた制御処理例について説明する。
During actual robot control, active constraint identification data (S * ( θ )) is applied to extract active constraints according to the input parameter (θ), the extracted active constraints are regarded as active equality constraints, and the quadratic programming problem is solved to calculate the optimal solution x * .
By performing such processing, it becomes possible to calculate the optimal solution x * of the quadratic programming problem at high speed.
In the following items, this learning process and an example of control processing using the learning result will be described.
  [2.二次計画問題を適用した学習処理と、学習結果を利用したロボット制御処理の具体例について]
 以下では、二次計画問題を適用した学習処理と、この学習処理の学習結果を利用したロボット制御処理の具体例について説明する。
[2. Concrete examples of learning processing applying the quadratic programming problem and robot control processing using the learning results]
A specific example of a learning process to which a quadratic programming problem is applied and a robot control process using the learning result of this learning process will be described below.
 上述したように、二次計画問題の未知の最適解xを高速に算出するための一つの手法として学習処理を用いる手法が有効である。 As described above, a method using learning processing is effective as one method for calculating the unknown optimal solution x * of the quadratic programming problem at high speed.
 様々な入力パラメータ(θ)に応じたアクティブ制約識別データ、すなわち、二次計画問題に含まれる不等式制約から入力パラメータ(θ)に応じたアクティブ制約を選択抽出するためのデータを学習処理によって予め生成する。
 すなわち、学習処理により、様々な入力パラメータ(θ)と、各パラメータ(θ)対応のアクティブ制約識別データ(S(θ))との組データからなる学習データセット(θ,S(θ))を予め生成する。
Active constraint identification data corresponding to various input parameters (θ), that is, data for selectively extracting active constraints corresponding to input parameters (θ) from inequality constraints included in quadratic programming problems, is generated in advance by learning processing. do.
That is, through the learning process, a learning data set (θ, S * (θ)) consisting of set data of various input parameters (θ) and active constraint identification data (S * (θ)) corresponding to each parameter (θ) ) is generated in advance.
 さらに、学習データセット(θ,S(θ))を利用した学習処理を実行して、様々な入力パラメータ(θ)から、入力パラメータ(θ)対応のアクティブ制約識別データ(S(θ))を推定する予測器、例えばニューラル・ネットワーク(NN)を生成する。 Further, a learning process using the learning data set (θ, S * (θ)) is executed to obtain active constraint identification data (S * (θ) corresponding to the input parameter (θ) from various input parameters (θ). ), for example a neural network (NN).
 ロボット制御実行時に、この予測器、例えばニューラル・ネットワーク(NN)を利用して、様々な入力パラメータ(θ)から、入力パラメータ(θ)対応のアクティブ制約識別データ(S(θ))を推定する。さらに、推定されたアクティブ制約識別データ(S(θ))に基づいて、入力パラメータ(θ)対応のアクティブ制約を選択し、選択したアクティブ制約を利用して二次計画問題を解いて、ロボット制御情報としての最適解xを算出する。 When executing robot control, this predictor, such as a neural network (NN), is used to estimate active constraint identification data (S * (θ)) corresponding to input parameters (θ) from various input parameters (θ). do. Furthermore, based on the estimated active constraint identification data (S * (θ)), an active constraint corresponding to the input parameter (θ) is selected, the selected active constraint is used to solve the quadratic programming problem, and the robot Optimal solution x * is calculated as control information.
 すなわち、ロボットが観測情報として取得した様々な入力パラメータ(θ)に対するアクティブ制約を選択抽出するために必要となるアクティブ制約識別データ(S(θ))を予測するための予測器を学習データセット(θ,S(θ))を利用した学習処理によって生成する。 That is, a predictor for predicting active constraint identification data (S * (θ)) necessary for selectively extracting active constraints for various input parameters (θ) acquired by the robot as observation information is set as a learning data set. It is generated by learning processing using (θ, S * (θ)).
 このような学習処理によって生成さる予測器、例えばニューラル・ネットワーク(NN)を利用することで、入力パラメータ(θ)に応じたアクティブ制約の抽出を高速に行うことが可能となる。
 この結果、二次計画問題の最適解x、すなわちロボット制御情報などの最適解xを高速に算出することが可能となる。
 以下、この学習処理と、学習処理結果を用いた制御処理例について説明する。
By using a predictor generated by such learning processing, such as a neural network (NN), it is possible to extract active constraints at high speed according to the input parameter (θ).
As a result, the optimal solution x * of the quadratic programming problem, that is, the optimal solution x * of robot control information and the like can be calculated at high speed.
An example of this learning process and a control process using the result of the learning process will be described below.
 図3は、情報処理装置内に構成される学習処理部40の構成例を示す図である。
 図3に示すように、学習処理部40は、学習データセット生成部50と、予測器生成部60を有する。
 学習データセット生成部50は、入力パラメータ(θ)に応じたアクティブ制約の抽出を可能とするアクティブ制約識別データ(S(θ))を算出し、様々な入力パラメータ(θ)と、パラメータ(θ)対応のアクティブ制約識別データ(S(θ))との組データからなる学習データセット(θ,S(θ))61を生成する。
FIG. 3 is a diagram showing a configuration example of the learning processing unit 40 configured within the information processing apparatus.
As shown in FIG. 3 , the learning processing section 40 has a learning data set generation section 50 and a predictor generation section 60 .
The learning data set generation unit 50 calculates active constraint identification data (S * (θ)) that enables extraction of active constraints according to the input parameter (θ), and generates various input parameters (θ) and parameters ( .theta.) A learning data set (.theta., S * (.theta.)) 61 consisting of set data with corresponding active constraint identification data (S * (.theta.)) is generated.
 予測器生成部60は、学習データセット(θ,S(θ))61を利用して、様々な入力パラメータ(θ)から、入力パラメータ(θ)に応じたアクティブ制約を選択するための予測器に相当するNN(ニューラル・ネットワーク)を生成するクラス分類処理部62を有する。 A predictor generator 60 uses a learning data set (θ, S * (θ)) 61 to generate predictions for selecting active constraints according to input parameters (θ) from various input parameters (θ). It has a class classification processing unit 62 that generates an NN (neural network) corresponding to a device.
 まず、学習データセット生成部50の構成と処理について説明する。
 図3に示すように、学習データセット生成部50は、二次計画問題標準化モデル生成部(QP Modeling)51と、二次計画問題標準化モデル最適解算出部(QP Solver)52と、アクティブ制約識別データ生成部53を有する。
First, the configuration and processing of the learning data set generation unit 50 will be described.
As shown in FIG. 3, the learning data set generation unit 50 includes a quadratic programming problem standardized model generation unit (QP Modeling) 51, a quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 52, and an active constraint identification unit. It has a data generator 53 .
 この図3に示す二次計画問題標準化モデル生成部(QP Modeling)51と、二次計画問題標準化モデル最適解算出部(QP Solver)52は、先に図2を参照して説明した二次計画問題最適解算出装置30の構成要素である二次計画問題標準化モデル生成部(QP Modeling)31と、二次計画問題標準化モデル最適解算出部(QP Solver)32と同様の処理を実行する。
 すなわち、パラメータθを入力して、二次計画問題の最適解xを出力する。
 なお、xは、x(n次元ベクトル)のエルミート転置行列を意味する。
A quadratic programming problem standardized model generator (QP Modeling) 51 and a quadratic programming problem standardized model optimum solution calculator (QP Solver) 52 shown in FIG. A quadratic programming problem standardized model generator (QP Modeling) 31 and a quadratic programming problem standardized model optimal solution calculator (QP Solver) 32, which are components of the problem optimum solution calculation device 30, perform the same processing.
That is, the parameter θ is input and the optimal solution x * of the quadratic programming problem is output.
Note that x * means a Hermitian transposed matrix of x (n-dimensional vector).
 先に説明したように、入力パラメータθと出力である最適解xとの関係は、例えば図1に示すロボット10の制御構成構成に当てはめると、以下のような対応関係とすることが可能である。
 入力パラメータθ=観測情報(障害物の距離、ロボットの位置、速度、方向など)
 出力最適解x*=ロボット制御情報(ロボット進行方向制御情報、速度制御情報、左右車輪部に対する出力制御情報など)
 入力パラメータθは、例えば複数(k)の観測情報からなるk次元ベクトル(θ,θ,・・・θk-1)であり、二次計画問題の最適解xは、ロボットの複数(n)の制御情報からなるn次元ベクトル(x,x,・・・xn-1)として表現される。
As described above, the relationship between the input parameter θ and the optimal solution x * , which is the output, can be the following correspondence when applied to the control configuration of the robot 10 shown in FIG. be.
Input parameter θ = observation information (distance of obstacles, robot position, speed, direction, etc.)
Output optimum solution x* = robot control information (robot traveling direction control information, speed control information, output control information for left and right wheels, etc.)
The input parameter θ is, for example, a k - dimensional vector (θ 0 , θ 1 , . n) of control information (x 0 , x 1 , . . . x n−1 ).
 図3に示す学習データセット生成部50の各構成部の実行する処理について説明する。
 図3に示す学習データセット生成部50の二次計画問題標準化モデル生成部(QP Modeling)51は、パラメータθを入力し、入力パラメータθに基づいて、二次計画問題標準化モデルを生成する。
Processing executed by each component of the learning data set generation unit 50 shown in FIG. 3 will be described.
A quadratic programming problem standardized model generator (QP Modeling) 51 of the learning data set generator 50 shown in FIG. 3 receives a parameter θ and generates a quadratic programming problem standardized model based on the input parameter θ.
 二次計画問題標準化モデルは、先に説明した以下の(式1)に示す数式モデルである。 The quadratic programming problem standardization model is the mathematical model shown in (Equation 1) below, which was explained earlier.
Figure JPOXMLDOC01-appb-M000005
Figure JPOXMLDOC01-appb-M000005
 上記(式1)中、(a)は目的関数(またはコスト関数)である。
 (b)は制約関数であり不等式によって構成される不等式制約関数である。
In the above (Formula 1), (a) is an objective function (or cost function).
(b) is a constraint function, which is an inequality constraint function composed of inequalities.
 上記式の(a)目的関数と(b)制約関数に示す各パラメータは以下のパラメータである。
 Pはn×nの実数値対象行列、
 qはn×1の実数ベクトル、
 Aはm×nの行列、
 l,uは、m次元ベクトル、
 xはn次元ベクトルxの転置行列、
 を意味する。
Each parameter shown in (a) objective function and (b) constraint function in the above formula is the following parameter.
P is an n×n real-valued symmetric matrix,
q is an n×1 real vector,
A is an m×n matrix,
l, u are m-dimensional vectors,
x T is the transposed matrix of the n-dimensional vector x,
means
 前述したように、二次計画問題は、上記式中の(a)目的関数が最小となる(b)制約関数の制約を満たす最適解x(n次元ベクトル)を算出する問題である。 As described above, the quadratic programming problem is a problem of calculating the optimal solution x * (n-dimensional vector) that satisfies the constraints of (a) the objective function and (b) the constraint function in the above equation.
 二次計画問題標準化モデル最適解算出部(QP Solver)52は、二次計画問題標準化モデル生成部(QP Modeling)51が生成した二次計画問題標準化モデル、すなわち、上記(式1)示す(a)目的関数と(b)制約関数によって構成される二次計画問題標準化モデルを入力して、(a)目的関数が最小となる(b)制約関数の制約を満たす最適解x(n次元ベクトル)を算出して出力する。 The quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 52 generates the quadratic programming problem standardized model generated by the quadratic programming problem standardized model generation unit (QP Modeling) 51, that is, the above (Equation 1) (a ) input the quadratic programming problem standardized model composed of the objective function and (b) the constraint function, the optimal solution x * (n-dimensional vector ) is calculated and output.
 二次計画問題標準化モデル最適解算出部(QP Solver)52は、最適化問題(二次計画問題)の最適解xを算出し、算出した最適解xを、不等式制約「l≦Ax≦u」に代入する。
 この代入処理によって、等式が成立している行のみを抽出する。等式が成立している行列要素を1、それ以外の行列要素を0とした選択行列Scl,Scuを生成して、これらの行列Scl,Scuを利用すると以下の関係式、すなわち、
 SclAx=Scll,
 ScuAx=Scuu,
 上記の関係式が成立する。
 これらの関係式を連結して、先に説明した以下の(式2)を生成する。
A quadratic programming problem standardized model optimum solution calculation unit (QP Solver) 52 calculates the optimum solution x * of the optimization problem (quadratic programming problem), and applies the calculated optimum solution x* to the inequality constraint “l≦Ax≦ u”.
This substitution process extracts only the rows where the equality holds. By generating selection matrices S cl and S cu in which the matrix elements satisfying the equation are set to 1 and the other matrix elements are set to 0, and using these matrices S cl and S cu , the following relational expression is obtained: ,
S cl Ax * = S cl l,
S cu Ax * =S cu u,
The above relational expression holds.
By concatenating these relational expressions, the following (Equation 2) described above is generated.
Figure JPOXMLDOC01-appb-M000006
Figure JPOXMLDOC01-appb-M000006
 二次計画問題標準化モデル最適解算出部(QP Solver)52は、上記(式2)を利用してアクティブ制約を抽出し、抽出したアクティブ制約を等式制約とみなして、二次計画問題標準化モデル生成部(QP Modeling)51が生成した不等式制約を持つ二次計画問題を以下の(式3)に示すようなアクティブ等式制約を持つ二次計画問題に変換する。 The quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 52 uses the above (formula 2) to extract the active constraints, treats the extracted active constraints as equality constraints, and calculates the quadratic programming problem standardized model The quadratic programming problem with inequality constraints generated by the generation unit (QP Modeling) 51 is converted into a quadratic programming problem with active equality constraints as shown in (Equation 3) below.
Figure JPOXMLDOC01-appb-M000007
Figure JPOXMLDOC01-appb-M000007
 なお、上記(式3)中、(a)は目的関数(またはコスト関数)である。
 (b)は制約関数であり、アクティブ等式制約からなる制約関数である。
Note that (a) in the above (Equation 3) is an objective function (or a cost function).
(b) is a constraint function, a constraint function consisting of active equality constraints.
 上記式の(a)目的関数と(b)制約関数に示す各パラメータは以下のパラメータである。
 Pはn×nの実数値対象行列、
 qはn×1の実数ベクトル、
 xはn次元ベクトルxの転置行列、
 Cはm×nの行列、
 dはm次元ベクトル、
 を意味する。
Each parameter shown in (a) objective function and (b) constraint function in the above formula is the following parameter.
P is an n×n real-valued symmetric matrix,
q is an n×1 real vector,
x T is the transposed matrix of the n-dimensional vector x,
C is an m×n matrix,
d is an m-dimensional vector,
means
 このように、二次計画問題標準化モデル最適解算出部(QP Solver)52は、上記(式3)を適用して、(a)目的関数が最小となる(b)アクティブ等式制約関数の制約を満たす最適解x(n次元ベクトル)を算出する。 In this way, the quadratic programming problem standardized model optimum solution calculation unit (QP Solver) 52 applies the above (Equation 3) to (a) minimize the objective function and (b) constrain the active equality constraint function Calculate the optimal solution x * (n-dimensional vector) that satisfies
 前述したように、二次計画問題の制約には、最適解xの算出処理に利用可能なアクティブ制約と、最適解xの算出処理には利用されない非アクティブ制約とが存在する。
 図4を参照して、二次計画問題におけるアクティブ制約と非アクティブ制約の具体例について説明する。
As described above, the constraints of the quadratic programming problem include active constraints that can be used for the calculation process of the optimum solution x * and inactive constraints that are not used for the calculation process of the optimum solution x * .
A specific example of active constraints and inactive constraints in a quadratic programming problem will be described with reference to FIG.
 図4の中心部に示すxは、二次計画問題の最適解xを示している。
 二次計画問題の最適解xは、前述したように、二次計画問題の(a)目的関数が最小となる(b)制約関数の制約を満たす解x(n次元ベクトル)である。
x * shown in the center of FIG. 4 indicates the optimal solution x* of the quadratic programming problem.
As described above, the optimal solution x * of the quadratic programming problem is the solution x * (n-dimensional vector) that satisfies the constraints of (a) the objective function and (b) the constraint function of the quadratic programming problem.
 図4に示す円状の点線は、二次計画問題の(a)目的関数の算出値の等高線であり、等高線の内側ほど算出値が小さくなるる。
 また、図4に示す領域(V)は、二次計画問題の(b)制約関数の制約を満たす領域である。
The circular dotted line shown in FIG. 4 is the contour line of the calculated value of the (a) objective function of the quadratic programming problem, and the calculated value becomes smaller toward the inner side of the contour line.
A region (V) shown in FIG. 4 is a region that satisfies the constraint of the (b) constraint function of the quadratic programming problem.
 線分ab,cd,ef,ghは、二次計画問題の(b)制約関数によって定義される複数の制約の例を示している。
 各線分から垂直に伸びる小さな点線矢印は、各制約の満たす方向を示している。
 例えば線分abとして示す制約abは、線分abの右下側領域が制約abを満足する領域である。線分ghとして示す制約ghは、線分ghの左上側領域が制約ghを満足する領域である。
Line segments ab, cd, ef, and gh show examples of multiple constraints defined by the (b) constraint function of the quadratic programming problem.
Small dotted arrows extending vertically from each line segment indicate the direction in which each constraint is satisfied.
For example, a constraint ab indicated as a line segment ab is a region where the lower right region of the line segment ab satisfies the constraint ab. A constraint gh shown as a line segment gh is a region where the upper left region of the line segment gh satisfies the constraint gh.
 図4に示す領域(V)は、4つの線分ab,cd,ef,ghによって示される4つの制約、すなわち、制約ab、制約cd、制約ef、制約ghの全ての制約を満たす領域(n次元状態ベクトルの領域)である。
 このように、領域(V)は、二次計画問題の(b)制約関数の制約を満たす領域であり、この領域(V)内で、二次計画問題の(a)目的関数を最小値とする解x(n次元ベクトル)が二次計画問題の最適解xとして算出されることになる。
A region (V) shown in FIG. 4 is a region (n region of the dimensional state vector).
Thus, the region (V) is a region that satisfies the constraints of the (b) constraint function of the quadratic programming problem, and within this region (V), the (a) objective function of the quadratic programming problem is the minimum value A solution x * (an n-dimensional vector) is calculated as the optimal solution x * of the quadratic programming problem.
 図4に示す4つの制約、すなわち、制約ab、制約cd、制約ef、制約gh、これら4つの制約中、制約abと制約cdの2つの制約は、二次計画問題の最適解xの算出処理に利用可能なアクティブ制約である。
 一方、制約ef、制約ghの2つの制約は、二次計画問題の最適解xの算出処理には利用されない非アクティブ制約である。
 非アクティブ制約は、(b)制約関数の制約を満たす領域を規定するのみであり、最適解xの算出処理には利用されない。
The four constraints shown in FIG. 4, namely constraint ab, constraint cd, constraint ef, and constraint gh, among these four constraints, two constraints, constraint ab and constraint cd, are used to calculate the optimal solution x * of the quadratic programming problem. Active constraints available for processing.
On the other hand, two constraints, constraint ef and constraint gh, are inactive constraints that are not used in the process of calculating the optimal solution x * of the quadratic programming problem.
The inactive constraint only defines a region that satisfies the constraints of the (b) constraint function, and is not used in the process of calculating the optimal solution x * .
 二次計画問題の(b)制約関数には、複数の異なる制約が含まれるが、これら複数の制約のどの制約が最適解xの算出処理に利用可能なアクティブ制約であるかを判別することは困難であり、最適解xの算出処理の演算過程における試行錯誤の結果として各制約のアクティブ性、非アクティブ性が判別されるにすぎない。 The (b) constraint function of the quadratic programming problem includes a plurality of different constraints, and determining which of these constraints is an active constraint that can be used in the process of calculating the optimal solution x * . is difficult, and the activeness and inactiveness of each constraint can only be determined as a result of trial and error in the calculation process of the calculation process of the optimum solution x * .
 二次計画問題の最適解xの算出処理の開始前に、最適解xの算出処理に利用可能なアクティブ制約と、最適解xの算出処理には利用されない非アクティブ制約とを判別し、アクティブ式制約のみを抽出し、抽出したアクティブ制約をアクティブ等式制約としてみなせば、二次計画問題を線形方程式に帰着させることが可能となり、高速な最適解xの算出が可能となる。
 図3に示す学習データセット生成部50のアクティブ制約識別データ生成部53は、このためのデータ、すなわちアクティブ制約識別データ(S(θ))を生成する。
Before starting the calculation process of the optimal solution x * of the quadratic programming problem, the active constraints that can be used for the calculation process of the optimal solution x * and the inactive constraints that are not used for the calculation process of the optimal solution x * are discriminated. , extracting only the active constraint and regarding the extracted active constraint as the active equality constraint, it becomes possible to reduce the quadratic programming problem to a linear equation, and to calculate the optimum solution x * at high speed.
The active constraint identification data generator 53 of the learning data set generator 50 shown in FIG. 3 generates data for this purpose, that is, active constraint identification data (S * (θ)).
 上述したように、図3に示す学習データセット生成部50の二次計画問題標準化モデル最適解算出部(QP Solver)52は、最適化問題(二次計画問題)の最適解xを算出し、算出した最適解xを、不等式制約「l≦Ax≦u」に代入して、代入処理によって等式が成立している行列要素を1、それ以外の行列要素を0とした選択行列Scl,Scuを生成する。 As described above, the quadratic programming problem standardized model optimum solution calculation unit (QP solver) 52 of the learning data set generation unit 50 shown in FIG. 3 calculates the optimum solution x * of the optimization problem (quadratic programming problem). , the calculated optimal solution x * is substituted into the inequality constraint “l≦Ax≦u”, and a selection matrix S is obtained in which the matrix elements for which the equality is established by the substitution process are set to 1, and the other matrix elements are set to 0. Generate cl , S cu .
 この行列Scl,Scuが、図3に示す学習データセット生成部50のアクティブ制約識別データ生成部53に入力され、アクティブ制約識別データ生成部53は、この行列Scl,Scuを利用して、二次計画問題の(b)制約関数に含まれる各制約のアクティブ性、非アクティブ性を識別するためのデータ(アクティブ制約識別データ(S(θ)))を生成する。 The matrices S cl and S cu are input to the active constraint identification data generator 53 of the learning data set generator 50 shown in FIG. 3, and the active constraint identification data generator 53 uses the matrices S cl and S cu to generate data (active constraint identification data (S * (θ))) for identifying the activity and inactivity of each constraint included in the (b) constraint function of the quadratic programming problem.
 図3に示す学習データセット生成部50の二次計画問題標準化モデル最適解算出部(QP Solver)52は、算出した二次計画問題の最適解x(n次元ベクトル)と、最適解xの算出処理の演算過程において生成した選択行列Scl,Scuをアクティブ制約識別データ生成部53に出力する。 The quadratic programming problem standardized model optimum solution calculation unit (QP Solver ) 52 of the learning data set generation unit 50 shown in FIG . The selection matrices S cl and S cu generated in the calculation process of the calculation process of are output to the active constraint identification data generation unit 53 .
 なお、選択行列Scl,Scuは、先に説明したように、算出した最適解xを、不等式制約「l≦Ax≦u」に代入して、等式が成立している行のみを抽出することで生成される行列である。等式が成立している行列要素を1、それ以外の行列要素を0とした選択行列Scl,Scuである。 As described above, the selection matrices S cl and S cu are obtained by substituting the calculated optimal solution x * into the inequality constraint “l≦Ax≦u” to select only the rows where the equality holds. A matrix generated by extraction. These are selection matrices S cl and S cu in which 1 is assigned to a matrix element for which an equality holds, and 0 is assigned to the other matrix elements.
 これらの選択行列Scl,Scuと、二次計画標準化モデルの不等式制約(l≦Ax<u)のパラメータA,l,u、すなわち、
 A:m×nの行列、
 l,u:、m次元ベクトル、
 これらの各パラメータには、以下の関係式、すなわち、
 SclAx=Scll,
 ScuAx=Scuu,
 上記の関係式が成立する。
These selection matrices S cl , S cu and the parameters A, l, u of the inequality constraints (l≦Ax<u) of the quadratic programming standardization model, that is,
A: m×n matrix,
l, u: , m-dimensional vector,
Each of these parameters has the following relationships:
S cl Ax * = S cl l,
S cu Ax * =S cu u,
The above relational expression holds.
 アクティブ制約識別データ生成部53は、以下の各データを入力する。
 (a)二次計画問題標準化モデル生成部(QP Modeling)51から、二次計画標準化モデルの不等式制約(l≦Ax<u)のパラメータA,l,u、
 (b)二次計画問題標準化モデル最適解算出部(QP Solver)52から、二次計画問題の最適解x(n次元ベクトル)と、選択行列Scl,Scu
The active constraint identification data generator 53 inputs the following data.
(a) From the quadratic programming problem standardized model generation unit (QP Modeling) 51, the parameters A, l, u of the inequality constraints (l≤Ax<u) of the quadratic programming standardized model,
(b) From the quadratic programming problem standardized model optimal solution calculator (QP Solver) 52, the quadratic programming problem optimal solution x * (n-dimensional vector), selection matrices S cl , S cu ,
 アクティブ制約識別データ生成部53は、これらの入力データに基づいて、二次計画標準化モデルの不等式制約(l≦Ax<u)から、アクティブ制約のみを選択抽出するための情報であるアクティブ制約識別データ(S(θ))を生成する。 Based on these input data, the active constraint identification data generator 53 generates active constraint identification data, which is information for selectively extracting only active constraints from the inequality constraints (l≦Ax<u) of the quadratic programming standardized model. Generate (S * (θ)).
 アクティブ制約識別データ生成部53の生成するアクティブ制約識別データは、図3に示すアクティブ制約識別データ生成部53の出力、すなわち、S(θ)である。
 このアクティブ制約識別データ(S(θ))は、二次計画問題標準化モデル最適解算出部(QP Solver)52から入力する選択行列Scl,Scu、の対角成分をまとめたデータである。
The active constraint identification data generated by the active constraint identification data generator 53 is the output of the active constraint identification data generator 53 shown in FIG. 3, that is, S * (θ).
This active constraint identification data (S * (θ)) is data summarizing the diagonal components of the selection matrices S cl and S cu input from the quadratic programming problem standardized model optimum solution calculator (QP Solver) 52. .
 二次計画問題標準化モデル最適解算出部(QP Solver)52から入力する選択行列Scl,Scu、は、以下の(式4)に示す行列である。 The selection matrices S cl and S cu input from the quadratic programming problem standardized model optimum solution calculator (QP Solver) 52 are the matrices shown in (Equation 4) below.
Figure JPOXMLDOC01-appb-M000008
Figure JPOXMLDOC01-appb-M000008
 この選択行列Scl,Scuは、左上端~右下端の対角要素が0または1で、それ以外の要素は0の行列である。
 左上端~右下端の対角要素の0は、二次計画問題の最適解x(n次元ベクトル)の算出処理に利用されない非アクティブ制約対応の要素であり、1は二次計画問題の最適解x(n次元ベクトル)の算出に利用されるアクティブ制約に対応する要素となる。
The selection matrices S cl and S cu are matrices in which the diagonal elements from the upper left end to the lower right end are 0 or 1, and the other elements are 0s.
0 of the diagonal element from the upper left to the lower right is an element corresponding to the inactive constraint that is not used for the calculation process of the optimal solution x * (n-dimensional vector) of the quadratic programming problem, and 1 is the optimal of the quadratic programming problem. It becomes an element corresponding to the active constraint used to calculate the solution x * (n-dimensional vector).
 図5を参照して、アクティブ制約識別データ生成部53の生成するアクティブ制約識別データ(S(θ))の具体例について説明する。
 アクティブ制約識別データ生成部53は、図5に示すように、入力パラメータ(θ)に対応するアクティブ制約識別データ(S(θ))を生成する。
A specific example of the active constraint identification data (S * (θ)) generated by the active constraint identification data generator 53 will be described with reference to FIG.
The active constraint identification data generator 53 generates active constraint identification data (S * (θ)) corresponding to the input parameter (θ), as shown in FIG.
 例えば、図5に示す例では、入力パラメータ(θ)に対応するアクティブ制約識別データ(S(θ))は(1000)である。
 S(θ)=(1000)は、制約ab,制約cd,制約ef,制約ghの4つの制約がアクティブ制約(1)か、非アクティブ制約(0)であるかを示すデータ列である。
 アクティブ制約識別データ(S(θ))は、アクティブ制約を1、非アクティブ制約を0として表現したデータ列によって構成される。
For example, in the example shown in FIG. 5, the active constraint identification data (S * (θ)) corresponding to the input parameter (θ 0 ) is (1000).
S *0 )=(1000) is a data string indicating whether the four constraints, constraint ab, constraint cd, constraint ef, and constraint gh, are active constraints (1) or inactive constraints (0). .
The active constraint identification data (S * (θ)) is composed of a data string expressing 1 for an active constraint and 0 for a non-active constraint.
 アクティブ制約識別データS(θ)=(1000)は、入力パラメータ(θ)に対応する制約ab,制約cd,制約ef,制約ghの4つの制約が以下の制約であることを意味する。
 制約ab=1(アクティブ制約)
 制約cd=0(非アクティブ制約)
 制約ef=0(非アクティブ制約)
 制約gh=0(非アクティブ制約)
The active constraint identification data S *0 )=(1000) means that the four constraints corresponding to the input parameter (θ 0 ), constraint ab, constraint cd, constraint ef, and constraint gh are the following constraints: .
Constraint ab=1 (active constraint)
Constraint cd=0 (inactive constraint)
constraint ef=0 (inactive constraint)
Constraint gh=0 (inactive constraint)
 また、アクティブ制約識別データS(θ)=(1100)は、入力パラメータ(θ)に対応する制約ab,制約cd,制約ef,制約ghの4つの制約が以下の制約であることを意味する。
 制約ab=1(アクティブ制約)
 制約cd=1(アクティブ制約)
 制約ef=0(非アクティブ制約)
 制約gh=0(非アクティブ制約)
Also, the active constraint identification data S *1 )=(1100) indicates that the four constraints corresponding to the input parameter (θ 1 ), ie, the constraint ab, the constraint cd, the constraint ef, and the constraint gh are the following constraints: means.
Constraint ab=1 (active constraint)
Constraint cd=1 (active constraint)
constraint ef=0 (inactive constraint)
Constraint gh=0 (inactive constraint)
 このように、図3に示す学習データセット生成部50のアクティブ制約識別データ生成部53は、二次計画標準化モデルの不等式制約(l≦Ax<u)から、入力パラメータ(θ)対応のアクティブ制約のみを選択抽出するための情報であるアクティブ制約識別データ(S(θ))を生成する。 In this way, the active constraint identification data generation unit 53 of the learning data set generation unit 50 shown in FIG. active constraint identification data (S * (θ)), which is information for selectively extracting only
 図3に示す学習データセット生成部50のアクティブ制約識別データ生成部53が生成したアクティブ制約識別データ(S(θ))は、学習データセット生成部50に対する入力パラメータ(θ)に対応付けられて、学習データセット61として学習データセット格納部(記憶部)に格納される。 The active constraint identification data (S * (θ)) generated by the active constraint identification data generator 53 of the learning data set generator 50 shown in FIG. and stored as a learning data set 61 in the learning data set storage unit (storage unit).
 図3に示す予測器生成部60内に示す学習データセット(θ,S(θ))61である。
 図3に示す予測器生成部60は、この学習データセット(θ,S(θ))をクラス分類処理部(NN Classifier=ニューラル・ネットワーク・クラス分類部)62に入力する。
This is the learning data set (θ, S * (θ)) 61 shown in the predictor generator 60 shown in FIG.
The predictor generator 60 shown in FIG. 3 inputs this learning data set (θ, S * (θ)) to a class classification processor (NN Classifier=neural network class classifier) 62 .
 クラス分類処理部(NN Classifier=ニューラル・ネットワーク・クラス分類部)62は、学習データセット(θ,S(θ))を利用した学習処理を実行して、入力パラメータθから、アクティブ制約識別データ(S(θ))を予測する予測器(NN:ニューラル・ネットワーク)を生成する。 A class classification processing unit (NN classifier=neural network class classifier) 62 executes a learning process using a learning data set (θ, S * (θ)) to obtain active constraint identification data from an input parameter θ Generate a predictor (NN: Neural Network) that predicts (S * (θ)).
 図6を参照して、クラス分類処理部(NN Classifier=ニューラル・ネットワーク・クラス分類部)62が生成する予測器(NN:ニューラル・ネットワーク)の具体例について説明する。 A specific example of the predictor (NN: neural network) generated by the class classification processor (NN Classifier = neural network class classifier) 62 will be described with reference to FIG.
 図6には、クラス分類処理部(NN Classifier=ニューラル・ネットワーク・クラス分類部)62が生成する予測器(NN:ニューラル・ネットワーク)62aを示している。 FIG. 6 shows a predictor (NN: neural network) 62a generated by the class classification processor (NN Classifier=neural network classifier) 62. FIG.
 予測器(NN:ニューラル・ネットワーク)62aは、入力パラメータθから、アクティブ制約識別データ(S(θ))対応のラベルを選択して出力する予測器である。
 ラベルは、アクティブ制約識別データ(S(θ))対応のラベルである。
 ラベル設定例を図7に示す。
A predictor (NN: neural network) 62a is a predictor that selects and outputs a label corresponding to active constraint identification data (S * (θ)) from an input parameter θ.
The label is a label corresponding to active constraint identification data (S * (θ)).
FIG. 7 shows an example of label setting.
 図7に示す表は、先に図5を参照して説明したアクティブ制約識別データ生成部53が生成するアクティブ制約識別データ(S(θ))にラベルを対応付けたデータに相当する。 The table shown in FIG. 7 corresponds to data in which labels are associated with the active constraint identification data (S * (θ)) generated by the active constraint identification data generator 53 described above with reference to FIG.
 例えば、入力パラメータ(θ)に対応するアクティブ制約識別データ(S(θ))は(1000)であり、このアクティブ制約識別データ(S(θ))=(1000)にはラベル[1]が設定される。 For example, the active constraint identification data (S * (θ)) corresponding to the input parameter (θ 0 ) is (1000), and this active constraint identification data (S * (θ))=(1000) has the label [1 ] is set.
 また、入力パラメータ(θ)に対応するアクティブ制約識別データ(S(θ))は(1100)であり、このアクティブ制約識別データ(S(θ))=(1100)にはラベル[2]が設定される。 Also, the active constraint identification data (S * (θ)) corresponding to the input parameter (θ 1 ) is (1100), and this active constraint identification data (S * (θ))=(1100) has the label [2 ] is set.
 このように、クラス分類処理部(NN Classifier=ニューラル・ネットワーク・クラス分類部)62が生成する予測器(NN:ニューラル・ネットワーク)62aは、入力パラメータ(θ)に対応するアクティブ制約識別データ(S(θ))のデータ列に応じたラベル1,2,3,・・・を選択して出力する。 In this way, the predictor (NN: neural network) 62a generated by the class classification processing unit (NN Classifier=neural network classifying unit) 62 generates active constraint identification data (S * Select and output labels 1, 2, 3, .
 予測器(NN:ニューラル・ネットワーク)62aにおけるラベル選択処理は、学習データセット生成部が生成した学習データセット(θ,S(θ))61を利用して行われる。 The label selection process in the predictor (NN: neural network) 62a is performed using the learning data set (θ, S * (θ)) 61 generated by the learning data set generation unit.
 例えば、記憶部に格納された学習データセット(θ,S(θ))61から、予測器(NN:ニューラル・ネットワーク)62aに対する入力パラメータθと類似度の高いパラメータ(θ)を含む学習データセット(θ,S(θ))を選択し、類似度の高い順に高スコアを設定し、設定したスコアが最も高い学習データセット(θ,S(θ))のラベル(アクティブ制約識別データ対応ラベル)を出力ラベルとする。 For example, from the learning data set (θ, S * (θ)) 61 stored in the storage unit, learning data containing parameters (θ) highly similar to the input parameters θ for the predictor (NN: neural network) 62a A set (θ, S * (θ)) is selected, a high score is set in descending order of similarity, and a label ( active constraint identification data corresponding label) is the output label.
 図6に示す例では、予測器(NN:ニューラル・ネットワーク)62aの算出スコアは以下の通りである。
 ラベル1(S(θ)=1000)のスコア=0.11
 ラベル2(S(θ)=1000)のスコア=0.79
 ラベル3(S(θ)=1000)のスコア=0.05
 ラベル4(S(θ)=1000)のスコア=0.01
In the example shown in FIG. 6, the calculated scores of the predictor (NN: neural network) 62a are as follows.
Score for label 1 (S * (θ)=1000)=0.11
Score for label 2 (S * (θ)=1000)=0.79
Score for label 3 (S * (θ)=1000)=0.05
Score for label 4 (S * (θ)=1000)=0.01
 このようスコア算出結果が得られた場合、予測器(NN:ニューラル・ネットワーク)62aは、最高スコアのラベル、すなわちラベル2を出力する。
 ラベル2は、アクティブ制約識別データS(θ)=1000に相当する。
 すなわち、入力パラメータ(θ)に対して適用すべきアクティブ制約識別データ(S(θ))=1000とする結果を出力する。
When such score calculation results are obtained, the predictor (NN: neural network) 62a outputs the label with the highest score, that is, label 2.
Label 2 corresponds to active constraint identification data S * (θ)=1000.
That is, it outputs the result of active constraint identification data (S * (θ))=1000 to be applied to the input parameter (θ).
 このアクティブ制約識別データ(S(θ))=1000は、先に、図4、図5を参照して説明したように、入力パラメータθに対応する制約ab,制約cd,制約ef,制約ghの4つの制約が以下の制約であることを意味する。
 制約ab=1(アクティブ制約)
 制約cd=0(非アクティブ制約)
 制約ef=0(非アクティブ制約)
 制約gh=0(非アクティブ制約)
This active constraint identification data (S * (θ))=1000 is the constraint ab, constraint cd, constraint ef, constraint gh are the following constraints:
Constraint ab=1 (active constraint)
Constraint cd=0 (inactive constraint)
constraint ef=0 (inactive constraint)
Constraint gh=0 (inactive constraint)
 このように、図3に示す予測器生成部60のクラス分類処理部(NN Classifier=ニューラル・ネットワーク・クラス分類部)62は、記憶部に格納された学習データセット(θ,S(θ))61を利用して、入力パラメータθから、アクティブ制約識別データ(S(θ))を予測する予測器(NN:ニューラル・ネットワーク)を生成する。 Thus, the class classification processor ( NN Classifier=neural network class classifier) 62 of the predictor generation unit 60 shown in FIG. ) 61 to generate a predictor (NN: Neural Network) that predicts the active constraint identification data (S * (θ)) from the input parameter θ.
 例えば、実際のロボット制御処理では、この予測器(NN:ニューラル・ネットワーク)、すなわち、クラス分類処理部(NN Classifier=ニューラル・ネットワーク・クラス分類部)62が生成した予測器(NN:ニューラル・ネットワーク)を利用した処理を実行する。 For example, in the actual robot control process, this predictor (NN: neural network), that is, the predictor (NN: neural network ) is executed.
 すなわち、まず、ロボットの観測情報(θ)から、ロボットの制御情報を含む最適解xを算出する二次計画問題を設定する。
 さらに、上記の予測器(NN:ニューラル・ネットワーク)を利用して、ロボットの観測情報(θ)に対応するアクティブ制約を推定し、推定したアクティブ制約を利用して二次計画問題の最適解xを算出する処理を実行することが可能となる。
 図8を参照してこの処理について説明する。
That is, first, a quadratic programming problem is set to calculate the optimal solution x * including the control information of the robot from the observation information (θ) of the robot.
Furthermore, using the above predictor (NN: neural network), the active constraint corresponding to the observation information (θ) of the robot is estimated, and the estimated active constraint is used to obtain the optimal solution x It is possible to execute processing for calculating * .
This processing will be described with reference to FIG.
 図8には、例えばロボットの情報処理装置内に構成される制御情報生成部80を示している。この制御情報生成部80は、例えば、ロボットの観測情報である入力パラメータ(θ)を入力し、ロボットの制御情報として、二次計画問題の最適解xを算出する処理を実行する。 FIG. 8 shows a control information generator 80 configured in, for example, an information processing device of a robot. The control information generating unit 80 receives, for example, an input parameter (θ), which is observation information of the robot, and executes a process of calculating the optimal solution x * of the quadratic programming problem as the control information of the robot.
 入力パラメータθは、例えば複数(k)の観測情報からなるk次元ベクトル(θ,θ,・・・θk-1)として表現される。
 また、二次計画問題の最適解xは、例えばロボットの複数(n)の制御情報からなるn次元ベクトル(x,x,・・・xn-1)として表現される。
The input parameter θ is expressed, for example, as a k - dimensional vector (θ 0 , θ 1 , .
Also, the optimal solution x * of the quadratic programming problem is expressed as an n- dimensional vector (x 0 , x 1 , .
 図8に示すように、制御情報生成部80は、クラス分類処理部(NN Classifier=ニューラル・ネットワーク・クラス分類部)81と、線形システム解析部(Linear System Solver)82を有する。 As shown in FIG. 8, the control information generation unit 80 has a class classification processing unit (NN classifier=neural network class classifier) 81 and a linear system solver 82.
 クラス分類処理部(NN Classifier=ニューラル・ネットワーク・クラス分類部)81は、先に図3を参照して説明した学習処理部40の予測器生成部60内のクラス分類処理部(NN Classifier=ニューラル・ネットワーク・クラス分類部)61と同様の構成である。すなわち、先に図6を参照して説明した学習処理部40の予測器生成部60において生成された予測器(NN:ニューラル・ネットワーク)62aと同様の処理を実行する。 The class classification processing unit (NN classifier = neural network class classification unit) 81 is a class classification processing unit (NN classifier = neural・Network class classification unit) 61 has the same configuration. That is, the same processing as the predictor (NN: neural network) 62a generated by the predictor generating unit 60 of the learning processing unit 40 described above with reference to FIG. 6 is executed.
 クラス分類処理部(NN Classifier=ニューラル・ネットワーク・クラス分類部)81は、予測器(NN:ニューラル・ネットワーク)を用いて、入力パラメータ(θ)に対応するアクティブ制約識別データ(S(θ))を生成する。アクティブ制約識別データ(S(θ))は、入力パラメータ(θ)に対応するアクティブ制約のみを選択可能とするデータである。 A class classification processing unit (NN classifier=neural network class classifier) 81 uses a predictor (NN: neural network) to generate active constraint identification data (S * (θ) ). The active constraint identification data (S * (θ)) is data that enables selection of only the active constraint corresponding to the input parameter (θ).
 クラス分類処理部(NN Classifier=ニューラル・ネットワーク・クラス分類部)81が生成したアクティブ制約識別データ(S(θ))は、線形システム解析部(Linear System Solver)82に入力される。 The active constraint identification data (S * (θ)) generated by the class classification processor (NN classifier=neural network class classifier) 81 is input to a linear system solver (Linear System Solver) 82 .
 線形システム解析部(Linear System Solver)82は、アクティブ制約識別データ(S(θ))を用いて、二次計画問題の不等式制約「l≦Ax≦u」に含まれる制約からアクティブ制約のみを抽出し、これらをアクティブ等式制約とみなし、アクティブ等式制約を満たす最適解x(n次元ベクトル)を算出する処理を行う。
 このような処理を行うことで二次計画問題の最適解xの高速算出処理が実現され、ロボットの制御を迅速に行うことが可能となる。
A linear system solver 82 uses the active constraint identification data (S * (θ)) to extract only the active constraints from the constraints included in the inequality constraints “l≦Ax≦u” of the quadratic programming problem. These are regarded as active equality constraints, and processing is performed to calculate the optimal solution x * (n-dimensional vector) that satisfies the active equality constraints.
By performing such processing, high-speed calculation processing of the optimal solution x * of the quadratic programming problem is realized, and the robot can be controlled quickly.
 図9を参照して、クラス分類処理部(NN Classifier=ニューラル・ネットワーク・クラス分類部)81の詳細構成と処理について説明する。 The detailed configuration and processing of the class classification processor (NN Classifier = neural network class classifier) 81 will be described with reference to FIG.
 図9には、クラス分類処理部(NN Classifier=ニューラル・ネットワーク・クラス分類部)81内に構成される予測器(NN:ニューラル・ネットワーク)81aとラベル変換部81bを示している。 FIG. 9 shows a predictor (NN: neural network) 81a and a label converting section 81b configured in a class classification processing section (NN Classifier=neural network classifying section) 81. FIG.
 予測器(NN:ニューラル・ネットワーク)81aは、先に図6を参照して説明した学習処理部40の予測器生成部60において生成された予測器(NN:ニューラル・ネットワーク)62aに相当する。 The predictor (NN: neural network) 81a corresponds to the predictor (NN: neural network) 62a generated in the predictor generation unit 60 of the learning processing unit 40 previously described with reference to FIG.
 すなわち、予測器(NN:ニューラル・ネットワーク)81aは、入力パラメータ(θ)に対応するラベル(アクティブ制約識別データ(S(θ))対応ラベル)を選択して出力する。
 ラベルは、先に図7を参照して説明したラベルであり、アクティブ制約識別データ(S(θ))各々に対応づけられたラベルである。
That is, the predictor (NN: neural network) 81a selects and outputs a label corresponding to the input parameter (θ) (active constraint identification data (S * (θ)) corresponding label).
The label is the label described above with reference to FIG. 7, and is associated with each active constraint identification data (S * (θ)).
 例えば、アクティブ制約識別データ(S(θ))=(1000)にはラベル[1]が設定され、アクティブ制約識別データ(S(θ))=(1100)にはラベル[2]が設定される。
 このようにラベルはアクティブ制約識別データ(S(θ))の設定が把握可能なラベルである。
For example, label [1] is set for active constraint identification data (S * (θ))=(1000), and label [2] is set for active constraint identification data (S * (θ))=(1100). be done.
In this way, the label is a label with which the setting of the active constraint identification data (S * (.theta.)) can be comprehended.
 図9に示すクラス分類処理部(NN Classifier=ニューラル・ネットワーク・クラス分類部)81の予測器(NN:ニューラル・ネットワーク)81aは、入力パラメータ(θ)に対応するアクティブ制約識別データ(S(θ))のデータ列に応じたラベル1,2,3,・・・を選択して出力する。 A predictor (NN: neural network) 81a of a class classification processing unit ( NN classifier=neural network class classifying unit) 81 shown in FIG. Select and output labels 1, 2, 3, .
 予測器(NN:ニューラル・ネットワーク)81aは、先に説明した学習データセット(θ,S(θ))61を利用した学習処理によって生成された予測器(NN:ニューラル・ネットワーク)である。 The predictor (NN: neural network) 81a is a predictor (NN: neural network) generated by learning processing using the learning data set (θ, S * (θ)) 61 described above.
 予測器(NN:ニューラル・ネットワーク)81aは、例えば、予測器(NN:ニューラル・ネットワーク)81aに対する入力パラメータθと類似度の高いパラメータ(θ)を含む学習データセット(θ,S(θ))の順に高スコアを設定し、設定したスコアが最も高い学習データセット(θ,S(θ))のラベル(アクティブ制約識別データ対応ラベル)を出力ラベルとするようなラベル推定処理を実行する。 The predictor (NN: neural network) 81a has, for example, a learning data set (θ, S * (θ) ), and perform label estimation processing such that the label (active constraint identification data corresponding label) of the learning data set (θ, S * (θ)) with the highest set score is set as the output label. .
 図9に示す例では、予測器(NN:ニューラル・ネットワーク)81aの算出スコアは以下の通りである。
 ラベル1(S(θ)=1000)のスコア=0.11
 ラベル2(S(θ)=1000)のスコア=0.79
 ラベル3(S(θ)=1000)のスコア=0.05
 ラベル4(S(θ)=1000)のスコア=0.01
In the example shown in FIG. 9, the calculated scores of the predictor (NN: neural network) 81a are as follows.
Score for label 1 (S * (θ)=1000)=0.11
Score for label 2 (S * (θ)=1000)=0.79
Score for label 3 (S * (θ)=1000)=0.05
Score for label 4 (S * (θ)=1000)=0.01
 このようスコア算出結果が得られた場合、予測器(NN:ニューラル・ネットワーク)81aは、最高スコアのラベル、すなわちラベル2を出力する。
 ラベル2は、アクティブ制約識別データS(θ)=1000に相当する。
 すなわち、入力パラメータ(θ)に対して適用すべきアクティブ制約識別データ(S(θ))=1000とする結果を出力する。
When such score calculation results are obtained, the predictor (NN: neural network) 81a outputs the label with the highest score, that is, label 2 .
Label 2 corresponds to active constraint identification data S * (θ)=1000.
That is, it outputs the result of active constraint identification data (S * (θ))=1000 to be applied to the input parameter (θ).
 このアクティブ制約識別データ(S(θ))=1000は、先に、図4、図5を参照して説明したように、入力パラメータθに対応する制約ab,制約cd,制約ef,制約ghの4つの制約が以下の制約であることを意味する。
 制約ab=1(アクティブ制約)
 制約cd=0(非アクティブ制約)
 制約ef=0(非アクティブ制約)
 制約gh=0(非アクティブ制約)
This active constraint identification data (S * (θ))=1000 is the constraint ab, constraint cd, constraint ef, constraint gh are the following constraints:
Constraint ab=1 (active constraint)
Constraint cd=0 (inactive constraint)
constraint ef=0 (inactive constraint)
Constraint gh=0 (inactive constraint)
 このように、図9に示すクラス分類処理部(NN Classifier=ニューラル・ネットワーク・クラス分類部)81の予測器(NN:ニューラル・ネットワーク)81aは、記憶部に格納された学習データセット(θ,S(θ))61を利用して、入力パラメータθから、アクティブ制約識別データ(S(θ))対応ラベルを生成してラベル変換部81bに出力する。 Thus, the predictor (NN: neural network) 81a of the class classification processing unit (NN Classifier=neural network class classifying unit) 81 shown in FIG. S * (θ)) 61 is used to generate a label corresponding to the active constraint identification data (S * (θ)) from the input parameter θ and output to the label converter 81b.
 ラベル変換部81bは、予測器(NN:ニューラル・ネットワーク)81aが生成したアクティブ制約識別データ(S(θ))対応ラベルを入力して、ラベルに対応付けられた1つのアクティブ制約識別データ(S(θ))を選択して、選択した1つのアクティブ制約識別データ(S(θ))を、次段の線形システム解析部(Linear System Solver)82に出力する。 The label conversion unit 81b inputs labels corresponding to active constraint identification data (S * (θ)) generated by a predictor (NN: neural network) 81a, and converts one active constraint identification data ( S * (θ)) is selected, and one selected active constraint identification data (S * (θ)) is output to a linear system solver 82 in the next stage.
 線形システム解析部(Linear System Solver)82は、クラス分類処理部(NN Classifier=ニューラル・ネットワーク・クラス分類部)81から入力したアクティブ制約識別データ(S(θ))を用いて、二次計画問題の不等式制約「l≦Ax≦u」に含まれる制約からアクティブ制約のみを抽出し、これらをアクティブ等式制約とみなし、アクティブ等式制約を満たす最適解x(n次元ベクトル)を算出する処理を行う。 A linear system solver 82 uses active constraint identification data (S * (θ)) input from a class classification processor (NN Classifier=neural network class classifier) 81 to perform a quadratic program Extract only the active constraints from the constraints contained in the problem inequality constraints "l≤Ax≤u", treat them as active equality constraints, and compute the optimal solution x * (n-dimensional vector) that satisfies the active equality constraints. process.
 このような処理を行うことで二次計画問題の最適解xの高速算出処理が実現され、ロボットの制御を迅速に行うことが可能となる。 By performing such processing, high-speed calculation processing of the optimal solution x * of the quadratic programming problem is realized, and the robot can be controlled quickly.
 しかし、このような二次計画問題の最適解高速算出手法においては、例えば二次計画問題における(b)制約関数に含まれる制約の数Nが多くなってしまうと上記のラベルの数が指数関数的に増加してしまう。 However, in such a high-speed optimal solution calculation method for quadratic programming problems, if the number of constraints N included in the (b) constraint function in a quadratic programming problem increases, the number of labels described above becomes an exponential function. will increase substantially.
 例えば上記の例では、図4~図7を参照して説明したように制約ab,制約cd,制約ef,制約ghの4つの制約を有する設定であり、この場合のラベル数は2=16であるが、例えば制約数が8になると、ラベル数は2=256となる。制約数が10になると、ラベル数は210=1024となる。 For example, in the above example, as described with reference to FIGS. 4 to 7, there are four constraints: constraint ab, constraint cd, constraint ef, and constraint gh. In this case, the number of labels is 2 4 =16. However, for example, if the number of constraints is 8, the number of labels is 2 8 =256. When the number of constraints is 10, the number of labels is 2 10 =1024.
 ラベルの数が極端に増加してしまうと、結果として、クラス分類処理部(NN Classifier=ニューラル・ネットワーク・クラス分類部)の予測器(NN:ニューラル・ネットワーク)における処理コストが増加するという問題が発生する。
 具体的には、例えば予測器(NN:ニューラル・ネットワーク)使用時のメモリ消費量や計算量が増加し、学習処理の効率低下、ロボット制御時の制御速度の低下といった問題を発生させる恐れがある。
 以下では、このような問題を解決した実施例について説明する。
If the number of labels increases significantly, as a result, there is a problem that the processing cost in the predictor (NN: neural network) of the classification processing unit (NN classifier = neural network classifying unit) increases. Occur.
Specifically, for example, the amount of memory consumption and the amount of calculation increases when using a predictor (NN: neural network), which may cause problems such as a decrease in the efficiency of learning processing and a decrease in control speed during robot control. .
An embodiment that solves such a problem will be described below.
  [3.二次計画問題の最適解から制約までのノルムに基づいてアクティブ制約を識別する構成について]
 次に、二次計画問題の最適解から制約までのノルムに基づいてアクティブ制約を識別する構成について説明する。
[3. Constructs that identify the active constraint based on the norm from the optimal solution to the constraint of the quadratic programming problem]
Next, a configuration for identifying active constraints based on the norm from the optimal solution of the quadratic programming problem to the constraint will be described.
 以下では、二次計画問題の最適解から制約までのノルムに基づいてアクティブ制約を識別するためのアクティブ制約識別データを生成する学習処理と、学習処理によって生成される学習結果を利用したロボット制御処理の具体例について説明する。 In the following, a learning process that generates active constraint identification data for identifying active constraints based on the norm from the optimal solution to the constraint of the quadratic programming problem, and a robot control process that uses the learning results generated by the learning process. A specific example of is described.
 先に説明したように、二次計画問題の未知の最適解xを高速に算出するための一つの手法として学習処理によって生成される予測器、すなわち、アクティブ制約を推定する予測器を用いる手法が有効である。 As described above, one technique for quickly calculating an unknown optimal solution x * to a quadratic programming problem is to use a predictor generated by learning processing, that is, a predictor that estimates active constraints. is valid.
 具体的には、様々な入力パラメータ(θ)に応じたアクティブ制約識別データ、すなわち、二次計画問題に含まれる不等式制約から入力パラメータ(θ)に応じたアクティブ制約を選択抽出するためのデータを学習処理によって予め生成することが有効である。 Specifically, active constraint identification data corresponding to various input parameters (θ), that is, data for selectively extracting active constraints corresponding to input parameters (θ) from inequality constraints included in quadratic programming problems. It is effective to generate them in advance by learning processing.
 以下に説明する実施例では、パラメータ(θ)対応のアクティブ制約識別データとして、二次計画問題の最適解xから制約までのベクトル空間における距離に相当するノルム(Lノルム(ユークリッドノルム))を利用する。 In the embodiment described below, the norm (L2 norm ( Euclidean norm)) corresponding to the distance in the vector space from the optimal solution x * of the quadratic programming problem to the constraint is used as the active constraint identification data corresponding to the parameter (θ). take advantage of
 まず、学習処理により、様々な入力パラメータ(θ)と、各パラメータ(θ)対応のノルム、すなわち、二次計画問題の最適解xから制約までのベクトル空間における距離に相当するノルム(S (θ))との組データからなる学習データセット(θ,S (θ))を予め生成する。 First, through learning processing, various input parameters (θ) and the norm corresponding to each parameter (θ), that is, the norm (S l A learning data set (θ, S l * (θ)) consisting of paired data with * (θ)) is generated in advance.
 この学習データセット(θ,S (θ))を記憶部に格納し、ロボット制御実行時に制御情報としての最適解xを算出する際に利用する。
 すなわち、ロボットが観測情報として取得した様々な入力パラメータ(θ)に対する各制約のノルム(S (θ))を、記憶部に格納された学習データセット(θ,S (θ))を利用して推定する。
 さらに、推定した各制約のノルム(S (θ))の値に応じて、各制約がアクティブ制約か非アクティブ制約かを識別する。
This learning data set (θ, S l * (θ)) is stored in the storage unit and used when calculating the optimum solution x * as control information during execution of robot control.
That is, the norm (S l * (θ)) of each constraint with respect to various input parameters (θ) acquired by the robot as observation information is stored in the learning data set (θ, S l * (θ)) is estimated using
Furthermore, depending on the value of the estimated norm (S l * (θ)) of each constraint, we identify whether each constraint is active or inactive.
 このような学習結果の適用処理により、入力パラメータ(θ)に応じたアクティブ制約の抽出を高速に行うことが可能となり、アクティブ制約のみを選択して二次計画問題の最適解x、すなわちロボット制御情報などの最適解xを高速に算出することが可能となる。 Such a learning result application process enables high-speed extraction of active constraints according to the input parameter (θ), and only active constraints are selected to obtain the optimal solution x * of the quadratic programming problem, that is, the robot It becomes possible to calculate the optimum solution x * such as control information at high speed.
 以下、様々な入力パラメータ(θ)と、各パラメータ(θ)に対応した各制約のノルム(S (θ))との組データからなる学習データセット(θ,S (θ))を生成する学習処理と、この学習処理によって生成された学習結果を用いたロボット制御処理の具体例について説明する。 Hereinafter, a learning data set (θ, S l * (θ)) consisting of set data of various input parameters (θ) and norms (S l * (θ)) of each constraint corresponding to each parameter (θ) and a specific example of the robot control process using the learning result generated by this learning process will be described.
 図10は、情報処理装置内に構成される学習処理部100の構成例を示す図である。
 図10に示すように、学習処理部100は、学習データセット生成部110と、制約ノルム推定器生成部120を有する。
 学習データセット生成部110は、入力パラメータ(θ)に応じたアクティブ制約の抽出を可能とする各制約対応のノルム(制約ノルム(S (θ))を算出し、様々な入力パラメータ(θ)と、パラメータ(θ)対応の各制約の(S (θ))との組データからなる学習データセット(θ,S (θ))121を生成する。
FIG. 10 is a diagram showing a configuration example of the learning processing unit 100 configured within the information processing apparatus.
As shown in FIG. 10 , the learning processing unit 100 has a learning data set generation unit 110 and a constraint norm estimator generation unit 120 .
The learning data set generation unit 110 calculates the norm corresponding to each constraint (constraint norm (S l * (θ)) that enables extraction of active constraints according to the input parameter (θ), and calculates various input parameters (θ ) and (S l * (θ)) of each constraint corresponding to the parameter (θ) .
 制約ノルム推定器生成部120は、学習データセット(θ,S (θ))121を利用して、様々な入力パラメータ(θ)から、入力パラメータ(θ)に応じた制約ノルムを推定する回帰分析器(NN Regressor(ニューラル・ネットワーク回帰分析器))を生成する。 The constraint norm estimator generation unit 120 uses the learning data set (θ, S l * (θ)) 121 to estimate the constraint norm corresponding to the input parameter (θ) from various input parameters (θ). Generate a regression analyzer (NN Regressor).
 まず、学習データセット生成部110の構成と処理について説明する。
 図10に示すように、学習データセット生成部110は、二次計画問題標準化モデル生成部(QP Modeling)111と、二次計画問題標準化モデル最適解算出部(QP Solver)112と、制約ノルム算出部(Calc Norm)113を有する。
First, the configuration and processing of the learning data set generation unit 110 will be described.
As shown in FIG. 10, the learning data set generator 110 includes a quadratic programming problem standardized model generator (QP Modeling) 111, a quadratic programming problem standardized model optimum solution calculator (QP Solver) 112, and a constraint norm calculator. It has a part (Calc Norm) 113 .
 この図10に示す二次計画問題標準化モデル生成部(QP Modeling)111と、二次計画問題標準化モデル最適解算出部(QP Solver)112は、先に図2を参照して説明した二次計画問題最適解算出装置30の構成要素である二次計画問題標準化モデル生成部(QP Modeling)31と、二次計画問題標準化モデル最適解算出部(QP Solver)32と同様の処理を実行する。
 すなわち、パラメータθを入力して、二次計画問題の最適解xを出力する。
 なお、xは、x(n次元ベクトル)のエルミート転置行列を意味する。
A quadratic programming problem standardized model generator (QP Modeling) 111 and a quadratic programming problem standardized model optimum solution calculator (QP Solver) 112 shown in FIG. A quadratic programming problem standardized model generator (QP Modeling) 31 and a quadratic programming problem standardized model optimal solution calculator (QP Solver) 32, which are components of the problem optimum solution calculation device 30, perform the same processing.
That is, the parameter θ is input and the optimal solution x * of the quadratic programming problem is output.
Note that x * means a Hermitian transposed matrix of x (n-dimensional vector).
 先に説明したように、入力パラメータθと出力である最適解xとの関係は、例えば図1に示すロボット10の制御構成構成に当てはめると、以下のような対応関係とすることが可能である。
 入力パラメータθ=観測情報(障害物の距離、ロボットの位置、速度、方向など)
 出力最適解x*=ロボット制御情報(ロボット進行方向制御情報、速度制御情報、左右車輪部に対する出力制御情報など)
 入力パラメータθは、例えば複数(k)の観測情報からなるk次元ベクトル(θ,θ,・・・θk-1)であり、二次計画問題の最適解xは、ロボットの複数(n)の制御情報からなるn次元ベクトル(x,x,・・・xn-1)として表現される。
As described above, the relationship between the input parameter θ and the optimal solution x * , which is the output, can be the following correspondence when applied to the control configuration of the robot 10 shown in FIG. be.
Input parameter θ = observation information (distance of obstacles, robot position, speed, direction, etc.)
Output optimum solution x* = robot control information (robot traveling direction control information, speed control information, output control information for left and right wheels, etc.)
The input parameter θ is, for example, a k - dimensional vector (θ 0 , θ 1 , . n) of control information (x 0 , x 1 , . . . x n−1 ).
 図10に示す学習データセット生成部110の各構成部の実行する処理について説明する。
 図10に示す学習データセット生成部110の二次計画問題標準化モデル生成部(QP Modeling)111は、パラメータθを入力し、入力パラメータθに基づいて、二次計画問題標準化モデルを生成する。
Processing executed by each component of the learning data set generation unit 110 shown in FIG. 10 will be described.
A quadratic programming problem standardized model generator (QP Modeling) 111 of the learning data set generator 110 shown in FIG. 10 receives a parameter θ and generates a quadratic programming problem standardized model based on the input parameter θ.
 二次計画問題標準化モデルは、先に説明した以下の(式1)に示す数式モデルである。 The quadratic programming problem standardization model is the mathematical model shown in (Equation 1) below, which was explained earlier.
Figure JPOXMLDOC01-appb-M000009
Figure JPOXMLDOC01-appb-M000009
 上記(式1)中、(a)は目的関数(またはコスト関数)である。
 (b)は制約関数であり不等式によって構成される不等式制約関数である。
In the above (Formula 1), (a) is an objective function (or cost function).
(b) is a constraint function, which is an inequality constraint function composed of inequalities.
 上記式の(a)目的関数と(b)制約関数に示す各パラメータは以下のパラメータである。
 Pはn×nの実数値対象行列、
 qはn×1の実数ベクトル、
 Aはm×nの行列、
 l,uは、m次元ベクトル、
 xはn次元ベクトルxの転置行列、
 を意味する。
Each parameter shown in (a) objective function and (b) constraint function in the above formula is the following parameter.
P is an n×n real-valued symmetric matrix,
q is an n×1 real vector,
A is an m×n matrix,
l, u are m-dimensional vectors,
x T is the transposed matrix of the n-dimensional vector x,
means
 前述したように、二次計画問題は、上記式中の(a)目的関数が最小となる(b)制約関数の制約を満たす最適解x(n次元ベクトル)を算出する問題である。 As described above, the quadratic programming problem is a problem of calculating the optimal solution x * (n-dimensional vector) that satisfies the constraints of (a) the objective function and (b) the constraint function in the above equation.
 二次計画問題標準化モデル最適解算出部(QP Solver)112は、二次計画問題標準化モデル生成部(QP Modeling)111が生成した二次計画問題標準化モデル、すなわち、上記(式1)示す(a)目的関数と(b)制約関数によって構成される二次計画問題標準化モデルを入力して、(a)目的関数が最小となる(b)制約関数の制約を満たす最適解x(n次元ベクトル)を算出して出力する。 The quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 112 generates the quadratic programming problem standardized model generated by the quadratic programming problem standardized model generation unit (QP Modeling) 111, that is, the above (Equation 1) (a ) input the quadratic programming problem standardized model composed of the objective function and (b) the constraint function, the optimal solution x * (n-dimensional vector ) is calculated and output.
 二次計画問題標準化モデル最適解算出部(QP Solver)112は、最適化問題(二次計画問題)の最適解xを算出し、算出した最適解xを、不等式制約「l≦Ax≦u」に代入する。
 この代入処理によって、等式が成立している行のみを抽出する。等式が成立している行列要素を1、それ以外の行列要素を0とした選択行列Scl,Scuを生成して、これらの行列Scl,Scuを利用すると以下の関係式、すなわち、
 SclAx=Scll,
 ScuAx=Scuu,
 上記の関係式が成立する。
 これらの関係式を連結して、先に説明した以下の(式2)を生成する。
A quadratic programming problem standardized model optimum solution calculation unit (QP Solver) 112 calculates the optimum solution x * of the optimization problem (quadratic programming problem), and applies the calculated optimum solution x* to the inequality constraint “l≦Ax≦ u”.
This substitution process extracts only the rows where the equality holds. By generating selection matrices S cl and S cu in which the matrix elements satisfying the equation are set to 1 and the other matrix elements are set to 0, and using these matrices S cl and S cu , the following relational expression is obtained: ,
S cl Ax * = S cl l,
S cu Ax * =S cu u,
The above relational expression holds.
By concatenating these relational expressions, the following (Equation 2) described above is generated.
Figure JPOXMLDOC01-appb-M000010
Figure JPOXMLDOC01-appb-M000010
 二次計画問題標準化モデル最適解算出部(QP Solver)112は、上記(式2)を利用してアクティブ制約を抽出し、抽出したアクティブ制約を等式制約とみなして、二次計画問題標準化モデル生成部(QP Modeling)111が生成した不等式制約を持つ二次計画問題を以下の(式3)に示すようなアクティブ等式制約を持つ二次計画問題に変換する。 The quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 112 uses the above (formula 2) to extract the active constraints, treats the extracted active constraints as equality constraints, and calculates the quadratic programming problem standardized model The quadratic programming problem with inequality constraints generated by the generation unit (QP Modeling) 111 is converted into a quadratic programming problem with active equality constraints as shown in (Formula 3) below.
Figure JPOXMLDOC01-appb-M000011
Figure JPOXMLDOC01-appb-M000011
 なお、上記(式3)中、(a)は目的関数(またはコスト関数)である。
 (b)は制約関数であり、アクティブ等式制約からなる制約関数である。
Note that (a) in the above (Equation 3) is an objective function (or a cost function).
(b) is a constraint function, a constraint function consisting of active equality constraints.
 上記式の(a)目的関数と(b)制約関数に示す各パラメータは以下のパラメータである。
 Pはn×nの実数値対象行列、
 qはn×1の実数ベクトル、
 xはn次元ベクトルxの転置行列、
 Cはm×nの行列、
 dはm次元ベクトル、
 を意味する。
Each parameter shown in (a) objective function and (b) constraint function in the above formula is the following parameter.
P is an n×n real-valued symmetric matrix,
q is an n×1 real vector,
x T is the transposed matrix of the n-dimensional vector x,
C is an m×n matrix,
d is an m-dimensional vector,
means
 このように、二次計画問題標準化モデル最適解算出部(QP Solver)112は、上記(式3)を適用して、(a)目的関数が最小となる(b)アクティブ等式制約関数の制約を満たす最適解x(n次元ベクトル)を算出する。 In this way, the quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 112 applies the above (Equation 3) to (a) minimize the objective function and (b) constrain the active equality constraint function Calculate the optimal solution x * (n-dimensional vector) that satisfies
 前述したように、二次計画問題の制約には、最適解xの算出処理に利用可能なアクティブ制約と、最適解xの算出処理には利用されない非アクティブ制約とが存在する。
 本実施例では、アクティブ制約と非アクティブ制約を識別するための指標として、二次計画問題の最適解xから制約までのベクトル空間における距離に相当するノルム(Lノルム(ユークリッドノルム))を利用する。
As described above, the constraints of the quadratic programming problem include active constraints that can be used for the calculation process of the optimum solution x * and inactive constraints that are not used for the calculation process of the optimum solution x * .
In this embodiment, the norm (L2 norm ( Euclidean norm)) corresponding to the distance in the vector space from the optimal solution x * of the quadratic programming problem to the constraint is used as an index for distinguishing between the active constraint and the inactive constraint. use.
 図11を参照して、二次計画問題におけるアクティブ制約と非アクティブ制約の具体例と、アクティブ制約と非アクティブ制約を識別するための指標として利用するノルム(制約ノルム)について説明する。 A specific example of active and inactive constraints in a quadratic programming problem and norms (constraint norms) used as indices for distinguishing between active and inactive constraints will be described with reference to FIG.
 図11の中心部に示すxは、二次計画問題の最適解xを示している。
 二次計画問題の最適解xは、前述したように、二次計画問題の(a)目的関数が最小となる(b)制約関数の制約を満たす解x(n次元ベクトル)である。
The x * shown in the center of FIG. 11 indicates the optimal solution x* of the quadratic programming problem.
As described above, the optimal solution x * of the quadratic programming problem is the solution x * (n-dimensional vector) that satisfies the constraints of (a) the objective function and (b) the constraint function of the quadratic programming problem.
 図11に示す円状の点線は、二次計画問題の(a)目的関数の算出値の等高線であり、等高線の内側ほど算出値が小さくなるる。
 また、図11に示す領域(V)は、二次計画問題の(b)制約関数の制約を満たす領域である。
The circular dotted line shown in FIG. 11 is the contour line of the calculated value of the (a) objective function of the quadratic programming problem, and the calculated value becomes smaller toward the inner side of the contour line.
A region (V) shown in FIG. 11 is a region that satisfies the constraint of the (b) constraint function of the quadratic programming problem.
 線分ab,cd,ef,ghは、二次計画問題の(b)制約関数によって定義される複数の制約の例を示している。
 各線分から垂直に伸びる小さな点線矢印は、各制約の満たす方向を示している。
 例えば線分abとして示す制約abは、線分abの右下側領域が制約abを満足する領域である。線分ghとして示す制約ghは、線分ghの左上側領域が制約ghを満足する領域である。
Line segments ab, cd, ef, and gh show examples of multiple constraints defined by the (b) constraint function of the quadratic programming problem.
Small dotted arrows extending vertically from each line segment indicate the direction in which each constraint is satisfied.
For example, a constraint ab indicated as a line segment ab is a region where the lower right region of the line segment ab satisfies the constraint ab. A constraint gh shown as a line segment gh is a region where the upper left region of the line segment gh satisfies the constraint gh.
 図11に示す領域(V)は、4つの線分ab,cd,ef,ghによって示される4つの制約、すなわち、制約ab、制約cd、制約ef、制約ghの全ての制約を満たす領域(n次元状態ベクトルの領域)である。
 このように、領域(V)は、二次計画問題の(b)制約関数の制約を満たす領域であり、この領域(V)内で、二次計画問題の(a)目的関数を最小値とする解x(n次元ベクトル)が二次計画問題の最適解xとして算出されることになる。
A region (V) shown in FIG. 11 is a region (n region of the dimensional state vector).
Thus, the region (V) is a region that satisfies the constraints of the (b) constraint function of the quadratic programming problem, and within this region (V), the (a) objective function of the quadratic programming problem is the minimum value A solution x * (an n-dimensional vector) is calculated as the optimal solution x * of the quadratic programming problem.
 図11に示す4つの制約、すなわち、制約ab、制約cd、制約ef、制約gh、これら4つの制約中、制約abと制約cdの2つの制約は、二次計画問題の最適解xの算出処理に利用可能なアクティブ制約である。
 一方、制約ef、制約ghの2つの制約は、二次計画問題の最適解xの算出処理には利用されない非アクティブ制約である。
 非アクティブ制約は、(b)制約関数の制約を満たす領域を規定するのみであり、最適解xの算出処理には利用されない。
The four constraints shown in FIG. 11, that is, constraint ab, constraint cd, constraint ef, and constraint gh . Active constraints available for processing.
On the other hand, two constraints, constraint ef and constraint gh, are inactive constraints that are not used in the process of calculating the optimal solution x * of the quadratic programming problem.
The inactive constraint only defines a region that satisfies the constraints of the (b) constraint function, and is not used in the process of calculating the optimal solution x * .
 二次計画問題の(b)制約関数には、複数の異なる制約が含まれるが、これら複数の制約のどの制約が最適解xの算出処理に利用可能なアクティブ制約であるかを判別するために、本実施例ではノルムを利用する。 Although the (b) constraint function of the quadratic programming problem includes a plurality of different constraints, in order to determine which of these constraints is the active constraint that can be used for the calculation process of the optimal solution x * . In addition, the norm is used in this embodiment.
 すなわち、二次計画問題の最適解xから各制約までのベクトル空間における距離に相当するノルム(Lノルム(ユークリッドノルム))を制約各々について算出し、算出した制約ノルムの値が予め規定したしきい値(λ)以上であれば、その制約は非アクティブ制約であると判断し、制約ノルムの値が予め規定したしきい値(λ)未満であれば、その制約はアクティブ制約であると判断する。 That is, the norm (L2 norm ( Euclidean norm)) corresponding to the distance in the vector space from the optimal solution x * of the quadratic programming problem to each constraint is calculated for each constraint, and the calculated constraint norm value is defined in advance. If the constraint is greater than or equal to a threshold (λ), the constraint is determined to be an inactive constraint, and if the value of the constraint norm is less than a predefined threshold (λ), the constraint is determined to be an active constraint. to decide.
 例えばしきい値λ=0.20とした場合、図11に示す例では、制約abと制約cdは、
 制約ノルム=0.00
 であり、これら制約abと制約cdのノルムは、しきい値λ=0.20未満であるのでアクティブ制約であると判定する。
For example, when the threshold λ=0.20, in the example shown in FIG. 11, the constraints ab and cd are
constraint norm = 0.00
and the norms of these constraints ab and cd are less than the threshold λ=0.20, so they are determined to be active constraints.
 また、制約efは、
 制約ノルム=1.60
 であり、この制約efのノルムは、しきい値λ=0.20以上であるので、非アクティブ制約であると判定する。
 また、制約ghは、
 制約ノルム=1.60
 であり、この制約ghのノルムも、しきい値λ=0.20以上であるので、非アクティブ制約であると判定する。
Also, the constraint ef is
constraint norm = 1.60
and the norm of this constraint ef is equal to or greater than the threshold λ=0.20, so it is determined to be an inactive constraint.
Also, the constraint gh is
constraint norm = 1.60
and the norm of this constraint gh is also greater than or equal to the threshold λ=0.20, so it is determined to be an inactive constraint.
 このように、本実施例では、二次計画問題の最適解xから各制約までのベクトル空間における距離に相当するノルム(Lノルム(ユークリッドノルム))を制約各々について算出し、算出した制約ノルムの値が予め規定したしきい値(λ)以上であれば、その制約は非アクティブ制約であると判断し、制約ノルムの値が予め規定したしきい値(λ)未満であれば、その制約はアクティブ制約であると判断する。 Thus, in this embodiment, the norm corresponding to the distance in the vector space from the optimal solution x * of the quadratic programming problem to each constraint (L 2 norm (Euclidean norm)) is calculated for each constraint, and the calculated constraint If the value of the norm is greater than or equal to a predefined threshold (λ), then the constraint is determined to be an inactive constraint; if the value of the constraint norm is less than the predefined threshold (λ), then Determine that the constraint is an active constraint.
 前述したように、二次計画問題の最適解xの算出処理の開始前に、最適解xの算出処理に利用可能なアクティブ制約と、最適解xの算出処理には利用されない非アクティブ制約とを判別し、アクティブ式制約のみを抽出し、抽出したアクティブ制約をアクティブ等式制約としてみなせば、二次計画問題を線形方程式に帰着させることが可能となり、線形方程式を解くことで高速な最適解xの算出が可能となる。 As described above, before starting the calculation process of the optimal solution x * of the quadratic programming problem, active constraints that can be used for the calculation process of the optimal solution x * and inactive constraints that are not used for the calculation process of the optimal solution x * are set. It is possible to reduce the quadratic programming problem to a linear equation by extracting only the active constraints, extracting only the active constraints, and treating the extracted active constraints as active equality constraints. Optimal solution x * can be calculated.
 図10に示す学習データセット生成部110の制約ノルム算出部(Calc Norm)113は、二次計画問題の各制約について、制約がアクティブ制約か非アクティブ制約かを判別するための指標値としてのノルム(制約ノルム(S (θ)))を算出する。 A constraint norm calculation unit (Calc Norm) 113 of the learning data set generation unit 110 shown in FIG. (Constraint norm (S l * (θ))) is calculated.
 すなわち、制約ノルム算出部(Calc Norm)113が算出する制約ノルム(S (θ))は、二次計画問題の制約関数によって定義される制約各々が、二次計画問題の目的関数の最適解の算出に利用されるアクティブ制約であるか、二次計画問題の目的関数の最適解の算出に利用されない非アクティブ制約であるかを識別するための制約アクティブ性判定指標値である。 That is, the constraint norm (S l * (θ)) calculated by the constraint norm calculation unit (Calc Norm) 113 is such that each constraint defined by the constraint function of the quadratic programming problem is the optimum of the objective function of the quadratic programming problem. It is a constraint activity determination index value for identifying whether it is an active constraint used for calculating a solution or an inactive constraint not used for calculating the optimal solution of the objective function of the quadratic programming problem.
 制約ノルム算出部(Calc Norm)113は、以下の各データを入力する。
 (a)二次計画問題標準化モデル生成部(QP Modeling)111から、二次計画標準化モデルの不等式制約(l≦Ax<u)のパラメータA,l,u、
 (b)二次計画問題標準化モデル最適解算出部(QP Solver)112から、二次計画問題の最適解x(n次元ベクトル)、
The constraint norm calculator (Calc Norm) 113 receives the following data.
(a) From the quadratic programming problem standardized model generation unit (QP Modeling) 111, the parameters A, l, u of the inequality constraints (l≤Ax<u) of the quadratic programming standardized model,
(b) From the quadratic programming problem standardized model optimal solution calculation unit (QP Solver) 112, the quadratic programming problem optimal solution x * (n-dimensional vector),
 制約ノルム算出部(Calc Norm)113は、これらの入力データに基づいて、二次計画問題の各制約について、制約がアクティブ制約か非アクティブ制約かを判別するための指標値としてのノルム(制約ノルム(S (θ)))を算出する。
 制約ノルム算出部(Calc Norm)113が算出する制約ノルム(S (θ))は、例えば以下の(式5)によって示されるデータ(各制約のノルムを連結した行列データ)である。
Based on these input data, the constraint norm calculation unit (Calc Norm) 113 calculates a norm (constraint norm (S l * (θ))) is calculated.
The constraint norm (S l * (θ)) calculated by the constraint norm calculator (Calc Norm) 113 is, for example, data (matrix data connecting the norms of each constraint) represented by the following (Equation 5).
Figure JPOXMLDOC01-appb-M000012
Figure JPOXMLDOC01-appb-M000012
 図12を参照して、制約ノルム算出部(Calc Norm)113が生成する制約ノルム(S (θ))の具体例について説明する。
 制約ノルム算出部(Calc Norm)113は、図12に示すように、入力パラメータ(θ)に対応する制約ノルム(S (θ))を算出する。
A specific example of the constraint norm (S l * (θ)) generated by the constraint norm calculator (Calc Norm) 113 will be described with reference to FIG. 12 .
The constraint norm calculator (Calc Norm) 113 calculates the constraint norm (S l * (θ)) corresponding to the input parameter (θ), as shown in FIG. 12 .
 制約ノルム(S (θ))は、前述したように、二次計画問題の最適解xから各制約までのベクトル空間における距離に相当するノルム(Lノルム(ユークリッドノルム))である。
 制約ノルム(S (θ))の値が、予め規定したしきい値(λ)、例えばλ=0.02以上であれば、その制約は非アクティブ制約であると判断し、制約ノルムの値が予め規定したしきい値(λ)未満であれば、その制約はアクティブ制約であると判断される。
The constraint norm (S l * (θ)) is the norm (L 2 norm (Euclidean norm)) corresponding to the distance in the vector space from the optimal solution x * of the quadratic programming problem to each constraint, as described above. .
If the value of the constraint norm (S l * (θ)) is greater than or equal to a predefined threshold value (λ), eg, λ=0.02, then the constraint is determined to be an inactive constraint, and the constraint norm If the value is less than a predefined threshold (λ), the constraint is determined to be an active constraint.
 例えば、図12に示す例では、入力パラメータ(θ)に対応する制約ノルム(S (θ))は、
 制約ノルム(S (θ))=(0.00,1.25,1.80,1.50)である。
 制約ノルム(S (θ))=(0.00,1.25,1.80,1.50)は、制約ab,制約cd,制約ef,制約ghの4つの制約のノルム(Lノルム(ユークリッドノルム))に対応する4つの値を示している。
For example, in the example shown in FIG. 12, the constraint norm (S l * (θ)) corresponding to the input parameter (θ 0 ) is
The constraint norm (S l * (θ))=(0.00, 1.25, 1.80, 1.50).
The constraint norm (S l * (θ)) = (0.00, 1.25, 1.80, 1.50) is the norm (L 2 Four values corresponding to the norm (Euclidean norm) are shown.
 例えば、予め規定したしきい値(λ)=0.02とした場合、
 制約abの制約ノルム=0.00のみが、しきい値(λ)=0.02未満であり、制約abは、アクティブ制約と判定される。
 制約cd,制約ef,制約ghの制約ノルム(1.25,1.80,1.50)は、いずれも、しきい値(λ)=0.02以上であり、これら制約cd,制約ef,制約ghは、非アクティブ制約と判定される。
For example, when the predetermined threshold value (λ)=0.02,
Only constraint norm=0.00 of constraint ab is less than threshold (λ)=0.02 and constraint ab is determined to be an active constraint.
Constraint norms (1.25, 1.80, 1.50) of constraint cd, constraint ef, and constraint gh are all equal to or greater than the threshold value (λ)=0.02, and these constraints cd, constraint ef, and Constraint gh is determined to be an inactive constraint.
 このように、図10に示す学習データセット生成部110の制約ノルム算出部(Calc Norm)113は、二次計画問題の各制約について、制約がアクティブ制約か非アクティブ制約かを判別するための指標値となるノルム(制約ノルム(S (θ)))を算出する。 In this way, the constraint norm calculation unit (Calc Norm) 113 of the learning data set generation unit 110 shown in FIG. A norm (constraint norm (S l * (θ))) to be a value is calculated.
 図10に示す学習データセット生成部110の制約ノルム算出部(Calc Norm)113が算出した各制約の制約ノルム(S (θ))は、学習データセット生成部110に対する入力パラメータ(θ)に対応付けられて、学習データセット121として学習データセット格納部(記憶部)に格納される。 The constraint norm (S l * (θ)) of each constraint calculated by the constraint norm calculator (Calc Norm) 113 of the learning data set generator 110 shown in FIG. , and stored as the learning data set 121 in the learning data set storage unit (storage unit).
 図10に示す制約ノルム推定器生成部120内に示す学習データセット(θ,S (θ))121である。
 図10に示す制約ノルム推定器生成部120は、この学習データセット(θ,S (θ))を制約ノルム推定器生成学習処理実行部(回帰分析器(NN Regressor)生成部)122に入力する。
This is the learning data set (θ, S l * (θ)) 121 shown in the constraint norm estimator generator 120 shown in FIG.
The constraint norm estimator generation unit 120 shown in FIG. 10 sends this learning data set (θ, S l * (θ)) to the constraint norm estimator generation learning processing execution unit (NN Regressor generation unit) 122. input.
 制約ノルム推定器生成学習処理実行部(回帰分析器(NN Regressor)生成部)122は、学習データセット(θ,S (θ))を利用した学習処理を実行して、入力パラメータθから、各制約の制約ノルム(S (θ))を推定する回帰分析器(NN Regressor:ニューラル・ネットワーク回帰分析器)を生成する。 A constraint norm estimator generation learning processing execution unit (regression analyzer (NN Regressor) generation unit) 122 executes learning processing using a learning data set (θ, S l * (θ)), and from the input parameter θ , generates a regression analyzer (NN Regressor) that estimates the constraint norm (S l * (θ)) for each constraint.
 図13を参照して、制約ノルム推定器生成学習処理実行部(回帰分析器(NN Regressor)生成部)122が学習処理により生成する回帰分析器の例について説明する。 An example of a regression analyzer generated by learning processing by the constraint norm estimator generation learning processing execution unit (regression analyzer (NN Regressor) generation unit) 122 will be described with reference to FIG.
 図13には、制約ノルム推定器生成学習処理実行部(回帰分析器(NN Regressor)生成部)122が学習処理により生成する回帰分析器、すなわち、入力パラメータθから、各制約の制約ノルム(S (θ))を推定する回帰分析器(NN Regressor:ニューラル・ネットワーク回帰分析器)122aを示している。 In FIG. 13, the constraint norm estimator generation learning processing execution unit (regression analyzer (NN Regressor) generation unit) 122 generates a regression analyzer by learning processing, that is, from the input parameter θ, the constraint norm of each constraint (S A regression analyzer (NN Regressor) 122a is shown estimating l * (θ)).
 図13に示すように、回帰分析器(NN Regressor:ニューラル・ネットワーク回帰分析器)122aは、入力パラメータθから、各制約の制約ノルム(S (θ))を推定する。 As shown in FIG. 13, a regression analyzer (NN Regressor: neural network regression analyzer) 122a estimates the constraint norm (S l * (θ)) of each constraint from the input parameter θ.
 回帰分析器(NN Regressor:ニューラル・ネットワーク回帰分析器)122aは、例えば、記憶部に格納された学習データセット(θ,S (θ))121を用いた回帰分析処理によって、入力パラメータθに対応する制約ノルム(S (θ))を推定して出力する。 A regression analyzer (NN Regressor: neural network regression analyzer) 122a performs, for example, a regression analysis process using a learning data set (θ, S l * (θ)) 121 stored in a storage unit to obtain an input parameter θ Estimate and output the constraint norm (S l * (θ)) corresponding to .
 例えば、回帰分析器(NN Regressor:ニューラル・ネットワーク回帰分析器)122aに対する入力パラメータθと類似度の高いパラメータ(θ)を含む学習データセット(θ,S (θ))を1つ以上選択し、選択した学習データセットの制約ノルム(S (θ))に基づいて、入力パラメータθに対応する制約ノルム(S (θ))を回帰的演算手法により算出して、算出した制約ノルム(S (θ))を出力する。 For example, select one or more learning data sets (θ, S l * (θ)) containing parameters (θ) highly similar to the input parameter θ for the regression analyzer (NN Regressor: neural network regression analyzer) 122a Then, based on the constraint norm (S l * (θ)) of the selected learning data set, the constraint norm (S l * (θ)) corresponding to the input parameter θ is calculated by a recursive calculation method. Output the constraint norm (S l * (θ)).
 例えば、実際のロボット制御処理では、この制約ノルム推定器生成学習処理実行部(回帰分析器(NN Regressor)生成部)122が生成した回帰分析器(NN Regressor:ニューラル・ネットワーク回帰分析器)122aを利用した処理を実行する。 For example, in actual robot control processing, the regression analyzer (NN Regressor: neural network regression analyzer) 122a generated by this constraint norm estimator generation learning processing execution unit (regression analyzer (NN Regressor) generation unit) 122 is Execute the used process.
 すなわち、まず、ロボットの観測情報(θ)から、ロボットの制御情報を含む最適解xを算出する二次計画問題を設定する。
 さらに、上記の回帰分析器(NN Regressor:ニューラル・ネットワーク回帰分析器)を利用して、二次計画問題の制約のノルムを推定し、推定したノルムに基づいて、ロボットの観測情報(θ)に対応するアクティブ制約を選択する。
 さらに、選択したアクティブ制約を利用して二次計画問題の最適解xを算出する処理を実行する。
 このような処理により、ロボットの観測情報(θ)から最適制御情報を含む二次計画問題の最適解xを算出する処理を高速に行うことが可能となる。
 図14を参照してこの処理について説明する。
That is, first, a quadratic programming problem is set to calculate the optimal solution x * including the control information of the robot from the observation information (θ) of the robot.
Furthermore, the above regression analyzer (NN Regressor: neural network regression analyzer) is used to estimate the norm of the constraint of the quadratic programming problem, and based on the estimated norm, the observed information (θ) of the robot Select the corresponding active constraint.
Furthermore, the selected active constraint is used to perform processing for calculating the optimal solution x * of the quadratic programming problem.
Such processing enables high-speed processing of calculating the optimal solution x * of the quadratic programming problem including the optimal control information from the observation information (θ) of the robot.
This processing will be described with reference to FIG.
 図14には、情報処理装置内に構成される制御情報生成部200を示している。この制御情報生成部200は、例えば、ロボットの観測情報である入力パラメータ(θ)を入力し、ロボットの制御情報として、二次計画問題の最適解xを算出する処理を実行する。 FIG. 14 shows the control information generator 200 configured in the information processing apparatus. The control information generation unit 200 receives, for example, an input parameter (θ), which is observation information of the robot, and executes a process of calculating the optimal solution x * of the quadratic programming problem as the control information of the robot.
 入力パラメータθは、例えば複数(k)の観測情報からなるk次元ベクトル(θ,θ,・・・θk-1)として表現される。
 また、二次計画問題の最適解xは、例えばロボットの複数(n)の制御情報からなるn次元ベクトル(x,x,・・・xn-1)として表現される。
The input parameter θ is expressed, for example, as a k - dimensional vector (θ 0 , θ 1 , .
Also, the optimal solution x * of the quadratic programming problem is expressed as an n- dimensional vector (x 0 , x 1 , .
 図14に示すように、制御情報生成部200は、制約ノルム推定部(NN Regressor=ニューラル・ネットワーク回帰分析器)201と、しきい値適用アクティブ制約選択部(Threshold)202と、線形システム解析部(Linear System Solver)203を有する。 As shown in FIG. 14, the control information generator 200 includes a constraint norm estimator (NN Regressor = neural network regression analyzer) 201, a threshold applied active constraint selector (Threshold) 202, and a linear system analyzer (Linear System Solver) 203.
 制約ノルム推定部(NN Regressor=ニューラル・ネットワーク回帰分析器)201は、先に図10を参照して説明した学習処理部100の制約ノルム推定器生成部120内の制約ノルム推定器生成学習処理実行部(回帰分析器(NN Regressor)生成部)122が生成した回帰分析器(NN Regressor:ニューラル・ネットワーク回帰分析器)を利用した制約ノルム推定処理を実行する。 The constraint norm estimator (NN Regressor = neural network regression analyzer) 201 executes the constraint norm estimator generation learning process in the constraint norm estimator generator 120 of the learning processing unit 100 described above with reference to FIG. A constraint norm estimation process using a regression analyzer (NN Regressor: neural network regression analyzer) generated by the unit (regression analyzer (NN Regressor) generation unit) 122 is executed.
 すなわち、図13を参照して説明した回帰分析器(NN Regressor:ニューラル・ネットワーク回帰分析器)122aを利用して、入力パラメータ(θ)に対応する各制約の制約ノルム(S (θ))を推定して出力する。
 制約ノルム(S (θ))は、各制約が、二次計画問題の最適解xの算出に利用されるアクティブ制約であるか、利用されない非アクティブ制約であるかを判別するための指標値である。
That is, using the regression analyzer (NN Regressor : neural network regression analyzer) 122a described with reference to FIG . ) is estimated and output.
The constraint norm (S l * (θ)) is used to determine whether each constraint is an active constraint that is used to calculate the optimal solution x * of the quadratic programming problem or an inactive constraint that is not used. It is an index value.
 図15を参照して、制約ノルム推定部(NN Regressor=ニューラル・ネットワーク回帰分析器)201が実行する処理例について説明する。
 図15には、制約ノルム推定部(NN Regressor=ニューラル・ネットワーク回帰分析器)201内に構成される回帰分析器(NN Regressor:ニューラル・ネットワーク回帰分析器)を示している。
An example of processing executed by the constraint norm estimator (NN Regressor=neural network regression analyzer) 201 will be described with reference to FIG.
FIG. 15 shows a regression analyzer (NN Regressor: neural network regression analyzer) configured in the constraint norm estimator (NN Regressor: neural network regression analyzer) 201 .
 この回帰分析器(NN Regressor:ニューラル・ネットワーク回帰分析器)は、先に図13を参照して説明した回帰分析器、すなわち、学習処理部100の制約ノルム推定器生成部120内の制約ノルム推定器生成学習処理実行部(回帰分析器(NN Regressor)生成部)122が生成した回帰分析器(NN Regressor:ニューラル・ネットワーク回帰分析器)122aに相当する。 This regression analyzer (NN Regressor: neural network regression analyzer) is the regression analyzer described above with reference to FIG. It corresponds to the regression analyzer (NN Regressor: neural network regression analyzer) 122a generated by the device generation learning processing execution unit (regression analyzer (NN Regressor) generation unit) 122 .
 すなわち、図15に示す回帰分析器(NN Regressor:ニューラル・ネットワーク回帰分析器)は、入力パラメータ(θ)に対応する各制約の制約ノルム(S (θ))を推定して出力する。 That is, the regression analyzer (NN Regressor: neural network regression analyzer) shown in FIG. 15 estimates and outputs the constraint norm (S l * (θ)) of each constraint corresponding to the input parameter (θ).
 図15に示す回帰分析器(NN Regressor:ニューラル・ネットワーク回帰分析器)は、先に説明した学習データセット(θ,S (θ))121を利用した学習処理によって生成された回帰分析器(NN Regressor:ニューラル・ネットワーク回帰分析器)である。 The regression analyzer (NN Regressor : neural network regression analyzer) shown in FIG . (NN Regressor: Neural Network Regression Analyzer).
 従って、例えば、回帰分析器(NN Regressor:ニューラル・ネットワーク回帰分析器)に対する入力パラメータθと類似度の高いパラメータ(θ)を含む学習データセットに記録された制約ノルム(S (θ))を用いた回帰分析処理によって、入力パラメータθに対応する制約ノルム(S (θ))を推定する処理が実行されることになる。 Therefore, for example, the constraint norm (S l * (θ)) recorded in the training data set containing the input parameter θ for the regression analyzer (NN Regressor: neural network regression analyzer) and the parameter (θ) with high similarity is executed to estimate the constraint norm (S l * (θ)) corresponding to the input parameter θ.
 例えば、回帰分析器(NN Regressor:ニューラル・ネットワーク回帰分析器)122aに対する入力パラメータθと類似度の高いパラメータ(θ)を含む学習データセット(θ,S (θ))を1つ以上選択し、選択した学習データセットの制約ノルム(S (θ))に基づいて、入力パラメータθに対応する制約ノルム(S (θ))を回帰的演算手法により推定して出力する処理などが実行される。 For example, select one or more learning data sets (θ, S l * (θ)) containing parameters (θ) highly similar to the input parameter θ for the regression analyzer (NN Regressor: neural network regression analyzer) 122a Then, based on the constraint norm (S l * (θ)) of the selected learning data set, a process of estimating and outputting the constraint norm (S l * ( θ )) corresponding to the input parameter θ by a recursive calculation method etc., are executed.
 図15に示す例では、回帰分析器(NN Regressor:ニューラル・ネットワーク回帰分析器)122aは、入力パラメータθに基づく推定結果として制約ノルム(S (θ))=(0.00,1.25,1.80,1.50)を出力している。 In the example shown in FIG. 15, the regression analyzer (NN Regressor: neural network regression analyzer) 122a uses the constraint norm (S l * (θ))=(0.00, 1 . 25, 1.80, 1.50).
 この制約ノルム(S (θ))データは、
 制約abの制約ノルム(S (θ))=0.00、
 制約cdの制約ノルム(S (θ))=1.25、
 制約efの制約ノルム(S (θ))=1.80、
 制約ghの制約ノルム(S (θ))=1.50、
 このような各制約のノルムを示すデータである。
This constraint norm (S l * (θ)) data is
Constraint norm of constraint ab (S l * (θ)) = 0.00,
Constraint norm of constraint cd (S l * (θ)) = 1.25,
Constraint norm of constraint ef (S l * (θ)) = 1.80,
constraint norm of constraint gh (S l * (θ))=1.50,
This data indicates the norm of each constraint.
 このように、制約ノルム推定部(NN Regressor=ニューラル・ネットワーク回帰分析器)201は、回帰分析器(NN Regressor:ニューラル・ネットワーク回帰分析器)を用いて、入力パラメータ(θ)に対応する各制約の制約ノルム(S (θ))を推定する。 In this way, the constraint norm estimator (NN Regressor = neural network regression analyzer) 201 uses a regression analyzer (NN Regressor: neural network regression analyzer) to determine each constraint corresponding to the input parameter (θ) Estimate the constraint norm (S l * (θ)) of .
 制約ノルム(S (θ))は、各制約が、二次計画問題の最適解xの算出に利用されるアクティブ制約であるか、利用されない非アクティブ制約であるかを判別するための指標値である。 The constraint norm (S l * (θ)) is used to determine whether each constraint is an active constraint that is used to calculate the optimal solution x * of the quadratic programming problem or an inactive constraint that is not used. It is an index value.
 制約ノルム推定部(NN Regressor=ニューラル・ネットワーク回帰分析器)201が推定した入力パラメータ(θ)に対応する各制約の制約ノルム(S (θ))は、しきい値適用アクティブ制約選択部(Threshold)202に入力される。 The constraint norm (S l * (θ)) of each constraint corresponding to the input parameter (θ) estimated by the constraint norm estimator (NN Regressor = neural network regression analyzer) 201 is obtained by the threshold application active constraint selector (Threshold) 202 .
 しきい値適用アクティブ制約選択部(Threshold)202は、予め規定されたしきい値(λ)を適用して、二次計画問題において定義された制約の各々が、二次計画問題の最適解xの算出に利用されるアクティブ制約であるか、利用されない非アクティブ制約であるかの判別データであるアクティブ制約識別データ(S(θ))を生成する。 A threshold active constraint selector (Threshold) 202 applies a predefined threshold (λ) such that each of the constraints defined in the quadratic programming problem is the optimal solution x Generates active constraint identification data (S * (θ)), which is discriminant data as to whether the constraint is an active constraint that is used in the calculation of * or an inactive constraint that is not used.
 このアクティブ制約識別データ(S(θ))は、先に図5、図7、図8等を参照して説明したアクティブ制約識別データ(S(θ))と同様のデータであり、入力パラメータ(θ)に対応するアクティブ制約のみを選択可能とするデータである。 This active constraint identification data (S * (θ)) is data similar to the active constraint identification data (S * (θ)) previously described with reference to FIGS. This is data that enables selection of only the active constraint corresponding to the parameter (θ).
 図16を参照して、しきい値適用アクティブ制約選択部(Threshold)202の実行する処理の具体例について説明する。 A specific example of processing executed by the threshold application active constraint selection unit (Threshold) 202 will be described with reference to FIG.
 上述したように、しきい値適用アクティブ制約選択部(Threshold)202は、予め規定されたしきい値(λ)を適用して、二次計画問題において定義された制約の各々が、二次計画問題の最適解xの算出に利用されるアクティブ制約であるか、利用されない非アクティブ制約であるかの判別データであるアクティブ制約識別データ(S(θ))を生成する。 As described above, the thresholded active constraint selector (Threshold) 202 applies a predefined threshold (λ) such that each of the constraints defined in the quadratic programming problem is a quadratic program Active constraint identification data (S * (θ)), which is discriminative data for determining whether the constraint is an active constraint used in calculating the optimum solution x * of the problem or an inactive constraint that is not used, is generated.
 図16には、しきい値適用アクティブ制約選択部(Threshold)202に対する(a)入力データと、(b)出力データを示している。
 (a)入力データは、制約ノルム推定部(NN Regressor=ニューラル・ネットワーク回帰分析器)201が生成した入力パラメータ(θ)に対応する各制約の制約ノルム(S (θ))である。
 前述したように、制約ノルム(S (θ))は、二次計画問題の最適解xから各制約までのベクトル空間における距離に相当するノルム(Lノルム(ユークリッドノルム))である。
FIG. 16 shows (a) input data and (b) output data for the threshold application active constraint selection unit (Threshold) 202 .
(a) Input data is the constraint norm (S l * (θ)) of each constraint corresponding to the input parameter (θ) generated by the constraint norm estimator (NN Regressor=neural network regression analyzer) 201 .
As described above, the constraint norm (S l * (θ)) is the norm (L 2 norm (Euclidean norm)) corresponding to the distance in the vector space from the optimal solution x * of the quadratic programming problem to each constraint. .
 しきい値適用アクティブ制約選択部(Threshold)202は、この(a)入力データの各制約の制約ノルム(S (θ))について、予め規定したしきい値(λ)との比較処理を行う。 The threshold application active constraint selection unit (Threshold) 202 compares (a) the constraint norm (S l * (θ)) of each constraint of the input data with a predetermined threshold value (λ). conduct.
 制約ノルム(S (θ))が予め規定したしきい値(λ)以上であれば、その制約は非アクティブ制約であると判断し、制約ノルムの値が予め規定したしきい値(λ)未満であれば、その制約はアクティブ制約であると判断し、判断結果に応じてアクティブ制約識別データ(S(θ))を生成する。
 すなわち、各制約がアクティブ制約であるか、利用されない非アクティブ制約であるかの判別データであるアクティブ制約識別データ(S(θ))を生成する。
 図16の(b)出力データとして示すアクティブ制約識別データ(S(θ))である。
If the constraint norm (S l * (θ)) is greater than or equal to a predefined threshold value (λ), then the constraint is determined to be an inactive constraint, and the value of the constraint norm is equal to or greater than the predefined threshold value (λ ), the constraint is determined to be an active constraint, and active constraint identification data (S * (θ)) is generated according to the determination result.
That is, it generates active constraint identification data (S * (θ)) that is data for determining whether each constraint is an active constraint or an inactive constraint that is not used.
FIG. 16B shows active constraint identification data (S * (θ)) shown as output data.
 図16に示す例において、(a)入力データ中のパラメータ(θ)の各制約の制約ノルム(S (θ))は、
 制約abの制約ノルム(S (θ))=0.00、
 制約cdの制約ノルム(S (θ))=1.25、
 制約efの制約ノルム(S (θ))=1.80、
 制約ghの制約ノルム(S (θ))=1.50、
 このようなデータである。
In the example shown in FIG. 16, (a) the constraint norm (S l * (θ)) of each constraint of the parameter (θ 0 ) in the input data is
Constraint norm of constraint ab (S l * (θ)) = 0.00,
Constraint norm of constraint cd (S l * (θ)) = 1.25,
Constraint norm of constraint ef (S l * (θ)) = 1.80,
constraint norm of constraint gh (S l * (θ))=1.50,
Such data.
 しきい値適用アクティブ制約選択部(Threshold)202は、この(a)入力データの各制約の制約ノルム(S (θ))について、予め規定したしきい値(λ)との比較処理を行う。
 例えばしきい値λ=0.20とした場合、制約abは、
 制約ノルム=0.00
 であり、これら制約abは、しきい値(λ=0.20)未満であり、アクティブ制約であると判定する。
The threshold application active constraint selection unit (Threshold) 202 compares (a) the constraint norm (S l * (θ)) of each constraint of the input data with a predetermined threshold value (λ). conduct.
For example, if the threshold λ = 0.20, the constraint ab is
constraint norm = 0.00
, and these constraints ab are less than a threshold (λ=0.20) and are determined to be active constraints.
 その他の、制約cd,制約ef、制約ghは、いずれも、しきい値(λ=0.20)以上であり、非アクティブ制約であると判定する。 All of the other constraints cd, ef, and gh are equal to or greater than the threshold (λ=0.20) and are determined to be inactive constraints.
 しきい値適用アクティブ制約選択部(Threshold)202は、これらの判定結果に基づいてアクティブ制約識別データ(S(θ))を生成する。アクティブ制約識別データ(S(θ))は、入力パラメータ(θ)に対応するアクティブ制約に(1)、非アクティブ制約に(0)を対応付けて設定したデータであり、アクティブ制約のみを選択可能としたデータである。 A threshold applied active constraint selector (Threshold) 202 generates active constraint identification data (S * (θ)) based on these determination results. The active constraint identification data (S * (θ)) is data set by associating (1) with the active constraint corresponding to the input parameter (θ) and (0) with the inactive constraint, and only the active constraint is selected. data that made it possible.
 しきい値適用アクティブ制約選択部(Threshold)202が生成したアクティブ制約識別データ(S(θ))は、線形システム解析部(Linear System Solver)203に入力される。 Active constraint identification data (S * (θ)) generated by the threshold applied active constraint selector (Threshold) 202 is input to a linear system solver (Linear System Solver) 203 .
 線形システム解析部(Linear System Solver)203は、アクティブ制約識別データ(S(θ))を用いて、二次計画問題の不等式制約「l≦Ax≦u」に含まれる制約からアクティブ制約のみを抽出し、これらをアクティブ等式制約とみなし、アクティブ等式制約を満たす最適解x(n次元ベクトル)を算出する処理を行う。 A linear system solver 203 uses the active constraint identification data (S * (θ)) to extract only the active constraints from the constraints included in the inequality constraints “l≦Ax≦u” of the quadratic programming problem. These are regarded as active equality constraints, and processing is performed to calculate the optimal solution x * (n-dimensional vector) that satisfies the active equality constraints.
 すなわち、抽出したアクティブ制約をアクティブ等式制約としてみなせば、二次計画問題を線形方程式に帰着させることが可能となり、線形方程式を解くことで高速な最適解xの算出が可能となる。 That is, if the extracted active constraints are regarded as active equality constraints, it becomes possible to reduce the quadratic programming problem to a linear equation, and by solving the linear equation, it becomes possible to calculate the optimum solution x * at high speed.
 このような処理を行うことで二次計画問題の最適解xの高速算出処理が実現され、ロボットの制御を迅速に行うことが可能となる。 By performing such processing, high-speed calculation processing of the optimal solution x * of the quadratic programming problem is realized, and the robot can be controlled quickly.
 本実施例では、先に図3~図9を参照して説明した構成や処理と異なり、制約のアクティブ、非アクティブの組み合わせに応じたラベル設定を行っていない。従って、二次計画問題における(b)制約関数に含まれる制約の数Nが多くなった場合のラベル数が指数関数的に増加するという問題を発生させることがない。 Unlike the configuration and processing described earlier with reference to FIGS. 3 to 9, this embodiment does not set labels according to the combination of active and inactive constraints. Therefore, there is no problem that the number of labels increases exponentially when the number N of constraints included in the (b) constraint function in the quadratic programming problem increases.
 先に説明したように、先に図3~図9を参照して説明した構成では、制約ab,制約cd,制約ef,制約ghの4つの制約を有する場合のラベル数は2=16であるが、例えば制約数が8になると、ラベル数は2=256となる。制約数が10になると、ラベル数は210=1024となる。 As described above, in the configuration described above with reference to FIGS. 3 to 9, the number of labels is 2 4 =16 when there are four constraints: constraint ab, constraint cd, constraint ef, and constraint gh. However, if the number of constraints is 8, the number of labels is 2 8 =256. When the number of constraints is 10, the number of labels is 2 10 =1024.
 このように、制約の数Nが多くなってしまうと上記のラベルの数が指数関数的に増加してしまい、予測器(NN:ニューラル・ネットワーク)使用時のメモリ消費量や計算量が増加し、学習処理の効率低下、ロボット制御時の制御速度の低下といった問題を発生させる可能性がある。 As described above, when the number of constraints N increases, the number of labels increases exponentially, resulting in an increase in memory consumption and calculation amount when using a predictor (NN: neural network). , there is a possibility of causing problems such as a decrease in the efficiency of learning processing and a decrease in control speed during robot control.
 これに対して、図10~図16を参照して説明したノルムの値に基づいて各制約がアクティブ制約であるか非アクティブ制約であるかを判定する構成では、ラベルによる場合分けが不要である。 On the other hand, the configuration for determining whether each constraint is an active constraint or an inactive constraint based on the norm value described with reference to FIGS. 10 to 16 does not require classification by label. .
 すなわち、各制約のノルムがしきい値以上であるかしきい値未満であるかを判別するのみで各制約がアクティブ制約であるか非アクティブ制約であるかを判定する処理が可能となり、処理コストを低減させることが可能となる。
 結果として学習処理の効率が向上し、ロボット制御時の制御速度も向上させることが可能となる。
That is, it is possible to determine whether each constraint is an active constraint or an inactive constraint simply by determining whether the norm of each constraint is greater than or equal to the threshold or less than the threshold. can be reduced.
As a result, the efficiency of the learning process is improved, and the control speed during robot control can also be improved.
  [4.情報処理装置のハードウェア構成例について]
 次に本開示の情報処理装置のハードウェア構成例について説明する。
[4. Hardware configuration example of information processing device]
Next, a hardware configuration example of the information processing apparatus of the present disclosure will be described.
 図17は、本開示の情報処理装置のハードウェア構成の一構成例を示すブロック図である。
 情報処理装置は、例えば先に図3や図10を参照して説明した学習処理部、あるいは図8や図14を参照して説明した制御情報生成部が実行する処理を実行可能な装置である。
FIG. 17 is a block diagram showing one configuration example of the hardware configuration of the information processing apparatus of the present disclosure.
The information processing device is, for example, a device capable of executing processing executed by the learning processing unit described above with reference to FIGS. 3 and 10, or the control information generation unit described with reference to FIGS. 8 and 14. .
 なお、情報処理装置は、例えばロボットに装着された装置や、ロボットの制御を行うためにロボットと通信可能な装置として構成することが可能である。
 図17に示す情報処理装置の各構成部について説明する。
The information processing device can be configured as, for example, a device attached to the robot or a device capable of communicating with the robot to control the robot.
Each component of the information processing apparatus shown in FIG. 17 will be described.
 CPU(Central Processing Unit)301は、ROM(Read Only Memory)302、または記憶部308に記憶されているプログラムに従って各種の処理を実行するデータ処理部として機能する。例えば、上述した実施例において説明したシーケンスに従った処理を実行する。RAM(Random Access Memory)303には、CPU301が実行するプログラムやデータなどが記憶される。これらのCPU301、ROM302、およびRAM303は、バス304により相互に接続されている。 A CPU (Central Processing Unit) 301 functions as a data processing section that executes various processes according to programs stored in a ROM (Read Only Memory) 302 or a storage section 308 . For example, the process according to the sequence described in the above embodiment is executed. A RAM (Random Access Memory) 303 stores programs and data executed by the CPU 301 . These CPU 301 , ROM 302 and RAM 303 are interconnected by a bus 304 .
 CPU301はバス304を介して入出力インタフェース305に接続され、入出力インタフェース305には、各種スイッチ、キーボード、タッチパネル、マウス、マイクロホン、さらに、ユーザ入力部やカメラ、LiDAR等各種センサ321の状況データ取得部などよりなる入力部306、ディスプレイ、スピーカーなどよりなる出力部307が接続されている。
 また、出力部307は、ロボット等の駆動を行う駆動部322に対する駆動情報も出力する。
The CPU 301 is connected to an input/output interface 305 via a bus 304. The input/output interface 305 includes various switches, a keyboard, a touch panel, a mouse, a microphone, and a user input unit, a camera, and various sensors 321 such as LiDAR for obtaining status data. An input unit 306 including a unit, etc., and an output unit 307 including a display, a speaker, etc. are connected.
The output unit 307 also outputs driving information to a driving unit 322 that drives a robot or the like.
 CPU301は、入力部306から入力される指令や状況データ等を入力し、各種の処理を実行し、処理結果を例えば出力部307に出力する。
 入出力インタフェース305に接続されている記憶部308は、例えばフラッシュメモリ、ハードディスク等からなり、CPU301が実行するプログラムや各種のデータを記憶する。通信部309は、インターネットやローカルエリアネットワークなどのネットワークを介したデータ通信の送受信部として機能し、外部の装置と通信する。
 また、CPUの他、カメラから入力される画像情報などの専用処理部としてGPU(Graphics Processing Unit)を備えてもよい。
The CPU 301 receives commands, situation data, and the like input from the input unit 306 , executes various processes, and outputs processing results to the output unit 307 , for example.
A storage unit 308 connected to the input/output interface 305 is composed of, for example, a flash memory, a hard disk, or the like, and stores programs executed by the CPU 301 and various data. A communication unit 309 functions as a transmission/reception unit for data communication via a network such as the Internet or a local area network, and communicates with an external device.
In addition to the CPU, a GPU (Graphics Processing Unit) may be provided as a dedicated processing unit for image information input from a camera.
 入出力インタフェース305に接続されているドライブ310は、磁気ディスク、光ディスク、光磁気ディスク、あるいはメモリカード等の半導体メモリなどのリムーバブルメディア311を駆動し、データの記録あるいは読み取りを実行する。 A drive 310 connected to the input/output interface 305 drives a removable medium 311 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory such as a memory card to record or read data.
  [5.本開示の構成のまとめ]
 以上、特定の実施例を参照しながら、本開示の実施例について詳解してきた。しかしながら、本開示の要旨を逸脱しない範囲で当業者が実施例の修正や代用を成し得ることは自明である。すなわち、例示という形態で本発明を開示してきたのであり、限定的に解釈されるべきではない。本開示の要旨を判断するためには、特許請求の範囲の欄を参酌すべきである。
[5. Summary of the configuration of the present disclosure]
Embodiments of the present disclosure have been described in detail above with reference to specific embodiments. However, it is obvious that those skilled in the art can modify or substitute the embodiments without departing from the gist of this disclosure. That is, the present invention has been disclosed in the form of examples and should not be construed as limiting. In order to determine the gist of the present disclosure, the scope of claims should be considered.
 なお、本明細書において開示した技術は、以下のような構成をとることができる。
 (1) 入力パラメータに対応する二次計画問題の最適解を算出する二次計画問題最適解算出部と、
 前記二次計画問題の制約関数によって定義される制約各々と前記最適解とのノルムである制約ノルムを算出する制約ノルム算出部と、
 前記入力パラメータと前記制約ノルムとの組データを学習データとした学習処理を実行し、様々な入力パラメータに応じた制約ノルムを推定する制約ノルム推定器を生成する学習処理実行部を有する情報処理装置。
In addition, the technique disclosed in this specification can take the following configurations.
(1) a quadratic programming problem optimum solution calculation unit that calculates the optimum solution of the quadratic programming problem corresponding to the input parameters;
a constraint norm calculator that calculates a constraint norm that is the norm between each constraint defined by the constraint function of the quadratic programming problem and the optimal solution;
An information processing apparatus having a learning process execution unit that generates a constraint norm estimator that executes a learning process using set data of the input parameter and the constraint norm as learning data and estimates the constraint norm according to various input parameters. .
 (2) 前記制約ノルム算出部が算出する前記制約ノルムは、
 前記二次計画問題の制約関数によって定義される制約各々が、前記二次計画問題の目的関数の最適解の算出に利用されるアクティブ制約であるか、前記二次計画問題の目的関数の最適解の算出に利用されない非アクティブ制約であるかを識別するための制約アクティブ性判定指標値である(1)に記載の情報処理装置。
(2) The constraint norm calculated by the constraint norm calculation unit is
Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem. The information processing apparatus according to (1), which is a constraint activity determination index value for identifying whether the constraint is an inactive constraint that is not used in the calculation of .
 (3) 前記制約ノルム算出部が算出する前記制約ノルムは、
 前記制約ノルムが予め規定されたしきい値未満の制約をアクティブ制約と判定し、
 前記制約ノルムが予め規定されたしきい値以上の制約を非アクティブ制約と判定するための制約アクティブ性判定指標値である(2)に記載の情報処理装置。
(3) The constraint norm calculated by the constraint norm calculation unit is
determining a constraint whose constraint norm is less than a predefined threshold as an active constraint;
The information processing apparatus according to (2), wherein the constraint norm is a constraint activity determination index value for determining a constraint equal to or greater than a predetermined threshold as an inactive constraint.
 (4) 前記学習処理部が生成する制約ノルム推定器が推定する様々な入力パラメータに応じた制約ノルムは、
 前記二次計画問題の制約関数によって定義される制約各々が前記二次計画問題の目的関数の最適解の算出に利用されるアクティブ制約であるか、前記二次計画問題の目的関数の最適解の算出に利用されない非アクティブ制約であるかを識別するための制約アクティブ性判定指標値である(1)~(3)いずれかに記載の情報処理装置。
(4) The constraint norm according to various input parameters estimated by the constraint norm estimator generated by the learning processing unit is
each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used for calculating the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem The information processing apparatus according to any one of (1) to (3), which is a constraint activity determination index value for identifying whether the constraint is an inactive constraint that is not used for calculation.
 (5) 前記制約ノルム算出部は、
 二次計画問題の最適解から制約までのベクトル空間における距離に相当するLノルム(ユークリッドノルム)を制約ノルムとして算出する(1)~(4)いずれかに記載の情報処理装置。
(5) The constraint norm calculation unit
The information processing apparatus according to any one of (1) to ( 4 ), wherein an L2 norm (Euclidean norm) corresponding to a distance in a vector space from an optimal solution of a quadratic programming problem to a constraint is calculated as the constraint norm.
 (6) 前記二次計画問題最適解算出部は、
 前記二次計画問題の制約関数による制約を満たし、かつ前記二次計画問題の目的関数が最小値となる解を前記最適解として算出する(1)~(5)いずれかに記載の情報処理装置。
(6) The quadratic programming problem optimal solution calculation unit
The information processing apparatus according to any one of (1) to (5), wherein a solution that satisfies the constraints of the constraint function of the quadratic programming problem and that minimizes the objective function of the quadratic programming problem is calculated as the optimal solution. .
 (7) 前記学習処理部は、
 ニューラル・ネットワークによって構成される制約ノルム推定器を生成する(1)~(6)いずれかに記載の情報処理装置。
(7) The learning processing unit
The information processing device according to any one of (1) to (6), which generates a constraint norm estimator configured by a neural network.
 (8) 前記学習処理部は、
 回帰分析処理を実行するニューラル・ネットワークによって構成される制約ノルム推定器を生成する(1)~(7)いずれかに記載の情報処理装置。
(8) The learning processing unit
The information processing apparatus according to any one of (1) to (7), which generates a constraint norm estimator configured by a neural network that executes regression analysis processing.
 (9) 前記情報処理装置は、
 入力パラメータに対応する二次計画問題標準化モデルを生成する二次計画問題標準化モデル生成部を有し、
 前記二次計画問題最適解算出部は、
 前記二次計画問題標準化モデル生成部が生成した二次計画問題標準化モデルを利用して、前記最適解を算出する(1)~(8)いずれかに記載の情報処理装置。
(9) The information processing device
a quadratic programming problem standardized model generation unit that generates a quadratic programming problem standardized model corresponding to input parameters;
The quadratic programming problem optimal solution calculation unit,
The information processing apparatus according to any one of (1) to (8), wherein the optimal solution is calculated using the quadratic programming problem standardized model generated by the quadratic programming problem standardized model generation unit.
 (10) 二次計画問題の制約関数によって定義される制約各々について、入力パラメータに応じた制約ノルムを推定する制約ノルム推定器と、
 前記制約ノルム推定器が推定した制約ノルムと予め規定したしきい値との比較処理により、
 前記二次計画問題の制約関数によって定義される制約各々が、前記二次計画問題の目的関数の最適解の算出に利用されるアクティブ制約であるか、前記二次計画問題の目的関数の最適解の算出に利用されない非アクティブ制約であるかを識別可能とした制約アクティブ性解析情報を生成するアクティブ制約選択部と、
 前記アクティブ制約選択部が生成した前記制約アクティブ性解析情報を利用して、アクティブ制約のみを選択して、前記二次計画問題の最適解を算出する線形システム解析部を有する情報処理装置。
(10) a constraint norm estimator that estimates the constraint norm according to the input parameters for each constraint defined by the constraint function of the quadratic programming problem;
By comparing the constraint norm estimated by the constraint norm estimator with a predetermined threshold value,
Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem. an active constraint selection unit that generates constraint activity analysis information that can identify whether it is an inactive constraint that is not used in the calculation of
An information processing apparatus comprising a linear system analysis unit that selects only active constraints using the constraint activity analysis information generated by the active constraint selection unit and calculates the optimum solution of the quadratic programming problem.
 (11) 前記制約ノルム推定器は、
 様々な入力パラメータと前記制約ノルムとの組データを学習データとした学習処理によって生成された制約ノルム推定器である(10)に記載の情報処理装置。
(11) The constraint norm estimator,
The information processing apparatus according to (10), which is a constraint norm estimator generated by learning processing using set data of various input parameters and the constraint norm as learning data.
 (12) 前記制約ノルム推定器は、
 ニューラル・ネットワークによって構成される制約ノルム推定器である(10)または(11)に記載の情報処理装置。
(12) The constraint norm estimator,
The information processing device according to (10) or (11), which is a constraint norm estimator configured by a neural network.
 (13) 前記制約ノルム推定器は、
 回帰分析処理を実行するニューラル・ネットワークによって構成される制約ノルム推定器である(10)~(12)いずれかに記載の情報処理装置。
(13) The constraint norm estimator,
The information processing device according to any one of (10) to (12), which is a constraint norm estimator configured by a neural network that executes regression analysis processing.
 (14) 前記線形システム解析部は、
 前記二次計画問題のアクティブ制約のみを選択して、前記二次計画問題を線形方程式に変換して、線形方程式を解くことで最適解を算出する(10)~(13)いずれかに記載の情報処理装置。
(14) The linear system analysis unit
(10) to (13), wherein only active constraints of the quadratic programming problem are selected, the quadratic programming problem is converted into a linear equation, and an optimal solution is calculated by solving the linear equation. Information processing equipment.
 (15) 前記線形システム解析部は、
 前記二次計画問題の不等式制約からアクティブ制約のみを抽出し、抽出した制約をアクティブ等式制約とみなし、アクティブ等式制約を満たす最適解を算出する(10)~(14)いずれかに記載の情報処理装置。
(15) The linear system analysis unit
(10) to (14), wherein only active constraints are extracted from the inequality constraints of the quadratic programming problem, the extracted constraints are regarded as active equality constraints, and an optimal solution that satisfies the active equality constraints is calculated. Information processing equipment.
 (16) 情報処理装置において実行する情報処理方法であり、
 二次計画問題最適解算出部が、入力パラメータに対応する二次計画問題の最適解を算出する二次計画問題最適解算出ステップと、
 制約ノルム算出部が、前記二次計画問題の制約関数によって定義される制約各々と前記最適解とのノルムである制約ノルムを算出する制約ノルム算出ステップと、
 学習処理実行部が、前記入力パラメータと前記制約ノルムとの組データを学習データとした学習処理を実行し、様々な入力パラメータに応じた制約ノルムを推定する制約ノルム推定器を生成する学習処理実行ステップを実行する情報処理方法。
(16) An information processing method executed in an information processing device,
a quadratic programming problem optimum solution calculation step in which the quadratic programming problem optimum solution calculation unit calculates the optimum solution of the quadratic programming problem corresponding to the input parameter;
a constraint norm calculation step in which a constraint norm calculation unit calculates a constraint norm that is a norm between each constraint defined by the constraint function of the quadratic programming problem and the optimal solution;
A learning processing execution unit executes learning processing using set data of the input parameter and the constraint norm as learning data, and generates a constraint norm estimator that estimates the constraint norm according to various input parameters. An information processing method that performs a step.
 (17) 情報処理装置において実行する情報処理方法であり、
 制約ノルム推定器が、二次計画問題の制約関数によって定義される制約各々について、入力パラメータに応じた制約ノルムを推定する制約ノルム推定ステップと、
 アクティブ制約選択部が、前記制約ノルム推定器が推定した制約ノルムと予め規定したしきい値との比較処理により、
 前記二次計画問題の制約関数によって定義される制約各々が、前記二次計画問題の目的関数の最適解の算出に利用されるアクティブ制約であるか、前記二次計画問題の目的関数の最適解の算出に利用されない非アクティブ制約であるかを識別可能とした制約アクティブ性解析情報を生成するアクティブ制約選択ステップと、
 線形システム解析部が、前記アクティブ制約選択部が生成した前記制約アクティブ性解析情報を利用して、アクティブ制約のみを選択して、前記二次計画問題の最適解を算出する線形システム解析ステップを実行する情報処理方法。
(17) An information processing method executed in an information processing device,
a constraint norm estimation step in which the constraint norm estimator estimates the constraint norm according to the input parameters for each constraint defined by the constraint function of the quadratic programming problem;
The active constraint selection unit compares the constraint norm estimated by the constraint norm estimator with a predetermined threshold,
Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem. an active constraint selection step for generating constraint activity analysis information that enables identification of inactive constraints that are not used in the calculation of
A linear system analysis unit selects only active constraints using the constraint activity analysis information generated by the active constraint selection unit, and executes a linear system analysis step of calculating an optimal solution of the quadratic programming problem. information processing method.
 (18) 情報処理装置において情報処理を実行させるプログラムであり、
 二次計画問題最適解算出部に、入力パラメータに対応する二次計画問題の最適解を算出させる二次計画問題最適解算出ステップと、
 制約ノルム算出部に、前記二次計画問題の制約関数によって定義される制約各々と前記最適解とのノルムである制約ノルムを算出させる制約ノルム算出ステップと、
 学習処理実行部に、前記入力パラメータと前記制約ノルムとの組データを学習データとした学習処理を実行し、様々な入力パラメータに応じた制約ノルムを推定する制約ノルム推定器を生成する学習処理実行ステップを実行させるプログラム。
(18) A program for executing information processing in an information processing device,
a quadratic programming problem optimum solution calculation step for causing the quadratic programming problem optimum solution calculation unit to calculate the optimum solution of the quadratic programming problem corresponding to the input parameters;
a constraint norm calculation step of causing a constraint norm calculation unit to calculate a constraint norm that is a norm between each constraint defined by the constraint function of the quadratic programming problem and the optimal solution;
Execution of learning processing for generating a constraint norm estimator for estimating constraint norms corresponding to various input parameters by executing learning processing using the set data of the input parameter and the constraint norm in the learning processing execution unit as learning data. A program that executes a step.
 (19) 情報処理装置において情報処理を実行させるプログラムであり、
 制約ノルム推定器に、二次計画問題の制約関数によって定義される制約各々について、入力パラメータに応じた制約ノルムを推定させる制約ノルム推定ステップと、
 アクティブ制約選択部に、前記制約ノルム推定器が推定した制約ノルムと予め規定したしきい値との比較処理により、
 前記二次計画問題の制約関数によって定義される制約各々が、前記二次計画問題の目的関数の最適解の算出に利用されるアクティブ制約であるか、前記二次計画問題の目的関数の最適解の算出に利用されない非アクティブ制約であるかを識別可能とした制約アクティブ性解析情報を生成させるアクティブ制約選択ステップと、
 線形システム解析部に、前記アクティブ制約選択部が生成した前記制約アクティブ性解析情報を利用して、アクティブ制約のみを選択して、前記二次計画問題の最適解を算出する線形システム解析ステップを実行させるプログラム。
(19) A program for executing information processing in an information processing device,
a constraint norm estimation step that causes a constraint norm estimator to estimate a constraint norm according to input parameters for each constraint defined by a constraint function of the quadratic programming problem;
By comparing the constraint norm estimated by the constraint norm estimator with a predetermined threshold in the active constraint selection unit,
Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem. an active constraint selection step for generating constraint activity analysis information that enables identification of inactive constraints that are not used in the calculation of
The linear system analysis unit uses the constraint activity analysis information generated by the active constraint selection unit to select only active constraints and execute a linear system analysis step of calculating the optimal solution of the quadratic programming problem. program to make
 なお、明細書中において説明した一連の処理はハードウェア、またはソフトウェア、あるいは両者の複合構成によって実行することが可能である。ソフトウェアによる処理を実行する場合は、処理シーケンスを記録したプログラムを、専用のハードウェアに組み込まれたコンピュータ内のメモリにインストールして実行させるか、あるいは、各種処理が実行可能な汎用コンピュータにプログラムをインストールして実行させることが可能である。例えば、プログラムは記録媒体に予め記録しておくことができる。記録媒体からコンピュータにインストールする他、LAN(Local Area Network)、インターネットといったネットワークを介してプログラムを受信し、内蔵するハードディスク等の記録媒体にインストールすることができる。 It should be noted that the series of processes described in the specification can be executed by hardware, software, or a composite configuration of both. When executing processing by software, a program recording the processing sequence is installed in the memory of a computer built into dedicated hardware and executed, or the program is loaded into a general-purpose computer capable of executing various processing. It can be installed and run. For example, the program can be pre-recorded on a recording medium. In addition to being installed in a computer from a recording medium, the program can be received via a network such as a LAN (Local Area Network) or the Internet and installed in a recording medium such as an internal hard disk.
 また、明細書に記載された各種の処理は、記載に従って時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されてもよい。また、本明細書においてシステムとは、複数の装置の論理的集合構成であり、各構成の装置が同一筐体内にあるものには限らない。 In addition, the various types of processing described in the specification may not only be executed in chronological order according to the description, but may also be executed in parallel or individually according to the processing capacity of the device that executes the processing or as necessary. Further, in this specification, a system is a logical collective configuration of a plurality of devices, and the devices of each configuration are not limited to being in the same housing.
 以上、説明したように、本開示の一実施例の構成によれば、二次計画問題のアクティブ制約を、各制約のノルムを利用して効率的に選択して二次計画問題の最適解の高速算出を可能とした装置、方法が実現される。
 具体的には、例えば、二次計画問題の制約各々について、入力パラメータに応じたノルムを推定する制約ノルム推定器と、推定した制約ノルムと予め規定したしきい値との比較処理により、二次計画問題の制約各々が、二次計画問題の目的関数の最適解の算出に利用されるアクティブ制約であるか、最適解算出に利用されない非アクティブ制約であるかを識別可能とした制約アクティブ性解析情報を生成するアクティブ制約選択部を有し、線形解析部が、制約アクティブ性解析情報を利用してアクティブ制約を選択して二次計画問題の最適解を算出する。
 本構成により、、二次計画問題のアクティブ制約を、各制約のノルムを利用して効率的に選択して二次計画問題の最適解の高速算出を可能とした装置、方法が実現される。
As described above, according to the configuration of one embodiment of the present disclosure, the active constraints of the quadratic programming problem are efficiently selected using the norm of each constraint to find the optimal solution of the quadratic programming problem. A device and method that enable high-speed calculation are realized.
Specifically, for example, for each constraint of a quadratic programming problem, a constraint norm estimator that estimates the norm according to the input parameter, and a comparison process between the estimated constraint norm and a predetermined threshold, the quadratic Constraint activity analysis that makes it possible to identify whether each constraint of a planning problem is an active constraint used to calculate the optimal solution of the objective function of a quadratic programming problem or an inactive constraint that is not used to calculate the optimal solution. An active constraint selector for generating information, wherein the linear analyzer utilizes the constraint activity analysis information to select active constraints to compute an optimal solution to the quadratic programming problem.
With this configuration, it is possible to realize an apparatus and a method for efficiently selecting active constraints of a quadratic programming problem using the norm of each constraint and enabling high-speed calculation of the optimum solution of the quadratic programming problem.
  10 ロボット
  20 走行経路
  30 二次計画問題最適解算出装置
  31 二次計画問題標準化モデル生成部(QP Modeling)
  32 二次計画問題標準化モデル最適解算出部(QP Solver)
  40 学習処理部
  50 学習データセット生成部
  51 二次計画問題標準化モデル生成部(QP Modeling)
  52 二次計画問題標準化モデル最適解算出部(QP Solver)
  53 アクティブ制約識別データ生成部
  60 予測器生成部
  61 学習データセット(θ,S(θ))
  62 クラス分類処理部(NN Classifier=ニューラル・ネットワーク・クラス分類部)
  80 制御情報生成部
  81 クラス分類処理部(NN Classifier=ニューラル・ネットワーク・クラス分類部)
  82 線形システム解析部(Linear System Solver)
 100 学習処理部
 110 学習データセット生成部
 111 二次計画問題標準化モデル生成部(QP Modeling)
 112 二次計画問題標準化モデル最適解算出部(QP Solver)
 113 制約ノルム算出部(Calc Norm)
 120 制約ノルム推定器生成部
 121 学習データセット(θ,S (θ))
 122 制約ノルム推定器生成学習処理実行部(回帰分析器(NN Regressor)生成部)
 200 制御情報生成部
 201 制約ノルム推定部(NN Regressor=ニューラル・ネットワーク回帰分析器)
 202 しきい値適用アクティブ制約選択部(Threshold)
 203 線形システム解析部(Linear System Solver)
 301 CPU
 302 ROM
 303 RAM
 304 バス
 305 入出力インタフェース
 306 入力部
 307 出力部
 308 記憶部
 309 通信部
 310 ドライブ
 311 リムーバブルメディア
 321 センサ
 322 駆動部
10 Robot 20 Traveling Route 30 Quadratic Programming Problem Optimal Solution Calculator 31 Quadratic Programming Problem Standardized Model Generation Unit (QP Modeling)
32 Quadratic Programming Problem Standardized Model Optimal Solution Calculator (QP Solver)
40 learning processing unit 50 learning data set generation unit 51 quadratic programming problem standardized model generation unit (QP Modeling)
52 Quadratic Programming Problem Standardized Model Optimal Solution Calculator (QP Solver)
53 active constraint identification data generator 60 predictor generator 61 learning data set (θ, S * (θ))
62 class classification processor (NN Classifier=neural network class classifier)
80 control information generation unit 81 class classification processing unit (NN Classifier=neural network class classification unit)
82 Linear System Solver
100 learning processing unit 110 learning data set generation unit 111 quadratic programming problem standardized model generation unit (QP Modeling)
112 Quadratic Programming Problem Standardized Model Optimal Solution Calculator (QP Solver)
113 constraint norm calculator (Calc Norm)
120 constraint norm estimator generator 121 learning data set (θ, S l * (θ))
122 constraint norm estimator generation learning processing execution unit (regression analyzer (NN Regressor) generation unit)
200 control information generator 201 constraint norm estimator (NN Regressor = Neural Network Regression Analyzer)
202 Threshold applied active constraint selector (Threshold)
203 Linear System Solver
301 CPUs
302 ROMs
303 RAM
304 bus 305 input/output interface 306 input unit 307 output unit 308 storage unit 309 communication unit 310 drive 311 removable media 321 sensor 322 drive unit

Claims (19)

  1.  入力パラメータに対応する二次計画問題の最適解を算出する二次計画問題最適解算出部と、
     前記二次計画問題の制約関数によって定義される制約各々と前記最適解とのノルムである制約ノルムを算出する制約ノルム算出部と、
     前記入力パラメータと前記制約ノルムとの組データを学習データとした学習処理を実行し、様々な入力パラメータに応じた制約ノルムを推定する制約ノルム推定器を生成する学習処理実行部を有する情報処理装置。
    a quadratic programming problem optimal solution calculation unit that calculates the optimal solution of the quadratic programming problem corresponding to the input parameters;
    a constraint norm calculator that calculates a constraint norm that is the norm between each constraint defined by the constraint function of the quadratic programming problem and the optimal solution;
    An information processing apparatus having a learning process execution unit that generates a constraint norm estimator that executes a learning process using set data of the input parameter and the constraint norm as learning data and estimates the constraint norm according to various input parameters. .
  2.  前記制約ノルム算出部が算出する前記制約ノルムは、
     前記二次計画問題の制約関数によって定義される制約各々が、前記二次計画問題の目的関数の最適解の算出に利用されるアクティブ制約であるか、前記二次計画問題の目的関数の最適解の算出に利用されない非アクティブ制約であるかを識別するための制約アクティブ性判定指標値である請求項1に記載の情報処理装置。
    The constraint norm calculated by the constraint norm calculation unit is
    Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem. 2. The information processing apparatus according to claim 1, which is a constraint activity determination index value for identifying whether the constraint is an inactive constraint that is not used in the calculation of .
  3.  前記制約ノルム算出部が算出する前記制約ノルムは、
     前記制約ノルムが予め規定されたしきい値未満の制約をアクティブ制約と判定し、
     前記制約ノルムが予め規定されたしきい値以上の制約を非アクティブ制約と判定するための制約アクティブ性判定指標値である請求項2に記載の情報処理装置。
    The constraint norm calculated by the constraint norm calculation unit is
    determining a constraint whose constraint norm is less than a predefined threshold as an active constraint;
    3. The information processing apparatus according to claim 2, wherein the constraint norm is a constraint activity determination index value for determining a constraint equal to or greater than a predetermined threshold as an inactive constraint.
  4.  前記学習処理部が生成する制約ノルム推定器が推定する様々な入力パラメータに応じた制約ノルムは、
     前記二次計画問題の制約関数によって定義される制約各々が前記二次計画問題の目的関数の最適解の算出に利用されるアクティブ制約であるか、前記二次計画問題の目的関数の最適解の算出に利用されない非アクティブ制約であるかを識別するための制約アクティブ性判定指標値である請求項1に記載の情報処理装置。
    The constraint norm according to various input parameters estimated by the constraint norm estimator generated by the learning processing unit is
    each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used for calculating the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem 2. The information processing apparatus according to claim 1, which is a constraint activity determination index value for identifying whether the constraint is an inactive constraint that is not used for calculation.
  5.  前記制約ノルム算出部は、
     二次計画問題の最適解から制約までのベクトル空間における距離に相当するLノルム(ユークリッドノルム)を制約ノルムとして算出する請求項1に記載の情報処理装置。
    The constraint norm calculator,
    2. The information processing apparatus according to claim 1, wherein L2 norm (Euclidean norm) corresponding to the distance in vector space from the optimal solution of the quadratic programming problem to the constraint is calculated as the constraint norm.
  6.  前記二次計画問題最適解算出部は、
     前記二次計画問題の制約関数による制約を満たし、かつ前記二次計画問題の目的関数が最小値となる解を前記最適解として算出する請求項1に記載の情報処理装置。
    The quadratic programming problem optimal solution calculation unit,
    2. The information processing apparatus according to claim 1, wherein a solution that satisfies the constraint of the constraint function of the quadratic programming problem and that minimizes the objective function of the quadratic programming problem is calculated as the optimum solution.
  7.  前記学習処理部は、
     ニューラル・ネットワークによって構成される制約ノルム推定器を生成する請求項1に記載の情報処理装置。
    The learning processing unit
    2. The information processing apparatus according to claim 1, which generates a constraint norm estimator configured by a neural network.
  8.  前記学習処理部は、
     回帰分析処理を実行するニューラル・ネットワークによって構成される制約ノルム推定器を生成する請求項1に記載の情報処理装置。
    The learning processing unit
    2. The information processing apparatus according to claim 1, which generates a constraint norm estimator configured by a neural network that performs regression analysis processing.
  9.  前記情報処理装置は、
     入力パラメータに対応する二次計画問題標準化モデルを生成する二次計画問題標準化モデル生成部を有し、
     前記二次計画問題最適解算出部は、
     前記二次計画問題標準化モデル生成部が生成した二次計画問題標準化モデルを利用して、前記最適解を算出する請求項1に記載の情報処理装置。
    The information processing device is
    a quadratic programming problem standardized model generation unit that generates a quadratic programming problem standardized model corresponding to input parameters;
    The quadratic programming problem optimal solution calculation unit,
    2. The information processing apparatus according to claim 1, wherein the quadratic programming problem standardized model generated by the quadratic programming problem standardized model generator is used to calculate the optimal solution.
  10.  二次計画問題の制約関数によって定義される制約各々について、入力パラメータに応じた制約ノルムを推定する制約ノルム推定器と、
     前記制約ノルム推定器が推定した制約ノルムと予め規定したしきい値との比較処理により、
     前記二次計画問題の制約関数によって定義される制約各々が、前記二次計画問題の目的関数の最適解の算出に利用されるアクティブ制約であるか、前記二次計画問題の目的関数の最適解の算出に利用されない非アクティブ制約であるかを識別可能とした制約アクティブ性解析情報を生成するアクティブ制約選択部と、
     前記アクティブ制約選択部が生成した前記制約アクティブ性解析情報を利用して、アクティブ制約のみを選択して、前記二次計画問題の最適解を算出する線形システム解析部を有する情報処理装置。
    a constraint norm estimator that estimates a constraint norm according to input parameters for each constraint defined by a constraint function of the quadratic programming problem;
    By comparing the constraint norm estimated by the constraint norm estimator with a predetermined threshold value,
    Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem. an active constraint selection unit that generates constraint activity analysis information that can identify whether it is an inactive constraint that is not used in the calculation of
    An information processing apparatus comprising a linear system analysis unit that selects only active constraints using the constraint activity analysis information generated by the active constraint selection unit and calculates the optimum solution of the quadratic programming problem.
  11.  前記制約ノルム推定器は、
     様々な入力パラメータと前記制約ノルムとの組データを学習データとした学習処理によって生成された制約ノルム推定器である請求項10に記載の情報処理装置。
    The constraint norm estimator comprises:
    11. The information processing apparatus according to claim 10, wherein the constraint norm estimator is a constraint norm estimator generated by a learning process using set data of various input parameters and the constraint norm as learning data.
  12.  前記制約ノルム推定器は、
     ニューラル・ネットワークによって構成される制約ノルム推定器である請求項10に記載の情報処理装置。
    The constraint norm estimator comprises:
    11. The information processing device according to claim 10, which is a constraint norm estimator configured by a neural network.
  13.  前記制約ノルム推定器は、
     回帰分析処理を実行するニューラル・ネットワークによって構成される制約ノルム推定器である請求項10に記載の情報処理装置。
    The constraint norm estimator comprises:
    11. The information processing device according to claim 10, wherein the constraint norm estimator is a neural network that performs regression analysis processing.
  14.  前記線形システム解析部は、
     前記二次計画問題のアクティブ制約のみを選択して、前記二次計画問題を線形方程式に変換して、線形方程式を解くことで最適解を算出する請求項10に記載の情報処理装置。
    The linear system analysis unit
    11. The information processing apparatus according to claim 10, wherein only active constraints of said quadratic programming problem are selected, said quadratic programming problem is converted into a linear equation, and an optimum solution is calculated by solving the linear equation.
  15.  前記線形システム解析部は、
     前記二次計画問題の不等式制約からアクティブ制約のみを抽出し、抽出した制約をアクティブ等式制約とみなし、アクティブ等式制約を満たす最適解を算出する請求項10に記載の情報処理装置。
    The linear system analysis unit
    11. The information processing apparatus according to claim 10, wherein only active constraints are extracted from the inequality constraints of said quadratic programming problem, the extracted constraints are regarded as active equality constraints, and an optimum solution satisfying the active equality constraints is calculated.
  16.  情報処理装置において実行する情報処理方法であり、
     二次計画問題最適解算出部が、入力パラメータに対応する二次計画問題の最適解を算出する二次計画問題最適解算出ステップと、
     制約ノルム算出部が、前記二次計画問題の制約関数によって定義される制約各々と前記最適解とのノルムである制約ノルムを算出する制約ノルム算出ステップと、
     学習処理実行部が、前記入力パラメータと前記制約ノルムとの組データを学習データとした学習処理を実行し、様々な入力パラメータに応じた制約ノルムを推定する制約ノルム推定器を生成する学習処理実行ステップを実行する情報処理方法。
    An information processing method executed in an information processing device,
    a quadratic programming problem optimum solution calculation step in which the quadratic programming problem optimum solution calculation unit calculates the optimum solution of the quadratic programming problem corresponding to the input parameter;
    a constraint norm calculation step in which a constraint norm calculation unit calculates a constraint norm that is a norm between each constraint defined by the constraint function of the quadratic programming problem and the optimal solution;
    A learning processing execution unit executes learning processing using set data of the input parameter and the constraint norm as learning data, and generates a constraint norm estimator that estimates the constraint norm according to various input parameters. An information processing method that performs a step.
  17.  情報処理装置において実行する情報処理方法であり、
     制約ノルム推定器が、二次計画問題の制約関数によって定義される制約各々について、入力パラメータに応じた制約ノルムを推定する制約ノルム推定ステップと、
     アクティブ制約選択部が、前記制約ノルム推定器が推定した制約ノルムと予め規定したしきい値との比較処理により、
     前記二次計画問題の制約関数によって定義される制約各々が、前記二次計画問題の目的関数の最適解の算出に利用されるアクティブ制約であるか、前記二次計画問題の目的関数の最適解の算出に利用されない非アクティブ制約であるかを識別可能とした制約アクティブ性解析情報を生成するアクティブ制約選択ステップと、
     線形システム解析部が、前記アクティブ制約選択部が生成した前記制約アクティブ性解析情報を利用して、アクティブ制約のみを選択して、前記二次計画問題の最適解を算出する線形システム解析ステップを実行する情報処理方法。
    An information processing method executed in an information processing device,
    a constraint norm estimation step in which the constraint norm estimator estimates the constraint norm according to the input parameters for each constraint defined by the constraint function of the quadratic programming problem;
    The active constraint selection unit compares the constraint norm estimated by the constraint norm estimator with a predetermined threshold,
    Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem. an active constraint selection step for generating constraint activity analysis information that enables identification of inactive constraints that are not used in the calculation of
    A linear system analysis unit selects only active constraints using the constraint activity analysis information generated by the active constraint selection unit, and executes a linear system analysis step of calculating an optimal solution of the quadratic programming problem. information processing method.
  18.  情報処理装置において情報処理を実行させるプログラムであり、
     二次計画問題最適解算出部に、入力パラメータに対応する二次計画問題の最適解を算出させる二次計画問題最適解算出ステップと、
     制約ノルム算出部に、前記二次計画問題の制約関数によって定義される制約各々と前記最適解とのノルムである制約ノルムを算出させる制約ノルム算出ステップと、
     学習処理実行部に、前記入力パラメータと前記制約ノルムとの組データを学習データとした学習処理を実行し、様々な入力パラメータに応じた制約ノルムを推定する制約ノルム推定器を生成する学習処理実行ステップを実行させるプログラム。
    A program for executing information processing in an information processing device,
    a quadratic programming problem optimum solution calculation step for causing the quadratic programming problem optimum solution calculation unit to calculate the optimum solution of the quadratic programming problem corresponding to the input parameters;
    a constraint norm calculation step of causing a constraint norm calculation unit to calculate a constraint norm that is a norm between each constraint defined by the constraint function of the quadratic programming problem and the optimal solution;
    Execution of learning processing for generating a constraint norm estimator for estimating constraint norms corresponding to various input parameters by executing learning processing using the set data of the input parameter and the constraint norm in the learning processing execution unit as learning data. A program that executes a step.
  19.  情報処理装置において情報処理を実行させるプログラムであり、
     制約ノルム推定器に、二次計画問題の制約関数によって定義される制約各々について、入力パラメータに応じた制約ノルムを推定させる制約ノルム推定ステップと、
     アクティブ制約選択部に、前記制約ノルム推定器が推定した制約ノルムと予め規定したしきい値との比較処理により、
     前記二次計画問題の制約関数によって定義される制約各々が、前記二次計画問題の目的関数の最適解の算出に利用されるアクティブ制約であるか、前記二次計画問題の目的関数の最適解の算出に利用されない非アクティブ制約であるかを識別可能とした制約アクティブ性解析情報を生成させるアクティブ制約選択ステップと、
     線形システム解析部に、前記アクティブ制約選択部が生成した前記制約アクティブ性解析情報を利用して、アクティブ制約のみを選択して、前記二次計画問題の最適解を算出する線形システム解析ステップを実行させるプログラム。
    A program for executing information processing in an information processing device,
    a constraint norm estimation step that causes a constraint norm estimator to estimate a constraint norm according to input parameters for each constraint defined by a constraint function of the quadratic programming problem;
    By comparing the constraint norm estimated by the constraint norm estimator with a predetermined threshold in the active constraint selection unit,
    Each of the constraints defined by the constraint function of the quadratic programming problem is an active constraint used to calculate the optimal solution of the objective function of the quadratic programming problem, or the optimal solution of the objective function of the quadratic programming problem. an active constraint selection step for generating constraint activity analysis information that enables identification of inactive constraints that are not used in the calculation of
    The linear system analysis unit uses the constraint activity analysis information generated by the active constraint selection unit to select only active constraints and execute a linear system analysis step of calculating the optimal solution of the quadratic programming problem. program to make
PCT/JP2022/006846 2021-07-01 2022-02-21 Information processing device, information processing method, and program WO2023276255A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2021-110328 2021-07-01
JP2021110328 2021-07-01

Publications (1)

Publication Number Publication Date
WO2023276255A1 true WO2023276255A1 (en) 2023-01-05

Family

ID=84692252

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2022/006846 WO2023276255A1 (en) 2021-07-01 2022-02-21 Information processing device, information processing method, and program

Country Status (1)

Country Link
WO (1) WO2023276255A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2020535539A (en) * 2018-02-05 2020-12-03 三菱電機株式会社 Predictive controllers, vehicles and methods to control the system

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2020535539A (en) * 2018-02-05 2020-12-03 三菱電機株式会社 Predictive controllers, vehicles and methods to control the system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LIU SHUDI; GUO YE; TANG WENJUN; SUN HONGBIN; HUANG WENQI: "Predicting Active Constraints Set in Security-Constrained Optimal Power Flow via Deep Neural Network", 2021 IEEE POWER & ENERGY SOCIETY GENERAL MEETING (PESGM), IEEE, 26 July 2021 (2021-07-26), pages 01 - 05, XP034053084, DOI: 10.1109/PESGM46819.2021.9637964 *

Similar Documents

Publication Publication Date Title
Chua et al. Deep reinforcement learning in a handful of trials using probabilistic dynamics models
Urain et al. Se (3)-diffusionfields: Learning smooth cost functions for joint grasp and motion optimization through diffusion
Li et al. A computer aided diagnosis system for thyroid disease using extreme learning machine
US11270124B1 (en) Temporal bottleneck attention architecture for video action recognition
Leke et al. Deep learning and missing data in engineering systems
KR102366302B1 (en) Autoencoder-based graph construction for semi-supervised learning
Raymond et al. Genetic programming with rademacher complexity for symbolic regression
Chamroukhi et al. Model-based functional mixture discriminant analysis with hidden process regression for curve classification
JP2012118668A (en) Learning device for pattern classification device and computer program for the same
Dong et al. Research on image classification based on capsnet
Krawczyk et al. Incremental weighted one-class classifier for mining stationary data streams
JP5017941B2 (en) Model creation device and identification device
WO2023276255A1 (en) Information processing device, information processing method, and program
JP2007249394A (en) Face image recognition device and face image recognition program
Benchara et al. A new distributed type-2 fuzzy logic method for efficient data science models of medical informatics
CN116630943A (en) Method, device, equipment and medium for constructing fatigue detection model of driver
CN110941542A (en) Sequence integration high-dimensional data anomaly detection system and method based on elastic network
US20220207368A1 (en) Embedding Normalization Method and Electronic Device Using Same
Malik et al. Teacher-class network: A neural network compression mechanism
Lv et al. Determination of the number of principal directions in a biologically plausible PCA model
US20210232895A1 (en) Flexible Parameter Sharing for Multi-Task Learning
Hadikhani et al. Human activity discovery with automatic multi-objective particle swarm optimization clustering with gaussian mutation and game theory
Dhrif et al. Gene subset selection for transfer learning using bilevel particle swarm optimization
Wong et al. Hybrid data regression model based on the generalized adaptive resonance theory neural network
JP7056804B2 (en) Experience loss estimation system, experience loss estimation method and experience loss estimation program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22832422

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE