CN111915060A

CN111915060A - Processing method and processing device for combined optimization task

Info

Publication number: CN111915060A
Application number: CN202010612243.6A
Authority: CN
Inventors: 甄慧玲; 王振坤; 李希君; 张青富; 袁明轩
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2020-06-30
Filing date: 2020-06-30
Publication date: 2020-11-10

Abstract

The application provides a processing method and a processing device for a combined optimization task, which belong to the technical field of artificial intelligence, wherein the processing method comprises the following steps: the computing equipment acquires a first solution set aiming at a target problem from a memory, wherein the target problem is a problem to be solved for realizing a target combination optimization task; the computing equipment groups the candidate solutions in the first solution set to obtain a plurality of solution sets; the calculation equipment inputs the multiple solution sets into a pre-trained neural network model to obtain multiple target confidence coefficients, and each target confidence coefficient in the multiple target confidence coefficients is used for representing the confidence coefficient that the corresponding solution set comprises a target solution; the computing equipment selects a solution set corresponding to the maximum target confidence coefficient in the target confidence coefficients from the solution sets according to the target confidence coefficients to obtain a second solution set; the computing device selects a target solution from the second solution set. Based on the technical method, the processing time of the combined optimization task can be shortened, and the processing efficiency of the combined optimization task is improved.

Description

Processing method and processing device for combined optimization task

Technical Field

The present application relates to the field of artificial intelligence, and more particularly, to a processing method and a processing apparatus for a combination optimization task.

Background

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The task of combinatorial optimization is always a difficult problem in the field of computers, and a specific algorithm is specifically needed to obtain N combinatorial optimization optimal or approximately optimal combinatorial schemes from a plurality of alternative results. At present, a branch-and-bound algorithm is usually adopted when a combined optimization task is solved, and a large number of branch-and-bound path characteristics of similar types of problems can be obtained through simulation learning by a pre-trained neural network model, so that in the solving process of a new problem, node selection of a traditional branch-and-bound method is replaced by a pre-trained sequencing model. However, the current neural network model has too long solution time for solving the combinatorial optimization task including large-scale data amount, i.e. the processing efficiency for the combinatorial optimization task is low.

Therefore, how to improve the processing efficiency of the combinatorial optimization task becomes a technical problem which needs to be solved urgently.

Disclosure of Invention

The application provides a processing method and a processing device for a combined optimization task, which can shorten the processing time of the combined optimization task and improve the processing efficiency of the combined optimization task.

In a first aspect, a method for processing a combined optimization task is provided, including: the method comprises the steps that a computing device obtains a first solution set aiming at a target problem from a memory, wherein the target problem is a problem needing to be solved for realizing a target combination optimization task, the first solution set comprises M candidate solutions aiming at the target problem, and M is an integer larger than 1; the computing equipment groups the candidate solutions in the first solution set to obtain a plurality of solution sets; the computing equipment inputs the solution sets into a pre-trained neural network model to obtain a plurality of target confidence coefficients, the target confidence coefficients correspond to the solution sets one by one, each target confidence coefficient in the target confidence coefficients is used for representing the confidence coefficient that the corresponding solution set comprises a target solution, and the target solution is a candidate solution which enables the planning result of the target problem to meet a preset constraint condition in the first solution set; the computing equipment selects a solution set corresponding to the maximum target confidence coefficient in the target confidence coefficients from the solution sets according to the target confidence coefficients to obtain a second solution set; the computing device selects the target solution from the second solution set.

It should be understood that a combinatorial optimization task may refer to a task that finds an optimal solution or a near optimal solution within a limited set in order to solve a certain combinatorial optimization problem. In general, the combinatorial optimization task can be represented by a mixed integer programming problem, and can also be represented by a non-deterministic problem.

It should be noted that the target solution may be a candidate solution that enables the planning result of the target problem to satisfy the preset constraint condition; wherein, the target problem may refer to a functional relationship including a plurality of variables; the candidate solution may refer to a candidate value of each of a plurality of variables in the functional relationship; the target solution may refer to a candidate value that enables the plurality of variables to satisfy a preset constraint condition among the candidate values of the plurality of variables; the preset constraint condition can be constraint limitation on candidate values of each variable and/or constraint limitation on the whole of a plurality of variables; for example, the preset constraint condition may refer to candidate values of the plurality of variables, so that the plurality of variables can reach a maximum value, a minimum value, or other constraint limits as a whole.

For example, the candidate solution may refer to a candidate target capacity for each of a plurality of production objects in the production scheduling task; the preset constraint condition may be a constraint limit on the daily capacity of each generated object, or the preset constraint condition may also be a constraint limit on the daily capacity of each generated object, where the preset constraint condition is that a plurality of generated objects can meet the required quantity of products and maximize the profit of each generated object under a certain raw material condition; the target solution may be a candidate target yield corresponding to the candidate target yield of each generation object, which can maximize the profit of each generation object. It should also be understood that the confidence that each target confidence in the plurality of target confidences is used to represent that the corresponding solution set includes the target solution may refer to the probability magnitude that each target confidence may be used to represent that the corresponding solution set includes the target solution; the greater the confidence that a solution set includes a target solution, the greater the probability that the solution set includes the target solution.

The computing device may be a device having a combined optimization task processing function, for example, a device that may include any computing function known in the art, such as a server, a computer, and the like; alternatively, the computing device may also refer to a chip having a computing function; for example, a chip disposed in a server or a chip disposed in a computer. The computing device may include a memory and a processor therein; the memory may be configured to store program code, and the processor may be configured to invoke the program code stored by the memory to implement the corresponding functionality of the computing device. The processor and the memory included in the computing device may be implemented by a chip, and are not particularly limited herein.

In the embodiment of the application, partial candidate solutions can be selected from all candidate solutions corresponding to a target problem by adopting a pre-trained neural network model, wherein the number of the partial candidate solutions is less than that of all the candidate solutions; further, selecting a target solution from the partial candidate solutions; when solving the problem to be solved for realizing the target combination optimization task, selecting a partial data set from a large data candidate set, and then selecting a target solution from the partial data set; therefore, the problems of long solving time and low solving efficiency existing in the process of directly selecting a target solution from a data set with larger data can be solved; by the processing method in the embodiment of the application, the time for solving the problem required to be solved for realizing the combined optimization task can be shortened when the combined optimization task is processed, so that the processing efficiency of the combined optimization task is improved.

In a possible implementation manner, the plurality of target confidence levels may be embodied by scoring labels of a plurality of solution sets; if the numerical value of the scoring label of the solution set 1 in the plurality of solution sets is greater than the numerical value of the scoring label of the solution set 2, it can be stated that the probability that the solution set 1 includes the target solution is greater than the probability that the solution set 2 includes the target solution; solution set 1 may be selected from the plurality of solution sets as the second solution set.

With reference to the first aspect, in certain implementations of the first aspect, the grouping, by the computing device, the candidate solutions in the first solution set to obtain a plurality of solution sets includes:

the computing device randomly groups the candidate solutions in the first solution set to obtain the plurality of solution sets.

In one possible implementation, the computing device may perform random average grouping on the candidate solutions in the first solution set to obtain the plurality of solution sets.

With reference to the first aspect, in certain implementations of the first aspect, the selecting, by the computing device, the target solution from the second solution set includes:

and the computing equipment selects a candidate solution with the maximum resource allocation rate or the maximum profit when the target combination optimization task is realized from the second solution set as the target solution.

In one possible implementation, the computing device may select, as the target solution, a candidate solution from the second solution set, where the planning result of the target problem satisfies a preset constraint condition, where the preset constraint condition may refer to that the resource allocation rate is the maximum or the profit is the maximum when the target combination optimization task is implemented.

With reference to the first aspect, in certain implementations of the first aspect, the objective combination optimization task includes any one of a production scheduling task, a three-dimensional boxing task, and an address selection task.

With reference to the first aspect, in some implementation manners of the first aspect, when the objective combination optimization task is a production scheduling task, the objective problem is to achieve a maximum production yield according to a product demand quantity and a raw material quantity, the first set includes M planning results, one of the M planning results is used to represent a target capacity of each of a plurality of production objects, and the objective solution is to achieve the maximum production yield among the M planning results.

With reference to the first aspect, in certain implementations of the first aspect, in a case that the target combination optimization task is a three-dimensional boxing task, the target problem is that a maximum space utilization rate of the box is achieved according to sizes of a plurality of articles to be boxed and sizes of the box, the first candidate set includes M boxing results, one of the M boxing results is used to indicate that a part of the articles to be boxed is selected from the plurality of articles to be boxed and is packed into the box, and the target solution is a boxing result of the M boxing results that achieves the maximum space utilization rate of the box.

With reference to the first aspect, in some implementation manners of the first aspect, in a case that the target combination optimization task is an addressing task, the target problem is that a maximum total service area is implemented according to candidate addresses of multiple objects to be built and a service area corresponding to each candidate address, the first solution set includes M addressing results, one of the M addressing results is used to represent a target addressing of each object to be built in the multiple objects to be built and a service area corresponding to the target addressing of each object to be built, and the target solution is an addressing result of the maximum total service area implemented in the M addressing results.

It should be noted that the above illustrates a target combination optimization task as a production scheduling task, a three-dimensional boxing task, and an address selection task; the processing method of the combined optimization task can also be applied to other combined optimization tasks, and the application is not limited in any way.

With reference to the first aspect, in certain implementations of the first aspect, the neural network model is obtained using the following training method:

acquiring training data, wherein the training data comprises a sample problem, a first sample solution set, a second sample solution set and a sample target solution, the sample problem is a problem to be solved for realizing a sample combination optimization task, the first sample solution set comprises K candidate solutions aiming at the sample problem, the sample target solution is a candidate solution in the first sample solution set, which enables a planning result of the sample problem to meet a preset constraint condition, the second sample solution set is formed by selecting L candidate solutions from the first sample solution set, the second sample solution set comprises the sample target solution, K is an integer greater than 1, and L is an integer less than K; and training the neural network model by taking the first sample solution set as input data to obtain a second sample solution set as a training target.

With reference to the first aspect, in certain implementations of the first aspect, the method further includes:

the computing device groups the candidate solutions in the first sample solution set to obtain the second sample solution set;

the training the neural network model with the training data includes:

obtaining a first confidence that the second set of sample solutions includes the sample target solution using the neural network model; training the neural network model according to the confidence.

It should be noted that, in the process of training the neural network, the neural network model is enabled to identify a second solution set including the target solution, that is, when a plurality of solution sets are input to the neural network model, the neural network model may output confidence degrees corresponding to the plurality of solution sets, and the confidence degrees may be used to represent confidence degrees that the corresponding solution sets in the plurality of solution sets include the sample target solution; by training the neural network model, the confidence of the second sample solution set output by the trained neural network model is greater than the confidence of other solution sets in the multiple solution sets, that is, the neural network model can identify the second sample solution set including the sample target solution in the multiple solution sets.

In an embodiment of the present application, the plurality of sample solution sets may be fixed and invariant, and the neural network model is trained by dynamically adjusting the confidence degrees of the plurality of sample solution sets, that is, dynamically adjusting the confidence degree of each sample solution set including the sample target solution; compared with a training mode of dynamically grouping a plurality of sample solution sets, the training time of the neural network model can be shortened by dynamically adjusting the confidence degrees of the plurality of sample solution sets.

With reference to the first aspect, in certain implementations of the first aspect, the grouping, by the computing device, the candidate solutions in the first sample solution set to obtain the second sample solution set includes:

the computing device randomly groups the candidate solutions in the first sample solution set to obtain the second sample solution set.

In one possible implementation, the computer device may perform random average grouping on the candidate solutions in the first sample solution set to obtain a plurality of sample solution sets, where the plurality of sample solution sets includes the second sample solution set.

With reference to the first aspect, in certain implementations of the first aspect, the obtaining, using the neural network model, a first confidence that the second set of sample target solutions includes the sample target solution includes:

and performing weighted calculation by using the neural network model according to the confidence coefficient of each candidate solution in the second sample solution set as the sample target solution to obtain the first confidence coefficient.

In an embodiment of the present application, the label used for representing the confidence of each solution set in the multiple solution sets including the target solution may be obtained by performing weighting processing on the confidence of the multiple candidate solutions included in one solution set as the target solution.

With reference to the first aspect, in certain implementations of the first aspect, the confidence level of each candidate solution as the sample target solution is obtained by the neural network model by using a class-two label mechanism.

With reference to the first aspect, in certain implementations of the first aspect, the confidence level of each candidate solution as the sample target solution is obtained by the neural network model by using an adaptive boosting algorithm.

In a second aspect, a training method for a combined optimization task processing model is provided, which includes: acquiring training data, wherein the training data comprises a sample problem, a first sample solution set, a second sample solution set and a sample target solution, the sample problem is a problem to be solved for realizing a sample combination optimization task, the first sample solution set comprises K candidate solutions for the sample problem, the sample target solution is a candidate solution in the first sample solution set, which enables a planning result of the sample problem to meet a preset constraint condition, the second sample solution set comprises L candidate solutions selected from the first sample solution set, the second sample solution set comprises the sample target solution, K is an integer greater than 1, and L is an integer less than K; and training the neural network model by taking the first sample solution set as input data and the second sample solution set as a training target to obtain the trained neural network model.

It should be understood that the trained neural network model may be used to select N candidate solutions from a first solution set corresponding to a to-be-processed combinatorial optimization problem to form a second solution set, where the first solution set includes M candidate solutions of the to-be-processed combinatorial optimization problem, and the to-be-processed combinatorial optimization problem is a problem to be solved to implement the to-be-processed combinatorial optimization task.

With reference to the second aspect, in some implementations of the second aspect, the method further includes:

the training the neural network model by taking the first sample solution set as input data to obtain the second sample solution set as a training target comprises:

obtaining a first confidence that the second set of sample solutions includes the sample target solution using the neural network model; training the neural network model according to the first confidence.

With reference to the second aspect, in certain implementations of the second aspect, the second sample solution set is obtained by randomly grouping candidate solutions in the first sample solution set.

With reference to the second aspect, in certain implementations of the second aspect, the obtaining, using the neural network model, a first confidence that the second set of sample solutions includes the sample target solution includes:

With reference to the second aspect, in certain implementations of the second aspect, the confidence level that each candidate solution is the sample target solution is obtained by the neural network model by using a two-class label mechanism.

With reference to the second aspect, in certain implementations of the second aspect, the confidence level that each candidate solution is the sample target solution is obtained by the neural network model by using an adaptive boosting algorithm.

With reference to the second aspect, in some implementation manners of the second aspect, the sample combination optimization task includes any one of a factory scheduling task, a production scheduling task, a three-dimensional boxing task, and a plant address selection task.

With reference to the second aspect, in some implementations of the second aspect, when the sample combination optimization task is a production scheduling task, the sample problem refers to achieving a maximum production yield according to a product demand quantity and a raw material quantity, the first sample set includes M planning results, one of the M planning results is used to represent a target capacity of each of a plurality of production objects, and the sample target solution refers to a planning result of the M planning results that achieves the maximum production yield.

With reference to the second aspect, in certain implementations of the second aspect, in a case that the sample combination optimization task is a three-dimensional boxing task, the sample problem is that a maximum space utilization rate of the box is achieved according to a size of a plurality of to-be-boxed articles and a size of the box, the first sample solution set includes M boxing results, one of the M boxing results is used to indicate that a part of the to-be-boxed articles is selected from the plurality of to-be-boxed articles to be boxed into the box, and the sample target solution is a boxing result of the M boxing results that achieves the maximum space utilization rate of the box.

With reference to the second aspect, in some implementation manners of the second aspect, in a case that the sample combination optimization task is an addressing task, the sample problem refers to that a maximum total service area is implemented according to candidate addresses of multiple objects to be built and a service area corresponding to each candidate address, the first sample solution set includes M addressing results, one of the M addressing results is used to represent a target addressing of each object to be built in the multiple objects to be built and a service area corresponding to the target addressing of each object to be built, and the sample target solution refers to an addressing result of the maximum total service area implemented in the M addressing results.

It should be noted that the above illustrates a sample combination optimization task as a production scheduling task, a three-dimensional boxing task, and an address selection task; the sample combination optimization task of the present application may also refer to other combination optimization tasks, and the present application does not limit this.

In a third aspect, a processing apparatus for combining optimization tasks is provided, including: the method comprises the steps of obtaining a first solution set aiming at a target problem from a memory, wherein the target problem is a problem needing to be solved for realizing a target combination optimization task, the first solution set comprises M candidate solutions aiming at the target problem, and M is an integer larger than 1; the processing unit is used for grouping the candidate solutions in the first solution set to obtain a plurality of solution sets; inputting the solution sets into a pre-trained neural network model to obtain a plurality of target confidence coefficients, wherein the target confidence coefficients correspond to the solution sets one to one, each target confidence coefficient in the target confidence coefficients is used for representing the confidence coefficient that the corresponding solution set comprises a target solution, and the target solution is a candidate solution which enables the planning result of the target problem to meet a preset constraint condition in the first solution set; according to the target confidence degrees, selecting a solution set corresponding to the maximum target confidence degree in the target confidence degrees from the solution sets to obtain a second solution set; and selecting the target solution from the second solution set.

With reference to the third aspect, in some implementations of the third aspect, the processing unit is specifically configured to:

and randomly grouping the candidate solutions in the first solution set to obtain the plurality of solution sets.

In one possible implementation, the candidate solutions in the first solution set may be randomly and averagely grouped to obtain a plurality of solution sets.

and selecting a candidate solution with the maximum resource allocation rate or the maximum profit when the target combination optimization task is realized from the second solution set as the target solution.

With reference to the third aspect, in certain implementations of the third aspect, the objective combination optimization task includes any one of a production scheduling task, a three-dimensional boxing task, and an addressing task.

With reference to the third aspect, in some implementations of the third aspect, in a case that the target combination optimization task is a production scheduling task, the target problem is that a maximum production yield is achieved according to a product demand quantity and a raw material quantity, the first set includes M planning results, one of the M planning results is used to represent a target capacity of each of a plurality of production objects, and the target solution is a planning result of the M planning results that achieves the maximum production yield.

With reference to the third aspect, in certain implementations of the third aspect, in a case that the target combination optimization task is a three-dimensional boxing task, the target problem is to achieve a maximum space utilization rate of the box according to a size of a plurality of articles to be boxed and a size of the box, the first candidate set includes M boxing results, one of the M boxing results is used to indicate that a part of the articles to be boxed is selected from the plurality of articles to be boxed and is boxed into the box, and the target solution is a boxing result that achieves the maximum space utilization rate of the box from the M boxing results.

With reference to the third aspect, in some implementation manners of the third aspect, in a case that the target combination optimization task is an addressing task, the target problem is that a maximum total service area is implemented according to candidate addresses of multiple objects to be built and a service area corresponding to each candidate address, the first solution set includes M addressing results, one of the M addressing results is used to represent a target addressing of each object to be built in the multiple objects to be built and a service area corresponding to the target addressing of each object to be built, and the target solution is an addressing result of the maximum total service area implemented in the M addressing results.

It should be noted that the above illustrates a target combination optimization task as a production scheduling task, a three-dimensional boxing task, and an address selection task; the combination optimization task of the present application may refer to other combination optimization tasks, and the present application is not limited in this respect.

With reference to the third aspect, in certain implementations of the third aspect, the neural network model is obtained using the following training method:

acquiring training data, wherein the training data comprises a sample problem, a first sample solution set, a second sample solution set and a sample target solution, the sample problem is a problem to be solved for realizing a sample combination optimization task, the first sample solution set comprises K candidate solutions for the sample problem, the second sample solution set is formed by selecting L candidate solutions from the first sample solution set, the second sample solution set comprises the sample target solution, the sample target solution is a candidate in the first sample solution set, which enables a planning result of the sample problem to meet a preset constraint condition, K is an integer greater than 1, and L is an integer less than K; and training the neural network model by taking the first sample solution set as input data to obtain a second sample solution set as a training target.

With reference to the third aspect, in certain implementations of the third aspect, the processing unit is further configured to:

grouping the candidate solutions in the first sample solution set to obtain a second sample solution set;

the training the neural network model with the training data includes:

and randomly grouping the candidate solutions in the first sample solution set to obtain the second sample solution set.

With reference to the third aspect, in certain implementations of the third aspect, the confidence level of each candidate solution as the sample target solution is obtained by the neural network model by using a class-two tagging mechanism.

With reference to the third aspect, in certain implementations of the third aspect, the confidence level that each candidate solution is the sample target solution is obtained by the neural network model by using an adaptive boosting algorithm.

In a fourth aspect, a training method for a combined optimization task processing model is provided, which includes: an obtaining unit, configured to obtain training data, where the training data includes a sample problem, a first sample solution set, a second sample solution set, and a sample target solution, the sample problem is a problem that needs to be solved to implement a sample combination optimization task, the first sample solution set includes K candidate solutions for the sample problem, the second sample solution set is formed by selecting L candidate solutions from the first sample solution set, the second sample solution set includes the sample target solution, the sample target solution is a candidate in the first sample solution set, where a planning result of the sample problem satisfies a preset constraint condition, K is an integer greater than 1, and L is an integer smaller than K; and the processing unit is used for training the neural network model by taking the first sample solution set as input data to obtain a second sample solution set as a training target so as to obtain the trained neural network model.

With reference to the fourth aspect, in some implementations of the fourth aspect, the processing unit is further configured to:

With reference to the fourth aspect, in some implementations of the fourth aspect, the second sample solution set is obtained by randomly grouping candidate solutions in the first sample solution set.

With reference to the fourth aspect, in some implementations of the fourth aspect, the processing unit is specifically configured to:

With reference to the fourth aspect, in certain implementations of the fourth aspect, the confidence level that each candidate solution is the sample target solution is obtained by the neural network model by using a two-class label mechanism.

With reference to the fourth aspect, in certain implementations of the fourth aspect, the confidence level that each candidate solution is the sample target solution is obtained by the neural network model by using an adaptive boosting algorithm.

With reference to the fourth aspect, in some implementations of the fourth aspect, the sample combination optimization task includes any one of a production scheduling task, a three-dimensional boxing task, and an address selection task.

With reference to the fourth aspect, in some implementation manners of the fourth aspect, in a case that the sample combination optimization task is a production scheduling task, the sample problem is that a maximum production yield is achieved according to a quantity of required products and a quantity of raw materials, the first sample solution set includes M planning results, one of the M planning results is used to represent a target capacity of each of a plurality of production objects, and the sample target solution is a planning result of the M planning results that achieves the maximum production yield.

With reference to the fourth aspect, in certain implementations of the fourth aspect, in a case that the sample combination optimization task is a three-dimensional boxing task, the sample problem is that a maximum space utilization rate of the box is achieved according to sizes of a plurality of to-be-boxed articles and sizes of the box, the first sample solution set includes M boxing results, one of the M boxing results is used to indicate that a part of the to-be-boxed articles is selected from the plurality of to-be-boxed articles to be boxed into the box, and the sample target solution is a boxing result of the M boxing results that achieves the maximum space utilization rate of the box.

With reference to the fourth aspect, in some implementation manners of the fourth aspect, in a case that the sample combination optimization task is an addressing task, the sample problem refers to that a maximum total service area is implemented according to candidate addresses of multiple objects to be built and a service area corresponding to each candidate address, the first sample solution set includes M addressing results, one of the M addressing results is used to indicate a target addressing of each object to be built in the multiple objects to be built and a service area corresponding to the target addressing of each object to be built, and the sample target solution refers to an addressing result of the maximum total service area implemented in the M addressing results.

In a fifth aspect, a processing device for combining optimization tasks is provided, which includes: a memory for storing a program; a processor for executing the memory-stored program, the processor for performing, when the memory-stored program is executed: acquiring a first solution set aiming at a target problem from a memory, wherein the target problem is a problem to be solved for realizing a target combination optimization task, the first solution set comprises M candidate solutions aiming at the target problem, and M is an integer greater than 1; grouping the candidate solutions in the first solution set to obtain a plurality of solution sets; inputting the solution sets into a pre-trained neural network model to obtain a plurality of target confidence coefficients, wherein the target confidence coefficients correspond to the solution sets one to one, each target confidence coefficient in the target confidence coefficients is used for representing the confidence coefficient that the corresponding solution set comprises a target solution, and the target solution is a candidate solution which enables the planning result of the target problem to meet a preset constraint condition in the first solution set; according to the target confidence degrees, selecting a solution set corresponding to the maximum target confidence degree in the target confidence degrees from the solution sets to obtain a second solution set; and selecting the target solution from the second solution set.

In a possible implementation manner, the processor included in the apparatus is further configured to execute the processing method in any implementation manner of the first aspect.

It will be appreciated that extensions, definitions, explanations and explanations of relevant content in the above-described first aspect also apply to the same content in the fifth aspect.

In a sixth aspect, a processing apparatus for combining optimization tasks is provided, including: a memory for storing a program; a processor for executing the memory-stored program, the processor for performing, when the memory-stored program is executed: acquiring training data, wherein the training data comprises a sample problem, a first sample solution set, a second sample solution set and a sample target solution, the sample problem is a problem to be solved for realizing a sample combination optimization task, the first sample solution set comprises K candidate solutions aiming at the sample problem, the second sample solution set is formed by selecting L candidate solutions from the first sample solution set, the second sample solution set comprises the sample target solution, the sample target solution is a candidate solution which enables a planning result of the sample problem to meet a preset constraint condition in the first sample solution set, K is an integer larger than 1, and L is an integer smaller than K; and training the neural network model by taking the first sample solution set as input data and the second sample solution set as a training target to obtain the trained neural network model.

In a possible implementation manner, the processor included in the apparatus is further configured to execute the processing method in any implementation manner of the second aspect.

It will be appreciated that extensions, definitions, explanations and explanations of relevant matters in the above first aspect also apply to the same matters in the sixth aspect.

In a seventh aspect, a computer-readable medium is provided, which stores program code for execution by a device, where the program code includes instructions for performing the processing method in any one of the implementations of the first aspect and the first aspect.

In an eighth aspect, a computer-readable medium is provided, which stores program code for execution by a device, the program code comprising instructions for performing the training method of the second aspect described above and any one of the implementations of the second aspect.

A ninth aspect provides a computer program product containing instructions for causing a computer to perform the processing method of any one of the implementations of the first aspect and the first aspect when the computer program product runs on a computer.

A tenth aspect provides a computer program product comprising instructions which, when run on a computer, cause the computer to perform the training method of any one of the implementations of the second aspect and the second aspect.

In an eleventh aspect, a chip is provided, where the chip includes a processor and a data interface, and the processor reads instructions stored in a memory through the data interface to execute the processing method in any one of the implementations of the first aspect and the first aspect.

Optionally, as an implementation manner, the chip may further include a memory, where the memory stores instructions, and the processor is configured to execute the instructions stored on the memory, and when the instructions are executed, the processor is configured to execute the processing method in any one of the implementations of the first aspect and the first aspect.

In a twelfth aspect, a chip is provided, where the chip includes a processor and a data interface, and the processor reads instructions stored in a memory through the data interface, and executes the training method in any one implementation manner of the second aspect and the second aspect.

Optionally, as an implementation manner, the chip may further include a memory, where the memory stores instructions, and the processor is configured to execute the instructions stored on the memory, and when the instructions are executed, the processor is configured to execute the training method in any one of the second aspect and the implementation manner of the second aspect.

Drawings

FIG. 1 is a schematic diagram of an artificial intelligence agent framework provided by an embodiment of the application;

FIG. 2 is a schematic diagram of an application scenario provided by an embodiment of the present application;

FIG. 3 is a schematic diagram of a vehicle dispatch application scenario provided by an embodiment of the present application;

FIG. 4 is a schematic diagram of a provisioning management application scenario provided by an embodiment of the application;

FIG. 5 is a diagram of a system architecture provided by an embodiment of the present application;

fig. 6 is a schematic diagram of a hardware structure of a chip according to an embodiment of the present disclosure;

FIG. 7 is a diagram of a system architecture provided by an embodiment of the present application;

FIG. 8 is a schematic flow chart diagram of a processing method for a combinatorial optimization task provided by an embodiment of the present application;

FIG. 9 is a schematic diagram of a processing method for a combinatorial optimization task provided in an embodiment of the present application;

FIG. 10 is a schematic diagram of inter-group dynamic scoring provided by embodiments of the present application;

FIG. 11 is a schematic flow chart diagram of a method for training a neural network model provided by an embodiment of the present application;

FIG. 12 is a schematic flow chart diagram of a processing method for a combinatorial optimization task provided by an embodiment of the present application;

FIG. 13 is a graphical illustration of the test effect provided by one embodiment of the present application;

FIG. 14 is a graphical illustration of the test effect provided by one embodiment of the present application;

FIG. 15 is a graphical illustration of the test effect provided by another embodiment of the present application;

FIG. 16 is a schematic block diagram of a processing device for combinatorial optimization tasks provided herein;

fig. 17 is a schematic hardware structure diagram of a processing device for combining optimization tasks according to an embodiment of the present application.

Detailed Description

The technical solution in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application; it is to be understood that the embodiments described are only a few embodiments of the present application and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

FIG. 1 shows a schematic diagram of an artificial intelligence body framework that describes the overall workflow of an artificial intelligence system, applicable to the general artificial intelligence field requirements.

The artificial intelligence theme framework 100 described above is described in detail below in terms of two dimensions, the "intelligent information chain" (horizontal axis) and the "Information Technology (IT) value chain" (vertical axis).

The "smart information chain" reflects a list of processes processed from the acquisition of data. For example, the general processes of intelligent information perception, intelligent information representation and formation, intelligent reasoning, intelligent decision making and intelligent execution and output can be realized. In this process, the data undergoes a "data-information-knowledge-wisdom" refinement process.

The 'IT value chain' reflects the value of the artificial intelligence to the information technology industry from the bottom infrastructure of the human intelligence, information (realization of providing and processing technology) to the industrial ecological process of the system.

(1) Infrastructure 110

The infrastructure can provide computing power support for the artificial intelligent system, realize communication with the outside world, and realize support through the basic platform.

The infrastructure may communicate with the outside through sensors, and the computing power of the infrastructure may be provided by a smart chip.

The intelligent chip may be a hardware acceleration chip such as a Central Processing Unit (CPU), a neural-Network Processing Unit (NPU), a Graphics Processing Unit (GPU), an Application Specific Integrated Circuit (ASIC), and a Field Programmable Gate Array (FPGA).

The infrastructure platform may include distributed computing framework and network, and may include cloud storage and computing, interworking network, and the like.

For example, for an infrastructure, data may be obtained through sensors and external communications and then provided to an intelligent chip in a distributed computing system provided by the base platform for computation.

(2) Data 120

Data at the upper level of the infrastructure is used to represent the data source for the field of artificial intelligence. The data relates to graphics, images, voice and text, and also relates to internet of things data of traditional equipment, including service data of an existing system and sensing data such as force, displacement, liquid level, temperature, humidity and the like.

(3) Data processing 130

The data processing generally includes processing modes such as data training, machine learning, deep learning, searching, reasoning, decision making and the like.

The machine learning and the deep learning can perform symbolized and formalized intelligent information modeling, extraction, preprocessing, training and the like on data.

Inference means a process of simulating an intelligent human inference mode in a computer or an intelligent system, using formalized information to think about and solve a problem by a machine according to an inference control strategy, and a typical function is searching and matching.

The decision-making refers to a process of making a decision after reasoning intelligent information, and generally provides functions of classification, sequencing, prediction and the like.

(4) General capabilities 140

After the above-mentioned data processing, further based on the result of the data processing, some general capabilities may be formed, such as algorithms or a general system, e.g. translation, analysis of text, computer vision processing, speech recognition, recognition of images, etc.

(5) Intelligent product and industry applications 150

The intelligent product and industry application refers to the product and application of an artificial intelligence system in various fields, and is the encapsulation of an artificial intelligence integral solution, the intelligent information decision is commercialized, and the landing application is realized, and the application field mainly comprises: intelligent manufacturing, intelligent transportation, intelligent home, intelligent medical treatment, intelligent security, automatic driving, safe city, intelligent terminal and the like.

The embodiment of the application can be applied to solving the problem of combinatorial optimization, wherein the combinatorial optimization can include: the intelligent transportation system belongs to the fields of intelligent transportation, intelligent production, intelligent scheduling, intelligent site selection and the like.

For example, as shown in fig. 2, the processing method of the combinatorial optimization task provided by the embodiment of the present application may be applied to solve the problem of combinatorial optimization; acquiring data to be processed may refer to acquiring a first solution set, which may include M candidate solutions of a target problem; through the processing method provided by the embodiment of the application, the first solution set can be formed by selecting N candidate solutions from the first solution set through a pre-trained neural network model, and further the target solution can be selected from the second solution set. For example, a predicted optimal solution of the combinatorial optimization task is output.

The combinatorial optimization (combinatorial optimization) task may refer to a problem of finding an optimal object in a limited set. Generally, combinatorial optimization can be expressed in Mixed Integer Programming (MIP) while also belonging to non-deterministic polymodal (NP-hard) problems.

It should be understood that the above-mentioned pre-trained ranking model can be applied to a Mathematical Programming (MP) problem, i.e., a problem of maximizing or minimizing a certain target under certain resource and condition constraints.

For example, the embodiments of the present application can be applied in the context of various combinatorial optimization problems. For example, the embodiments of the present application can be applied to various scenarios, such as a vehicle dispatch problem, a three-dimensional bin packing problem, a multi-level supply chain management problem, an advertisement/product bidding problem, a factory scheduling problem, and a factory selection problem. In addition, the embodiment of the present application may be combined with other technologies, and is not limited herein.

An application scenario of the processing method of the combinatorial optimization task in the embodiment of the present application is briefly introduced below.

The application scene one: production scheduling tasks

In a specific application scenario, the processing method of the embodiment of the application can be applied to production scheduling tasks; by the processing method of the combined optimization task, the maximum production benefit of each generation object in the multiple generation objects can be realized according to the product demand quantity and the raw material quantity.

In the processing method, partial candidate solutions can be selected from all candidate solutions corresponding to the target problem by adopting a pre-trained neural network model, and the number of the partial candidate solutions is smaller than that of all the candidate solutions; further, selecting a target solution from the partial candidate solutions; when solving the problem to be solved for realizing the target combination optimization task, selecting a partial data set from a large data candidate set, and then selecting a target solution from the partial data set; therefore, the problems of long solving time and low solving efficiency existing in the process of directly selecting a target solution from a data set with larger data can be solved; by the processing method in the embodiment of the application, the time for solving the problem required to be solved for realizing the production scheduling task can be shortened when the production scheduling task is processed, so that the processing efficiency of the production scheduling task is improved.

Illustratively, when the processing method of the present application is applied to a production scheduling task, the computing device obtains, from a memory, a first solution set for a target problem, where the target problem is that a maximum production yield is achieved according to a quantity of required products and a quantity of raw materials, and the first solution set includes M planning results for the target problem, where one of the M planning results is used to represent a target capacity of each of a plurality of production objects; the computing equipment groups the candidate solutions in the first solution set to obtain a plurality of solution sets; the computing equipment inputs the solution sets into a pre-trained neural network model to obtain a plurality of target confidence degrees, wherein the target confidence degrees are in one-to-one correspondence with the solution sets, and each target confidence degree in the target confidence degrees is used for representing the confidence degree that the corresponding solution set comprises a target solution; the computing equipment selects a solution set corresponding to the maximum target confidence coefficient in the target confidence coefficients from the solution sets according to the target confidence coefficients to obtain a second solution set; and the computing equipment selects the target solution from the second solution set, wherein the target solution is a planning result for realizing the maximum production yield from the M planning results.

Application scenario two: three-dimensional boxing task

In a specific application scenario, the processing method of the embodiment of the application can be applied to a three-dimensional boxing task; by the processing method of the combined optimization task, the maximum space utilization rate of the box can be realized according to the sizes of the multiple articles to be boxed and the sizes of the box.

In the processing method, partial candidate solutions can be selected from all candidate solutions corresponding to the target problem by adopting a pre-trained neural network model, and the number of the partial candidate solutions is smaller than that of all the candidate solutions; further, selecting a target solution from the partial candidate solutions; when solving the problem to be solved for realizing the target combination optimization task, selecting a partial data set from a large data candidate set, and then selecting a target solution from the partial data set; therefore, the problems of long solving time and low solving efficiency existing in the process of directly selecting a target solution from a data set with larger data can be solved; by the processing method in the embodiment of the application, the time for solving the problems required to be solved for realizing the three-dimensional boxing task can be shortened when the three-dimensional boxing task is processed, so that the processing efficiency of the three-dimensional boxing task is improved.

Illustratively, when the processing method of the present application is applied to a three-dimensional boxing task, the computing device obtains, from a memory, a first solution set for a target problem, where the target problem is to achieve a maximum space utilization of a box according to a size of a plurality of articles to be boxed and a size of the box, the first solution set includes M boxing results for the target problem, and one of the M boxing results is used to represent that a part of the articles to be boxed is selected from the plurality of articles to be boxed and is loaded into the box; the computing equipment groups the candidate solutions in the first solution set to obtain a plurality of solution sets; the computing equipment inputs the solution sets into a pre-trained neural network model to obtain a plurality of target confidence degrees, wherein the target confidence degrees are in one-to-one correspondence with the solution sets, and each target confidence degree in the target confidence degrees is used for representing the confidence degree that the corresponding solution set comprises a target solution; the computing equipment selects a solution set corresponding to the maximum target confidence coefficient in the target confidence coefficients from the solution sets according to the target confidence coefficients to obtain a second solution set; and the computing equipment selects the target solution from the second solution set, wherein the target solution is a boxing result which realizes the maximum space utilization rate of the box from the M boxing results.

For example, the three-dimensional dimensions of the box and the dimensions of the K objects to be boxed may be given; the target problem is that some problems to be packed are selected from K problems to be packed so that the space utilization rate of the box is maximum; specifically, a coordinate position can be selected, and articles are arranged in the forward direction along the coordinate position during boxing; the inspection and boxing layer can calculate the gap height difference of the articles according to the selected coordinate position; checking whether an article suitable for filling the gap exists in the articles which are not put in the gap, namely, the size of the placed article is required to be not larger than the gap; and calculating the utilization rate of the box, and outputting the boxing result if the utilization rate of the box is higher than that of the previous boxing result, wherein the boxing result can comprise an article number, article coordinates and a boxing direction.

Application scenario three: site selection task

In a specific application scenario, the processing method of the embodiment of the application can be applied to an addressing task; by the processing method of the combined optimization task, the maximum total service area can be realized according to the candidate addresses of the objects to be established and the service area corresponding to each candidate address.

In the processing method, partial candidate solutions can be selected from all candidate solutions corresponding to the target problem by adopting a pre-trained neural network model, and the number of the partial candidate solutions is smaller than that of all the candidate solutions; further, selecting a target solution from the partial candidate solutions; when solving the problem to be solved for realizing the target combination optimization task, selecting a partial data set from a large data candidate set, and then selecting a target solution from the partial data set; therefore, the problems of long solving time and low solving efficiency existing in the process of directly selecting a target solution from a data set with larger data can be solved; by the processing method in the embodiment of the application, the time for solving the problem required to be solved for realizing the addressing task can be shortened when the addressing task is processed, so that the processing efficiency of the addressing task is improved.

Illustratively, when the processing method of the present application is applied to an addressing task, a computing device obtains, from a memory, a first solution set for a target problem, where the target problem is that a maximum total service area is realized according to candidate addresses of a plurality of objects to be created and a service area corresponding to each candidate address, the first solution set includes M addressing results for the target problem, and one of the M addressing results is used to indicate a target addressing of each object to be created in the plurality of objects to be created and a service area corresponding to the target addressing of each object to be created; the computing equipment groups the candidate solutions in the first solution set to obtain a plurality of solution sets; the computing equipment inputs the solution sets into a pre-trained neural network model to obtain a plurality of target confidence degrees, wherein the target confidence degrees are in one-to-one correspondence with the solution sets, and each target confidence degree in the target confidence degrees is used for representing the confidence degree that the corresponding solution set comprises a target solution; the computing equipment selects a solution set corresponding to the maximum target confidence coefficient in the target confidence coefficients from the solution sets according to the target confidence coefficients to obtain a second solution set; and the computing equipment selects the target solution from the second solution set, wherein the target solution is an address selection result for realizing the maximum total service area in the M address selection results.

For example, the site selection task may refer to a problem that a sewage treatment plant selects a construction location, since each different construction location may serve different cities; therefore, the objective problem may mean that the total service range is maximized according to a plurality of different locations of the sewage treatment plant and the service area of each location, that is, may mean that the total service range is maximized in the case of constructing a sewage treatment plant having a small number.

For example, the above-mentioned addressing task may refer to selecting a broadcast station of a broadcast program, if the broadcast program is to be listened to by listeners in 30 cities; therefore, it is necessary to select which broadcasting stations broadcast the broadcast program on so that more areas in 30 cities can listen to the broadcast program through smaller broadcasting stations; therefore, the target problem may be that the total service range, i.e., the coverage range, of the plurality of broadcast stations in 30 cities is maximized when the broadcast program is broadcasted by the plurality of broadcast stations and the service area corresponding to each broadcast station is maximized.

In addition, in a specific application scenario, the processing method of the embodiment of the present application may be applied to the field of intelligent transportation, for example, in the vehicle dispatching problem shown in fig. 3, in accordance with three different pick-up points, three different destinations, and four dispatchable vehicles, the target problem may refer to the problem of achieving the vehicle dispatching that needs to be solved with the shortest transportation time and the lowest transportation cost. Or, in a specific application scenario, the processing method of the embodiment of the present application may be applied to the field of intelligent scheduling. Illustratively, in the supply management problem shown in fig. 4, the objective problem may refer to a problem to be solved in order to achieve supply profit maximization in accordance with the case of two suppliers, two production plants, three distribution centers, and four consumers.

It should be understood that the above description is illustrative of the application scenario and does not limit the application scenario of the present application in any way.

Since the embodiments of the present application relate to the application of a large number of neural networks, for the sake of understanding, the following description will be made first of all with respect to terms and concepts of the neural networks to which the embodiments of the present application may relate.

1. Neural network

The neural network may be composed of neural units, which may be referred to as x_sAnd an arithmetic unit with intercept 1 as input, the output of which may be:

wherein s is 1,2, … … n, n is a natural number greater than 1, and W is_sIs x_sB is the bias of the neural unit. f is an activation function (activation functions) of the neural unit for introducing a nonlinear characteristic into the neural network to convert an input signal in the neural unit into an output signal. The output signal of the activation function may be used as an input for the next convolutional layer, and the activation function may be a sigmoid function. A neural network is a network formed by a plurality of the above-mentioned single neural units being joined together, i.e. the output of one neural unit may be the input of another neural unit. The input of each neural unit can be connected with the local receiving domain of the previous layer to extract the characteristics of the local receiving domain, and the local receiving domain can be a region composed of a plurality of neural units.

2. Deep neural network

Deep Neural Networks (DNNs), also called multi-layer neural networks, can be understood as neural networks with multiple hidden layers. The DNNs are divided according to the positions of different layers, and the neural networks inside the DNNs can be divided into three categories: input layer, hidden layer, output layer. Generally, the first layer is an input layer, the last layer is an output layer, and the middle layers are hidden layers. The layers are all connected, that is, any neuron of the ith layer is necessarily connected with any neuron of the (i + 1) th layer.

Although DNN appears complex, it is not really complex in terms of the work of each layer, simply the following linear relational expression:

wherein the content of the first and second substances,

is the input vector of the input vector,

is the output vector of the output vector,

is an offset vector, W is a weight matrix (also called coefficient), and α () is an activation function. Each layer is only for the input vector

Obtaining the output vector through such simple operation

Due to the large number of DNN layers, the coefficient W and the offset vector

The number of the same is also large. The definition of these parameters in DNN is as follows: taking coefficient W as an example: suppose that in a three-layered DNN, the 4 th neuron of the second layer goes to the 2 nd neuron of the third layerThe linear coefficient of a warp element is defined as

The superscript 3 represents the number of layers in which the coefficient W is located, while the subscripts correspond to the third layer index 2 of the output and the second layer index 4 of the input.

In summary, the coefficients from the kth neuron at layer L-1 to the jth neuron at layer L are defined as

Note that the input layer is without the W parameter. In deep neural networks, more hidden layers make the network more able to depict complex situations in the real world. Theoretically, the more parameters the higher the model complexity, the larger the "capacity", which means that it can accomplish more complex learning tasks. The final goal of the process of training the deep neural network, i.e., learning the weight matrix, is to obtain the weight matrix (the weight matrix formed by the vectors W of many layers) of all the layers of the deep neural network that is trained.

3. Convolutional neural network

A Convolutional Neural Network (CNN) is a deep neural network with a convolutional structure. The convolutional neural network comprises a feature extractor consisting of convolutional layers and sub-sampling layers, which can be regarded as a filter. The convolutional layer is a neuron layer for performing convolutional processing on an input signal in a convolutional neural network. In convolutional layers of convolutional neural networks, one neuron may be connected to only a portion of the neighbor neurons. In a convolutional layer, there are usually several characteristic planes, and each characteristic plane may be composed of several neural units arranged in a rectangular shape. The neural units of the same feature plane share weights, where the shared weights are convolution kernels.

4. Loss function

In the process of training the deep neural network, because the output of the deep neural network is expected to be as close to the value really expected to be predicted as possible, the weight vector of each layer of neural network can be updated according to the difference situation between the predicted value of the current network and the target value really expected (of course, an initialization process is usually carried out before the first updating, namely parameters are pre-configured for each layer in the deep neural network); for example, if the predicted value of the network is high, the weight vector is adjusted to make the predicted value lower, and the adjustment is continued until the deep neural network can predict the real desired target value or a value very close to the real desired target value. Therefore, it is necessary to define in advance "how to compare the difference between the predicted value and the target value", which are loss functions (loss functions) or objective functions (objective functions), which are important equations for measuring the difference between the predicted value and the target value. Taking the loss function as an example, if the higher the output value (loss) of the loss function indicates the larger the difference, the training of the deep neural network becomes the process of reducing the loss as much as possible.

5. Back propagation algorithm

The neural network can adopt a Back Propagation (BP) algorithm to correct the size of parameters in the initial neural network model in the training process, so that the reconstruction error loss of the neural network model is smaller and smaller.

Specifically, the error loss is generated by transmitting the input signal in the forward direction until the output, and the parameters in the initial neural network model are updated by reversely propagating the error loss information, so that the error loss is converged. The back propagation algorithm is a back propagation motion with error loss as a dominant factor, aiming at obtaining the optimal parameters of the neural network model, such as a weight matrix.

6. Mixed integer programming

Mixed Integer Programming (MIP) refers to a mathematical programming problem that requires that variables must be partially or completely integers in order to maximize or minimize a target under certain resource and condition constraints.

7. Branch and bound method

Branch and bound (branch and bound) is one of the most common algorithms for solving integer programming problems. The method can solve not only pure integer programming but also mixed integer programming. The branch-and-bound method is a search and iteration method, and different branch variables and subproblems are selected for branching. It is essentially a method of searching a target solution to a problem on a solution space tree of the problem.

First, a system architecture of a processing method for a combinatorial optimization task provided in an embodiment of the present application is introduced. Referring to fig. 5, a system architecture 200 is provided in an embodiment of the present application. As shown in the system architecture 200 in fig. 5, a data collection device 260 is used to collect training data. For the processing method of the combinatorial optimization task in the embodiment of the present application, the ranking model may be further trained through training data, that is, the training data collected by the data collecting device 260 may include a first sample solution set, a second sample solution set, and a sample target solution.

After the training data is collected, the data collection device 260 stores the training data in the database 230, and the training device 220 trains the target model/rule 201 (i.e., the pre-trained neural network model in the embodiment of the present application) based on the training data maintained in the database 230. The training device 220 inputs training data into the neural network model until a difference between output prediction data of the training neural network model and sample data satisfies a preset condition (e.g., the difference between the prediction data and the sample data is less than a certain threshold, or the difference between the prediction data and the sample data remains unchanged or no longer decreases), thereby completing training of the target model/rule 201.

In the embodiment provided by the present application, the target model/rule 201 is a first ordering model obtained through training data, and a solution set corresponding to a maximum target confidence in a plurality of target confidence may be selected from a plurality of solution sets as a second solution set according to a plurality of target confidence corresponding to the plurality of solution sets through the first ordering model; the multiple solution sets are obtained by randomly grouping candidate solutions in the first solution set, the multiple solution sets correspond to multiple target confidence coefficients one to one, each target confidence coefficient in the multiple target confidence coefficients is used for indicating that the corresponding solution set comprises a confidence coefficient of a target solution, and the target solution can be a candidate solution which enables a planning result of a target problem to meet a preset constraint condition in the first solution set. It should be noted that, in practical applications, the training data maintained in the database 230 may not necessarily all come from the collection of the data collection device 260, and may also be received from other devices.

It should be noted that, the training device 220 does not necessarily perform the training of the target model/rule 201 based on the training data maintained by the database 230, and may also obtain the training data from the cloud or other places for performing the model training, and the above description should not be taken as a limitation to the embodiment of the present application.

It should also be noted that at least a portion of the training data maintained in the database 230 may also be used to execute the process of the device 210 on the to-be-processed process.

The target model/rule 201 obtained by training according to the training device 220 may be applied to different systems or devices, for example, the execution device 110 shown in fig. 5, where the execution device 110 may be a terminal, such as a mobile phone terminal, a tablet computer, a notebook computer, an AR/VR, a vehicle-mounted terminal, and may also be a server or a cloud.

In fig. 5, the execution device 210 configures an input/output (I/O) interface 212 for data interaction with an external device, and a user may input data to the I/O interface 212 through the client device 240, where the input data may include: a pending question entered by the client device.

The preprocessing module 213 and the preprocessing module 214 are used for preprocessing input data (such as a problem to be processed) received by the I/O interface 212. In the embodiment of the present application, the input data may be processed directly by the calculation module 211 without the preprocessing module 213 and the preprocessing module 214 (or only one of them may be used).

In the process that the execution device 210 preprocesses the input data, or in the process that the calculation module 211 of the execution device 210 performs the calculation or other related processes, the execution device 210 may call the data, the code, and the like in the data storage system 250 for corresponding processes, or store the data, the instruction, and the like obtained by corresponding processes in the data storage system 250.

Finally, the I/O interface 212 returns the processing result, i.e. the target node of the to-be-processed problem obtained as described above, to the client device 240, so as to provide the user with the path information of the branch-and-bound algorithm.

It should be noted that the training device 220 may generate corresponding target models/rules 201 for different targets or different tasks based on different training data, and the corresponding target models/rules 201 may be used to achieve the targets or complete the tasks, so as to provide the user with the required results.

In the case shown in fig. 5, in one case, the user may manually give the input data, which may be operated through an interface provided by the I/O interface 212.

Alternatively, the client device 240 may automatically send the input data to the I/O interface 212, and if the client device 240 is required to automatically send the input data to obtain authorization from the user, the user may set the corresponding permissions in the client device 240. The user can view the result output by the execution device 210 at the client device 240, and the specific presentation form can be display, sound, action, and the like. The client device 240 may also serve as a data collection terminal, collecting input data of the input I/O interface 212 and output results of the output I/O interface 212 as new sample data, and storing the new sample data in the database 230. Of course, the input data input to the I/O interface 212 and the output result output from the I/O interface 212 as shown in the figure may be directly stored in the database 230 as new sample data by the I/O interface 212 without being collected by the client device 240.

It should be noted that fig. 5 is only a schematic diagram of a system architecture provided in the embodiment of the present application, and the position relationship between the devices, modules, and the like shown in the diagram does not constitute any limitation. For example, in FIG. 5, the data storage system 250 is an external memory with respect to the execution device 210, in other cases, the data storage system 250 may be disposed in the execution device 210.

As shown in fig. 5, a target model/rule 201 is obtained by training according to the training device 220, and the target model/rule 201 may be a ranking model in the embodiment of the present application; for example, the pre-trained neural network model provided in the embodiment of the present application may be used to output N candidate solutions in the first solution set, and the pre-trained neural network model may be a deep neural network, a convolutional neural network, or may be a deep convolutional neural network, or the like.

Fig. 6 is a schematic diagram of a hardware structure of a chip according to an embodiment of the present disclosure.

The chip shown in fig. 6 may include a neural-Network Processing Unit (NPU) 300; the chip may be provided in the execution device 210 as shown in fig. 5 to complete the calculation work of the calculation module 211. The chip may also be disposed in the training device 220 as shown in fig. 5 to complete the training work of the training device 220 and output the target model/rule 201.

The NPU 300 is mounted as a coprocessor on a main processing unit (CPU), and tasks are allocated by the main CPU. The core portion of the NPU 300 is an arithmetic circuit 303, and the controller 304 controls the arithmetic circuit 303 to extract data in a memory (weight memory or input memory) and perform an operation.

In some implementations, the arithmetic circuitry 303 includes a plurality of processing units (PEs) internally. In some implementations, the operational circuitry 303 is a two-dimensional systolic array; the arithmetic circuit 303 may also be a one-dimensional systolic array or other electronic circuit capable of performing mathematical operations such as multiplication and addition. In some implementations, the arithmetic circuitry 303 is a general-purpose matrix processor.

For example, assume there are an input matrix a, a weight matrix B, and an output matrix C; the arithmetic circuit 303 fetches the data corresponding to the matrix B from the weight memory 302 and buffers the data on each PE in the arithmetic circuit 303; the arithmetic circuit 303 takes the matrix a data from the input memory 301 and performs matrix operation with the matrix B, and a partial result or a final result of the obtained matrix is stored in an accumulator 308 (accumulator).

The vector calculation unit 307 may further process the output of the operation circuit 303, such as vector multiplication, vector addition, exponential operation, logarithmic operation, magnitude comparison, and the like. For example, the vector calculation unit 307 may be used for network calculation of a non-convolution/non-FC layer in a neural network, such as pooling (Pooling), batch normalization (batch normalization), local response normalization (local response normalization), and the like.

In some implementations, the vector calculation unit 307 can store the processed output vector to the unified memory 306. For example, the vector calculation unit 307 may apply a non-linear function to the output of the arithmetic circuit 303, such as a vector of accumulated values, to generate the activation value. In some implementations, the vector calculation unit 307 generates normalized values, combined values, or both.

In some implementations, the vector of processed outputs can be used as activation inputs to the arithmetic circuitry 303, for example, for use in subsequent layers in a neural network.

The unified memory 306 is used to store input data as well as output data. The weight data directly passes through a memory unit access controller 305 (DMAC) to store the input data in the external memory into the input memory 301 and/or the unified memory 306, store the weight data in the external memory into the weight memory 302, and store the data in the unified memory 306 into the external memory.

A bus interface unit 310 (BIU) for implementing interaction between the main CPU, the DMAC, and the instruction fetch memory 309 through a bus.

An instruction fetch buffer 309(instruction fetch buffer) connected to the controller 304 is used for storing instructions used by the controller 304; the controller 304 is configured to call an instruction cached in the instruction fetch memory 309, so as to control the operation process of the operation accelerator.

Generally, the unified memory 306, the input memory 301, the weight memory 302, and the instruction fetch memory 309 are On-Chip (On-Chip) memories, the external memory is a memory outside the NPU, and the external memory may be a double data rate synchronous dynamic random access memory (DDR SDRAM), a High Bandwidth Memory (HBM), or other readable and writable memories.

For example, the operation of the pre-trained neural network model in the embodiment of the present application may be performed by the operation circuit 303 or the vector calculation unit 307.

The executing device 210 in fig. 5 described above is capable of executing the steps of the processing method of the combinatorial optimization task according to the embodiment of the present application, and the chip shown in fig. 6 may also be used for executing the steps of the processing method of the combinatorial optimization task according to the embodiment of the present application.

Fig. 7 illustrates a system architecture provided in an embodiment of the present application. The system architecture 400 may include a local device 420, a local device 430, and an execution device 410 and a data storage system 450, where the local device 420 and the local device 430 are connected with the execution device 410 through a communication network.

Illustratively, the execution device 410 may be implemented by one or more servers.

Alternatively, the execution device 410 may be used with other computing devices. For example: data storage, routers, load balancers, and the like. The execution device 410 may be disposed on one physical site or distributed across multiple physical sites. The execution device 410 may use the data in the data storage system 450 or call program code in the data storage system 450 to implement the processing method of the combinatorial optimization task of the embodiment of the present application.

It should be noted that the execution device 410 may also be referred to as a cloud device, and at this time, the execution device 410 may be deployed in the cloud.

Specifically, the execution device 410 may perform the following processes:

acquiring a first solution set aiming at a target problem from a memory, wherein the target problem is a problem to be solved for realizing a target combination optimization task, the first solution set comprises M candidate solutions aiming at the target problem, and M is an integer greater than 1; grouping the candidate solutions in the first solution set to obtain a plurality of solution sets; inputting the solution sets into a pre-trained neural network model to obtain a plurality of target confidence coefficients, wherein the target confidence coefficients correspond to the solution sets one to one, each target confidence coefficient in the target confidence coefficients is used for representing the confidence coefficient that the corresponding solution set comprises a target solution, and the target solution is a candidate solution which enables the planning result of the target problem to meet a preset constraint condition in the first solution set; according to the target confidence degrees, selecting a solution set corresponding to the maximum target confidence degree in the target confidence degrees from the solution sets to obtain a second solution set; and selecting the target solution from the second solution set.

In a possible implementation manner, the processing method of the combination optimization task in the embodiment of the present application may be an offline method executed in a cloud, for example, the processing method in the embodiment of the present application may be executed in the execution device 410.

In a possible implementation manner, the processing method of the combinatorial optimization task according to the embodiment of the present application may be performed by the local device 420 or the local device 430.

For example, a user may operate respective user devices (e.g., local device 420 and local device 430) to interact with the execution device 410. Each local device may represent any computing device, such as a personal computer, computer workstation, smartphone, tablet, smart camera, smart car or other type of cellular phone, media consumption device, wearable device, set-top box, game console, and so forth. The local devices of each user may interact with the enforcement device 410 via a communication network of any communication mechanism/standard, such as a wide area network, a local area network, a peer-to-peer connection, etc., or any combination thereof.

In an implementation manner, the local device 420 and the local device 430 may acquire the relevant parameters of the pre-trained neural network model from the execution device 410, deploy the pre-trained neural network model on the local device 420 and the local device 430, and perform processing of a combination optimization task by using the pre-trained neural network model.

In another implementation, the pre-trained neural network model may be deployed directly on the performing device 410, and the performing device 410 may obtain relevant parameters of the pre-trained neural network model from the local device 420 and the local device 430.

The following describes the technical solution of the embodiment of the present application in detail with reference to fig. 8 to 15.

Fig. 8 is a schematic flowchart of a processing method of a combinatorial optimization task provided in an embodiment of the present application. In some examples, the method 500 may be performed by the execution device 210 in fig. 5, the chip shown in fig. 6, and the execution device 410 in fig. 7. The method 500 in fig. 8 may include steps S510 to S550, which are described in detail below.

S510, the computing device obtains a first solution set aiming at the target problem from the memory.

The target problem is a problem to be solved for realizing a target combination optimization task, the first solution set comprises M candidate solutions aiming at the target problem, and M is an integer larger than 1.

In one possible implementation, the objective combinatorial optimization task may include a combinatorial optimization problem in industrial production. Combinatorial optimization problems in industrial production may include any of vehicle dispatch problems, three-dimensional bin packing problems, multi-level supply chain management problems, advertisement/product bidding problems, plant scheduling problems, and plant selection problems.

It should be understood that a combinatorial optimization task may refer to a task that finds an optimal solution or a near optimal solution within a limited set in order to solve a certain combinatorial optimization problem. In general, combinatorial optimization problems can be represented by mixed integer programming, and can also be non-deterministic problems.

S520, grouping the candidate solutions in the first solution set by the computing equipment to obtain a plurality of solution sets.

Optionally, in a possible implementation manner, the grouping, by the computing device, the candidate solutions in the first solution set to obtain a plurality of solution sets includes:

Illustratively, the computer device may randomly average groups the candidate solutions in the first set of solutions, resulting in a plurality of sets of solutions.

It should be noted that the number of candidate solutions included in each solution set in the multiple solution sets is less than M.

S530, inputting the multiple solution sets into a pre-trained neural network model by the computing equipment to obtain multiple target confidence degrees.

The target confidence degrees are in one-to-one correspondence with the solution sets, each target confidence degree in the target confidence degrees is used for indicating that the corresponding solution set comprises the confidence degree of the target solution, and the target solution is a candidate solution which enables the planning result of the target problem to meet a preset constraint condition in the first solution set.

It should be understood that the confidence that each target confidence in the plurality of target confidence levels is used to indicate that the corresponding solution set includes the target solution may refer to the probability that each target confidence level is used to indicate that the corresponding solution set includes the target solution; the greater the confidence that a solution set includes a target solution, the greater the probability that the solution set includes the target solution.

It should also be understood that since the nature of the sample problem corresponding to the training neural network model is different from the applicable scenario, the path of the solution is also different. Therefore, the neural network model is only suitable for the same type of target problems under the same or similar scene with the training data after the training is finished; for example, if the neural network is trained through data corresponding to the plant scheduling combinatorial optimization task, the pre-trained neural network model is only suitable for processing the plant scheduling combinatorial optimization task.

And S540, selecting a solution set corresponding to the maximum target confidence coefficient in the target confidence coefficients from the solution sets by the computing equipment according to the target confidence coefficients to obtain a second solution set.

And S550, selecting a target solution from the second solution set by the computing equipment.

It should be noted that the target solution may be a candidate solution that enables a planning result of the target problem to meet a preset constraint condition; wherein, the target problem may refer to a functional relationship including a plurality of variables; the candidate solution may refer to a candidate value of each of a plurality of variables in the functional relationship; the target solution may refer to a candidate value that enables the plurality of variables to satisfy a preset constraint condition among the candidate values of the plurality of variables; the preset constraint condition can be constraint limitation on candidate values of each variable and/or constraint limitation on the whole of a plurality of variables; for example, the preset constraint condition may refer to candidate values of the plurality of variables, so that the plurality of variables can reach a maximum value, a minimum value, or other constraint limits as a whole.

For example, the candidate solution may refer to a candidate target capacity for each of a plurality of production objects in the production scheduling task; the preset constraint condition may be a constraint limit on the daily capacity of each generated object, or the preset constraint condition may also be a constraint limit on the daily capacity of each generated object, where the preset constraint condition is that a plurality of generated objects can meet the required quantity of products and maximize the profit of each generated object under a certain raw material condition; the target solution may be a candidate target yield corresponding to the candidate target yield of each generation object, which can maximize the profit of each generation object. Optionally, in a possible implementation manner, the selecting, by the computing device, a target solution from the second solution set includes:

and the computing equipment selects a candidate solution with the maximum resource allocation rate or the maximum profit when the target combination optimization task is realized from the second solution set to obtain the target solution.

In a specific application scenario, when the objective combination optimization task is a production scheduling task, the objective problem refers to implementation of a maximum production yield according to a product demand quantity and a raw material quantity, the first set includes M planning results, one of the M planning results is used for representing an objective capacity of each production object in a plurality of production objects, and the objective solution refers to a planning result of implementation of the maximum production yield in the M planning results.

In a specific application scenario, in a case that the target combination optimization task is a three-dimensional boxing task, the target problem is that the maximum space utilization rate of the box is achieved according to the sizes of a plurality of articles to be boxed and the size of the box, the first candidate set includes M boxing results, one of the M boxing results is used for representing that a part of articles to be boxed are selected from the plurality of articles to be boxed and are packed into the box, and the target solution is a boxing result of the M boxing results that achieves the maximum space utilization rate of the box.

In a specific application scenario, in a case that the objective combination optimization task is an addressing task, the objective problem refers to implementation of a maximum total service area according to candidate addresses of a plurality of objects to be built and a service area corresponding to each candidate address, the first solution set includes M addressing results, one of the M addressing results is used for representing an objective addressing of each object to be built in the plurality of objects to be built and a service area corresponding to the objective addressing of each object to be built, and the objective solution refers to implementation of an addressing result of the maximum total service area in the M addressing results.

In one example, the computing device selecting the target solution from the second solution set may refer to the computing device obtaining labels of candidate solutions in the second solution set through a ranking model, and selecting the target solution according to the labels of the second candidate solutions. For example, the specific flow can be seen in the second ranking model in the following fig. 11.

In the embodiment of the present application, in the process of processing the combinatorial optimization task in step S530, all candidate solutions included in the first solution set may be first divided into a plurality of solution sets; obtaining a plurality of target confidence coefficients corresponding to a plurality of solution sets through a pre-trained neural network, and selecting a solution set corresponding to the maximum target confidence coefficient from the plurality of solution sets as a second solution set; further, a target solution can be determined from the second solution set. By the processing method in the embodiment of the application, the time for solving the problem required to be solved for realizing the combined optimization task can be shortened when the combined optimization task is processed, so that the processing efficiency of the combined optimization task is improved.

Optionally, in a possible implementation manner, the pre-trained neural network model is obtained by using the following training method: acquiring training data, wherein the training data comprises a sample problem, a first sample solution set, a second sample solution set and a sample target solution, the sample problem is a problem to be solved for realizing a sample combination optimization task, the first sample solution set comprises K candidate solutions aiming at the sample problem, the second sample solution set is formed by selecting L candidate solutions from the first sample solution set, the second sample solution set comprises the sample target solution, the sample target solution is a candidate solution which enables a planning result of the sample target problem to meet a preset constraint condition in the first sample solution set, K is an integer larger than 1, and L is an integer smaller than K; and training the neural network model by taking the first sample solution set as input data to obtain a second sample solution set as a training target.

Exemplarily, the processing method further includes: the computing device groups the candidate solutions in the first sample solution set to obtain the second sample solution set; the training the neural network model with the training data includes:

When training a pre-trained neural network model, grouping the first sample solution sets to obtain a plurality of sample solution sets; the plurality of sample solution sets include a second sample solution set, and the second sample solution set refers to a solution set including a sample target solution in the plurality of solution sets. Inputting a plurality of sample solution sets into a neural network model, and obtaining a first confidence coefficient that a second sample solution set comprises a sample target solution by using the neural network model; the neural network model is trained according to the first confidence.

In other words, the first confidence levels of the plurality of sample solution sets including the sample target solution may be obtained using the neural network model, such that model parameters of the neural network are continuously updated by a reverse iterative algorithm according to the first confidence levels of the plurality of sample solution sets including the sample target solution, such that the first confidence level of a second sample solution set of the plurality of sample solution sets is greater than the first confidence level of other sample solution sets of the plurality of sample solution sets including the sample target solution, i.e., such that the neural network model is able to derive the second sample solution set including the sample target solution from the plurality of sample solution sets.

In an embodiment of the present application, the plurality of sample solution sets may be fixed and invariant, and the label of each sample solution set may be dynamically adjusted by dynamically adjusting the corresponding confidence degrees of the plurality of solution sets, where the label of each sample solution set is used to indicate the confidence degree that each sample solution set includes the target solution, and the neural network model is trained according to the confidence degrees of the plurality of sample solution sets; compared with a training mode of dynamically grouping a plurality of sample solution sets, the training time of the neural network model can be shortened by dynamically adjusting the label of each sample solution set.

Optionally, in a possible implementation manner, the grouping, by the computing device, the candidate solutions in the first sample solution set to obtain the second sample solution set includes:

Illustratively, the computing device may group the first set of sample solutions by a random average score.

Further, the first confidence that each sample solution set in the plurality of sample solution sets includes the sample target solution may be calculated by the confidence that each sample solution in each sample solution set is the sample target solution.

Optionally, in a possible implementation, the obtaining, by using the neural network model, a first confidence that the second sample solution set includes the sample target solution includes:

Illustratively, one sample solution set includes 3 sample solutions, and the first confidence of the sample solution set including the sample target solution may be obtained by performing weighted summation on the confidence of each of the 3 sample solutions as the sample target solution.

Optionally, in a possible implementation, the confidence of each candidate solution as the sample target solution is obtained by the neural network model by using an adaptive enhancement algorithm.

Illustratively, the confidence of each solution in each sample solution set as a sample target solution may be obtained by employing an adaptive boost (adaptive boost) algorithm.

It should be understood that Adaboost is an iterative algorithm, and the core idea is to train different classifiers (weak classifiers) for the same training set, and then combine these weak classifiers to form a stronger final classifier (strong classifier). The algorithm is realized by changing data distribution, and determines the weight of each sample according to whether the classification of each sample in each training set is correct and the accuracy of the last overall classification. And (4) sending the new data set with the modified weight value to a lower-layer classifier for training, and finally fusing the classifiers obtained by each training as a final decision classifier.

The nature of the adaptive enhancement algorithm mechanism may be, among other things, a label that forms a weighted average for each packet (e.g., a score for the packet).

The method specifically comprises the following steps:

step 1, initially making the weight of each sample solution equal, and dividing an initial group according to the position; the score of the initial grouping is the average of the solution scores of all samples in the initial grouping;

step 2, sorting all sample solutions and initial group groups respectively, and determining the group where the optimal solution (for example, the candidate solution with the highest score) is located;

step 3, if a certain group comprises the optimal solution, stopping iteration, and scoring the group finally, namely scoring at the moment; otherwise, executing step 4;

step 4, if a certain group does not comprise the optimal solution, the weight of the optimal solution is improved, and the group is scored again;

and 5, reordering the packets, and returning to execute the step 3.

The mechanism by the adaptive enhancement algorithm described above can ensure that the sample target solution set includes the sample target solution, even if the target solution of the plurality of candidate solutions is contained in the optimally ranked group.

For example, assuming that the number of all sample candidate solutions is 9, and 3 sample candidate solutions in the 9 sample candidate solutions are divided into 1 group in the initial grouping, three groups of the initial grouping may be obtained, that is, group 1: sample candidate solution 1, sample candidate solution 2, sample candidate solution 3; grouping 2: sample solution candidates 4,5, 6; grouping 3: sample candidate solution 7, sample candidate solution 8, sample candidate solution 9; assuming that the weight of each sample candidate solution in the initial grouping is 1, scoring the sample candidate solutions in the initial grouping, namely scoring corresponding to each variable in the grouping 1 is 1,2 and 3; the corresponding scores of all variables in the group 2 are 2, 2 and 2; the corresponding scores of all variables in the group 3 are 1, 1 and 4; weighting and averaging the initial scores of the groups to obtain a score 2 of a group 1; score 2 for group 2; score 2 for group 3; according to the scoring of the variables in each group, if the score of the sample candidate solution 9 is the variable with the highest score in the 9 variables, the variable 9 can be regarded as the optimal solution; then the weight values of the sample candidate solutions may be adjusted, for example, the weight values in group 1 are correspondingly adjusted to 1, and 3; adjusting the weight values in the group 3 to 1, 1 and 4; the weight value of packet 2 may not be adjusted; further, calculating the score of each group after the weight value is adjusted, thereby obtaining the score of 4 of the group 1; score 2 for group 2; the score for group 3 is: 6; so that packet 3 can be determined to be the best packet and the optimal solution included in packet 3.

Optionally, in a possible implementation, the confidence of each candidate solution as the sample target solution is obtained by the neural network model by using a binary label mechanism.

Illustratively, a 0-1 binary classification scoring mechanism may be employed.

For example, all candidate solutions are randomly grouped to obtain an initial group; each candidate solution in the initial packet is scored, the label of the packet including the optimal solution is set to 1, and the labels of the rest of the packets are all set to 0.

In one example, as shown in fig. 9, the processing method of the combinatorial optimization task in the embodiment of the present application may mainly include two steps of determining the target group (e.g., the second solution set) and determining the target solution of the target group respectively.

Specifically, the first step is to group all candidate solutions to obtain an initial group; scoring each group of the initial grouping to obtain a target grouping in the initial grouping;

and the second step is to sort the candidate solutions in the target group so as to obtain the candidate solution of the ranking Top-K, namely to select the target solution in the target group.

For example, the first step may be performed in the neural network model pre-trained in the embodiment of the present application, that is, may refer to a pre-trained inter-group ranking model; the second step described above may be performed in a pre-trained intra-group ranking model.

It should be noted that, the specific process of the above-mentioned pre-trained intra-group ranking model may refer to a training method of the ranking model in the prior art, and the application is not limited in any way.

For example, as shown in fig. 9, a second solution set in the multiple solution sets may be determined according to multiple solution sets obtained by randomly grouping all candidate solutions in the first candidate solution set and the confidence that the solution set includes the target solution, where the second solution set may be a solution set that includes the target solution and has a confidence that is greater than the confidence that other solution sets in the multiple solution sets include the target solution; for example, the second solution set may refer to the highest scoring one of the initial groupings; further, the plurality of candidate solutions included in the second solution set may be ranked, for example, the plurality of candidate solutions in the second solution set may be scored, so as to obtain a highest scored target solution of the plurality of candidate solutions.

FIG. 10 is a schematic diagram of inter-group dynamic scoring provided by an embodiment of the present application. As shown in fig. 10, learning of the inter-group ranking model may be accomplished by scoring different groupings (e.g., initial groupings of random average scores) and defining labels.

It should be understood that the candidate solution shown in fig. 10 may refer to the solution of each variable required to solve the combinatorial optimization problem.

It should be noted that the most intuitive idea of grouping is to dynamically group the candidate solutions, so that the order of N groups is consistent with the order of the candidate solutions, that is, the best group contains M/N candidate solutions ordered from 1 st to M/N, the second group contains M/N candidate solutions ordered from M/N +1 st to 2M/N, and so on. However, it should be noted that in the embodiment of the present application, the learning objective of the pre-trained neural network model, i.e., the inter-group ranking model, is not to divide the candidate solutions into N parts, but to include the target solution of the candidate solutions in the best group, i.e., to include the target solution in the target group.

Illustratively, the adaptive boosting algorithm described above, or a binary tagging mechanism, may be employed in dynamically scoring individual packets.

In the embodiment of the application, a relatively accurate grouping sequence can be learned by the two scoring strategies, namely learning the inter-group sequencing model, so that a target group is selected; then, the candidate solution ordering is learned again with respect to the candidate solutions in the target group (directly ignoring the candidate solutions in the group that has not been picked). This approach allows for better utilization of existing samples due to: the number of packets is small relative to the number of candidate solutions; for example, for a 10000 variable problem solving process, the candidate solutions can be generally divided into 10-20 groups, and the number of groups is much smaller than the number of candidate solutions. Therefore, the process of learning the inter-group ranking model is also relatively simple; for example, for the above example, the inter-group ordering model is actually an ordering model between only 10-20 groups; in training the inter-group ranking model, a "group" is treated as a "whole," i.e., a "variable" in the model. Learning an inter-group ranking model is relatively easy on a fixed number and fixed quality of samples. Further, a model of inter-variable ranking may be learned for a target grouping determined from the plurality of candidate groupings, i.e., one or more groups ranked first by the inter-group ranking model; for example, still taking the above example, the best group of 10000 variables is selected, that is, the number of the processed solution candidates is 1000-; therefore, by adopting the processing method of the combined optimization task provided by the embodiment of the application, the processing efficiency of the combined optimization task can be improved.

Fig. 11 is a schematic flowchart of a training method of a neural network model (i.e., the above-mentioned inter-group ranking model) provided in an embodiment of the present application. In some examples, the training method 600 may be performed by the performing device 210 in fig. 5, the chip shown in fig. 6, and the performing device 410 in fig. 7. The training method 600 in fig. 11 may include steps S610 to S640, which are described in detail below.

It should be appreciated that in embodiments of the present application, the component ranking model may be the pre-trained neural network model of FIG. 8 described above.

And S610, inputting a sample problem.

The training problem can be a problem to be solved for realizing a sample combination optimization task; for example, the problem of combination optimization in industrial production such as resource allocation and resource scheduling; the goal is to maximize or minimize certain goals with certain resource and condition constraints.

Illustratively, the training questions may include, but are not limited to, any of factory scheduling, production scheduling, three-dimensional binning, and factory site selection.

It should be understood that since the nature of the sample problem corresponding to the training neural network model is different from the applicable scenario, the path of the solution is also different. Therefore, the training data of the sample problem used for training is only suitable for being realized in the same type of problems under the same or similar scenes after the neural network model is trained; for example, training is done on factory-scheduled data, which is only suitable for testing on factory-scheduled problems.

And S620, acquiring a sample set, namely acquiring training data.

For example, for a certain sample problem, it is necessary to obtain sample sets according to the sample problem definition features, and divide the sample sets into a training set, a testing set, and a verification set; the training set is a sample set used for training and is mainly used for training parameters in a neural network; the verification set is a sample set used for verifying the performance of the model, and after training of different neural networks on the training set is finished, the performance of each model is compared and judged through the verification set. The different models are mainly neural networks corresponding to different hyper-parameters, and can also be neural networks with completely different structures; the test set is used for objectively evaluating the performance of the trained neural network.

Illustratively, the training data may refer to historical strategies that solve the sample problem described above; for example, the historical allocation result corresponding to the sample problem and the historical profit data corresponding to the historical allocation result are solved.

In one example, assume that a training problem refers to a factory production problem on a future day; for example, how 10000 raw materials are distributed to 5 plants to maximize profits for each of the 5 plants; the training sample set may refer to historical data such as distribution results of 10000 raw materials each distributed to 5 plants in 30 days before a certain day in the future and profit values of respective plants in 5 plants per day.

In another example, assume that the training problem refers to an assembly line balance problem on a future day; for example, how to allocate 10 ten thousand products to 100 production lines for production, thereby minimizing the assembly line cycle for producing 10 ten thousand products; the training sample set may refer to historical data such as the distribution results of 10 ten thousand products to 100 production lines each within 50 days before a certain day in the future and the completion efficiency of each production line per day.

It should be understood that the above description is illustrative of the training problem and the training sample, and the present application is not limited thereto.

And S630, training a sequencing model.

It should be understood that the above-described ranking model may be the ranking model shown in fig. 8, wherein a first ranking model and a second ranking model may be included in the ranking model; the first ranking model may be an inter-group ranking model, i.e., a pre-trained neural network model shown in fig. 8; the second order model may be referred to as an intra-group order model.

For example, N candidate solutions from the first solution set may be selected by the first ranking model to form a second solution set; and selecting a target solution from a second solution set through a second sequencing model, wherein the first solution set comprises M candidate solutions of a target problem, the target problem is a problem to be solved for realizing a target combination optimization task, M is an integer larger than 1, and N is an integer smaller than M.

For example, the training method of the inter-group ranking model may refer to the processes shown in fig. 8 to 10; namely, the labels are dynamically adjusted on the initial grouping obtained at random, so that the inter-group sorting model can select a target grouping from a plurality of groupings, and the target grouping comprises a target solution.

In one example, in the stage of extracting the grouping feature, the neighborhood information of the candidate solution in the grouping can be extracted; the specific operation flow is as follows: the window size can be set, and when the characteristics of the candidate solutions in the groups are extracted, the characteristics in the groups can be extracted according to the size of the given window; the average influence on the objective function is caused by the scoring average value and the scoring variance of the nodes in the window and the combination of different nodes, and the influence can be respectively tested by adopting a multi-sampling mode, for example, a sliding window, and respectively testing 3,4 and 5 nodes, so that the accuracy and the corresponding relation between the nodes and the group are enhanced.

The characteristics of each candidate solution may refer to scores of the candidate solutions, influence on solving a sample problem after deleting the candidate solutions, and the like; the grouping characteristics may be embodied by an average score, a score variance, a covariance, etc. of the candidate solutions included within a grouping.

It should be understood that the candidate solution may refer to candidate values of variables corresponding to solving the mathematical programming problem.

For example, if each group includes 10000 nodes, if the window size is 100, it means that 100 nodes in all the nodes are randomly selected to be placed in the group, and then 100 times are required to select 10000 nodes; if the window size is 1000, which means that 1000 variables of all nodes are randomly selected to be put into the group, 10000 nodes need to be selected 10 times.

Illustratively, the second ranking model may refer to a model for ranking a plurality of variables, for example, may refer to a ranking model obtained through a supervised learning mechanism; the training method of the second ranking model may also refer to the training method of the ranking model in the prior art, and is not described herein again.

And S640, finishing training of the sequencing model.

For example, a ranking model is obtained through the trained inter-group ranking model and the trained intra-group ranking model, and the ranking model is used for determining path information of a branch-and-bound algorithm corresponding to a mathematical programming problem.

It should be noted that the training method of the first ranking model and the second ranking model may be implemented by using a traditional machine learning regression classification model, or may be implemented by using a deep learning neural network; the present application does not limit the training method of the first and second order models.

In an embodiment of the present application, a partial solution (e.g., a second solution set) is selected from all candidate solutions (e.g., a first solution set) by employing two ranking models trained in advance, i.e., a first ranking model trained in advance, wherein the number of partial candidate solutions is smaller than the number of all candidate solutions; then, a pre-trained second ranking model is used for selecting a target solution from the partial candidate solutions; when solving the problem to be solved for realizing the target combination optimization task, selecting a partial data set from a large data candidate set, and then selecting a target solution from the partial data set; therefore, the problems of long solving time and low solving efficiency existing in the process of directly selecting a target solution from a data set with larger data can be solved; through the two pre-trained sequencing models in the embodiment of the application, the solving efficiency can be improved when the path information of the branch-and-bound algorithm corresponding to the combinatorial optimization problem is determined.

Fig. 12 is a schematic flowchart of a processing method of a combinatorial optimization task provided in an embodiment of the present application. In some examples, the method 700 may be performed by the execution device 210 in fig. 5, the chip shown in fig. 6, and the execution device 410 in fig. 7. The method 700 in fig. 12 may include steps S701 to S705, which are described in detail below.

S701, acquiring the problem to be processed.

Exemplarily, the data to be processed includes a problem to be processed and information of a candidate node set corresponding to the problem to be processed; the candidate node in the candidate node set corresponding to the problem to be processed may be a candidate solution corresponding to the problem to be processed.

For example, the candidate node may refer to a variable corresponding to solving the problem to be processed, and the information of the candidate node may refer to a constraint condition on the variable in the solving process.

And S702, processing a first sequencing model.

Illustratively, according to all candidate solutions (e.g., the first solution set) corresponding to the problem to be processed and the pre-trained first ordering model, a plurality of target confidence levels may be obtained, and the plurality of target confidence levels may refer to a plurality of prediction labels output by the pre-trained first ordering model.

In one example, the computing device may randomly group all candidate solutions corresponding to the problem to be processed to obtain a plurality of solution sets; inputting the multiple solution sets into a pre-trained first sequencing model to obtain multiple target confidence degrees, wherein the multiple target confidence degrees correspond to the multiple solution sets one by one, each target confidence degree in the multiple target confidence degrees is used for indicating the confidence degree that the corresponding solution set comprises a target solution, and the target solution is a candidate solution which enables the planning result of the target problem to meet a preset constraint condition in the first solution set; and selecting a solution set corresponding to the maximum target confidence degree from the multiple solution sets according to the multiple target confidence degrees to serve as a second solution set.

For example, the labels of the respective solution sets may be embodied by the scores of the respective groups; if the score of the solution set 1 in the multiple solution sets is greater than the score of the solution set 2, it can be said that the probability that the solution set 1 includes the target solution is greater than the probability that the solution set 2 includes the target solution; solution set 1 may be selected as the second solution set described above.

And S703, processing by using a second sequencing model.

For example, the target packet in the packet may be obtained through the step S702, that is, a second solution set may be selected from a plurality of solution sets; further, a target solution may be obtained according to the candidate solutions included in the second solution set and a second ranking model trained in advance.

In one example, a plurality of candidate solutions included in the second solution set may be input to a second ranking model trained in advance, to obtain labels of the plurality of candidate solutions, where the labels may be used to label each candidate solution as a confidence of the target solution; and selecting a target solution in the second solution set according to the labels of the candidate solutions in the second solution set.

And S704, determining the path information of the branch-and-bound algorithm.

Illustratively, according to the ranking of each candidate solution in the second solution set, the path information of the branch-and-bound algorithm may be determined; and the path information of the branch-and-bound algorithm is used for representing the selected sequence of the candidate solutions in the target group in the branch-and-bound algorithm.

S705, predicting a target solution.

For example, the solution of the problem to be processed can be performed through the obtained branch-and-bound path information, so as to obtain a target solution of each variable corresponding to the problem to be processed.

Further, a target combination result of the problem to be processed can be predicted according to the target solution of each variable.

It should be understood that the target combination optimization result may refer to that, when the to-be-processed combination optimization problem is solved, the target solution of each variable is solved under the condition that each variable satisfies the constraint condition, and the combination optimization result is obtained according to the target solution of each variable.

Exemplarily, the target solution may refer to a candidate solution which can achieve the maximum resource allocation rate or the maximum profit corresponding to the problem to be processed among the plurality of candidate solutions included in the second solution set; and obtaining a combined optimization result according to the target solution of each variable. It should also be understood that the above illustrations are for the purpose of assisting persons skilled in the art in understanding the present embodiments, and are not intended to limit the present embodiments to the particular values or particular scenarios illustrated. It will be apparent to those skilled in the art from the foregoing description that various equivalent modifications or changes may be made, and such modifications or changes are intended to fall within the scope of the embodiments of the present application.

Fig. 13 is a schematic diagram of a test effect provided by an embodiment of the present application. The test data used in fig. 13 is an integer hybrid programming problem set (MIP-1), the number of constraints is 500, and the number of variables is 1000; fig. 13 (a) is a schematic diagram showing a comparison of the solving efficiency of MIP-1 for different models, and fig. 13 (b) is a schematic diagram showing a comparison of the number of nodes traversed by different models; the adopted test model comprises the best known open source software of Scientific Constrained Integer Programming (SCIP), wherein the origin represents a target published model; 1 represents a first sequencing model obtained by training with an adaptive boosting algorithm; 2 denotes a first ordering model obtained by pre-training using a binary label mechanism.

As shown in (a) in fig. 13, the solution times of the processing methods of the different combinatorial optimization tasks are provided; as shown in fig. 13 (b), the processing method that provides different combinatorial optimization tasks requires traversing the number of all variables; according to the processing method of the combined optimization task, the optimal solution, namely the target solution, can be found only by traversing 70% of the nodes; compared with the existing processing method, the optimal solution can be determined by traversing 100% of variables, so that the solving efficiency is improved.

In one example, the processing method of the combined optimization task provided by the present application is tested on a Set Cover (Set Cover) problem.

The set coverage problem may refer to a standard combination optimization problem, and the corresponding industrial problems may include factory site selection, production line planning, and the like.

In particular, the set covering problem describes a set S of subclasses comprising a set U and elements within U, with the goal of finding a subset of S that contains elements that contain all the elements in U and minimizes the number of subclass sets. For example, U { {1,2,3,4,5}, S { {1,2}, {3,4}, {2,4,5}, {4,5} }, and O { {1,2}, {3,4} {4,5}, or O { {1,2}, {3,4}, {2,4,5} }, which find a set that satisfies the condition may be. The essence of the set coverage problem is therefore to find the smallest coverage subset of a set; for example, the set coverage problem can often be expressed as:

where x represents the variable to be solved and c and a represent the coefficient matrix. It can be seen that the set coverage problem is very similar to the standard MIP representation, but requires a choice between variables being not only integers, but 0 or 1, and hence such mixed integer programming is also referred to as 0-1 mixed integer programming (0-1 MIP).

Fig. 14 is a schematic diagram of a test effect applied to a set coverage problem provided by an embodiment of the present application. Wherein, the test data used in fig. 14 is that the SC-1 problem set (SC-1) includes 5000 problems, the number of constraints is 500, and the number of variables is 1000; fig. 14 (a) is a schematic diagram showing a comparison of the solving efficiency of different models at SC-1, and fig. 14 (b) is a schematic diagram showing a comparison of the number of nodes traversed by different models; the adopted test model comprises the best known open source software of Scientific Constrained Integer Programming (SCIP), wherein the origin represents a target published model; 1 represents a first sequencing model obtained by training with an adaptive boosting algorithm; 2 denotes a first ordering model obtained by pre-training using a binary label mechanism.

As shown in fig. 14 (a), it can be seen that the processing method of the combination optimization task proposed in the present application is advantageous in computational efficiency; as shown in (b) of fig. 14, it can be seen that the reason for improving the computational efficiency is that the present application only needs to traverse 50% of the variables, whereas the original method needs to traverse 60% of the variables. Therefore, the processing method of the combined optimization task, provided by the application, is applied to the problem of set coverage, and the computing efficiency of the existing processing method is improved.

In one example, the processing method of the combined optimization task provided by the present application is tested on advertisement pricing/placement problems.

It should be appreciated that the ad pricing/placement problem is distinguished from other problems in that the constraint matrix is more complex.

The data set used in this application is a standard advertisement bidding/auction problem data set, for example, the problem of advertisement pricing/placement can also be expressed as an integer programming problem:

where b represents a parameter in the optimization objective (e.g., price per unit time/area); y represents decision variables (e.g., production, sales, hold-off); s represents indirect solving variables (such as worker man-hour and sales relation); j represents the sequence relationship of the variables (e.g., serial numbers of different machines, serial numbers of different production lines).

Fig. 15 is a schematic diagram of the test effect applied to the advertisement pricing/placement problem provided by the embodiment of the present application. Wherein, the test data used in fig. 15 is a standard advertisement bidding/auction problem data set, the number of constraints is 200, and the number of variables is 400; fig. 15 (a) is a schematic diagram showing a comparison of the solving efficiency of different models at SC-1, and fig. 15 (b) is a schematic diagram showing a comparison of the number of nodes traversed by different models; the adopted test model comprises the best known open source software of Scientific Constrained Integer Programming (SCIP), wherein the origin represents a target published model; 1 represents a first sequencing model obtained by training with an adaptive boosting algorithm; 2 denotes a first ordering model obtained by pre-training using a binary label mechanism.

As shown in fig. 15 (a), it can be seen that the processing method of the combination optimization task proposed in the present application is advantageous in computational efficiency; as shown in fig. 15 (b), it can be seen that the reason for improving the computational efficiency is that all variables need to be traversed in the original processing method, whereas the processing method of the present application only needs to traverse 90% of the variables to obtain the optimal solution. Therefore, the processing method of the combined optimization task is applied to the advertisement pricing/putting problem, and the computing efficiency of the existing processing method is improved.

It should be noted that, the above illustrates the test effect of the processing method of the combined optimization task provided in the present application on the set coverage problem and the advertisement pricing/placement problem, and the processing method of the combined optimization task provided in the embodiment of the present application may also be applied to other mixed integer programming problems, which is not limited in this application.

The processing method and the test effect of the combination optimization task provided by the embodiment of the present application are described in detail above with reference to fig. 1 to 15; the device embodiment of the present application will be described in detail below with reference to fig. 16 to 17. It should be understood that the processing device for the combination optimization task in the embodiment of the present application may perform the foregoing various processing methods in the embodiment of the present application, that is, the following specific working processes of various products, and reference may be made to the corresponding processes in the foregoing method embodiments.

Fig. 16 is a schematic block diagram of a processing device for combining optimization tasks provided herein.

It should be understood that the processing device 800 may perform the methods illustrated in fig. 8-12. The processing device 800 comprises: an acquisition unit 810 and a processing unit 820.

The obtaining unit 810 is configured to obtain, from a memory, a first solution set for a target problem, where the target problem is a problem that needs to be solved to implement a target combinatorial optimization task, the first solution set includes M candidate solutions for the target problem, and M is an integer greater than 1; the processing unit 820 is configured to group candidate solutions in the first solution set to obtain a plurality of solution sets; inputting the solution sets into a pre-trained neural network model to obtain a plurality of target confidence coefficients, wherein the target confidence coefficients correspond to the solution sets one to one, each target confidence coefficient in the target confidence coefficients is used for representing the confidence coefficient that the corresponding solution set comprises a target solution, and the target solution is a candidate solution which enables the planning result of the target problem to meet a preset constraint condition in the first solution set; according to the target confidence degrees, selecting a solution set corresponding to the maximum target confidence degree in the target confidence degrees from the solution sets to obtain a second solution set; and selecting the target solution from the second solution set.

Optionally, as an embodiment, the processing unit 820 is specifically configured to:

Optionally, as an embodiment, the target combination optimization task includes any one of a factory scheduling task, a production scheduling task, a three-dimensional boxing task, and a plant address selection task.

Optionally, as an embodiment, in a case that the objective combination optimization task is a production scheduling task, the objective problem refers to achieving a maximum production yield according to a product demand quantity and a raw material quantity, the first set includes M planning results, one of the M planning results is used to represent a target capacity of each of a plurality of production objects, and the objective solution refers to a planning result of the M planning results that achieves the maximum production yield.

Optionally, as an embodiment, in a case that the target combination optimization task is a three-dimensional boxing task, the target problem is that a maximum space utilization rate of the box is achieved according to a size of a plurality of articles to be boxed and a size of the box, the first candidate set includes M boxing results, one of the M boxing results is used to represent that a part of the articles to be boxed is selected from the plurality of articles to be boxed and is packed into the box, and the target solution is a boxing result that achieves the maximum space utilization rate of the box from the M boxing results.

Optionally, as an embodiment, when the objective combination optimization task is an addressing task, the objective problem refers to that a maximum total service area is realized according to candidate addresses of a plurality of objects to be built and a service area corresponding to each candidate address, the first solution set includes M addressing results, one of the M addressing results is used to represent a target addressing of each object to be built in the plurality of objects to be built and a service area corresponding to the target addressing of each object to be built, and the objective solution refers to an addressing result of the maximum total service area realized in the M addressing results.

Optionally, as an embodiment, the neural network model is obtained by using the following training method:

acquiring training data, wherein the training data comprises a first sample solution set, a second sample solution set and a sample target solution, the first sample solution set comprises K candidate solutions of a sample problem, the sample problem is a problem to be solved for realizing a sample combination optimization task, the second sample solution set comprises the sample target solution, and K is an integer greater than 1;

and training the neural network model by taking the first sample solution set as input data to obtain a second sample solution set as a training target, wherein the second sample solution set is composed of L candidate solutions selected from the first sample solution set, and L is an integer less than K.

Optionally, as an embodiment, the processing unit 820 is further configured to:

the training the neural network model with the training data includes:

Optionally, as an embodiment, the confidence of each candidate solution as the sample target solution is obtained by the neural network model by using a binary label mechanism.

Optionally, as an embodiment, the confidence of each candidate solution as the sample target solution is obtained by the neural network model by using an adaptive boosting algorithm.

The processing device 800 is embodied as a functional unit. The term "unit" herein may be implemented in software and/or hardware, and is not particularly limited thereto.

For example, a "unit" may be a software program, a hardware circuit, or a combination of both that implement the above-described functions. The hardware circuitry may include an Application Specific Integrated Circuit (ASIC), an electronic circuit, a processor (e.g., a shared processor, a dedicated processor, or a group of processors) and memory that execute one or more software or firmware programs, a combinational logic circuit, and/or other suitable components that support the described functionality.

Accordingly, the units of the respective examples described in the embodiments of the present application can be realized in electronic hardware, or a combination of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

The processing apparatus 900 shown in fig. 17 (the processing apparatus 900 may be a computer device) includes a memory 910, a processor 920, a communication interface 9030, and a bus 940. The memory 910, the processor 920 and the communication interface 930 are communicatively connected to each other through a bus 940.

The memory 910 may be a Read Only Memory (ROM), a static memory device, a dynamic memory device, or a Random Access Memory (RAM). The memory 910 may store a program, and when the program stored in the memory 910 is executed by the processor 920, the processor 920 is configured to perform the steps of the processing method of the combinatorial optimization task according to the embodiment of the present application; for example, the respective steps shown in fig. 8 to 12 are performed.

It should be understood that the processing apparatus for the combination optimization task shown in the embodiment of the present application may be a computing device, and may also be a chip configured in the computing device in the cloud.

The computing device may be a device having a combined optimization task processing function, for example, a device that may include any computing function known in the art, such as a server, a computer, and the like; alternatively, the computing device may also refer to a chip having a computing function; for example, a chip disposed in a server or a chip disposed in a computer. The computing device may include a memory and a processor therein; the memory may be configured to store program code, and the processor may be configured to invoke the program code stored by the memory to implement the corresponding functionality of the computing device. The processor and the memory included in the computing device may be implemented by a chip, and are not particularly limited herein. For example, the memory may be configured to store program instructions related to a processing method of a combinatorial optimization task provided in an embodiment of the present application, and the processor may be configured to call the program instructions related to the processing method stored in the memory when solving a target problem, execute the processing method of the embodiment of the present application to select partial solutions from the first solution set to form a second solution set, and select a target solution from the second solution set.

The processor 920 may be a general-purpose Central Processing Unit (CPU), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more integrated circuits, and is configured to execute related programs to implement the processing method of the combined optimization task of the embodiments of the present application.

The processor 920 may also be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the processing method of the present application for the combinatorial optimization task may be implemented by hardware integrated logic circuits in the processor 920 or instructions in the form of software.

The processor 920 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic device, or discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 910, and the processor 920 reads information in the memory 910, and performs, in combination with hardware of the storage medium, functions that need to be performed by units included in the processing apparatus shown in fig. 16 in the embodiment of the present application, or performs a processing method of combining optimization tasks shown in fig. 8 to 12 in the embodiment of the method of the present application.

The communication interface 930 enables communication between the processing device 900 and other devices or communication networks using transceiver devices such as, but not limited to, transceivers.

Bus 940 may include a pathway to transfer information between various components of processing device 900 (e.g., memory 910, processor 920, communication interface 930).

It should be noted that although the processing device 900 described above shows only memories, processors, and communication interfaces, in a particular implementation, those skilled in the art will appreciate that the processing device 900 may also include other components necessary to achieve normal operation. Also, the processing device 9009 may comprise hardware components for performing other additional functions, as may be appreciated by those skilled in the art according to particular needs. Furthermore, those skilled in the art will appreciate that the processing device 900 described above may also include only those components necessary to implement the embodiments of the present application, and need not include all of the components shown in FIG. 17.

Illustratively, the embodiment of the present application further provides a chip, which includes a transceiver unit and a processing unit. The transceiver unit can be an input/output circuit and a communication interface; the processing unit is a processor or a microprocessor or an integrated circuit integrated on the chip; the chip can execute the processing method of the combined optimization task in the method embodiment.

Illustratively, the embodiment of the present application further provides a computer-readable storage medium, on which instructions are stored, and the instructions, when executed, perform the processing method of the combinatorial optimization task in the above method embodiment.

Illustratively, the present application further provides a computer program product containing instructions, which when executed, perform the processing method of the combinatorial optimization task in the above method embodiments.

It should be understood that the processor in the embodiments of the present application may be a Central Processing Unit (CPU), and the processor may also be other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

It will also be appreciated that the memory in the embodiments of the subject application can be either volatile memory or nonvolatile memory, or can include both volatile and nonvolatile memory. The non-volatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an electrically Erasable EPROM (EEPROM), or a flash memory. Volatile memory can be Random Access Memory (RAM), which acts as external cache memory. By way of example, but not limitation, many forms of Random Access Memory (RAM) are available, such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), Enhanced SDRAM (ESDRAM), synchlink DRAM (SLDRAM), and direct bus RAM (DR RAM).

The above embodiments may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. The computer program product comprises one or more computer instructions or computer programs. The procedures or functions according to the embodiments of the present application are wholly or partially generated when the computer instructions or the computer program are loaded or executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another computer readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains one or more collections of available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium. The semiconductor medium may be a solid state disk.

It should be understood that the term "and/or" herein is merely one type of association relationship that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone, wherein A and B can be singular or plural. In addition, the "/" in this document generally indicates that the former and latter associated objects are in an "or" relationship, but may also indicate an "and/or" relationship, which may be understood with particular reference to the former and latter text.

In the present application, "at least one" means one or more, "a plurality" means two or more. "at least one of the following" or similar expressions refer to any combination of these items, including any combination of the singular or plural items. For example, at least one (one) of a, b, or c, may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or multiple.

It should be understood that, in the various embodiments of the present application, the sequence numbers of the above-mentioned processes do not mean the execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the present application, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a read-only memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A processing method for a combined optimization task is characterized by comprising the following steps:

the method comprises the steps that a computing device obtains a first solution set aiming at a target problem from a memory, wherein the target problem is a problem needing to be solved for realizing a target combination optimization task, the first solution set comprises M candidate solutions aiming at the target problem, and M is an integer larger than 1;

the computing equipment groups the candidate solutions in the first solution set to obtain a plurality of solution sets;

the computing equipment inputs the solution sets into a pre-trained neural network model to obtain a plurality of target confidence coefficients, the target confidence coefficients correspond to the solution sets one by one, each target confidence coefficient in the target confidence coefficients is used for representing the confidence coefficient that the corresponding solution set comprises a target solution, and the target solution is a candidate solution which enables the planning result of the target problem to meet a preset constraint condition in the first solution set;

the computing equipment selects a solution set corresponding to the maximum target confidence coefficient in the target confidence coefficients from the solution sets according to the target confidence coefficients to obtain a second solution set;

the computing device selects the target solution from the second solution set.

2. The processing method of claim 1, wherein the computing device groups the candidate solutions in the first set of solutions to obtain a plurality of sets of solutions, comprising:

3. The processing method of claim 1 or 2, wherein the computing device choosing the target solution from the second set of solutions comprises:

4. The process of any one of claims 1 to 3, wherein the target combination optimization task comprises any one of a production scheduling task, a three-dimensional binning task, and an addressing task.

5. The process according to any one of claims 1 to 4, wherein in a case where the target combination optimization task is a production scheduling task, the target problem is that a maximum production yield is achieved according to a quantity of required products and a quantity of raw materials, the first set includes M planning results, one of the M planning results is used for representing a target capacity of each of a plurality of production objects, and the target solution is a planning result of the M planning results that achieves the maximum production yield.

6. The processing method according to any one of claims 1 to 4, wherein in a case where the target combination optimization task is a three-dimensional boxing task, the target problem is that a maximum space utilization of the box is achieved according to a size of a plurality of to-be-boxed items and a size of the box, the first candidate set includes M boxing results, one of the M boxing results is used to represent that a part of the to-be-boxed items is selected from the plurality of to-be-boxed items to be boxed in the box, and the target solution is a boxing result of the M boxing results that achieves the maximum space utilization of the box.

7. The processing method according to any one of claims 1 to 4, wherein in a case that the objective combination optimization task is an addressing task, the objective problem is that a maximum total service area is realized according to candidate addresses of a plurality of objects to be built and a service area corresponding to each candidate address, the first solution set includes M addressing results, one of the M addressing results is used for representing an objective addressing of each object to be built in the plurality of objects to be built and a service area corresponding to the objective addressing of each object to be built, and the objective solution is an addressing result of the M addressing results that the maximum total service area is realized.

8. The process of any one of claims 1 to 7, wherein the neural network model is obtained using a training method of:

acquiring training data, wherein the training data comprises a sample problem, a first sample solution set, a second sample solution set and a sample target solution, the sample problem is a problem to be solved for realizing a sample combination optimization task, the first sample solution set comprises K candidate solutions aiming at the sample problem, the second sample solution set is formed by selecting L candidate solutions from the first sample solution set, the second sample solution set comprises the sample target solution, the sample target solution is a candidate solution which enables a planning result of the sample problem to meet a preset constraint condition in the first sample solution set, K is an integer larger than 1, and L is an integer smaller than K;

and training the neural network model by taking the first sample solution set as input data to obtain a second sample solution set as a training target.

9. The processing method of claim 8, further comprising:

obtaining a first confidence that the second set of sample solutions includes the sample target solution using the neural network model;

training the neural network model according to the first confidence.

10. The processing method of claim 9, wherein the computing device grouping the candidate solutions in the first set of sample solutions to obtain the second set of sample solutions comprises:

11. The processing method of claim 9 or 10, wherein said obtaining the second set of sample solutions using the neural network model includes a first confidence of the sample target solution, comprises:

12. The process of claim 11, wherein the confidence level that each candidate solution is the sample target solution is derived by the neural network model by employing a binary label mechanism.

13. The process of claim 11, wherein the confidence that each candidate solution is the sample target solution is derived by the neural network model using an adaptive boosting algorithm.

14. A processing apparatus for combining optimization tasks, comprising:

the method comprises the steps of obtaining a first solution set aiming at a target problem from a memory, wherein the target problem is a problem needing to be solved for realizing a target combination optimization task, the first solution set comprises M candidate solutions aiming at the target problem, and M is an integer larger than 1;

the processing unit is used for grouping the candidate solutions in the first solution set to obtain a plurality of solution sets; inputting the solution sets into a pre-trained neural network model to obtain a plurality of target confidence coefficients, wherein the target confidence coefficients correspond to the solution sets one to one, each target confidence coefficient in the target confidence coefficients is used for representing the confidence coefficient that the corresponding solution set comprises a target solution, and the target solution is a candidate solution which enables the planning result of the target problem to meet a preset constraint condition in the first solution set; according to the target confidence degrees, selecting a solution set corresponding to the maximum target confidence degree in the target confidence degrees from the solution sets to obtain a second solution set; and selecting the target solution from the second solution set.

15. The processing apparatus as defined in claim 14, wherein the processing unit is specifically to:

16. The processing apparatus according to claim 14 or 15, wherein the processing unit is specifically configured to:

17. The processing apparatus according to any one of claims 14 to 16, wherein the target combination optimization task includes any one of a production scheduling task, a three-dimensional binning task, and an addressing task.

18. The processing apparatus according to any one of claims 14 to 17, wherein in a case where the target combination optimization task is a production scheduling task, the target problem is that a maximum production yield is achieved according to a quantity of required products and a quantity of raw materials, the first set includes M planning results, one of the M planning results is used to represent a target capacity of each of a plurality of production objects, and the target solution is a planning result of the M planning results that achieves the maximum production yield.

19. The processing apparatus according to any one of claims 14 to 17, wherein in a case where the target combinatorial optimization task is a three-dimensional boxing task, the target problem is that a maximum space utilization of the box is achieved according to a size of a plurality of articles to be boxed and a size of the box, the first candidate set includes M boxing results, one of the M boxing results is used to represent that a part of the articles to be boxed is selected from the plurality of articles to be boxed and loaded into the box, and the target solution is that of the M boxing results, which achieves the maximum space utilization of the box.

20. The processing apparatus according to any one of claims 14 to 17, wherein in a case that the objective combination optimization task is an addressing task, the objective problem is to implement a maximum total service area according to candidate addresses of a plurality of objects to be created and a service area corresponding to each candidate address, the first solution set includes M addressing results, one of the M addressing results is used to represent an objective addressing of each object to be created in the plurality of objects to be created and a service area corresponding to the objective addressing, and the objective solution is to implement an addressing result of the maximum total service area in the M addressing results.

21. The processing apparatus according to any one of claims 14 to 20, wherein the neural network model is obtained using a training method of:

22. The processing apparatus as in claim 21 wherein the processing unit is further to:

training the neural network model by taking the first sample solution set as input data to obtain a second sample solution set as a training target, comprising:

training the neural network model according to the first confidence.

23. The processing apparatus as in claim 22, wherein the processing unit is specifically configured to:

24. The processing apparatus according to claim 22 or 23, wherein the processing unit is specifically configured to:

25. The processing apparatus of claim 24, wherein the confidence level that each candidate solution is the sample target solution is derived by the neural network model by employing a class-two tagging mechanism.

26. The processing apparatus of claim 24, wherein the confidence level that each candidate solution is the sample target solution is derived by the neural network model using an adaptive boosting algorithm.

27. A combined optimization task processing apparatus comprising a processor and a memory, the memory for storing program instructions, the processor for invoking the program instructions to perform the processing method of any of claims 1 to 13.

28. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored therein program instructions that, when executed by a processor, implement the processing method of any one of claims 1 to 13.

29. A chip comprising a processor and a data interface, the processor being configured to read instructions stored on a memory through the data interface to cause the chip to perform the processing method of any one of claims 1 to 13.