CN116842060B

CN116842060B - Reasoning query optimization method and device based on agent model rearrangement technology

Info

Publication number: CN116842060B
Application number: CN202311107125.XA
Authority: CN
Inventors: 杨智慧; 王晓阳
Original assignee: Zhejiang Lab
Current assignee: Zhejiang Lab
Priority date: 2023-08-30
Filing date: 2023-08-30
Publication date: 2024-01-09
Anticipated expiration: 2043-08-30
Also published as: CN116842060A

Abstract

The specification discloses a reasoning query optimization method and device based on a proxy model rearrangement technology. The method comprises the following steps: acquiring each task model, determining different execution sequences when the task is queried through the task model in an execution reasoning way, querying a target value range of execution cost corresponding to each execution sequence according to target accuracy, filtering each execution sequence according to the target value range to obtain each candidate execution sequence, determining the combination of accuracy parameters of different agent models corresponding to each task model under each candidate execution sequence based on the target accuracy, and determining a target execution sequence and a target parameter combination according to the execution cost corresponding to each candidate parameter combination under each candidate execution sequence; when a query request is received, the proxy model corresponding to each task model is called according to the target parameter combination, and the reasoning query task corresponding to the query request is executed through each task model according to the target execution sequence.

Description

Reasoning query optimization method and device based on agent model rearrangement technology

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a method and an apparatus for optimizing inference query based on a proxy model rearrangement technology.

Background

With the advent of the big data age and the great success of artificial intelligence, neural network models have also been widely used, and in the process of data processing, users can analyze and mine knowledge from massive data by using machine learning and deep learning technologies. Typically, the data analysis is performed in the form of queries and the neural network model is encapsulated in User-defined functions (User-Defined Functions, UDFs) to form inference operators. Such queries containing one or more inference operators are referred to as "inference queries".

However, as the size of data increases, the amount of computation and time consumed in processing the above-described inference query task also increases. For example, traffic management authorities analyze data collected by national road monitoring cameras using a vehicle identification neural network model, so that traffic flow is monitored according to the analysis result, and it usually takes several weeks to process a vehicle identification filtering query on a day of video data, which severely reduces the efficiency of data processing and analysis, even in relation to national and social developments.

Therefore, how to reduce the time consumed by the task of reasoning and inquiring and improve the efficiency of data processing and analysis is a problem to be solved urgently.

Disclosure of Invention

The present disclosure provides a method and an apparatus for optimizing inference query based on a proxy model rearrangement technique, so as to partially solve the foregoing problems in the prior art.

The technical scheme adopted in the specification is as follows:

the specification provides an inference query optimization method based on a proxy model rearrangement technology, which comprises the following steps:

acquiring each task model, and determining different execution sequences when the reasoning inquiry task is executed through each task model;

inquiring a target value range of the execution cost corresponding to each execution sequence according to the target accuracy input by a user, and filtering each execution sequence according to the target value range to obtain each candidate execution sequence;

determining, as each parameter combination, a combination of accuracy parameters of different proxy models corresponding to each task model in each candidate execution order based on the target accuracy, where the proxy models are used to filter data in an input task model according to filtering conditions corresponding to the task models, and the accuracy parameters are used to characterize a proportion of unfiltered data to data that do not meet the filtering conditions when the proxy models of the accuracy parameters filter the data in the input task model;

Determining the execution cost corresponding to each candidate parameter combination in each candidate execution sequence, and determining a target execution sequence and a target parameter combination in the target execution sequence according to the execution cost corresponding to each candidate parameter combination in each candidate execution sequence;

when a query request is received, the agent model corresponding to each task model is called according to the target parameter combination, and the reasoning query task corresponding to the query request is executed through each task model according to the target execution sequence.

Optionally, determining the execution cost corresponding to each candidate parameter combination in each candidate execution sequence specifically includes:

for each accuracy parameter in the candidate parameter combination, obtaining a mapping relation between the accuracy and the filtering rate corresponding to the agent model of the accuracy parameter, wherein the mapping relation corresponding to the agent models of different accuracy parameters is different;

determining the filtering rate corresponding to the accuracy parameter according to the accuracy corresponding to the accuracy parameter and the mapping relation corresponding to the agent model of the accuracy parameter;

determining the execution cost corresponding to the accuracy parameter according to the accuracy corresponding to the accuracy parameter and the filtering rate;

And determining the execution cost corresponding to each candidate parameter combination according to the execution cost corresponding to each accuracy parameter in each candidate parameter combination.

Optionally, before acquiring each task model, the method further includes:

determining an initial value range of the execution cost corresponding to each execution sequence;

constructing and training agent models of different accuracy parameters corresponding to each task model, and determining a filtering rate range of filtering rates corresponding to each agent model under different execution sequences and a selection rate range of each task model according to the target accuracy;

updating the initial value range according to the target accuracy, the filtering rate range and the selection rate range to obtain a target value range of execution cost corresponding to each execution sequence;

optionally, before acquiring each task model, the method further includes:

for each task model, dividing the task model into a first sub-model and a second sub-model, wherein the first sub-model is used for generating a labeled sample, and the second sub-model is used for training a classification model and calculating the mapping relation between the accuracy and the filtering rate;

determining different execution sequences of sub-models corresponding to each agent model, and taking the different execution sequences as the execution sequences of the sub-models;

Determining a target value range of the execution cost corresponding to the execution sequence of each sub-model aiming at the execution sequence of the sub-model;

filtering the execution sequences of the sub-models according to the target value range of the corresponding execution cost of each sub-model execution sequence to obtain candidate sub-model execution sequences;

determining a target sub-model sequence and a target parameter combination according to the execution cost corresponding to each parameter combination under each sub-model execution sequence and the time cost corresponding to each sub-model execution sequence;

and calling each agent model corresponding to the target parameter combination, and executing tasks corresponding to all sub-models according to the execution sequence of the target sub-models.

Optionally, according to the target accuracy input by the user, querying a target value range of an execution cost corresponding to each execution sequence, and filtering each execution sequence according to the target value range to obtain each candidate execution sequence, which specifically includes:

and for each execution sequence, if the highest value of the execution cost corresponding to the execution sequence is lower than the lowest value of the execution cost corresponding to other candidate execution sequences, taking the execution sequence as the candidate execution sequence.

And for each execution sequence, if the lowest value of the execution cost corresponding to the execution sequence is smaller than the lowest value of the execution cost corresponding to other execution sequences, and the highest value of the execution cost corresponding to the execution sequence is larger than the lowest value of the execution cost corresponding to the other execution sequences and smaller than the highest value of the execution cost corresponding to the other execution sequences, the execution sequence is taken as the candidate execution sequence.

and for each execution sequence, if the lowest value of the execution cost corresponding to the execution sequence is smaller than the lowest value of the execution cost corresponding to other execution sequences and the highest value of the execution cost corresponding to the execution sequence is larger than the highest value of the execution cost corresponding to other execution sequences, taking the execution sequence as the candidate execution sequence.

The present specification provides an inference query optimization apparatus based on a proxy model rearrangement technique, including:

the acquisition module acquires each task model and determines different execution sequences when the reasoning inquiry task is executed through each task model;

The filtering module queries a target value range of the execution cost corresponding to each execution sequence according to the target accuracy input by a user, and filters each execution sequence according to the target value range to obtain each candidate execution sequence;

the first determining module is used for determining the combination of accuracy parameters of different agent models corresponding to each task model under the candidate execution sequence according to the target accuracy rate aiming at each candidate execution sequence, wherein the agent model is used for filtering data in an input task model according to the filtering condition corresponding to the task model as each parameter combination, and the accuracy parameters are used for representing the proportion of unfiltered data to data which do not accord with the filtering condition when the agent model of the accuracy parameters filters the data in the input task model;

the second determining module is used for determining the execution cost corresponding to each candidate parameter combination under each candidate execution sequence, determining a target execution sequence and a target parameter combination under the target execution sequence according to the execution cost corresponding to each candidate parameter combination under each candidate execution sequence;

and the execution module is used for calling the proxy model corresponding to each task model according to the target parameter combination when the query request is received, and executing the reasoning query task corresponding to the query request through each task model according to the target execution sequence.

The present specification provides a computer readable storage medium storing a computer program which when executed by a processor implements the above-described proxy model-based data processing method.

The present specification provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the above-described proxy model-based data processing method when executing the program.

The above-mentioned at least one technical scheme that this specification adopted can reach following beneficial effect:

in the data processing method based on the agent model provided by the specification, each task model is obtained, different execution sequences when reasoning and inquiring tasks are executed through each task model are determined, a target value range of execution cost corresponding to each execution sequence is inquired according to target accuracy, each execution sequence is filtered according to the target value range to obtain each candidate execution sequence, the combination of accuracy parameters of different agent models corresponding to each task model under each candidate execution sequence is determined based on the target accuracy, and the target execution sequence and the target parameter combination are determined according to the execution cost corresponding to each candidate parameter combination under each candidate execution sequence; when a query request is received, the proxy model corresponding to each task model is called according to the target parameter combination, and the reasoning query task corresponding to the query request is executed through each task model according to the target execution sequence.

According to the method, in the process of executing data processing, the agent model corresponding to each accuracy parameter combination under different task model candidate execution sequences can be selected to filter data input into each task model according to the agent model corresponding to each most reasonable target accuracy parameter combination, and the agent model is processed according to the target execution sequences, so that the calculated amount of each task model is reduced, the data processing time is shortened, in addition, in the process of determining the accuracy parameter combination, the execution sequences meeting the conditions are screened through the value range of the execution cost corresponding to each execution sequence, the calculated amount in the process of determining the target accuracy parameter combination is shortened, and the data processing efficiency is further improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the specification, illustrate and explain the exemplary embodiments of the present specification and their description, are not intended to limit the specification unduly. In the drawings:

FIG. 1 is a schematic diagram of a prior art reasoning query process provided in this specification;

FIG. 2 is a schematic flow chart of a method for optimizing inference query based on a proxy model rearrangement technique provided in the present specification;

FIG. 3 is a schematic diagram of a filtered inference query process of a proxy model provided in the present specification;

FIG. 4 is a schematic diagram of a process for determining a correspondence between accuracy and filtering rate of a proxy model provided in the present specification;

FIG. 5 is a schematic diagram of a search tree provided in the present specification;

FIG. 6 is a schematic diagram of a pruning process in an execution sequence provided in the present specification;

FIG. 7 is a schematic diagram of a fine-grained search tree structure provided in the present specification;

FIG. 8 is a schematic diagram of an accuracy distribution process corresponding to a fine-grained search tree provided in the present disclosure;

FIG. 9 is a schematic diagram of an optimal execution sequence of rearranged task models and agent models provided in the present specification;

FIG. 10 is an exemplary diagram of a rearrangement technology system for reasoning-oriented queries provided in the present specification;

FIG. 11 is a schematic diagram of an inference query optimization apparatus based on a proxy model rearrangement technique provided in the present specification;

fig. 12 is a schematic view of an electronic device corresponding to fig. 2 provided in the present specification.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the present specification more apparent, the technical solutions of the present specification will be clearly and completely described below with reference to specific embodiments of the present specification and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present specification. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the present disclosure.

Fig. 1 is a schematic diagram of a prior art reasoning query process provided in this specification.

The final purpose of the query task is to enable S-land people to have emotion attitudes on wearing masks; the numbers on the arrows in fig. 1 represent the amount of data that is passed into the next inference operator, the time under each inference operator being the time it takes to process a piece of data,、/>and +.>Three inference operators, namely "topic model", "place marker" and "emotion analysis", respectively, ">、/>And +.>The filters "subject=mask", "place=s" and "emotion=positive" corresponding to the three inference operators are respectively represented.

As can be seen from the figure, when the input data volume is large and a plurality of inference algorithms are included, the calculation volume of each inference operator is relatively large, the time consumed in the whole query process is increased, and the overall execution efficiency of the query task is seriously affected.

The following describes in detail the technical solutions provided by the embodiments of the present specification with reference to the accompanying drawings.

Fig. 2 is a schematic flow chart of an inference query optimization method based on a proxy model rearrangement technology provided in the present specification, including the following steps:

s201: and acquiring each task model, and determining different execution sequences when the reasoning inquiry task is executed through each task model.

In the execution process of the reasoning query task, each reasoning operator can correspond to one reasoning query task, so that data are screened, and a final query result can be obtained after the reasoning query task is executed through a plurality of reasoning operators. However, the classical Reordering technique (Reordering) only considers the execution cost of the inference operator and the selectivity of the inference operator filter, so that it is only to search the order space for an optimal execution plan, which is difficult to cope with the optimization of the complex agent model-based inference query.

Based on the above, the present disclosure provides a data processing method for proxy models, which filters data by adding a proxy model before each task model, and selects each proxy model corresponding to a target parameter combination in a task execution sequence with minimum execution cost to execute an inference query task, thereby shortening the time consumed in the query process.

In the present specification, an execution body for implementing a data processing method for a proxy model may be a designated device such as a server, and for convenience of description, the present specification describes a data processing method for a proxy model using only a server as an execution body.

Wherein, formalization description of one inference query is: inference queryComprising inference operator->And their respective filters +.>And target accuracy of query +.>The target accuracy can be input by a user, each inference operator corresponds to a task model and is used for executing the corresponding inference query task, and the final query result can be obtained after the original data is processed through a plurality of task models (inference operators).

In this specification, the data used for data processing may be video data or image data, or may be audio data or text data, and this is not a specific limitation. The reasoning inquiry task can be to screen the data meeting the requirements, so that the data analysis is performed according to the inquiry result.

The server may first obtain each task model and determine a different execution order when performing the inferential query task through each task model.

Taking the query task shown in fig. 1 as an example, the query result is the emotion attitude of the S-land people to the mask. The corresponding query data may be video data acquired by the image acquisition device.

In the query task, three task models (inference operators) can be included, each for performing a different inference query task, i.e., topic identification model Site marking model->Emotion analysis model->Each task model corresponds to a filter, including +.>、/>And +.>Filter->、/>And +.>The corresponding filtering conditions are "topic=mask", "place=s", "emotion=positive", and the three task models and their corresponding filters are used to screen data with topic "mask", screen data with place "S", and screen data with emotion expression "positive", respectively.

The server may define functionsAnd its corresponding filter->The sequence relation between them is->In other words,is->Task model corresponding to each inference operator +.>Is->A filter(s)>Is->And ordering different task models and filters thereof by the agent models to obtain different execution sequences when the reasoning inquiry task is executed through each task model. Identifying the model by the subject matter>Site marking model->Emotion analysis model->For example, the three task models and their corresponding filter combinations can be co-located in six different orders, corresponding to six execution orders.

S202: inquiring a target value range of the execution cost corresponding to each execution sequence according to the target accuracy input by a user, and filtering each execution sequence according to the target value range to obtain each candidate execution sequence.

S203: for each candidate execution sequence, determining the combination of accuracy parameters of different proxy models corresponding to each task model under the candidate execution sequence based on the target accuracy, wherein the proxy model is used for filtering data in an input task model according to filtering conditions corresponding to the task model as each parameter combination, and the accuracy parameters are used for representing the proportion of unfiltered data to data which do not meet the filtering conditions when the proxy model of the accuracy parameters filters the data in the input task model.

In the process of inputting task model through correlation proxy modelBased on the filtering of model data, the specification can find the optimal execution plan so as to minimize the time of the execution plan, and it should be pointed out that the optimal accuracy rate for a specific order is assigned to the combinationIt is not necessarily applicable to execution plans in other orders; when only the order space is considered, calculating an optimal order based on the execution cost of the inference operator and the selectivity of the inference operator filter, and then calling an accuracy allocation algorithm to obtain an execution plan which is not necessarily optimal; therefore, there is a need to find an optimal execution plan in both the order space and the accuracy space.

Based on this, the invention proposes the following optimization problems: finding the optimal order in order spaceSimultaneously searching the optimal allocation combination of the accuracy parameter proxy model in the accuracy space +.>The overall time to perform the inferential query task (the execution time of the query task) is minimized, formally described as:

wherein, the sequence space is represented and comprises a plurality of execution sequences, A represents the accuracy space and represents the parameter combination of the accuracy parameters of different proxy models,is->Personal agent model combination,/->Is->And C is a cost function, and represents the execution cost when each agent model and task model of the parameter combination execute the reasoning query task under a specific execution sequence, wherein the execution cost can be calculated through the cost function, and the longer the time consumed when the reasoning query task is executed, the larger the execution cost, and vice versa, the smaller the execution cost.

For any task model, the server can set a proxy model before the task model so as to perform preliminary filtering on data input into the task model according to the filtering condition corresponding to the filter of the task model, thereby reducing the calculated amount of the task model.

Before setting the above-mentioned proxy model, it is necessary to construct the proxy model in advance, in this specification, one proxy modelCan be +.>Is a five-tuple of (2): />Representation->Input relation of->Is->Filter conditions of the task model filter to be lifted, < ->Is->Corresponding labeled sample, +.>Is->Classification model of->Is thatThe mapping relation between the accuracy parameter and the filtering rate. For ease of understanding, the present disclosure provides a schematic diagram of an inference query process after proxy model filtering, as shown in fig. 3.

Fig. 3 is a schematic diagram of a filtered inference query process of the proxy model provided in the present specification.

Wherein, proxy modelInput relation of->Is->The corresponding filtering condition is->"subject=mask"; proxy model->Input relation of->Is->The corresponding filtering condition is->"place=s".

In training the proxy model, the server may generate its tagged samples first。

In particular, the server may sendThe filtering condition relation in (a) is applied to a sample of the original input data, which sample may contain several pieces of data, and then the current proxy model +.>Filter conditions to be improved- >To mark this sample, satisfying the filter condition +.>Is marked as positive (+1) or else as negative (-1), generating +.in FIG. 3, taking FIG. 3 as an example>Is->The process is as follows: application of +.>Then, use the filtering condition +.>The "place=s" marks this sample, and a piece of data is marked as a positive class if the place described by the data is S, and as a negative class otherwise.

After generating the labeled sample, the server may train the classification modelIn particular, the server can be in its sample with tag +.>Applying lightweight classification algorithms, such as linear SVM, shallow NNs, to minimize identified positive samplesThe deviation between the number and the number of marked positive samples is used as an optimization target, and a classification model is trained and obtained>。

Further, for each proxy model, the server may calculate the accuracy corresponding to its accuracy parameterAnd filtration rate->Mapping relation between->The method comprises the steps of carrying out a first treatment on the surface of the The accuracy parameter is used for representing the proportion of unfiltered data to data which does not accord with the filtering condition when the data in the input task model is filtered through the agent model of the accuracy parameter.

Specifically, classification modelDeducing a piece of data->The probability value on the positive class is recorded as +.>The method comprises the steps of carrying out a first treatment on the surface of the For a linear support vector machine (linear SVM), -, the>Wherein->Is a weight matrix, < >>Is a bias term; if->Data->Will be filtered out and will not be passed to subsequent task model processing, and for ease of understanding, the present specification provides a schematic diagram of a process for determining the correspondence between the accuracy and the filtering rate of the proxy model, as shown in fig. 4.

Fig. 4 is a schematic diagram of a process for determining a correspondence between accuracy and filtering rate of a proxy model provided in the present specification.

Wherein when the threshold value isWhen (I)>Data->Transmitting the agent model to a subsequent reasoning operator for further processing; otherwise, data->Is filtered out in advance; when threshold value->In this case, the percentage of the 10 pieces of positive data passing through to the total positive data is 10/10=100%, i.e. +.>The percentage of the filtered 5 pieces of data to the total 18 pieces of data is 5/18=28%, i.e. the filtration rate +.>The method comprises the steps of carrying out a first treatment on the surface of the When threshold value->When the data is lifted to 0.4, the data with a label of positive type is mistakenly screened by the proxy model, so that the accuracy is improved>Down to 9/10=90%, at which point the proxy model filters out more Data, its filtration rate->Lifting to 8/18=44%; thus, the accuracy is obtainedParameters and filtration Rate->Mapping relation between->The method comprises the steps of carrying out a first treatment on the surface of the Notably, with accuracy +.>Rise in the filtration rate ofDescending.

In the present specification, the different proxy models correspond to different accuracy parameters, and the mapping relationship between the accuracy and the filtering rate corresponding to the proxy models of the different accuracy parameters is also different.

In this specification, the target accuracy parameter combination with the lowest execution cost in a specific execution order may be determined by the following accuracy allocation algorithm. The accuracy allocation algorithm may be:

procedure accuracy rate allocation algorithm

Extracting a sample from the raw data;

fordo：

at->Application of the filtering conditions->；

forIn discrete space->Middo:

fordo：

at->Upper application->；

If at->The upper meets->Approximately reuse->The method comprises the steps of carrying out a first treatment on the surface of the Otherwise, training->；

Calculation accuracy->And filtration rate->A relationship between;

calculating the execution cost of the processing model of each accuracy parameter；

Calculating the execution cost corresponding to each parameter combination；

Selecting the minimumIs->And->And returns.

In view of the proxy modelHas an input relation->Target filtration condition- >When the attributes are equal and the order is different, the input relation of the proxy model is changed, and the constructed proxy models have different execution cost although the constructed proxy models are different for the same predicate; unlike classical rearrangement problems, due to the introduction of proxy model and its accuracy parameters, when no target accuracy is assigned +.>When the execution plan under each order is a range, not a certain value; for a specific cisOptimal accuracy allocation combination of execution plans in sequence +.>It is not necessarily applicable to execution plans in other orders; on the other hand, when only different sequences are considered, only the execution cost of the inference operator and the selectivity of the inference operator filter are considered, an optimal sequence is calculated, and then an execution plan obtained by distributing target accuracy based on the optimal sequence is not necessarily optimal; therefore, it is necessary to simultaneously add the sequence space>And accuracy space->Searching the optimal execution plan.

For simple and visual and at the same time in sequence spaceAnd accuracy space->The upper search algorithm, namely the violent search algorithm based on the rearrangement technology is as follows: first, from the order space->Enumerating execution plans of all different orders +. >The query shown in FIG. 1 contains 3 pairs of inference operator filters, so its order space is +.>Including 6 sequential execution plans; then, the process is carried out,finding the order +.>Optimum accuracy ofRate allocation combination->Constructing a proxy model->And calculates the execution cost +.>The method comprises the steps of carrying out a first treatment on the surface of the For the order of queries shown in FIG. 1, the execution plan shown in FIG. 2 after it has been optimized using an accuracy allocation algorithm; finally, return to the agent model with minimum execution cost +.>And +.>The method comprises the steps of carrying out a first treatment on the surface of the Whereas the brute force search algorithm based on the rearrangement technique requires a space in exponential order +.>Searching up for the optimal accuracy allocation combination under each order has an expensive, unacceptable optimization cost overhead.

Therefore, the method and the device can query the target value range of the execution cost corresponding to each execution sequence according to the target accuracy, filter each execution sequence according to the target value range to obtain each candidate execution sequence, screen the execution sequence and further reduce the calculated amount when calculating the cost function.

The server can determine an initial value range of the execution cost corresponding to each execution sequence, in the process of constructing and training the proxy model of different accuracy parameters corresponding to each task model, the server can calculate a filter rate range of the filter rate corresponding to each proxy model under different execution sequences and a selection rate range of each task model according to the target accuracy, and then update the initial value range according to the target accuracy, the filter rate range and the selection rate range to obtain the target value range of the execution cost corresponding to each execution sequence.

In particular, sequential spaceThe number of different execution sequences and the number of task models are in an exponential relation, and the query task shown in FIG. 1 comprises 3 pairs of inference operator filters, so that the sequence space is +.>The method comprises 6 execution sequences; by combining the same pre-conjugates in different execution orders, a search tree is constructed. For ease of understanding, the present disclosure provides a schematic diagram of a search tree, as shown in fig. 5.

Fig. 5 is a schematic diagram of a search tree provided in the present specification.

Wherein X, Y, Z represents three task models respectively, and the current searching node (node 1) is represented as X and the next node (node 2) is represented as Z.

For a proxy model for a particular execution order,has upper and lower bounds; intuitively, when all proxy models filter out all data, then +.>Reaching the lower limit; when all proxy models do not filter any data, i.e. pass through all data, then +.>Reaching an upper limit; for the order in FIG. 6 +.>When->Filtering out all data, +.>Reaching the lower limit; when->、/>And->All data are passed +.>Reaching the upper limit.

Thus, proxy model for an accuracy parameterIts execution cost- >Is bounded:

when (when)When the cost function obtains a lower boundIn practical use, < > a->Target accuracy for user input, +.>The minimum value of the selectivity of the filter representing the task model corresponding to the agent model is represented, the greater the filtering rate is, the smaller the selectivity is, and the +.>Is the maximum value of the filtering rate of the proxy model.

When (when)When the cost function obtains an upper bound，/>Maximum value of the selectivity of the filter showing the agent model corresponds to the task model, +.>Is the minimum value of the filtering rate of the proxy model.

On the one hand, different execution plans in the sequential space construct a search tree, and on the other hand, objective functionsBased on these two observed phenomena, the problem to be solved by the present invention is formulated as a branch definition algorithm to solve. In this specification, pruning filtering may be performed on different parameter combinations by pseudo-codes of a branch definition algorithm to obtain candidate parameter combinations. The pseudocode of the branch definition algorithm may be expressed as:

procedure branch definition algorithm；

According to the sequential spaceConstructing a search tree in each corresponding execution sequence;

；

for the followingRespectively use->Andinitialization->And->Obtaining an initial value range;

whilenext node- >do：

Node computation：

According to the accuracy parametersDetermine->Agent model->And calculate +.>And->；

UpdatingCost function of execution plan corresponding to leaf node +.>Obtaining a target value range;

sequencing pruning；

fordo；

Accuracy rate allocation algorithm；

Sequencing pruning；

returnMake->Minimum.

In the searching process, the algorithm gradually determines the agent model, the selection rate and the filtering rate, so that the value range of the execution cost of the candidate parameter combination is contracted, pruning and filtering are carried out on the parameter combination, and the cost overhead in the optimizing process is reduced.

The input of the algorithm is a queryAnd target accuracy->Output is target execution order->Target parameter set of proxy model accuracy parametersClose->The method comprises the steps of carrying out a first treatment on the surface of the Next, for each order +.>Before constructing the proxy model, +.>And->Initialization->And->Obtaining an initial value range of the execution cost, then starting from the root node of the search tree, searching gradually along the search tree, and performing +.>Performing node calculation; the specific process is as follows: according to the accuracy parametersDetermining node->Agent model->And at the determination- >In-process calculation of->And->The method comprises the steps of carrying out a first treatment on the surface of the According to the accuracy parameter->Determining node->Agent model->And calculate +.>And->The method comprises the steps of carrying out a first treatment on the surface of the Respectively utilize、/>、/>And->、/>、/>Update node->Corresponding cost function->Obtaining a target value range; then, sorting pruning is carried out according to the target value range of the execution cost under different execution sequences; when the leaf node of the tree is searched out of the while loop, the remaining candidate execution plans are sequentially traversed +.>) The method comprises the steps of carrying out a first treatment on the surface of the For each candidate execution plan, the above-mentioned accuracy allocation algorithm is called +.>The method comprises the steps of carrying out a first treatment on the surface of the Then, sorting pruning is carried out according to the target value range of the execution cost; finally, return the minimum goal order of execution of cost function +.>And target parameter combination of proxy model accuracy +.>。

For ease of understanding, the present book name provides a schematic diagram of the process of pruning in the execution order, as shown in fig. 6.

Fig. 6 is a schematic diagram of a pruning process in an execution sequence provided in the present specification.

Wherein,and->Respectively two parameter combinations, and respectively corresponds to the target value range of the execution costAnd->Three relations are shared between the two; when->And->When the cost functions of (2) do not overlap, from +.>Execution plan with higher pruning drop boundary in fig. 6, reference numeral (a) >And->Does not overlap but +.>Is higher, so->Pruning; when->And->When there is an overlap in the cost functions of (a) and (b) of FIG. 6, the execution plan of the lower boundary has a higher priority>Has a higher priority; when->And->When the cost function boundary of (a) is an inclusion relationship, the execution plan of the outer layer boundary has a higher priority since the execution plan of the outer layer can be pruned more conveniently by searching preferentially, in the reference sign (b) of FIG. 6 +.>Is on the outer layer, with higher priority; the server can decide +_ according to these three rules>Priority relation between different execution plans of two-by-two until +.>Is completely ordered; execution plan pruned->Is taken from->Removed from the container.

Taking fig. 5 as an example, where the execution order XYZ corresponds to the target range of execution costs of (5, 40) and XZY corresponds to the target range of execution costs of (3, 26), the parameter combinations in the XZY order can be taken as target parameter combinations.

In other words, for each execution order, if the highest value of the execution cost corresponding to the execution order is lower than the lowest value of the execution cost corresponding to the other execution orders, the execution order is taken as the candidate execution order.

And if the lowest value of the execution cost corresponding to the execution sequence is smaller than the lowest value of the execution cost corresponding to other execution sequences, and the highest value of the execution cost corresponding to the execution sequence is larger than the lowest value of the execution cost corresponding to the other execution sequences and smaller than the highest value of the execution cost corresponding to the other execution sequences, the execution sequence is taken as the candidate execution sequence.

And if the lowest value of the execution cost corresponding to the execution sequence is smaller than the lowest value of the execution cost corresponding to other execution sequences and the highest value of the execution cost corresponding to the execution sequence is larger than the highest value of the execution cost corresponding to other execution sequences, the execution sequence is taken as the candidate execution sequence.

For the next node mentioned in the above-mentioned branch definition algorithmThe method comprises the steps of carrying out a first treatment on the surface of the The invention first selects +.>Execution plan of middle row in first bit +.>Then selecting the node to be accessed next to return; if->All nodes of the execution plan arranged first in the middle have been accessed, then the next time is returnedThe execution plan of which is the node to be accessed next; if->If all the nodes of the execution plan have been accessed, jumping out of the loop; in FIG. 6 +.>Arranged at->Is the first bit in (a), and node +. >Having accessed, then the next node +.>Function return node->。

On-going node computationThe server may consider the cost function of a proxy model +.>Rising with rising accuracy parameters, for nodes +.>The server can be used for controlling the data according to the accuracy rate parameterConstruction node->Proxy model for computational lower bound on +.>And calculating +.in the course of constructing the proxy model>And->Update->Cost function of execution plan corresponding to leaf node +.>Lower bound of (2); for node 1 in FIG. 5, build proxy model for calculating lower bound +.>And is calculated in the construction processAnd->Update->And->The lower bound of the cost function.

Further, the server may be configured toConstruction node->Proxy model for computational lower bound on +.>And calculating +.in the course of constructing the proxy model>And->Update->Cost function of execution plan corresponding to leaf node +.>Upper bound of (2); for node 1 in FIG. 5, build a proxy model for computing the lower boundAnd calculating +.>And->Update->And->The upper bound of the cost function.

It should be noted that, in this solution, before executing the reasoning query task, the execution sequences and the parameter combinations under the execution sequences may be filtered, so that each candidate execution sequence may be directly used in the process of executing the reasoning query task, and, of course, the execution sequences and the parameter combinations under the execution sequences may also be filtered according to the target parameters input by the user in the process of executing the reasoning query task. The above-described selectivity s and the filter rate r may be fixed values calculated in the process of constructing the proxy model.

Further, in order to perform fine-grained pruning on the execution sequence, for each task model, the server may divide the task model into a first sub-model and a second sub-model, where the first sub-model is used to generate a labeled sample, the second sub-model is used to train the classification model and calculate a mapping relationship between the accuracy and the filtering rate, then determine different execution sequences of sub-models corresponding to each agent model, as each sub-model execution sequence, for each sub-model execution sequence, the server may determine a target value range of execution cost corresponding to the sub-model execution sequence, each sub-model execution sequence corresponds to a target value range of execution cost, filter each sub-model execution sequence to obtain each candidate sub-model execution sequence, determine a target sub-model sequence and a target parameter combination according to the execution cost corresponding to each parameter combination under each sub-model execution sequence and the time cost corresponding to each sub-model execution sequence, and call each agent model corresponding to the target parameter combination, and execute tasks corresponding to each sub-model according to the target sub-model execution sequence.

Specifically, based on the branch definition algorithm, the invention provides a fine-grained search tree data structure and an accuracy allocation algorithm in search based on the characteristic of solving the problem, and obtains a branch definition optimization algorithm.

For each node in the search tree shown in FIG. 5, the nodeBuilding agent model->The calculation process of (1) comprises: according to their corresponding input relations->And filtration conditions->Sample with label>Then, in this sample->Training a classification model by using a lightweight classification algorithm>And calculates the proxy model accuracy parameter +.>And filtration rate->Mapping relation between->The method comprises the steps of carrying out a first treatment on the surface of the Wherein, a sample with a tag is generated->And training a classification model->Is relatively time consuming; in order to further reduce the time overhead of the online optimization process, the invention proposes to split each node in the search tree into two nodes: />The node is used to generate a labeled sample,/-)>The nodes are used to train the classification model in the proxy model and calculate +.>Thus constructing a finer granularity search tree, for ease of understanding, the present specification provides a schematic diagram of a fine granularity search tree structure, as shown in fig. 7.

Fig. 7 is a schematic diagram of a fine-grained search tree structure provided in the present specification.

Reference numeral (a) of FIG. 7 is a partial example diagram of an original search tree, nodesSplit into two nodes: />The node is used to generate a labeled sample, +.>The nodes are used to train the classification model and calculate +. >The corresponding fine-grained search tree is shown in fig. 7 by reference numeral (b); the fine-grained search tree provides more spatial search pruning cost function boundaries for different execution plans than the original search tree, which may be +.>Node pruning without performing its corresponding +.>And (5) a node.

The pseudocode of the branch-bound optimization algorithm can be expressed as:

procedure branch definition optimization algorithm；

From the slaveA fine granularity search tree is constructed;

；

for the followingRespectively use->And->Initialization->And->；

whileNext node->do：

Node computation：

ifNode do:

calculating a labeled sampleAnd calculate +.>And->；/>

Invoking an accuracy allocation algorithmAnd calculate +.>、/>And；

updatingCost function of execution plan corresponding to leaf node +.>Upper and lower bounds of (2);

sequencing pruning；

returnMake->Minimum;

sequencing pruning。

First, the server is from the sequential spaceConstructing a fine-grained search tree in each corresponding execution sequence, enumerating all sequences, and carrying out +_ on each node on the fine-grained search tree>Andinitialization->And->Determining an initial value range of the execution cost; then, starting from the root node of the fine-grained search tree, searching gradually along this search tree, at each node +. >Performing node calculation; the method comprises the following steps: if it is->Nodes, then calculate labeled samples +.>And calculate +.>And->If it is->Node, then for the root node to the current +.>The execution plan corresponding to the node is called an accuracy rate allocation algorithm, an optimal agent model is constructed, and +.>、/>And->。

Then, utilize、/>Or->、/>And->Updating nodesExecution plan cost function corresponding to leaf node +.>Obtaining a target value range of the specified cost, and then ordering the pruning +.>Function according to->Execution plan cost function of the candidate in->Sequencing and pruning are carried out on the upper and lower boundaries of (2); finally, the while loop ends, returning to the optimal order +.>Agent model accuracy combination>The method comprises the steps of carrying out a first treatment on the surface of the Ordered pruning in view of branch definition optimization algorithm>Function and ordering pruning in the above-mentioned branch definition algorithm>The functions are consistent, and the description is not repeated.

Defining the next node pointed out in the optimization algorithm for the branchesThe method comprises the steps of carrying out a first treatment on the surface of the Updating the search tree to fine granularity in view of the underlying data structure such that the priority queue +.>The first execution plan has at least one candidate node for the next node to be accessed, for the case of multiple candidate nodes; in the reference numeral (b) of FIG. 7, for execution plan +. >Search node 1 (node +.>) There are two candidate nodes: node 2 (node->) And node 3 (node->) The method comprises the steps of carrying out a first treatment on the surface of the Therefore, the invention proposes to estimate the time cost calculated by each node and the cost function contraction benefit brought by the time cost, and then select the node with low cost and high benefit to continue to access; wherein for->Node and->The time cost of the nodes is respectively: the time overhead of generating the tagged samples and the time overhead of training the classification model.

Node computationProcedure, in->Node calculates sample with label->In->Node invoking accuracy allocation algorithm to calculate +.>Optimal accuracy allocation combination on corresponding prefix execution plan>And->The method comprises the steps of carrying out a first treatment on the surface of the The present disclosure provides a schematic diagram of an accuracy allocation process corresponding to a fine-grained search tree, as shown in fig. 8.

Fig. 8 is a schematic diagram of an accuracy allocation process corresponding to a fine-grained search tree provided in the present specification.

Reference numeral (a) of FIG. 8 is whenBased on +.>，/>Constructing an accuracy upper and lower bound of the calculation cost function; the reference number (b) of FIG. 8 shows the search tree of its corresponding fine granularity, when node 2 is searched, for the current +. >Node, call accuracy rate allocation algorithm only to +.>The distribution accuracy, therefore, find the optimal distribution combination as +.>The method comprises the steps of carrying out a first treatment on the surface of the Similarly, the reference numeral (c) of FIG. 8 shows that on a fine-grained search tree, when node 4 is searched, the search is directed from the root node to the current +.>Node, invoking accuracy rate allocation algorithm pairThe distribution accuracy rate, find the optimal distribution combination as +.>，/>。

In view of the query q of the accuracy rate distribution algorithm on a specific execution sequence, the target accuracy rate A is distributed to the accuracy rate alpha parameter of the proxy model in the query, so that the execution cost of the target accuracy rate A is calculatedIn this specification, when searching for a node t of depth i, an optimal accuracy combination is calculated using an accuracy allocation algorithm for an execution plan corresponding from a root node to a current node>And corresponding execution cost->Thereby further shrinking the cost function boundary of the execution plan corresponding to the node.

S204: determining the execution cost corresponding to each candidate parameter combination in each candidate execution sequence, and determining a target execution sequence and a target parameter combination in the target execution sequence according to the execution cost corresponding to each candidate parameter combination in each candidate execution sequence.

The server may determine an execution cost corresponding to each candidate parameter combination in each candidate execution order, and determine a target execution order and a target parameter combination in the target execution order according to the execution cost corresponding to each candidate parameter combination in each candidate execution order, where the server may use the candidate execution order with the minimum execution cost and the parameter combination in the candidate execution order as the target execution order and the target parameter combination.

S205: when a query request is received, the agent model corresponding to each task model is called according to the target parameter combination, and the reasoning query task corresponding to the query request is executed through each task model according to the target execution sequence.

After determining a target execution sequence corresponding to the target accuracy and a target parameter combination under the target execution sequence, when receiving a query request, the server can call a proxy model corresponding to each task model according to the target parameter combination, execute an reasoning query task corresponding to the query request through each task model according to the target execution sequence, and complete the whole reasoning query process after obtaining a final query result. For easy understanding, the present disclosure provides a schematic diagram of the optimal execution sequence of each rearranged task model and agent model, as shown in fig. 9.

Fig. 9 is a schematic diagram of an optimal execution sequence of each rearranged task model and agent model provided in the present specification.

Also by topic recognition modelSite marking model->Emotion analysis model->For example, after the execution sequence and the accuracy parameter are distributed, the optimal execution sequence of the task model is determined as follows: / >The parameter combination of the proxy model is +.>、/>And +.>For the query q (attitude of the user at the site S to the mask) input by the user, the user first goes through the proxy model +.>Preliminary filtering is carried out on the initial data, and then the filtered data is input into a place marking model +.>Filtering conditions by its corresponding filter>Screening the data, and taking the screened data as a next task model (++)>) Corresponding agent model (+)>) And then sequentially executing the subsequent agent model and the corresponding task model until a final query result is obtained, namely, data of positive attitude of the user at the site S to the mask.

Of course, in practical application, the method may also be applied to other scenes, for example, screening out bus travel data of S places according to image data collected by traffic cameras of each place, in this process, the server may determine a target execution sequence of a processing task and a target parameter combination of a proxy model through the method provided in the present specification, so as to screen out image data of a place S through a classifier corresponding to a place marking model according to the target execution sequence and the target parameter combination, screen out image data including vehicles through a main body recognition model, screen out image data including buses through a target recognition model, and further adjust and plan public traffic of the place S according to the screened data.

In this specification, the server may also perform the above-mentioned task of reasoning query based on the framework of the reasoning query processing system of the rearrangement technology, and for convenience of understanding, this specification provides an exemplary diagram of a rearrangement technology system for reasoning query, as shown in fig. 10.

Fig. 10 is an exemplary diagram of a rearrangement technology system for reasoning-oriented queries provided in the present specification.

The system comprises an interactive interface module, a query optimization module, an execution engine module and a data input and output module; a user firstly submits a query q through a user interaction interface and designates a target accuracy A=90% of a query result which can be accepted by the user, a query optimization module receives the query q submitted by the user and the target accuracy A, and then an optimized execution plan is generated by using initial k% input dataAnd saves the output result. ExecutingExecution plan after execution of optimization by line engine module on the remaining (1-k%) input data +.>And saves the output result.

The query optimization module comprises a proxy model building sub-module, a target accuracy distribution sub-module and a rearrangement proxy model sub-module, wherein the proxy model building sub-module provides an algorithm for online proxy model building, the target accuracy distribution sub-module provides an accuracy distribution algorithm aiming at a specific execution sequence, and the rearrangement proxy model sub-module provides a branch definition algorithm for searching an optimal execution sequence and distributing accuracy.

The foregoing describes one or more implementations of the data processing method based on the proxy model according to the present disclosure, and further provides a corresponding data processing apparatus based on the proxy model according to the same concept, as shown in fig. 11.

Fig. 11 is a schematic diagram of an inference query optimization apparatus based on a proxy model rearrangement technology provided in the present specification, including:

the acquisition module 1101 is configured to acquire each task model, and determine different execution sequences when performing an inference query task through each task model;

The filtering module 1102 is configured to query a target value range of an execution cost corresponding to each execution order according to a target accuracy input by a user, and filter each execution order according to the target value range to obtain each candidate execution order;

a first determining module 1103, configured to determine, for each candidate execution order, a combination of accuracy parameters of different proxy models corresponding to each task model in the candidate execution order based on the target accuracy, where the proxy model is used to filter data in an input task model according to a filtering condition corresponding to the task model, as each parameter combination, and the accuracy parameters are used to characterize a proportion of unfiltered data to data that does not meet the filtering condition when the proxy model using the accuracy parameters filters the data in the input task model;

a second determining module 1104, configured to determine an execution cost corresponding to each candidate parameter combination in each candidate execution order, and determine a target execution order and a target parameter combination in the target execution order according to the execution cost corresponding to each candidate parameter combination in each candidate execution order;

And an execution module 1105, configured to, when a query request is received, call a proxy model corresponding to each task model according to the target parameter combination, and execute, according to the target execution order, an inference query task corresponding to the query request through each task model.

Optionally, the second determining module 1104 is specifically configured to obtain, for each accuracy parameter in the candidate parameter combination, a mapping relationship between an accuracy rate and a filtering rate corresponding to a proxy model of the accuracy parameter, where mapping relationships corresponding to proxy models of different accuracy parameters are different; determining the filtering rate corresponding to the accuracy parameter according to the accuracy corresponding to the accuracy parameter and the mapping relation corresponding to the agent model of the accuracy parameter; determining the execution cost corresponding to the accuracy parameter according to the accuracy corresponding to the accuracy parameter and the filtering rate; and determining the execution cost corresponding to each candidate parameter combination according to the execution cost corresponding to each accuracy parameter in each candidate parameter combination.

Optionally, before each task model is acquired, the acquiring module 1101 is further configured to determine an initial value range of an execution cost corresponding to each execution order; constructing and training agent models of different accuracy parameters corresponding to each task model, and determining a filtering rate range of filtering rates corresponding to each agent model under different execution sequences and a selection rate range of each task model according to the target accuracy; and updating the initial value range according to the target accuracy, the filtering rate range and the selection rate range to obtain a target value range of the execution cost corresponding to each execution sequence.

Optionally, before each task model is acquired, the acquiring module 1101 is further configured to divide, for each task model, the task model into a first sub-model and a second sub-model, where the first sub-model is used for generating a labeled sample, and the second sub-model is used for training a classification model and calculating a mapping relationship between an accuracy rate and a filtering rate; determining different execution sequences of sub-models corresponding to each agent model, and taking the different execution sequences as the execution sequences of the sub-models; determining a target value range of the execution cost corresponding to the execution sequence of each sub-model aiming at the execution sequence of the sub-model; filtering the execution sequences of the sub-models according to the target value range of the corresponding execution cost of each sub-model execution sequence to obtain candidate sub-model execution sequences; determining a target sub-model sequence and a target parameter combination according to the execution cost corresponding to each parameter combination under each sub-model execution sequence and the time cost corresponding to each sub-model execution sequence; and calling each agent model corresponding to the target parameter combination, and executing tasks corresponding to all sub-models according to the execution sequence of the target sub-models.

Optionally, the filtering module 1102 is specifically configured to, for each execution order, take the execution order as the candidate execution order if the highest value of the execution cost corresponding to the execution order is lower than the lowest value of the execution cost corresponding to the other execution orders.

Optionally, the filtering module 1102 is specifically configured to, for each execution order, take the execution order as the candidate execution order if, for each execution order, the lowest value of the execution cost corresponding to the execution order is smaller than the lowest value of the execution cost corresponding to the other execution orders, and the highest value of the execution cost corresponding to the execution order is greater than the lowest value of the execution cost corresponding to the other execution orders and smaller than the highest value of the execution cost corresponding to the other execution orders.

Optionally, the filtering module 1102 is specifically configured to, for each execution order, take the execution order as the candidate execution order if, for each execution order, the lowest value of the execution cost corresponding to the execution order is smaller than the lowest value of the execution cost corresponding to other execution orders, and the highest value of the execution cost corresponding to the execution order is greater than the highest value of the execution cost corresponding to the other execution orders.

The present specification also provides a computer readable storage medium storing a computer program operable to perform a method of inferential query optimization based on the proxy model rearrangement technique described above with reference to FIG. 2.

The present specification also provides a schematic structural diagram of an electronic device corresponding to fig. 2 shown in fig. 12. At the hardware level, the electronic device includes a processor, an internal bus, a network interface, a memory, and a non-volatile storage, as described in fig. 12, although other hardware required by other services may be included. The processor reads the corresponding computer program from the non-volatile memory into the memory and then runs to implement the proxy model-based data processing method described above with respect to fig. 2. Of course, other implementations, such as logic devices or combinations of hardware and software, are not excluded from the present description, that is, the execution subject of the following processing flows is not limited to each logic unit, but may be hardware or logic devices.

Improvements to one technology can clearly distinguish between improvements in hardware (e.g., improvements to circuit structures such as diodes, transistors, switches, etc.) and software (improvements to the process flow). However, with the development of technology, many improvements of the current method flows can be regarded as direct improvements of hardware circuit structures. Designers almost always obtain corresponding hardware circuit structures by programming improved method flows into hardware circuits. Therefore, an improvement of a method flow cannot be said to be realized by a hardware entity module. For example, a programmable logic device (Programmable Logic Device, PLD) (e.g., field programmable gate array (Field Programmable Gate Array, FPGA)) is an integrated circuit whose logic function is determined by the programming of the device by a user. A designer programs to "integrate" a digital system onto a PLD without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Moreover, nowadays, instead of manually manufacturing integrated circuit chips, such programming is mostly implemented by using "logic compiler" software, which is similar to the software compiler used in program development and writing, and the original code before the compiling is also written in a specific programming language, which is called hardware description language (Hardware Description Language, HDL), but not just one of the hdds, but a plurality of kinds, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), lava, lola, myHDL, PALASM, RHDL (Ruby Hardware Description Language), etc., VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog are currently most commonly used. It will also be apparent to those skilled in the art that a hardware circuit implementing the logic method flow can be readily obtained by merely slightly programming the method flow into an integrated circuit using several of the hardware description languages described above.

The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application specific integrated circuits (Application Specific Integrated Circuit, ASIC), programmable logic controllers, and embedded microcontrollers, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller may thus be regarded as a kind of hardware component, and means for performing various functions included therein may also be regarded as structures within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.

The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in one or more software and/or hardware elements when implemented in the present specification.

It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

The present description is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the specification. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.

Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.

It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

The description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.

The foregoing is merely exemplary of the present disclosure and is not intended to limit the disclosure. Various modifications and alterations to this specification will become apparent to those skilled in the art. Any modifications, equivalent substitutions, improvements, or the like, which are within the spirit and principles of the present description, are intended to be included within the scope of the claims of the present description.

Claims

1. The reasoning inquiry optimization method based on the agent model rearrangement technology is characterized by comprising the following steps:

for each accuracy parameter in the candidate parameter combination, acquiring a mapping relation between the accuracy and filtering rate corresponding to the agent model of the accuracy parameter, wherein the mapping relation corresponding to the agent model of different accuracy parameters is different, determining the filtering rate corresponding to the accuracy parameter according to the accuracy corresponding to the accuracy parameter and the mapping relation corresponding to the agent model of the accuracy parameter, determining the execution cost corresponding to the accuracy parameter according to the accuracy corresponding to the accuracy parameter and the filtering rate, determining the execution cost corresponding to each candidate parameter combination according to the execution cost corresponding to each accuracy parameter in each candidate parameter combination, and determining the target execution sequence and the target parameter combination under the target execution sequence according to the execution cost corresponding to each candidate parameter combination under each candidate execution sequence;

2. The method of claim 1, wherein prior to acquiring each task model, the method further comprises:

and updating the initial value range according to the target accuracy, the filtering rate range and the selection rate range to obtain a target value range of the execution cost corresponding to each execution sequence.

3. The method of claim 2, wherein prior to acquiring each task model, the method further comprises:

4. The method of claim 1, wherein querying a target range of values for execution costs corresponding to each execution order according to a target accuracy rate entered by a user, and filtering each execution order according to the target range of values to obtain each candidate execution order, specifically comprising:

and for each execution sequence, if the highest value of the execution cost corresponding to the execution sequence is lower than the lowest value of the execution cost corresponding to other execution sequences, the execution sequence is taken as the candidate execution sequence.

5. The method of claim 1, wherein querying a target range of values for execution costs corresponding to each execution order according to a target accuracy rate entered by a user, and filtering each execution order according to the target range of values to obtain each candidate execution order, specifically comprising:

6. The method of claim 1, wherein querying a target range of values for execution costs corresponding to each execution order according to a target accuracy rate entered by a user, and filtering each execution order according to the target range of values to obtain each candidate execution order, specifically comprising:

7. An inference query optimization device based on a proxy model rearrangement technology, which is characterized by comprising:

the second determining module is used for obtaining a mapping relation between the accuracy and the filtering rate corresponding to the agent model of the accuracy parameter aiming at each accuracy parameter in the candidate parameter combination, wherein the mapping relation corresponding to the agent model of different accuracy parameters is different, determining the filtering rate corresponding to the accuracy parameter according to the accuracy corresponding to the accuracy parameter and the mapping relation corresponding to the agent model of the accuracy parameter, determining the execution cost corresponding to the accuracy parameter according to the accuracy corresponding to the accuracy parameter and the filtering rate, determining the execution cost corresponding to each candidate parameter combination according to the execution cost corresponding to each accuracy parameter in each candidate parameter combination, and determining the target execution sequence and the target parameter combination under the target execution sequence according to the execution cost corresponding to each candidate parameter combination under each candidate execution sequence;

8. A computer readable storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the method of any of the preceding claims 1-6.

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of any of the preceding claims 1-6 when executing the program.