CN116992929A

CN116992929A - Network architecture determining method, device and storage medium

Info

Publication number: CN116992929A
Application number: CN202211147941.9A
Authority: CN
Inventors: 高莹莹; 张世磊; 冯俊兰
Original assignee: China Mobile Communications Group Co Ltd; China Mobile Communications Ltd Research Institute
Current assignee: China Mobile Communications Group Co Ltd; China Mobile Communications Ltd Research Institute
Priority date: 2022-09-19
Filing date: 2022-09-19
Publication date: 2023-11-03

Abstract

The invention discloses a network architecture determining method, a network architecture determining device and a storage medium; the method comprises the following steps: inquiring a preset database according to a target task, and determining a target AI capacity set required for realizing the target task; establishing a super network corresponding to the target task according to the target AI capability set; the super network includes AI capability combinations of at least one path; and searching paths according to the super network, and determining a target AI capacity combination corresponding to the target task.

Description

Network architecture determining method, device and storage medium

Technical Field

The present invention relates to the field of wireless communications, and in particular, to a network architecture determining method, apparatus, and storage medium.

Background

Neural network model Search (NAS, neural Architecture Search) is a method of searching for an optimal network architecture from a Search Space, and includes three parts, namely a Search Space (Search Space), a Search policy (Search strategy), and a performance evaluation policy (Performance estimation strategy).

The search space defines in principle which architectures can be represented, which in combination with a priori knowledge of the task attributes can reduce the search space size and simplify the search; however, this also introduces a bias of artifacts, which may prevent the finding of novel architectural building blocks (building blocks) that go beyond the current human knowledge. The search strategy is used to illustrate how to do a spatial search and includes a trade-off between classical exploration-development (exploration); on the one hand, there is a need to quickly find well-behaved architectures, and on the other hand, to avoid premature convergence to sub-optimal architecture (suboptimal architecture) areas. The goal of NAS is typically to find an architecture that achieves high predictive performance in unknown data, performance assessment refers to the process of assessing this performance, the simplest option being to perform standard training and validation on the data architecture, but this approach is computationally expensive, limiting the amount of architecture that can be explored. Accordingly, most recent research has focused on developing methods to reduce these performance estimation costs.

NAS mainly aims to solve two associated problems: the first problem is weight optimization, finding the weight that minimizes the Loss on the training set (Loss); the second problem is architecture optimization, i.e., finding the structure that maximizes precision on the validation set. There may be other limitations in the actual traffic, such as requiring that the delay must be less than a certain threshold.

Most of the schemes in the related art focus on how to reduce the search space so as to realize fast finding of the optimal architecture and how to avoid convergence to the suboptimal architecture, but deviate from specific tasks, namely, a search algorithm irrelevant to the tasks is adopted so as to achieve that the optimal architecture of the model can be searched out in different task data sets. However, in the context of architectural artificial intelligence, there is a need to provide an automatic optimization combination of end-to-end AI capabilities that can be task oriented, oriented towards increasing artificial intelligence (AI, artificial Intelligence) capabilities and cumbersome and complex business scenarios.

Disclosure of Invention

In view of the above, the primary objective of the present invention is to provide a network architecture determining method, device and storage medium.

In order to achieve the above purpose, the technical scheme of the invention is realized as follows:

the embodiment of the invention provides a network architecture determining method, which comprises the following steps:

Inquiring a preset database according to a target task, and determining a target AI capacity set required for realizing the target task;

establishing a super network corresponding to the target task according to the target AI capability set; the super network includes AI capability combinations of at least one path;

and searching paths according to the super network, and determining a target AI capacity combination corresponding to the target task.

In the above scheme, the database comprises at least one of the following: a task instance library, an atomic capability library; the task instance library comprises: at least one historical task, and AI capability combinations corresponding to each historical task; the atomic capability library includes: at least one AI capability;

the step of inquiring a preset database according to a target task to determine a target AI capability set required for realizing the target task includes:

inquiring the task instance library according to first description information of a target task, and determining whether a target historical task meeting a first condition exists in the task instance library;

under the condition that a target historical task exists in the task instance library, determining a target AI capacity set according to AI capacity combinations corresponding to the target historical task;

and under the condition that the target historical task does not exist in the task instance library, inquiring the atomic capacity library according to the first description information of the target task, determining the AI capacity meeting the second condition in the atomic capacity library, and determining a target AI capacity set according to the AI capacity meeting the second condition.

In the above solution, the task instance library further includes: second descriptive information for each historical task; the task instance library is queried according to the first description information of the target task, and the task instance library comprises at least one of the following components:

calculating the text minimum editing distance between the first description information of the target task and the second description information of each historical task in the task instance library;

calculating cosine similarity distance of the first descriptive information and the second descriptive information;

determining a first text vector corresponding to the first description information and a second text vector corresponding to the second description information, and identifying the first text vector and the second text vector by using a multi-layer perceptron to obtain an identification result;

accordingly, the satisfaction of the first condition includes at least one of:

the minimum editing distance of the text is smaller than a first threshold value;

the cosine similarity distance is smaller than a second threshold;

the identification result characterizes that the first descriptive information and the second descriptive information are in the same category.

In the above scheme, the atomic capability library further includes: third description information of each AI capability; the querying the atomic capability library according to the first description information of the target task comprises the following steps:

Calculating the text minimum editing distance between the first description information of the target task and the third description information of each AI capacity in the atomic capacity library;

calculating cosine similarity distance of the first descriptive information and the third descriptive information;

determining a first text vector corresponding to the first description information and a third text vector corresponding to the third description information, and identifying the first text vector and the third text vector by using a multi-layer perceptron to obtain an identification result;

accordingly, the satisfaction of the second condition includes at least one of:

the minimum editing distance of the text is smaller than a third threshold value;

the cosine similarity distance is smaller than a fourth threshold;

the identification result characterizes that the first descriptive information and the third descriptive information are in the same category.

In the above scheme, the establishing the super network corresponding to the target task according to the target AI capability set includes one of the following steps:

combining the AI capacity combinations corresponding to a plurality of target historical tasks under the condition that the target historical tasks are a plurality of, and obtaining a super network corresponding to the target tasks;

and performing connection operation on the AI capacity meeting the second condition to obtain the super network corresponding to the target task.

In the above solution, the determining, according to the path search performed by the super network, a target AI capability combination corresponding to the target task includes:

verifying the AI capacity combination of each path in the super network by using a preset verification set to obtain a verification result of the AI capacity combination of each path;

determining a target AI capability combination meeting a search target according to the verification result of the AI capability combination of each path;

wherein the search objective is determined based on at least one of performance loss of the validation set, model size, computation consumption, and inference duration.

and searching paths of the super network according to the transfer probability of the AI capacity, and determining a target AI capacity combination.

In the above scheme, the method further comprises:

and determining the transfer probability of the AI capacity according to the AI capacity combination corresponding to each historical task in the task instance library.

In the above solution, the determining the transition probability of the AI capability according to the AI capability combination corresponding to each historical task in the task instance library includes:

Determining the number of times of transferring the first AI capability to the second AI capability and the number of times of occurrence of the first AI capability according to the AI capability combination corresponding to each historical task in the task instance library; the second AI capability is other AI capabilities except the first AI capability in the task instance library;

the probability of the first AI capability transitioning to the second AI capability is determined based on the number of times the first AI capability transitions to the second AI capability and the number of times the first AI capability occurs.

The embodiment of the invention provides a network architecture determining device, which comprises:

the first processing module is used for inquiring a preset database according to a target task and determining a target AI capacity set required for realizing the target task;

the second processing module is used for establishing a super network corresponding to the target task according to the target AI capability set; the super network includes AI capability combinations of at least one path;

and the third processing module is used for searching paths according to the super network and determining a target AI capacity combination corresponding to the target task.

The first processing module is used for inquiring the task instance library according to first description information of the target task and determining whether the target historical task meeting the first condition exists in the task instance library;

In the above solution, the task instance library further includes: second descriptive information for each historical task; the first processing module is specifically configured to perform at least one of the following:

Accordingly, the satisfaction of the first condition includes at least one of:

the cosine similarity distance is smaller than a second threshold;

In the above scheme, the atomic capability library further includes: third description information of each AI capability; the first processing module is specifically configured to perform at least one of the following:

accordingly, the satisfaction of the second condition includes at least one of:

the cosine similarity distance is smaller than a fourth threshold;

In the above aspect, the second processing module is configured to perform one of the following:

In the above scheme, the third processing module is configured to verify the AI capability combination of each path in the super network by using a preset verification set, so as to obtain a verification result of the AI capability combination of each path;

In the above scheme, the third processing module is configured to perform path search on the super network according to the transition probability of the AI capability, and determine a target AI capability combination.

In the above solution, the third processing module is further configured to determine a transition probability of the AI capabilities according to the AI capability combination corresponding to each historical task in the task instance library.

In the above scheme, the third processing module is specifically configured to determine, according to the AI capability combination corresponding to each historical task in the task instance library, the number of times the first AI capability is transferred to the second AI capability and the number of times the first AI capability appears; the second AI capability is other AI capabilities except the first AI capability in the task instance library;

The embodiment of the invention provides a network architecture determining device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the steps of any one of the above methods when executing the program.

Embodiments of the present invention also provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of any of the methods described above.

The embodiment of the invention provides a network architecture determining method, a device and a storage medium, wherein the method comprises the following steps: inquiring a preset database according to a target task, and determining a target AI capacity set required for realizing the target task; establishing a super network corresponding to the target task according to the target AI capability set; the super network includes AI capability combinations of at least one path; and searching paths according to the super network, and determining a target AI capacity combination corresponding to the target task. Thus, databases are searched for different tasks, AI capability sets for the tasks are determined, target AI capability combinations for the tasks are established according to the AI capability sets, and even non-AI research personnel can rapidly configure and combine various AI capabilities for the tasks, so that model combinations with optimal performance for the tasks are obtained.

Drawings

Fig. 1 is a flow chart of a network architecture determining method according to an embodiment of the present invention;

FIG. 2 is a frame diagram of a multi-task progressive multi-constraint architecture artificial intelligent network architecture searching method provided by an embodiment of the invention;

fig. 3 is a flow chart of a network architecture determining method according to an embodiment of the present invention;

FIG. 4 is a schematic diagram showing AI capability by edge according to an embodiment of the invention;

FIG. 5 is a schematic diagram of a super network according to an embodiment of the present invention;

FIG. 6 is a schematic diagram of a method for determining a network architecture for a language identification task according to an embodiment of the present invention;

FIG. 7 is a schematic diagram of a super network for a language identification task according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of a network architecture determining apparatus according to an embodiment of the present invention;

fig. 9 is a schematic structural diagram of another network architecture determining apparatus according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to examples.

Fig. 1 is a flow chart of a network architecture determining method according to an embodiment of the present invention; as shown in fig. 1, the method may be applied to intelligent devices such as a server, and the method includes:

Step 101, inquiring a preset database according to a target task, and determining a target AI capability set required for realizing the target task;

102, establishing a super network corresponding to the target task according to the target AI capability set; the super network includes AI capability combinations of at least one path;

and 103, searching paths according to the super network, and determining a target AI capacity combination corresponding to the target task.

In some embodiments, the database comprises at least one of: a task instance library, an atomic capability library;

wherein the task instance library comprises: at least one historical task, and AI capability combinations corresponding to each historical task; AI capability combining can be understood as a super-network that completes a historical task, including a single AI capability or multiple AI capabilities connected.

The atomic capability library includes: at least one AI capability.

Each AI capability includes, but is not limited to, the following information:

1) Model parameters Aij.

Here i denotes capability ID and j denotes model ID, i.e. each AI capability can be implemented by one or more models.

The model parameters can be determined based on training such as a neural network, and the model output can be obtained by forward deriving the input features based on the model parameters.

2) Input/output (I/O) interfaces respectively represent the type of the input interface (I) and the type of the output interface (O) of AI capability.

The type of input interface includes, but is not limited to, at least one of: a) Signal types (e.g., voice, text, image, etc.); b) Re-refinement to feature types for signal types, such as: the types of optional features of the speech signal include, but are not limited to: f-bank, MFCC, original speech signal, etc.; c) Feature length, feature dimension input to model the window length of the truncated feature (such as: frame number-frame shift).

The type of output interface includes, but is not limited to, at least one of: a) Outputting meaning (e.g., transcribed text, synthesized speech, translated text, image semantic understanding, text semantic understanding, etc.); b) Output type (e.g., discrete, continuous, indicating whether gradient conduction is supported, etc.); c) Output length, etc.

3) Calculating a cost C; including but not limited to resource consumption, model size, inference duration, etc.

In some embodiments, the determining the target AI capability set required for implementing the target task according to the target task querying a preset database includes at least one of:

inquiring the task instance library according to first description information of a target task, determining a target historical task meeting a first condition in the task instance library, and determining the target AI capability set according to AI capability combinations corresponding to the target historical task;

And inquiring the atomic capacity library according to the first description information of the target task, determining the AI capacity meeting the second condition in the atomic capacity library, and determining the target AI capacity set according to the AI capacity meeting the second condition.

Specifically, determining first description information of a target task according to information such as input characteristics, output characteristics and the like of the target task; and querying a task instance library according to the first description information of the target task, determining an AI capability combination corresponding to the target historical task, and/or querying an atomic capability library according to the first description information of the target task, and determining AI capabilities meeting a second condition to determine a target AI capability set required for realizing the target task.

In some embodiments, the determining the target AI capability set required for implementing the target task according to the target task querying a preset database includes:

Specifically, determining first description information of a target task according to information such as input characteristics, output characteristics and the like of the target task; firstly, inquiring a task instance library according to first description information of a target task, and if a history task (namely a target history task) which is the same as or similar to the target task exists in the task instance library, directly obtaining a target AI (advanced technology) capability set of the target task according to the target history task; if the task instance library does not have the target historical task, querying an atomic capability library according to the first description information of the target task, determining the AI capability meeting the second condition, and determining a target AI capability set according to the AI capability meeting the second condition.

In some embodiments, the task instance library further comprises: second descriptive information for each historical task;

the task instance library is queried according to the first description information of the target task, and the task instance library comprises at least one of the following components:

embedding and representing the first descriptive information and the second descriptive information by using a text pre-training model, and calculating cosine similarity distances of the first descriptive information and the second descriptive information;

determining a first text vector corresponding to the first descriptive information and a second text vector corresponding to the second descriptive information, and identifying the first text vector and the second text vector by using a multi-layer perceptron to obtain an identification result.

Accordingly, the satisfaction of the first condition includes at least one of:

the cosine similarity distance is smaller than a second threshold;

Here, taking the target history task satisfying the first condition as the history task identical or similar to the target task, the supernetwork for the target task may be directly established according to the AI capability combination corresponding to the target history task.

Considering that there may be a situation that too many target historical tasks are searched out, a K value may be set, and the first K historical tasks with the highest similarity are screened out from the target historical tasks. The top K historical tasks with highest similarity may include: the text minimum edits the top K historical tasks closest to the text minimum edit and the top K historical tasks closest to the cosine similarity.

In some embodiments, the atomic capability library further comprises: third description information of each AI capability;

the atomic capability library is queried according to the first description information of the target task, and the atomic capability library comprises one of the following components:

embedding and representing the first descriptive information and the third descriptive information by using a text pre-training model, and calculating cosine similarity distances of the first descriptive information and the third descriptive information;

determining a first text vector corresponding to the first descriptive information and a third text vector corresponding to the third descriptive information, and identifying the first text vector and the third text vector by using a multi-layer perceptron to obtain an identification result.

Accordingly, the satisfaction of the second condition includes at least one of:

the cosine similarity distance is smaller than a fourth threshold;

Here, the first description information of the target task includes a description for at least one of the following for the target task: input features, output features, application scenarios, resource requirements, inference duration, etc.

Similarly, the second description information of the historical task, the third description information of the AI capabilities, may include a description of at least one of: input features, output features, application scenes, resource requirements, reasoning time length and the like of the task.

The first threshold, the second threshold, the third threshold, the fourth threshold, and the K value may be set or adjusted based on actual requirements, and the values are not limited herein. The text pre-training model and the multi-layer perceptron can be obtained by pre-training, and are not described in detail herein.

In some embodiments, the establishing the super network corresponding to the target task according to the target AI capability set includes one of the following:

Specifically, based on the first description information of the target task, the history task (i.e., the target history task) which determines that the target task is the same or similar can be retrieved from the task instance library, and the super network (also referred to as a search space or a super network search space) to be searched is constructed by using the AI capability combination of the target history task.

If a plurality of target historical tasks are searched, performing a merging operation based on information such as an I/O interface by using AI capability combinations (which can be understood as optimal AI capability combinations determined by searching for the corresponding target historical tasks) of the plurality of target historical tasks to form a super network to be searched (each optimal AI capability combination path can be regarded as a graph, and graphs corresponding to the plurality of target historical tasks are sub-merged to form the super network to be searched). Here, the AI capabilities combination refers to AI capabilities connected in a certain order, that is, the AI capabilities combination reflects not only the AI capabilities included but also the connection order of AI capabilities or the path of AI capabilities combination.

Setting a similarity threshold value for ensuring the accuracy of super-network construction, and if the similarity between the target task and all the historical tasks in the task instance library is lower than the similarity threshold value, considering that the target task does not have the target historical task; and performing correlation calculation on the target task and all AI capacities in the atomic capacity library, screening the AI capacities related to the target task, and constructing a super network to be searched based on information connection such as an I/O interface and the like based on the screened AI capacities.

In some embodiments, the determining, according to the path search performed by the super network, a target AI capability combination corresponding to the target task includes:

verifying the model combination of each path in the super network by using a preset verification set to obtain a verification result of the model combination of each path;

determining a target AI capability combination meeting a search target according to the verification result of the model combination of each path;

Specifically, after the super network is built, a search for an optimal path may be performed to obtain an optimal AI capability combination path, i.e., a target AI capability combination.

The search targets may include: learning ability connects weights to minimize validation set performance loss. Based on a small amount of test data given by a task, searching a network optimal structure by taking the test data as a verification set, inputting the data of the verification set into each path (each path comprises sequentially connected AI capabilities, each AI capability comprises a model, namely, each path is a model combination connected according to a certain sequence) in a super network, carrying out forward deduction on each path, carrying out loss calculation on final output and a verification set label, calculating a weight gradient of each side based on the loss, updating the weight of each side based on the gradient, and finally extracting the path with the largest weight to obtain the optimal AI capability of each step, wherein each optimal AI capability is connected to form an optimal AI capability combination, namely, a target AI capability combination.

The index for evaluating whether the path is optimal comprises verification set performance loss, and the searching target can further comprise: the model has the indexes of minimum scale, minimum calculation power consumption, minimum reasoning time length and the like. If there are other requirements, such as pursuing performance without requiring an inference duration, the loss terms may be traded off or different weights may be set, as shown in equation (1) below.

L=a1×l_performance+a2×l_scale+a3×l_calculation force+a4×l_reasoning period (1)

Wherein a1 represents the weight of the validation set performance penalty; l_performance represents validation set performance penalty; a2 represents the weight of the model scale; L_Scale represents model Scale; a3 represents the weight of the computational effort consumption; l_calculation force represents calculation force consumption; a4 represents the weight of the reasoning time length; the l_inference duration indicates an inference duration.

In some embodiments, the method further comprises:

The determining the transition probability of the AI capacity according to the AI capacity combination corresponding to each historical task in the task instance library comprises the following steps:

Specifically, when the task instance library is accumulated to a certain scale, the transition probability of each AI capacity, that is, the probability of the first AI capacity transitioning to the second AI capacity, may be calculated by using the existing path, and the first AI capacity is hereinafter denoted by a capacity and the second AI capacity is denoted by B capacity.

The probability of the a capability transitioning to the B capability is represented by the following formula (2):

p (B|A) =number of times A capability to B capability/number of times A capability occurs (2)

Here, the transfer of the a capability to the B capability means that the a capability is executed by the B capability after the a capability is transferred to the AI capability combination when a certain task is completed.

The transition probability of the AI capacity is utilized, so that the path search of the optimal AI capacity combination can be free from the stage of learning by utilizing the verification set; when the verification set does not exist, the optimal path cannot be learned, and the path with the optimal AI capacity combination can be directly decoded through the AI capacity transition probability, for example, a greedy search mode is applied, the capacity with the maximum transition probability is selected in each stage, candidates of each stage can be increased in a beam-search mode and the like, and therefore the search result is more accurate.

For example, the target task may be an end-to-end speech recognition task that inputs a segment of conversational speech, the task target predicting intent type and slot information, the output including intent type, slot type and slot value.

Firstly, inquiring a task instance library and an atomic capability library according to the description information of an end-to-end voice recognition task, and determining a target AI capability set, wherein the target AI capability set comprises the following AI capabilities: end point detection (VAD), phrase recognition (ASR), natural Language Understanding (NLU), long speech recognition (laser), speaker Segmentation (SD).

Then, a super network is established based on the target AI capabilities set, the super network including paths of at least one AI capability combination of:

1) Long speech- > end point detection (VAD) - > phrase voice recognition (ASR) - > Natural Language Understanding (NLU) - > semantic information;

2) Long speech- > long speech recognition (laser) - > Natural Language Understanding (NLU) - > semantic information;

3) Long voice- > endpoint detection (VAD) - > end-to-port language understanding (SLU) - > semantic information, namely, intention and slot information is directly obtained from voice signals;

4) Long speech- > speaker Segmentation (SD) - > phrase voice recognition (ASR) - > Natural Language Understanding (NLU) - > semantic information;

5) Long speech- > speaker Segmentation (SD) - > end-to-port language understanding (SLU) - > semantic information;

6) Long speech- > end-to-port language understanding (SLU) - > semantic information.

Wherein each AI capability may include multiple models, i.e., each path has multiple branch paths.

Finally, path searching is performed according to the above established super network, and the optimal AI capacity combination corresponding to the end-to-end speech recognition task is determined, which may be any one of the above, for example:

long voice- > end-point detection (VAD) - > end-to-port language understanding (SLU) - > semantic information.

And the optimal branch path for each AI capability in this path may be: VAD2- > SLU3, i.e., end-point detection (second of three branches, i.e., second of three end-point detection capabilities) - > end-to-port language understanding (third of 5 end-to-port language understanding capabilities). According to the VAD2 model and SLU3 model, an optimal target AI capability combination (or target model combination) corresponding to the end-to-end voice recognition task is formed

In some embodiments, the determining, according to the path search performed by the super network, a target AI capability combination corresponding to the target task further includes:

updating model parameters of AI capacity combinations of each path in the super network by using a preset training set to obtain updating results of the AI capacity combinations of each path;

Determining a target AI capability combination meeting a search target according to the updating result of the AI capability combination of each path;

wherein the search target further comprises: updating the model parameters of AI ability to minimize the training set loss.

Here, whether to perform the model parameter update of the AI capability may be determined based on how much data amount, for example, when the available data amount is small, the model parameter update may not be performed; when the available data volume is large, model parameter updating can be performed to obtain a target AI capability combination more suitable for the target task.

In some embodiments, the method further comprises:

and updating the task instance library according to the target task and the target AI capacity combination corresponding to the target task.

Therefore, task instance libraries can be continuously enriched, and the subsequent configuration efficiency of AI capability combinations aiming at different tasks is improved.

The model determination method provided by the embodiment of the invention can automatically search the optimal AI capacity combination for different tasks, thereby realizing that non-AI research personnel can rapidly configure and combine various AI capacities for the tasks, finding the AI capacity combination with optimal performance for the current task on a complex AI capacity platform, and optimally evaluating calculation power, network, time delay and the like.

Fig. 2 is a frame diagram of a multi-task progressive multi-constraint system artificial intelligent network architecture searching (AINAS, artificial Intelligence Neural Architecture Search) method provided by an embodiment of the present invention, where, as shown in fig. 2, each AI capability of an AI capability platform is standardized and managed in advance, and an atomic capability library including a plurality of AI capabilities is constructed, where each AI capability description includes information such as function description, interface formulation, resource consumption, and the like. And constructing a super network (also called a search space or a super network search space) which is formed by combining AI capabilities and can be shared by a plurality of different tasks, and storing the super network in a task instance library.

When the method is applied, a progressive search mode is adopted, AI capacity types (such as voice recognition or picture recognition) required by a current task and a super network (comprising one or more groups of AI capacity combinations connected in different sequences) are roughly searched, and then the search of an optimal model combination is carried out on the searched super network. Specifically, task description is carried out for a certain task to be processed, heuristic super-network construction is carried out on a task instance library and an atomic capability library according to the task description, namely, a super-network adopted by a history task similar to or the same as the task is searched out, or AI capability combination is determined and a super-network (namely, a search space A) is established; and searching the established super network for an optimal path according to the search strategy, and determining the path of the optimal AI capacity combination for processing the task to obtain the optimal model combination for processing the task. Thus, the complexity of the establishment of the search space is reduced, and the search speed and the accuracy of the optimal model combination are improved.

In the searching process, a plurality of constraint conditions are provided (namely, the performance evaluation strategy comprises a plurality of constraint conditions), and whether the consumption of each AI capacity on the aspects of computing resources, time delay and the like meets the current task demands or not is considered in addition to model performance indexes such as accuracy rate, error rate and the like of the model. Thus, the performance of the model combination is improved, and the optimal model combination is obtained.

It should be noted that, after searching the optimal AI capability combination for each new task, the task and the optimal AI capability combination may be saved as a task instance in the task instance library, so as to facilitate the subsequent other similar tasks to be referred to.

Fig. 3 is a flow chart of a network architecture determining method according to an embodiment of the present invention; as shown in fig. 3, the method includes:

step 301, constructing an atomic capability library;

the atomic capability library includes: at least one AI capability, descriptive information for each AI capability; wherein, the description information comprises: description of functions, interface formulation, resource consumption, etc.

Here, standardized management is performed in advance on each AI capability of the AI capability platform, and the AI capabilities are represented, including, but not limited to, representing each AI capability with a graph node or edge.

Taking an edge as an example, each AI capability is represented as one edge in the graph, two edges of the edge respectively represent I/O interfaces, and the connection of the AI capabilities can form a super network which is oriented to the sharable AI capabilities of a plurality of different tasks. This may be a first constraint on the super network that must be matched according to the type of I/O interface, and if there is no such constraint, any two AI capabilities may be connected. As shown in fig. 4, fig. 4 is a schematic diagram of an AI capability represented by edges, including but not limited to the following information:

1) Model parameters Aij.

The atomic capability library may include a plurality of AI capabilities therein; each AI capability may be implemented by one or more models; here, i denotes a capability ID, and j denotes a model ID. Forward derivation of the input features based on model parameters may result in model output.

2) I/O interfaces respectively represent the type of an input interface (I) and the type of an output interface (O) of AI capability.

By connecting the edges, a shareable super-network of AI capabilities is obtained, where each path represents the combination of AI capabilities traversed from an input to an output, as shown in fig. 5.

Step 302, constructing a task instance library;

here, the completed historical task and the optimal AI capacity corresponding to the historical task are combined to serve as a task instance; and constructing a task instance library according to the task instance.

The task instance library comprises: at least one historical task, descriptive information of each historical task, and AI capability combination corresponding to each historical task.

Similar task searches, calculation of transition probabilities of AI capabilities, etc. can be performed according to each task instance in the task instance library.

Step 303, determining description information of the target task, and determining AI capacity combinations corresponding to the target task according to the target task.

Here, the description information of the target task, the description information of the AI capability, the description information of the history task may include, but is not limited to, at least one of:

description of input features and output features of tasks or AI capabilities;

describing an application scene;

a description of resource requirements;

description of the duration of reasoning.

The step 303 may include:

according to the description information of the target task and the task instance library, similar task retrieval is carried out, and whether the history task with the same or similar target task exists (recorded as a target history task) is determined;

under the condition that the task instance library is provided with the target historical task, determining the AI capacity combination corresponding to the target task according to the AI capacity combination corresponding to the target historical task;

under the condition that the task instance library does not have the target historical task, inquiring an atomic capability library according to the description information of the target task, and determining the AI capability combination corresponding to the target task.

And 304, establishing a super network according to the AI capacity combination corresponding to the target task.

Specifically, if there is a history task that is the same as or similar to the current target task in the task instance library, the AI capability combination corresponding to the same or similar target history task may be directly used as the optimal AI capability combination of the current target task. However, considering that different models may have different performances on different scene data, if only using task input and output descriptions may not determine whether an AI capability combination corresponding to an existing task instance is optimal for a current target task, the embodiment of the present invention further provides an asymptotic search method, firstly, based on description information of the target task, searching for the same or similar target historical task in a task instance library, and using the AI capability combination of the target historical task to construct a hyperspace to be searched; if a plurality of target historical tasks are searched, combining operation based on information such as an I/O interface is carried out on the searched optimal AI capacity combination paths of different target historical tasks to form a super network.

In order to ensure the accuracy of the hyperspace construction, a similarity threshold can be set, if the similarity between the target task and all the historical tasks in the task instance library is lower than the similarity threshold, the target task is considered to have no identical or similar historical task, then correlation calculation is carried out according to the description information of the target task and all the AI capabilities in the atomic capability library, relevant AI capabilities are screened out, and then a super network to be searched is constructed based on information connection such as I/O interfaces of the AI capabilities.

Searches for the same or similar tasks, related AI capabilities searches may be accomplished in any of the following ways: calculating the minimum editing distance of the text; embedding and representing the text description by using a text pre-training model, and calculating cosine similarity equidistance; the text vectors are input into the multi-layer perceptron for training (task instance libraries are accumulated to a certain scale). The method shown in fig. 1 is specifically described, and will not be described in detail here.

Step 305, searching the optimal AI capability combination of the target task according to the super network.

Here, after the super network is constructed, an automatic search may be performed for the super network to determine an optimal AI capability combination of the target task. The purpose of the automatic search is to search for the optimal AI capability combination for the target task. Whether the output is optimal or not is evaluated by the performance of the final output result of each path on the verification set, namely, the input characteristics are subjected to forward reasoning through the model parameters of each capability on the path, so that the final output and the verification set labeling data are obtained, and the loss calculation is performed. Here, the evaluation items such as resource consumption, model scale, reasoning time length and the like for the capacity are added on the basis of the model performance evaluation, so that the overall evaluation is performed on each AI capacity for different tasks, because besides performance indexes such as model accuracy and the like, calculation cost and the like are also key indexes for determining whether the capacity is suitable for the current task, and calculation of the cost needs to structure and calculate the information when the AI capacity is standardized, for example, the reasoning time length of the whole path is obtained by adding the reasoning time lengths of the capacities on each path.

In an example, the step 304 may include: and searching a network optimal structure based on a small amount of test data given by a current task, inputting the verification set data into extracted features according to the task, inputting the extracted features into each path in a super network, performing forward derivation on each path, performing loss calculation on final output and verification set labels, calculating weight gradient of each side based on the loss, updating the weight of each side based on the gradient, and finally extracting the path with the maximum weight to obtain optimal AI capacity of each step, wherein the AI capacities are connected to form an optimal AI capacity combination.

The indexes for evaluating whether the path is optimal include indexes such as model scale, calculation power consumption, minimum reasoning time length and the like besides the performance loss of the verification set. If some requirements, such as more pursuit of performance, do not require an inference duration, the penalty terms may be traded off or different weights may be set, as shown in equation (1).

In another example, the step 304 may include: and carrying out path search on the super network by using the AI capability transition probability to determine the optimal AI capability combination.

Here, with the AI capability transition probability, the search of the optimal AI capability combination path can be saved from the stage of learning with the verification set. When the verification set does not exist, the optimal path cannot be learned, but the optimal path can be directly decoded through AI capability transition probability, for example, a greedy search mode is applied, and the capability with the maximum transition probability is selected in each stage; the search results can also be more accurate by adding candidates for each stage using beam-search or the like.

The transition probability of the AI capability can be calculated by using the existing path after the task instance library is accumulated to a certain scale. Specifically, the probability of transferring from a capability to B capability can be calculated by the following formula (2):

p (B|A) =number of times A to B/number of times A occurs (2)

Where P (B|A) represents the probability of transitioning from A-capability to B-capability.

FIG. 6 is a schematic diagram of a method for determining a network architecture for a language identification task according to an embodiment of the present invention; as shown in fig. 6, the method is applied to end-to-end speech recognition, and the method includes:

Step 601, determining a target task, wherein the target task is an end-to-end voice recognition task;

the end-to-end speech recognition task takes a section of dialogue speech as input, and aims at predicting intention type and slot position information and outputting a text value containing the intention type, the slot position type and the slot position value.

Assuming a capability platform with 500 AI capabilities, if no constraint is made, then there is 500 x 499 connections for a network with only one layer, each capability may also have different interfaces (e.g., 2 input interfaces, 4 output interfaces) and tunable super-parameters (assuming 2), a single capability may also have 2 x 4 x 2 = 16 candidates, the entire search space may be 500 x 499 x 16, the network may call more than one layer, i.e., more than two capabilities, and the search space may be larger. Therefore, it is necessary to reduce the search space. In addition, the learning process may not be able to provide a large amount of training data, set the capability parameters to have been learned, and only learn the structural connection weights between the capabilities, so only a small amount of data is needed as a verification set.

Step 602, inquiring a preset database according to the end-to-end voice recognition task, and determining a target AI capability set required for realizing the end-to-end voice recognition task.

Specifically, step 602 may include: inquiring a task instance library according to the description information of the end-to-end voice recognition task, and determining whether a target historical task meeting a first condition exists in the task instance library (namely, a historical task with similarity higher than a similarity threshold value with the end-to-end voice recognition task);

if the target history task exists, selecting a target history task of the top-K, and combining AI capabilities corresponding to the target history task of the top-K to be used as a target AI capability set required for realizing the end-to-end voice recognition task;

if the target history task does not exist, the atomic capability library is queried according to the description information of the end-to-end voice recognition task, and related AI capability is determined.

And 603, establishing a super network corresponding to the end-to-end voice recognition task according to the target AI capability set.

Here, suppose 10 target history tasks are selected, via step 602, involving 50 associated AI capabilities. The input/output interfaces based on these AI capabilities are sub-combined, and the capabilities of the same input/output interface are located at the same layer, and as shown in fig. 4, different edges of the same layer serve as different candidate AI capabilities of the step.

For example, of the 50 AI capabilities, 16 AI capabilities are all AI capabilities of voice recognition, and may be only different in scale of related fields or models, different in reasoning time of algorithms used by the models, or the like, and then 16 sides are arranged in parallel on the layer.

As shown in fig. 7, the path from the task input, i.e. long speech as input, and output as intended slot, may contain several types of candidates:

These candidate paths, combined by the same input-output interface, form a super path, and each link of each super path may involve multiple AI capabilities, thereby forming different branch paths. For example, endpoint detection may provide 3 kinds of tunable overparameters, corresponding to three AI capabilities, speaker segmentation may learn different speaker segmentation AI capabilities based on different models, long speech recognition may have different model implementation capabilities, and phrase recognition may be subdivided into different scene and different performance capabilities. Language understanding capabilities are also divided into text-based natural language understanding and speech-based end-to-port language understanding, with each method corresponding to a different AI capability. Finally, the size of the super network is: the branch path numbers of all links in each path are multiplied to be the total number of candidate branch paths on the path, and the candidate branch path numbers of 6 super paths are added to obtain the final search path of the super network.

And step 604, searching paths according to the super network, and determining a target AI capacity combination corresponding to the end-to-end voice recognition task.

The related AI capabilities and the order of the combinations are determined via step 603, and further searching is performed by using the network structure automatic search algorithm to determine the optimal AI capability combination. One method may search for the optimal structure of the network based on a small amount of test data as a validation set, with the optimization objective being to minimize equation 1 above, and the optimal path being derived from the maximization of the weights searched. The other method can be independent of the verification set, an AI capability map is constructed based on the transition probabilities of the AI capabilities learned by the task instance library, and the capability with the maximum transition probability is selected in each stage on the basis of the optimal path with the maximum search probability and meeting other performance indexes such as transition probability, other calculation power, performance indexes and the like by applying a greedy search mode; the search results can also be more accurate by adding candidates for each stage using beam-search or the like.

The optimal AI capabilities combination finally retrieved may be: long speech- > end point detection- > end-to-port language understanding- > semantic information (VAD- > SLU), the optimal branch path for each link in the super path may be: VAD2- > SLU3, i.e. end-point detection (second of three branches, i.e. second of three end-point detection capabilities) > end-to-port language understanding (third of 5 end-to-port language understanding capabilities), is the optimal model combination for processing end-to-end speech recognition tasks.

Fig. 8 is a schematic structural diagram of a network architecture determining apparatus according to an embodiment of the present invention; as shown in fig. 8, the apparatus includes:

a third processing module, configured to perform path search according to the super network, and determine a target AI capability combination corresponding to the target task

It should be noted that: the network architecture determining apparatus provided in the above embodiment only illustrates the division of each program module when implementing the corresponding network architecture determining method, and in practical application, the above processing allocation may be performed by different program modules according to needs, that is, the internal structure of the server is divided into different program modules to complete all or part of the above processing. In addition, the apparatus provided in the foregoing embodiments and the embodiments of the corresponding methods belong to the same concept, and specific implementation processes of the apparatus and the embodiments of the methods are detailed in the method embodiments, which are not described herein again.

Fig. 9 is a schematic structural diagram of a network architecture determining apparatus according to an embodiment of the present invention, as shown in fig. 9, where the network architecture determining apparatus 90 includes: a processor 901 and a memory 902 for storing a computer program capable of running on the processor;

the processor 901 is configured to execute, when executing the computer program: inquiring a preset database according to a target task, and determining a target AI capacity set required for realizing the target task; establishing a super network corresponding to the target task according to the target AI capability set; the super network includes AI capability combinations of at least one path; and searching paths according to the super network, and determining a target AI capacity combination corresponding to the target task. Specifically, the network architecture determining apparatus may also execute the method shown in fig. 1, which belongs to the same concept as the network architecture determining method embodiment shown in fig. 1, and detailed implementation procedures of the method embodiment are described in the method embodiment, which is not repeated herein.

In practical application, the network architecture determining apparatus 90 may further include: at least one network interface 903. The various components in the network architecture determination apparatus 90 are coupled together by a bus system 904. It is appreciated that the bus system 904 is used to facilitate connected communications between these components. The bus system 904 includes a power bus, a control bus, and a status signal bus in addition to a data bus. But for clarity of illustration, the various buses are labeled as bus system 904 in fig. 9. The number of the processors 901 may be at least one. The network interface 903 is used for wired or wireless communication between the network architecture determination apparatus 90 and other devices.

The memory 902 in an embodiment of the present invention is used to store various types of data to support the operation of the network architecture determination apparatus 90.

The method disclosed in the above embodiment of the present invention may be applied to the processor 901 or implemented by the processor 901. Processor 901 may be an integrated circuit chip with signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in the processor 901 or instructions in the form of software. The Processor 901 may be a general purpose Processor, a DiGital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. Processor 901 may implement or perform the methods, steps and logic blocks disclosed in embodiments of the present invention. The general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the method disclosed in the embodiment of the invention can be directly embodied in the hardware of the decoding processor or can be implemented by combining hardware and software modules in the decoding processor. The software modules may be located in a storage medium in a memory 902 and the processor 901 reads information in the memory 902, in combination with its hardware, performing the steps of the method as described above.

In an exemplary embodiment, the network architecture determination 90 may be implemented by one or more application specific integrated circuits (ASIC, application Specific Integrated Circuit), DSPs, programmable logic devices (PLD, programmable Logic Device), complex programmable logic devices (CPLD, complex Programmable Logic Device), field-programmable gate arrays (FPGA, field-Programmable Gate Array), general purpose processors, controllers, microcontrollers (MCU, micro Controller Unit), microprocessors (Microprocessor), or other electronic components for performing the aforementioned methods.

The embodiment of the invention also provides a computer readable storage medium, on which a computer program is stored;

the computer program, when executed by a processor, performs: inquiring a preset database according to a target task, and determining a target AI capacity set required for realizing the target task; establishing a super network corresponding to the target task according to the target AI capability set; the super network includes AI capability combinations of at least one path; and searching paths according to the super network, and determining a target AI capacity combination corresponding to the target task. Specifically, the computer program may also execute the method shown in fig. 1, which belongs to the same concept as the network architecture determining method embodiment shown in fig. 1, and the detailed implementation process of the computer program is detailed in the method embodiment, which is not described herein.

In the several embodiments provided by the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The above described device embodiments are only illustrative, e.g. the division of the units is only one logical function division, and there may be other divisions in practice, such as: multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. In addition, the various components shown or discussed may be coupled or directly coupled or communicatively coupled to each other via some interface, whether indirectly coupled or communicatively coupled to devices or units, whether electrically, mechanically, or otherwise.

The units described as separate units may or may not be physically separate, and units displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units; some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may be separately used as one unit, or two or more units may be integrated in one unit; the integrated units may be implemented in hardware or in hardware plus software functional units.

Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the above method embodiments may be implemented by hardware associated with program instructions, and the foregoing program may be stored in a computer readable storage medium, where the program when executed performs steps including the above method embodiments; and the aforementioned storage medium includes: a mobile storage device, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk or an optical disk, or the like, which can store program codes.

Alternatively, the above-described integrated units of the present invention may be stored in a computer-readable storage medium if implemented in the form of software functional modules and sold or used as separate products. Based on such understanding, the technical solutions of the embodiments of the present invention may be embodied in essence or a part contributing to the prior art in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the methods described in the embodiments of the present invention. And the aforementioned storage medium includes: a removable storage device, ROM, RAM, magnetic or optical disk, or other medium capable of storing program code.

It should be noted that: "first," "second," etc. are used to distinguish similar objects and not necessarily to describe a particular order or sequence.

In addition, the embodiments of the present application may be arbitrarily combined without any collision.

The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method for determining network architecture, the method comprising:

2. The method of claim 1, wherein the database comprises at least one of: a task instance library, an atomic capability library; the task instance library comprises: at least one historical task, and AI capability combinations corresponding to each historical task; the atomic capability library includes: at least one AI capability;

3. The method of claim 2, wherein the task instance library further comprises: second descriptive information for each historical task;

accordingly, the satisfaction of the first condition includes at least one of:

the cosine similarity distance is smaller than a second threshold;

4. The method of claim 2, wherein the atomic capability library further comprises: third description information of each AI capability;

the atomic capability library is queried according to the first description information of the target task, and the atomic capability library comprises at least one of the following components:

Accordingly, the satisfaction of the second condition includes at least one of:

the cosine similarity distance is smaller than a fourth threshold;

5. The method of claim 2, wherein the establishing a super network corresponding to the target task according to the target AI capability set comprises one of:

6. The method of claim 1, wherein the determining the target AI capabilities combination corresponding to the target task according to the path search performed by the super-network comprises:

7. The method of claim 1, wherein the determining the target AI capabilities combination corresponding to the target task according to the path search performed by the super-network comprises:

8. The method of claim 7, wherein the method further comprises:

9. The method of claim 8, wherein determining the transition probability of AI capabilities based on AI capability combinations corresponding to each historical task in the task instance library comprises:

10. A network architecture determination apparatus, the apparatus comprising:

11. A network architecture determination apparatus comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method of any one of claims 1 to 9 when the program is executed by the processor.

12. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method according to any one of claims 1 to 9.