CN111753999A

CN111753999A - Model using method and device

Info

Publication number: CN111753999A
Application number: CN202010611595.XA
Authority: CN
Inventors: 叶剑武
Original assignee: Beijing Xiaomi Pinecone Electronic Co Ltd
Current assignee: Beijing Xiaomi Pinecone Electronic Co Ltd
Priority date: 2020-06-29
Filing date: 2020-06-29
Publication date: 2020-10-09

Abstract

The present disclosure provides a model using method and apparatus. The model using method is applied to a terminal, a deep learning model is built in the terminal, and the terminal combines operators related to the use of the operators in the multiple operators according to the use information of the multiple operators in the deep learning model within a period of running time so as to reduce the number of the operators in the deep learning model and obtain a combined first deep learning model. Compared with the deep learning model before combination, the number of operators in the deep learning model after combination is small, so that the times that the execution result of the operator is placed in the memory and the execution result of the operator is read from the memory for other operators to use in the model reasoning process are reduced, the reasoning speed of the model is accelerated, and the reasoning time of the model is shortened.

Description

Model using method and device

Technical Field

The present disclosure relates to the field of computer communication technologies, and in particular, to a model using method and apparatus.

Background

And training the deep learning model after the deep learning model is built, and deploying the trained deep learning model in the terminal.

The reasoning speed of the deep learning model is optimized by configuring high-performance hardware resources for a terminal at present. The mode of reasoning speed of the existing optimization model is single.

Disclosure of Invention

To overcome the problems in the related art, the present disclosure provides a model using method and apparatus.

According to a first aspect of the embodiments of the present disclosure, there is provided a model using method, applied to a terminal, in which a deep learning model is built, the method including:

obtaining use information of a plurality of operators in the deep learning model in a period of operation time, wherein the operators comprise associated operators;

after determining that the use information of the operators using the associations meets a preset information condition, combining the operators using the associations according to the use sequence of the operators using the associations to obtain a combined first deep learning model;

using the first deep learning model in response to satisfying a use condition of the first deep learning model.

Optionally, the determining that the usage information of the operator using the association meets a preset information condition includes at least one of:

determining that the use frequency of the operator using the association is greater than or equal to a preset frequency;

determining that the number of times of use of the operator using the association is greater than or equal to a preset number of times.

Optionally, the operators using the associations differ in complexity; after determining that the usage information of the operators using the associations meets a preset information condition, combining the operators using the associations according to the usage sequence of the operators using the associations, including:

after determining that the use information of the operators which are associated in use and different in complexity meets the preset information condition, combining the operators which are associated in use and different in complexity according to the use sequence of the operators which are associated in use and different in complexity.

Optionally, the associated operators include two operators with different complexities, and the operator with high complexity is used before the operator with low complexity; after determining that the usage information of the operators with the usage associations and different complexities meets the preset information condition, combining the operators with the usage associations and different complexities according to the usage sequence of the operators with the usage associations and different complexities, including:

and after determining that the use information of the two operators with different complexities meets the preset information condition, combining the two operators with different complexities according to the sequence that the operator with high complexity precedes the operator with low complexity.

Optionally, one operator of the plurality of operators is associated with another plurality of operators; the method further comprises the following steps:

in a case that the one operator and a part of the other operators are combined, in response to usage information of the one operator and another part of the other operators satisfying the preset information condition, combining the one operator and another part of the other operators in a target deep learning model, and obtaining a combined second deep learning model, wherein the target deep learning model is a model obtained based on the deep learning model and in which the one operator and the operators used thereafter are not combined;

using the second deep learning model in response to satisfying a use condition of the second deep learning model.

Optionally, the target deep learning model comprises the deep learning models that are not combined.

According to a second aspect of the embodiments of the present disclosure, there is provided a model using apparatus applied to a terminal, in which a deep learning model is built, the apparatus including:

an obtaining module configured to obtain usage information of a plurality of operators in the deep learning model in a period of runtime, the plurality of operators including an associated operator;

the first combination module is configured to, after determining that the usage information of the operators using the associations meets a preset information condition, combine the operators using the associations according to the usage sequence of the operators using the associations to obtain a combined first deep learning model;

a first usage module configured to use the first deep learning model in response to a usage condition of the first deep learning model being satisfied.

Optionally, the apparatus further comprises:

a determining module configured to determine that a usage frequency of the operator using the association is greater than or equal to a preset frequency and/or determine that a usage number of the operator using the association is greater than or equal to a preset number.

Optionally, the operators using the associations differ in complexity;

the first combination module is configured to, after determining that the usage information of the operators with the usage associations and different complexities satisfies the preset information condition, combine the operators with the usage associations and different complexities according to the usage sequence of the operators with the usage associations and different complexities.

Optionally, the associated operators include two operators with different complexities, and the operator with high complexity is used before the operator with low complexity;

the first combining module is configured to, after determining that the usage information of the two operators with different complexities satisfies the preset information condition, combine the two operators with different complexities according to an order that the operator with high complexity precedes the operator with low complexity.

Optionally, one operator of the plurality of operators is associated with another plurality of operators; the device further comprises:

a second combination module configured to, in a case where the one operator and a part of the other plurality of operators have been combined, combine the one operator and another part of the other plurality of operators in a target deep learning model, which is a model in which the one operator and the operator used thereafter have not been combined, obtained based on the deep learning model, in response to usage information of the one operator and the another part of the other plurality of operators satisfying the preset information condition, to obtain a combined second deep learning model;

a second usage module configured to use the second deep learning model in response to a usage condition of the second deep learning model being satisfied.

According to a third aspect of embodiments of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method of any one of the above first aspects.

According to a fourth aspect of the embodiments of the present disclosure, there is provided a terminal, including: the system comprises an internal bus, a memory, a processor and an external interface which are connected through the internal bus; wherein,

the external interface is used for acquiring data;

the memory for storing corresponding machine-readable instructions for model usage;

the processor is configured to read the machine-readable instructions on the memory and execute the machine-executable instructions to implement the following operations:

The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:

in the embodiment of the disclosure, the terminal combines the operators using the association in the multiple operators according to the use information of the multiple operators in the deep learning model in a period of running time, so as to reduce the number of the operators in the deep learning model and obtain the combined first deep learning model. Compared with the deep learning model before combination, the number of operators in the deep learning model after combination is small, so that the times that the execution result of the operator is placed in the memory and the execution result of the operator is read from the memory for other operators to use in the model reasoning process are reduced, the reasoning speed of the model is accelerated, and the reasoning time of the model is shortened. Accordingly, the disclosed embodiments provide a new method for optimizing the inference speed of a model.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

FIG. 1 is a flow diagram illustrating a method for using a model in accordance with an exemplary embodiment;

FIG. 2 is a block diagram illustrating a model-using apparatus in accordance with an exemplary embodiment;

fig. 3 is a schematic diagram illustrating a structure of a terminal according to an exemplary embodiment.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

The terminology used in the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used in this disclosure and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.

It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, such information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present disclosure. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.

Fig. 1 is a flowchart illustrating a model using method according to an exemplary embodiment, where the method illustrated in fig. 1 is applied to a terminal, and a deep learning model is built in the terminal, and the method includes:

in step 101, usage information of a plurality of operators in a deep learning model in a period of runtime is obtained, wherein the plurality of operators comprise operators using correlation.

The terminal can be provided with an application program, and the deep learning model is built in the application program. The terminal can adopt the method provided by the embodiment of the disclosure to combine operators in the deep learning model and optimize the deep learning model in the process of running the application program and using the deep learning model.

Or, the deep learning model may be directly built in the terminal, for example, the deep learning model is built in the system software of the terminal, and the terminal may combine operators in the deep learning model and optimize the deep learning model by using the method provided by the embodiment of the present disclosure in the process of using the deep learning model.

The operators using associations may include a plurality of operators having an input-output relationship, for example, an execution result (i.e., output) of one operator as an input when another operator executes. Illustratively, the execution result of operator a is input when operator B is executed, and the execution result of operator B is input when operator C is executed, and it can be said that there is a usage association between operator a and operator B, and there is a usage association between operator B and operator C, or there is a usage association between operator a, operator B, and operator C.

The plurality of operators in this step may include all operators in the deep learning model, or may include some operators in the deep learning model, and the terminal may obtain use information of some operators preset in the deep learning model.

The plurality of operators includes operators that use the associations, and the usage information for the plurality of operators over a period of time may include usage information for the operators that use the associations over a period of time. Usage information for using the associated operator over a period of time may include at least one of: frequency of use of the associated operator, number of uses of the associated operator. The use frequency of the associated operator can be understood as the frequency of using the associated operator in sequence, and the use frequency of using the associated operator can be understood as the frequency of using the associated operator in sequence.

The length of a period of operation time can be set as desired, and the disclosure is not limited thereto.

In step 102, after it is determined that the usage information of the operators using the associations meets the preset information condition, the operators using the associations are combined according to the usage sequence of the operators using the associations, and a combined first deep learning model is obtained.

And the terminal combines the operators using the association according to the using sequence of the operators using the association, wherein the number of the operators after combination is less than that of the operators before combination.

The combined operator is used as an independent operator, and the purpose of combination is as follows: and (3) not writing the calculation result of one operator into a memory, but directly using the calculation result of one operator to carry out the next calculation.

For example, there are a, b, and c for performing addition calculations, one way: after a + b is calculated, writing the calculation result into a memory, acquiring the calculation result of a + b from the memory, adding the calculation result of a + b and c, and writing the obtained result into the memory; in another mode: and combining a, b and c, calculating a + b + c, and writing the obtained result into a memory. Comparing the two ways, the operations of writing data into the memory and reading data from the memory can be reduced by combining the operators.

In one embodiment, there are multiple ways of combining operators. For example, when the method provided by the embodiment of the present disclosure is executed once, all the operators using the association are combined into one operator, such as combining the operator a, the operator B, and the operator C into one operator. For another example, only a part of the operators using the association are combined each time the method provided by the embodiment of the present disclosure is executed, and after the method is executed for multiple times, the combination of all the operators using the association is completed, for example, the operator a and the operator B are combined when the method is executed for one time, and the combined operator a and the operator B are combined with the operator C when the method is executed for the next time.

Illustratively, the deep learning model has a function of speech recognition, after a section of audio is input into the deep learning model, the operator units in the deep learning model are repeatedly executed, when the operator units are executed once, two operators with input-output relations in the operator units are combined, and after the operator units are executed for multiple times, the combination of multiple operators in the operator units is completed.

In one embodiment, determining that the usage information of the operator using the association satisfies a preset information condition may include at least one of: determining that the use frequency of the operator using the association is greater than or equal to a preset frequency; determining that the number of uses of the associated operator is greater than or equal to a preset number.

In one embodiment, the operators that use associations differ in complexity. The terminal can combine the operators with the relevance and different complexities according to the using sequence of the operators with the relevance and different complexities after determining that the using information of the operators with the relevance and different complexities meets the preset information condition.

In this embodiment, the complexity of the operator is used as a basis for combining the operators, and the operators with different associated and different complexities are combined.

In one embodiment, the associated operators include two operators with different complexities, the operator with high complexity is used before the operator with low complexity, and after the terminal determines that the use information of the two operators with different complexities meets the preset information condition, the two operators with different complexities are combined according to the sequence that the operator with high complexity is used before the operator with low complexity.

In this embodiment, the operators in the deep learning model are combined in a combination manner of an operator with high complexity and an operator with low complexity.

For example, batch normalization is a regularization operator, sigmoid is an activation function in a neural network, bias add is a bias term, and the three operators have lower complexity compared with a convolution operator and a fully-connected operator.

In step 103, the first deep learning model is used in response to the use condition of the first deep learning model being satisfied.

Generally, the number of operators in the deep learning model is large, the large number of operators can form a plurality of operator execution paths, each operator execution path relates to a certain number of operators, and a certain function is realized through the operators related to the operator execution paths.

The terminal can combine operators related to an operator execution path in the deep learning model to obtain a combined first deep learning model, and when the operator is required to execute a plurality of operators related to the path, the operator in the first deep learning model is executed to execute the combined operators related to the path.

Because the number of the combined operators is less than that of the operators before combination, the times that the execution result of the operator is placed in the memory and read from the memory for use by other operators in the model reasoning process are reduced, the reasoning speed of the model is accelerated, the reasoning time of the model is shortened, and the calculation performance of the model is improved.

The model inference process described in the embodiments of the present disclosure may be understood as a model use process or a model application process.

In one embodiment, the terminal obtains use information of a plurality of operators in the deep learning model in a period of operation time, one operator in the operators is related to other operators, and the output of one operator serves as the input of the operators.

Under the condition that one operator and one part of other operators are combined, if the terminal detects that the use information of the one operator and the other part of the other operators meets the preset information condition, combining the one operator in the target deep learning model and the other part of the other operators to obtain a combined second deep learning model, and controlling the application program to use the second deep learning model when responding to the use condition meeting the second deep learning model.

The target deep learning model is a model in which the above-mentioned one operator and an operator used thereafter are not combined, which is obtained based on the deep learning model in which the combination does not occur.

There are various target deep learning models, for example, a target deep learning model is a deep learning model in which combination does not occur, i.e., an initial deep learning module. And the terminal saves the depth learning model which is not combined in the process of optimizing the model.

For another example, the target deep learning model is a model in which some operators in the deep learning model have been combined, but the above-mentioned one operator and the operator used thereafter have not been combined.

By executing the method provided by the embodiment, the combined deep learning model corresponding to the execution paths of different operators is obtained.

While, for purposes of simplicity of explanation, the foregoing method embodiments have been described as a series of acts or combination of acts, it will be appreciated by those skilled in the art that the present disclosure is not limited by the order of acts, as some steps may, in accordance with the present disclosure, occur in other orders and concurrently.

Further, those skilled in the art should also appreciate that the embodiments described in the specification are exemplary embodiments and that acts and modules referred to are not necessarily required by the disclosure.

Corresponding to the embodiment of the application function implementation method, the disclosure also provides an embodiment of an application function implementation device and a corresponding terminal.

Fig. 2 is a block diagram illustrating a model using apparatus according to an exemplary embodiment, where the model using apparatus illustrated in fig. 2 is applied to a terminal having a deep learning model built therein, the apparatus including: an acquisition module 21, a first assembly module 22 and a first usage module 23; wherein,

the obtaining module 21 is configured to obtain usage information of a plurality of operators in the deep learning model in a period of runtime, where the plurality of operators include an associated operator;

the first combining module 22 is configured to, after determining that the usage information of the usage-associated operators meets a preset information condition, combine the usage-associated operators according to the usage order of the usage-associated operators to obtain a combined first deep learning model;

the first usage module 23 is configured to use the first deep learning model in response to a usage condition of the first deep learning model being satisfied.

In one embodiment, on the basis of the model using apparatus shown in fig. 2, the apparatus may further include: a determination module;

the determining module is configured to determine that the usage frequency of the operator using the association is greater than or equal to a preset frequency, and/or determine that the usage number of the operator using the association is greater than or equal to a preset number.

In one embodiment, on the basis of the model using apparatus shown in fig. 2, the complexity of the operators using the association is different;

the first combining module 22 may be configured to, after determining that the usage information of the operators with the associated usage and different complexities satisfies the preset information condition, combine the operators with the associated usage and different complexities according to the usage order of the operators with the associated usage and different complexities.

In one embodiment, the operators using associations comprise two operators with different complexities, and the operators with high complexities are used before the operators with low complexities;

the first combining module 22 may be configured to, after determining that the usage information of the two operators with different complexities satisfies the preset information condition, combine the two operators with different complexities in an order in which the operator with higher complexity precedes the operator with lower complexity.

In one embodiment, based on the model using apparatus shown in FIG. 2, one operator of the plurality of operators is associated with another plurality of operators; the apparatus may further include: a second combination module and a second use module;

the second combination module is configured to, in a case that the one operator and a part of the other operators are combined, in response to usage information of the one operator and another part of the other operators satisfying the preset information condition, combine the one operator and another part of the other operators in a target deep learning model, and obtain a combined second deep learning model, wherein the target deep learning model is a model obtained based on the deep learning model, and the one operator and the other part of the other operators are not combined;

the second usage module is configured to use the second deep learning model in response to a usage condition of the second deep learning model being satisfied.

In one embodiment, the target deep learning models include the deep learning models that are not combined.

For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules can be selected according to actual needs to achieve the purpose of the disclosed solution. One of ordinary skill in the art can understand and implement it without inventive effort.

Fig. 3 is a schematic structural diagram of a terminal according to an exemplary embodiment, where the terminal may include: an internal bus 310, and a memory 320, a processor 330, and an external interface 340 connected through the internal bus 310; wherein,

an external interface 340 for acquiring data;

a memory 320 for storing corresponding machine-readable instructions for model use;

a processor 330 configured to read the machine-readable instructions on the memory 320 and execute the machine-executable instructions to perform operations comprising:

The disclosed embodiments also provide a non-transitory computer readable storage medium on which a computer program is stored, which when executed by a processor implements the steps of the above-described model using method.

In the embodiments of the present disclosure, the computer readable storage medium may be in various forms, for example, in different examples, the machine readable storage medium may be: a RAM (random Access Memory), a volatile Memory, a non-volatile Memory, a flash Memory, a storage drive (e.g., a hard drive), a solid state drive, any type of storage disk (e.g., an optical disk, a dvd, etc.), or similar storage medium, or a combination thereof. In particular, the computer readable medium may be paper or another suitable medium upon which the program is printed. Using these media, the programs can be electronically captured (e.g., optically scanned), compiled, interpreted, and processed in a suitable manner, and then stored in a computer medium.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A model using method is applied to a terminal, a deep learning model is built in the terminal, and the method comprises the following steps:

2. The method of claim 1, wherein the determining that the usage information of the operator using the association satisfies a preset information condition comprises at least one of:

3. The method of claim 1, wherein the operators using associations differ in complexity; after determining that the usage information of the operators using the associations meets a preset information condition, combining the operators using the associations according to the usage sequence of the operators using the associations, including:

4. The method of claim 3, wherein the operators using associations comprise two operators of different complexity, wherein operators of higher complexity are used before operators of lower complexity; after determining that the usage information of the operators with the usage associations and different complexities meets the preset information condition, combining the operators with the usage associations and different complexities according to the usage sequence of the operators with the usage associations and different complexities, including:

5. The method of claim 1, wherein one of the plurality of operators is associated with another plurality of operators; the method further comprises the following steps:

6. The method of claim 5, wherein the target deep learning model comprises the deep learning model without combining.

7. A model using apparatus applied to a terminal having a deep learning model built therein, the apparatus comprising:

8. The apparatus of claim 7, further comprising:

9. The apparatus of claim 7, wherein the operators using associations differ in complexity;

10. The apparatus of claim 9, wherein the operators using associations comprise two operators of different complexity, wherein operators of higher complexity are used before operators of lower complexity;

11. The apparatus of claim 7, wherein one of the plurality of operators is associated with another plurality of operators; the device further comprises:

12. The apparatus of claim 11, wherein the target deep learning model comprises the deep learning model without combining.

13. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor, performs the steps of the method of any one of claims 1 to 6.

14. A terminal, comprising: the system comprises an internal bus, a memory, a processor and an external interface which are connected through the internal bus; wherein,

the external interface is used for acquiring data;