CN114168446A

CN114168446A - Simulation evaluation method and device for mobile terminal operation algorithm model

Info

Publication number: CN114168446A
Application number: CN202210126304.7A
Authority: CN
Inventors: 吕承飞; 吴飞; 牛超越; 顾仁杰
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2022-02-10
Filing date: 2022-02-10
Publication date: 2022-03-11
Anticipated expiration: 2042-02-10
Also published as: CN114168446B

Abstract

The embodiment of the application provides a simulation evaluation method and device for a mobile terminal operation algorithm model. In the embodiment of the application, in the mobile-end-oriented model evaluation, a cloud evaluation environment adaptive to partial equipment evaluation environment is simulated on cloud-side resources aiming at the partial equipment evaluation environment, so that part of genuine equipment can be replaced for model evaluation, and the problem that the model evaluation cannot be performed on all genuine equipment is solved; aiming at the other part of equipment evaluation environment, the corresponding genuine machine equipment is adopted for genuine machine evaluation, and meanwhile, the machine learning model is subjected to model evaluation based on the genuine machine environment and the cloud evaluation environment, so that the problem of model evaluation for the genuine machine equipment is solved, conditions are provided for intelligent landing on the end side, evaluation advantages of both the model evaluation based on the genuine machine environment and the model evaluation based on the cloud evaluation environment can be considered, the evaluation efficiency and the correctness and the authenticity of an evaluation result aiming at the machine learning model are improved, and the automatic evaluation aiming at a mobile terminal is realized.

Description

Simulation evaluation method and device for mobile terminal operation algorithm model

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a simulation evaluation method and device for a mobile terminal operation algorithm model.

Background

Machine Learning (Machine Learning) is a study of computer algorithms that seeks to automatically optimize the performance of the computer algorithm (or model) through data or past experience. Generally, a machine learning model is obtained by a machine learning engineer performing data acquisition and algorithm design for a problem and then performing model training on the cloud using a traditional cloud-side machine learning framework, such as TensorFlow and pytorreh. In the model reasoning process, the mobile device initiates a model request to the cloud, and the cloud executes a relevant calculation task by using the machine learning model and returns a task result to the mobile device for further processing by the mobile device. This way of using the machine learning model remotely has the problem of poor response timeliness.

Along with the improvement of computing power of the mobile equipment and the maturity of a model compression technology, the machine learning model generated by the cloud side can be compressed and converted to obtain the end-side machine learning model which is small in model size and adaptive to the mobile equipment, the model inference process can be completed on the end side based on the end-side machine learning model, end-side intellectualization is achieved, and the model inference efficiency is improved. However, after the compression conversion, the structure and performance of the machine learning model change, which necessitates evaluation of the compression-converted model. However, mobile-oriented model evaluation is a pain point problem faced by end-side intelligence.

Disclosure of Invention

Various aspects of the application provide a simulation evaluation method, device and storage medium for a mobile-end-oriented operation algorithm model, which are used for solving the problem of mobile-end-oriented model evaluation and providing conditions for intelligent landing on the end side.

The embodiment of the application provides a simulation evaluation method for a mobile terminal operation algorithm model, which comprises the following steps: responding to a model evaluation request, and acquiring a model to be evaluated and description information of various equipment evaluation environments corresponding to the model to be evaluated, wherein the model to be evaluated is a machine learning model obtained by compressing and converting an original machine learning model; determining a first type of equipment evaluation environment to be simulated and a second type of equipment evaluation environment based on the real machine equipment according to the description information of the multiple types of equipment evaluation environments; simulating a cloud evaluation environment corresponding to a first type of equipment evaluation environment on a target cloud side resource, and deploying a software operation environment in a hardware operation environment provided by a target real machine equipment to obtain a second type of equipment evaluation environment, wherein the cloud evaluation environment and the first type of equipment evaluation environment corresponding to the cloud evaluation environment comprise the same hardware operation environment and software operation environment; and performing combined evaluation on the model to be evaluated according to the second-class equipment evaluation environment and the cloud evaluation environment to obtain a performance evaluation result of the model to be evaluated.

The embodiment of the present application further provides a simulation evaluating apparatus for a mobile terminal operation algorithm model, including: the acquisition module is used for responding to the model evaluation request and acquiring the model to be evaluated and the description information of various equipment evaluation environments corresponding to the model to be evaluated, wherein the model to be evaluated is a machine learning model obtained by compressing and converting an original machine learning model; the determining module is used for determining a first equipment evaluating environment to be simulated and a second equipment evaluating environment based on the genuine equipment according to the description information of the multiple equipment evaluating environments; the system comprises a construction module, a cloud evaluation module and a software evaluation module, wherein the construction module is used for simulating a cloud evaluation environment corresponding to a first equipment evaluation environment on a target cloud side resource, and deploying a software operation environment in a hardware operation environment provided by a target real machine equipment to obtain a second equipment evaluation environment, and the cloud evaluation environment and the corresponding first equipment evaluation environment comprise the same hardware operation environment and software operation environment; and the combined evaluation module is used for performing combined evaluation on the model to be evaluated according to the second-class equipment evaluation environment and the cloud evaluation environment so as to obtain a performance evaluation result of the model to be evaluated.

An embodiment of the present application further provides a computer device, including: a memory and a processor; and the processor is coupled with the memory and is used for executing the computer program so as to realize the simulation evaluation method facing the mobile terminal operation algorithm model.

The embodiment of the application also provides a computer readable storage medium storing a computer program, and when the computer program is executed by a processor, the processor is caused to realize a simulation evaluation method for the mobile terminal operation algorithm model.

In the embodiment of the application, in the mobile-end-oriented model evaluation, partial evaluation environment information to be simulated and partial equipment evaluation environment based on the genuine machine equipment are determined from various equipment evaluation environments, a cloud evaluation environment adaptive to the partial equipment evaluation environment is simulated on cloud-side resources aiming at the partial equipment evaluation environment, and the cloud evaluation environment adaptive to the equipment evaluation environment can replace partial genuine machine equipment to perform model evaluation, so that the problem that the model evaluation cannot be performed on all the genuine machine equipment is solved; aiming at the other part of equipment evaluation environment, the real machine equipment is adopted for real machine evaluation, and meanwhile, the machine learning model to be evaluated is subjected to model evaluation based on the real machine environment and model evaluation based on the cloud evaluation environment, so that the problem of model evaluation facing the real machine equipment is solved, and conditions are provided for intelligent landing on the end side.

Further, in the embodiment of the application, on one hand, the model evaluation based on the real machine environment can ensure the correctness and the authenticity of the model evaluation result; on the other hand, model evaluation based on the cloud evaluation environment can utilize the advantages of cloud side resources, not only can replace more real machine devices to carry out model evaluation, but also can finish evaluation of large-scale data sets, is beneficial to improving the evaluation efficiency and coverage range of machine learning models, and can solve the problem of poor stability of model evaluation existing in model test on the real machine devices to a certain extent. Furthermore, the evaluation advantages of both the model evaluation based on the real machine environment and the model evaluation based on the cloud evaluation environment are considered, the evaluation task of the model precision and the model operation performance facing the operation of the mobile terminal is met, the evaluation efficiency of the machine learning model and the correctness and authenticity of the evaluation result are improved, and the automatic evaluation of the model facing the mobile terminal is realized.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

fig. 1 is a schematic flowchart of a simulation evaluation method for a mobile-end-oriented operation algorithm model according to an embodiment of the present application;

FIG. 2 is a development flow diagram of an exemplary end intelligence;

fig. 3 is a schematic structural diagram of an evaluation system according to an embodiment of the present application;

fig. 4 is an interaction flowchart of a simulation evaluation method for a mobile terminal running algorithm model according to an embodiment of the present application;

fig. 5 is a schematic structural diagram of a simulation evaluation device for a mobile terminal running algorithm model according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In order to realize the end-side intellectualization, the machine learning model generated by the cloud side can be compressed and converted to obtain the end-side machine learning model which is small in model volume and is adaptive to the mobile equipment, the model reasoning process can be completed on the end side based on the end-side machine learning model, the end-side intellectualization is realized, and the model reasoning efficiency is improved. The structure and performance of the model before and after compression and conversion can be changed, and in order to ensure that the compressed and converted new model can be adapted to the mobile device, the new model needs to be evaluated. For example, the model may alter the machine learning framework before and after compression and transformation, such as: the TensorFlow used by the cloud side is changed into the MNN used by the end side, and whether the precision of the new model on the new frame is basically consistent or not compared with the precision of the original model on the original frame needs to be verified, so that errors introduced to the model by compression and conversion steps are reduced, and the effect of the algorithm is reduced. For another example, considering that resources (e.g., computing power, memory, bandwidth) of the mobile device are limited compared with the cloud, it is further required to evaluate resource consumption of the new model executed on the end side, and to verify feasibility and stability of deployment of the new model on the end side.

However, mobile-oriented model evaluation is a pain point problem faced by end-side intelligence. This is a pain point problem because:

firstly, the method comprises the following steps: the mobile equipment is seriously fragmented, and is difficult to cover all real-machine equipment to complete full-scale evaluation. There are great differences in the computing unit, OS (operating system) and hardware resources of the mobile device, and the number of types of accumulated devices exceeds 500. If the real machine evaluation is respectively completed on all equipment types for each converted model, the engineering quantity and the time cost are hard to bear. The computing unit of the mobile device includes, but is not limited to: arm (advanced RISC machines), CPU (central Processing Unit), GPU (graphics Processing Unit), NPU (Neural-network Processing Unit). The OS includes, but is not limited to: iOS (apple), Android (Android), hong meng system. Hardware resources include, for example, but are not limited to: memory, graphics card, bandwidth, etc.

Secondly, the method comprises the following steps: the mobile device is light in design and difficult to support complex and heavy model evaluation. The computing power, storage and memory of a single mobile device are limited, and a test set used for evaluating a complete model usually contains ten-thousand or even hundred million levels of data, and the total amount of data can reach the GB level. If the data are downloaded to the mobile device and the evaluation is performed on the mobile device through the model, not only is the link complexity high and the automation difficult to realize, but also the mobile device cannot well complete the task of the model evaluation. In addition, a stable long connection between the mobile device and the server is also a problem, and situations such as disconnection of the mobile device and the like often occur, so that the stability of model evaluation also presents a challenge.

In order to solve the technical problems, in the embodiment of the application, in the mobile-end-oriented model evaluation, partial evaluation environment information to be simulated and partial equipment evaluation environment based on the genuine machine equipment are determined from various equipment evaluation environments, a cloud evaluation environment adapted to the partial equipment evaluation environment is simulated on cloud-side resources according to the partial equipment evaluation environment, and the cloud evaluation environment adapted to the equipment evaluation environment can be used for replacing partial genuine machine equipment to perform model evaluation, so that the problem that the model evaluation cannot be performed on all the genuine machine equipment is solved; aiming at the other part of equipment evaluation environment, the real machine equipment is adopted for real machine evaluation, and meanwhile, the machine learning model to be evaluated is subjected to model evaluation based on the real machine environment and model evaluation based on the cloud evaluation environment, so that the problem of model evaluation facing the real machine equipment is solved, and conditions are provided for intelligent landing on the end side. Furthermore, the cloud evaluation environment adaptive to the equipment evaluation environment can be used for replacing part of real machine equipment to perform model evaluation, so that only a small amount of real machine equipment is needed to perform model evaluation, and most of model evaluation is completed by transferring to the cloud side, thereby solving the problems that the real machine equipment cannot well complete a model evaluation task due to limited resources such as calculation power, storage, memory and the like, and long connection for model evaluation is unstable and the like to a great extent.

The technical solutions provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings.

Fig. 1 is a schematic flow chart of a simulation evaluation method for a mobile terminal running algorithm model according to an embodiment of the present application. Referring to fig. 1, the method may include the steps of:

101. and responding to the model evaluation request, and acquiring the model to be evaluated and the description information of various equipment evaluation environments corresponding to the model to be evaluated, wherein the model to be evaluated is a machine learning model obtained by compressing and converting an original machine learning model.

102. And determining a first type of equipment evaluation environment to be simulated and a second type of equipment evaluation environment based on the real machine equipment according to the description information of the various equipment evaluation environments.

103. Simulating a cloud evaluation environment corresponding to the first equipment evaluation environment on the target cloud side resource, and deploying a software operation environment in a hardware operation environment provided by the target real machine equipment to obtain a second equipment evaluation environment, wherein the cloud evaluation environment and the first equipment evaluation environment corresponding to the cloud evaluation environment comprise the same hardware operation environment and software operation environment.

104. And performing combined evaluation on the model to be evaluated according to the second type of equipment evaluation environment provided by the target real machine equipment and the cloud evaluation environment provided by the target cloud side resource to obtain a performance evaluation result of the model to be evaluated.

For the convenience of understanding, the development flow of the end intelligence is briefly described in conjunction with fig. 2. Firstly, problem definition is carried out on an application scene needing to develop a machine learning model, and requirement analysis is carried out on the basis of the defined problem; then, determining which data need to be collected according to demand analysis, and collecting the data by using end-side equipment such as mobile phones, tablet computers, Internet of things (IOT) equipment, wearable equipment, vehicle-mounted equipment and the like; secondly, performing algorithm design by taking data acquired at the end side as training data and combining the training data to determine a model structure of the machine learning model; and carrying out model training based on the training data and the model structure to obtain an original machine learning model. The training process of the original machine learning model may be completed on the cloud side, and may be, for example, a tensoflow model developed based on a tensoflow development framework applicable to the cloud side, a PyTorch model trained based on a PyTorch development framework applicable to the cloud side, or a Caffe model developed based on a Caffe development framework applicable to the cloud side. Here, the training process of the original machine learning model is not limited to the training using the machine learning framework applicable to the cloud side, and may be obtained by training using the machine learning framework applicable to the end side on the end side. And then, after the model training is finished, compressing and converting the original machine learning model trained by the model. Compression and transformation, mainly to optimize the model size, for example, the model size can be reduced by pruning, quantization, etc., and the model framework can be changed, so that the model can be used on the end side. The machine learning model after the compression conversion can be a TFLite model based on a TFLite development framework, an MNN model based on an MNN development framework, or a machine learning model based on a PyTorch Mobile development framework, for example. And evaluating the compressed machine learning model, wherein the machine learning model passing the evaluation can be deployed to the end side for reasoning operation, and further analyzing and deciding the application scene by combining the reasoning result.

It should be noted that the tensrflow development framework, the PyTorch development framework, the Caffe development framework, the TFLite development framework, the MNN development framework, etc. are all machine learning development frameworks, and the machine learning development framework is essentially a programming library or tool, so as to enable developers to construct machine learning models more easily and quickly. The machine learning development framework can not only develop the machine learning model, but also provide a machine learning inference engine for the developed machine learning model or serve as an inference engine of the machine learning model, so as to provide a running environment for the machine learning model.

Several deep learning development frameworks are described herein. The Tensorflow development framework is an end-to-end development framework for machine learning and deep learning, which is developed by open source mathematical computation software using C + + language and is computed in the form of Data Flow Graph (Data Flow Graph). The TFLite development framework is a lightweight version of a machine learning development framework adapted to the hardware resource environment of the mobile device, relative to the tensrflow development framework. The PyTorch development framework is a machine learning development framework developed using the Python programming language. The Caffe development framework is a machine learning development framework with expressiveness, speed and thinking modularization. The PyTorch Mobile development framework is a lightweight version of the machine learning development framework adapted to the hardware resource environment of the Mobile device relative to the PyTorch development framework. The MNN development framework is a high-performance deep learning development framework. Among them, the machine learning development framework tensoflow, pytorreh, or Caffe is more suitable for the cloud side, and the machine learning development framework TFLite, MNN, or pytorreh Mobile is more suitable for the end side.

In this embodiment, the machine learning model obtained by compressing and converting the original machine learning model is referred to as a model to be evaluated, where the model to be evaluated is a machine learning model that needs to be run on the mobile device, and needs to be evaluated by the model before being deployed to the mobile device, so as to determine the running effect of the model on the movement. As an example, a machine learning framework suitable for the cloud side may be adopted to perform model training, so as to obtain an original machine learning model; and compressing and converting the original machine learning model according to the machine learning framework suitable for the end side to obtain the model to be evaluated. The compression and conversion of the original machine learning model according to the machine learning frame suitable for the end side refers to the process of operator fusion, quantization, compression and the like of the original machine learning model with reference to model operators, model sizes, model structures and the like supported by the machine learning frame suitable for the end side, and finally the model to be evaluated, which is adapted to the machine learning frame suitable for the end side, is obtained.

The original machine learning model can be obtained by performing model training on the cloud server by adopting a machine learning framework suitable for the cloud side, and specifically, the machine learning model is developed by utilizing cloud side resources and a machine learning development framework supported by the cloud server. Alternatively, the original machine learning model may be a machine learning model trained on the end-side device, and specifically, the original machine learning model is developed by using the end-side resources and supported machine learning development framework of the end-side device. Because the end-side device developing the original machine learning model and the end-side device needing to deploy the model to be evaluated have difference in terms of hardware operating environment and software operating environment, the model to be evaluated obtained by compressing and converting the original machine learning model needs to be evaluated. It is noted that, in the embodiment of the present application, the end-side device is a resource device opposite to the cloud-side resource, and includes at least a mobile device. In addition, in the process of model evaluation, the mobile device will be referred to as a genuine device for the convenience of distinction and description.

In this embodiment, the model evaluation request may be triggered as needed by a user, where the user may be, for example, an algorithm developer of the machine learning model or an operation and maintenance person responsible for evaluating the machine learning model, and the model evaluation request may also be triggered automatically when a set trigger condition is met, where the model evaluation request is used to request that the model to be evaluated is evaluated in multiple equipment evaluation environments. And analyzing the model evaluation request to obtain the model to be evaluated and the description information of the various equipment evaluation environments corresponding to the model to be evaluated. The description information of the equipment evaluation environment is used for describing evaluation environments required for evaluating a model to be evaluated, wherein the model to be evaluated is applied to the genuine machine equipment, the equipment evaluation environments refer to running environments required by the genuine machine equipment to run the model to be evaluated in practical application, the running environments are also required by the model to be evaluated on the genuine machine equipment, and the running environments mainly comprise hardware running environments and/or software running environments which can be provided by the genuine machine equipment when the model to be evaluated is run. Optionally, the description information of each equipment evaluation environment includes hardware description information describing a hardware operating environment used for model evaluation on the end side (that is, the mobile equipment) and software description information describing a software operating environment used for model evaluation on the end side. The hardware running environment and the software running environment mainly refer to the hardware environment and the software environment on which the end-side machine learning model runs on the real machine device. Hardware operating environments include, for example and without limitation: a Central Processing Unit (CPU) and its parameter index, a Graphics Processing Unit (GPU) and its parameter index, an embedded Neural Network Processor (NPU) and its parameter index, a memory and its parameter index, a Graphics card and its parameter index, and a bandwidth. Software operating environments include, for example, but are not limited to: an Operating System OS (Operating System) and its version, a machine learning inference engine and its version, a database and its parameter index, and so on.

Further optionally, the model evaluation request carries an evaluation task identifier corresponding to the model to be evaluated, and the model to be evaluated and the description information of the various equipment evaluation environments corresponding to the model to be evaluated can be obtained according to the evaluation task identifier in the model evaluation request.

Specifically, before evaluating the model to be evaluated, an evaluation task may be created for the model to be evaluated, and the relevant information of the evaluation task may include, but is not limited to: the evaluation task identification is used as the unique identification of the evaluation task, the task name, the acquisition address of the model file of the model to be evaluated, the acquisition address of the evaluation data set, the model input parameter, the model output parameter, the performance evaluation index item and the description information of various equipment evaluation environments required by model evaluation. It should be noted that the above-mentioned information related to the evaluation task (including description information of various equipment evaluation environments) may be stored in advance before the model evaluation, but is not limited to this, and for example, the information may also be submitted simultaneously with the model evaluation request when the user submits the model evaluation request.

The obtaining address of the model file of the model to be evaluated is used for obtaining the model file of the model to be evaluated. And the obtaining address of the evaluation data set is used for obtaining the evaluation data set of the model to be evaluated from the evaluation data management terminal. The evaluation data management end at least stores evaluation task identifiers of machine learning models to be evaluated and evaluation data sets thereof. The profile data set may include a plurality of data pairs, each data pair being represented in the form of: < model input data, expected model output data >. Model input parameters include, for example, but are not limited to: data type of model input data, data format, and batch size (batch size) of each round of model training. Model output parameters include, for example, but are not limited to: the data type and the data format of the model output data and the batch size of the output result of each round of model training.

The performance evaluation index item refers to a performance index item required to evaluate a model to be evaluated, and examples include, but are not limited to: accuracy, Precision, Recall, AUC, or memory footprint, CPU consumption, reasoning averaging time, etc. associated with model Accuracy. The description information of the multiple device evaluation environments is used to describe an end-side operating environment required for evaluating a model of a model to be evaluated, where the operating environment includes a hardware operating environment and a software operating environment, and reference may be made to the foregoing embodiment specifically, and details are not described here.

Based on the above, in an optional embodiment, the relevant information of the corresponding evaluation task may be obtained according to the evaluation task identifier in the model evaluation request, and the model file of the model to be evaluated may be obtained according to the obtaining address of the model file of the model to be evaluated in the relevant information. In this embodiment, obtaining the model file of the model to be evaluated also means obtaining the model to be evaluated. Further, the description information of the evaluation environment of the various devices related to the model to be evaluated is obtained from the relevant information of the evaluation task.

In this embodiment, considering that the model to be evaluated cannot be evaluated on all the genuine machine devices, the information of the evaluation environment of the first type of device to be simulated and the evaluation environment of the second type of device required to be based on the genuine machine devices can be determined according to the description information of the evaluation environments of various devices; the first type of equipment evaluation environment refers to an equipment evaluation environment needing to be simulated by using a cloud evaluation environment, and the second type of equipment evaluation environment refers to an equipment evaluation environment directly provided by a real machine equipment. Therefore, a cloud evaluation environment corresponding to the first-class equipment evaluation environment information to be simulated is simulated on the target cloud resource aiming at the first-class equipment evaluation environment information to be simulated, and the cloud evaluation environment and the first-class equipment evaluation environment information corresponding to the cloud evaluation environment have the same hardware running environment and software running environment, so that a true machine equipment can be replaced for evaluating the model to be evaluated. For the second-class equipment evaluation environment, the software operating environment for model evaluation required by the second-class equipment evaluation environment can be directly deployed in the hardware operating environment provided by the target genuine machine equipment, so that the second-class equipment evaluation environment based on the genuine machine equipment can be obtained.

In the embodiment of the application, not only a model evaluation mode based on a real machine device is supported, but also a model evaluation mode based on a cloud evaluation environment is supported. The model evaluation mode based on the cloud evaluation environment refers to a mode that the cloud evaluation environment is used for carrying out model evaluation by simulating the cloud evaluation environment which is the same as the hardware operation environment and the software operation environment related to the first equipment evaluation environment on the cloud side resources. The model evaluation mode based on the real machine environment refers to the model evaluation performed in the real machine evaluation environment provided by the real machine equipment. The cloud evaluation environment and the real machine environment both belong to evaluation environments for model evaluation. The cloud evaluation environment of this embodiment refers to an evaluation environment obtained by simulating a first-class equipment evaluation environment by using a virtualization technology, for example, the cloud evaluation environment may be an evaluation environment constructed on an ARM server based on a MNN engine, or an Android system is simulated and operated on the ARM server by using a virtualization technology, and then the MNN evaluation environment is constructed therein. Regardless of the type of equipment evaluation environment, the equipment evaluation environment includes a hardware operating environment and a software operating environment.

The description information of each equipment evaluation environment comprises hardware description information and software description information, wherein the hardware description information is used for describing relevant information of a hardware operation environment required by the corresponding equipment evaluation environment, such as the type and the resource occupation amount of hardware resources required by the hardware operation environment; the software description information is used for describing relevant information of a software running environment required by a corresponding equipment evaluation environment, such as the type and version of an operating system required by the software running environment, an end-side machine learning inference engine for a model to be evaluated and the like. In this embodiment, a detailed implementation manner of determining a first-class device evaluation environment to be simulated and a second-class device evaluation environment based on a genuine device according to description information of multiple device evaluation environments is not limited. The following examples illustrate:

mode 1: according to hardware description information of multiple equipment evaluation environments, at least two equipment evaluation environments are selected as a first type of equipment evaluation environment to be simulated by combining a set quantity proportion, and other equipment evaluation environments are used as second type of equipment environment information based on the real machine equipment.

Optionally, under the condition that the software description information of each equipment evaluation environment is the same, the types and the resource occupation amounts of the hardware operating environments required by the various equipment evaluation environments can be determined according to the hardware description information of the various equipment evaluation environments; and selecting at least two equipment evaluation environments as a first type of equipment evaluation environment to be simulated according to the resource types and the resource occupancy of the hardware operating environment required by the multiple equipment evaluation environments and by combining a set quantity ratio. For example, more than 80% of equipment evaluation environments can be selected and simulated by adopting a cloud evaluation environment in combination with the type of the hardware operating environment and the resource occupation amount, so that the advantages of cloud side resources can be fully exerted, and the cost of the real-aircraft evaluation is reduced. For example, the cost when the real machine equipment is adopted for implementation can be determined according to the resource type and the resource occupation amount of the hardware operating environment, and the equipment evaluation environment with higher cost is preferentially selected as the first type of equipment evaluation environment to be simulated. For another example, according to the resource type and the resource occupation amount of the hardware operating environment, an equipment evaluation environment with a specific resource type or a large resource occupation amount is determined as a first type of equipment evaluation environment to be simulated.

Of course, under the condition that the software description information of each equipment evaluation environment is different, the first equipment evaluation environment to be simulated can be determined according to the hardware description information of multiple equipment evaluation environments, and the detailed implementation is referred to above and is not described herein again.

Mode 2: according to the software description information of multiple equipment evaluation environments, at least two equipment evaluation environments are selected as a first type of equipment evaluation environment to be simulated by combining a set quantity ratio, and the other equipment evaluation environments are used as second type of equipment environment information based on the real machine equipment.

Optionally, under the condition that the software description information of each equipment evaluation environment is different, determining a first equipment evaluation environment to be simulated according to the software description information of multiple equipment evaluation environments. For example, according to the software description information of each equipment evaluation environment, the type and version of the operating system of the software operating environment required by each equipment evaluation environment, the end-side machine learning inference engine for the model to be evaluated, and the like are determined, the equipment evaluation environment corresponding to the operating system of a specific type can be selected as the first type of equipment evaluation environment to be simulated, or the equipment evaluation environment corresponding to the operating system of a specific version can be selected as the first type of equipment evaluation environment to be simulated, or the equipment evaluation environment requiring a specific end-side machine learning inference engine can be selected as the first type of equipment evaluation environment to be simulated, or the equipment evaluation environment corresponding to the end-side machine learning inference engine with higher resource consumption can be selected as the first type of equipment evaluation environment to be simulated according to the resource consumption of the end-side machine learning inference engine, and so on. In addition, a combination of two or more of the options listed above may also be used to determine the evaluation environment of the first type of equipment to be simulated.

Mode 3: according to hardware description information and software description information of multiple equipment evaluation environments, at least two equipment evaluation environments are selected as a first type of equipment evaluation environment to be simulated by combining a set quantity proportion, and the rest equipment evaluation environments are used as second type of equipment environment information based on the genuine equipment.

Under the condition that the hardware description information and the software description information of various equipment evaluation environments are different, the hardware description information and the software description information of the equipment evaluation environments and a set quantity proportion, such as not less than 50%, or not more than 90%, or 80% and the like, can be simultaneously combined to determine the first type of equipment evaluation environment to be simulated. In the specific determination process, the first equipment evaluation environment to be simulated can be comprehensively determined according to the resource type and the resource occupation amount of the hardware operating environment defined by the hardware description information, the type and the version of the operating system defined by the software description information, the type and the resource consumption of the end-side machine learning inference engine and the like.

Based on the above, after the first type of equipment evaluation environment to be simulated is determined, the other equipment evaluation environments are the second type of equipment evaluation environment based on the real machine equipment. Further, an evaluation environment which is the same as the first equipment evaluation environment required by model evaluation can be constructed on the cloud side resources. For convenience of understanding, an evaluation environment which is constructed on the cloud side resources and is the same as a first type of equipment evaluation environment required by model evaluation is called a cloud evaluation environment, and the cloud side resources which are constructed with the cloud evaluation environment are called target cloud side resources. Target cloud-side resources include, but are not limited to: various cloud computing resources such as ARM servers, various cloud storage resources such as ODPS and ODS, and the like. Therefore, in this embodiment, a cloud evaluation environment corresponding to the first-class equipment evaluation environment is simulated on the target cloud-side resource, and the cloud evaluation environment and the first-class equipment evaluation environment corresponding to the cloud evaluation environment include the same hardware operating environment and software operating environment. It is noted that the target cloud-side resource can be regarded as a simulated device simulated on the cloud side, as opposed to a real-machine device on the end side. For example, the live devices on the end side are real mobile phones, tablets, IOT devices, wearable devices, and in-vehicle devices. The simulation equipment is a simulation mobile phone, a simulation tablet computer, a simulation IOT equipment, a simulation wearable equipment, a simulation vehicle-mounted equipment and the like which are simulated in a cloud side mode, and the simulation equipment has the same hardware running environment and software running environment as the real machine equipment.

In specific application, the description information of the first-class equipment evaluation environment includes hardware description information and software description information, and an optional implementation manner of simulating a cloud evaluation environment corresponding to the first-class equipment evaluation environment on the target cloud-side resource is as follows: according to the hardware description information of the first equipment evaluation environment, cloud side resources with the hardware operation environment the same as that required by the first equipment evaluation environment are selected from the cloud side resources as target cloud side resources; according to the software description information of the first equipment evaluation environment, a software running environment required by the first equipment evaluation environment is constructed on the target cloud side resource, so that a cloud evaluation environment corresponding to the first equipment evaluation environment is obtained. For example, the hardware description information of the first equipment evaluation environment indicates that real-machine equipment supporting an ARM instruction set is adopted for evaluation, and a cloud server supporting the ARM instruction set is selected from a plurality of cloud servers and used as a target cloud side resource; if the software description information of the first-class equipment evaluation environment indicates that the genuine equipment needs to be configured with the android system and the MNN reasoning engine, deploying the android system and the MNN reasoning engine on the target cloud side resource.

Further optionally, in order to improve the accuracy and the authenticity of the evaluation result for the model to be evaluated, an operating system and an end-side machine learning inference engine which are required for the evaluation of the model to be evaluated may be deployed on the target cloud-side resource. For example, if the model to be evaluated is a machine learning model based on the MNN development framework, the end-side machine learning inference engine is the MNN inference engine. For another example, if the model to be evaluated is a machine learning model based on the TFLite development framework, the end-side machine learning inference engine is a TFLite inference engine.

Based on the above, according to the software description information of the first-class device evaluation environment, a software operating environment in the first-class device evaluation environment is constructed on the target cloud-side resource, so as to obtain a cloud evaluation environment corresponding to the first-class device evaluation environment, in an optional implementation manner: acquiring an operating system required by the first equipment evaluation environment and an end-side machine learning inference engine adapted to a model to be evaluated according to the software description information of the first equipment evaluation environment; according to the operating system required by the first equipment evaluation environment, a target container is constructed on target cloud side resources by adopting a virtualization technology, and an end side machine learning inference engine is deployed in the target container to obtain a cloud evaluation environment corresponding to the first equipment evaluation environment. It should be noted that, the end-side machine learning inference engines required by different equipment evaluation environments in the first equipment evaluation environment may be the same or different, and are not limited thereto.

In this embodiment, the operating system on which the target container depends needs to be the same as the operating system required by the first type of equipment evaluation environment. In practical application, the operating system of the target cloud-side resource may be the same as or different from the operating system required by the first-class equipment evaluation environment, and for such a situation, a target container needs to be constructed distinguishably. Therefore, in an optional implementation manner, according to the operating system required by the first-class device evaluation environment, the specifically step of constructing the target container on the target cloud-side resource by using the virtualization technology is as follows: if the operating system required by the first equipment evaluation environment is the same as the operating system of the corresponding target cloud side resource, constructing a first container depending on the host operating system on the target cloud side resource as a target container; if the operating system required by the first equipment evaluation environment is different from the operating system corresponding to the target cloud side resource, a second container with the operating system is built on the target cloud side resource to serve as a target container, and the operating system of the second container is the same as the operating system required by the first equipment evaluation environment.

It is noted that the host operating system is an operating system of the target cloud-side resource. If the operating system required by the first equipment evaluation environment is a Linux operating system, for example, many IOT equipment are based on the Linux operating system, and the host operating system (i.e., the operating system of the target cloud side resource) is also the Linux operating system, the first container does not need to carry the operating system, the first container runs depending on the host operating system, and the cloud evaluation environment simulated based on the first container is used for replacing a real machine environment based on the Linux operating system; if the operating system required by the first equipment evaluation environment is an Android operating system and the host operating system (i.e. the operating system of the target cloud side resource) is a Linux operating system, the second container needs to be provided with the operating system, and the provided operating system is the Android operating system.

In this embodiment, in order to improve the correctness and the authenticity of the evaluation result, a model evaluation mode based on a genuine-machine environment is also supported, that is, the model to be evaluated is evaluated on the genuine-machine device. Based on the above, the target genuine machine equipment capable of providing the second type of equipment evaluation environment needs to be determined. Therefore, in this embodiment, according to the description information of the second-class device evaluation environment, a target genuine machine device is selected from genuine machine devices that can provide a hardware operating environment required by the second-class device evaluation environment, and a software operating environment corresponding to the second-class device evaluation environment is deployed on the target genuine machine device, so as to obtain the second-class device evaluation environment.

In practical application, the manufacturers and the models of real machine equipment are multiple, so that the variety of the real machine equipment is various. Further optionally, in order to enable the evaluation of the model to be evaluated to cover more comprehensive genuine machine equipment, according to the description information of the second type of equipment evaluation environment, an implementation manner of selecting the target genuine machine equipment is as follows: determining real machine equipment capable of providing a hardware operating environment in the second equipment evaluation environment according to the description information of the second equipment evaluation environment; dividing the real machine equipment into multiple types according to the multi-dimensional attribute information of the real machine equipment, and selecting part of real machine equipment from each type of real machine equipment as target real machine equipment.

Specifically, the multi-dimensional attribute information of the real machine device includes, but is not limited to: manufacturer name, user number, CPU parameter index, GPU parameter index, NPU parameter index, memory and its parameter index, display card and its parameter index, bandwidth, etc. For example, the real machine equipment can be classified into real machine equipment of different manufacturers according to manufacturer names; dividing the real machine equipment into mainstream equipment with a large number of users and non-mainstream equipment with a small number of users according to the number of the users; and dividing the real machine equipment into different levels from high to low according to the parameter indexes of one or more hardware indexes such as the parameter index of the CPU, the parameter index of the GPU, the parameter index of the NPU, the memory and the parameter index thereof, the display card and the parameter index thereof, the bandwidth and the like. Of course, the specific classification strategy depends on the actual application requirements.

In this embodiment, after classifying the genuine machine devices, part of the genuine machine devices may be selected from the genuine machine devices under each category as the target genuine machine devices for performing the genuine machine evaluation, and the number of the part of the genuine machine devices may be one or more. It should be understood that the genuine machine evaluation covers different categories of genuine machine equipment, and the accuracy and the authenticity of the evaluation result of the model to be evaluated can be improved.

In this embodiment, after the second-class device evaluation environment constructed by the target genuine machine device and the cloud evaluation environment constructed by the target cloud-side resource are used, the model to be evaluated can be jointly evaluated according to the second-class device evaluation environment provided by the target genuine machine device and the cloud evaluation environment provided by the cloud-side resource, so as to obtain the performance evaluation result of the model to be evaluated.

When the method is applied specifically, target evaluation data matched with a model to be evaluated can be obtained from an evaluation data management end according to an evaluation task identifier corresponding to the model to be evaluated; according to the target evaluation data, operating the model to be evaluated in a second type of equipment evaluation environment provided by the target genuine machine equipment and a cloud evaluation environment provided by the cloud side resources respectively to obtain a genuine machine evaluation result and a simulation evaluation result which are matched with the performance evaluation index item; and performing combined analysis on the genuine machine evaluation result and the simulation evaluation result according to the performance evaluation index item to obtain a performance evaluation result of the model to be evaluated. When a genuine machine evaluation result and a simulation evaluation result which are matched with the performance evaluation index item are obtained, at least one of operation result data, intermediate state data and resource consumption data which are respectively generated in a second-class equipment evaluation environment and a cloud evaluation environment by the model to be evaluated can be obtained according to the performance evaluation index item, and the genuine machine evaluation result and the simulation evaluation result which correspond to the model to be evaluated are generated according to the at least one data generated in different evaluation environments.

In this embodiment, according to an evaluation task identifier corresponding to a model to be evaluated, a performance evaluation index item adapted to the model to be evaluated in the evaluation task may be obtained, where the performance evaluation index item may include at least a performance evaluation index item related to model precision and a performance evaluation index item related to model operation performance.

In this embodiment, according to the evaluation task identifier corresponding to the model to be evaluated, an evaluation data set associated with the evaluation task identifier may also be obtained from the evaluation data management end, and a plurality of data pairs are selected from the evaluation data set as target evaluation data adapted to the model to be evaluated. Taking model input data in the target evaluation data as input data of a model to be evaluated, and operating the model to be evaluated in a second-class equipment evaluation environment provided by the target real machine equipment and a cloud evaluation environment provided by cloud side resources respectively; and acquiring operation result data, intermediate state data and resource consumption data generated in the operation process of the model to be evaluated. The operation result data of the model to be evaluated can be result data obtained by performing algorithm calculation on the model to be evaluated according to input data. The intermediate state data may refer to non-result data generated by one or more processing steps during the operation of the model to be evaluated. Resource consumption data includes, for example, but is not limited to: the memory resource amount, the CPU resource amount and the inference average time consumption consumed in the running process of the model to be evaluated.

It should be noted that Accuracy (Accuracy), Precision (Precision), Recall (Recall), and Area under the Curve (AUC) related to model Accuracy can be calculated based on the comparison result between the above-mentioned operation result data and the expected model result data.

Here, the accuracy is the number of samples correctly classified by the machine learning model divided by the number of all samples, and generally, the higher the accuracy, the better the classification effect of the machine learning model. Accuracy is a measure of accuracy, representing the proportion of positive examples in the classification results that are classified as positive examples by the machine learning model. Recall is a measure of coverage, and there are multiple positive examples of measures that are divided into positive examples. The area under the curve is a comprehensive measure of the effect of the machine learning model on all possible classification thresholds.

After at least one of the operation result data, the intermediate state data and the resource consumption data of the model to be evaluated, which are generated in the operation process of the model to be evaluated, is obtained, an actual performance evaluation index related to the precision of the model or an actual performance evaluation index related to the operation performance of the model can be calculated. Comparing the actual performance evaluation index related to the model precision with the reference index corresponding to the performance evaluation index item related to the model precision, and determining the evaluation result of the model to be evaluated in the dimension of the model precision; comparing the actual performance evaluation index related to the model operation performance with the reference index corresponding to the performance evaluation index item related to the model operation performance, and determining the evaluation result of the model to be evaluated in the dimension of the model operation performance; and (3) jointly deciding the performance evaluation result of the model to be evaluated based on the evaluation result of the second-class equipment evaluation environment and the cloud evaluation environment in the dimension of the model accuracy and the evaluation result of the model to be evaluated in the dimension of the model operation performance. For example, the actual performance evaluation indexes of the model to be evaluated under different performance evaluation index items, which are obtained based on the second-class equipment evaluation environment and the cloud evaluation environment, may be subjected to weighted summation, and finally, the performance evaluation result of the model to be evaluated is obtained. And for different performance evaluation index items, the weights corresponding to the two types of evaluation environments are different. The performance evaluation result of the model to be evaluated comprises that the evaluation passes, the evaluation fails and the like.

The cloud evaluation environment provided by the cloud side resources can realize the complete evaluation of a large-scale data set of the model and the automatic evaluation of various kinds of real machine equipment; the second type of equipment evaluation environment provided by the genuine machine equipment enables the support of genuine machine evaluation, can obtain the performance data index of the real operation of the model, and verifies the correctness and stability of model deployment of the model in the complex genuine machine environment. Therefore, through various evaluation environments such as a second-class equipment evaluation environment provided by the genuine machine equipment and a cloud evaluation environment provided by the cloud side resource, evaluation tasks of model precision and model operation performance facing the operation of the mobile terminal can be met, large-scale data set evaluation tasks are supported, and automatic evaluation on various genuine machine equipment is supported.

In an optional embodiment, the type and the number of the first-class equipment evaluation environments are larger than those of the second-class equipment evaluation environments, that is, most of evaluation tasks of the real-machine equipment are executed by the cloud, and less of evaluation tasks of the real-machine equipment are executed by the real-machine equipment, so that the evaluation efficiency can be improved, and meanwhile, the accuracy and the authenticity of an evaluation result can be improved.

It should be noted that, in practical applications, multiple equipment evaluation environments corresponding to the model to be evaluated may be all classified into the first type of equipment evaluation environment, so that the model to be evaluated may all use the model evaluation mode based on the cloud evaluation environment in the multiple equipment evaluation environments. Of course, the various equipment evaluation environments corresponding to the model to be evaluated can also be completely divided into the second type of equipment evaluation environment, so that the model to be evaluated can completely use the model evaluation mode based on the real machine environment in the various equipment evaluation environments.

According to the simulation evaluation method for the mobile terminal operation algorithm model, in the mobile terminal-oriented model evaluation, a cloud evaluation environment adaptive to partial equipment evaluation environment is simulated on cloud side resources aiming at the partial equipment evaluation environment, and the cloud evaluation environment adaptive to the equipment evaluation environment can be used for replacing partial genuine equipment to carry out model evaluation, so that the problem that the model evaluation cannot be carried out on all the genuine equipment is solved; aiming at the other part of equipment evaluation environment, the corresponding genuine machine equipment is adopted for genuine machine evaluation, and meanwhile, the machine learning model to be evaluated is subjected to model evaluation based on the genuine machine environment and model evaluation based on the cloud evaluation environment, so that the problem of model evaluation facing the genuine machine equipment is solved, and conditions are provided for intelligent landing on the end side. Furthermore, the cloud evaluation environment adaptive to the equipment evaluation environment can be used for replacing part of real machine equipment to perform model evaluation, so that only a small amount of real machine equipment is needed to perform model evaluation, and most of model evaluation is completed by transferring to the cloud side, thereby solving the problems that the real machine equipment cannot well complete a model evaluation task due to limited resources such as calculation power, storage, memory and the like, and long connection for model evaluation is unstable and the like to a great extent.

For the convenience of understanding, a simulation evaluation method of running an algorithm model to the mobile terminal is described below by taking a specific evaluation system as an example.

Fig. 3 is a schematic structural diagram of an evaluation system according to an embodiment of the present application. As shown in fig. 3, the system includes: the model evaluation scheduling terminal 10, the evaluation data management terminal 20 and the model evaluation execution terminal 30. Wherein, the three ends can be connected by wire or wireless network. Optionally, two devices in the system shown in fig. 3 interact with each other through a wired network or a wireless network. For example, the wired network may include a coaxial cable, a twisted pair, an optical fiber, and the like, and the Wireless network may be a 2G network, a 3G network, a 4G network, or a 5G network, a Wireless Fidelity (WIFI) network, and the like. The specific type or specific form of the interaction is not limited in the application as long as the interaction function between every two devices can be realized. It should be understood that the number of the model evaluation scheduling terminal 10, the evaluation data management terminal 20, and the model evaluation execution terminal 30 in fig. 3 is only illustrative. In practical application, any number of model evaluation scheduling terminals 10, evaluation data management terminals 20, and model evaluation execution terminals 30 may be deployed according to actual needs.

The device forms of the model evaluation scheduling terminal 10, the evaluation data management terminal 20, and the model evaluation execution terminal 30 are not limited in the embodiment of the application, and the model evaluation scheduling terminal 10, the evaluation data management terminal 20, and the model evaluation execution terminal 30 may be, for example, but not limited to: the terminal devices such as a mobile phone, a tablet computer, a wearable smart device, and a smart home device, or the model evaluation scheduling terminal 10, the evaluation data management terminal 20, and the model evaluation execution terminal 30 may be, for example, but not limited to: a single server or a distributed server cluster of multiple servers.

In this embodiment, the model evaluation scheduling terminal 10 mainly manages creation, release, and the like related to evaluation tasks, and the evaluation data management terminal 20 is mainly used for managing evaluation data, for example, storing evaluation task identifiers of machine learning models to be evaluated and evaluation data sets thereof in association. The model evaluation execution terminal 30 is mainly used for executing model evaluation.

The simulation evaluation method of the algorithm model operated by the mobile terminal is described with reference to the interaction flow chart shown in fig. 4. Referring to fig. 4, the method of the present embodiment may include the steps of:

s1, the user sends an evaluation task creating request to the model evaluation scheduling terminal 10.

S2, the model evaluation scheduling terminal 10 responds to the evaluation task creating request to create the evaluation task of the model to be evaluated.

S3, the model evaluation scheduling terminal 10 sends the evaluation task identifier corresponding to the model to be evaluated and the evaluation data set thereof to the evaluation data management terminal 20.

S4, the evaluation data management terminal 20 stores the evaluation task identification and the evaluation data set corresponding to the model to be evaluated in an associated manner.

S5, the user sends a model evaluation request to the model evaluation scheduling terminal 10.

S6, the model evaluating scheduling terminal 10 forwards the model evaluating request to the model evaluating executing terminal 30.

S7, the model evaluation execution end 30 responds to the model evaluation request, obtains the model to be evaluated and the description information of the corresponding multiple equipment evaluation environments from the model evaluation scheduling end 10, and obtains the corresponding target evaluation data from the evaluation data management end 20.

S8, the model evaluation execution end 30 determines a first type of equipment evaluation environment to be simulated and a second type of equipment evaluation environment based on the genuine equipment according to the description information of the various equipment evaluation environments, constructs the second type of equipment evaluation environment on the target genuine equipment according to the description information of the second type of equipment evaluation environment, and constructs a cloud evaluation environment on the target cloud side resource according to the description information of the first type of equipment evaluation environment.

And S9, the model evaluation execution end 30 uses the target evaluation data as model input data, and runs the model to be evaluated in the second type of equipment evaluation environment provided by the target real machine equipment and the cloud evaluation environment provided by the target cloud side resource respectively.

S10, the model evaluation execution end 30 performs joint evaluation on the model to be evaluated according to the relevant information of the evaluation task and the model operation data to obtain the performance evaluation result of the model to be evaluated.

Specifically, the whole simulation evaluation process can be divided into an evaluation task creation stage and an evaluation task execution stage. Steps S1 to S4 in fig. 4 correspond to an evaluation task creation stage, and steps S5 to S10 in fig. 4 correspond to an evaluation task execution stage.

In the evaluation task creating stage, a user can initiate an evaluation task creating request as required, and the model evaluation scheduling terminal 10 responds to the evaluation task creating request to create an evaluation task of a model to be evaluated. For example, the model evaluation scheduling terminal 10 pushes an evaluation task creation page to the user, so that the user inputs relevant information of the evaluation task on the evaluation task creation page, where the relevant information of the evaluation task includes, but is not limited to: the evaluation task comprises an evaluation task identifier serving as a unique identifier of an evaluation task, a task name, an acquisition address of a model file of a model to be evaluated, an acquisition address of an evaluation data set, a model input parameter, a model output parameter, a performance evaluation index and an evaluation environment. For example, three evaluation environments, such as a simulated MNN running environment, a simulated Android system evaluation, and a genuine machine evaluation, shown in fig. 3 are provided on the evaluation task creation page for the user to select. After selecting the evaluation environment, the user inputs the description information of the evaluation environment on the evaluation task creation page.

The model evaluation scheduling terminal 10 responds to the confirmation operation of the relevant information of the evaluation task on the evaluation task creation page by the user to create the evaluation task. The model evaluation scheduling terminal 10 can download the model file of the model to be evaluated according to the acquired address of the model file of the model to be evaluated and store the model file at the local terminal, and send the evaluation task identifier corresponding to the model to be evaluated and the evaluation data set thereof to the evaluation data management terminal 20, so that the evaluation data management terminal 20 locally stores the evaluation task identifier corresponding to the model to be evaluated and the evaluation data set thereof in an associated manner. And finishing the evaluation task of creating the model to be evaluated.

In the evaluation task execution stage: the model evaluation scheduling terminal 10 confirms that the evaluation task execution stage is entered when receiving a model evaluation request sent by a user. At this time, the model evaluation scheduling terminal 10 forwards the model evaluation request to the model evaluation execution terminal 30, and the model evaluation execution terminal 30 executes steps S7 to S10 to execute the evaluation task. For specific implementation of the steps S7 through S10, reference may be made to the description of the foregoing embodiments.

Taking the example that a user selects three evaluation environments of the simulated MNN operating environment, the simulated Android system evaluation and the genuine machine evaluation in FIG. 3 for evaluation, firstly, the model evaluation execution end 30 creates three evaluation environments of the simulated MNN operating environment, the simulated Android system evaluation and the genuine machine evaluation respectively based on the description information of the three evaluation environments; then, the model evaluation execution end 30 acquires the model file of the model to be evaluated from the model evaluation scheduling end 10 and deploys the model file to three evaluation environments, so as to deploy the model to be evaluated in the three evaluation environments; in addition, the model evaluation execution terminal 30 evaluates the target evaluation data corresponding to the data management terminal 20 and deploys the target evaluation data to three evaluation environments. And respectively operating the model to be evaluated in three evaluation environments by taking the target evaluation data as model input data, and collecting model operation data such as operation result data, intermediate state data, resource consumption data and the like of the model to be evaluated, which are generated in the operation process of the model to be evaluated. And the model evaluation execution end 30 carries out model evaluation according to the reference performance evaluation index and the model operation data in the evaluation task.

The following briefly explains the implementation principle and the evaluation target of the three evaluation environments.

1. Simulating MNN operating environment

The realization principle is as follows: an evaluation operation environment is built on the ARM server based on the MNN engine, and since the same instruction set Architecture (ARM) as the real machine equipment is used, evaluation output consistent with the real machine of the real machine equipment can be obtained.

Evaluation target: the method is used for evaluating the Precision of the model on a large-scale evaluation data set, namely Precision indexes such as Accuracy (Accuracy rate), Precision (Accuracy rate), Recall (Recall rate), AUC (area under the curve) and the like of the model. And performing parallel evaluation on the evaluation data set by creating a plurality of evaluation engine instances to accelerate the acquisition of evaluation results.

2. Android operation environment simulation

The realization principle is as follows: an Android system is simulated and operated on an ARM server through a virtualization technology, and then an MNN evaluation environment is built in the Android system. Because the virtualization technology is used for simulating the hardware and the system environment at the outlet side, the performance index basically consistent with the real machine running of the real machine equipment can be measured. And performing parallel evaluation on the evaluation data set by creating a plurality of evaluation engine instances to accelerate the acquisition of evaluation results.

Evaluation target: the method is used for evaluating the running performance indexes of a CPU, a memory, reasoning time consumption and the like consumed by the evaluation model running on the real machine equipment. The simulator parameter setting device can quickly modify simulation hardware and can quickly change the simulation system through switching the system mirror image, and the simulation hardware and the simulation system can be combined to realize automatic evaluation on a large number of simulation real-machine devices, so that the coverage rate of the evaluated real-machine devices is improved.

3. Operating environment of real machine

The realization principle is as follows: and connecting the real machine equipment through a communication protocol, deploying an MNN evaluation engine, pushing an evaluation task and obtaining an execution result. In fig. 3, the real-machine platform includes a plurality of real-machine devices, and the real-machine devices are illustrated as mobile phones.

Evaluation target: the method is used for evaluating the accuracy and performance of the real model of the specific equipment. And meanwhile, the model deployment accuracy and stability in a complex real machine environment are checked.

It is worth noting that: based on an ARM server, 2 simulation environments running at the end side of the engine at the end side are constructed, and the complete evaluation of a large-scale data set of a model and the automatic evaluation of various real-machine equipment are realized. The method can be used for carrying out real machine evaluation based on a real machine platform, can also support real operation environment evaluation for specific equipment, obtains performance data indexes of real operation of the model, and verifies the model deployment correctness and stability of the model in a complex real machine environment. Therefore, through the three evaluation environments, evaluation tasks of the mobile-end-operation-oriented model precision and the model operation performance can be met, and the large-scale data set evaluation task and the automatic evaluation on various real machine devices are supported.

In the embodiments of the present application, in the case that the performance evaluation result of the to-be-evaluated model indicates that evaluation fails, notification information that evaluation fails can be pushed to an algorithm developer in charge of the to-be-evaluated model, so that the algorithm developer can repair the to-be-evaluated model in time. Under the condition that the performance evaluation result of the model to be evaluated indicates that the evaluation is passed, deploying the model to be evaluated to a real machine device running with target application, wherein the target application refers to the application needing to use the model to be evaluated; further, a trigger event sequence of the model to be evaluated can be generated according to the information of the target event generated in the running process of the target application; and triggering the computing container deployed on the real machine equipment to execute the model to be evaluated according to the trigger event sequence.

In the embodiment of the application, a data stream processing component or framework and a computing container are also deployed on the real machine device where the target application is located. The data stream processing component or the framework is used for generating a trigger event sequence of the model to be evaluated according to information of a target event generated in the running process of the target application and triggering the computing container to execute the machine learning model to be evaluated or the machine learning task to which the machine learning model belongs. The computation container is mainly responsible for model reasoning on the machine learning model by evaluation, but is not limited to this. For example, in the case where the machine learning model belongs to a machine learning task, and the machine learning task further includes pre-processing and post-processing tasks related to the machine learning model, the computation container may also execute these pre-processing and post-processing tasks.

The embodiment provides a definition of a machine learning task and an implementation structure thereof, a real expression of the machine learning task can be regarded as a piece of program code, and the implementation structure of the machine learning task can include a pre-processing task, a machine learning model and a post-processing task. The preprocessing task is used for preprocessing input data to be input into the machine learning model, for example, in a computer vision scene, the preprocessing task needs to perform the following data preprocessing: image rotation, image magnification, image reduction, and the like. The machine learning model is a function expressed by a piece of data, and the data records a function structure and function parameters. The machine learning model, after being trained (parameter optimized), can recognize a specific type of pattern, and the specific role of the machine learning model is to realize the mapping from one sample to a sample label. The preprocessing task is used to perform post-processing on the output data of the machine learning model, for example, the machine learning model outputs a plurality of classification results and probabilities thereof, and selects a final classification result meeting requirements from the plurality of classification results according to the probabilities of the plurality of classification results for output.

The working principle with respect to the data stream processing component or framework is as follows:

in this embodiment, a task tree mode may be adopted to organize and manage the machine learning task of the target application and the trigger event sequence thereof, and for a current target event occurring in the running process of the target application, the current target event is matched with an event node or a leaf node on the task tree by querying the task tree, and whether to trigger execution of the machine learning task is determined according to a matching result, so as to complete processing of a data stream (or an event stream) related to the machine learning task at an end side. Therefore, whether the trigger condition of the machine learning task is met or not can be automatically, quickly and accurately identified on the end side by combining the task tree, and the automatic, accurate and quick execution of the machine learning task is guaranteed.

Specifically, a task tree corresponding to the target application is generated in advance, the task tree includes a root node, event nodes and leaf nodes, each leaf node is associated with at least one machine learning task, and the trigger events in the trigger event sequence corresponding to at least one machine learning task correspond to the event nodes passing from the root node to the leaf nodes in sequence. When the target application generates a current target event, candidate event nodes used for the current target event are obtained, wherein the candidate event nodes comprise two types of event nodes, the first type of event node is a next-level event node of a root node, and the second type of event node is a next-level event node of an event node matched with a previous target event in a task tree. And matching the current target event with the trigger events corresponding to the candidate event nodes to obtain at least one candidate event node with the trigger event matched with the current target event as a target event node. And for each target event node, if the next-level node of the target event node comprises a leaf node, executing at least one machine learning task related to the leaf node according to the information of the target event matched by the event node passing from the root node to the leaf node.

In this embodiment, the trigger condition of the machine learning task may be a trigger event sequence composed of one or more trigger event IDs (abbreviations of Identity/Identifier), and a position of the trigger event ID in the trigger event sequence indicates a sequence of occurrence of a corresponding trigger event. It should be understood that when the real-machine device detects that all trigger events in the trigger event sequence occur in sequence, the trigger condition of the machine learning task is met, the machine learning task is triggered, and the real-machine device needs to execute the machine learning task. Of course, if all the trigger events in the trigger event sequence do not meet the trigger condition of the machine learning task in sequence, the machine learning task is not triggered, and the computing container does not need to execute the machine learning task.

In this embodiment, the trigger event may be a base event. The basic event refers to an original event generated along with a user operation in the real machine equipment, and the basic event can be classified into the following categories based on the user operation, for example and without limitation: a page entry event corresponding to a page entry (page enter) operation, a page exit event corresponding to a page exit (page exit) operation, a scroll event corresponding to a page scroll (page scroll) operation, a click event corresponding to a control click (click) operation, and an exposure event corresponding to an exposure (exposure) operation. It should be noted that the exposure operation refers to that some specific content (e.g. goods, advertisement banner) is presented to the user on the screen of the real machine device (at this time, the user is considered to see the specific content) along with other interactive operations (e.g. page entry, page sliding, control clicking) of the user. The "specific content appears on the screen" may also be referred to as "the user sees the specific content", and this time is recorded as an exposure event of the specific content.

In this embodiment, the basic event may be obtained by analyzing user behavior data generated by the App of the user operating the genuine device. Event information for the underlying event includes, but is not limited to: event ID, page ID, timestamp, event content. The event ID is used for distinguishing different types of basic events, and different event IDs correspond to different basic events. For example, the event IDs are represented in the form of numbers, and are 1 (corresponding to a page entry event), 2 (corresponding to a page exit event), 3 (corresponding to a page scroll event), 4 (corresponding to a click event), and 5 (corresponding to an exposure event), respectively. Of course, the event ID is represented in a digital form only by way of example, and the present embodiment does not limit the representation form of the event ID.

The page ID is an ID of a page associated with a basic event corresponding to the occurrence event ID. For example, the target application is a shopping class APP, which includes, for example, a home page, a shopping cart page, or a system message page, etc. When the user performs operations related to the basic events on the home page, the shopping cart page or the system message page, the pages are pages related to the basic events.

In the embodiment, a user generates a series of Basic Events by using an APP process on a real-time device to form a Basic event stream (Basic Events) according to the chronological order. A page event stream may also be constructed on the basis of the base event stream. The construction process of the page event stream is as follows: after a new basic event enters a basic event stream, determining whether the new basic event belongs to a page entry event or not according to the event ID of the new basic event, determining that a new page event occurs under the condition that the new basic event belongs to the page entry event, acquiring the event ID of the new basic event and the ID (page ID for short) of the page to which the new basic event belongs, constructing information of the new page event according to the form of { page ID: [ event ID ] }, and adding the page event stream. Wherein, the basic event corresponding to the event ID on the page corresponding to the page ID can be known according to the { page ID: [ event ID ] }. The page event stream includes a plurality of page events generated according to a time sequence, the page IDs of different page events are different, that is, each page event is a page event identified by a page ID, and a page event may include one or more basic events.

It should be noted that, whenever a new basic event enters the basic event stream, it may be determined whether the page ID of the new basic event is consistent with the page ID corresponding to the page event that is added to the page event stream at the latest, and if so, the event ID of the new basic event is continuously updated to the information of the page event that is added to the page event stream at the latest. As more and more base events join the same page event, event IDs of multiple base events are associated under the same page ID, e.g., { page ID: [ event ID., event ID ] } denotes event IDs of multiple base events associated under the same page ID. It should be noted that when a page exit event is detected on the same page, the page event corresponding to the page ID is ended. Meanwhile, the basic events subordinate to the same page event include respective basic events from the entry page to the exit page.

In this embodiment, various data processing requirements of the target application on the mobile device correspond to a plurality of different machine learning tasks, each machine learning task corresponds to a trigger event sequence, and in order to facilitate efficient management of triggering of the machine learning tasks, a tree structure may be used to organize and manage each machine learning task of the target application on the mobile device and the trigger event sequence corresponding to the machine learning task. In order to facilitate understanding and distinguishing, a tree structure constructed based on each machine learning task of the target application and the corresponding trigger event sequence is used as a task tree. The task tree includes a root node, an event node, and a leaf node, and a path from the root node to the leaf node passes through one or more event nodes. A task tree has only one root node, and the number of event nodes and leaf nodes can be one or more. A path consisting of any root node, a plurality of event nodes and leaf nodes in a task tree uniquely corresponds to a trigger condition or a trigger event sequence, each event node on the path corresponding to each trigger condition is associated with one trigger event in the trigger condition, and the node identification of each event node comprises identification information of the trigger event; the node identifier of the root node is used for marking the start of the trigger condition, and the root node can be regarded as a start node corresponding to the trigger condition; the node identifier of the leaf node is used for marking the end of the trigger condition, the leaf node can be regarded as an end node corresponding to the trigger condition, machine learning tasks meeting the trigger condition are stored in the leaf node in an associated mode, and the number of the machine learning tasks in the leaf node can be one or more. Further, for different trigger conditions in which one or more identical trigger events exist, event nodes on a path from the root node to the subtree root node are common event nodes of the different trigger conditions. In an alternative embodiment, the task tree may employ a dictionary tree, a binary tree, and a huffman tree, but is not limited thereto.

In practical applications, changes in data processing requirements may require updating of trigger conditions for existing machine learning tasks over time, or new data processing requirements may require deployment of new machine learning tasks for target applications. Thus, further optionally, updating of the task tree of the target application already built is supported. And when the trigger condition of the existing machine learning task is updated, updating the task tree according to the updated trigger condition. When a new machine learning task is deployed to a target application, acquiring a trigger event sequence corresponding to the new machine learning task as a trigger event sequence to be matched, and matching trigger events in the trigger event sequence to be matched with trigger events corresponding to event nodes on a task tree in sequence; if a target path corresponding to the trigger events in the trigger event sequence to be matched in sequence is matched, associating the new machine learning task with a leaf node corresponding to the target path; and if the target path corresponding to the trigger event sequence in the trigger event sequence to be matched is not matched, taking the event node successfully matched as the root node of the subtree, constructing the subtree for the trigger event in the trigger event sequence to be matched, and associating the new machine learning task with the leaf node of the subtree.

Specifically, before the trigger events in the trigger event sequence to be matched are sequentially matched with the trigger events corresponding to the event nodes on the task tree, a start mark and an end mark are respectively added to the head and the tail of the trigger event sequence to be matched, the start mark is used for the root node of the corresponding task tree, the end mark is used for the leaf node on the corresponding task tree, and the trigger events between the start mark and the end mark are used for the event nodes on the corresponding task tree.

For the trigger event sequence to be matched to which the start mark and the end mark have been added, a graph Search algorithm such as Depth First Search (DFS) or Breadth First Search (BFS) may be employed to Search the task tree to identify whether a target path exists in the task tree in sequence corresponding to the trigger events in the trigger event sequence to be matched, where the target path is formed by a plurality of event nodes passing from the root node to the leaf nodes, and it is noted that, when searching the task tree, the event nodes on the paths on the task tree are sequentially traversed starting from the root node of the task tree to determine whether a target path exists in sequence corresponding to the trigger events in the trigger event sequence to be matched, and if a target path exists, it is indicated that the trigger condition of the new machine learning task is the same as the trigger condition of the machine learning task that has been deployed, for this case no new subtree needs to be added to the task tree. If the target path does not exist, the triggering condition of the new machine learning task is different from that of the machine learning task already deployed, and for the situation, a new sub-tree needs to be added to the task tree. And the root node of the sub-tree is the last event node successfully matched with the trigger event in the trigger event sequence to be matched on the task tree, the trigger events of the trigger event sequence to be matched after the last trigger event successfully matched with the task tree are sequentially used as the trigger events corresponding to one event node of the sub-tree, leaf nodes are added to the sub-tree, and the leaf nodes are associated with a new machine learning task, so that the creation of the sub-tree is completed.

In this embodiment, in the running process of the target application, user behavior data in the process of using the target application by the user may be collected, and the user behavior data may be analyzed to determine whether a current target event such as a basic event and/or a page event occurs. In an optional implementation mode, responding to an interactive operation initiated by a user in the running process of a target application, acquiring a basic event generated by the interactive operation, adding the basic event into a basic event stream, and taking the basic event as a current target event, wherein the basic event is one type of events in preset event types; and/or judging whether the basic event is a page entry event or not; and if the basic event is a page entry event, constructing a page event according to the identification of the basic event and the identification of the page to which the basic event belongs, and taking the page event as the current target event. The preset event type is set according to actual requirements, and is set according to actual requirements, for example, one or more of a page entry event, a page exit event, a page scrolling event, a click event and an exposure event.

In this embodiment, when a target event occurs during the running process of a target application, candidate event nodes for the current target event are obtained, where the candidate event nodes include a first-class event node and a second-class event node, the first-class event node is a next-class event node of a root node, and the second-class event node is a next-class event node of an event node matched to a previous target event in a task tree. The first type event node cannot be empty and is a necessary candidate event node, and the number of the second type event nodes can be 0. In an alternative embodiment, the static node list may be used to store the first type event node and the dynamic node list may be used to store the second type event node, and then the static node list and the dynamic node list may be obtained to obtain the candidate event node for the current target event.

In this embodiment, after the candidate event nodes are acquired, the current target event is matched with the trigger event corresponding to the candidate event node, so as to acquire at least one candidate event node where the trigger event is matched with the current target event, as the target event node. When the candidate event node comes from the static node list or the dynamic node list, when the current target event is matched with the trigger event corresponding to the candidate event node, the event nodes in the static node list and the dynamic node list can be traversed in sequence; and matching the trigger event corresponding to the currently traversed event node with the current target event, and if the trigger event and the current target event are matched, taking the currently traversed event node as the target event node matched with the current target event.

In this embodiment, the number of target event nodes may be one or more. When a target event node is matched, judging whether the next-level node is a leaf node or not for each next-level node of the target event node; if the next-level node is a leaf node, triggering a computing container to execute at least one machine learning task associated with the leaf node according to the information of the target event matched with the event node passing from the root node to the leaf node; and if the next-level node is the event node, adding the event node into the dynamic cache, assigning the event node in the dynamic cache to the dynamic node list after traversing all the event nodes in the static node list and the dynamic node list, and emptying the dynamic cache.

The description of the computing container and its working principle is as follows:

in this embodiment, the computing container is a cross-platform physical computing engine supporting multi-end deployment and end cloud consistent deployment, and the computing container may be implemented based on a tensor computing engine, optionally, the tensor computing engine may be, but is not limited to, an MNN engine, where the MNN engine is a lightweight deep learning end-side inference engine, and is intended to solve the problem of deep neural network model operation at end-side inference, including optimization, conversion, and inference of the deep neural network model, and has the characteristics of high versatility and high performance, and supports models of various training frameworks, common deep learning operators, various systems, and computation optimization in a convolution computing manner, and the like. The back-end (Backends) developer of the MNN engine can manually write a plurality of codes to adapt to each platform hardware to realize the cross-platform characteristic, and the machine learning task developer can execute the cross-platform on each mobile device and the cloud side server supported by the computing container by only writing one code of the machine learning task when in use. In the embodiment of the application, the computing container may shield hardware differences between various mobile devices and between the mobile device and the cloud-side device, and may be deployed on various mobile devices or on the cloud-side device. Optionally, the computing container supporting consistent deployment of the cloud includes, from top to bottom, a virtual machine (or referred to as a dynamic programming language interpreter), a machine learning library and a tensor calculation engine for a dynamic programming language, on which the machine learning task depends, and executes the machine learning task based on the virtual machine, the machine learning library and the tensor calculation engine according to the trigger event sequence; the machine learning task is written by adopting a dynamic programming language based on library functions in a machine learning library, the machine learning library is constructed based on a model operator provided by a tensor calculation engine, and the model operator provided by the tensor calculation engine corresponds to a plurality of back-end (backups) adaptation layers and is used for adapting a plurality of hardware resources, so that a calculation container to which the tensor calculation engine belongs supports cross-platform deployment. The tensor calculation engine has functions of geometric calculation, semi-automatic search and the like, the back-end adaptation layer is a software layer of an instruction set architecture for enabling the tensor calculation engine to adapt to different hardware resources, and optionally, the adaptable hardware platform instruction set architecture includes but is not limited to: versions of the ARM instruction set architecture, OpenCL, Vulkan, Metal, X86 AVX, CUDA, and the like. The computing container runs depending on the operating system and hardware resources of the cloud-side device or the mobile device where the computing container is located.

Further optionally, the machine learning task is written in Python language, and accordingly, the virtual machine is a virtual machine for Python language, but not limited thereto. The machine learning task can also be written by JavaScript and Ruby, and correspondingly, the virtual machine is a virtual machine for JavaScript and Ruby. The virtual machine in this embodiment is actually an interpreter for a dynamic programming language.

Further optionally, the machine learning task of this embodiment at least includes a model algorithm task, that is, a task that needs to be completed by a machine learning model, and based on this, the process for executing the machine learning task by the computing container provided in this embodiment may be: when the trigger event sequence arrives, executing the dynamic programming language interpreter to interpret the machine learning task into a plurality of machine instructions, wherein the plurality of machine instructions comprise first machine instructions corresponding to a model algorithm task in the machine learning task; executing a plurality of machine instructions in sequence, calling a first library function in a target library function and executing the first library function when the first machine instruction is executed, wherein the first library function is the library function called by the first machine instruction and is used for realizing a model algorithm task in a machine learning task; and under the condition that the first library function relates to tensor calculation, calling a tensor calculation engine, and executing a target model operator corresponding to the first library function on a target hardware resource which is pre-adapted for the model algorithm task to complete tensor calculation, wherein the target model operator is the model operator corresponding to the first library function in the model operators provided by the tensor calculation engine.

Further optionally, the machine learning task of this embodiment further includes a pre-processing and/or post-processing task adapted to the model algorithm task, and the plurality of machine instructions further includes a second machine instruction corresponding to the pre-processing and/or post-processing task in the machine learning task. Based thereon, the process of the computing container performing the machine learning task further comprises: and when the second machine instruction is executed, calling a second library function provided by the dynamic programming language interpreter and executing the second library function to complete the pre-processing and/or post-processing task, wherein the second library function is the library function called by the second machine instruction and is used for realizing the pre-processing and/or post-processing task in the machine learning task.

Further optionally, the computing container provided in this embodiment of the present application is further configured to provide a standard API set implemented based on a static or dynamic programming language to the outside, where an API in the standard API set may be called through the dynamic programming language, and the standard API set includes a first type of API carried by a dynamic programming language interpreter and a second type of API provided by a machine learning library through the dynamic programming language interpreter, so as to write a machine learning task; the first type of API is used for realizing preprocessing logic and/or postprocessing logic in the machine learning task, and the second type of API is used for realizing a model algorithm in the machine learning task. In other words, the machine learning task may implement pre-processing and/or post-processing tasks using the first class of APIs and implement model algorithm tasks using the second class of APIs. Based on this, the process of the computing container performing the machine learning task includes: executing a dynamic programming language interpreter to interpret a machine learning task as a plurality of machine instructions upon arrival of a sequence of trigger events; in the process of sequentially executing a plurality of machine instructions, when a second machine instruction corresponding to a first type API used by a machine learning task is executed, a second library function provided by the dynamic programming language interpreter can be called to pre-process a trigger event sequence and/or post-process result data generated by model calculation, and when a first machine instruction corresponding to a second type API used by the machine learning task is executed, a first library function in the machine learning library is called to perform model calculation on the result data generated by pre-processing; further, under the condition that the first library function relates to tensor calculation, a tensor calculation engine is called, a target model operator corresponding to the first library function is executed on a target hardware resource which is suitable for the model algorithm task in advance to complete tensor calculation, and result data generated by the model algorithm task is returned upwards after the model algorithm task is completed. In an alternative embodiment, the tensor computation engine is a MNN engine, and accordingly the machine learning library includes at least one of: the method comprises the steps of constructing a machine learning library for model inference based on an MNN engine, a machine learning library for model training, a machine learning library for visual calculation and a machine learning library for matrix operation.

Further optionally, the adapting a target hardware resource for the model algorithm task in advance includes: before the model algorithm task is executed, according to the model operator related to the model algorithm task and the input tensor shape of the model operator, a semi-automatic search algorithm is adopted, and target hardware resources are adapted for the model algorithm task from hardware resources corresponding to various rear-end adaptation layers in advance.

According to the model operator related to the model algorithm task and the input tensor shape of the model operator, adopting a semi-automatic search algorithm, and adapting target hardware resources for the model algorithm task from hardware resources corresponding to various rear-end adaptation layers in advance, wherein the method comprises the following steps: calculating performance parameters of the model algorithm task when the model algorithm task is executed on hardware resources corresponding to each rear-end adaptation layer according to the model operator related to the model algorithm task and the input tensor shape of the model operator; and selecting the hardware resources with performance parameters meeting the requirements as the target hardware resources matched with the model algorithm task according to the performance parameters of the model algorithm task when the model algorithm task is executed on the hardware resources corresponding to each rear-end adaptation layer.

Further, according to the model operator involved in the model algorithm task and the input tensor shape thereof, calculating the performance parameters of the model algorithm task when the model algorithm task is executed on each hardware resource corresponding to the back-end adaptation layer, including: aiming at each rear-end adaptation layer, respectively executing various implementation algorithms corresponding to each model operator related to the model algorithm task on hardware resources corresponding to the rear-end adaptation layer to obtain the loss performance fraction of each model operator when various implementation algorithms are adopted; and taking the sum of the minimum loss performance scores of the loss performance scores of each model operator related to the model algorithm task when various realization algorithms are adopted as the performance parameter of the model algorithm task on the back-end adaptation layer. Each model operator corresponds to at least one realization algorithm, the corresponding execution performance of the model operators is different when different realization algorithms are adopted, and the corresponding execution performance of the same model operator is different when the same realization algorithm is adopted but the same model operator is executed on hardware resources. In the embodiment of the application, the execution performance is characterized by the performance loss fraction, and the smaller the performance loss fraction is, the better the execution performance is. In the embodiment of the application, the model operators provided by the tensor calculation engine comprise an atomic operator, a deformation operator and a combination operator. Wherein, the atomic operator refers to a model operator which cannot be split or has poor performance after split, and examples include but are not limited to: monocular operators (Unary, exemplified above); binocular operators (Binary) such as addition, subtraction, multiplication and division; reducing the input tensor in a certain direction by a reduction operator (Reduce) (such as extreme value taking, summation, average and the like) to Reduce the dimension of the input tensor by one dimension; and the Raster operator (Raster) is used for completing the area mapping (or called as carrying) of the data in the memory, and different deformation effects can be equivalently realized by carrying out custom setting on the Raster operator. The number of deformation operators is large, but for an operator which only performs memory mapping and does not perform calculation, the operator can be realized by a Raster operator Raster. The Raster operator Raster realizes various deformation operators by storing the memory mapping information from the input tensor to the output tensor. The mapping relation is obtained by storing offset (offset), stride (stride) and size (size) information of a memory and accessing data through a fixed loop structure in a Raster operator (Raster). The Raster operator uses regions (regions) to store transformation information. Deformation operators include, for example, but are not limited to, transpose, slice, concatenation, permutation, and the like. The compound operator can be decomposed into atomic operators and, for example, compound operators include, but are not limited to: convolution operation, deconvolution operation, pooling operation, inner product operation, and the like. The deformation operator and the combined operator are formed by combining or splicing atomic operators. The geometric calculation is responsible for disassembling/converting the deformation operator and the combination operator into an atomic operator. The principle of transforming the deformation operator is as follows: the method comprises the steps of firstly calculating the shape output by an operator based on the shape input by the operator, then calculating a series of linear memory carrying areas based on the shape input and output and deformation rules, carrying the memory according to the areas, and realizing the method equivalent to the original deformation operator. For a composite operator, the deformed part can be extracted, the region of memory transportation is calculated in the same way, and other operators can be decomposed into atomic operators.

In one application scenario, the target application is a shopping APP, and the machine learning task is a commodity recommendation model for personalized commodity recommendation to a user. On a cloud side resource, training a commodity recommendation model adaptive to each user in advance, obtaining a commodity recommendation model for an end side through compression and conversion, evaluating the performance of the commodity recommendation model by adopting the evaluation method provided by the embodiment, deploying the commodity recommendation model into a shopping APP used by the user under the condition of evaluation, and meanwhile, acquiring and generating various basic events by a data flow processing frame and a computing container along with various operations of browsing, clicking, checking commodity details, checking commodity comments, adding a shopping cart and the like of the user on the shopping APP, and triggering the computing container to execute the commodity recommendation model when a trigger event sequence corresponding to the commodity recommendation model is generated, for example, when the user clicks to enter a new page; the calculation container loads the commodity recommendation model, the commodity recommendation model is operated by adopting the method provided by the embodiment, the target commodity recommended to the user is finally output, and the recommended target commodity is displayed on a related page by the shopping APP for the user to check and select.

In another application scenario, the target application is a live broadcast APP, and the machine learning task is a marking processing model for marking live broadcast content. On a cloud side resource, training a marking processing model in advance, obtaining a marking processing model for an end side through compression and conversion, evaluating the performance of the marking processing model by adopting the evaluating method provided by the embodiment, deploying the marking processing model into a live APP used by a main broadcast under the condition of evaluation, wherein the live APP comprises a data stream processing frame and a computing container provided by the embodiment, continuously generating live content along with the process of a live broadcast process, the live content comprises a live broadcast picture and live broadcast voice, the data stream processing frame continuously processes events in the live broadcast process, and generating a trigger event sequence of the marking processing model, for example, when the main broadcast has a pause event, triggering the computing container to execute the marking processing model; the calculation container loads the marking processing model and operates the marking processing model by adopting the method provided by the embodiment, finally, marks are added on the key contents in the live broadcast contents, and the live broadcast contents with the mark information are output so that the live broadcast server can further process the live broadcast contents with the mark information. The marking content can be some private information, important information or specific commodity information and the like.

It should be noted that the execution subjects of the steps of the methods provided in the above embodiments may be the same device, or different devices may be used as the execution subjects of the methods. For example, the execution subjects of steps 101 to 104 may be device a; for another example, the execution subject of

steps

101, 102, and 103 may be device a, and the execution subject of step 104 may be device B; and so on.

In addition, in some of the flows described in the above embodiments and the drawings, a plurality of operations are included in a specific order, but it should be clearly understood that the operations may be executed out of the order presented herein or in parallel, and the sequence numbers of the operations, such as 101, 102, etc., are merely used for distinguishing different operations, and the sequence numbers do not represent any execution order per se. Additionally, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first", "second", etc. in this document are used for distinguishing different messages, devices, modules, etc., and do not represent a sequential order, nor limit the types of "first" and "second" to be different.

Fig. 5 is a schematic structural diagram of a simulation evaluating apparatus for a mobile terminal running algorithm model according to an embodiment of the present application. As shown in fig. 5, the apparatus may include:

the obtaining module 51 is configured to respond to a model evaluation request, and obtain a model to be evaluated and description information of multiple device evaluation environments corresponding to the model, where the model to be evaluated is a machine learning model obtained by compressing and converting an original machine learning model;

the determining module 54 determines a first type of equipment evaluation environment to be simulated and a second type of equipment evaluation environment based on the genuine equipment according to the description information of the multiple types of equipment evaluation environments;

the building module 52 is configured to simulate a cloud evaluation environment corresponding to the first type of equipment evaluation environment on the target cloud-side resource, and deploy a software operating environment in a hardware operating environment provided by the target real machine equipment to obtain a second type of equipment evaluation environment, where the cloud evaluation environment and the first type of equipment evaluation environment corresponding to the cloud evaluation environment include the same hardware operating environment and software operating environment.

And the joint evaluation module 53 is configured to perform joint evaluation on the model to be evaluated according to the second-class device evaluation environment and the cloud evaluation environment to obtain a performance evaluation result of the model to be evaluated.

Further optionally, if the description information of each equipment evaluation environment includes hardware description information and software description information, the determining module 54 is specifically configured to: according to hardware description information and/or software description information of multiple equipment evaluation environments, at least two equipment evaluation environments are selected as a first type of equipment evaluation environment to be simulated by combining a set quantity proportion, and the rest equipment evaluation environments are used as second type of equipment environment information based on the genuine equipment.

Further optionally, the determining module 54 is configured to: under the condition that the software description information of each equipment evaluation environment is the same, determining the types and the resource occupation amounts of the hardware running environments required by the various equipment evaluation environments according to the hardware description information of the various equipment evaluation environments; and selecting at least two equipment evaluation environments as a first type of equipment evaluation environment to be simulated according to the resource types and the resource occupancy of the hardware operating environment required by the multiple equipment evaluation environments and by combining a set quantity ratio.

When the building module 52 simulates a cloud evaluation environment corresponding to the first-class equipment evaluation environment on the target cloud-side resource, the building module is specifically configured to: according to the hardware description information of the first equipment evaluation environment, cloud side resources with the hardware operation environment the same as that required by the first equipment evaluation environment are selected from the cloud side resources as target cloud side resources; according to the software description information of the first equipment evaluation environment, a software running environment required by the first equipment evaluation environment is constructed on the target cloud side resource, so that a cloud evaluation environment corresponding to the first equipment evaluation environment is obtained.

Further optionally, the building module 52 is configured to, according to the software description information of the first-class device evaluation environment, build a software operating environment required by the first-class device evaluation environment on the target cloud-side resource, so as to obtain a cloud evaluation environment corresponding to the first-class device evaluation environment, specifically: acquiring an operating system required by the first equipment evaluation environment and an end-side machine learning inference engine adapted to a model to be evaluated according to the software description information of the first equipment evaluation environment; according to the operating system required by the first equipment evaluation environment, a target container is constructed on target cloud side resources by adopting a virtualization technology, and an end side machine learning inference engine is deployed in the target container to obtain a cloud evaluation environment corresponding to the first equipment evaluation environment.

Further optionally, when the building module 52 builds the target container on the target cloud-side resource by using the virtualization technology according to the operating system required by the first-class device evaluation environment, specifically, the building module is configured to: if the operating system required by the first equipment evaluation environment is the same as the operating system of the corresponding target cloud side resource, constructing a first container depending on the host operating system on the target cloud side resource as a target container; if the operating system required by the first equipment evaluation environment is different from the operating system corresponding to the target cloud side resource, a second container with the operating system is built on the target cloud side resource to serve as a target container, and the operating system of the second container is the same as the operating system required by the first equipment evaluation environment.

Further optionally, the end-side machine learning inference engine required by the first class of device evaluation environment is the same and is an MNN engine.

Further optionally, the constructing module 52 is specifically configured to, when a software operating environment is deployed in a hardware operating environment provided by the target genuine machine device to obtain a second type of device evaluation environment: according to the description information of the second type equipment evaluation environment, real machine equipment which can provide a hardware operation environment required by the second type equipment evaluation environment is determined; dividing the real machine equipment into multiple types according to the multi-dimensional attribute information of the real machine equipment, and selecting part of real machine equipment from each type of real machine equipment as target real machine equipment.

Further optionally, when performing joint evaluation on the to-be-evaluated model according to the second-class device evaluation environment and the cloud evaluation environment, the joint evaluation module 53 is specifically configured to: acquiring target evaluation data matched with the model to be evaluated from an evaluation data management end according to an evaluation task identifier corresponding to the model to be evaluated; according to the target evaluation data, operating the model to be evaluated in the second-class equipment evaluation environment and the cloud evaluation environment respectively to obtain a real machine evaluation result and a simulation evaluation result which are matched with the performance evaluation index item; and performing combined analysis on the genuine machine evaluation result and the simulation evaluation result according to the performance evaluation index item to obtain a performance evaluation result of the model to be evaluated.

Further optionally, the system further comprises a processing module, configured to perform model training by using a machine learning framework applicable to the cloud side to obtain an original machine learning model; and compressing and converting the original machine learning model according to the machine learning framework suitable for the end side to obtain the model to be evaluated.

Further optionally, the joint evaluation module 53 is further configured to: and under the condition that the performance evaluation result of the model to be evaluated indicates that the evaluation is passed, deploying the model to be evaluated to a real machine device running with target application, wherein the target application refers to the application needing to use the model to be evaluated. Correspondingly, the real machine equipment is also used for: generating a trigger event sequence of a model to be evaluated according to information of a target event generated in the running process of the target application; and triggering the computing container deployed on the real machine equipment to execute the model to be evaluated according to the trigger event sequence.

The simulation evaluation device for the mobile-end-oriented operation algorithm model shown in fig. 5 may execute the simulation evaluation method for the mobile-end-oriented operation algorithm model shown in the embodiment shown in fig. 1, and the implementation principle and the technical effect are not described again. The specific manner in which each module and unit of the apparatus in fig. 5 in the above embodiment perform operations has been described in detail in the embodiment related to the method, and will not be described in detail herein.

Fig. 6 is a schematic structural diagram of a computer device according to an embodiment of the present application. As shown in fig. 6, the apparatus includes: a memory 61 and a processor 62.

The memory 61 is used for storing computer programs and may be configured to store other various data to support operations on the computer device. Examples of such data include instructions for any application or method operating on the computer device, contact data, phonebook data, messages, pictures, videos, and the like.

The memory 61 may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

A processor 62, coupled to the memory 61, for executing computer programs in the memory 61 for: responding to a model evaluation request, and acquiring description information of a model to be evaluated and a plurality of equipment evaluation environments corresponding to the model to be evaluated, wherein the model to be evaluated refers to a first class of equipment evaluation environment to be simulated and a second class of equipment evaluation environment based on a real machine device, which are determined by a machine learning model obtained by compressing and converting an original machine learning model according to the description information of the plurality of equipment evaluation environments; simulating a cloud evaluation environment corresponding to a first type of equipment evaluation environment on a target cloud side resource, and deploying a software operation environment in a hardware operation environment provided by a target real machine equipment to obtain a second type of equipment evaluation environment, wherein the cloud evaluation environment and the first type of equipment evaluation environment corresponding to the cloud evaluation environment comprise the same hardware operation environment and software operation environment; and performing combined evaluation on the model to be evaluated according to the second-class equipment evaluation environment and the cloud evaluation environment to obtain a performance evaluation result of the model to be evaluated.

Further optionally, the description information of each equipment evaluation environment includes hardware description information and software description information, and the processor 62 is specifically configured to, when determining the first type of equipment evaluation environment to be simulated and the second type of equipment evaluation environment based on the genuine equipment: according to hardware description information and/or software description information of multiple equipment evaluation environments, at least two equipment evaluation environments are selected as a first type of equipment evaluation environment to be simulated by combining a set quantity proportion, and the rest equipment evaluation environments are used as second type of equipment environment information based on the genuine equipment.

Further optionally, when the processor 62 selects at least two equipment evaluation environments as the first type of equipment evaluation environment to be simulated, the following steps are specifically performed: under the condition that the software description information of each equipment evaluation environment is the same, determining the types and the resource occupation amounts of the hardware running environments required by the various equipment evaluation environments according to the hardware description information of the various equipment evaluation environments; and selecting at least two equipment evaluation environments as a first type of equipment evaluation environment to be simulated according to the resource types and the resource occupancy of the hardware operating environment required by the multiple equipment evaluation environments and by combining a set quantity ratio.

Further optionally, the description information of each equipment evaluation environment includes hardware description information and software description information, and when the processor 62 simulates a cloud evaluation environment corresponding to the first type of equipment evaluation environment on the target cloud-side resource, the processor is specifically configured to: according to the hardware description information of the first equipment evaluation environment, cloud side resources with the hardware operation environment the same as that required by the first equipment evaluation environment are selected from the cloud side resources as target cloud side resources; according to the software description information of the first equipment evaluation environment, a software running environment required by the first equipment evaluation environment is constructed on the target cloud side resource, so that a cloud evaluation environment corresponding to the first equipment evaluation environment is obtained.

Further optionally, when the processor 62 obtains the cloud evaluation environment corresponding to the first type of equipment evaluation environment, the method is specifically configured to: acquiring an operating system required by the first equipment evaluation environment and an end-side machine learning inference engine adapted to a model to be evaluated according to the software description information of the first equipment evaluation environment; according to the operating system required by the first equipment evaluation environment, a target container is constructed on target cloud side resources by adopting a virtualization technology, and an end side machine learning inference engine is deployed in the target container to obtain a cloud evaluation environment corresponding to the first equipment evaluation environment.

Further optionally, when the processor 62 constructs the target container on the target cloud-side resource by using the virtualization technology, the virtualization technology is specifically configured to: if the operating system required by the first equipment evaluation environment is the same as the operating system of the corresponding target cloud side resource, constructing a first container depending on the host operating system on the target cloud side resource as a target container; if the operating system required by the first equipment evaluation environment is different from the operating system corresponding to the target cloud side resource, a second container with the operating system is built on the target cloud side resource to serve as a target container, and the operating system of the second container is the same as the operating system required by the first equipment evaluation environment.

Further optionally, before obtaining the second type of equipment evaluation environment, the processor 62 is further configured to: determining real machine equipment capable of providing a hardware operating environment in the second equipment evaluation environment according to the description information of the second equipment evaluation environment; dividing the real machine equipment into multiple types according to the multi-dimensional attribute information of the real machine equipment, and selecting part of real machine equipment from each type of real machine equipment as target real machine equipment.

Further optionally, when performing joint evaluation on the to-be-evaluated model according to the second-class device evaluation environment and the cloud evaluation environment, the processor 62 is specifically configured to: acquiring target evaluation data matched with the model to be evaluated from an evaluation data management end according to an evaluation task identifier corresponding to the model to be evaluated; according to the target evaluation data, operating the model to be evaluated in the second-class equipment evaluation environment and the cloud evaluation environment respectively to obtain a real machine evaluation result and a simulation evaluation result which are matched with the performance evaluation index item; and performing combined analysis on the genuine machine evaluation result and the simulation evaluation result according to the performance evaluation index item to obtain a performance evaluation result of the model to be evaluated.

Further optionally, the processor 62 is further configured to perform model training by using a machine learning framework applicable to the cloud side to obtain an original machine learning model; and compressing and converting the original machine learning model according to the machine learning framework suitable for the end side to obtain the model to be evaluated.

Further optionally, the processor 62 is further configured to deploy the model to be evaluated to a real-time device running a target application under the condition that the performance evaluation result of the model to be evaluated indicates that evaluation passes, where the target application refers to an application that needs to use the model to be evaluated. Correspondingly, the real machine equipment is also used for: generating a trigger event sequence of a model to be evaluated according to information of a target event generated in the running process of the target application; and triggering the computing container deployed on the real machine equipment to execute the model to be evaluated according to the trigger event sequence.

Further, as shown in fig. 6, the computer apparatus further includes: communication components 63, display 64, power components 65, audio components 66, and the like. Only some of the components are shown schematically in fig. 6, and it is not meant that the computer device includes only the components shown in fig. 6. In addition, the components within the dashed line in fig. 6 are optional components, not necessary components, and may be determined according to the product form of the production scheduling apparatus. The computer device of this embodiment may be implemented as a terminal device such as a desktop computer, a notebook computer, a smart phone, or an IOT device, or may be a server device such as a conventional server, a cloud server, or a server array. If the computer device of this embodiment is implemented as a terminal device such as a desktop computer, a notebook computer, a smart phone, etc., the computer device may include components within a dashed line frame in fig. 6; if the computer device of this embodiment is implemented as a server device such as a conventional server, a cloud server, or a server array, the components in the dashed box in fig. 6 may not be included.

Accordingly, the present application further provides a computer-readable storage medium storing a computer program, where the computer program can implement the steps in the above method embodiments when executed.

Accordingly, the present application also provides a computer program product, which includes a computer program/instruction, when the computer program/instruction is executed by a processor, the processor is enabled to implement the steps in the above method embodiments.

The communication component is configured to facilitate wired or wireless communication between the device in which the communication component is located and other devices. The device where the communication component is located can access a wireless network based on a communication standard, such as a WiFi, a 2G, 3G, 4G/LTE, 5G and other mobile communication networks, or a combination thereof. In an exemplary embodiment, the communication component receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

The display includes a screen, which may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation.

The power supply assembly provides power for various components of the device in which the power supply assembly is located. The power components may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device in which the power component is located.

The audio component may be configured to output and/or input an audio signal. For example, the audio component includes a Microphone (MIC) configured to receive an external audio signal when the device in which the audio component is located is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may further be stored in a memory or transmitted via a communication component. In some embodiments, the audio assembly further comprises a speaker for outputting audio signals.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.

The above are merely examples of the present application and are not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims

1. A simulation evaluation method for a mobile terminal operation algorithm model is characterized by comprising the following steps:

responding to a model evaluation request, and acquiring a model to be evaluated and description information of various equipment evaluation environments corresponding to the model to be evaluated, wherein the model to be evaluated is a machine learning model obtained by compressing and converting an original machine learning model;

determining a first type of equipment evaluation environment to be simulated and a second type of equipment evaluation environment based on the real machine equipment according to the description information of the multiple types of equipment evaluation environments;

simulating a cloud evaluation environment corresponding to the first equipment evaluation environment on a target cloud side resource, and deploying a software operation environment in a hardware operation environment provided by a target real machine equipment to obtain a second equipment evaluation environment, wherein the cloud evaluation environment and the first equipment evaluation environment corresponding to the cloud evaluation environment comprise the same hardware operation environment and software operation environment;

and performing combined evaluation on the model to be evaluated according to the second-class equipment evaluation environment and the cloud evaluation environment to obtain a performance evaluation result of the model to be evaluated.

2. The method according to claim 1, wherein the description information of each equipment evaluation environment includes hardware description information and software description information, and then determining a first type of equipment evaluation environment to be simulated and a second type of equipment evaluation environment based on the genuine equipment according to the description information of the plurality of equipment evaluation environments comprises:

and selecting at least two equipment evaluation environments as a first type of equipment evaluation environment to be simulated according to the hardware description information and/or the software description information of the multiple equipment evaluation environments and combining a set quantity ratio, and taking the rest equipment evaluation environments as second type of equipment environment information based on the genuine equipment.

3. The method according to claim 2, wherein selecting at least two equipment evaluation environments as the first type of equipment evaluation environment to be simulated according to the hardware description information and/or the software description information of the plurality of equipment evaluation environments in combination with a set quantity ratio comprises:

under the condition that the software description information of each equipment evaluation environment is the same, determining the type and the resource occupation amount of the hardware operation environment required by the various equipment evaluation environments according to the hardware description information of the various equipment evaluation environments;

and selecting at least two equipment evaluation environments as a first type of equipment evaluation environment to be simulated according to the resource types and the resource occupancy of the hardware operating environment required by the multiple equipment evaluation environments and by combining a set quantity proportion.

4. The method according to claim 1, wherein the description information of the first-class equipment evaluation environment includes hardware description information and software description information, and then the simulation of the cloud evaluation environment corresponding to the first-class equipment evaluation environment on the target cloud-side resource includes:

according to the hardware description information of the first equipment evaluation environment, cloud side resources with the hardware operating environment the same as that required by the first equipment evaluation environment are selected from the cloud side resources as target cloud side resources;

and according to the software description information of the first equipment evaluation environment, constructing a software running environment required by the first equipment evaluation environment on the target cloud side resource to obtain a cloud evaluation environment corresponding to the first equipment evaluation environment.

5. The method according to claim 4, wherein the step of constructing a software operating environment required by the first-class equipment evaluation environment on the target cloud-side resource according to the software description information of the first-class equipment evaluation environment to obtain a cloud evaluation environment corresponding to the first-class equipment evaluation environment comprises:

acquiring an operating system required by the first equipment evaluation environment and an end-side machine learning inference engine adapted to the model to be evaluated according to the software description information of the first equipment evaluation environment;

according to the operating system required by the first equipment evaluation environment, a target container is constructed on the target cloud side resources by adopting a virtualization technology, and the end side machine learning inference engine is deployed in the target container to obtain a cloud evaluation environment corresponding to the first equipment evaluation environment.

6. The method according to claim 5, wherein building a target container on the target cloud-side resource by using a virtualization technology according to the operating system required by the first equipment evaluation environment comprises:

if the operating system required by the first equipment evaluation environment is the same as the operating system of the corresponding target cloud side resource, constructing a first container depending on a host operating system on the target cloud side resource as the target container;

if the operating system required by the first equipment evaluation environment is different from the operating system corresponding to the target cloud side resource, a second container with the operating system is built on the target cloud side resource and serves as the target container, and the operating system of the second container is the same as the operating system required by the first equipment evaluation environment.

7. The method according to claim 1, wherein before deploying the software operating environment in the hardware operating environment provided by the target genuine machine device and obtaining the second type of device evaluation environment, the method further comprises:

according to the description information of the second type equipment evaluation environment, real machine equipment which can provide the hardware operation environment required by the second type equipment evaluation environment is determined;

and dividing the real machine equipment into multiple types according to the multi-dimensional attribute information of the real machine equipment, and selecting part of real machine equipment from each type of real machine equipment as target real machine equipment.

8. The method according to claim 1, wherein the joint evaluation of the model to be evaluated according to the second-class equipment evaluation environment and the cloud evaluation environment is performed to obtain a performance evaluation result of the model to be evaluated, and the method comprises the following steps:

acquiring target evaluation data matched with the model to be evaluated from an evaluation data management end according to the evaluation task identifier corresponding to the model to be evaluated;

according to the target evaluation data, the model to be evaluated is operated in the second class of equipment evaluation environment and the cloud evaluation environment respectively to obtain a real machine evaluation result and a simulation evaluation result which are matched with the performance evaluation index item;

and performing combined analysis on the genuine machine evaluation result and the simulation evaluation result according to the performance evaluation index item to obtain a performance evaluation result of the model to be evaluated.

9. The method according to any one of claims 1-8, further comprising:

performing model training by adopting a machine learning framework suitable for the cloud side to obtain an original machine learning model;

and compressing and converting the original machine learning model according to a machine learning framework suitable for the end side to obtain a model to be evaluated.

10. The method according to any one of claims 1-8, further comprising:

under the condition that the performance evaluation result of the model to be evaluated shows that the evaluation passes, deploying the model to be evaluated to a real machine device running with a target application; and

generating a trigger event sequence of the model to be evaluated according to the information of the target event generated in the running process of the target application;

and triggering a computing container deployed on the real machine equipment to execute the model to be evaluated according to the trigger event sequence.

11. A simulation evaluating device for a mobile terminal operation algorithm model is characterized by comprising:

the device comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for responding to a model evaluation request and acquiring a model to be evaluated and description information of various equipment evaluation environments corresponding to the model to be evaluated, and the model to be evaluated is a machine learning model obtained by compressing and converting an original machine learning model;

the determining module is used for determining a first equipment evaluating environment to be simulated and a second equipment evaluating environment based on the genuine equipment according to the description information of the multiple equipment evaluating environments;

the building module is used for simulating a cloud evaluation environment corresponding to the first equipment evaluation environment on a target cloud side resource, deploying a software operation environment in a hardware operation environment provided by a target real machine device to obtain a second equipment evaluation environment, and the cloud evaluation environment and the corresponding first equipment evaluation environment comprise the same hardware operation environment and software operation environment;

and the joint evaluation module is used for performing joint evaluation on the model to be evaluated according to the second-class equipment evaluation environment and the cloud evaluation environment so as to obtain a performance evaluation result of the model to be evaluated.

12. A computer device, comprising: a memory and a processor; the memory for storing a computer program, the processor being coupled to the memory for executing the computer program for implementing the steps of the method of any of claims 1-10.

13. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, causes the processor to carry out the steps of the method according to any one of claims 1 to 10.