CN118012468A

CN118012468A - Model processing method, system and equipment

Info

Publication number: CN118012468A
Application number: CN202410411323.3A
Authority: CN
Inventors: 薛盛可
Original assignee: Zhejiang Shenxiang Intelligent Technology Co ltd
Current assignee: Zhejiang Shenxiang Intelligent Technology Co ltd
Priority date: 2024-04-08
Filing date: 2024-04-08
Publication date: 2024-05-10
Anticipated expiration: 2044-04-08
Also published as: CN118012468B

Abstract

The embodiment of the application provides a model processing method, a system and equipment, which relate to the technical field of computers, and the method comprises the following steps: acquiring target configuration parameters in a remote code warehouse; the remote code warehouse stores at least one type of processing data corresponding to the hardware platform, and the target configuration parameters comprise identification information corresponding to the target hardware platform; generating a model processing task corresponding to the target hardware platform according to the target processing data corresponding to the target hardware platform and the target configuration parameters; and acquiring a model to be processed, and executing the model processing task to process the model to be processed. The embodiment of the application can effectively reduce the quantization and test difficulty of the AI model and improve the quantization and test efficiency of the AI model.

Description

Model processing method, system and equipment

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method, a system, and an apparatus for processing a model.

Background

When an artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) model is ready to be deployed to hardware platforms with limited resources, such as mobile devices, embedded systems and various internet of things terminals, practical challenges such as insufficient storage space, limited computing power and the like are met.

In order to solve the technical problems, the quantization technology has been developed, and the purposes of compressing the model volume and improving the model operation efficiency are achieved mainly by converting complex floating point number weights and activation values inside the AI model into low-precision data types (such as integers or binary systems).

At present, in the development process of the AI model, the deployment of the models to different hardware platforms is a tedious and time-consuming task, each model needs to be processed by adopting different quantization technologies in a targeted manner, and performance tests may also need to be performed on each target hardware platform one by one after quantization is completed so as to verify the real performance of the model, which results in lower processing efficiency of the AI model.

Disclosure of Invention

The application provides a model processing method, a system and equipment through a plurality of aspects, which can effectively improve the processing efficiency of an AI model.

In a first aspect, an embodiment of the present application provides a method for processing a model, including:

Acquiring target configuration parameters in a remote code warehouse; the remote code warehouse stores at least one type of processing data corresponding to the hardware platform, and the target configuration parameters comprise identification information corresponding to the target hardware platform;

generating a model processing task corresponding to the target hardware platform according to the target processing data corresponding to the target hardware platform and the target configuration parameters;

and acquiring a model to be processed, and executing the model processing task to process the model to be processed.

In one possible implementation, the obtaining the target configuration parameter in the remote code repository includes:

detecting whether newly built or updated target configuration parameters exist in the remote code warehouse;

and when the newly-built or updated target configuration parameters exist in the remote code warehouse, acquiring the target configuration parameters.

In a possible implementation manner, the target configuration parameter includes a to-be-processed model path, and the obtaining the to-be-processed model and executing the model processing task includes:

Starting a docker container; the docker container includes the dependencies required to perform the model processing task;

and acquiring the model to be processed according to the model path to be processed.

In a possible implementation manner, the target processing data includes quantized processing data corresponding to the target hardware platform, the target configuration parameters further include quantized configuration parameters corresponding to the target hardware platform, and the quantized configuration parameters include at least one of quantized data paths and quantized precision;

The generating a model processing task corresponding to the target hardware platform according to the target processing data corresponding to the target hardware platform and the target configuration parameters comprises the following steps:

And generating a model quantization task according to the quantization configuration parameters and the quantization processing data.

In one possible embodiment, the method further comprises:

Determining whether the model quantization task is successfully executed;

when the model quantization task is successfully executed, uploading the quantized model to a preset storage device;

outputting a reminding message when the model quantification task fails to execute; the reminding message is used for reminding the model to be processed of the quantization failure.

In one possible embodiment, the method further comprises:

when the model quantization task is successfully executed, determining whether to test the quantized model;

and when the quantized model is determined to be tested, sending a test instruction to an automatic test device corresponding to the target hardware platform.

In a possible implementation manner, the target processing data includes test processing data corresponding to the target hardware platform, the target configuration parameters further include test configuration parameters corresponding to the target hardware platform, and the test configuration parameters include at least one of a test data path, a test file path and a test result storage location;

And when a test instruction is received, generating a model test task corresponding to the target hardware platform according to the test configuration parameters and the test processing data.

In a possible implementation manner, the executing the model processing task includes:

Determining whether the local hardware resource supports running the model to be processed;

And if the local hardware resource does not support running of the model to be processed, the model to be processed and the model test task are mounted in a remote terminal, and the model test task is executed by the remote terminal.

In one possible embodiment, the method further comprises:

recording a test result of the model to be processed;

and generating a performance evaluation report corresponding to the to-be-processed model according to the test result.

In a second aspect, an embodiment of the present application provides a model processing system, where the system includes a user side and at least one processing device;

The user side is used for: configuring or updating configuration parameters of the hardware platform, and pushing the newly configured or updated configuration parameters to a remote code warehouse; the remote code warehouse stores processing data corresponding to at least one hardware platform;

The at least one processing device is configured to: acquiring target configuration parameters from the remote code warehouse, and generating a model processing task corresponding to a target hardware platform according to target processing data corresponding to the target hardware platform and the target configuration parameters; and acquiring a model to be processed, and executing the model processing task to process the model to be processed.

In one possible implementation, the configuration parameters include a quantization configuration parameter and a test configuration parameter; the processing device comprises an automatic quantification module and an automatic test module;

The automatic quantization module is used for: when detecting that a new target quantization configuration parameter exists in the remote code warehouse, acquiring the target quantization configuration parameter, and generating a model quantization task corresponding to a target hardware platform according to the target quantization configuration parameter; obtaining a model to be quantized, and executing the model quantization task to quantize the model to be quantized;

The automatic test module is used for: when a test instruction is received, acquiring a target test configuration parameter corresponding to a target hardware platform from the remote code warehouse, and generating a model test task corresponding to the target hardware platform according to the target test configuration parameter; and obtaining a model to be tested, and executing the model test task to test the model to be tested.

In a third aspect, an embodiment of the present application provides an electronic device, including: a memory and a processor;

The memory stores computer-executable instructions;

the processor executes computer-executable instructions stored in the memory, causing the processor to perform the model processing method as provided in the first aspect.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium having stored therein computer-executable instructions for implementing the model processing method as provided in the first aspect, when the computer-executable instructions are executed by a computer.

In a fifth aspect, embodiments of the present application provide a computer program product comprising a computer program which, when run, causes a computer to perform the model processing method as provided in the first aspect.

According to the model processing method, system and equipment provided by the embodiment of the application, as the processing data corresponding to each hardware platform, such as the quantization method data, the test method data and the like, are integrated in the remote code warehouse, a developer does not need to know the specific processing mode of each hardware platform, when the hardware platform has model quantization and/or test requirements, the developer only needs to push the configuration parameters corresponding to the hardware platform into the remote code warehouse, the system can automatically acquire the required configuration parameters from the remote code warehouse, and automatically generate and execute model processing tasks, such as model quantization tasks, model test tasks and the like, and the whole process does not need manual intervention, so that the quantization and/or test difficulty of an AI model can be effectively reduced, and the quantization and/or test efficiency of the AI model can be improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:

FIG. 1 is a schematic diagram of a model processing system provided in an exemplary embodiment of the present application;

FIG. 2 is a flow chart of a model processing method according to an exemplary embodiment of the present application;

FIG. 3 is a second flow chart of a model processing method according to an exemplary embodiment of the present application;

FIG. 4 is a flowchart of a model processing method according to an exemplary embodiment of the present application;

FIG. 5 is a schematic flow diagram of the operation of a model processing system provided in an exemplary embodiment of the present application;

fig. 6 is a schematic structural diagram of an electronic device according to an exemplary embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be clearly and completely described below with reference to specific embodiments of the present application and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application. The user information (including but not limited to user equipment information, user personal information and the like) and the data (including but not limited to data for analysis, stored data, presented data and the like) related to the application are information and data authorized by a user or fully authorized by all parties, and the collection, the use and the processing of related data comply with related laws and regulations and standards, and are provided with corresponding operation inlets for users to select authorization or rejection.

In the embodiments of the present application, "at least one" means one or more, and "a plurality" means two or more. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a alone, a and B together, and B alone, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one of" or the like means any combination of these items, including any combination of single item(s) or plural items(s). For example, at least one (one) of a, b, or c may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or plural.

It should be noted that, in the embodiments of the present application, words such as "exemplary" or "such as" are used to denote examples, illustrations, or descriptions. Any embodiment or design described herein as "exemplary" or "for example" should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.

Some of the terms involved in the embodiments of the present application are explained below:

1. Quantification: in the field of artificial intelligence, and in particular in deep learning, it is meant to convert the weights of a model and the output of an activation function from a floating point representation to a low precision integer or other lower bit format. The main purpose of this process is to reduce the memory size of the model, speed up the model calculation process, and reduce the operational requirements on hardware so that the model can be run on resource-constrained devices. For example, deep learning models often use 32-bit floating point numbers (float 32) to represent weights and activation outputs, and quantization can convert these 32-bit floating point numbers to 8-bit integers (int 8) or other precision representations.

Dock: a containerization platform may package applications and their dependent items into lightweight containers for deployment and operation anywhere.

Jenkins: an automated build tool for continuous integration and continuous delivery is primarily used for automated build, test and deployment of software projects. It supports various programming languages and build tools and can be integrated with a variety of version control systems. Jenkins enables development teams to easily set an automation flow by providing an easy-to-use Web interface, thereby speeding up software development and delivery.

OSS: an object storage service (Object Storage Service), a cloud service for storing large amounts of unstructured data. The object store stores data as objects, each of which contains data itself (e.g., a file), metadata (information describing the data), and a unique identifier.

ONNX: an open source framework for converting a deep learning model into a format that can run on different platforms. It allows different artificial intelligence frameworks to represent deep learning models in the same format, thereby facilitating migration and interoperability of models between different frameworks.

Caffe: an open source deep learning framework can help developers quickly build, train and deploy deep learning models.

In the context of rapid development of current information technology, artificial intelligence technology, particularly big data analysis and deep learning technology, has penetrated into various corners of society, such as numerous industries from finance, medical treatment, education to manufacturing, internet of things, etc.

Among them, the importance of the quantization model is increasingly highlighted as one of core technologies in the field of deep learning. When the AI model is ready to be deployed to a hardware environment with limited resources, such as mobile devices, embedded systems and various internet of things terminals, practical challenges such as insufficient storage space, limited computing power and the like are faced. To solve such problems, quantization techniques have been developed that achieve the goals of compressing model volumes and improving operational efficiency by converting complex floating point number weights and activation values within a deep learning model into low precision data types (e.g., integer or binary).

The quantization process can greatly reduce the memory requirement of the model, and because a plurality of hardware platforms have more excellent optimizing processing capacity for integer operation, the model reasoning speed can be effectively accelerated, and the performance of the model in various actual scenes is ensured. Among other things, the practical effect and cost-effectiveness of a quantitative model has a decisive influence on the breadth and feasibility of the model application.

Currently, in the development process of AI models, deploying the models onto different hardware platforms is a tedious and time-consuming task. Each model needs to be quantized by adopting different quantization technologies according to hardware platforms in a targeted manner, and performance tests can be performed on each target hardware platform one by one after quantization is completed so as to verify the actual performance of the model. In the process, a developer of the AI model not only needs to know the specific quantization technology of each hardware platform, but also needs to know the model test method of each hardware platform in depth, thus greatly increasing the workload and the technical threshold. Meanwhile, the quantization and test operation of each hardware platform may consume a great deal of manpower and time, resulting in a great increase in repetitive workload and lower model quantization and test efficiency.

In view of the above technical problems, in the embodiment of the present application, a developer may push quantization method data and test method data corresponding to each hardware platform into a remote code repository, when a hardware platform has a model quantization and/or test requirement, the developer only needs to push configuration parameters corresponding to the hardware platform into the remote code repository, and the system may automatically obtain required configuration parameters from the remote code repository, and automatically generate and execute a model quantization task and a model test task.

The technical scheme shown in the application is described in detail by specific examples. It should be noted that the following embodiments may exist alone or in combination with each other, and for the same or similar content, the description will not be repeated in different embodiments.

Referring to fig. 1, fig. 1 is a schematic diagram of a model processing system according to an exemplary embodiment of the present application. As shown in fig. 1, the model processing system includes a user side 101, a remote code repository 102, and at least one processing device 103.

In some embodiments, the client 101 is configured to create or modify configuration parameters corresponding to the hardware platform, and submit the created or modified configuration parameters to the remote code repository 102 through the Git.

Wherein Git is an open-source distributed version control system for tracking the change history of code, which allows a developer to record each change of a file and to rollback to a previous version, thereby facilitating collaboration, code review, project management, etc.

Optionally, the processing configuration parameters may include at least one of:

Model path to be processed: in the quantization process, the model to be processed may be an original ONNX model or a Caffe model; in the test flow, the model to be processed can be an original ONNX model or a Caffe model, and also can be a ONNX model or a Caffe model after quantization processing.

Identification information of the target hardware platform: the identification information may be the name of the target hardware platform, or may include a series of key parameters or characteristics of the target hardware platform to uniquely identify and describe the target hardware platform. Optionally, the identification information may include processor information, memory information, display card information, motherboard information, expansion interface information, and the like.

Quantization data path: may refer to a storage path for quantized data, and/or an entire flow path for data during quantization, including the source of the data, intermediate processing, and final storage location, etc.

Quantization accuracy: in model quantization, the quantization accuracy determines the degree of refinement in converting the original data or model parameters from a high-accuracy representation to a low-accuracy representation. The higher the quantization accuracy, the closer the converted data or parameters are to their original values, but the required memory space and computational resources may increase accordingly. Conversely, lower quantization accuracy may introduce larger errors, but helps reduce the size of the model and increase computational efficiency. Quantization accuracy generally relates to bit number selection (e.g., 8 bits, 16 bits, 32 bits, etc.), error and loss of accuracy, quantization method calibration and tuning, etc.

Test data path: may refer to a storage path for test data and/or a flow path for test data throughout a test procedure.

Test file path: storage locations in the file system for various files used during the test.

Test result storage location: including storage areas, storage addresses, etc.

In some implementations, the remote code repository 102, which may also be referred to as a remote repository or remote code repository, may be used to provide version control services. For example, in a distributed version control system such as Git, the remote code repository may operate as a central server, running continuously for 24 hours, allowing developers to push (push) or pull (pull) code.

In some implementations, a remote code repository can be created on a code hosting platform (e.g., gitHub, gitLab, etc.). After creating the remote code repository, the local code repository is associated with the remote code repository. Once the local code repository is successfully associated with the remote code repository, the local modification may be pushed to the remote code repository using a gitpush command or the latest code may be pulled from the remote code repository using a gitpush command.

In some implementations, the remote code repository 102 may have stored therein processing data corresponding to at least one hardware platform; the processing data includes quantization method data and/or test method data. In addition, the remote code repository 102 may also store configuration parameters corresponding to the hardware platform, including quantized configuration parameters and/or test configuration parameters.

Alternatively, the hardware platform may be a hardware platform supporting quantization.

By way of example, the plurality of hardware platforms to support quantization may include, without limitation, a central processing unit (Central Processing Unit, CPU), a graphics processor (Graphics Processing Unit, GPU), a field programmable gate array (Field Programmable GATE ARRAY, FPGA), an Application Specific Integrated Circuit (ASIC), a mobile terminal, or an embedded device, etc.

In some embodiments, the processing device 103 is configured to automatically obtain a target configuration parameter when detecting that a new target configuration parameter exists in the remote code repository 102, and generate a model processing task corresponding to a target hardware platform according to target processing data corresponding to the target hardware platform and the target configuration parameter; and acquiring a model to be processed, and executing the model processing task to process the model to be processed.

Referring to fig. 2, fig. 2 is a schematic flow chart of a model processing method according to an exemplary embodiment of the present application. In some embodiments, the model processing method includes:

s201, acquiring target configuration parameters in a remote code warehouse.

In some embodiments, the remote code repository stores processing data corresponding to at least one hardware platform, where the processing data includes quantization method data and/or test method data.

In some embodiments, the remote code repository may also store configuration parameters corresponding to at least one hardware platform, including a quantized configuration parameter and/or a test configuration parameter.

In some embodiments, the target configuration parameter includes identification information corresponding to the target hardware platform.

In some embodiments, whether new or updated target configuration parameters exist in the remote code repository may be detected; and when the newly-built or updated target configuration parameters exist in the remote code warehouse, acquiring the target configuration parameters.

Optionally, the target configuration parameters may include quantized configuration parameters and/or test configuration parameters corresponding to the target hardware platform.

In some embodiments, a periodic or triggered detection mechanism may be provided to detect whether new or updated target configuration parameters exist in the remote code repository.

Alternatively, the detection mechanism may be implemented by a pre-written script or program, for example, the detection mechanism may check the Git commit message (commit message) by the pre-written script to determine whether the newly created or updated target configuration parameters exist in the remote code repository.

S202, generating a model processing task corresponding to the target hardware platform according to target processing data corresponding to the target hardware platform and the target configuration parameters.

In some embodiments, when the target configuration parameter includes a quantization configuration parameter, information such as a quantization method, a quantization data path, a quantization precision and the like corresponding to the target hardware platform may be determined by analyzing the quantization configuration parameter, and a model quantization task corresponding to the target hardware platform may be generated according to the information and quantization method data corresponding to the target hardware platform stored in the remote code repository.

In some embodiments, when the target configuration parameter includes a test configuration parameter, information such as a test method, a test data path, a test file path, a test result storage location and the like corresponding to the target hardware platform may be determined by analyzing the test configuration parameter, and a model test task corresponding to the target hardware platform may be generated according to the information and test method data corresponding to the target hardware platform stored in the remote code repository.

In some embodiments, when the target configuration parameters include both the quantized configuration parameters and the test configuration parameters, a model quantization task corresponding to the target hardware platform may be generated based on the quantized configuration parameters, and after the model quantization task is completed, a model test task corresponding to the target hardware platform may be generated based on the test configuration parameters, where the model test task may be used to test the running performance of the quantized model on the target hardware platform.

S203, acquiring a model to be processed, and executing the model processing task to process the model to be processed.

In some embodiments, the model to be processed may be downloaded from a remote code repository or other storage device and processed by performing the model processing tasks described above.

Alternatively, the model to be processed may be an AI model, or may be another type of model, which is not limited in the embodiment of the present application.

The embodiment of the application provides a model processing method, because the remote code warehouse integrates processing data corresponding to each hardware platform, such as quantization method data, test method data and the like, a developer does not need to know the specific processing mode of each hardware platform, when the hardware platform has model quantization and/or test requirements, the developer only needs to push the configuration parameters corresponding to the hardware platform into the remote code warehouse, the system can automatically acquire the required configuration parameters from the remote code warehouse, and automatically generate and execute model processing tasks, such as model quantization tasks, model test tasks and the like, the whole process does not need manual intervention, thus not only effectively reducing the quantization and/or test difficulty of an AI model, but also improving the quantization and/or test efficiency of the AI model.

Based on the descriptions in the foregoing embodiments, referring to fig. 3, fig. 3 is a second schematic flow chart of a model processing method according to an exemplary embodiment of the present application. In some embodiments, the model processing method includes:

S301, acquiring target configuration parameters in a remote code warehouse; the target configuration parameters comprise identification information corresponding to the target hardware platform and quantization configuration parameters corresponding to the target hardware platform.

In some embodiments, the remote code repository stores processing data corresponding to at least one hardware platform, the processing data including quantized processing data; the quantization configuration parameter may include at least one of a quantization data path and a quantization accuracy.

In some embodiments, a pre-written script or detection tool may be used to detect whether there are newly submitted or updated configuration parameters in the remote code repository, and if so, pull (e.g., using a gitpull command) the configuration parameters from the remote code repository.

S302, generating a model quantization task according to the quantization configuration parameters and the quantization processing data.

In some embodiments, the Jenkins automation technology may be utilized to automatically analyze the quantization configuration parameters, determine information such as a quantization method, a quantization data path, quantization precision and the like corresponding to the target hardware platform, and generate a model quantization task corresponding to the target hardware platform according to the information and quantization processing data corresponding to the target hardware platform stored in the remote code repository.

The quantization processing data comprise quantization method data of a target hardware platform.

S303, acquiring a model to be processed, and executing a model quantization task to perform quantization processing on the model to be processed.

In some embodiments, a pre-established dock container may be started first, and a model to be processed is obtained according to the model path to be processed in the quantized configuration parameters.

Optionally, the dock container includes the dependency items required to perform the model quantization task.

In some embodiments, after the dock container is started, the model to be processed and the quantization data may be downloaded through OSS, and a quantization tool may be configured, and then the model quantization task may be performed.

In some embodiments, after the model quantization task is executed, some optional processes may be executed, including encryption, quantization error comparison, etc., where the specific execution content may depend on the configuration options of the user.

In some embodiments, after the model quantization task is executed, it may be determined whether the model quantization task is executed successfully; when the model quantization task is successfully executed, uploading the quantized model to a preset storage device; outputting a reminding message when the execution of the model quantification task fails; the reminding message is used for reminding the model quantification task failure.

Optionally, the preset storage device may be an OSS device.

Optionally, after uploading the quantized model to a preset storage device, a reminding message can be output; the reminding message is used for reminding the success of quantification of the model to be processed.

In some embodiments, after the model quantization task is successfully executed, it may also be determined whether to test the quantized model; and when the quantized model is determined to be tested, sending a test instruction to an automatic test device corresponding to the target hardware platform.

Optionally, whether to test the quantized model may be determined according to a test option configured by a user in the quantized configuration parameters.

According to the model processing method provided by the embodiment of the application, since the quantization method data corresponding to each hardware platform are integrated in the remote code warehouse, a developer does not need to know the special quantization processing mode of each hardware platform, when the hardware platform has a model quantization requirement, the developer only needs to push the quantization configuration parameters corresponding to the hardware platform into the remote code warehouse, the system can automatically acquire the required configuration parameters from the remote code warehouse, and automatically generate and execute the model quantization processing task, the whole process does not need manual intervention, the quantization difficulty of an AI model can be effectively reduced, and the quantization efficiency of the AI model can be improved.

Based on the descriptions in the foregoing embodiments, referring to fig. 4, fig. 4 is a schematic flow chart of a model processing method according to an exemplary embodiment of the present application. In some embodiments, the model processing method includes:

S401, acquiring target configuration parameters in a remote code warehouse; the target configuration parameters comprise identification information corresponding to the target hardware platform and test configuration parameters corresponding to the target hardware platform.

In some embodiments, the remote code repository stores processing data corresponding to at least one hardware platform, where the processing data includes test processing data; the test configuration parameters include at least one of a test data path, a test file path, and a test result storage location.

And S402, when a test instruction is received, generating a model test task corresponding to the target hardware platform according to the test configuration parameters and the test processing data.

In some embodiments, the Jenkins automation technology may be utilized to automatically analyze the test configuration parameters, determine the information such as the test data, the test file, the test result storage location, etc. corresponding to the target hardware platform, and generate the model test task corresponding to the target hardware platform according to the information and the test processing data corresponding to the target hardware platform stored in the remote code repository.

The test processing data comprise test method data of a target hardware platform.

S403, acquiring a model to be processed, and executing a model test task to test the model to be processed.

In some embodiments, a pre-established dock container may be started first, and a model to be processed is obtained according to a model path to be processed in the test configuration parameters.

Optionally, the dock container includes the dependencies required to perform the model test tasks.

Alternatively, the model to be processed may be a quantized model stored in the OSS device.

In some embodiments, after the dock container is started, the model to be processed and the test data may be downloaded through the OSS device, and then the model test task is executed.

In some embodiments, before executing the model test task, it may be determined whether the local hardware resource supports running the model to be processed; and if not, mounting the model to be processed and the model test task into a remote terminal, and executing the model test task by using the remote terminal.

In some embodiments, in the process of executing the model test task, a test result of the model to be processed may be recorded; and after the model test task is completed, generating a performance evaluation report corresponding to the model to be processed according to the test result.

In some embodiments, the test results may be stored in the same location as the model storage in the OSS device.

In some embodiments, after the model test task is completed, a notification message may also be output; the notification message is used for notifying the user of the test result.

According to the model processing method provided by the embodiment of the application, as the test method data corresponding to each hardware platform are integrated in the remote code warehouse, a developer does not need to know the special test processing mode of each hardware platform, when the hardware platform has a model test requirement, the developer only needs to push the test configuration parameters corresponding to the hardware platform into the remote code warehouse, the system can automatically acquire the required configuration parameters from the remote code warehouse, and automatically generate and execute a model test processing task, the whole process does not need manual intervention, the test difficulty of an AI model can be effectively reduced, and the test efficiency of the AI model can be improved.

Based on the description in the foregoing embodiment, the embodiment of the present application further provides a model processing system, where the system may include a user side and at least one processing device; wherein:

the user terminal is used for: configuring or updating configuration parameters of the hardware platform, and pushing the newly configured or updated configuration parameters to a remote code warehouse; the remote code repository stores processing data corresponding to at least one hardware platform.

The at least one processing device is configured to: acquiring target configuration parameters from a remote code warehouse, and generating a model processing task corresponding to a target hardware platform according to target processing data corresponding to the target hardware platform and the target configuration parameters; and acquiring the model to be processed, and executing the model processing task to process the model to be processed.

In some embodiments, the configuration parameters include a quantization configuration parameter and a test configuration parameter; the processing device comprises an automatic quantization module and an automatic test module. Wherein:

The automatic quantization module is used for: when a new target quantization configuration parameter exists in the remote code warehouse, acquiring the target quantization configuration parameter, and generating a model quantization task corresponding to the target hardware platform according to the target quantization configuration parameter; and obtaining a model to be quantized, and executing a model quantization task to perform quantization processing on the model to be quantized.

The automatic test module is used for: when a test instruction is received, acquiring a target test configuration parameter corresponding to a target hardware platform in a remote code warehouse, and generating a model test task corresponding to the target hardware platform according to the target test configuration parameter; and obtaining the model to be tested, and executing the model test task to test the model to be tested.

Referring to fig. 5, fig. 5 is a schematic flow chart of an operation of a model processing system according to an exemplary embodiment of the present application. In some embodiments, the model processing system described above includes: the system comprises a user side, a remote code warehouse, an automatic quantification module, a first automatic testing module and a second automatic testing module.

Optionally, the automatic quantization module, the first automatic test module and the second automatic test module may be deployed in a Linux machine.

In some embodiments, an automatic quantization module can be deployed in a Linux machine, if more quantization tasks occur, a plurality of Linux machines can be deployed at the same time to perform task allocation scheduling, and quantization efficiency is improved. For different hardware platforms, the same automatic quantization module can be deployed in different terminals or Linux machines (running terminal programs through remote calls), and can be quantized or tested in parallel. For a plurality of quantization and test tasks of the same hardware platform, task queuing can be automatically carried out, and tasks can be executed in series, so that the correctness and the effectiveness of the execution result of the tasks are ensured.

In some embodiments, the user side is configured to configure or update configuration parameters of the hardware platform, and push (put push) the newly configured or updated configuration parameters to the remote code repository.

In some embodiments, when other hardware platforms need to be extended and supported, maintenance personnel only need to push the quantization method data and the test method data of the hardware platform to a remote code warehouse, and the test is passed once, and subsequent quantization and test can be reused, so that model development time and labor cost can be effectively saved.

In some embodiments, the automatic quantization module may automatically perform the following operations based on Jenkins automation technology:

1. configuration parameters in a remote code repository are obtained (Git pull).

2. And checking the Git submission information to determine whether newly built or updated configuration parameters exist in the remote code repository.

3. And judging whether a quantized model is needed or not based on whether newly built or updated configuration parameters exist in the remote code warehouse. If yes, continuing to execute the subsequent operation, and if not, ending the current operation.

In some embodiments, the need for a quantized model is determined when newly built or updated configuration parameters exist in the remote code repository.

4. And generating model quantization tasks of the hardware platforms.

In some embodiments, the model quantification task of each target hardware platform may be generated based on the identification information of each target hardware platform in the newly-built or updated configuration parameters in the remote code repository.

5. The docker vessel was started.

6. And executing the model quantization task.

In some embodiments, after the dock container is started, the model to be processed and the quantization data may be downloaded first, and a quantization tool may be configured, and then the model quantization task may be performed.

7. Judging whether the quantization is successful.

In some embodiments, after the model quantization task is executed, whether the model quantization task is executed successfully may be determined; when the model quantization task is successfully executed, uploading the quantized model to a preset storage device; outputting a reminding message when the execution of the model quantification task fails; the reminding message is used for reminding the model quantification task failure.

8. And judging whether to test the model precision.

In some embodiments, after uploading the quantized model to the preset storage device, it may also be determined whether to test the quantized model; and when the quantized model is determined to be tested, sending a test instruction to an automatic test device corresponding to the target hardware platform.

In some embodiments, the first automatic test module may automatically perform the following operations based on Jenkins automation technology:

1. And acquiring (Git pull) the test configuration parameters corresponding to the hardware platform 1 in the remote code warehouse.

2. And when a test instruction is received, downloading test data according to the test configuration parameters.

3. And generating a model test task corresponding to the hardware platform 1 according to the test configuration parameters.

4. The docker vessel was started.

5. It is determined whether to perform the above-described model test tasks locally. If yes, the model test task is executed locally, if not, data mounting is carried out, the model to be processed and the model test task are mounted in a remote terminal, and a remote instruction is sent to the remote terminal so as to execute the model test task by using the remote terminal.

6. And uploading the test result.

7. And outputting a reminding message.

In some embodiments, the second automatic test module may automatically perform the following operations based on Jenkins automation technology:

1. test configuration parameters corresponding to the hardware platform 2 in the (Git pull) remote code repository are obtained.

3. And generating a model test task corresponding to the hardware platform 2 according to the test configuration parameters.

4. The docker vessel was started.

6. And uploading the test result.

7. And outputting a reminding message.

In the embodiment of the application, seamless access of various model formats can be supported, and different types of hardware platforms can be easily expanded and docked, so that efficient quantization and adaptation of various models are realized. In addition, the system can be provided with rich visualization tools and reporting functions, and a user can deeply understand the change condition of the performance before and after model quantification and the actual effect thereof through an intuitive data chart and an exhaustive evaluation report.

The model processing system provided by the embodiment of the application can realize the following beneficial effects:

compatible with multiple hardware platforms: the method can support quantization and testing of multiple deep learning model formats and hardware platforms, and greatly enhances flexibility and universality of model deployment.

And (3) automatic treatment: by utilizing the Jenkins automation technology, the full-flow automation from model quantification to performance test is realized, manual intervention is not needed, the model processing efficiency is greatly improved, and the risk of human operation errors is reduced.

Ease of use and extensibility: the user can start the quantization process and/or the test process by configuring a small amount of parameters, so that the technical threshold of developers in the aspects of quantization technology and reasoning test methods of each hardware platform is reduced. Meanwhile, the system has good expansibility, and is convenient for supporting more types of models or hardware platforms to be newly added in the future.

Resource saving and efficiency improvement: through highly integrated and automatic flow design, the labor cost and the time cost are obviously reduced, and the integral research and development efficiency of the AI model is improved.

Based on the description of the foregoing embodiments, an electronic device is further provided in some embodiments of the present application, referring to fig. 6, fig. 6 is a schematic structural diagram of an electronic device provided in an exemplary embodiment of the present application, and the electronic device 60 may include a processor 601 and a memory 602. The processor 601 and the memory 602 are illustratively interconnected by a bus 603.

Memory 602 stores computer-executable instructions;

the processor 601 executes computer-executable instructions stored in the memory 602, so that the processor 601 executes the relevant content in the embodiments corresponding to the model processing method described above, which is not described herein.

Correspondingly, the embodiment of the application further provides a computer readable storage medium, in which computer executable instructions are stored, and when the computer executable instructions are executed by a processor, the computer executable instructions are used to implement the model processing method corresponding to the relevant content in each embodiment, which is not described herein.

Correspondingly, the embodiment of the present application may further provide a computer program product, which includes a computer program, and when the computer program is executed by a processor, the method for processing a model may be implemented to correspond to the relevant content in each embodiment, which is not described herein.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In one typical configuration, a computing device includes one or more processors, input/output interfaces, network interfaces, and memory.

The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.

Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by the computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method or apparatus that includes the element.

The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the application are to be included in the scope of the claims of the present application.

Claims

1. A method of model processing, the method comprising:

2. The method of claim 1, wherein the obtaining the target configuration parameters in the remote code repository comprises:

3. The method according to claim 1, wherein the target configuration parameters include a model path to be processed, and the acquiring the model to be processed and executing the model processing task includes:

4. A method according to any one of claims 1 to 3, wherein the target processing data comprises quantized processing data corresponding to the target hardware platform, the target configuration parameters further comprise quantized configuration parameters corresponding to the target hardware platform, and the quantized configuration parameters comprise at least one of quantized data paths and quantized accuracy;

5. The method according to claim 4, wherein the method further comprises:

Determining whether the model quantization task is successfully executed;

6. The method of claim 5, wherein the method further comprises:

7. A method according to any one of claims 1 to 3, wherein the target processing data includes test processing data corresponding to the target hardware platform, the target configuration parameters further include test configuration parameters corresponding to the target hardware platform, and the test configuration parameters include at least one of a test data path, a test file path, and a test result storage location;

8. The method of claim 7, wherein the performing the model processing task comprises:

9. The method of claim 8, wherein the method further comprises:

recording a test result of the model to be processed;

10. A model processing system, wherein the system comprises a user side and at least one processing device;

11. The system of claim 10, wherein the configuration parameters include a quantitative configuration parameter and a test configuration parameter; the processing device comprises an automatic quantification module and an automatic test module;

12. An electronic device, comprising: a memory and a processor;

The memory stores computer-executable instructions;

The processor executes computer-executable instructions stored in the memory, causing the processor to perform the model processing method according to any one of claims 1 to 9.

13. A computer-readable storage medium, in which computer-executable instructions are stored, which, when executed by a computer, implement the model processing method according to any one of claims 1 to 9.

14. A computer program product comprising a computer program which, when run, causes a computer to perform the model processing method of any one of claims 1 to 9.