CN115712441A

CN115712441A - Model deployment method and device, computer equipment and storage medium

Info

Publication number: CN115712441A
Application number: CN202211411314.1A
Authority: CN
Inventors: 徐洁
Original assignee: Ping An Property and Casualty Insurance Company of China Ltd
Current assignee: Ping An Property and Casualty Insurance Company of China Ltd
Priority date: 2022-11-11
Filing date: 2022-11-11
Publication date: 2023-02-24

Abstract

The embodiment of the application belongs to the field of artificial intelligence and relates to a model deployment method, which comprises the following steps: judging whether a model deployment request triggered by a user is received; if so, displaying a preset information filling page, and receiving data write-back information input by a user on the information filling page; reading a fetching script corresponding to the model identification from a first preset file; reading a code file and a model file corresponding to the model identification from a second preset file; generating a model deployment file of the target model based on the data write-back information, the access script, the code file and the model file; and deploying the target model based on a preset application container engine and the model deployment file. The application also provides a model deployment device, computer equipment and a storage medium. In addition, the application also relates to a block chain technology, and the model deployment file can be stored in the block chain. According to the method and the device, the target model can be automatically deployed based on the model deployment file, and the deployment efficiency of the target model is improved.

Description

Model deployment method and device, computer equipment and storage medium

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to a model deployment method and apparatus, a computer device, and a storage medium.

Background

Artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. The artificial intelligence base technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

In the field of artificial intelligence, model deployment is an important research direction. At present, jupitter and VScode training tools of most platforms in the industry only support online model training, if online models need to be deployed, codes, data and the like need to be manually exported again, and after the online models need to be deployed, relevant contents such as model files, images and data need to be reselected in a model deployment link and operated again. Because the model file is overlarge in part of deep learning modeling scenes, the derivation is long, the operation is troublesome, the manual workload is large, and the deployment efficiency of the model is low.

Disclosure of Invention

An embodiment of the application aims to provide a model deployment method, a model deployment device, computer equipment and a storage medium, so as to solve the technical problems that the existing model deployment method is large in manual workload and low in model deployment efficiency.

In order to solve the above technical problem, an embodiment of the present application provides a model deployment method, which adopts the following technical solutions:

judging whether a model deployment request triggered by a user is received; the model deployment instruction carries a model identifier of a target model which is developed;

if so, displaying a preset information filling page, and receiving data write-back information input by the user on the information filling page;

reading a fetching script corresponding to the model identification from a first preset file;

reading a code file and a model file corresponding to the model identification from a second preset file;

generating a model deployment file of the target model based on the data write-back information, the access script, the code file, and the model file;

and deploying the target model based on a preset application container engine and the model deployment file.

Further, the step of deploying the target model based on the preset application container engine and the model deployment file specifically includes:

acquiring the data write-back information, the access script, the code file and the model file from the deployment file based on the application container engine;

acquiring a preset docker mirror image from the code file; and (c) a second step of,

acquiring environment dependence information from the code file;

and loading the data write-back information, the access script, the model file and the environment dependence information into a target directory corresponding to the docker mirror image, and triggering the operation of the model file in the target directory to realize the deployment of the target model.

Further, before the step of obtaining the preset docker image from the code file, the method further includes:

analyzing tool information of the target model operation dependence from the code file;

packaging the tool information to generate a corresponding docker mirror image;

and storing the docker mirror image into the target directory in the code file.

Further, after the step of deploying the target model based on the preset application container engine and the model deployment file, the method further includes:

receiving scheduling task information corresponding to the target model and input by the user;

generating a scheduling task corresponding to the target model based on the scheduling task information;

determining a resource allocation rule corresponding to the scheduling task;

and executing the scheduling task based on the resource allocation rule.

Further, the step of determining the resource allocation rule corresponding to the scheduling task specifically includes:

using a first preset cluster as an access computing resource of the target model in executing the scheduling task;

using a second preset cluster as a model running resource of the target model in executing the scheduling task;

and using a third preset cluster as a data write-back resource of the target model executing the scheduling task.

Further, after the step of receiving the data write-back information input by the user on the information filling page, the method further includes:

generating information confirmation information corresponding to the data write-back information, and displaying the information confirmation information;

judging whether a modification operation for the data write-back information input by the user is received;

if so, modifying the data write-back information based on the modification operation to obtain modified target data write-back information;

the step of generating a model deployment file for the target model based on the data write-back information, the fetch script, the code file, and the model file comprises:

and generating a model deployment file of the target model based on the target data write-back information, the access script, the code file and the model file.

Further, the model deployment request also carries user identity information of the user, and the step of displaying a preset information filling page specifically includes:

extracting the user identity information from the model deployment request;

judging whether the user identity information is stored in a preset white list or not;

if yes, acquiring the biological feature information of the user;

performing identity authentication on the user based on the biological characteristic information, and judging whether the identity authentication is passed;

and if the identity authentication is passed, executing the step of displaying the preset information filling page.

In order to solve the above technical problem, an embodiment of the present application further provides a model deployment apparatus, which adopts the following technical solutions:

the first judgment module is used for judging whether a model deployment request triggered by a user is received or not; the model deployment instruction carries a model identifier of a target model which is developed;

the first receiving module is used for displaying a preset information filling page and receiving data write-back information input by the user on the information filling page if the information filling page is the preset information filling page;

the first reading module is used for reading the access script corresponding to the model identifier from a first preset file;

the second reading module is used for reading the code file and the model file corresponding to the model identifier from a second preset file;

a first generation module, configured to generate a model deployment file of the target model based on the data write-back information, the access script, the code file, and the model file;

and the deployment module is used for deploying the target model based on a preset application container engine and the model deployment file.

In order to solve the above technical problem, an embodiment of the present application further provides a computer device, which adopts the following technical solutions:

In order to solve the above technical problem, an embodiment of the present application further provides a computer-readable storage medium, which adopts the following technical solutions:

Compared with the prior art, the embodiment of the application mainly has the following beneficial effects:

after a model deployment request triggered by a user is received, a preset information filling page is displayed firstly, data write-back information input by the user on the information filling page is received, then an access script corresponding to a model identifier is read from a first preset file, a code file and a model file corresponding to the model identifier are read from a second preset file, a model deployment file of a target model is generated subsequently based on the data write-back information, the access script, the code file and the model file, and finally the target model is deployed based on a preset application container engine and the model deployment file. According to the method and the device, the model deployment file corresponding to the target model is built, and then the target model can be automatically deployed based on the model deployment file, so that the workload of manual deployment of a user can be reduced, the efficiency of deployment of the target model is improved, and the use experience of the user is improved.

Drawings

In order to more clearly illustrate the solution of the present application, the drawings used in the description of the embodiments of the present application will be briefly described below, and it is obvious that the drawings in the description below are some embodiments of the present application, and that other drawings may be obtained by those skilled in the art without inventive effort.

FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;

FIG. 2 is a flow diagram of one embodiment of a model deployment method according to the present application;

FIG. 3 is a schematic block diagram of one embodiment of a model deployment apparatus according to the present application;

FIG. 4 is a schematic block diagram of one embodiment of a computer device according to the present application.

Detailed Description

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "including" and "having," and any variations thereof in the description and claims of this application and the description of the figures above, are intended to cover non-exclusive inclusions. The terms "first," "second," and the like in the description and claims of this application or in the above-described drawings are used for distinguishing between different objects and not for describing a particular order.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.

In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings.

As shown in fig. 1, the system architecture 100 may include

terminal devices

101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the

terminal devices

101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

The user may use the

terminal devices

101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The

terminal devices

101, 102, 103 may have various communication client applications installed thereon, such as a web browser application, a shopping application, a search application, an instant messaging tool, a mailbox client, social platform software, and the like.

The

terminal devices

101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture experts Group Audio Layer III, mpeg compression standard Audio Layer 3), MP4 players (Moving Picture experts Group Audio Layer IV, mpeg compression standard Audio Layer 4), laptop portable computers, desktop computers, and the like.

The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the

terminal devices

101, 102, 103.

It should be noted that the model deployment method provided in the embodiments of the present application is generally executed by a server/terminal device, and accordingly, the model deployment apparatus is generally disposed in the server/terminal device.

It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for an implementation.

With continuing reference to FIG. 2, a flow diagram of one embodiment of a model deployment method in accordance with the present application is shown. The model deployment method comprises the following steps:

step S201, judging whether a model deployment request triggered by a user is received; and carrying the model identification of the target model which is developed by the model deployment instruction.

In this embodiment, the electronic device (for example, the server/terminal device shown in fig. 1) on which the model deployment method operates may obtain the model identifier through a wired connection manner or a wireless connection manner. It should be noted that the above-mentioned wireless connection means may include, but is not limited to, 3G/4G/5G connection, wiFi connection, bluetooth connection, wiMAX connection, zigbee connection, UWB (ultra wideband) connection, and other now known or later developed wireless connection means. The model deployment request is a request for performing model deployment on the developed target model triggered by a user. The model identification of the object model includes model ID information of the object model. There is a one-to-one correspondence between models and model ID information. In addition, the user can start the training tool for on-line model training, complete model development of the target model on line and add a code file and a model file of the target model. In addition, the mount addresses of the code file and the model file of the target model are set according to actual service use requirements, for example, the newly added code file and the model file of the target model can be synchronized to a first specified directory of a preset mount file, so that the code file and the model file of the target model can be automatically and quickly acquired in the following process.

Step S202, if yes, displaying a preset information filling page, and receiving data write-back information input by the user on the information filling page.

In this embodiment, the information-filled page is a page provided in advance for a user to input data write-back information related to the target model, and the data write-back information is information corresponding to the data write-back logic of the target model. The data write-back refers to a process of automatically filling data of the electronic setting query into a relevant form when data updating is performed. The format of the data write-back information can be data write-back SQL, and the related form refers to a Hive table.

Step S203, reading the access script corresponding to the model identification from the first preset file.

In this embodiment, the access script specifically refers to an SQL script used in data preparation during training of the target model, and data acquisition can be completed based on the access script. The first preset file may refer to a script mount file created in advance for storing script data. The access script can be provided by a user and is input into a second specified directory in a preset script mount file in advance, and the access script is stored in the second specified directory by taking a model identifier corresponding to the access script as an index.

And step S204, reading a code file and a model file corresponding to the model identification from a second preset file.

In this embodiment, the code file is a newly added code file after a user completes model development of an object model on line according to actual service requirements, and the code file at least may include operation parameters, a docker image, and environment dependency information required for operating the object model. The model file comprises model data of the target model, the model data is fixed and invariable after the target model is developed, and the model data at least comprises network parameters of each network layer in the target model and hierarchical information of each network layer. The second preset file may refer to a mount file created in advance for storing the code file and the model file.

Step S205, generating a model deployment file of the target model based on the data write-back information, the access script, the code file and the model file.

In this embodiment, the model deployment file of the target model may be generated by integrating the data write-back information, the fetch script, the code file, and the model file.

And S206, deploying the target model based on a preset application container engine and the model deployment file.

In this embodiment, the specific implementation process of deploying the target model based on the preset application container engine and the model deployment file is described in further detail in the following specific embodiments, and is not described in detail herein.

After a model deployment request triggered by a user is received, a preset information filling page is displayed firstly, data write-back information input by the user on the information filling page is received, then an access script corresponding to a model identifier is read from a first preset file, a code file and a model file corresponding to the model identifier are read from a second preset file, a model deployment file of a target model is generated subsequently based on the data write-back information, the access script, the code file and the model file, and finally the target model is deployed based on a preset application container engine and the model deployment file. According to the method and the device, the model deployment file corresponding to the target model is built, and then the deployment of the target model can be automatically realized based on the model deployment file, so that the workload of manual deployment of a user can be reduced, the efficiency of deployment of the target model is improved, and the use experience of the user is improved.

In some optional implementations, step S206 includes the following steps:

and acquiring the data write-back information, the access script, the code file and the model file from the deployment file based on the application container engine.

In this embodiment, the application container engine is specifically a Docker, and may acquire the data write-back information, the access script, the code file, and the model file from the deployment file based on a Docker instruction related to information acquisition. The Docker instruction may include a Docker image instruction, a Docker run instruction, and the like.

And acquiring a preset docker mirror image from the code file. And the number of the first and second groups,

in this embodiment, the code file is a newly added code file after a user completes model development of an object model on line according to actual service requirements, and the code file mainly includes operation parameters, a docker image and environment dependency information required for operating the object model. The construction process of the docker image will be described in the following embodiments, and will not be described in detail herein.

And acquiring environment dependency information from the code file.

In this embodiment, the environment-dependent information includes environment parameters of an environment required to run the model file.

According to the method and the device, the target model is automatically deployed based on the generated model deployment file corresponding to the target model, so that the target model is deployed without depending on downloading of various software or installation of a functional module, and the user does not need to perform mirror image operation by himself, and the method and the device are favorable for improving the deployment efficiency of the target model and improving the use experience of the user.

In some optional implementation manners of this embodiment, before the step of obtaining the preset docker image from the code file, the electronic device may further perform the following steps:

and analyzing tool information on which the target model depends from running from the code file.

In this embodiment, the tool information refers to software and functional modules on which the target model operates. The software may include python, anaconda, etc., and the functional module may include some basic functional modules, such as transflow.

And packaging the tool information to generate a corresponding docker mirror image.

In this embodiment, the tool information may be packaged based on the application container engine to generate a corresponding docker image, so as to install software and a functional module, on which the target model depends, into the docker image.

And storing the docker mirror image into the target directory in the code file.

In this embodiment, the target directory is a directory preset in the code file and used for storing mirror image data.

This application can follow after obtaining the code file of target model, can follow analyze out in the code file the instrument information that the target model operation relied on, and will instrument information packing generates corresponding docker mirror image, follow-up will the docker mirror image is stored to in the code file under the target directory for follow-up can follow this target directory and seek out the docker mirror image that corresponds with the target model fast, and then can come to carry out according to the docker mirror image that obtains target model's deployment, because the mirror image file is automatic generation in advance, do not need the operation that the user printed the mirror image by oneself, be favorable to improving target model's deployment efficiency, improve user's use experience.

In some optional implementations, after step S206, the electronic device may further perform the following steps:

and receiving scheduling task information which is input by the user and corresponds to the target model.

In this embodiment, the scheduling task information at least includes information such as a scheduling period. The scheduling period may refer to a timing time for running the target model.

And generating a scheduling task corresponding to the target model based on the scheduling task information.

In this embodiment, the scheduling task may include processing flows such as fetching, model running, and write-back of the target model that are run regularly.

And determining a resource allocation rule corresponding to the scheduling task.

In this embodiment, the resource allocation rule is a pre-established decoupling and separation for fetching, model running and write-back of a model running at a fixed time, so as to implement resource calculation and resource separation for model running. The foregoing specific implementation process for determining the resource allocation rule corresponding to the scheduling task is further described in the following specific embodiments, and will not be described in detail herein.

And executing the scheduling task based on the resource allocation rule.

According to the method and the device, after the target model is deployed based on the preset application container engine and the model deployment file, the scheduling task corresponding to the target model is further generated according to the scheduling task information corresponding to the target model and input by a user, the resource allocation rule corresponding to the scheduling task is determined, and the scheduling task is executed based on the resource allocation rule, so that the resource minimization during the execution of the scheduling task is realized, the resource waste of the system is reduced, the resource utilization rate of the system is improved, and the processing intelligence of the scheduling task is improved.

In some optional implementation manners, the determining a resource allocation rule corresponding to the scheduling task includes the following steps:

and using the first preset cluster as an access computing resource of the target model in executing the scheduling task.

In this embodiment, the first preset cluster is a spare cluster supported by Hive, and the spare cluster includes available CPU resources.

And using a second preset cluster as a model operation resource of the target model in executing the scheduling task.

In this embodiment, the second preset cluster is a GPU cluster resource.

In this embodiment, the third preset cluster is a spare cluster supported by Hive, and the spare cluster includes available CPU resources.

According to the method and the device, the resource allocation rule corresponding to the scheduling task is determined, namely the fetch computing resource uses the Spark cluster supported by Hive, the GPU cluster resource is used in model operation, the data write-back resource uses the Spark cluster supported by Hive, decoupling separation of fetch, model operation and write-back of the model operated at fixed time is realized, the computing resource and the model operation resource are separated, so that resource minimization can be realized when the scheduling task of the target model is executed based on the resource allocation rule subsequently, the resource waste of the system is reduced, and the processing intelligence of the scheduling task is improved.

In some optional implementation manners of this embodiment, after step S202, the electronic device may further perform the following steps:

and generating information confirmation information corresponding to the data write-back information, and displaying the information confirmation information.

In this embodiment, the content of the information confirmation information is not specifically limited, and may be written and generated in advance according to actual use requirements. For example, the content of the information confirmation information may include: and (3) confirming whether the input data write-back information needs to be modified or not, clicking a confirmation button if the input data write-back information does not need to be modified, and clicking a modification button if the input data write-back information needs to be modified and modifying the data write-back information.

And judging whether a modification operation for the data write-back information input by the user is received.

In this embodiment, an information modification frame is displayed in advance, and the modification operation may include operations of adding, deleting, replacing, and the like performed by the user on the data write-back information in the information modification frame.

And if so, modifying the data write-back information based on the modification operation to obtain modified target data write-back information.

In this embodiment, the data write-back information may be modified based on the data write-back information modification operation input by the user in the information modification box, so as to obtain modified target data write-back information

The step of generating a model deployment file of the target model based on the data write-back information, the fetch script, the code file, and the model file includes:

After the step of receiving the data write-back information input by the user on the information filling page, the method and the device also intelligently generate information confirmation information corresponding to the data write-back information and display the information confirmation information; if the modification operation of the data write-back information input by the user is detected to be received, the data write-back information is modified based on the modification operation to obtain modified target data write-back information, and then a model deployment file of the target model is generated based on the modified target data write-back information subsequently, so that the data accuracy of the generated model deployment file is ensured, and the use experience of the user is improved.

In some optional implementation manners of this embodiment, the model deployment request further carries user identity information of the user, and step S202 includes the following steps:

and extracting the user identity information from the model deployment request.

In this embodiment, the user identity information of the user may be extracted from the model deployment request by analyzing the model deployment request. The user identity information may include a name or ID information of the user.

And judging whether the user identity information is stored in a preset white list.

In this embodiment, the white list is list data that is pre-constructed according to actual service usage requirements, and user identity information of a valid employee is stored in the white list.

And if so, acquiring the biological feature information of the user.

In the present embodiment, the biological information is not particularly limited, and may include one or more of a face image, a pupil image, voiceprint information, and the like of the user, for example.

And performing identity authentication on the user based on the biological characteristic information, and judging whether the identity authentication is passed or not.

In this embodiment, the target biometric information corresponding to the user information may be acquired, and then the similarity between the target biometric information and the biometric information may be calculated, and if the similarity is greater than a preset similarity threshold, it is determined that the user passes the authentication, and if the similarity is less than the similarity threshold, it is determined that the user does not pass the authentication. The value of the similarity threshold is not particularly limited, and may be set according to actual use requirements.

According to the method and the device, when the model deployment request triggered by the user is received, the user can be intelligently authenticated, only when the user passes multiple authentication, the model deployment request can be responded and subsequent corresponding processing is carried out, the response to the model deployment request triggered by an illegal user is avoided, and the processing standardization and the processing intelligence of the model deployment request are improved.

It is emphasized that, in order to further ensure the privacy and security of the model deployment file, the model deployment file may also be stored in a node of a blockchain.

The block chain referred by the application is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a string of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, which is used for verifying the validity (anti-counterfeiting) of the information and generating a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

The embodiment of the application can acquire and process related data based on an artificial intelligence technology. Among them, artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.

The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware associated with computer readable instructions, which can be stored in a computer readable storage medium, and when executed, the processes of the embodiments of the methods described above can be included. The storage medium may be a non-volatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a Random Access Memory (RAM).

It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless explicitly stated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.

With further reference to fig. 3, as an implementation of the method shown in fig. 2, the present application provides an embodiment of a model deployment apparatus, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be specifically applied to various electronic devices.

As shown in fig. 3, the model deployment apparatus 300 according to this embodiment includes: the system comprises a first judging module 301, a first receiving module 302, a first reading module 303, a second reading module 304, a first generating module 305 and a deploying module 306. Wherein:

a first judging module 301, configured to judge whether a model deployment request triggered by a user is received; the model deployment instruction carries a model identifier of a target model which is developed;

a first receiving module 302, configured to, if yes, display a preset information filling page, and receive data write-back information input by the user on the information filling page;

a first reading module 303, configured to read a fetching script corresponding to the model identifier from a first preset file;

a second reading module 304, configured to read a code file and a model file corresponding to the model identifier from a second preset file;

a first generating module 305 for generating a model deployment file of the target model based on the data write-back information, the fetch script, the code file, and the model file;

a deployment module 306, configured to deploy the target model based on a preset application container engine and the model deployment file.

In this embodiment, the operations that the modules or units are respectively configured to execute correspond to the steps of the model deployment method in the foregoing embodiment one to one, and are not described herein again.

In some optional implementations of this embodiment, the deployment module 306 includes:

a first obtaining submodule, configured to obtain, based on the application container engine, the data write-back information, the access script, the code file, and the model file from the deployment file;

the second obtaining sub-module is used for obtaining a preset docker mirror image from the code file; and the number of the first and second groups,

a third obtaining submodule, configured to obtain environment dependency information from the code file;

and the deployment sub-module is used for loading the data write-back information, the access script, the model file and the environment dependence information into a target directory corresponding to the docker mirror image, and triggering the operation of the model file in the target directory to realize the deployment of the target model.

In some optional implementations of this embodiment, the deployment module 306 further includes:

the analysis submodule is used for analyzing tool information of the target model operation dependence from the code file;

the packaging submodule is used for packaging the tool information to generate a corresponding docker mirror image;

and the storage submodule is used for storing the docker mirror image into the target directory in the code file.

In this embodiment, operations that the modules or units are respectively used to execute correspond to the steps of the model deployment method in the foregoing embodiment one to one, and are not described herein again.

In some optional implementations of this embodiment, the model deployment apparatus further includes:

the second receiving module is used for receiving scheduling task information which is input by the user and corresponds to the target model;

the second generation module is used for generating a scheduling task corresponding to the target model based on the scheduling task information;

the determining module is used for determining a resource allocation rule corresponding to the scheduling task;

and the execution module is used for executing the scheduling task based on the resource allocation rule.

In some optional implementations of this embodiment, the determining module includes:

the first determining submodule is used for using a first preset cluster as an access computing resource of the target model in executing the scheduling task;

the second determining submodule is used for using a second preset cluster as a model operation resource of the target model in executing the scheduling task;

and the third determining submodule is used for using a third preset cluster as a data write-back resource of the target model executing the scheduling task.

the third generating module is used for generating information confirmation information corresponding to the data write-back information and displaying the information confirmation information;

the second judgment module is used for judging whether modification operation for the data write-back information input by the user is received or not;

the modification module is used for modifying the data write-back information based on the modification operation if the target data write-back information is correct, so that modified target data write-back information is obtained;

the first generation module comprises:

and the generation submodule is used for generating a model deployment file of the target model based on the target data write-back information, the access script, the code file and the model file.

In some optional implementation manners of this embodiment, the model deployment request further carries user identity information of the user, and the first receiving module 302 includes:

the extraction submodule is used for extracting the user identity information from the model deployment request;

the judging submodule is used for judging whether the user identity information is stored in a preset white list or not;

the fourth obtaining submodule is used for obtaining the biological characteristic information of the user if the biological characteristic information of the user is obtained;

the verification sub-module is used for performing identity verification on the user based on the biological characteristic information and judging whether the identity verification passes;

and the execution submodule is used for executing the step of displaying the preset information filling page if the identity authentication is passed.

In this embodiment, the operations that the modules or units are respectively configured to execute correspond to the steps of the model deployment method of the foregoing embodiment one by one, and are not described herein again.

In order to solve the technical problem, an embodiment of the present application further provides a computer device. Referring to fig. 4, fig. 4 is a block diagram of a basic structure of a computer device according to the present embodiment.

The computer device 4 comprises a memory 41, a processor 42, a network interface 43 communicatively connected to each other via a system bus. It is noted that only computer device 4 having components 41-43 is shown, but it is understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead. As will be understood by those skilled in the art, the computer device is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and the hardware includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.

The computer device can be a desktop computer, a notebook, a palm computer, a cloud server and other computing devices. The computer equipment can carry out man-machine interaction with a user in a keyboard mode, a mouse mode, a remote controller mode, a touch panel mode or a voice control equipment mode.

The memory 41 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the memory 41 may be an internal storage unit of the computer device 4, such as a hard disk or a memory of the computer device 4. In other embodiments, the memory 41 may also be an external storage device of the computer device 4, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the computer device 4. Of course, the memory 41 may also include both an internal storage unit of the computer device 4 and an external storage device thereof. In this embodiment, the memory 41 is generally used for storing an operating system and various application software installed in the computer device 4, such as computer readable instructions of a model deployment method. Further, the memory 41 may also be used to temporarily store various types of data that have been output or are to be output.

The processor 42 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 42 is typically used to control the overall operation of the computer device 4. In this embodiment, the processor 42 is configured to execute computer readable instructions stored in the memory 41 or process data, such as computer readable instructions for executing the model deployment method.

The network interface 43 may comprise a wireless network interface or a wired network interface, and the network interface 43 is generally used for establishing communication connection between the computer device 4 and other electronic devices.

in the embodiment of the application, after a model deployment request triggered by a user is received, a preset information filling page is displayed firstly, data write-back information input by the user on the information filling page is received, an access script corresponding to a model identifier is read from a first preset file, a code file and a model file corresponding to the model identifier are read from a second preset file, a model deployment file of a target model is generated subsequently based on the data write-back information, the access script, the code file and the model file, and finally the target model is deployed based on a preset application container engine and the model deployment file. According to the method and the device, the model deployment file corresponding to the target model is built, and then the target model can be automatically deployed based on the model deployment file, so that the workload of manual deployment of a user can be reduced, the efficiency of deployment of the target model is improved, and the use experience of the user is improved.

The present application further provides another embodiment, which is to provide a computer-readable storage medium storing computer-readable instructions executable by at least one processor to cause the at least one processor to perform the steps of the model deployment method as described above.

Through the description of the foregoing embodiments, it is clear to those skilled in the art that the method of the foregoing embodiments may be implemented by software plus a necessary general hardware platform, and certainly may also be implemented by hardware, but in many cases, the former is a better implementation. Based on such understanding, the technical solutions of the present application may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present application.

It is to be understood that the above-described embodiments are merely illustrative of some, but not restrictive, of the broad invention, and that the appended drawings illustrate preferred embodiments of the invention and do not limit the scope of the invention. This application is capable of embodiments in many different forms and the embodiments are provided so that this disclosure will be thorough and complete. Although the present application has been described in detail with reference to the foregoing embodiments, it will be apparent to one skilled in the art that the present application may be practiced without modification or with equivalents of some of the features described in the foregoing embodiments. All equivalent structures made by using the contents of the specification and the drawings of the present application are directly or indirectly applied to other related technical fields and are within the protection scope of the present application.

Claims

1. A method of model deployment, comprising the steps of:

2. The model deployment method according to claim 1, wherein the step of deploying the target model based on the preset application container engine and the model deployment file specifically comprises:

acquiring a preset docker mirror image from the code file; and the number of the first and second groups,

acquiring environment dependence information from the code file;

3. The model deployment method according to claim 2, wherein before the step of obtaining the preset docker image from the code file, the method further comprises:

packaging the tool information to generate a corresponding docker mirror image;

and storing the docker mirror image into the target directory in the code file.

4. The model deployment method according to claim 1, further comprising, after the step of deploying the target model based on the preset application container engine and the model deployment file:

determining a resource allocation rule corresponding to the scheduling task;

and executing the scheduling task based on the resource allocation rule.

5. The model deployment method according to claim 4, wherein the step of determining the resource allocation rule corresponding to the scheduling task specifically includes:

6. The model deployment method according to claim 1, wherein after the step of receiving the data write-back information input by the user on the information filling page, the method further comprises:

7. The model deployment method according to claim 1, wherein the model deployment request further carries user identity information of the user, and the step of displaying a preset information-filled page specifically comprises:

extracting the user identity information from the model deployment request;

if yes, acquiring the biological feature information of the user;

8. A model deployment apparatus, comprising:

the first receiving module is used for displaying a preset information filling page and receiving data write-back information input by the user on the information filling page if the information filling page is displayed;

a first generating module, configured to generate a model deployment file of the target model based on the data write-back information, the fetch script, the code file, and the model file;

9. A computer device comprising a memory having computer readable instructions stored therein and a processor which when executed implements the steps of the model deployment method of any one of claims 1 to 7.

10. A computer-readable storage medium having computer-readable instructions stored thereon which, when executed by a processor, implement the steps of the model deployment method of any one of claims 1 to 7.