CN117289904A

CN117289904A - Artificial intelligence model construction method, system, equipment and storage medium

Info

Publication number: CN117289904A
Application number: CN202311257709.5A
Authority: CN
Inventors: 王晓虎; 蒋含竹; 罗勇; 李滔
Original assignee: Zhejiang Geely Holding Group Co Ltd; Guangyu Mingdao Digital Technology Co Ltd
Current assignee: Zhejiang Geely Holding Group Co Ltd; Guangyu Mingdao Digital Technology Co Ltd
Priority date: 2023-09-26
Filing date: 2023-09-26
Publication date: 2023-12-26

Abstract

The invention relates to the technical field of artificial intelligence, and discloses an artificial intelligence model construction method, an artificial intelligence model construction system, an artificial intelligence model construction device and a storage medium, wherein the artificial intelligence model construction method comprises the following steps: constructing a container arrangement platform based on a distributed micro-service architecture, managing each virtual node in a cluster, and deploying an artificial intelligent algorithm library for the started virtual nodes; splicing all components in a dragging mode to generate instruction information for constructing a target artificial intelligent model, wherein the instruction information comprises a directed acyclic graph formed among all the components in the target artificial intelligent model; responding to instruction information for analysis, determining configuration resources required for constructing a target artificial intelligent model according to the directed acyclic graph, calling matched nodes based on the configuration resources for distributed training, and generating the target artificial intelligent model; through the graphical modeling of the components, not only the node flow is clear and clear, but also the comprehensiveness is strong.

Description

Artificial intelligence model construction method, system, equipment and storage medium

Technical Field

The present invention relates to the field of artificial intelligence technologies, and in particular, to a method, a system, an apparatus, and a storage medium for constructing an artificial intelligence model.

Background

Along with the development of scientific technology, the intellectualization is a current development trend, the intellectualization is based on automatic processing of data, and the automatic processing of the data is not separated from an artificial intelligent model for processing various data. Therefore, the artificial intelligence model is widely applied in various fields, such as computer vision, image processing, natural language processing, information classification, searching, recommendation, big data and the like, and plays a great pushing role. The AI (Artificial Intelligence ) model refers to a mathematical model that uses methods in the fields of mathematics, statistics, computer science, machine learning, etc. to analyze, process, predict, and optimize data with certain regularity and predictability.

In the related art, with the continuous penetration of artificial intelligence application, more and more application scenes need to use an artificial intelligence model to complete tasks such as data analysis, prediction, classification and the like. However, when constructing the artificial intelligence model, on one hand, a professional with professional knowledge is required to write codes, so that the use cost is high and the training time is long; on the other hand, the requirements of various users cannot be met, and the universality is lacking, so that the AI modeling threshold is greatly increased, and the use of common users is not facilitated.

Disclosure of Invention

The following presents a simplified summary in order to provide a basic understanding of some aspects of the disclosed embodiments. This summary is not an extensive overview, and is intended to neither identify key/critical elements nor delineate the scope of such embodiments, but is intended as a prelude to the more detailed description that follows.

In view of the above-mentioned shortcomings of the prior art, the present invention discloses an artificial intelligence model construction method, system, device and storage medium, so as to overcome the problems of high construction difficulty and threshold of artificial intelligence model construction.

In a first aspect, the present invention provides a method for constructing an artificial intelligence model, including:

constructing a container arranging platform based on a distributed micro-service architecture, managing each virtual node in a cluster, and deploying an artificial intelligent algorithm library for each started virtual node, wherein the artificial intelligent algorithm library comprises data type operators, training type operators, algorithm type operators and verification type operators, each type of operators encapsulates each algorithm in a component form, and the algorithms are in one-to-one correspondence with the components;

splicing all the components in a dragging mode to generate instruction information for constructing a target artificial intelligent model, wherein the instruction information comprises a service name of the target artificial intelligent model to be constructed and a directed acyclic graph formed among all the components;

And responding to the instruction information for analysis, determining configuration resources required for constructing the target artificial intelligent model according to the directed acyclic graph, calling the matched virtual nodes to perform distributed training based on the configuration resources, and generating the target artificial intelligent model.

Optionally, the generating the target artificial intelligence model based on the distributed training of the virtual nodes matched with the configuration resource call includes:

acquiring current system resources of the cluster, wherein the system resources comprise container resource allocation;

if the container resource allocation is greater than or equal to the allocation resources required by the target artificial intelligent model, responding to the instruction information, and calling each virtual node to respond according to the allocation resources required by the target artificial intelligent model so as to enable each virtual node to synchronously run to distribute and train each component, so that the target artificial intelligent model is generated;

if the container resource allocation is smaller than the allocation resources required by the target artificial intelligent model, monitoring the container resource allocation in real time until the container resource allocation is greater than or equal to the allocation resources required by the target artificial intelligent model, responding to the instruction information, and calling each virtual node to respond according to the allocation resources required by the target artificial intelligent model so as to enable each virtual node to synchronously run to distribute and train each component, thereby generating the target artificial intelligent model.

Optionally, the analyzing in response to the instruction information determines configuration resources required for constructing the target artificial intelligence model according to the directed acyclic graph, calls the matched virtual nodes to perform distributed training based on the configuration resources, and further includes:

if a plurality of instruction information is received within a preset time, analyzing each instruction information, and determining the service name of the artificial intelligent model corresponding to each instruction information;

the service names of the artificial intelligent models corresponding to the instruction information are subjected to priority ranking, and a priority list is determined;

sequentially determining configuration resources required by the target artificial intelligent model according to the order of the priority list from high to low, and calling the matched virtual nodes one by one according to the order to perform distributed training;

and if the service names of at least two artificial intelligent models are detected to have the same priority, arbitrating according to the time stamps carried by the service names of the at least two artificial intelligent models, and constructing a target artificial intelligent model in a first-come first-training mode.

Optionally, the determining, sequentially from high to low, configuration resources required by the target artificial intelligent model according to the priority list, and calling the matched virtual nodes one by one according to the order to perform distributed training includes:

The top ranking in the priority list is used as the highest priority, and the last ranking is used as the lowest priority;

firstly, configuring resources of a target artificial intelligent model with the highest priority, after the configuration is finished, if the current system still has residual resources, then configuring the resources one by one according to the sequence from the high priority list to the low priority list until the current system residual resources are zero or the target artificial intelligent model with the lowest priority is configured;

decomposing each target artificial intelligent model into a plurality of processing flows according to the corresponding directed acyclic graph, wherein each processing flow supports synchronous operation so as to realize distributed training;

if a plurality of processing flows from a plurality of target artificial intelligent models are determined, simultaneously applying resources to a current system, arbitrating according to the priority of the target artificial intelligent models corresponding to the processing flows, and determining the sequence of the operation resources of the processing flows.

Optionally, splicing each component in a dragging mode to generate instruction information for constructing a target artificial intelligence model, wherein the instruction information comprises a service name of the target artificial intelligence model to be constructed and a directed acyclic graph formed between each component, and the method comprises the following steps:

Displaying each component in the artificial intelligence algorithm library in a tool component area in a text representation mode, and selecting a target component in the tool component area in a dragging mode;

moving the target assembly into a modeling canvas area, and connecting the assemblies layer by layer to form a directed acyclic graph; determining the service name of the target artificial intelligent model according to the description information, the regulation and control parameters and the topological structure of the directed acyclic graph of each component;

and combining the service name of the target artificial intelligent model with the directed acyclic graph to generate instruction information for constructing the target artificial intelligent model.

Optionally, the artificial intelligence model building method further includes: constructing a distributed micro-service architecture based on a k8s cluster, wherein the k8s cluster adopts a distributed file system combination cluster as a unified object storage scheme; responding to the instruction information of the target artificial intelligent model, analyzing the instruction information by utilizing an AVES service, and determining a directed acyclic graph formed between all the components in the instruction information; and according to orderly calling application program interface services of the directed acyclic graph formed among the components, the artificial intelligent model is subjected to distributed calling of each program according to the sequence of data importing, data processing, model training, model verification and model storage, and the construction of the target artificial intelligent model is completed.

Optionally, the artificial intelligence model building method further includes:

selecting an operator matched with the target artificial intelligent model from the data operator to import a data set to form a target data set;

dividing the target data set into a training set and a verification set according to a preset proportion;

preprocessing the training set and the verification set by using the training class operator, wherein the preprocessing comprises at least one of missing value filling, repeated line removal, random sampling and data type conversion;

performing feature processing on the preprocessed training set and the preprocessed verification set, wherein the feature processing comprises at least one of feature normalization, feature standardization, outlier processing, single-heat coding data conversion and wrapped feature selection;

selecting at least one operator matched with the target artificial intelligent model from the algorithm operators, and carrying out joint training on the training set after feature processing and the at least one operator to construct the target artificial intelligent model;

and selecting an operator matched with the target artificial intelligent model from the verification operators to verify, determining the performance of the target artificial intelligent model until the performance of the target artificial intelligent model meets the preset index, and outputting the target artificial intelligent model.

In a second aspect, the present invention provides an artificial intelligence model building system comprising:

the operator configuration module is used for constructing a container arranging platform based on a distributed micro-service architecture, managing each virtual node in a cluster, deploying an artificial intelligent algorithm library for each started virtual node, wherein the artificial intelligent algorithm library comprises data type operators, training type operators, algorithm type operators and verification type operators, each type of operators encapsulates each algorithm in a component form, and the algorithms are in one-to-one correspondence with the components;

the instruction determining module is used for splicing the components in a dragging mode to generate instruction information for constructing a target artificial intelligent model, wherein the instruction information comprises a service name of the target artificial intelligent model to be constructed and a directed acyclic graph formed among the components;

and the model construction module is used for responding to the instruction information to analyze, determining configuration resources required by constructing the target artificial intelligent model according to the directed acyclic graph, calling the matched virtual nodes to perform distributed training based on the configuration resources, and generating the target artificial intelligent model.

In a third aspect, the present invention provides an electronic device comprising: a processor and a memory; the memory is used for storing a computer program, and the processor is used for executing the computer program stored in the memory so as to enable the electronic equipment to execute the method.

In a fourth aspect, the present invention provides a computer readable medium having stored thereon a computer program for causing a computer to perform the above-described method.

The invention has the beneficial effects that:

according to the invention, a container arranging platform is constructed based on a distributed micro-service architecture, virtual nodes in a cluster are managed, an artificial intelligence algorithm library is deployed for each started virtual node, each algorithm in the artificial intelligence algorithm library is packaged into a corresponding component one by one, each component is spliced in a dragging mode to generate instruction information for constructing a target artificial intelligence model, the instruction information is analyzed, configuration resources required for constructing the target artificial intelligence model are determined according to a directed acyclic graph, and the matched virtual nodes are called for distributed training based on the configuration resources to generate the target artificial intelligence model; on the other hand, through the graphical modeling of the components, not only the node flow is clear and clear, but also the comprehensiveness is strong.

Drawings

FIG. 1 is a flow diagram of an artificial intelligence model building method according to an exemplary embodiment of the present invention;

FIG. 2 is a business architecture diagram of an artificial intelligence model building method according to an exemplary embodiment of the present invention;

FIG. 3 is an operational interface diagram of an artificial intelligence model building method according to an exemplary embodiment of the present invention;

FIG. 4 is a schematic diagram of an artificial intelligence model building system according to an exemplary embodiment of the invention

FIG. 5 is a schematic diagram of an implementation of an artificial intelligence model building system according to an exemplary embodiment of the present invention;

fig. 6 is a schematic diagram of a computer system suitable for use in implementing the electronic device of the present invention, as shown in an exemplary embodiment of the present invention.

Detailed Description

Other advantages and effects of the present invention will become apparent to those skilled in the art from the following disclosure, which describes the embodiments of the present invention with reference to specific examples. The invention may be practiced or carried out in other embodiments that depart from the specific details, and the details of the present description may be modified or varied from the spirit and scope of the present invention. It should be noted that, without conflict, the following embodiments and sub-samples in the embodiments may be combined with each other.

It should be noted that the illustrations provided in the following embodiments merely illustrate the basic concept of the present invention by way of illustration, and only the components related to the present invention are shown in the drawings and are not drawn according to the number, shape and size of the components in actual implementation, and the form, number and proportion of the components in actual implementation may be arbitrarily changed, and the layout of the components may be more complicated.

In the following description, numerous details are set forth in order to provide a more thorough explanation of embodiments of the present invention, it will be apparent, however, to one skilled in the art that embodiments of the present invention may be practiced without these specific details, in other embodiments, well-known structures and devices are shown in block diagram form, rather than in detail, in order to avoid obscuring the embodiments of the present invention.

The terms first, second and the like in the description and in the claims of the embodiments of the disclosure and in the above-described figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate in order to describe embodiments of the present disclosure. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. The term "plurality" means two or more, unless otherwise indicated. In the embodiment of the present disclosure, the character "/" indicates that the front and rear objects are an or relationship. For example, A/B represents: a or B. The term "and/or" is an associative relationship that describes an object, meaning that there may be three relationships. For example, a and/or B, represent: a or B, or, A and B.

Referring to fig. 1, a flow chart of an artificial intelligence model construction method according to an exemplary embodiment of the invention is shown. Referring to fig. 1, in an exemplary embodiment, the artificial intelligence model building method at least includes steps S101 to S103, which are described in detail as follows:

step S101, constructing a container arrangement platform based on a distributed micro-service architecture, managing each virtual node in a cluster, deploying an artificial intelligent algorithm library for each started virtual node, wherein the artificial intelligent algorithm library comprises data type operators, training type operators, algorithm type operators and verification type operators, each type operator encapsulates each algorithm in a component form, and the algorithms correspond to each component one by one;

in particular, the distributed microservice architecture is a software development and deployment model that breaks down complex applications into small, independent, autonomous services. These services may be developed, deployed, and extended independently and may communicate and work in concert over a network. The micro-service architecture may improve the reliability, scalability, and flexibility of the application while reducing the maintenance cost and complexity of the application.

The algorithm library is not deployed by each server physical node in the server cluster, the server cluster is managed by the K8S, and when the model is built for training, the system can call the K8S to start the virtual node (Pod) in real time, so that the virtual node comprises the artificial intelligent algorithm library.

Referring in detail to fig. 2, a service structure diagram of an artificial intelligence model construction method according to an exemplary embodiment of the present invention includes:

by utilizing various operator components (data type operators, training type operators, algorithm type operators and verification type operators) of the system, the AI model required under the service scene can be easily constructed. For example, an industrial BMA four-layer sheet welding voltage parameter recommendation model, an ME two-layer sheet welding pre-pressing time parameter recommendation model, an ME two-layer sheet welding current parameter recommendation model, an automobile door handle trim detection model, a license plate size detection model, a stamping part hole detection model, an automobile keyless entry existence detection model and the like.

For example, data class operators include, but are not limited to: machine learning data importing, image classification data importing, text classification data importing, machine learning data segmenting, image data segmenting and the like;

training class operators include, but are not limited to: feature engineering control, regression training control, image classification training control, character recognition control, word vector training control and the like;

algorithm class operators include, but are not limited to: GBDT (decision tree) algorithm, adboost (iterative) algorithm, resnet (residual neural network) algorithm, fasterRCNN (target detection) algorithm, word2vec (word vector) algorithm, etc.;

Verification class operators include, but are not limited to: regression verification control, text detection verification control, object detection verification control, image classification verification control, word vector verification control, and the like.

Wherein each algorithm in the operators is packaged into a component, that is, each component characterizes one algorithm, and it should be noted that the component is not limited herein whether it is a base component or an extension component.

Optionally, it should also be noted that the algorithm may also be packaged as a component in the following manner, as follows:

determining metadata defined for each algorithm, the metadata comprising: format data corresponding to input and output of the algorithm and format data of parameters; and packaging at least one algorithm according to the defined metadata to obtain a corresponding algorithm component.

Specifically, the algorithms may be based on the same computing framework or may be based on different computing frameworks. When defining metadata, the metadata of each algorithm can be defined respectively; some parameters which can be shared can be defined as one metadata, so that the workload can be reduced, and the metadata multiplexing is facilitated.

After the algorithm is componentized, data transmission may be required between different algorithm components, and in order to achieve this requirement, some important metadata in the algorithm, for example, the format data of the input and output and the format data of the parameters in the algorithm, need to be separately defined. In general, it is desirable to enable the format of the output of an algorithm component to match the format of the input of the algorithm component receiving the data, which may mean that the format of the output is the same as the format of the input or can be automatically converted.

For example, if one algorithm component needs to be connected to another two algorithm components, and the two algorithm components respectively receive JSON format data and binary data, when the algorithm component is initially defined, the format of defining data input and output needs to include: JSON format and binary format.

When the algorithm is packaged, one algorithm can be packaged into one algorithm component or a plurality of algorithms can be packaged into one algorithm component according to actual conditions. For example, if an algorithm is often used alone or is not fixed in collocation with other algorithms, the algorithm may be packaged alone as an algorithm component; if one algorithm is often fixedly matched with another algorithm or a plurality of algorithms, the algorithms can be packaged together to obtain an algorithm component comprising a plurality of algorithms, so that the subsequent modeling is more convenient, and the model structure is also facilitated to be simplified.

Step S102, splicing all components in a dragging mode to generate instruction information for constructing a target artificial intelligent model, wherein the instruction information comprises service names of the target artificial intelligent model and a directed acyclic graph formed among all components in the target artificial intelligent model;

Referring to fig. 3, an operation interface diagram of an artificial intelligence model construction method according to an exemplary embodiment of the present invention is described in detail as follows:

moving the target assembly into a modeling canvas area, and connecting all the assemblies layer by layer to form a directed acyclic graph; determining the service name of the target artificial intelligent model according to the description information, the regulation and control parameters and the topological structure of the directed acyclic graph of each component;

Specifically, the AI modeled operation page is divided into 3 large blocks:

tool assembly area: on the left side of the page, the basic components required to build the AI model are provided, including various types of components of the data class, algorithm class, training class, and validation class, totaling 180 multiple operator components (i.e., components). By using these tool components, the user does not have to re-write cumbersome code and can quickly build an AI model by focusing on the business itself.

Modeling canvas area: in the middle of the page, the directed acyclic graph can be constructed by dragging operator components of the left tool component area into a canvas and connecting multiple components layer by layer to realize AI modeling.

Component configuration description area: the system is positioned on the right side of the page, and comprises the description information of the component selected by the user and a list of parameters which can be regulated by the component, so that the user can know the component conveniently and regulate the parameters of AI modeling.

By the method, the service name of the target artificial intelligent model can be determined, the service name of the target artificial intelligent model can be combined with the directed acyclic graph, and instruction information for constructing the target artificial intelligent model is generated, so that the target artificial intelligent model can be quickly and accurately constructed according to direct response of the instruction information.

And step S103, analyzing in response to the instruction information, determining configuration resources required for constructing the target artificial intelligent model according to the directed acyclic graph, and calling matched nodes to perform distributed training based on the configuration resources to generate the target artificial intelligent model.

It should be noted that, the container editing platform includes, but is not limited to, docker Swarm (cluster management tool), nomad (cluster management tool), elastic Container Service (cluster management tool), onteon (cluster management tool), kubernetes (cluster management tool), in this embodiment, as an example, kubernetes, where Kubernetes have become industry targets and are free to open source.

Of course, the invention also develops a set of distributed container arranging platform for constructing the AI modeling system, but the distributed development has higher difficulty and high cost. In the upper Web system, you can develop a Web (Web page) system using programming languages such as c++, python, etc., but Python has low performance and low c++ development efficiency compared to Java. In the aspect of the upper-layer AI modeling framework, mainly a PyTorch, tensorFlow, scikit-Learn framework and other frameworks are selected, and an MxNet, caffe, paddlepaddle AI framework is also selected to be used as a substitute, but the frameworks are not mainstream enough, so that the development of custom AI components and the investigation of problems by users are not facilitated.

In the embodiment, single operator calculation in the target artificial intelligent model is distributed to a plurality of hardware devices for concurrent calculation, so that the purpose of calculating the calculation speed of the single operator is achieved. The calculation of a single operator is distributed on a plurality of pieces of hardware with the same configuration in a model parallel mode, and model storage and calculation are carried out so as to ensure that the calculation steps are consistent. The distributed training has better performance, higher upper limit of running resources, full utilization of cluster resources, loss reduction and the like.

According to the method, a container arranging platform is constructed based on a distributed micro-service architecture, virtual nodes in a cluster are managed, an artificial intelligence algorithm library is deployed for each started virtual node, each algorithm of the artificial intelligence algorithm library is packaged into corresponding components one by one, the components are spliced in a dragging mode to generate instruction information for constructing a target artificial intelligence model, the instruction information is responded to be analyzed, configuration resources required for constructing the target artificial intelligence model are determined according to a directed acyclic graph, the matched virtual nodes are called based on the configuration resources to perform distributed training, and the target artificial intelligence model is generated; on the other hand, through the graphical modeling of the components, not only the node flow is clear and clear, but also the comprehensiveness is strong.

Optionally, invoking the matched node for distributed training based on the configuration resource to generate a target artificial intelligence model, including:

if the container resource allocation is greater than or equal to the allocation resources required by the target artificial intelligent model, responding to the instruction information, and calling each node according to the allocation resources required by the target artificial intelligent model to respond so as to enable each node to distribute and train each component and generate the target artificial intelligent model;

if the container resource allocation is smaller than the allocation resources required by the target artificial intelligent model, the container resource allocation is monitored in real time until the container resource allocation is greater than or equal to the allocation resources required by the target artificial intelligent model, responding to the instruction information, and calling each node according to the allocation resources required by the target artificial intelligent model to respond so as to enable each node to distribute and train each component to generate the target artificial intelligent model.

In this embodiment, current system resources of the cluster are monitored in real time, and the current system resources of the cluster are compared with configuration resources required by the target artificial intelligent model, so that whether the target artificial intelligent model can be built can be quickly determined, system blocking and slowness caused by calling the resources to build the target artificial intelligent model are avoided under the condition of insufficient resources, and further, the efficiency of model building is influenced.

By the method, when a plurality of instruction information is received at the same time, arbitration is carried out according to the priority, which target artificial intelligent model is firstly determined to carry out resource allocation, if at least two artificial intelligent models with the same priority exist, the arbitration is carried out by using the carried time stamp, and the target artificial intelligent model is constructed by adopting a first-come first-training mode.

In this embodiment, instead of applying for full resources for the whole target artificial intelligent model, fine-grained management is performed for each processing flow to manage cluster resources, so that resource waste is avoided, and concurrency capability of a cluster training model is improved.

Optionally, the artificial intelligence model building method further comprises: constructing a distributed micro-service architecture based on a k8s cluster, wherein the k8s cluster adopts a distributed file system combination cluster as a unified object storage scheme; responding to the instruction information of the target artificial intelligent model, analyzing the instruction information by utilizing an AVES service, and determining a directed acyclic graph formed among all components in the instruction information; and according to orderly calling application program interface services of the directed acyclic graph formed among the components, the artificial intelligent model is subjected to distributed calling of each program according to the sequence of data importing, data processing, model training, model verification and model storage, and the construction of the target artificial intelligent model is completed.

Specifically, kubernetes is an open-source container orchestration engine that is used to automate the deployment, scaling, and management of containerized applications. By combining multiple real servers together, a large virtualized system is formed. Pod can be virtualized on the system (which can be understood as a virtual server, and can be applied according to the resource requirements of CPU, memory and disk). Specific programs may run in the container of the Pod, e.g., java programs, mySQL database.

Ceph, for example, refers to a set of distributed systems that provide block storage, file system, object storage services.

For example, springCloud refers to the Web framework in Java programming, feign refers to the remote HTTP request invocation framework in Java programming, and Zuul refers to the gateway routing framework in Java programming.

For example, the deep learning, machine learning framework employed by the present system includes PyTorch, tensorFlow, scikit-Learn, and the like.

In the embodiment, the invention provides a low-code and componentized AI modeling scheme, which greatly improves the overall modeling concurrency capability and performance, reduces unnecessary resource loss, provides rich reusable components for users, and enables the users to select corresponding components according to actual demands and quickly build AI algorithm models by adopting technologies such as Kubernetes, virtualization, micro-services, CUDA (graphic processing development environment), machine learning, deep learning and the like. The distributed, componentized and low-code AI modeling capability provided by the invention solves the problems of high AI modeling threshold, insufficient AI training resource utilization and AI training single-node bottleneck.

Optionally, the artificial intelligence model building method further comprises:

Selecting an operator matched with the target artificial intelligent model from the data type operators to import the data set to form a target data set;

selecting at least one operator matched with the target artificial intelligent model from algorithm operators, and carrying out joint training on the training set after feature processing and the at least one operator to construct the target artificial intelligent model;

and selecting an operator matched with the target artificial intelligent model from verification operators to verify, determining the performance of the target artificial intelligent model until the performance of the target artificial intelligent model meets the preset index, and outputting the target artificial intelligent model.

Specifically, in the middle of fig. 3 is an AI modeling example of a "dimensional tolerance correlation analysis model", which has the following operation steps:

first, the "size three coordinate measurement dataset" is imported by using a "machine learning data import" operator.

And then, dividing the size three-coordinate measurement data into a training set and a verification set by using a machine learning data segmentation operator.

And then, carrying out data preprocessing, adding an operator of 'missing value filling', 'repeated line removal', 'data type conversion', and connecting the data and the processing operator together by using an operator of 'data preprocessing control', so as to realize a data preprocessing function.

And then, carrying out feature engineering, adding a feature normalization operator, an outlier processing operator, a box cox conversion operator and a wrapped feature selection operator, and connecting data with a processing operator by using a feature engineering control operator to realize a feature engineering function.

Then, a basic algorithm LASSO regression operator needed by the 'dimensional tolerance correlation analysis' model is selected, and data to be trained and the algorithm are connected together by using a regression training control operator, so that an AI model training function is realized.

And finally, dragging in a regression verification control operator, and connecting the data to be verified and the trained model together to realize the evaluation function of the AI model.

Through the mode, the AI modeling scheme bottom layer is based on Kubernetes, belongs to a target of an industry distributed virtualization platform, and relatively has better robustness, stability, safety, universality, concurrency performance and the like. For example, the underlying system carries user security authentication itself, not just the upper web application, with finer granularity of security assurance; for example, if any virtual node in the cluster fails, other virtual nodes are automatically started to replace, and the training task is recovered, so that the cluster has good robustness; meanwhile, the bottom layer system Kubernetes has been widely applied to a large number of companies (Google, ali, tencent, huacheng, baidu and the like), so that the system has wide applicability, is more universal and is convenient to maintain.

Referring to FIG. 4, a schematic diagram of an artificial intelligence model building system according to an exemplary embodiment of the present invention is shown. As shown in connection with FIG. 4, the exemplary artificial intelligence model building system 400 includes: an operator configuration module 401, an instruction determination module 402, and a model building module 405, wherein:

The operator configuration module 401 is configured to construct a container arrangement platform based on a distributed microservice architecture, manage each virtual node in a cluster, deploy an artificial intelligent algorithm library for each virtual node started, wherein the artificial intelligent algorithm library comprises data type operators, training type operators, algorithm type operators and verification type operators, each type operator encapsulates each algorithm in a component form, and the algorithms correspond to each component one by one;

the instruction determining module 402 is configured to splice the components in a drag manner, and generate instruction information for constructing the target artificial intelligent model, where the instruction information includes a service name of the target artificial intelligent model and a directed acyclic graph formed between the components in the target artificial intelligent model;

the model building module 403 is configured to parse in response to the instruction information, determine configuration resources required for building the target artificial intelligent model according to the directed acyclic graph, and invoke matched nodes to perform distributed training based on the configuration resources, so as to generate the target artificial intelligent model.

It should be noted that, the artificial intelligent model building system provided in the above embodiment and the artificial intelligent model building method provided in the above embodiment belong to the same concept, and a specific manner of performing the operation of each step has been described in detail in the system embodiment, which is not described herein.

According to the artificial intelligent model building system provided by the embodiment of the disclosure, a container arrangement platform is built based on a distributed micro-service architecture, virtual nodes in a cluster are managed, an artificial intelligent algorithm library is deployed for each virtual node started, various algorithms of the artificial intelligent algorithm library are packaged into corresponding assemblies one by one, the assemblies are spliced in a dragging mode to generate instruction information for building a target artificial intelligent model, the instruction information is responded to be analyzed, configuration resources required for building the target artificial intelligent model are determined according to a directed acyclic graph, and the matched nodes are called for distributed training based on the configuration resources to generate the target artificial intelligent model; on the other hand, through the graphical modeling of the components, not only the node flow is clear and clear, but also the comprehensiveness is strong.

Referring to fig. 5, a schematic diagram of an implementation structure of an artificial intelligence model building system according to an exemplary embodiment of the present invention includes:

constructing a k8s cluster based on a distributed micro-service architecture, wherein the k8s cluster adopts a distributed file system combination cluster as a unified object storage scheme; responding to the instruction information of the target artificial intelligent model, analyzing the instruction information by utilizing an AVES service, and determining a directed acyclic graph formed among all components in the instruction information; and according to orderly calling application program interface services of the directed acyclic graph formed among the components, the artificial intelligent model is subjected to distributed calling of each program according to the sequence of data importing, data processing, model training, model verification and model storage, and the construction of the target artificial intelligent model is completed.

Specifically, a Kubernetes virtualized cluster is used as a service basis, a Ceph combined cluster is used as a unified object storage scheme, and on the basis, system service Pod including AVES service Pod, model-public service Pod, business service Pod and the like are established.

According to the design of the AI model by the user, dynamically starting a data import Pod (namely, carrying out data import through remote data and then carrying out distributed object storage), dividing the data stored by the distributed object, forming a training set and a verification set, then carrying out distributed object storage, reading the training set and the verification set for preprocessing and preprocessing control, training the Pod by an AI algorithm (reading the preprocessed training set for training, determining the AI model), verifying and controlling the Pod (reading the preprocessed verification set for verifying the AI model), and the like.

When the AI model building task (i.e., instruction information) of the user is received, it is sent to the AVES service. The AVES service parses task information, sequentially calls API servers (application programming interface services) of Kubernetes according to AI model DAGs (directed acyclic graphs) constructed by users, distributively starts Pod, and gradually performs AI model construction processes (including data import, data processing, model training, model evaluation and model saving) to finally complete modeling tasks.

When the user needs to use the model, the user can acquire the corresponding trained AI model from the Ceph object storage, and the AI model is used for storing a large number of models, data, operator components and the like through the distributed storage capacity.

Referring to FIG. 6, a schematic diagram of a computer system suitable for use in implementing an embodiment of the invention is shown. It should be noted that, the computer system 600 of the electronic device shown in fig. 6 is only an example, and should not impose any limitation on the functions and the application scope of the embodiments of the present invention.

As shown in fig. 6, the computer system 600 includes a central processing unit (Central Processing Unit, CPU) 601, which can perform various appropriate actions and processes, such as performing the methods in the above-described embodiments, according to a program stored in a Read-Only Memory (ROM) 602 or a program loaded from a storage section 608 into a random access Memory (Random Access Memory, RAM) 603. In the RAM603, various programs and data required for system operation are also stored. The CPU 601, ROM602, and RAM603 are connected to each other through a bus 304. An Input/Output (I/O) interface 605 is also connected to bus 604.

The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, mouse, etc.; an output portion 607 including a Cathode Ray Tube (CRT), a liquid crystal display (Liquid Crystal Display, LCD), and a speaker, etc.; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN (Local Area Network ) card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The drive 610 is also connected to the I/O interface 605 as needed. Removable media 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on drive 610 so that a computer program read therefrom is installed as needed into storage section 608.

In particular, according to embodiments of the present invention, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present invention include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising a computer program for performing the method shown in the flowchart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 609, and/or installed from the removable medium 611. When executed by a Central Processing Unit (CPU) 601, performs the various functions defined in the system of the present invention.

The present invention also provides a computer readable storage medium storing a computer program which when executed implements at least one embodiment described above with respect to an artificial intelligence model building method, such as the embodiment described in fig. 1.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method of the embodiments of the present invention.

In the embodiments provided herein, a computer-readable storage medium may include read-only memory, random-access memory, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, flash memory, usb disk, a removable hard disk, or any other medium that can be used to store desired program code in the form of instructions or data structures and that can be accessed by a computer. In addition, any connection is properly termed a computer-readable medium. For example, if the instructions are transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, digital Subscriber Line (DSL), or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. It should be understood, however, that computer-readable and data storage media do not include connections, carrier waves, signals, or other transitory media, but are intended to be directed to non-transitory, tangible storage media. Disk and disc, as used herein, includes Compact Disc (CD), laser disc, optical disc, digital Versatile Disc (DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers.

In one or more exemplary aspects, the functions described by the computer program of the methods of the present invention may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, these functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. The steps of a method or algorithm disclosed in the present invention may be embodied in a processor-executable software module, which may be located on a tangible, non-transitory computer-readable and writable storage medium. Tangible, non-transitory computer readable and writable storage media may be any available media that can be accessed by a computer.

The flowcharts and block diagrams in the figures described above illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The above embodiments are merely illustrative of the principles of the present invention and its effectiveness, and are not intended to limit the invention. Modifications and variations may be made to the above-described embodiments by those skilled in the art without departing from the spirit and scope of the invention. Accordingly, it is intended that all equivalent modifications and variations of the invention be covered by the claims, which are within the ordinary skill of the art, be within the spirit and scope of the present disclosure.

Claims

1. An artificial intelligence model construction method, comprising:

2. The method for constructing an artificial intelligence model according to claim 1, wherein the generating a target artificial intelligence model based on the distributed training of the virtual nodes matched with the configuration resource call comprises:

3. The method of claim 1, wherein the parsing in response to the instruction information determines configuration resources required for building the target artificial intelligence model according to the directed acyclic graph, invoking the matched virtual nodes for distributed training based on the configuration resources, and further comprising:

4. The method for constructing an artificial intelligence model according to claim 3, wherein determining configuration resources required by the target artificial intelligence model sequentially according to the order of the priority list from high to low, and calling the matched virtual nodes one by one according to the order to perform distributed training comprises:

5. The artificial intelligence model constructing method according to claim 1, wherein the steps of splicing the components in a drag form to generate instruction information for constructing a target artificial intelligence model, the instruction information including a service name of the target artificial intelligence model to be constructed and a directed acyclic graph formed between the components, include:

6. The method for constructing an artificial intelligence model according to any one of claims 1 to 5, wherein a distributed micro-service architecture is constructed based on k8s clusters, and the k8s clusters adopt distributed file system combination clusters as a unified object storage scheme; responding to the instruction information of the target artificial intelligent model, analyzing the instruction information by utilizing an AVES service, and determining a directed acyclic graph formed between all the components in the instruction information; and according to orderly calling application program interface services of the directed acyclic graph formed among the components, the artificial intelligent model is subjected to distributed calling of each program according to the sequence of data importing, data processing, model training, model verification and model storage, and the construction of the target artificial intelligent model is completed.

7. The artificial intelligence model construction method according to any one of claims 1 to 5, further comprising:

performing feature processing on the preprocessed training set and the preprocessed verification set, wherein the feature processing comprises at least one of feature normalization, feature standardization, outlier processing, independent heat coding and wrapped feature selection;

8. An artificial intelligence model building system, comprising:

9. An electronic device, comprising: a processor and a memory;

The memory is configured to store a computer program, and the processor is configured to execute the computer program stored in the memory, to cause the electronic device to perform the method according to any one of claims 1 to 7.

10. A computer-readable storage medium, characterized in that a computer program is stored thereon for causing a computer to perform the method according to any one of claims 1 to 7.