CN114757307B - Artificial intelligence automatic training method, system, device and storage medium - Google Patents

Artificial intelligence automatic training method, system, device and storage medium Download PDF

Info

Publication number
CN114757307B
CN114757307B CN202210669062.6A CN202210669062A CN114757307B CN 114757307 B CN114757307 B CN 114757307B CN 202210669062 A CN202210669062 A CN 202210669062A CN 114757307 B CN114757307 B CN 114757307B
Authority
CN
China
Prior art keywords
training
model
target
automatic
artificial intelligence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210669062.6A
Other languages
Chinese (zh)
Other versions
CN114757307A (en
Inventor
王晓辉
季知祥
蒲天骄
刘鹏
肖凯
郭鹏天
李道兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Electric Power Research Institute Co Ltd CEPRI
Original Assignee
China Electric Power Research Institute Co Ltd CEPRI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Electric Power Research Institute Co Ltd CEPRI filed Critical China Electric Power Research Institute Co Ltd CEPRI
Priority to CN202210669062.6A priority Critical patent/CN114757307B/en
Publication of CN114757307A publication Critical patent/CN114757307A/en
Application granted granted Critical
Publication of CN114757307B publication Critical patent/CN114757307B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/06Energy or water supply
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y04INFORMATION OR COMMUNICATION TECHNOLOGIES HAVING AN IMPACT ON OTHER TECHNOLOGY AREAS
    • Y04SSYSTEMS INTEGRATING TECHNOLOGIES RELATED TO POWER NETWORK OPERATION, COMMUNICATION OR INFORMATION TECHNOLOGIES FOR IMPROVING THE ELECTRICAL POWER GENERATION, TRANSMISSION, DISTRIBUTION, MANAGEMENT OR USAGE, i.e. SMART GRIDS
    • Y04S10/00Systems supporting electrical power generation, transmission or distribution
    • Y04S10/50Systems or methods supporting the power network operation or management, involving a certain degree of interaction with the load-side end user applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Economics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Public Health (AREA)
  • Water Supply & Treatment (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an artificial intelligence automatic training method, a system, a device and a storage medium, wherein a series of network searching frameworks facing a convolutional neural network, a recursive neural network, a graph convolution neural network and the like are designed, basic operators adaptive to the structure of a power model are designed, feasible model searching strategies are explored, multi-task automatic training under a power grid application scene is realized, the difficulty in developing an artificial intelligence model is reduced, the problems that in a traditional artificial intelligence algorithm, feature selection, model evaluation and the like depend on personal experience and are tedious and time-consuming are solved, and the efficiency of developing the artificial intelligence model is improved. The invention can provide automatic training of artificial intelligence algorithms such as image recognition, video analysis, text analysis, voice recognition and the like for the fields of scheduling, transportation and inspection, safety supervision and the like, constructs models such as inspection defect fault recognition, safety production monitoring violation operation recognition, electric power entity recognition and the like of electric power transmission and transformation equipment, and supports artificial intelligence application of various businesses such as electric power scheduling, transportation and inspection, safety supervision, marketing and the like.

Description

Artificial intelligence automatic training method, system, device and storage medium
Technical Field
The invention belongs to the field of artificial intelligence, and relates to a construction method of an artificial intelligence automatic training system for electric power multi-scene, in particular to an artificial intelligence automatic training method, an artificial intelligence automatic training system, an artificial intelligence automatic training device and a storage medium.
Background
Although the deep learning model reduces the difficulty of feature extraction, two major classes of hyper-parameters are brought: learning rate, momentum, batch size related to training; the network width, the number of convolution kernels, the number of convolution layers, etc. related to the network structure. These hyper-parameters constitute a huge search space. While the user of the algorithm needs professional knowledge to adjust the complicated algorithm parameters, it usually takes a lot of time and experience to try. From the recent development trend, the idea of algorithm-by-algorithm to solve the problem of parameter adjustment is more and more favored by researchers.
While deep learning has made this great progress, it is far from fully automated machine learning systems. Each machine learning application, such as feature engineering, model selection, algorithm selection, requires customization. How to free these model developers from these tedious tasks becomes a valuable task for them to engage in more innovative work.
At present, the data analysis and mining application for power service mainly includes the following technical solutions:
(1) business understanding
For data analysis and mining application of the power business, business requirement analysis data analysts need to learn and understand relevant knowledge in the business field, and conduct multiple analysis and discussion with business personnel and key interest relevant parties to jointly formulate business requirements to form business problems; determining the analysis target of the project, namely the application scene to be implemented finally together with business application personnel, and compiling a corresponding functional design scheme; at the same time, the personnel, techniques, time and data available for project implementation need to be evaluated.
(2) Analytical method selection
The key point of this step is to convert the business problem into an analysis problem, form an initial analysis hypothesis, and preliminarily determine the analysis mining method to be used so as to determine the artificial intelligence algorithm to be adopted according to the analysis target.
(3) Data preparation
And according to the business requirement analysis result, researching potential data sources and understanding available data according to business rules. And analyzing the data requirements and the external data in the power system possibly used by combining the analysis target. Due to the fact that data quality is uneven caused by equipment abnormality, transmission interference or human factors and the like, a large number of null values, abnormal values, error values and the like exist, and therefore data preprocessing becomes decisive work of data analysis and mining. Data needs to be cleaned by combining service rules and data distribution conditions; integrating and fusing data preparation based on a unified data model; and (3) preprocessing the cleaned data such as specification, transformation, discretization and the like according to a set analysis mining method so as to improve the performance of an analysis algorithm.
And (3) carrying out data marking on partial data according to the training requirement of the algorithm model to form data which can be identified by a normalized algorithm, and supporting the algorithm model training.
(4) Data modeling
And carrying out model training, parameter tuning and algorithm verification on the preliminarily determined analysis method according to the analysis hypothesis and the data condition. Through data exploration and variable selection, descriptive statistical analysis and exploratory modeling analysis are performed to understand the relationships between variables. Based on analysis hypothesis, analysis targets and data exploration conditions, one or a class of specific analysis method is selected, and when large-scale full-scale data is analyzed and mined, a distributed algorithm in a novel analysis and mining tool is adopted to perform model training. In the analysis and mining process, a plurality of machine learning algorithms need to be subjected to parallelization transformation and implementation, which is still a challenge in academia and industry at present. In the model training process, model parameters need to be adjusted and optimized according to the result of the analysis method.
(5) Model evaluation
The step is to verify the analysis method on actual data (data adopted in non-training), and iteratively optimize the analysis mining model according to the verification result. And screening data dimensions or attributes by combining project analysis targets and designed service scenes, and selecting corresponding display modes according to the purposes and user groups.
(6) Application development
And solidifying the analysis process, the analysis method and the like into a model, developing a business application module based on the model, and providing a standard IO interface to be interconnected with a business system. And according to the requirements of analysis projects, adopting an implementation strategy of firstly trial application and then comprehensive popularization. And in the implementation process, feedback information is collected, and whether model correction is needed or not is determined according to the result feedback condition.
The above prior art has the following disadvantages:
the whole data analysis process involves more work needing to be participated by personnel such as algorithm analysis selection, model training environment, model evaluation and the like.
In the aspect of algorithm analysis selection, a model suitable for a specific power task needs to be selected from a large number of algorithms for specific power services, and analysis and mining personnel need to have a high algorithm analysis technical level and a large workload.
In the aspect of model training, the software and hardware environment of an algorithm model needs to be constructed by an analysis and excavation worker, on one hand, the software environment needs to be constructed by the worker, on the other hand, the model training time is long and the effect is poor because the model training condition is usually not good.
In the aspect of model evaluation, mining personnel need to analyze the evaluation indexes of the artificially defined algorithm model, and write related evaluation programs to realize model evaluation.
Disclosure of Invention
The invention aims to solve the problems in the prior art, provides an artificial intelligence automatic training method, a system, a device and a storage medium, realizes the functions of sample data management, model self-adaptive selection, model automatic training and the like, realizes the automatic training of an artificial intelligence model, reduces the development difficulty of the artificial intelligence model, solves the problems of dependence on personal experience, complexity and time consumption in characteristic selection, model evaluation and the like in the traditional artificial intelligence algorithm, and improves the development efficiency of the artificial intelligence model.
In order to achieve the purpose, the invention adopts the following technical scheme to realize the purpose:
an artificial intelligence automatic training method comprises the following steps:
receiving the uploaded sample data, and carrying out manual or automatic auxiliary labeling on the sample data;
selecting a target service field corresponding to the marking information according to the marking information of the sample data;
selecting a target model training task from model training tasks corresponding to the target business field according to the target business field;
matching a target deep learning model according to the target model training task;
carrying out automatic model training and parameter pre-training aiming at the target deep learning model;
and realizing automatic model evaluation according to the type of the model training task and the evaluation method.
An artificial intelligence automatic training system, comprising:
the sample data management module is used for receiving the uploaded sample data and carrying out manual or automatic auxiliary labeling on the sample data;
the service field selection module is used for selecting a target service field corresponding to the marking information according to the marking information of the sample data;
the target model training task selection module is used for selecting a target model training task from model training tasks corresponding to the target business field according to the target business field;
the target deep learning model matching module is used for matching a target deep learning model according to the target model training task;
the model and parameter training module is used for carrying out automatic model training and parameter pre-training aiming at the target deep learning model;
and the model evaluation module is used for realizing automatic model evaluation according to the model training task type and the evaluation method.
An artificial intelligence automatic training device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the method as described above when executing the computer program.
A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method as described above.
Compared with the prior art, the invention has the following beneficial effects:
the artificial intelligence automatic training method is oriented to automatic training of models such as images and texts of electric power multi-scene, has the characteristics of simplicity in operation, high efficiency in model training and the like, can effectively support model training such as image recognition, text analysis, voice recognition and the like of services such as power grid scheduling, operation and inspection, safety supervision, marketing and the like, and realizes automatic matching of the services and deep learning models.
Furthermore, when the neural architecture automatic search technology is used, the method adopts the graph convolution neural network architecture search based on an information transfer mechanism, adopts a sparse filter and a dense filter to be competent for a coarse-and-fine granularity feature filtering task, designs a basic operator adaptive to the structure of the power model, and explores a feasible model search strategy.
Furthermore, the invention realizes network automatic training with less manual intervention by designing basic operators and using parameter updating modes such as reinforcement learning, differentiable learning and the like.
Further, when the gradient descent algorithm is used for searching the network structure, the invention aims to solve the problems of increased calculated amount and reduced stability when the network depth is deepened: firstly, search space approximation, namely, the number of optional operations is reduced when the number of network layers is increased, so that the calculation amount is reduced; and secondly, searching space regularization, namely randomly discarding the operation layer and reducing the number of jump connection, so that the jump connection is prevented from occupying a dominant position in the operation layer.
According to the invention, by establishing the full-automatic model training system facing to the electric power multi-scene, a convenient, easy-to-use and efficient model training environment is provided for model developers, the technical threshold of model development is greatly reduced, the model development time is shortened, an algorithm model suitable for the electric power multi-scene is effectively constructed, and the intelligent development of electric power business is supported effectively.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention, and therefore should not be considered as limiting the scope, and those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.
FIG. 1 is a flow chart of automated model training of the present invention.
FIG. 2 is a schematic diagram of a search space of a convolutional neural network of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be obtained by a person skilled in the art without inventive step based on the embodiments of the present invention, are within the scope of protection of the present invention.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.
In the description of the embodiments of the present invention, it should be noted that, if the terms "upper", "lower", "horizontal", "inner", etc. are used to indicate the orientation or positional relationship based on the orientation or positional relationship shown in the drawings or the orientation or positional relationship which the product of the present invention is used to usually place, it is only for convenience of describing the present invention and simplifying the description, but it is not necessary to indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and the like are used merely to distinguish one description from another, and are not to be construed as indicating or implying relative importance.
Furthermore, the term "horizontal", if present, does not mean that the component is required to be absolutely horizontal, but may be slightly inclined. For example, "horizontal" merely means that the direction is more horizontal than "vertical" and does not mean that the structure must be perfectly horizontal, but may be slightly inclined.
In the description of the embodiments of the present invention, it should be further noted that unless otherwise explicitly stated or limited, the terms "disposed," "mounted," "connected," and "connected" should be interpreted broadly, and may be, for example, fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.
Debugging the hyper-parameters of the deep learning model takes a lot of time and experience to try. From the recent development trend, the idea of algorithm-by-algorithm to solve the problem of parameter adjustment is more and more favored by researchers.
Automated training techniques (AutoML) have been proposed in this context. The proposal hopes to automatically solve all human work, and the threshold of machine learning is lower, so that more people have the opportunity to use the machine learning. AutoML is an end-to-end process of applying machine learning. In a typical machine learning task, AutoML automates the learning of the important steps of model, optimization, and evaluation, allowing the machine learning model to be applied without human intervention. The main problems with AutoML are the selection of appropriate features, the selection of appropriate model families, and the selection of appropriate model parameters. The transfer learning is to transfer the model parameters which are well trained to a new model to help the new model training. Considering that most data or tasks are relevant, through migration learning, the learned model parameters (also understood as the knowledge learned by the model) can be shared with the new model in a certain way so as to accelerate and optimize the learning efficiency of the model, and the model does not need to be learned from zero like most networks. The transfer learning can accelerate the training process and also can solve the problem of less training sample data in the target field.
The artificial intelligence automatic training system facing to the electric power multi-scene comprises functional modules of sample data management, business field selection, self-adaptive model matching, model training, model evaluation, model encapsulation and the like. Aiming at a model training task, a network search framework oriented to a convolutional neural network, a recursive neural network, a graph convolutional neural network and the like is designed, basic operators adaptive to the structure of the model training task are designed, a feasible model search strategy is explored, and multi-task automatic training under a power grid application scene is realized. The automatic model training realizes the network automatic training with less manual intervention by designing a basic operator and using a parameter updating mode such as reinforcement learning, differentiable learning and the like.
The invention is described in further detail below with reference to the accompanying drawings:
referring to fig. 1, the embodiment of the invention discloses an artificial intelligence automatic training method, which comprises the following steps:
(1) sample data management
The artificial intelligence automatic training platform sample data management oriented to the electric power multi-scene comprises sample data uploading, sample data labeling, sample data management and the like. And the sample data uploading provides local uploading, server uploading and the like. The sample data labeling provides two modes of manual labeling and automatic auxiliary labeling, and automatic auxiliary labeling realizes automatic labeling of the sample data through a sample data automatic labeling tool, so that the sample data labeling efficiency is improved. The sample data management realizes the full-flow standardized management of the sample data and supports the automatic training of the model.
The user can select the sample data type when uploading the sample data, the platform realizes the automatic marking of the sample data based on the sample data type, and the sample data can be applied to model training through manual checking based on the automatic marking result. The sample data types of automatic labeling comprise images of power transmission and transformation, safety supervision and the like, videos, text information and the like, wherein the power transmission and transformation image labels comprise foreign matters, insulators, small hardware fittings, ground wires and the like; the safety supervision video marking comprises the steps of crossing the line and entering, not wearing a safety helmet, not wearing a tool, smoking, not carrying a correct tool, identifying a human face, not wearing an insulating glove, using a non-insulating article near the live-line equipment and the like; the text information labels include power equipment, lines, substations, organizations, places, names of people, and the like.
(2) Power adaptive model matching
Determining a service field according to sample data, and calculating and determining application of the subdivided service field through sample content similarity; for example: the adaptive model matching firstly belongs to the image or text (natural language processing) or voice category according to the type of the application (sample), and then the adaptive model matching belongs to the operation and detection bird nest identification, insulator breakage, violation and no safety belt fastening and the like. Aiming at the business fields of operation inspection, scheduling, safety supervision, marketing and the like, the method mainly comprises model training tasks such as image recognition, voice recognition, text analysis, knowledge graph spectrum and the like. Knowledge graph and emotion analysis are based on natural language text processing application and belong to a natural language processing type; speech synthesis and voiceprint recognition belong to speech processing applications and speech recognition. The scheduling field relates to model training tasks such as knowledge graph construction, text analysis, voice recognition, voice synthesis and the like; the field of operation inspection relates to model training tasks such as image recognition, voiceprint recognition, knowledge graph spectrum and the like; the safety supervision field mainly relates to video analysis, violation identification and the like, and the marketing field relates to model training tasks such as voice recognition, emotion analysis, knowledge maps and the like. Image recognition and voice recognition belong to the large category of artificial intelligence application, and image recognition belongs to operation and inspection such as pole tower nest recognition, insulator breakage and violation recognition such as safety belt unfastening and smoking.
The self-adaptive model matching automatically matches the optimal deep learning model according to the types of image recognition, text analysis, voice recognition and the like of power business operation inspection, safety supervision, scheduling, marketing and the like. The method aims at deep learning models such as image recognition, natural language processing, voice matching convolutional neural networks, cyclic neural networks, graph convolutional neural networks and the like.
(3) Network architecture search
Aiming at specific services in the fields of electric power operation and inspection, safety supervision, scheduling and marketing services, the automatic model training and parameter pre-training of a deep learning model are realized, the automatic model training and parameter pre-training are carried out on the basis of a differentiable network searching framework, the automatic model training and parameter pre-training are embedded into a unified framework for collaborative learning, an effective network architecture searching technology is designed, and the optimal training parameters are optimized while the optimal model structure is automatically searched.
(4) When a neural architecture automatic search technology is used, a basic operator adaptive to the structure of the power model is designed, and a feasible model search strategy is explored. The automatic network training theory realizes network automatic training under less manual intervention by designing a basic operator and using a parameter updating mode such as reinforcement learning, differentiable learning and the like. As shown in fig. 1, the present embodiment employs convolutional neural network and cyclic neural network architecture search based on search space regularization, and graph convolutional neural network architecture search based on information transfer mechanism.
Search space regularization-based convolutional neural network and cyclic neural network architecture search
The network search strategy is improved, and the search space is regularized to improve the network search performance, as shown in fig. 2. The network search based on gradient has the problem of unstable training: firstly, the accuracy of a target task is too sensitive to random initialization; the second is that the optimal substructure searched on the proxy data set does not achieve satisfactory performance on the target data set. This instability problem is caused by optimization gaps (optimization gap), i.e., searching for optimal substructures on proxy datasets while retraining gaps between subnetworks on other datasets. Even if the target data set and the proxy data set are the same, it cannot be guaranteed that the network after cells are stacked for multiple times is the optimal sub-network.
The problem of optimizing gaps is solved by regularizing the search space. And (3) relaxing the discrete search space into a continuous and differentiable function, and searching the network structure based on a gradient descent algorithm. Search space is composed ofcellRepresent each ofcellIs a directed acyclic graph containing N nodes and a number of edges.cellWhere each node represents a feature level and the edges between nodes represent feature transformation operations O (∙) e O. For each intermediate nodex j Its last page with all predecessor nodesx 1 ,x 2 ,…,x j(-1) Are densely chained, the connected edges are denoted as E, respectively i,j() ,(i<j). DARTS calculates weight parameters for all operations on each edgeα i,j() And after normalization through a softmax function, the operation is used for carrying out weighted aggregation on the operations, so that the discrete search space is relaxed into continuous differentiable search space. In order to solve the problems of increased calculated amount and reduced stability when the network depth is deepened: firstly, search space approximation, namely, the number of optional operations is reduced when the number of network layers is increased, so that the calculation amount is reduced; and secondly, the search space is regularized, namely the jump connection is prevented from occupying a dominant position on the operation layer by randomly discarding the operation layer and reducing the number of the jump connection.
Graph convolution neural network architecture search based on information transfer mechanism
A graph neural structure search method based on a newly designed search space and a gradient-based search strategy is designed to automatically learn a better structure with optimal message passing depth on the graph. A powerful graph network search space is constructed by constructing a graph neural structure model with a tree topology calculation process and two types of fine-grained basic operators (feature filtering and neighbor aggregation), wherein the feature filtering performs adaptive feature selection, and the neighbor aggregation captures structure information and calculates statistics of neighbors.
Firstly, a graph neural structure paradigm suitable for a network architecture search task is designed, and a topological structure of a graph neural structure computational graph is defined as a directed tree. Where each node represents a potential graph-embedded representation and each edge represents an operation (feature filtering or neighbor aggregation). Wherein the feature filtering operation performs adaptive selection of features by a method that employs a gating mechanism to control information flow; two filters, a sparse filter and a dense filter, are designed to be competent for the task of coarse and fine granularity feature filtering. Is formulated as:
F s (H)=diag(M Q ([H,H in ]))H,F d (H)=M Z ([H,H in ])⊙H (1)
wherein the content of the first and second substances,F s (H) a sparse filter is represented in the form of a sparse filter,F d (H) representing a dense filter, H representing a graph embedding to be filtered, H in Representing the initial inputs to the overall graph neural structure computation tree.
Second, the search space is made continuous using the same method as the differentiable web search architecture. Then relax one particular operation to softmax for all possible operations, the formula is given as follows:
Figure 56600DEST_PATH_IMAGE001
(2)
whereinoRepresenting the set of all possible operations. After relaxation, the goal of the optimization is to learn the architecture parameters in all hybrid operations togetherαAnd weight parameterwαAndwefficient optimization can be performed using a gradient-based approach.
Finally, in determining the search framework, each blend operation may be performed by
Figure 120371DEST_PATH_IMAGE002
Instead of the most likely operation, the formula is expressed as follows:
Figure 114346DEST_PATH_IMAGE003
(3)
(5) model evaluation
And the model evaluation realizes automatic model evaluation according to the training task type and the evaluation method. The model evaluation method comprises accuracy, recall rate, F value and the like.
The accuracy rates are as follows:
accuracy = number of correct entities identified/number of entities identified
The recall ratio is as follows:
recall = number of correct entities identified/number of entities in sample data
The accuracy and the recall rate are between 0 and 1, and the closer the numerical value is to 1, the higher the accuracy or the recall rate is; when the accuracy rate and the recall rate are contradictory, the F value is as follows:
f value = (2 × accuracy × recall)/(accuracy + recall)
Wherein the F value is a weighted harmonic mean of the accuracy and the recall.
(6) Mold package
And packaging the trained model according to different model application scenes and deployment modes, and supporting the model to be released and used in various modes such as a cloud API, an equipment side SDK and the like.
The invention also discloses an artificial intelligence automatic training system, which comprises:
the sample data management module is used for receiving the uploaded sample data and carrying out manual or automatic auxiliary labeling on the sample data;
the service field selection module is used for selecting a target service field corresponding to the marking information according to the marking information of the sample data;
the target model training task selection module is used for selecting a target model training task from model training tasks corresponding to the target business field according to the target business field;
the target deep learning model matching module is used for matching a target deep learning model according to the target model training task;
the model and parameter training module is used for carrying out automatic model training and parameter pre-training aiming at the target deep learning model;
and the model evaluation module is used for realizing automatic model evaluation according to the model training task type and the evaluation method.
In order to support artificial intelligence model training, the artificial intelligence automatic training system further comprises the following modules:
1) and the server management module is used for managing the CPU server and the GPU server and distributing the training task to the distributed computing nodes to execute the computation.
2) And the training frame integration module is used for realizing the abstraction of the training process, determining related data and parameters, starting a training task and monitoring and analyzing the training process.
3) And the pooling computing resource management module is used for clouding GPU resources, so that the platform can automatically distribute the training tasks to the appropriate GPUs when the training tasks are started.
4) The resource isolation and environment management module realizes efficient management of resources and environments in the large-scale computing nodes, and is compatible with GPUs of different models, CUDA/CuDNN of different versions and different deep learning frames.
The invention provides an artificial intelligence automatic training device. The artificial intelligence automatic training device of this embodiment includes: a processor, a memory, and a computer program stored in the memory and executable on the processor. The processor realizes the steps of the above-mentioned method embodiments when executing the computer program. Alternatively, the processor implements the functions of the modules/units in the above device embodiments when executing the computer program.
The computer program may be partitioned into one or more modules/units that are stored in the memory and executed by the processor to implement the invention.
The artificial intelligence automatic training device can be computing equipment such as a desktop computer, a notebook computer, a palm computer and a cloud server. The artificial intelligence automatic training device may include, but is not limited to, a processor, a memory.
The processor may be a Central Processing Unit (CPU), other general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, etc.
The memory may be used to store the computer programs and/or modules, and the processor may implement the various functions of the artificial intelligence automatic training device by running or executing the computer programs and/or modules stored in the memory and invoking the data stored in the memory.
The modules/units integrated by the artificial intelligence automatic training device can be stored in a computer readable storage medium if they are implemented in the form of software functional units and sold or used as independent products. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying said computer program code, a recording medium, a usb-disk, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a Read-only memory (ROM), a Random Access Memory (RAM), an electrical carrier signal, a telecommunications signal, a software distribution medium, etc. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes will occur to those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (14)

1. An artificial intelligence automatic training method is characterized by comprising the following steps:
receiving the uploaded sample data, and carrying out manual or automatic auxiliary labeling on the sample data; the automatic auxiliary marking is that the automatic marking of the sample data is realized by utilizing the existing model according to the type of the sample data; the automatic marking is realized based on the sample data type selected when the user uploads the sample data, and the sample data is applied to model training based on the marked information which passes the check; the automatic auxiliary labeling comprises power transmission and transformation image labeling, safety supervision video labeling and text information labeling; the electric transmission and transformation image annotation comprises annotation of at least one of the following target objects in the electric transmission and transformation image: foreign matters, insulators, small hardware fittings or ground wires; the safety supervision video annotation comprises the annotation of at least one target object in the safety supervision video: crossing the line and breaking into, not wearing safety helmet, not wearing frock, smoking, not carrying correct tool, face recognition, not wearing insulating gloves or using non-insulating article near the live equipment; the text information annotation comprises an annotation to at least one of the following target objects in the text information: electrical equipment, lines, substations, organizations, locations or names of people;
selecting a target service field corresponding to the marking information according to the marking information of the sample data;
selecting a target model training task from model training tasks corresponding to the target business field according to the target business field;
matching a target deep learning model according to the target model training task;
carrying out automatic model training and parameter pre-training aiming at the target deep learning model;
and realizing automatic model evaluation according to the type of the model training task and the evaluation method.
2. The artificial intelligence auto-training method of claim 1, wherein the sample data is uploaded locally or from a server.
3. The artificial intelligence automatic training method of claim 1, wherein the business domain includes a shipping service, a scheduling service, a security surveillance service, and a marketing service; the operation and inspection service corresponds to model training tasks of image recognition, voiceprint recognition and knowledge graph; scheduling business corresponding to a model training task of knowledge graph construction, text analysis, voice recognition and voice synthesis; the safety supervision business corresponds to a model training task of video analysis, and the marketing business corresponds to a model training task of voice recognition, emotion analysis and a knowledge graph.
4. The artificial intelligence automatic training method of claim 1, wherein the specific method of the automatic model training and the parameter training of the deep learning model is as follows:
aiming at the convolutional neural network, carrying out automatic model training and parameter pre-training by adopting a method based on search space regularization;
aiming at the cyclic neural network, performing automatic model training and parameter pre-training by adopting a search space regularization-based method;
aiming at the graph convolution neural network, an automatic model training and parameter pre-training are carried out by adopting a method based on an information transfer mechanism.
5. The artificial intelligence automatic training method of claim 1, wherein the training task matching a target learning model according to the target model comprises:
determining the type of a target model training task;
if the type of the target model training task is an image type, determining that a target deep learning model is a convolutional neural network;
if the type of the target model training task is a natural language processing type, determining that a target deep learning model is a cyclic neural network;
and if the type of the target model training task is a voice recognition type, determining that the target deep learning model is a graph convolution neural network.
6. The method of claim 4, wherein the automated model training and parameter pre-training using a search space regularization based approach comprises:
relaxing the search space into a continuous, differentiable function;
and searching the network structure based on a gradient descent algorithm.
7. The artificial intelligence auto-training method of claim 6, wherein the relaxing the search space into a continuous, differentiable function comprises:
respectively searching the spacecellIn each intermediate nodex j To all predecessor nodesx 1 ,x 2 ,…,x j(-1) Carry out dense linking, search spacecellIs a directed acyclic graph containing N nodes and a plurality of edges, each node represents a characteristic layer, and the nodeiAnd nodejEdge E between i,j() Represents a feature transformation operation O (∙) e O, whereini<j
Computing weight parameters for all feature transformation operations on each edgeα i,j()
And normalizing the weight parameters of each feature transformation operation through a softmax function, wherein the normalization result is used for carrying out weighted aggregation on each operation.
8. The method according to claim 6, wherein a search space approximation process and a search space regularization process are performed during the network structure search based on the gradient descent algorithm.
9. The artificial intelligence auto-training method of claim 4, wherein the search space employed by the information transfer mechanism based method is constructed by a target graph neural structure paradigm having a tree topology computation process and a target fine-grained basic operator, the target fine-grained basic operator comprising feature filtering for performing adaptive feature selection and neighbor aggregation for capturing structure information and computing statistics of neighbors;
the search strategy adopted by the information transfer mechanism-based method comprises the following steps:
performing an adaptive selection of features by a method that employs a gating mechanism to control information flow; using a sparse filter F s (H) And a dense filter F d (H) And the method is qualified for the characteristic filtering task of coarse and fine granularity:
F s (H)=diag(M Q ([H,H in ]))H,F d (H)=M Z ([H,H in ])⊙H (1)
wherein H denotes the graph embedding to be filtered, H in Representing the initial inputs of the overall graph neural structure computation tree;
the search space is made continuous using the same method as the differentiable web search architecture, then relaxing one operation to all possible operations softmax:
Figure 934303DEST_PATH_IMAGE001
(2)
wherein the content of the first and second substances,orepresents the set of all possible operations; after relaxation, to learn the architecture parameters in all blending operations togetherαAnd weight parameterwFor the purpose of optimizationMarking; architectural parametersαAnd weight parameterwEfficient optimization using a gradient-based approach;
in determining the search framework, each blend operation is performed
Figure 120565DEST_PATH_IMAGE002
Replacement is by the most likely operation:
Figure 46932DEST_PATH_IMAGE003
(3)。
10. the artificial intelligence automatic training method of claim 1, wherein the evaluation method comprises calculating accuracy, calculating recall, and calculating F-number;
the calculation formula of the accuracy is as follows:
accuracy = number of correct entities identified/number of entities identified
The recall rate is calculated according to the following formula:
recall = number of correct entities identified/number of entities in sample data
The accuracy and the recall rate are between 0 and 1, and the closer the numerical value is to 1, the higher the accuracy or the recall rate is;
the calculation formula of the F value is as follows:
when the accuracy rate and the recall rate are contradictory, the F value is:
f value = (2 × accuracy × recall)/(accuracy + recall)
Wherein the F value is a weighted harmonic mean of the accuracy and the recall.
11. An artificial intelligence automatic training system, comprising:
the sample data management module is used for receiving the uploaded sample data and carrying out manual or automatic auxiliary labeling on the sample data; the automatic auxiliary marking is that the automatic marking of the sample data is realized by utilizing the existing model according to the type of the sample data; the automatic marking is realized based on the sample data type selected when the user uploads the sample data, and the sample data is applied to model training based on the marked information which passes the check; the automatic auxiliary labeling comprises power transmission and transformation image labeling, safety supervision video labeling and text information labeling; the electric transmission and transformation image annotation comprises annotation of at least one of the following target objects in the electric transmission and transformation image: foreign matters, insulators, small hardware fittings or ground wires; the safety supervision video annotation comprises the annotation of at least one target object in the safety supervision video: crossing the line and breaking into, not wearing safety helmet, not wearing frock, smoking, not carrying correct tool, face recognition, not wearing insulating gloves or using non-insulating article near the live equipment; the text information annotation comprises an annotation to at least one of the following target objects in the text information: electrical equipment, lines, substations, organizations, locations or names of people;
the service field selection module is used for selecting a target service field corresponding to the marking information according to the marking information of the sample data;
the target model training task selection module is used for selecting a target model training task from model training tasks corresponding to the target business field according to the target business field;
the target deep learning model matching module is used for matching a target deep learning model according to the target model training task;
the model and parameter training module is used for carrying out automatic model training and parameter pre-training aiming at the target deep learning model;
and the model evaluation module is used for realizing automatic model evaluation according to the model training task type and the evaluation method.
12. The artificial intelligence automatic training system of claim 11, further comprising:
the server management module is used for managing the CPU server and the GPU server and distributing model training tasks to distributed computing nodes to execute computation;
the training frame integration module realizes the abstraction of a training process, starts a training task and monitors and analyzes the training process after determining related data and parameters;
the system comprises a pooling computing resource management module, a resource management module and a resource management module, wherein the pooling computing resource management module is used for clouding GPU resources and realizing that a platform automatically distributes training tasks to corresponding GPUs when the training tasks are started;
the resource isolation and environment management module is used for managing resources and environments in the computing nodes and is compatible with GPUs of different models, CUDA/CuDNN of different versions and different deep learning frameworks.
13. An artificial intelligence auto-training device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor when executing the computer program implements the steps of the method according to any one of claims 1 to 10.
14. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 10.
CN202210669062.6A 2022-06-14 2022-06-14 Artificial intelligence automatic training method, system, device and storage medium Active CN114757307B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210669062.6A CN114757307B (en) 2022-06-14 2022-06-14 Artificial intelligence automatic training method, system, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210669062.6A CN114757307B (en) 2022-06-14 2022-06-14 Artificial intelligence automatic training method, system, device and storage medium

Publications (2)

Publication Number Publication Date
CN114757307A CN114757307A (en) 2022-07-15
CN114757307B true CN114757307B (en) 2022-09-06

Family

ID=82336190

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210669062.6A Active CN114757307B (en) 2022-06-14 2022-06-14 Artificial intelligence automatic training method, system, device and storage medium

Country Status (1)

Country Link
CN (1) CN114757307B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116010602B (en) * 2023-01-10 2023-09-29 湖北华中电力科技开发有限责任公司 Data optimization method and system based on big data
CN116594748A (en) * 2023-05-19 2023-08-15 航天宏图信息技术股份有限公司 Model customization processing method, device, equipment and medium for task
CN116468131B (en) * 2023-06-19 2023-09-01 成都市奇点软件有限公司 Automatic AI (advanced technology attachment) driven project method and system based on staged retraining

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108171117A (en) * 2017-12-05 2018-06-15 南京南瑞信息通信科技有限公司 Electric power artificial intelligence visual analysis system based on multinuclear heterogeneous Computing
CN109271539A (en) * 2018-08-31 2019-01-25 华中科技大学 A kind of image automatic annotation method and device based on deep learning
WO2019113122A1 (en) * 2017-12-04 2019-06-13 Conversica, Inc. Systems and methods for improved machine learning for conversations
CN113344098A (en) * 2021-06-22 2021-09-03 北京三快在线科技有限公司 Model training method and device
CN113706099A (en) * 2021-08-23 2021-11-26 中国电子科技集团公司第二十八研究所 Data labeling and deep learning model training and service publishing system
CN113806574A (en) * 2021-08-31 2021-12-17 北京中育神州数据科技有限公司 Software and hardware integrated artificial intelligent image recognition data processing method
CN113903081A (en) * 2021-09-29 2022-01-07 北京许继电气有限公司 Visual identification artificial intelligence alarm method and device for images of hydraulic power plant
CN114077674A (en) * 2021-10-31 2022-02-22 国电南瑞科技股份有限公司 Power grid dispatching knowledge graph data optimization method and system

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108174165A (en) * 2018-01-17 2018-06-15 重庆览辉信息技术有限公司 Electric power safety operation and O&M intelligent monitoring system and method
CN109635918A (en) * 2018-10-30 2019-04-16 银河水滴科技(北京)有限公司 The automatic training method of neural network and device based on cloud platform and preset model
CN113095346A (en) * 2020-01-08 2021-07-09 华为技术有限公司 Data labeling method and data labeling device
CN111814966A (en) * 2020-08-24 2020-10-23 国网浙江省电力有限公司 Neural network architecture searching method, neural network application method, device and storage medium
CN112434794A (en) * 2020-11-30 2021-03-02 国电南瑞科技股份有限公司 Computer vision data set semi-automatic labeling method and system based on deep learning
CN114445706A (en) * 2022-01-25 2022-05-06 南京工程学院 Power transmission line target detection and identification method based on feature fusion

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019113122A1 (en) * 2017-12-04 2019-06-13 Conversica, Inc. Systems and methods for improved machine learning for conversations
CN108171117A (en) * 2017-12-05 2018-06-15 南京南瑞信息通信科技有限公司 Electric power artificial intelligence visual analysis system based on multinuclear heterogeneous Computing
WO2019109771A1 (en) * 2017-12-05 2019-06-13 南京南瑞信息通信科技有限公司 Power artificial-intelligence visual-analysis system on basis of multi-core heterogeneous parallel computing
CN109271539A (en) * 2018-08-31 2019-01-25 华中科技大学 A kind of image automatic annotation method and device based on deep learning
CN113344098A (en) * 2021-06-22 2021-09-03 北京三快在线科技有限公司 Model training method and device
CN113706099A (en) * 2021-08-23 2021-11-26 中国电子科技集团公司第二十八研究所 Data labeling and deep learning model training and service publishing system
CN113806574A (en) * 2021-08-31 2021-12-17 北京中育神州数据科技有限公司 Software and hardware integrated artificial intelligent image recognition data processing method
CN113903081A (en) * 2021-09-29 2022-01-07 北京许继电气有限公司 Visual identification artificial intelligence alarm method and device for images of hydraulic power plant
CN114077674A (en) * 2021-10-31 2022-02-22 国电南瑞科技股份有限公司 Power grid dispatching knowledge graph data optimization method and system

Also Published As

Publication number Publication date
CN114757307A (en) 2022-07-15

Similar Documents

Publication Publication Date Title
CN114757307B (en) Artificial intelligence automatic training method, system, device and storage medium
CN110659173B (en) Operation and maintenance system and method
Yang et al. Development of image recognition software based on artificial intelligence algorithm for the efficient sorting of apple fruit
Yu et al. LSTM-EFG for wind power forecasting based on sequential correlation features
Jauro et al. Deep learning architectures in emerging cloud computing architectures: Recent development, challenges and next research trend
Wang et al. Power system network topology identification based on knowledge graph and graph neural network
CN111967271A (en) Analysis result generation method, device, equipment and readable storage medium
Xie et al. Logm: Log analysis for multiple components of hadoop platform
CN105630797A (en) Data processing method and system
CN115456093A (en) High-performance graph clustering method based on attention-graph neural network
WO2023129164A1 (en) Digital twin sequential and temporal learning and explaining
Papageorgiou et al. A systematic review on machine learning methods for root cause analysis towards zero-defect manufacturing
CN114265954B (en) Graph representation learning method based on position and structure information
Hu et al. 5G‐Oriented IoT Big Data Analysis Method System
CN115936217A (en) Method and device for judging maturity of business opportunity, storage medium and electronic equipment
CN114329099A (en) Overlapping community identification method, device, equipment, storage medium and program product
CN113807704A (en) Intelligent algorithm platform construction method for urban rail transit data
Wijayanto et al. Predicting future potential flight routes via inductive graph representation learning
Ren et al. [Retracted] A Study on Information Classification and Storage in Cloud Computing Data Centers Based on Group Collaborative Intelligent Clustering
Valls Canudas et al. Reconstruction of the LHCb Calorimeter using Machine Learning: lessons learned
CN105354298A (en) Hadoop based method for analyzing large-scale social network and analysis platform thereof
Wang et al. Unsupervised Data Anomaly Detection Based on Graph Neural Network
CN115392615B (en) Data missing value completion method and system for generating countermeasure network based on information enhancement
CN115080968B (en) Artificial intelligence server with intelligent safety protection
Ching et al. Understanding the Amazon from space

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant