CN117668701B - AI artificial intelligence machine learning system and method - Google Patents
AI artificial intelligence machine learning system and method Download PDFInfo
- Publication number
- CN117668701B CN117668701B CN202410124324.XA CN202410124324A CN117668701B CN 117668701 B CN117668701 B CN 117668701B CN 202410124324 A CN202410124324 A CN 202410124324A CN 117668701 B CN117668701 B CN 117668701B
- Authority
- CN
- China
- Prior art keywords
- model
- neural network
- data
- training
- machine learning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000010801 machine learning Methods 0.000 title claims abstract description 43
- 238000013473 artificial intelligence Methods 0.000 title claims abstract description 22
- 238000000034 method Methods 0.000 title claims abstract description 12
- 238000012549 training Methods 0.000 claims abstract description 46
- 238000013528 artificial neural network Methods 0.000 claims abstract description 40
- 238000013136 deep learning model Methods 0.000 claims abstract description 34
- 125000004122 cyclic group Chemical group 0.000 claims abstract description 28
- 238000012545 processing Methods 0.000 claims abstract description 26
- 230000001537 neural effect Effects 0.000 claims abstract description 19
- 238000013480 data collection Methods 0.000 claims abstract description 15
- 238000007781 pre-processing Methods 0.000 claims abstract description 14
- 238000004140 cleaning Methods 0.000 claims abstract description 5
- 230000006870 function Effects 0.000 claims description 21
- 238000013527 convolutional neural network Methods 0.000 claims description 19
- 230000004913 activation Effects 0.000 claims description 15
- 230000008859 change Effects 0.000 claims description 13
- 238000005457 optimization Methods 0.000 claims description 12
- 230000035772 mutation Effects 0.000 claims description 11
- 239000011159 matrix material Substances 0.000 claims description 9
- 238000011176 pooling Methods 0.000 claims description 9
- 230000004927 fusion Effects 0.000 claims description 7
- 230000000694 effects Effects 0.000 claims description 5
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 238000000605 extraction Methods 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 230000008569 process Effects 0.000 claims description 3
- 238000012795 verification Methods 0.000 claims description 3
- 230000000306 recurrent effect Effects 0.000 claims description 2
- 230000006872 improvement Effects 0.000 description 8
- 230000009471 action Effects 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 4
- 238000010200 validation analysis Methods 0.000 description 4
- 230000002028 premature Effects 0.000 description 2
- 238000010845 search algorithm Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000003278 mimic effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/10—Pre-processing; Data cleansing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/0464—Convolutional networks [CNN, ConvNet]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Biophysics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to the technical field of machine learning, in particular to an AI artificial intelligence machine learning system and method. The system comprises a data collection and processing unit, wherein the data collection and processing unit comprises a data collection module and a preprocessing module; the preprocessing module is used for cleaning the collected data; introducing residual connection into the AI deep learning model to improve the performance of the model and accelerate training, and training the AI deep learning model by using the preprocessed data; the model optimizing unit optimizes the structure of the deep learning model based on the neural architecture search, performs variation operation of changing the connection weight on newly generated individuals in the neural architecture search, adds random noise on the basis of the current weight, and further optimizes the performance of the model; the decision output unit is used for converting the output of the model into an actual decision. Residual connection is added in the cyclic neural network to improve the performance of the model and accelerate training, and improve the training and sequence modeling capacity of the model.
Description
Technical Field
The invention relates to the technical field of machine learning, in particular to an AI artificial intelligence machine learning system and method.
Background
AI artificial intelligence machine learning systems utilize machine learning techniques to mimic human intelligence, machine learning being a branch of artificial intelligence that allows a computer system to accomplish tasks by learning from data, rather than explicitly programming. These systems discover patterns and learn from them by processing and analyzing large amounts of data in order to make predictions, categorize, identify patterns, or make decisions.
However, machine learning systems require a large amount of data to train, model training efficiency is low, some machine learning models are sensitive to changes in the distribution of new data, and if the new data is different from the distribution of training data, the model performance may be degraded.
Disclosure of Invention
The invention aims to provide an AI artificial intelligence machine learning system and method, which are used for solving the problems that the machine learning system provided in the background technology needs a large amount of data for training and the model training efficiency is low.
To achieve the above object, an object of the present invention is to provide an AI artificial intelligence machine learning system including:
the data collection and processing unit comprises a data collection module and a preprocessing module;
wherein the data collection module collects a data set for training; the data may come from a variety of sources, such as sensors, databases, logs, social media.
The preprocessing module is used for cleaning the collected data;
the model training unit is used for establishing an AI deep learning model based on an AI machine learning algorithm, training the AI deep learning model by using the preprocessed data and enabling the AI deep learning model to learn and capture the mode, the feature and the relation of the data from the preprocessed data;
the model optimization unit optimizes the structure of the AI deep learning model based on neural architecture search;
and the decision output unit is used for converting the output of the model into an actual decision.
The decision output unit is responsible for converting the output generated by the trained model into actual decisions, actions or suggestions, the functions vary according to the specific application scenario, and the following are some possible cases and functions:
classification and predictive conversion: if the task of the model is classification or prediction, the decision output unit converts the result output by the model into an actual classification label, probability value or prediction result. For example, if the model is for image recognition, the decision output unit may convert the model predicted object class into a textual description or a related action recommendation.
Decision making: in some cases, the AI system may need to make decisions about results in a particular context. This unit may map the output of the model to a series of possible actions or decisions and make the final selection according to certain set rules or conditions. For example, based on model predictions of market trends, the decision output unit may recommend buying or selling a security.
As a further improvement of the technical scheme, the data cleaning comprises missing value processing, outlier processing and data deduplication processing.
As a further improvement of the technical scheme, the AI machine learning algorithm is based on a fusion model of a cyclic neural network and a convolutional neural network, and comprises the following specific steps:
s3.1, using a convolutional neural network as a feature extractor to extract image data features in the preprocessed data, and extracting spatial features of the image through a convolution and pooling layer;
s3.2, mapping the image data features extracted by the convolutional neural network into the cyclic neural network, and extracting spatial features;
s3.3, the cyclic neural network receives the spatial characteristics extracted by the convolutional neural network as input, processes sequence information, and adds residual connection in the cyclic neural network; residual connection improves the performance of the model and accelerates training, improves the training and sequence modeling capacity of the model, and is particularly used for processing image sequence tasks;
and S3.4, modeling the sequence information data by using a cyclic neural network, and adding an output layer to the cyclic neural network to perform any task of prediction, classification or decision output.
As a further improvement of the present technical solution, in S3.1, in the convolutional neural network, the image feature extraction is completed through a convolutional layer and a pooling layer, which specifically includes:
convolution operation:
Conv(I,K)=I*K;
wherein I represents an input image; k represents a convolution kernel;
pooling operation:
Pool(I)=max(I)。
as a further improvement of the present technical solution, in S3.3, in the cyclic neural network, the calculation involved in RNN processing sequence information is:
h t =Activation(W hh h t-1 +W xh x t +b h );
wherein h is t Representing the hidden state of the current time step; x is x t Representing an input; w (W) hh And W is xh Representing a weight matrix; b h Representing the bias; activity () represents an Activation function;
wherein, add the residual connection to the formula, optimize and adjust the formula as:
h t =Activation(W hh h t-1 +W xh x t +b h )+h t-1 ;
wherein h is t-1 Representing the hidden state of the previous time step; by adding h in the new hidden state calculation t-1 The purpose of residual connection is achieved, and the flow and gradient propagation of information are facilitated.
As a further improvement of the present technical solution, in S3.4, in the predicting task, the predicted value of the next time step is:
Mt=Activation(W out h t +b out );
wherein Mt represents the predicted output at the next time step; h is a t Representing the hidden state of the current time step; w (W) out Representing the weights of the output layers; b out Representing the bias of the output layer;
in the classification task, the Softmax activation function is used to get the output probability distribution for each class:
wherein z represents an input vector; n represents the number of categories; softmax (z) i A probability representing the i-th category;
in the decision output task, a threshold function is used to determine a decision result according to an output value: precision Output (threading):
wherein M represents the output of the model; threshold represents a threshold;
dprecision Output (threshold) represents a threshold function.
As a further improvement of the technical scheme, the neural architecture search is based on an evolution algorithm, and the structure of the AI deep learning model is optimized by the following steps:
s6.1, determining the parameter range, the connection mode, the layer number and the node number variable of the neural network structure, and forming a search space;
s6.2, randomly generating an initial network structure population as a starting point of an evolution algorithm;
s6.3, training and evaluating each network structure, and evaluating the performance and generalization capability of each network structure by using a training set and a verification set, wherein the evaluation indexes are usually used as fitness functions; the fitness can be measured according to the performance of the neural network on a specific task, and in the image classification task, the classification accuracy can be used as a fitness index; in the language model generation task, a confusion index may be used; the data set is typically divided into a training set and a validation set. The fitness may be calculated based on the performance of the neural network on the validation set. This can prevent overfitting and better measure the generalization ability of the model;
s6.4, selecting a better network structure according to the fitness function to serve as a parent of the next generation population;
s6.5, performing cross operation on the selected parent individuals to generate new individuals;
s6.6, performing mutation operation on a newly generated individual, and introducing randomness to change certain characteristics of the individual so as to conveniently explore more network structural space;
s6.7, gradually evolving a more excellent network structure by repeatedly executing the selection, crossing and mutation operations until a stopping condition is reached;
and S6.8, after the evolution iteration is finished, selecting an optimal network structure from the final population as a final result, and determining the optimal neural network structure according to the evaluation index.
As a further improvement of the present technical solution, in S6.4, a specific algorithm for determining parent individuals of the next generation population is:
wherein P is i Representing the probability that the ith individual is selected; f (f) i Indicating the fitness of the ith individual; n represents the number of individuals in the population;
in order to adapt to the change of the search space, the selection probability is dynamically adjusted according to the change condition of individuals in the population, the selection probability is dynamically adjusted according to the distribution condition of the fitness of the individuals in each generation, so as to keep the diversity of the population, and the optimization of the formula is obtained:
where α denotes a parameter for adjusting the selection probability distribution. When α takes different values, it has different effects on the distribution of the selection probabilities. The larger alpha value can increase the selection probability difference of individuals with higher fitness, the smaller alpha value can enable the selection probability to be closer to the standard fitness proportion selection, the selection pressure can be adjusted according to the distribution condition of the fitness by dynamically adjusting the selection probability, the diversity of population is promoted, and the solution space is better explored.
As a further improvement of the present technical solution, in S6.6, a mutation operation of changing the connection weight is performed on the newly generated individual, and random noise is added on the basis of the current weight to adjust the connection weight, where the adjustment formula is specifically as follows:
W mutated =W+noise;
wherein W is mutated Representing the mutated weight matrix; w represents an original weight matrix; noise represents random noise added; may be a random number from a gaussian distribution;
wherein,representing a gaussian distribution; sigma represents the standard deviation of noise; />The noise variable follows a gaussian distribution with a mean value of 0 and a standard deviation of sigma.
In another aspect, the present invention provides an AI artificial intelligence machine learning method for use in any one of the AI artificial intelligence machine learning systems described above, including the steps of:
s10.1, acquiring a training data set from different sources by a data collecting and processing unit, and preprocessing the collected data by a preprocessing module;
s10.2, an AI deep learning model is established based on an AI machine learning algorithm through a model training unit, and the AI deep learning model is trained by using preprocessed data, wherein the AI machine learning algorithm is a fusion model based on a cyclic neural network and a convolutional neural network, and by adding residual connection in the cyclic neural network, the performance of the model is improved, the training is accelerated, and the training and sequence modeling capacity of the model are improved;
s10.3, performing structural optimization on the established AI deep learning model by a model optimization unit, automatically searching and optimizing the deep learning model structure based on neural architecture search, performing variation operation of changing connection weight on newly generated individuals in the neural architecture search, adding random noise on the basis of the current weight, and adjusting the connection weight so as to optimize the model performance;
and S10.4, finally, converting the model output into an actual decision through a decision output unit, and classifying, predicting and converting and making decisions according to the application scene.
Compared with the prior art, the invention has the beneficial effects that:
1. in the AI artificial intelligent machine learning system and method, an AI deep learning model established by an AI machine learning algorithm is a fusion model based on a cyclic neural network and a convolutional neural network, and residual connection is added in the cyclic neural network so as to improve the performance of the model, accelerate training and improve the training and sequence modeling capacity of the model.
2. In the AI artificial intelligent machine learning system and method, in a model optimizing unit, the structure of a deep learning model is optimized based on neural architecture search, and newly generated individuals are subjected to variation operation of changing connection weights in the neural architecture search, so that the variation operation increases the variability among individuals, thereby maintaining the diversity of populations; introducing variation is helpful to avoid premature sinking into a local optimal solution, and random noise is added on the basis of the current weight to adjust the connection weight so as to optimize the performance of the model;
the neural architecture search optimizes the structure of the deep learning model based on an evolution algorithm, dynamically adjusts the selection probability according to the change condition of individuals in the population in order to adapt to the change of the search space, and dynamically adjusts the selection probability according to the distribution condition of the fitness of the individuals in each generation so as to keep the diversity of the population, thereby solving the problem that the model performance is possibly reduced due to different distribution of new data and training data.
Drawings
Fig. 1 is an overall flow diagram of the present invention.
The meaning of each reference sign in the figure is:
1.a data collection and processing unit; 2. a model training unit; 3. a model optimizing unit; 4. and a decision output unit.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1:
referring to fig. 1, an AI artificial intelligence machine learning system is provided, comprising:
the data collection and processing unit 1, wherein the data collection and processing unit 1 comprises a data collection module and a preprocessing module;
wherein the data collection module collects a data set for training; the data is typically cleaned, converted and marked to facilitate model learning and analysis, and to ensure data quality and availability, the collected data is cleaned by a preprocessing module, which includes missing value processing, outlier processing and data deduplication processing.
Sources of data include sensors, databases, logs, social media.
The model training unit 2, the model training unit 2 builds AI deep learning model based on AI machine learning algorithm, and trains AI deep learning model by using the preprocessed data, is used for making AI deep learning model learn from the preprocessed data and capturing mode, feature and relation of the data; in order to make accurate predictions, classifications, or decisions;
the model optimizing unit 3 optimizes the structure of the AI deep learning model based on the neural architecture search by the model optimizing unit 3; searching and evaluating the neural network structure, automatically trying different layer numbers, node numbers and connection mode framework combinations to find the optimal network structure;
and the decision output unit 4, the decision output unit 4 is used for converting the output of the model into an actual decision.
The decision output unit 4 is responsible for converting the output generated by the trained model into actual decisions, actions or suggestions, and the functions vary according to the specific application scenario, including the following situations and functions:
classification and predictive conversion: if the task of the model is classification or prediction, the decision output unit converts the result output by the model into an actual classification label, probability value or prediction result. For example, if the model is for image recognition, the decision output unit may convert the model predicted object class into a textual description or a related action recommendation.
Decision making: in some cases, the AI system may need to make decisions about results in a particular context. This unit may map the output of the model to a series of possible actions or decisions and make the final selection according to certain set rules or conditions. For example, based on model predictions of market trends, the decision output unit may recommend buying or selling a security.
The AI machine learning algorithm is based on a fusion model of a cyclic neural network and a convolutional neural network, and comprises the following specific steps:
s3.1, using a convolutional neural network as a feature extractor to extract image data features in the preprocessed data, and extracting spatial features of the image through a convolution and pooling layer, wherein the features can capture local information and abstract representation of the image;
in the convolutional neural network of the embodiment, the image feature extraction is completed through a convolutional layer and a pooling layer, and the specific operation is as follows:
convolution operation:
Conv(I,K)=I*K;
wherein I represents an input image; k represents a convolution kernel;
pooling operation:
Pool(I)=max(I)。
s3.2, mapping the image data features extracted by the convolutional neural network into the cyclic neural network, and extracting spatial features; includes reshaping the feature map into sequence data;
s3.3, the cyclic neural network receives the spatial characteristics extracted by the convolutional neural network as input, processes sequence information, and adds residual connection in the cyclic neural network; residual connection improves the performance of the model and accelerates training, improves the training and sequence modeling capacity of the model, and is particularly used for processing image sequence tasks;
in the recurrent neural network of S3.3, the RNN processing sequence information involves the following calculations:
h t =Activation(W hh h t-1 +W xh x t +b h );
wherein h is t Representing the hidden state of the current time step; x is x t Representing an input; w (W) hh And W is xh Representing a weight matrix; b h Representing biasThe method comprises the steps of carrying out a first treatment on the surface of the Activity () represents an Activation function;
wherein, add the residual connection to the formula, optimize and adjust the formula as:
h t =Activation(W hh h t-1 +W xh x t +b h )+h t-1 ;
wherein h is t-1 Representing the hidden state of the previous time step; by adding h in the new hidden state calculation t-1 The purpose of residual connection is achieved, and the flow and gradient propagation of information are facilitated.
And S3.4, modeling the sequence information data by using a cyclic neural network, and adding an output layer to the cyclic neural network to perform any task of prediction, classification or decision output.
The output may be used for different tasks such as classifying the video sequence, generating descriptive text.
In this embodiment, any one of tasks based on prediction, classification, or decision output of the output layer is classified into a group that can be expressed as:
in the prediction task, the value of the next time step obtained by prediction is as follows:
Mt=Activation(W out h t +b out );
wherein Mt represents the predicted output at the next time step; h is a t Representing the hidden state of the current time step; w (W) out Representing the weights of the output layers; b out Representing the bias of the output layer;
in the classification task, the Softmax activation function is used to get the output probability distribution for each class:
wherein z represents an input vector; n represents the number of categories; softmax (z) i A probability representing the i-th category;
in the decision output task, a threshold function is used to determine a decision result according to an output value: precision Output (threading):
wherein M represents the output of the model; threshold represents a threshold;
the Decision Output (threshold) represents a threshold function.
Still further, in the present embodiment, in the model optimizing unit 3, the neural architecture search is based on the evolution algorithm, and the steps of optimizing the structure of the AI deep learning model are as follows:
s6.1, determining the parameter range, the connection mode, the layer number and the node number variable of the neural network structure, and forming a search space for generating and adjusting different network structures;
s6.2, randomly generating an initial network structure population as a starting point of an evolution algorithm;
s6.3, training and evaluating each network structure, and evaluating the performance and generalization capability of each network structure by using a training set and a verification set, wherein the evaluation indexes are usually used as fitness functions; the fitness can be measured according to the performance of the neural network on a specific task, and in the image classification task, the classification accuracy can be used as a fitness index; in the language model generation task, a confusion index may be used; the data set is typically divided into a training set and a validation set. The fitness may be calculated based on the performance of the neural network on the validation set. This can prevent overfitting and better measure the generalization ability of the model;
s6.4, selecting a better network structure according to the fitness function to serve as a parent of the next generation population:
specific algorithms for determining parent individuals of the next generation population are:
wherein P is i Represents the ithProbability of an individual being selected; f (f) i Indicating the fitness of the ith individual; n represents the number of individuals in the population;
in order to adapt to the change of the search space, the selection probability is dynamically adjusted according to the change condition of individuals in the population, the selection probability is dynamically adjusted according to the distribution condition of the fitness of the individuals in each generation, so as to keep the diversity of the population, and the optimization of the formula is obtained:
where α denotes a parameter for adjusting the selection probability distribution. When α takes different values, it has different effects on the distribution of the selection probabilities. The larger alpha value can increase the selection probability difference of individuals with higher fitness, the smaller alpha value can enable the selection probability to be closer to the standard fitness proportion selection, the selection pressure can be adjusted according to the distribution condition of the fitness by dynamically adjusting the selection probability, the diversity of population is promoted, and the solution space is better explored.
S6.5, performing cross operation on the selected parent individuals to generate new individuals. This may be achieved by exchanging, combining or reorganizing parts of the network structure;
s6.6, performing mutation operation on a newly generated individual, and introducing randomness to change certain characteristics of the individual so as to conveniently explore more network structural space;
in this embodiment, a mutation operation of changing the connection weight is performed on a new individual, and random noise is added on the basis of the current weight to adjust the connection weight, and the adjustment formula is specifically as follows:
W mutated =W+noise;
wherein W is mutated Representing the mutated weight matrix; w represents an original weight matrix; noise represents random noise added; may be a random number from a gaussian distribution;
wherein,representing a gaussian distribution; sigma represents the standard deviation of noise; />The noise variable follows a gaussian distribution with a mean value of 0 and a standard deviation of sigma.
The random noise is introduced to cause the weight to have small amplitude change, which is helpful for exploring wider solution space, and can help the neural architecture search algorithm jump out of the local optimal solution, try more possibilities and find a potential better solution; the variation operation increases the difference among individuals, so that the diversity of the population is maintained; introducing variation helps to avoid premature sinking into the locally optimal solution. Without a mutation operation, the search algorithm may wander around the locally optimal solution, and the mutation operation may help jump out of such a locally optimal solution.
S6.7, gradually evolving a more excellent network structure by repeatedly executing selection, crossing and mutation operations until a stopping condition is reached, wherein the stopping condition is that the preset iteration times or performance convergence is reached;
and S6.8, after the evolution iteration is finished, selecting an optimal network structure from the final population as a final result, and determining the optimal neural network structure according to the evaluation index.
Example 2:
embodiment 2 of the present invention differs from embodiment 1 in that this embodiment describes an AI artificial intelligence machine learning method used by the AI artificial intelligence machine learning system.
The AI artificial intelligence machine learning method is used for the AI artificial intelligence machine learning system and comprises the following steps:
s10.1, acquiring training data sets from different sources by the data collecting and processing unit 1, and preprocessing the collected data by a preprocessing module;
s10.2, an AI deep learning model is established based on an AI machine learning algorithm through a model training unit 2, and the AI deep learning model is trained by using preprocessed data, wherein the AI machine learning algorithm is a fusion model based on a cyclic neural network and a convolutional neural network, and by adding residual connection in the cyclic neural network, the performance of the model is improved, the training is accelerated, and the training and sequence modeling capacity of the model are improved;
s10.3, performing structural optimization on the established AI deep learning model by a model optimization unit 3, automatically searching and optimizing the deep learning model structure based on neural architecture search, performing variation operation of changing connection weight on newly generated individuals in the neural architecture search, adding random noise on the basis of the current weight, and adjusting the connection weight so as to optimize the model performance;
and S10.4, finally, converting the model output into an actual decision through a decision output unit 4, and classifying, predicting and converting and making decisions according to the application scene.
The foregoing has shown and described the basic principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the above-described embodiments, and that the above-described embodiments and descriptions are only preferred embodiments of the present invention, and are not intended to limit the invention, and that various changes and modifications may be made therein without departing from the spirit and scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.
Claims (6)
- An ai artificial intelligence machine learning system, comprising:a data collection and processing unit (1), the data collection and processing unit (1) comprising a data collection module and a preprocessing module, wherein the data collection module collects a data set for training; the preprocessing module is used for cleaning the collected data;the model training unit (2), the said model training unit (2) builds AI deep learning model based on AI machine learning algorithm, introduce the residual error connection in AI deep learning model, in order to improve the performance of the model and accelerate training, and use the preprocessed data to train AI deep learning model, use for making AI deep learning model study from preprocessed data and catch mode, characteristic and relation of the data;the AI machine learning algorithm is based on a fusion model of a cyclic neural network and a convolutional neural network, and comprises the following specific steps:s3.1, using a convolutional neural network as a feature extractor to extract image data features in the preprocessed data, and extracting spatial features of the image through a convolution and pooling layer;s3.2, mapping the image data features extracted by the convolutional neural network into the cyclic neural network, and extracting spatial features;s3.3, the cyclic neural network receives the spatial characteristics extracted by the convolutional neural network as input, processes sequence information, and adds residual connection in the cyclic neural network;s3.4, modeling the sequence information data by using a cyclic neural network, and adding an output layer to the cyclic neural network to perform any task of prediction, classification or decision output;the model optimization unit (3), the structure of the AI deep learning model is optimized based on neural architecture search, the variation operation of changing the connection weight is carried out on newly generated individuals in the neural architecture search, random noise is added on the basis of the current weight, and the connection weight is adjusted, so that the model performance is optimized;the neural architecture search is based on an evolution algorithm, and the structural optimization of the AI deep learning model comprises the following steps:s6.1, determining the parameter range, the connection mode, the layer number and the node number variable of the neural network structure, and forming a search space;s6.2, randomly generating an initial network structure population as a starting point of an evolution algorithm;s6.3, training and evaluating each network structure, and evaluating the performance and generalization capability of each network structure by using a training set and a verification set;s6.4, selecting a better network structure according to the fitness function to serve as a parent of the next generation population;specific algorithms for determining parent individuals of the next generation population are:wherein P is i Representing the probability that the ith individual is selected; f (f) i Indicating the fitness of the ith individual; n represents the number of individuals in the population;in order to adapt to the change of the search space, the selection probability is dynamically adjusted according to the change condition of individuals in the population, and the selection probability is dynamically adjusted according to the distribution condition of the fitness of the individuals in each generation, and then the optimization of a formula is obtained:wherein α represents a parameter for adjusting the selection probability distribution;s6.5, performing cross operation on the selected parent individuals to generate new individuals;s6.6, performing mutation operation on a newly generated individual, introducing randomness to change some characteristics of the individual, performing mutation operation on the newly generated individual to change the connection weight, and adding noise on the basis of the current weight to adjust the connection weight, wherein an adjustment formula is specifically as follows:W mutated =W+noise;wherein W is mutated Representing the mutated weight matrix; w represents an original weight matrix; noise represents random noise added;wherein:representing a gaussian distribution; sigma represents the standard deviation of noise;s6.7, gradually evolving a more excellent network structure by repeatedly executing the selection, crossing and mutation operations until a stopping condition is reached;s6.8, after evolution iteration is finished, selecting an optimal network structure from a final population to serve as a final result;and the decision output unit (4) is used for converting the output of the model into an actual decision.
- 2. The AI artificial intelligence machine learning system of claim 1, wherein: the data cleaning comprises missing value processing, outlier processing and data deduplication processing.
- 3. The AI artificial intelligence machine learning system of claim 1, wherein: in the step S3.1, in the convolutional neural network, the extraction of the image data features is completed through a convolutional layer and a pooling layer, and the specific operations are as follows:convolution operation:Conv(I,K)=I*K;wherein I represents an input image; k represents a convolution kernel;pooling operation:Pool(I)=max(I)。
- 4. the AI artificial intelligence machine learning system of claim 3, wherein: in S3.3, in the recurrent neural network, the RNN processing sequence information is calculated as:h t =Activation(W hh h t-1 +W xh x t +b h );wherein h is t Representing the hidden state of the current time step; x is x t Representing an input; w (W) hh And W is xh Representing weightsA matrix; b h Representing the bias; activity () represents an Activation function;wherein, add the residual connection to the above formula, optimize and adjust the formula as:h t =Activation(W hh h t-1 +W xh x t +b h )+h t-1 ;wherein h is t-1 Representing the hidden state of the previous time step.
- 5. The AI artificial intelligence machine learning system of claim 4, wherein: in the step S3.4, in the predicting task, the value of the next time step obtained by prediction is:Mt=Activation(W out h t +b out );wherein Mt represents the predicted output at the next time step; h is a t Representing the hidden state of the current time step; w (W) out Representing the weights of the output layers; b out Representing the bias of the output layer;in the classification task, the Softmax activation function is used to get the output probability distribution for each class:wherein z represents an input vector; n represents the number of categories; softmax (z) i A probability representing the i-th category;in the decision output task, a threshold function is used to determine a decision result according to an output value:Decision Output(Thresholding):wherein M represents the output of the model; threshold represents a threshold;the Decision Output (threshold) represents a threshold function.
- AI artificial intelligence machine learning method for AI artificial intelligence machine learning system according to any of claims 1-5, characterized in that it comprises the steps of:s10.1, acquiring data sets for training from different sources by a data collecting and processing unit (1), and preprocessing the collected data by a preprocessing module;s10.2, an AI deep learning model is established based on an AI machine learning algorithm through a model training unit (2), and the AI deep learning model is trained by using preprocessed data, wherein the AI machine learning algorithm is a fusion model based on a cyclic neural network and a convolutional neural network, and by adding residual connection in the cyclic neural network, the performance of the model is improved, the training is accelerated, and the training and sequence modeling capacity of the model are improved;s10.3, performing structural optimization on the established AI deep learning model by a model optimization unit (3), automatically searching and optimizing the deep learning model structure based on neural architecture search, performing variation operation of changing connection weight on newly generated individuals in the neural architecture search, adding random noise on the basis of the current weight, and adjusting the connection weight so as to optimize the model performance;and S10.4, finally, converting the model output into an actual decision through a decision output unit (4), and carrying out classification, prediction conversion and decision making according to the application scene.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410124324.XA CN117668701B (en) | 2024-01-30 | 2024-01-30 | AI artificial intelligence machine learning system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410124324.XA CN117668701B (en) | 2024-01-30 | 2024-01-30 | AI artificial intelligence machine learning system and method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117668701A CN117668701A (en) | 2024-03-08 |
CN117668701B true CN117668701B (en) | 2024-04-12 |
Family
ID=90073469
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410124324.XA Active CN117668701B (en) | 2024-01-30 | 2024-01-30 | AI artificial intelligence machine learning system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117668701B (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101404071A (en) * | 2008-11-07 | 2009-04-08 | 湖南大学 | Electronic circuit fault diagnosis neural network method based on grouping particle swarm algorithm |
CN104318307A (en) * | 2014-10-21 | 2015-01-28 | 重庆工商职业学院 | Tread pattern noise reduction method based on self-adaptive fuzzy genetic algorithm |
CN111898689A (en) * | 2020-08-05 | 2020-11-06 | 中南大学 | Image classification method based on neural network architecture search |
CN112784949A (en) * | 2021-01-28 | 2021-05-11 | 华东计算技术研究所(中国电子科技集团公司第三十二研究所) | Neural network architecture searching method and system based on evolutionary computation |
WO2021103977A1 (en) * | 2019-11-30 | 2021-06-03 | 华为技术有限公司 | Neural network searching method, apparatus, and device |
CN113469891A (en) * | 2020-03-31 | 2021-10-01 | 武汉Tcl集团工业研究院有限公司 | Neural network architecture searching method, training method and image completion method |
CN115481727A (en) * | 2022-09-15 | 2022-12-16 | 电子科技大学 | Intention recognition neural network generation and optimization method based on evolutionary computation |
CN116245162A (en) * | 2022-11-25 | 2023-06-09 | 杭州电子科技大学 | Neural network pruning method and system based on improved adaptive genetic algorithm |
CN117173037A (en) * | 2023-08-03 | 2023-12-05 | 江南大学 | Neural network structure automatic search method for image noise reduction |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11790212B2 (en) * | 2019-03-18 | 2023-10-17 | Microsoft Technology Licensing, Llc | Quantization-aware neural architecture search |
-
2024
- 2024-01-30 CN CN202410124324.XA patent/CN117668701B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101404071A (en) * | 2008-11-07 | 2009-04-08 | 湖南大学 | Electronic circuit fault diagnosis neural network method based on grouping particle swarm algorithm |
CN104318307A (en) * | 2014-10-21 | 2015-01-28 | 重庆工商职业学院 | Tread pattern noise reduction method based on self-adaptive fuzzy genetic algorithm |
WO2021103977A1 (en) * | 2019-11-30 | 2021-06-03 | 华为技术有限公司 | Neural network searching method, apparatus, and device |
CN113469891A (en) * | 2020-03-31 | 2021-10-01 | 武汉Tcl集团工业研究院有限公司 | Neural network architecture searching method, training method and image completion method |
CN111898689A (en) * | 2020-08-05 | 2020-11-06 | 中南大学 | Image classification method based on neural network architecture search |
CN112784949A (en) * | 2021-01-28 | 2021-05-11 | 华东计算技术研究所(中国电子科技集团公司第三十二研究所) | Neural network architecture searching method and system based on evolutionary computation |
CN115481727A (en) * | 2022-09-15 | 2022-12-16 | 电子科技大学 | Intention recognition neural network generation and optimization method based on evolutionary computation |
CN116245162A (en) * | 2022-11-25 | 2023-06-09 | 杭州电子科技大学 | Neural network pruning method and system based on improved adaptive genetic algorithm |
CN117173037A (en) * | 2023-08-03 | 2023-12-05 | 江南大学 | Neural network structure automatic search method for image noise reduction |
Non-Patent Citations (6)
Title |
---|
AutoML: A systematic review on automated machine learning with neural architecture search;Imrus Salehin等;《Journal of Information and Intelligence》;20231008;第2卷(第1期);52-81 * |
Evolution of neural network architecture and weights using mutation based genetic algorithm;A. Nadi等;《2009 14th International CSI Computer Conference》;20091208;536-540 * |
Evolutionary Architecture Search for Generative Adversarial Networks Based On Weight Sharing;Yu Xue等;《 IEEE Transactions on Evolutionary Computation》;20231201;1-15 * |
Particle swarm optimization of deep neural networks architectures for image classification;Francisco Erivaldo Fernandes Junior等;《Swarm and Evolutionary Computation》;20190601;第49卷;62-74 * |
基于演化优化的神经网络结构搜索方法研究;胡文玥;《中国优秀硕士学位论文全文数据库 (信息科技辑)》;20220415(第04期);I138-910 * |
采用神经网络架构搜索的遥感影像分割方法;周鹏等;《西安电子科技大学学报》;20210724;第48卷(第05期);47-57 * |
Also Published As
Publication number | Publication date |
---|---|
CN117668701A (en) | 2024-03-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Harvey et al. | Automated feature design for numeric sequence classification by genetic programming | |
CN109993100B (en) | Method for realizing facial expression recognition based on deep feature clustering | |
CN111898689A (en) | Image classification method based on neural network architecture search | |
CN111292195A (en) | Risk account identification method and device | |
Kamada et al. | An adaptive learning method of restricted Boltzmann machine by neuron generation and annihilation algorithm | |
CN111008224B (en) | Time sequence classification and retrieval method based on deep multitasking representation learning | |
CN112687374B (en) | Psychological crisis early warning method based on text and image information joint calculation | |
CN114756686A (en) | Knowledge reasoning and fault diagnosis method based on knowledge graph | |
Dara et al. | Feature extraction in medical images by using deep learning approach | |
CN116781346A (en) | Convolution two-way long-term and short-term memory network intrusion detection method based on data enhancement | |
CN113239897A (en) | Human body action evaluation method based on space-time feature combination regression | |
CN114896228B (en) | Industrial data stream cleaning model and method based on filtering rule multistage combination optimization | |
CN112560948A (en) | Eye fundus map classification method and imaging method under data deviation | |
Chahal et al. | Deep learning: a predictive IoT data analytics method | |
Sutomo | Identification of Organic and Non-Organic Waste with Computer Image Recognition using Convolutionalneural Network with Efficient-Net-B0 Architecture | |
CN117668701B (en) | AI artificial intelligence machine learning system and method | |
CN111708865A (en) | Technology forecasting and patent early warning analysis method based on improved XGboost algorithm | |
Dobrovska et al. | Development Of The Classifier Based On A Multilayer Perceptron Using Genetic Algorithm And Cart Decision Tree | |
Shahinzadeh et al. | Deep Learning: A Overview of Theory and Architectures | |
CN115063374A (en) | Model training method, face image quality scoring method, electronic device and storage medium | |
Krishna et al. | Parkinson's Disease Detection from Speech Signals Using Explainable Artificial Intelligence | |
CN118484540B (en) | Text classification method and system based on feature perception | |
CN118135496B (en) | Classroom behavior identification method based on double-flow convolutional neural network | |
CN118378178B (en) | Transformer fault identification method and system based on residual map convolution neural network | |
CN116881854B (en) | XGBoost-fused time sequence prediction method for calculating feature weights |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |