CN117668701B

CN117668701B - AI artificial intelligence machine learning system and method

Info

Publication number: CN117668701B
Application number: CN202410124324.XA
Authority: CN
Inventors: 黄家欣; 李艳; 李珊; 艾金龙
Original assignee: Yunnan Xunsheng Technology Co ltd
Current assignee: Yunnan Xunsheng Technology Co ltd
Priority date: 2024-01-30
Filing date: 2024-01-30
Publication date: 2024-04-12
Anticipated expiration: 2044-01-30
Also published as: CN117668701A

Abstract

The invention relates to the technical field of machine learning, in particular to an AI artificial intelligence machine learning system and method. The system comprises a data collection and processing unit, wherein the data collection and processing unit comprises a data collection module and a preprocessing module; the preprocessing module is used for cleaning the collected data; introducing residual connection into the AI deep learning model to improve the performance of the model and accelerate training, and training the AI deep learning model by using the preprocessed data; the model optimizing unit optimizes the structure of the deep learning model based on the neural architecture search, performs variation operation of changing the connection weight on newly generated individuals in the neural architecture search, adds random noise on the basis of the current weight, and further optimizes the performance of the model; the decision output unit is used for converting the output of the model into an actual decision. Residual connection is added in the cyclic neural network to improve the performance of the model and accelerate training, and improve the training and sequence modeling capacity of the model.

Description

AI artificial intelligence machine learning system and method

Technical Field

The invention relates to the technical field of machine learning, in particular to an AI artificial intelligence machine learning system and method.

Background

AI artificial intelligence machine learning systems utilize machine learning techniques to mimic human intelligence, machine learning being a branch of artificial intelligence that allows a computer system to accomplish tasks by learning from data, rather than explicitly programming. These systems discover patterns and learn from them by processing and analyzing large amounts of data in order to make predictions, categorize, identify patterns, or make decisions.

However, machine learning systems require a large amount of data to train, model training efficiency is low, some machine learning models are sensitive to changes in the distribution of new data, and if the new data is different from the distribution of training data, the model performance may be degraded.

Disclosure of Invention

The invention aims to provide an AI artificial intelligence machine learning system and method, which are used for solving the problems that the machine learning system provided in the background technology needs a large amount of data for training and the model training efficiency is low.

To achieve the above object, an object of the present invention is to provide an AI artificial intelligence machine learning system including:

the data collection and processing unit comprises a data collection module and a preprocessing module;

wherein the data collection module collects a data set for training; the data may come from a variety of sources, such as sensors, databases, logs, social media.

The preprocessing module is used for cleaning the collected data;

the model training unit is used for establishing an AI deep learning model based on an AI machine learning algorithm, training the AI deep learning model by using the preprocessed data and enabling the AI deep learning model to learn and capture the mode, the feature and the relation of the data from the preprocessed data;

the model optimization unit optimizes the structure of the AI deep learning model based on neural architecture search;

and the decision output unit is used for converting the output of the model into an actual decision.

The decision output unit is responsible for converting the output generated by the trained model into actual decisions, actions or suggestions, the functions vary according to the specific application scenario, and the following are some possible cases and functions:

classification and predictive conversion: if the task of the model is classification or prediction, the decision output unit converts the result output by the model into an actual classification label, probability value or prediction result. For example, if the model is for image recognition, the decision output unit may convert the model predicted object class into a textual description or a related action recommendation.

Decision making: in some cases, the AI system may need to make decisions about results in a particular context. This unit may map the output of the model to a series of possible actions or decisions and make the final selection according to certain set rules or conditions. For example, based on model predictions of market trends, the decision output unit may recommend buying or selling a security.

As a further improvement of the technical scheme, the data cleaning comprises missing value processing, outlier processing and data deduplication processing.

As a further improvement of the technical scheme, the AI machine learning algorithm is based on a fusion model of a cyclic neural network and a convolutional neural network, and comprises the following specific steps:

s3.1, using a convolutional neural network as a feature extractor to extract image data features in the preprocessed data, and extracting spatial features of the image through a convolution and pooling layer;

s3.2, mapping the image data features extracted by the convolutional neural network into the cyclic neural network, and extracting spatial features;

s3.3, the cyclic neural network receives the spatial characteristics extracted by the convolutional neural network as input, processes sequence information, and adds residual connection in the cyclic neural network; residual connection improves the performance of the model and accelerates training, improves the training and sequence modeling capacity of the model, and is particularly used for processing image sequence tasks;

and S3.4, modeling the sequence information data by using a cyclic neural network, and adding an output layer to the cyclic neural network to perform any task of prediction, classification or decision output.

As a further improvement of the present technical solution, in S3.1, in the convolutional neural network, the image feature extraction is completed through a convolutional layer and a pooling layer, which specifically includes:

convolution operation:

Conv(I,K)＝I*K；

wherein I represents an input image; k represents a convolution kernel;

pooling operation:

Pool(I)＝max(I)。

as a further improvement of the present technical solution, in S3.3, in the cyclic neural network, the calculation involved in RNN processing sequence information is:

h _t ＝Activation(W _hh h _t-1 +W _xh x _t +b _h )；

wherein h is _t Representing the hidden state of the current time step; x is x _t Representing an input; w (W) _hh And W is _xh Representing a weight matrix; b _h Representing the bias; activity () represents an Activation function;

wherein, add the residual connection to the formula, optimize and adjust the formula as:

h _t ＝Activation(W _hh h _t-1 +W _xh x _t +b _h )+h _t-1 ；

wherein h is _t-1 Representing the hidden state of the previous time step; by adding h in the new hidden state calculation _t-1 The purpose of residual connection is achieved, and the flow and gradient propagation of information are facilitated.

As a further improvement of the present technical solution, in S3.4, in the predicting task, the predicted value of the next time step is:

Mt＝Activation(W _out h _t +b _out )；

wherein Mt represents the predicted output at the next time step; h is a _t Representing the hidden state of the current time step; w (W) _out Representing the weights of the output layers; b _out Representing the bias of the output layer;

in the classification task, the Softmax activation function is used to get the output probability distribution for each class:

wherein z represents an input vector; n represents the number of categories; softmax (z) _i A probability representing the i-th category;

in the decision output task, a threshold function is used to determine a decision result according to an output value: precision Output (threading):

wherein M represents the output of the model; threshold represents a threshold;

dprecision Output (threshold) represents a threshold function.

As a further improvement of the technical scheme, the neural architecture search is based on an evolution algorithm, and the structure of the AI deep learning model is optimized by the following steps:

s6.1, determining the parameter range, the connection mode, the layer number and the node number variable of the neural network structure, and forming a search space;

s6.2, randomly generating an initial network structure population as a starting point of an evolution algorithm;

s6.3, training and evaluating each network structure, and evaluating the performance and generalization capability of each network structure by using a training set and a verification set, wherein the evaluation indexes are usually used as fitness functions; the fitness can be measured according to the performance of the neural network on a specific task, and in the image classification task, the classification accuracy can be used as a fitness index; in the language model generation task, a confusion index may be used; the data set is typically divided into a training set and a validation set. The fitness may be calculated based on the performance of the neural network on the validation set. This can prevent overfitting and better measure the generalization ability of the model;

s6.4, selecting a better network structure according to the fitness function to serve as a parent of the next generation population;

s6.5, performing cross operation on the selected parent individuals to generate new individuals;

s6.6, performing mutation operation on a newly generated individual, and introducing randomness to change certain characteristics of the individual so as to conveniently explore more network structural space;

s6.7, gradually evolving a more excellent network structure by repeatedly executing the selection, crossing and mutation operations until a stopping condition is reached;

and S6.8, after the evolution iteration is finished, selecting an optimal network structure from the final population as a final result, and determining the optimal neural network structure according to the evaluation index.

As a further improvement of the present technical solution, in S6.4, a specific algorithm for determining parent individuals of the next generation population is:

wherein P is _i Representing the probability that the ith individual is selected; f (f) _i Indicating the fitness of the ith individual; n represents the number of individuals in the population;

in order to adapt to the change of the search space, the selection probability is dynamically adjusted according to the change condition of individuals in the population, the selection probability is dynamically adjusted according to the distribution condition of the fitness of the individuals in each generation, so as to keep the diversity of the population, and the optimization of the formula is obtained:

where α denotes a parameter for adjusting the selection probability distribution. When α takes different values, it has different effects on the distribution of the selection probabilities. The larger alpha value can increase the selection probability difference of individuals with higher fitness, the smaller alpha value can enable the selection probability to be closer to the standard fitness proportion selection, the selection pressure can be adjusted according to the distribution condition of the fitness by dynamically adjusting the selection probability, the diversity of population is promoted, and the solution space is better explored.

As a further improvement of the present technical solution, in S6.6, a mutation operation of changing the connection weight is performed on the newly generated individual, and random noise is added on the basis of the current weight to adjust the connection weight, where the adjustment formula is specifically as follows:

W _mutated ＝W+noise；

wherein W is _mutated Representing the mutated weight matrix; w represents an original weight matrix; noise represents random noise added; may be a random number from a gaussian distribution;

wherein,representing a gaussian distribution; sigma represents the standard deviation of noise; />The noise variable follows a gaussian distribution with a mean value of 0 and a standard deviation of sigma.

In another aspect, the present invention provides an AI artificial intelligence machine learning method for use in any one of the AI artificial intelligence machine learning systems described above, including the steps of:

s10.1, acquiring a training data set from different sources by a data collecting and processing unit, and preprocessing the collected data by a preprocessing module;

s10.2, an AI deep learning model is established based on an AI machine learning algorithm through a model training unit, and the AI deep learning model is trained by using preprocessed data, wherein the AI machine learning algorithm is a fusion model based on a cyclic neural network and a convolutional neural network, and by adding residual connection in the cyclic neural network, the performance of the model is improved, the training is accelerated, and the training and sequence modeling capacity of the model are improved;

s10.3, performing structural optimization on the established AI deep learning model by a model optimization unit, automatically searching and optimizing the deep learning model structure based on neural architecture search, performing variation operation of changing connection weight on newly generated individuals in the neural architecture search, adding random noise on the basis of the current weight, and adjusting the connection weight so as to optimize the model performance;

and S10.4, finally, converting the model output into an actual decision through a decision output unit, and classifying, predicting and converting and making decisions according to the application scene.

Compared with the prior art, the invention has the beneficial effects that:

1. in the AI artificial intelligent machine learning system and method, an AI deep learning model established by an AI machine learning algorithm is a fusion model based on a cyclic neural network and a convolutional neural network, and residual connection is added in the cyclic neural network so as to improve the performance of the model, accelerate training and improve the training and sequence modeling capacity of the model.

2. In the AI artificial intelligent machine learning system and method, in a model optimizing unit, the structure of a deep learning model is optimized based on neural architecture search, and newly generated individuals are subjected to variation operation of changing connection weights in the neural architecture search, so that the variation operation increases the variability among individuals, thereby maintaining the diversity of populations; introducing variation is helpful to avoid premature sinking into a local optimal solution, and random noise is added on the basis of the current weight to adjust the connection weight so as to optimize the performance of the model;

the neural architecture search optimizes the structure of the deep learning model based on an evolution algorithm, dynamically adjusts the selection probability according to the change condition of individuals in the population in order to adapt to the change of the search space, and dynamically adjusts the selection probability according to the distribution condition of the fitness of the individuals in each generation so as to keep the diversity of the population, thereby solving the problem that the model performance is possibly reduced due to different distribution of new data and training data.

Drawings

Fig. 1 is an overall flow diagram of the present invention.

The meaning of each reference sign in the figure is:

1.a data collection and processing unit; 2. a model training unit; 3. a model optimizing unit; 4. and a decision output unit.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Example 1:

referring to fig. 1, an AI artificial intelligence machine learning system is provided, comprising:

the data collection and processing unit 1, wherein the data collection and processing unit 1 comprises a data collection module and a preprocessing module;

wherein the data collection module collects a data set for training; the data is typically cleaned, converted and marked to facilitate model learning and analysis, and to ensure data quality and availability, the collected data is cleaned by a preprocessing module, which includes missing value processing, outlier processing and data deduplication processing.

Sources of data include sensors, databases, logs, social media.

The model training unit 2, the model training unit 2 builds AI deep learning model based on AI machine learning algorithm, and trains AI deep learning model by using the preprocessed data, is used for making AI deep learning model learn from the preprocessed data and capturing mode, feature and relation of the data; in order to make accurate predictions, classifications, or decisions;

the model optimizing unit 3 optimizes the structure of the AI deep learning model based on the neural architecture search by the model optimizing unit 3; searching and evaluating the neural network structure, automatically trying different layer numbers, node numbers and connection mode framework combinations to find the optimal network structure;

and the decision output unit 4, the decision output unit 4 is used for converting the output of the model into an actual decision.

The decision output unit 4 is responsible for converting the output generated by the trained model into actual decisions, actions or suggestions, and the functions vary according to the specific application scenario, including the following situations and functions:

The AI machine learning algorithm is based on a fusion model of a cyclic neural network and a convolutional neural network, and comprises the following specific steps:

s3.1, using a convolutional neural network as a feature extractor to extract image data features in the preprocessed data, and extracting spatial features of the image through a convolution and pooling layer, wherein the features can capture local information and abstract representation of the image;

in the convolutional neural network of the embodiment, the image feature extraction is completed through a convolutional layer and a pooling layer, and the specific operation is as follows:

convolution operation:

Conv(I,K)＝I*K；

wherein I represents an input image; k represents a convolution kernel;

pooling operation:

Pool(I)＝max(I)。

s3.2, mapping the image data features extracted by the convolutional neural network into the cyclic neural network, and extracting spatial features; includes reshaping the feature map into sequence data;

in the recurrent neural network of S3.3, the RNN processing sequence information involves the following calculations:

h _t ＝Activation(W _hh h _t-1 +W _xh x _t +b _h )；

wherein h is _t Representing the hidden state of the current time step; x is x _t Representing an input; w (W) _hh And W is _xh Representing a weight matrix; b _h Representing biasThe method comprises the steps of carrying out a first treatment on the surface of the Activity () represents an Activation function;

h _t ＝Activation(W _hh h _t-1 +W _xh x _t +b _h )+h _t-1 ；

The output may be used for different tasks such as classifying the video sequence, generating descriptive text.

In this embodiment, any one of tasks based on prediction, classification, or decision output of the output layer is classified into a group that can be expressed as:

in the prediction task, the value of the next time step obtained by prediction is as follows:

Mt＝Activation(W _out h _t +b _out )；

wherein M represents the output of the model; threshold represents a threshold;

the Decision Output (threshold) represents a threshold function.

Still further, in the present embodiment, in the model optimizing unit 3, the neural architecture search is based on the evolution algorithm, and the steps of optimizing the structure of the AI deep learning model are as follows:

s6.1, determining the parameter range, the connection mode, the layer number and the node number variable of the neural network structure, and forming a search space for generating and adjusting different network structures;

s6.4, selecting a better network structure according to the fitness function to serve as a parent of the next generation population:

specific algorithms for determining parent individuals of the next generation population are:

wherein P is _i Represents the ithProbability of an individual being selected; f (f) _i Indicating the fitness of the ith individual; n represents the number of individuals in the population;

S6.5, performing cross operation on the selected parent individuals to generate new individuals. This may be achieved by exchanging, combining or reorganizing parts of the network structure;

in this embodiment, a mutation operation of changing the connection weight is performed on a new individual, and random noise is added on the basis of the current weight to adjust the connection weight, and the adjustment formula is specifically as follows:

W _mutated ＝W+noise；

The random noise is introduced to cause the weight to have small amplitude change, which is helpful for exploring wider solution space, and can help the neural architecture search algorithm jump out of the local optimal solution, try more possibilities and find a potential better solution; the variation operation increases the difference among individuals, so that the diversity of the population is maintained; introducing variation helps to avoid premature sinking into the locally optimal solution. Without a mutation operation, the search algorithm may wander around the locally optimal solution, and the mutation operation may help jump out of such a locally optimal solution.

S6.7, gradually evolving a more excellent network structure by repeatedly executing selection, crossing and mutation operations until a stopping condition is reached, wherein the stopping condition is that the preset iteration times or performance convergence is reached;

Example 2:

embodiment 2 of the present invention differs from embodiment 1 in that this embodiment describes an AI artificial intelligence machine learning method used by the AI artificial intelligence machine learning system.

The AI artificial intelligence machine learning method is used for the AI artificial intelligence machine learning system and comprises the following steps:

s10.1, acquiring training data sets from different sources by the data collecting and processing unit 1, and preprocessing the collected data by a preprocessing module;

s10.2, an AI deep learning model is established based on an AI machine learning algorithm through a model training unit 2, and the AI deep learning model is trained by using preprocessed data, wherein the AI machine learning algorithm is a fusion model based on a cyclic neural network and a convolutional neural network, and by adding residual connection in the cyclic neural network, the performance of the model is improved, the training is accelerated, and the training and sequence modeling capacity of the model are improved;

s10.3, performing structural optimization on the established AI deep learning model by a model optimization unit 3, automatically searching and optimizing the deep learning model structure based on neural architecture search, performing variation operation of changing connection weight on newly generated individuals in the neural architecture search, adding random noise on the basis of the current weight, and adjusting the connection weight so as to optimize the model performance;

and S10.4, finally, converting the model output into an actual decision through a decision output unit 4, and classifying, predicting and converting and making decisions according to the application scene.

The foregoing has shown and described the basic principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the above-described embodiments, and that the above-described embodiments and descriptions are only preferred embodiments of the present invention, and are not intended to limit the invention, and that various changes and modifications may be made therein without departing from the spirit and scope of the invention as claimed. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims

An ai artificial intelligence machine learning system, comprising:

a data collection and processing unit (1), the data collection and processing unit (1) comprising a data collection module and a preprocessing module, wherein the data collection module collects a data set for training; the preprocessing module is used for cleaning the collected data;

the model training unit (2), the said model training unit (2) builds AI deep learning model based on AI machine learning algorithm, introduce the residual error connection in AI deep learning model, in order to improve the performance of the model and accelerate training, and use the preprocessed data to train AI deep learning model, use for making AI deep learning model study from preprocessed data and catch mode, characteristic and relation of the data;

the AI machine learning algorithm is based on a fusion model of a cyclic neural network and a convolutional neural network, and comprises the following specific steps:

s3.1, using a convolutional neural network as a feature extractor to extract image data features in the preprocessed data, and extracting spatial features of the image through a convolution and pooling layer;

s3.2, mapping the image data features extracted by the convolutional neural network into the cyclic neural network, and extracting spatial features;

s3.3, the cyclic neural network receives the spatial characteristics extracted by the convolutional neural network as input, processes sequence information, and adds residual connection in the cyclic neural network;

s3.4, modeling the sequence information data by using a cyclic neural network, and adding an output layer to the cyclic neural network to perform any task of prediction, classification or decision output;

the model optimization unit (3), the structure of the AI deep learning model is optimized based on neural architecture search, the variation operation of changing the connection weight is carried out on newly generated individuals in the neural architecture search, random noise is added on the basis of the current weight, and the connection weight is adjusted, so that the model performance is optimized;

the neural architecture search is based on an evolution algorithm, and the structural optimization of the AI deep learning model comprises the following steps:

s6.1, determining the parameter range, the connection mode, the layer number and the node number variable of the neural network structure, and forming a search space;

s6.2, randomly generating an initial network structure population as a starting point of an evolution algorithm;

s6.3, training and evaluating each network structure, and evaluating the performance and generalization capability of each network structure by using a training set and a verification set;

s6.4, selecting a better network structure according to the fitness function to serve as a parent of the next generation population;

specific algorithms for determining parent individuals of the next generation population are:

wherein P is _i Representing the probability that the ith individual is selected; f (f) _i Indicating the fitness of the ith individual; n represents the number of individuals in the population;

in order to adapt to the change of the search space, the selection probability is dynamically adjusted according to the change condition of individuals in the population, and the selection probability is dynamically adjusted according to the distribution condition of the fitness of the individuals in each generation, and then the optimization of a formula is obtained:

wherein α represents a parameter for adjusting the selection probability distribution;

s6.5, performing cross operation on the selected parent individuals to generate new individuals;

s6.6, performing mutation operation on a newly generated individual, introducing randomness to change some characteristics of the individual, performing mutation operation on the newly generated individual to change the connection weight, and adding noise on the basis of the current weight to adjust the connection weight, wherein an adjustment formula is specifically as follows:

W _mutated ＝W+noise；

wherein W is _mutated Representing the mutated weight matrix; w represents an original weight matrix; noise represents random noise added;

wherein:

representing a gaussian distribution; sigma represents the standard deviation of noise;

s6.7, gradually evolving a more excellent network structure by repeatedly executing the selection, crossing and mutation operations until a stopping condition is reached;

s6.8, after evolution iteration is finished, selecting an optimal network structure from a final population to serve as a final result;

and the decision output unit (4) is used for converting the output of the model into an actual decision.
2. The AI artificial intelligence machine learning system of claim 1, wherein: the data cleaning comprises missing value processing, outlier processing and data deduplication processing.
3. The AI artificial intelligence machine learning system of claim 1, wherein: in the step S3.1, in the convolutional neural network, the extraction of the image data features is completed through a convolutional layer and a pooling layer, and the specific operations are as follows:

convolution operation:

Conv(I，K)＝I*K；

wherein I represents an input image; k represents a convolution kernel;

pooling operation:

Pool(I)＝max(I)。
4. the AI artificial intelligence machine learning system of claim 3, wherein: in S3.3, in the recurrent neural network, the RNN processing sequence information is calculated as:

h _t ＝Activation(W _hh h _t-1 +W _xh x _t +b _h )；

wherein h is _t Representing the hidden state of the current time step; x is x _t Representing an input; w (W) _hh And W is _xh Representing weightsA matrix; b _h Representing the bias; activity () represents an Activation function;

wherein, add the residual connection to the above formula, optimize and adjust the formula as:

h _t ＝Activation(W _hh h _t-1 +W _xh x _t +b _h )+h _t-1 ；

wherein h is _t-1 Representing the hidden state of the previous time step.
5. The AI artificial intelligence machine learning system of claim 4, wherein: in the step S3.4, in the predicting task, the value of the next time step obtained by prediction is:

Mt＝Activation(W _out h _t +b _out )；

wherein Mt represents the predicted output at the next time step; h is a _t Representing the hidden state of the current time step; w (W) _out Representing the weights of the output layers; b _out Representing the bias of the output layer;

in the classification task, the Softmax activation function is used to get the output probability distribution for each class:

wherein z represents an input vector; n represents the number of categories; softmax (z) _i A probability representing the i-th category;

in the decision output task, a threshold function is used to determine a decision result according to an output value:

Decision Output(Thresholding)：

wherein M represents the output of the model; threshold represents a threshold;

the Decision Output (threshold) represents a threshold function.
AI artificial intelligence machine learning method for AI artificial intelligence machine learning system according to any of claims 1-5, characterized in that it comprises the steps of:

s10.1, acquiring data sets for training from different sources by a data collecting and processing unit (1), and preprocessing the collected data by a preprocessing module;

s10.2, an AI deep learning model is established based on an AI machine learning algorithm through a model training unit (2), and the AI deep learning model is trained by using preprocessed data, wherein the AI machine learning algorithm is a fusion model based on a cyclic neural network and a convolutional neural network, and by adding residual connection in the cyclic neural network, the performance of the model is improved, the training is accelerated, and the training and sequence modeling capacity of the model are improved;

s10.3, performing structural optimization on the established AI deep learning model by a model optimization unit (3), automatically searching and optimizing the deep learning model structure based on neural architecture search, performing variation operation of changing connection weight on newly generated individuals in the neural architecture search, adding random noise on the basis of the current weight, and adjusting the connection weight so as to optimize the model performance;

and S10.4, finally, converting the model output into an actual decision through a decision output unit (4), and carrying out classification, prediction conversion and decision making according to the application scene.