WO2017206936A1 - 基于机器学习的网络模型构造方法及装置 - Google Patents
基于机器学习的网络模型构造方法及装置 Download PDFInfo
- Publication number
- WO2017206936A1 WO2017206936A1 PCT/CN2017/086917 CN2017086917W WO2017206936A1 WO 2017206936 A1 WO2017206936 A1 WO 2017206936A1 CN 2017086917 W CN2017086917 W CN 2017086917W WO 2017206936 A1 WO2017206936 A1 WO 2017206936A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- network
- data processing
- sub
- processing step
- data
- Prior art date
Links
- 238000010276 construction Methods 0.000 title claims abstract description 54
- 238000010801 machine learning Methods 0.000 title claims abstract description 26
- 238000012545 processing Methods 0.000 claims abstract description 265
- 238000000034 method Methods 0.000 claims abstract description 72
- 238000012549 training Methods 0.000 claims abstract description 66
- 238000005457 optimization Methods 0.000 claims abstract description 53
- 230000008569 process Effects 0.000 claims abstract description 45
- 238000013528 artificial neural network Methods 0.000 claims description 22
- 238000004422 calculation algorithm Methods 0.000 claims description 22
- 238000005070 sampling Methods 0.000 claims description 3
- 238000003062 neural network model Methods 0.000 description 16
- 210000002569 neuron Anatomy 0.000 description 15
- 238000004458 analytical method Methods 0.000 description 13
- 230000006870 function Effects 0.000 description 13
- 238000010586 diagram Methods 0.000 description 10
- 238000013507 mapping Methods 0.000 description 6
- 239000011159 matrix material Substances 0.000 description 5
- 238000012300 Sequence Analysis Methods 0.000 description 4
- 238000004891 communication Methods 0.000 description 4
- 239000000284 extract Substances 0.000 description 4
- 238000007635 classification algorithm Methods 0.000 description 3
- 238000003066 decision tree Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000005312 nonlinear dynamic Methods 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 238000000513 principal component analysis Methods 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000013501 data transformation Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000000556 factor analysis Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000007477 logistic regression Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24133—Distances to prototypes
- G06F18/24137—Distances to cluster centroïds
- G06F18/2414—Smoothing the distance, e.g. radial basis function networks [RBFN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/03—Arrangements for converting the position or the displacement of a member into a coded form
- G06F3/041—Digitisers, e.g. for touch screens or touch pads, characterised by the transducing means
- G06F3/047—Digitisers, e.g. for touch screens or touch pads, characterised by the transducing means using sets of wires, e.g. crossed wires
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
- G06N3/0455—Auto-encoder networks; Encoder-decoder networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/047—Probabilistic or stochastic networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/14—Network analysis or design
- H04L41/145—Network analysis or design involving simulating, designing, planning or modelling of a network
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/16—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks using machine learning or artificial intelligence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/10—Machine learning using kernel methods, e.g. support vector machines [SVM]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/01—Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L41/00—Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
- H04L41/12—Discovery or management of network topologies
Definitions
- the present application relates to the field of Internet technologies, and in particular, to a network model construction method and apparatus based on machine learning.
- Network models include: non-neural network models and neural network models.
- NN Neural Networks
- neurons are a complex network system formed by interconnecting a large number of processing units (called neurons), and is a highly complex nonlinear dynamic learning system.
- the basis of neural networks lies in neurons, which are represented by the network topology, node characteristics and learning rules of neurons.
- the embodiment of the present application provides a network model construction method and device based on machine learning, which simplifies the construction process of the network model and improves the model construction efficiency.
- a first aspect of the embodiments of the present application provides a network model construction method based on machine learning, which may include:
- At least one sub-network after optimization training is combined to form a target network model.
- the data processing flow for acquiring the original network model and the reference data set generated by the original network model in the data processing flow include:
- the reference data set includes at least one set of input/output data corresponding to each data processing step.
- the hierarchically constructing the at least one sub-network according to the data processing flow and the reference data set includes:
- a sub-network equivalent to each data processing step is constructed according to a network main structure, an input layer structure, and an output layer structure of a sub-network equivalent to each data processing step, wherein one sub-network is equivalent to one data processing step.
- the parameters of the sub-network equivalent to each data processing step are optimized and adjusted according to the neural network training optimization algorithm, and the parameters include: network node, weight At least one of the training rates.
- At least one set of input/output data corresponding to the data processing step equivalent to the seed network and at least one set of input/output data corresponding to the data processing step equivalent to the merged object network are used as a reference Optimize and adjust the parameters of the network connected to the connection;
- the network obtained by the merge connection is used as the seed network, and the foregoing process is iterated until all the at least one sub-network are merged and connected to form the target network model.
- the method also includes:
- an intermediate hidden layer is added between the seed network and the merged object network, and the seed network and the merged target network are merged and connected in a full connection manner by using the intermediate hidden layer.
- the second aspect of the embodiments of the present application provides a machine model-based network model construction apparatus, which may include:
- An obtaining module configured to acquire a data processing flow of the original network model and a reference data set generated by the original network model in the data processing flow;
- a hierarchical construction module configured to hierarchically construct at least one sub-network according to the data processing flow and the reference data set;
- An optimization training module configured to perform optimization training on the at least one sub-network by using the reference data set
- the merging module is configured to combine the at least one sub-network after the optimized training to form a target network model.
- the obtaining module includes:
- a step obtaining unit configured to acquire at least one data processing step performed by the original network model in the data processing flow
- a data acquisition unit configured to acquire operation data generated by the original network model when performing each data processing step
- a sampling extracting unit configured to extract part or all of the data from the running data generated by the original network model when performing each data processing step, to form a reference data set
- the reference data set includes at least one set of input/output data corresponding to each data processing step.
- the hierarchical construction module includes:
- a query unit configured to separately query, from a preset equivalent correspondence table, a network main structure of a sub-network equal to each data processing step;
- a determining unit configured to determine an input layer structure and an output layer structure of a sub-network equivalent to each data processing step according to at least one set of input/output data corresponding to each data processing step;
- a construction unit configured to construct a sub-network equivalent to each data processing step according to a network main structure, an input layer structure, and an output layer structure of a sub-network equivalent to each data processing step, wherein one sub-network and one data processing step Equal price.
- the optimization training module includes:
- a reading unit configured to sequentially read at least one set of input/output data corresponding to each data processing step from the reference data set;
- an adjusting unit configured to optimize, according to the at least one set of input/output data corresponding to each data processing step, a parameter of a sub-network equal to each data processing step according to a neural network training optimization algorithm, where the parameters include: At least one of a network node, a weight, and a training rate.
- the merge module includes:
- a seed selection unit configured to select any one of the at least one sub-network as a seed network
- a merge object selection unit configured to obtain a merge order according to each data processing step, and select, according to the merge order, a sub-network other than the seed network from the at least one sub-network as a merge target network;
- a removal unit configured to remove an input layer and an output layer between the seed network and the merged object network
- a merge connection unit configured to perform a merge connection between the seed network and the merged object network by using a full connection
- An optimization adjustment unit configured to: at least one group corresponding to at least one set of input/output data corresponding to a data processing step equivalent to the seed network and a data processing step equivalent to the merged object network if the merge connection is successful
- the input/output data is used as a reference to optimize and adjust the parameters of the merged network;
- the seed selection unit is further configured to use the merged connected network as a seed network, and iteratively perform corresponding processing by the merged object selecting unit, the teardown unit, the merged connection unit, and the optimization adjustment unit until the at least A sub-network is all merged to form a target network model.
- the merging module further includes:
- Adding a unit if the merge connection fails, adding an intermediate hidden layer between the seed network and the merged object network, so that the merged connection unit connects the seed network and the middle through the intermediate hidden layer
- the merged object network adopts a full connection mode for merge connection.
- the actual running data generated by the original network model in the data processing flow is used as a reference data set, and at least one equivalent sub-network is hierarchically constructed, and At least one sub-network performs optimization training, and finally merges to form a target network model; since the actual operational data of the original network model is used to flexibly and quickly construct various levels of the target network model, and then the respective layers are combined to form a target network model, instead of It is necessary to envision the overall structure of the target network model from scratch, simplify the model construction process, and effectively improve the model construction efficiency; The adjustment adopts the method of divide and conquer, and optimizes and recombines each sub-network separately. This makes the optimization and adjustment process of the target network model more flexible and further improves the model construction efficiency.
- FIG. 1 is a flowchart of a method for constructing a network model based on machine learning according to an embodiment of the present application
- FIG. 2 is a flowchart of another method for constructing a network model based on machine learning according to an embodiment of the present application
- FIG. 2A is a schematic diagram of an original network model in an embodiment of the present application.
- 3a is a schematic diagram of a construction process and an optimization training process of a sub-network b1 that is equivalent to the data processing step a1 in the original network model according to an embodiment of the present application;
- FIG. 3b is a schematic diagram of a process of merging sub-networks b1 and b2 and an optimized training process for a merged network according to an embodiment of the present application;
- FIG. 3c is another schematic diagram of a process of merging sub-networks b1 and b2 and an optimized training process for a merged network according to an embodiment of the present application;
- FIG. 4 is a schematic structural diagram of a network model construction apparatus based on machine learning according to an embodiment of the present application
- FIG. 5 is a schematic structural diagram of a network model construction apparatus based on machine learning according to an embodiment of the present invention.
- a neural network model in the process of machine learning a network model to construct a neural network model, data of an existing network model needs to be labeled to form a training set, and then a neural network model is constructed from zero. Since the process of training a neural network requires a large amount of manual annotation of data to generate annotation data as a training set, this process requires a large amount of human-computer interaction and requires a large amount of equipment resources. In addition, constructing a neural network model from scratch requires complex parameter adjustments of the constructed network model as a whole, so the workload is large and the model construction efficiency is low.
- Network models include: non-neural network models and neural network models.
- the neural network is a complex network system formed by interconnecting a large number of processing units (called neurons in the neural network), and is a highly complex nonlinear dynamic learning system.
- Neurons are the basic unit that constitutes a neural network.
- the neural network model is represented by the network topology, node characteristics, and learning rules of the neurons. Compared with the non-neural network model, the neural network model has stronger deep learning ability and better environmental adaptability.
- the embodiment of the present application provides a network model construction method and device based on machine learning, which can analyze the data processing flow of the original network model, and use the actual running data generated by the original network model in the data processing flow as a reference data set. And hierarchically constructing at least one equivalent sub-network, and performing optimization training on at least one sub-network, and finally merging to form a target network model.
- machine learning By constructing the target network model through this machine learning process, the following beneficial effects can be obtained:
- the target network model is obtained through joint optimization adjustment of at least one sub-network equivalent to the data processing flow of the original network model, compared with the original network model, the target network model no longer needs to consider the integration between the various data processing steps. And the adaptation problem; and the joint optimization process is based on the actual operational data of the original network model, thereby ensuring the reliability of the processing performance of the target network model.
- the flexibility of adjustment is high. Since the target network model is driven by the actual operational data of the training original network model, if the target network model needs to be adjusted, only the parameters involved in the training process need to be adjusted, and the parameters between the various hierarchical structures of the target network model need not be considered. With the problem of allocation, the flexibility of adjustment is high; and if the original network model appears or generates new operational data, the target network model can be directly adjusted by using the new operational data, without excessive manual intervention, thereby saving labor costs and reducing The workload of the model construction.
- the use of data is more adequate.
- the target network model is a deep network formed by joint optimization of at least one sub-network, and has high performance reliability. Compared with the original network model, it can continuously iterate parameters to maximize the value of data and achieve better performance optimization.
- the target network model is an equivalent replacement of the original network model by means of machine learning.
- the actual operational data generated by the original network model in the data processing flow is utilized, and no manual understanding or cognitive annotation is added.
- Data which can inherit the explanatory elements of the original network model to a certain extent, and is more suitable for scenarios with higher interpretative requirements.
- the original network model may be a non-neural network model or a neural network model
- the target network model may be a neural network model.
- the original network model is described by taking a non-neural network model as an example, and the target network model is described by a neural network model.
- the embodiment of the present application discloses a network model construction method based on machine learning.
- the method may include the following steps S101 to S104:
- the original network model performs a series of data processing steps to form a complete data processing flow.
- the data processing steps herein may include, but are not limited to, at least one of the following:
- the step is implemented based on a classification algorithm, a clustering algorithm, a component analysis algorithm, a dimensionality reduction algorithm, and an encoder method.
- the classification algorithm may include, but is not limited to, a support vector machine (SVM), a decision tree, and a threshold.
- SVM support vector machine
- Classifier Logistic Regression, Shallow Neural Network, Gradient Boost Decision Tree (GBDT), a method to improve the accuracy of weak classification algorithms (Boosting), and the nearest neighbor algorithm (k-Nearest Neighbor, KNN) ), Bayesian classifier, random forest method and its possible variants.
- Clustering algorithms may include, but are not limited to, partition-based clustering methods (Kmeans), K-centers, MeanShift, spectral clustering, density-based clustering algorithms (Density-Based Spatial Clustering of Applications with Noise, DBSCAN), similar propagation method.
- the component analysis algorithm may include, but is not limited to, Principal Component Analysis (PCA), Canonical Correspondence Analysis (CCA), factor analysis, Fourier transform, wavelet analysis.
- the dimensionality reduction mapping algorithm may include but is not limited to: Mixed Discriminant Analysis (Mixture Discriminant Analysis, MDA), Fisher projection, lsoMap (a global optimization algorithm).
- Encoder methods may include, but are not limited to, Linear Discriminant Analysis (LDA), Probability Latent Semantic Analysis (PLSA), Latent Semantic Analysis (LSA), Sparse Coding (Sparse) Coding).
- LDA Linear Discriminant Analysis
- PLSA Probability Latent Semantic Analysis
- LSA Latent Seman
- This step may be implemented based on a data statistics algorithm, which may include, but is not limited to, summation, averaging, quantification, maximum value, central statistical moment, card statistic, and the like.
- This step may be implemented based on a sequence analysis algorithm, which may include, but is not limited to, an autoregressive Integrated Moving Average Model (ARIMA) regression, a Karman filter, and the like.
- ARIMA Autoregressive Integrated Moving Average Model
- the functions herein may include, but are not limited to, a linear mapping function, a transformation function including information entropy, an analytic function, a transcendental function, and the like.
- This step may include, but is not limited to, data merging, filtering of data, data separation, data transformation, and the like.
- step S101 at least one data processing step involved in recording the data processing flow of the original network model operation may be analyzed.
- the original network model generates operational data when performing each of the data processing steps described above, where the operational data may include, but is not limited to, input data, intermediate data, and obtained output data used by the original network model during actual operation. Or, the labeled data input, intermediate data, and output data used by the original network model during training or testing; or analog input data, intermediate data, and obtained output data manually injected for the original network model. Then, this step extracts some or all of the operational data from the operational data of each data processing step as a reference data set.
- S102 Construct at least one sub-network according to the data processing flow and the reference data set.
- the data processing flow records at least one data processing step performed by the original network model, the reference data set including at least one set of input/output data corresponding to each data processing step.
- the idea of hierarchical structure is that each data processing step of the original network model can be performed by a sub-network with equivalent functions, then one data processing step can correspond to the network main structure of one sub-network; at the same time, The input/output layer of the sub-network can then be determined by the input/output data of the data processing step.
- step a1 is equal to the sub-network b1 of the first level
- step a2 is equivalent to the sub-network b2 of the second level
- step a4 is equivalent to the sub-network
- the purpose of the optimization training is to continuously adjust the parameters of the at least one sub-network by using the data in the reference data set as a reference, so that the performance index of the sub-network is the same as the performance index of the corresponding data processing step in the original network model.
- the parameters of the subnetwork may include at least one of a network node, a weight, and a training rate.
- at least one set of input/output data refined from each data processing step is used to perform optimal training on the sub-networks of the same data processing step.
- the reference data set corresponds to step a1.
- At least one set of input/output data is used for optimal training of the sub-network b1
- at least one set of input/output data corresponding to step a2 is used for optimal training of the sub-network b2
- at least one set of inputs corresponding to step a4 The /output data is used for optimal training of the sub-network b4.
- S104 Combine at least one sub-network after optimization training to form a target network model.
- the process of merging it is necessary to iteratively merge at least one sub-network after optimization training.
- the target network model is a deep network with high performance reliability.
- the machine learning-based network model construction method of the embodiment of the present application analyzes the data processing flow of the original network model, and uses the actual running data generated by the original network model in the data processing flow as a reference data set, and hierarchically constructs at least one layer. Equivalent sub-networks, and optimize training for at least one sub-network, and finally merge to form a target network model; because the actual operational data of the original network model is used to construct each level of the target network model flexibly and quickly, and then the layers are merged
- the target network model can be formed, and it is no longer necessary to envision the overall structure of the target network model from scratch, which simplifies the model construction process and effectively improves the model construction efficiency.
- the optimization and adjustment of the target network model adopts the divide-and-conquer method, respectively
- the sub-network is optimized and re-merged, which makes the optimization and adjustment process of the target network model more flexible and further improves the model construction efficiency.
- the embodiment of the present application discloses another network model construction method based on machine learning.
- the method may include the following steps S201-S204:
- the original network model performs a series of data processing steps to form a complete data processing flow.
- at least one data processing step performed by the data processing flow for recording the original network model may be analyzed; for example, assuming that the data processing flow of the original network model is obtained, "step a1 - step a2 - step a3 - step A4" has four data processing steps.
- the data to be processed is referred to as original data, and the original data is processed through a series of data processing steps, that is, through the processing of each data processing step in the original network model, and finally the output result is obtained.
- a data processing step is used to perform a particular processing function, such as classifying data, or statistics, and the like.
- Each data processing step constitutes the above data processing flow.
- the data processing flow may include: one or more sub-data processing flows consisting of data processing steps.
- the result obtained through each sub-data processing flow is referred to as a sub-output result. After each sub-output result is obtained, the sub-data results are combined to obtain the above-mentioned output result.
- each of the data processing steps is referred to as a network node
- each of the sub-data processing flows is referred to as a sub-path in the original network model
- each sub-path is a network
- the nodes are connected in a unidirectional path in sequence according to the execution order of the above data processing steps.
- the original network model is a network model obtained by combining the above-described sub-paths.
- three sub-data processing flows are included in a data processing flow constituting the original network model.
- the node 11 corresponding to the classification processing step of the data, the node 12 corresponding to the statistical processing step of the data, and the node 13 corresponding to the regression processing step of the data are included.
- the node 21 corresponding to the clustering processing step of the data, and the node 22 corresponding to the function mapping processing step of the data are included.
- the third sub-data processing flow includes: a node 31 corresponding to the component analysis processing step of the data, a node 32 corresponding to the statistical processing step of the data, that is, the node 12, and a node 33 corresponding to the regression processing step of the data, that is, the above-mentioned node 13 And the node 34 corresponding to the sequence analysis processing step of the data.
- the raw data is processed by each data processing step in the first sub-data processing flow to obtain a first sub-output result
- the original data is processed by each data processing step in the second sub-data processing flow to obtain a second sub-output.
- the raw data is processed by each data processing step in the third sub-data processing flow to obtain a third sub-output result.
- FIG. 2A is a schematic diagram of an original network model in an embodiment of the present application.
- step S201 at least one data processing step in the data processing flow constituting the original network model, for example, the classification processing step of the data corresponding to the node 11 in FIG. 2A, may be acquired.
- the original network model generates operational data at each data processing step, where the operational data may include, but is not limited to, input data, intermediate data, and obtained output data used by the original network model in actual operation; or, the original network The labeled input data, intermediate data, and output data used by the model during training or testing; or analog input data, intermediate data, and obtained output data manually injected for the original network model.
- the operation data generated when the original network model executes each data processing step is respectively obtained.
- the operation data generated when the original network model performs step a1 is separately acquired, and the operation data generated when step a2 is executed is performed.
- the operation data generated when step a3 is executed and the operation data generated when step a4 is executed.
- the operational data generated when each of the data processing steps described above is executed may be acquired, for example, the operational data generated when the classification processing step of the data corresponding to the node 11 is executed.
- this step may sample and extract one set of input/output data to be added to the reference data set. It is also possible to extract all of the two sets of input/output data to the reference data set.
- the reference data set includes at least one set of input/output data corresponding to each data processing step.
- Steps S201-S203 of this embodiment may be specific refinement steps of step S101 shown in FIG. 1.
- the data processing steps may include, but are not limited to, at least one of the following: 1 single machine learning step; 2 data statistics step; 3 sequence analysis step; 4 function processing step; 5 data editing processing step.
- 1 single machine learning step For each type of data processing step, there are network main structures of sub-networks of equal price.
- Table 1 For the equivalence relationship, see Table 1 below:
- any network main structure or any combination of network main structures corresponds to the data processing steps in the left example, for example, the steps of the above data statistics, etc.
- the main network structure of the price includes a pooling layer, a convolution layer, and a fully connected layer. Then, any one of the pooling layer, the convolution layer, the fully connected layer, or a combination of various structures and data statistics The steps are equivalent, then, in actual application, if there are multiple network main structures of sub-networks with equal price in one data processing step, the final network main structure of the sub-network can be determined through actual training effect feedback, for example, selecting the smallest error rate The main structure of the network. It can be understood that the above Table 1 is not exhaustive, and if there are other data processing steps and an equivalent network main structure, it can be added to the above Table 1.
- each network main structure may be a neural subnetwork composed of at least one neuron.
- the neuron may be a chivalrous or generalized neuron. In this step, follow the above
- the equivalent correspondence table determines the network main structure of an equivalent sub-network for each data processing step in the original network model.
- S205 Determine an input layer structure and an output layer structure of the sub-network equivalent to each data processing step according to at least one set of input/output data corresponding to each data processing step.
- the dimension of the input layer of the equivalent sub-network is determined according to the input data corresponding to each data processing step; and the dimension of the output layer of the equivalent sub-network is determined according to the output data corresponding to each data processing step; That is, the input layer of the sub-network equivalent to a certain data processing step has the same dimension as the input data corresponding to the data processing step, and the output layer of the sub-network equivalent to the data processing step corresponds to the data processing step.
- the output data has the same dimensions.
- the dimensions here may include: the source of the data, the amount of data, and the like.
- S206 Construct a sub-network equivalent to each data processing step according to a network main structure, an input layer structure, and an output layer structure of a sub-network equivalent to each data processing step, wherein one sub-network is equivalent to one data processing step.
- a sub-network is equivalent to a data processing step, that is, under the same input, the processing result processed by the sub-network is the same as the processing result processed by the data processing step or similar. Similarity means that the difference between the two processing results is less than a predetermined threshold.
- Steps S204-S206 of this embodiment may be specific refinement steps of step S102 shown in FIG. 1.
- the neural network training optimization algorithm optimizes and adjusts parameters of a sub-network equivalent to each data processing step, where the parameters include: a network node, At least one of a weight and a training rate.
- the neural network training optimization algorithm may include but is not limited to at least one of the following: a random gradient descent algorithm, an RMSProp (an optimization algorithm), a momentum method, an AdaGrad (an algorithm for assigning different learning rates to each parameter), and AdaDelta (a type) optimization).
- a random gradient descent algorithm an RMSProp (an optimization algorithm), a momentum method, an AdaGrad (an algorithm for assigning different learning rates to each parameter), and AdaDelta (a type) optimization).
- AdaGrad an algorithm for assigning different learning rates to each parameter
- AdaDelta a type optimization
- Reading at least one set of input/output numbers corresponding to step a1 According to the optimization training for the sub-network b1 equivalent to the step a1; reading at least one set of input/output data corresponding to the step a2 from the reference data set for optimizing training of the sub-network b2 equivalent to the step a2,
- at least one set of input/output data corresponding to step a4 is read from the reference data set for optimal training of the sub-network b4 of the same price in step a4.
- Steps S207-S208 of this embodiment may be specific refinement steps of step S103 shown in FIG. 1.
- FIG. 3a shows a schematic diagram of the construction process and the optimization training process of the sub-network b1 which is equivalent to the data processing step a1 in the original network model.
- S210 Select a sub-network other than the seed network from the at least one sub-network as a merge target network according to a preset merge order.
- the merge order can be set according to each data processing step.
- the preset merge order includes any one of: a sequential execution order of the at least one data processing step, a subsequent execution order of the at least one data processing step, and a structural similarity between the at least one sub-network The order of the high and low.
- steps S209-S210 the original network model performs four data processing steps of "step a1 - step a2 - step a3 - step a4", and the equal-priced sub-networks are: sub-network b1, sub-network Network b2, sub-network b3, and sub-network b4.
- the sub-network b2 is selected as the seed network, then: 1) according to the sequence of execution of the at least one data processing step, the sub-network b3 equivalent to the step a3 should be selected as the merge target network; 2) if at least one data is processed The sequential execution order of the steps is selected in the reverse order, and the sub-network b1 equivalent to the step a1 should be selected as the merge target network; 3) if the order of the structural similarity between the at least one sub-network is selected, the sub-network should be selected.
- the sub-network with the highest structural similarity of b2 is used as the merged object network. Assuming that the network main structures of the sub-network b2 and the sub-network b4 are all connected layer structures, the sub-network b4 is selected as the merged target network.
- this step needs to remove the input layer of the sub-network and remove the output layer of the sub-network. If the seed network is the upper sub-network and the merged object network is the lower sub-network, then the input layer of the seed network needs to be removed, and the output layer of the merged object network is removed; if the seed network is the lower sub-network, the merged object network is used as the upper sub-network. Then the output layer of the seed network needs to be removed, and the input layer of the merged object network is removed.
- step S212 Integrate the seed network with the merged network by using a full connection. Pick up. If the merge connection is successful, the process goes to step S214; if the merge fails, the process goes to step S213.
- the seed network is A
- the merged object network is B
- each neuron in the seed network as an output is mapped to a neuron as an input in the merged object network by mapping the weight matrix W set. That is, a mapping relationship between each of the neurons in the seed network as an output and a neuron as an input in the merged object network is established.
- the mapping relationship between each neuron of subnetwork b1 and each neuron of subnetwork b2 is established by a weight matrix using a weighted matrix to establish its connection.
- At least one set of input/output data corresponding to a data processing step equivalent to the seed network and at least one set of input/output data corresponding to a data processing step equivalent to the merged object network are used as a reference for The parameters of the connected network are optimized and adjusted.
- both the seed network and the merged object network may be sub-neural networks, and the sub-neural network may be optimally adjusted by using the input/output data determined in this step.
- this step requires the output data corresponding to the data processing step of the above-mentioned layer sub-network equivalent price.
- the input data corresponding to the data processing step of the same layer sub-network is used as a reference, and the parameters of the merged connected network are optimized and adjusted.
- S215 The merged connected network is used as the seed network, and the processes of the foregoing steps S210-S214 are iterated until all the at least one sub-network are merged and connected to form a target network model.
- the foregoing steps S210-S214 may be repeatedly performed, in accordance with a predetermined merge order, for example, according to the execution order of each data processing step in each sub-data processing flow, each data in each sub-data processing flow.
- the sub-network corresponding to the processing step is sequentially combined to obtain the merged sub-network.
- the sub-network corresponding to each data processing step in the first sub-data processing flow is combined to obtain the merged first sub-neural network.
- the sub-network corresponding to each data processing step in the second sub-data processing flow is merged to obtain a combined second sub-neural network, corresponding to each data processing step in the third sub-data processing flow.
- the sub-network performs a merge process to obtain a combined third sub-neural network.
- the merged first sub-neural network, the merged second sub-neural network, and the merged third sub-god The merging process corresponding to the above node 00 is performed according to a predetermined merge order by the network, to obtain a target network model, such as a neural network model.
- Steps S209-S215 of this embodiment may be specific refinement steps of step S104 shown in FIG. 1.
- the above steps S209-S215 can be referred to FIG. 3b and FIG. 3c together.
- FIG. 3b and FIG. 3c respectively show a merge process of the sub-networks b1 and b2 and a schematic diagram of an optimized training process for the merged connected network.
- the intermediate hidden layer functions to adapt the output of the previous sub-network to the input of the latter sub-network. For example, in FIG. 3b, if the format of the output of the sub-network b1 does not match the format of the input of the sub-network b2, the output of the sub-network b1 can be adjusted by the processing of the intermediate hidden layer to make the adjusted sub- The form of the output of the network b1 conforms to the form of the input of the sub-network b2.
- the machine learning-based network model construction method of the embodiment of the present application analyzes the data processing flow of the original network model, and uses the actual running data generated by the original network model in the data processing flow as a reference data set, and hierarchically constructs at least one layer. Equivalent sub-networks, and optimize training for at least one sub-network, and finally merge to form a target network model; because the actual operational data of the original network model is used to construct each level of the target network model flexibly and quickly, and then the layers are merged
- the target network model can be formed, and it is no longer necessary to envision the overall structure of the target network model from scratch, which simplifies the model construction process and effectively improves the model construction efficiency.
- the optimization and adjustment of the target network model adopts the divide-and-conquer method, respectively
- the sub-network is optimized and re-merged, which makes the optimization and adjustment process of the target network model more flexible and further improves the model construction efficiency.
- the embodiment of the present application also discloses a network model construction device based on machine learning.
- the device can run the following units:
- the obtaining module 101 is configured to acquire a data processing flow of the original network model and a reference data set generated by the original network model in the data processing flow.
- the hierarchical construction module 102 is configured to hierarchically construct at least one sub-network according to the data processing flow and the reference data set.
- the optimization training module 103 is configured to perform optimization training on the at least one sub-network by using the reference data set.
- the merging module 104 is configured to merge at least one sub-network after the optimized training to form a target network model.
- the device runs the following unit in the process of running the obtaining module 101:
- the step obtaining unit 1001 is configured to acquire at least one data processing step performed by the original network model in the data processing flow.
- the data acquisition unit 1002 is configured to acquire operation data generated by the original network model when performing each data processing step.
- the sampling extracting unit 1003 is configured to separately extract a reference data set from the running data generated by the original network model when performing each data processing step; wherein the reference data set includes at least one set of inputs corresponding to each data processing step /Output Data.
- the device runs the following unit in the process of running the hierarchical structure module 102:
- the query unit 2001 is configured to separately query, from the preset equivalent correspondence table, the network main structure of the sub-network that is equivalent to each data processing step.
- the determining unit 2002 is configured to determine an input layer structure and an output layer structure of the sub-network equivalent to each data processing step according to at least one set of input/output data corresponding to each data processing step.
- the constructing unit 2003 is configured to construct a sub-network equivalent to each data processing step according to a network main structure, an input layer structure and an output layer structure of a sub-network equivalent to each data processing step, wherein one sub-network and one data processing The steps are equal.
- the device runs the following unit in the process of running the optimization training module 103:
- the reading unit 3001 is configured to sequentially read at least one set of input/output data corresponding to each data processing step from the reference data set.
- the adjusting unit 3002 is configured to optimize, according to the at least one set of input/output data corresponding to each data processing step, the parameters of the sub-network equal to each data processing step according to the neural network training optimization algorithm, where the parameters include : at least one of a network node, a weight, and a training rate.
- the device runs the following unit in the process of running the merging module 104:
- the seed selecting unit 4001 is configured to select any one of the at least one sub-network as a seed network.
- the merge object selection unit 4002 is configured to select, from the at least one sub-network, a sub-network other than the seed network as the merge target network according to a preset merge order, where the preset merge order includes any one of the following: a sequence of sequential execution of the at least one data processing step, the at least one data processing step The first execution order, the order of structural similarity between the at least one sub-network.
- the removing unit 4003 is configured to remove the input layer and the output layer between the seed network and the merged object network.
- the merge connection unit 4004 is configured to perform a merge connection between the seed network and the merge target network by using a full connection.
- the optimization adjustment unit 4005 is configured to: if the merge connection is successful, at least one set of input/output data corresponding to the data processing step equivalent to the seed network and at least one corresponding to the data processing step of the merge target network The group input/output data is used as a reference to optimize the parameters of the merged connected network.
- the seed selecting unit 4001 is further configured to use the merged connected network as a seed network, and iteratively perform corresponding processing by the merged object selecting unit 4002, the removing unit 4003, the merge connecting unit 4004, and the optimization adjusting unit 4005. Until the at least one sub-network is all merged and connected to form a target network model.
- the device also runs the following units in the process of running the merge module 104:
- An adding unit 4006, configured to add an intermediate hidden layer between the seed network and the merged object network, if the merge connection fails, so that the merge connection unit 4004 sends the seed network through the intermediate hidden layer
- the merged object network is merged and connected in a full connection manner.
- FIG. 4 can be used to perform the steps of the method shown in the embodiment of FIG. 1 to FIG. 3, the functions of the units of the apparatus shown in FIG. 4 can be referred to the description of the steps shown in FIG. 1 to FIG. I will not go into details here.
- the machine learning-based network model constructing apparatus of the embodiment of the present application analyzes the data processing flow of the original network model, and uses the actual running data generated by the original network model in the data processing flow as a reference data set. Layering at least one equivalent sub-network, and optimizing training for at least one sub-network, and finally merging to form a target network model; since the actual operational data of the original network model is used to flexibly and quickly construct various levels of the target network model, The merging of each level can form the target network model, instead of imagining the overall structure of the target network model from scratch, simplifying the model construction process and effectively improving the model construction efficiency; the optimization and adjustment of the target network model is divided and controlled. In this way, each sub-network is optimized and re-consolidated separately, which makes the optimization and adjustment process of the target network model more flexible, and further improves the model construction efficiency.
- the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).
- FIG. 5 is a schematic structural diagram of a machine learning-based network model construction apparatus provided by an implementation of the present application.
- the network model construction device 50 can include a processor 501, a non-volatile computer readable memory 502, a display unit 503, and a network communication interface 504. These components communicate over bus 505.
- a plurality of program modules are stored in the memory 502, including an application program 506, a network communication module 507, and an operating system 508.
- the processor 501 can read various modules (not shown) included in the application in the memory 502 to execute various functional applications and data processing of the network model construction device.
- the processor 501 in this embodiment may be one or multiple, and may be a CPU, a processing unit/module, an ASIC, a logic module, or a programmable gate array.
- the operating system 508 can be: a Windows operating system, a Linux operating system, or an Android operating system.
- the operating system 508 can include a network model construction module 509.
- the network model construction module 509 can include a computer executable instruction set 509-1 and corresponding metadata and heuristic algorithm 509-2 formed by respective functional modules in the apparatus shown in FIG. These sets of computer executable instructions can be executed by the processor 501 and perform the functions of the method shown in FIG. 1 or FIG. 2 or the apparatus shown in FIG.
- the application 506 can include an application installed and running on the terminal device.
- the network communication interface 504 cooperates with the network communication module 507 to complete transmission and reception of various network signals of the network model construction device 50.
- the display unit 503 has a display panel for completing input and display of related information.
- each functional module in each embodiment of the present application may be integrated into one processing unit, or each module may exist physically separately, or two or more modules may be integrated into one unit.
- the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
- the functional modules of the various embodiments may be located at one terminal or network node, or may be distributed to multiple terminals or network nodes.
- the present application therefore also provides a storage medium having stored therein computer readable instructions that are executed by at least one processor for performing any of the above-described embodiments of the present application.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Health & Medical Sciences (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Medical Informatics (AREA)
- Databases & Information Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Human Computer Interaction (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
一种基于机器学习的网络模型构造方法及装置,其中的方法包括:获取原网络模型的数据处理流程及所述原网络模型在所述数据处理流程中产生的参考数据集(S101);按照所述数据处理流程以及所述参考数据集分层构造至少一个子网络(S102);采用所述参考数据集对所述至少一个子网络进行优化训练(S103);将优化训练后的至少一个子网络进行合并处理,形成目标网络模型(S104)。上述技术方案能够简化网络模型的构造过程,提升模型构造效率。
Description
本申请要求于2016年6月2日提交中国专利局、申请号为201610389530.9、发明名称为“一种基于机器学习的网络模型构造方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请涉及互联网技术领域,尤其涉及一种基于机器学习的网络模型构造方法及装置。
随着机器学习技术的快速发展,越来越多的领域开始使用机器学习的方法构建网络模型,并将构建的网络模型作为分析、控制、决策的工具。网络模型包括:非神经网络模型和神经网络模型。其中,神经网络(Neural Networks,NN)是由大量的处理单元(称为神经元)互相连接而形成的复杂网络系统,是一个高度复杂的非线性动力学习系统。神经网络的基础在于神经元,神经网络模型由神经元的网络拓扑、节点特点和学习规则来表示。
发明内容
本申请实施例提供一种基于机器学习的网络模型构造方法及装置,简化网络模型的构造过程,提升模型构造效率。
本申请实施例第一方面提供一种基于机器学习的网络模型构造方法,可包括:
获取原网络模型的数据处理流程及所述原网络模型在所述数据处理流程中产生的参考数据集;
按照所述数据处理流程以及所述参考数据集分层构造至少一个子网络;
采用所述参考数据集对所述至少一个子网络进行优化训练;
将优化训练后的至少一个子网络进行合并处理,形成目标网络模型。
所述获取原网络模型的数据处理流程及所述原网络模型在所述数据处理流程中产生的参考数据集,包括:
获取原网络模型在数据处理流程中所执行的至少一个数据处理步骤;
获取所述原网络模型在执行各数据处理步骤时产生的运行数据;
分别从所述原网络模型在执行各数据处理步骤时产生的运行数据中提取部分或全部数据,组成所述参考数据集;
其中,所述参考数据集包含各数据处理步骤对应的至少一组输入/输出数据。
所述按照所述数据处理流程以及所述参考数据集分层构造至少一个子网络,包括:
从预设的等价对应表中分别查询与各数据处理步骤相等价的子网络的网络主结构;
按照各数据处理步骤对应的至少一组输入/输出数据,确定与各数据处理步骤相等价的子网络的输入层结构和输出层结构;
根据与各数据处理步骤相等价的子网络的网络主结构、输入层结构和输出层结构构造与各数据处理步骤相等价的子网络,其中,一个子网络与一个数据处理步骤相等价。
所述采用所述参考数据集对所述至少一个子网络进行优化训练,包括:
依次从所述参考数据集中读取各数据处理步骤对应的至少一组输入/输出数据;
以各数据处理步骤对应的至少一组输入/输出数据为参考,按照神经网络训练优化算法对与各数据处理步骤相等价的子网络的参数进行优化调整,所述参数包括:网络节点、权值、训练速率中的至少一种。
所述将优化训练后的至少一个子网络进行合并,形成目标网络模型,包括:
从所述至少一个子网络中选择任一个子网络作为种子网络;
获取根据所述各数据处理步骤设置合并顺序,根据所述合并顺序从所述至少一个子网络中选取除种子网络之外的一个子网络作为合并对象网络;
将所述种子网络与所述合并对象网络之间采用全连接的方式进行合并连接;
若合并连接成功,以与所述种子网络相等价的数据处理步骤对应的至少一组输入/输出数据以及与所述合并对象网络相等价的数据处理步骤对应的至少一组输入/输出数据为参考,对合并连接的网络的参数进行优化调整;
再以合并连接得到的网络作为种子网络,迭代上述过程直至将所述至少一个子网络全部合并连接形成所述目标网络模型。
该方法还包括:
若合并连接失败,在所述种子网络与所述合并对象网络之间添加中间隐含层,通过所述中间隐含层将所述种子网络与所述合并对象网络采用全连接方式进行合并连接。
本申请实施例第二方面提供一种基于机器学习的网络模型构造装置,可包括:
获取模块,用于获取原网络模型的数据处理流程及所述原网络模型在所述数据处理流程中产生的参考数据集;
分层构造模块,用于按照所述数据处理流程以及所述参考数据集分层构造至少一个子网络;
优化训练模块,用于采用所述参考数据集对所述至少一个子网络进行优化训练;
合并模块,用于将优化训练后的至少一个子网络进行合并处理,形成目标网络模型。
所述获取模块包括:
步骤获取单元,用于获取原网络模型在数据处理流程中所执行的至少一个数据处理步骤;
数据获取单元,用于获取所述原网络模型在执行各数据处理步骤时产生的运行数据;
抽样提取单元,用于分别从所述原网络模型在执行各数据处理步骤时产生的运行数据中提取部分或全部数据,组成参考数据集;
其中,所述参考数据集包含各数据处理步骤对应的至少一组输入/输出数据。
所述分层构造模块包括:
查询单元,用于从预设的等价对应表中分别查询与各数据处理步骤相等价的子网络的网络主结构;
确定单元,用于按照各数据处理步骤对应的至少一组输入/输出数据,确定与各数据处理步骤相等价的子网络的输入层结构和输出层结构;
构造单元,用于根据与各数据处理步骤相等价的子网络的网络主结构、输入层结构和输出层结构构造与各数据处理步骤相等价的子网络,其中,一个子网络与一个数据处理步骤相等价。
所述优化训练模块包括:
读取单元,用于依次从所述参考数据集中读取各数据处理步骤对应的至少一组输入/输出数据;
调整单元,用于以各数据处理步骤对应的至少一组输入/输出数据为参考,按照神经网络训练优化算法对与各数据处理步骤相等价的子网络的参数进行优化调整,所述参数包括:网络节点、权值、训练速率中的至少一种。
所述合并模块包括:
种子选择单元,用于从所述至少一个子网络中选择任一个子网络作为种子网络;
合并对象选取单元,用于获取根据所述各数据处理步骤设置合并顺序,根据所述合并顺序从所述至少一个子网络中选取除种子网络之外的一个子网络作为合并对象网络;
拆除单元,用于拆除所述种子网络与所述合并对象网络之间的输入层和输出层;
合并连接单元,用于将所述种子网络与所述合并对象网络之间采用全连接的方式进行合并连接;
优化调整单元,用于若合并连接成功,以与所述种子网络相等价的数据处理步骤对应的至少一组输入/输出数据以及与所述合并对象网络相等价的数据处理步骤对应的至少一组输入/输出数据为参考,对合并连接的网络的参数进行优化调整;
所述种子选择单元还用于将合并连接的网络作为种子网络,迭代由所述合并对象选取单元、所述拆除单元、所述合并连接单元和所述优化调整单元执行相应处理直到将所述至少一个子网络全部合并连接形成目标网络模型。
所述合并模块还包括:
添加单元,用于若合并连接失败,在所述种子网络与所述合并对象网络之间添加中间隐含层,以使得所述合并连接单元通过所述中间隐含层将所述种子网络与所述合并对象网络采用全连接方式进行合并连接。
本申请实施例中,通过分析原网络模型的数据处理流程,以原网络模型在所述数据处理流程中产生的实际运行数据为参考数据集,分层构造至少一个等价的子网络,并且对至少一个子网络进行优化训练,最后合并形成目标网络模型;由于采用原网络模型的实际运行数据灵活快速地构造目标网络模型的各个层次,再将各个层次进行合并即可形成目标网络模型,而不再需要从零开始构想目标网络模型的整体结构,简化了模型构造过程,有效地提升了模型构造效率;对于目标网络模型的优
化调整采用分而治之的方式,分别对各个子网络进行优化调整再合并,这使得目标网络模型的优化调整过程更为灵活,进一步提升模型构造效率。
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的一种基于机器学习的网络模型构造方法的流程图;
图2为本申请实施例提供的另一种基于机器学习的网络模型构造方法的流程图;
图2A为本申请实施例中原网络模型的示意图;
图3a为本申请实施例提供的与原网络模型中的数据处理步骤a1相等价的子网络b1的构造过程和优化训练过程的示意图;
图3b为本申请实施例提供的子网络b1和b2的合并过程和对合并连接的网络的优化训练过程的一个示意图;
图3c为本申请实施例提供的子网络b1和b2的合并过程和对合并连接的网络的优化训练过程的另一个示意图;
图4为本申请实施例提供的一种基于机器学习的网络模型构造装置的结构示意图;
图5为本发明实施提供的一种基于机器学习的网络模型构造装置的结构示意图。
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
在本申请一实施例中,在对一网络模型进行机器学习以构建神经网络模型的过程中,需要对现有网络模型的数据进行标注形成训练集,再从零开始构建神经网络模型。由于训练神经网络的过程需要大量地手动标注数据以生成作为训练集的标注数据,此过程需要大量人机交互,需要消耗大量设备资源。此外,从零开始构造神经网络模型需要对所构造的网络模型整体进行复杂的参数调整,因此工作量大,模型构造效率低。
网络模型包括:非神经网络模型和神经网络模型。其中,神经网络是由大量的处理单元(在神经网络中被称为神经元)互相连接而形成的复杂网络系统,是一个高度复杂的非线性动力学习系统。神经元是构成神经网络的基本单元。神经网络模型由神经元的网络拓扑、节点特点和学习规则来表示。与非神经网络模型相比,神经网络模型具有更强的深度学习能力以及更好的环境适应能力。
本申请实施例提供了一种基于机器学习的网络模型构造方法及装置,能够通过分析原网络模型的数据处理流程,以原网络模型在所述数据处理流程中产生的实际运行数据为参考数据集,分层构造至少一个等价的子网络,并且对至少一个子网络进行优化训练,最后合并形成目标网络模型。通过此机器学习过程构建目标网络模型,可以获得如下有益效果:
(1)性能可靠性高。由于目标网络模型是经过对与原网络模型的数据处理流程相等价的至少一个子网络联合优化调整得到的,与原网络模型相比,目标网络模型不再需要考虑各个数据处理步骤之间的磨合与适应问题;并且联合优化过程是以原网络模型的实际运行数据为基准,从而可以保证目标网络模型的处理性能的可靠性。
(2)调整灵活度高。由于目标网络模型是通过训练原网络模型的实际运行数据驱动改变的,因此若需要调整目标网络模型,仅需调整训练过程涉及的参数,而不需要考虑目标网络模型各个层级结构之间的参数适配问题,调整灵活度较高;并且若原网络模型出现或产生新的运行数据,可以直接利用新的运行数据对目标网络模型进行调整,不需要过多的人工干预,从而节省了人力成本,减轻模型构造的工作量。
(3)构造过程简单。由于不需要从零开始构建目标网络模型,这就避免复杂的网络结构调整过程,并且不需要人工冗杂的经验作为参考,大大节省了模型构造时间,提升构造效率。
(4)对数据的利用更加充分。目标网络模型是由至少一个子网络联合优化而形成的深度网络,性能可靠性高,相对于原网络模型,其能够对参数不断迭代以最大程度地利用数据的价值,实现更好的性能优化。
(5)具有可解释性。目标网络模型是通过机器学习的方法将原网络模型进行等价替换而成,在此过程利用了原网络模型在数据处理流程中产生的实际运行数据,而并未添加人工理解或认知的标注数据,这能够在一定程度上将原网络模型的解释性要素继承下来,更适用于对于解释性要求较高的场景。
需要说明的是,原网络模型可以是非神经网络模型或神经网络模型,目标网络模型可以是神经网络模型。除特别说明外,本申请后续实施例中,原网络模型以非神经网络模型为例进行说明,而目标网络模型以神经网络模型进行说明。
基于上述描述,本申请实施例公开了一种基于机器学习的网络模型构造方法,请参见图1,该方法可包括以下步骤S101-步骤S104:
S101,获取原网络模型的数据处理流程及所述原网络模型在所述数据处理流程中产生的参考数据集。
原网络模型作为一种分析、控制、决策的工具,在运行过程中会执行一系列的数据处理步骤而形成一套完整的数据处理流程。此处的数据处理步骤可以包括但不限于以下至少一种:
①单一机器学习的步骤。此步骤基于分类算法、聚类算法、成分分析算法、降维映射算法、编码器方法而实现,其中,分类算法可包括但不限于:支持向量机(Support Vector Machine,SVM)、决策树、阈值分类器、逻辑回归、浅层神经网络、迭代决策树(Gradient Boost Decision Tree,GBDT)、一种提高弱分类算法准确度的方法(Boosting)方法、最邻近结点算法(k-Nearest Neighbor,KNN)、贝叶斯分类器、随机森林方法及其可能的变种方法。聚类算法可包括但不限于:基于划分的聚类方法(Kmeans)、K中心、MeanShift、谱聚类、基于密度的聚类算法(Density-Based Spatial Clustering of Applications with Noise,DBSCAN)、相似传播方法。成分分析算法可包括但不限于:主成分分析(Principal Component Analysis,PCA)、典范对应分析(Canonical Correspondence Analysis,CCA)、因子分析、傅里叶变换、小波分析。降维映射算法可包括但不限于:混合判别分析(Mixture Discriminant Analysis,
MDA)、Fisher投影、lsoMap(一种全局优化算法)。编码器方法可包括但不限于:线性判别式分析(Linear Discriminant Analysis,LDA)、概率潜在语义分析(Probability Latent Semantic Analysis,PLSA)、隐含语义分析(Latent Semantic Analysis,LSA)、稀疏编码(Sparse Coding)。
②数据统计的步骤。此步骤可基于数据统计算法实现,该数据统计算法可包括但不限于:求和、求平均、求分位数、求最值、求中心统计矩、求卡方统计量等。
③序列分析的步骤。此步骤可基于序列分析算法实现,该序列分析算法可包括但不限于:自回归积分滑动平均模型(Autoregressive Integrated Moving Average Model,ARIMA)回归、Karman滤波等。
④函数处理的步骤。此处的函数可包括但不限于:线性映射函数、包含信息熵的变换函数、解析函数、超越函数等。
⑤对数据的编辑处理步骤。此步骤可以包括但不限于:数据合并、对数据的筛选、数据分离、数据变换等。
本步骤S101中,可以分析记录原网络模型运行的数据处理流程所涉及的至少一个数据处理步骤。并且,原网络模型在执行上述的各数据处理步骤时会产生运行数据,此处的运行数据可以包括但不限于:原网络模型在实际运行过程中使用的输入数据、中间数据和获得的输出数据;或者,原网络模型在训练或测试过程中所使用的带标注的输入数据、中间数据和输出数据;或者,由人工针对原网络模型注入的模拟输入数据、中间数据及获得的输出数据。那么,本步骤从各数据处理步骤的运行数据中提炼部分或全部的运行数据作为参考数据集。
S102,按照所述数据处理流程以及所述参考数据集分层构造至少一个子网络。
所述数据处理流程记载了原网络模型所执行的至少一个数据处理步骤,所述参考数据集包含各数据处理步骤对应的至少一组输入/输出数据。分层构造的思想在于:原网络模型的每一个数据处理步骤可以由一个具备等价功能的子网络来执行,那么,一个数据处理步骤可对应一个子网络的网络主结构;与此同时,该子网络的输入/输出层则可以由该数据处理步骤的输入/输出数据来确定。因此,本步骤中,按照原网络模型的数据处理流程以及提炼到的参考数据集,可以分层构造至少一个子网络,例如:假设原网络模型的数据处理流程表示为“步骤a1-步骤a2-步骤a3-步骤a4”共四个数据处理步骤,那么,步骤a1与第一层次的子网络b1相等价,该子网
络b1的主网络结构由步骤a1确定,该子网络b1的输入层和输出层由从步骤a1提炼出输入/输出数据确定;同理,步骤a2与第二层次的子网络b2相等价,该子网络b2的主网络结构由步骤a2确定,该子网络b2的输入层和输出层由从步骤a2提炼出输入/输出数据确定;以此类推,步骤a4与第四层次的子网络b4相等价,该子网络b4的主网络结构由步骤a4确定,该子网络b4的输入层和输出层由从步骤a4提炼出输入/输出数据确定。按照此实例进一步可知,目标网络模型是由子网络b1、子网络b2、子网络b3和子网络b4按层次联接而成。
S103,采用所述参考数据集对所述至少一个子网络进行优化训练。
优化训练的目的在于:以所述参考数据集中的数据作为参考基准,不断调整至少一个子网络的参数,使得子网络的性能指标达到与原网络模型中对应的数据处理步骤的性能指标相同或更高水平。此处,子网络的参数可包括:网络节点、权值、训练速率中的至少一种。具体实现中,从每个数据处理步骤提炼的至少一组输入/输出数据被用来对该数据处理步骤相等价的子网络进行优化训练,按照步骤S102中的例子,参考数据集中步骤a1对应的至少一组输入/输出数据用于对子网络b1进行优化训练,步骤a2对应的至少一组输入/输出数据用于对子网络b2进行优化训练,以此类推,步骤a4对应的至少一组输入/输出数据用于对子网络b4进行优化训练。
S104,将优化训练后的至少一个子网络进行合并,形成目标网络模型。
本步骤需要迭代将优化训练后的至少一个子网络不断合并,在合并过程中,还需要对不断合并形成的网络进行联合优化,最终形成完整的目标网络模型;通过迭代合并和联合优化的过程形成的目标网络模型是一个深度网络,性能可靠性较高。
本申请实施例的基于机器学习的网络模型构造方法,通过分析原网络模型的数据处理流程,以原网络模型在所述数据处理流程中产生的实际运行数据为参考数据集,分层构造至少一个等价的子网络,并且对至少一个子网络进行优化训练,最后合并形成目标网络模型;由于采用原网络模型的实际运行数据灵活快速地构造目标网络模型的各个层次,再将各个层次进行合并即可形成目标网络模型,而不再需要从零开始构想目标网络模型的整体结构,简化了模型构造过程,有效地提升了模型构造效率;对于目标网络模型的优化调整采用分而治之的方式,分别对各个子网络进行优化调整再合并,这使得目标网络模型的优化调整过程更为灵活,进一步提升模型构造效率。
本申请实施例公开了另一种基于机器学习的网络模型构造方法,请参见图2,该方法可包括以下步骤S201-步骤S204:
S201,获取原网络模型在数据处理流程中所执行的至少一个数据处理步骤。
原网络模型作为一种分析、控制、决策的工具,在运行过程中会执行一系列的数据处理步骤而形成一套完整的数据处理流程。本步骤中,可以分析记录原网络模型运行的数据处理流程所执行的至少一个数据处理步骤;例如:假设获取到原网络模型的数据处理流程所执行的“步骤a1-步骤a2-步骤a3-步骤a4”共四个数据处理步骤。
在本申请一实施例中,将待处理的数据称为原始数据,将原始数据经过一系列数据处理步骤的处理,即经过该原网络模型中各数据处理步骤的处理,最终得到输出结果,每一数据处理步骤用于完成一特定处理功能,例如,对数据进行分类、或统计等。各数据处理步骤构成上述数据处理流程。其中,该数据处理流程可以包括:一条或多条由数据处理步骤构成的子数据处理流程。将经过各子数据处理流程得到的结果称为子输出结果。在得到各子输出结果后,对该各子数据结果进行合并处理得到上述输出结果。在上述原网络模型中,将上述每一数据处理步骤称为一网络节点,将上述每一子数据处理流程称为所述原网络模型中的一子路径,每一子路径是将上述各网络节点按照上述数据处理步骤的执行顺序依次串联起来组成的单向路径。上述原网络模型是将上述各子路径进行组合处理得到的网络模型。
例如,在一构成原网络模型的数据处理流程中包含三条子数据处理流程。在第一子数据处理流程中,包括:数据的分类处理步骤对应的节点11,数据的统计处理步骤对应的节点12,数据的回归处理步骤对应的节点13。在第二子数据处理流程中,包括:数据的聚类处理步骤对应的节点21,数据的函数映射处理步骤对应的节点22。第三子数据处理流程包括:数据的成分分析处理步骤对应的节点31,数据的统计处理步骤对应的节点32,也即上述节点12,数据的回归处理步骤对应的节点33,也即上述节点13,以及数据的序列分析处理步骤对应的节点34。将原始数据经过所述第一子数据处理流程中各数据处理步骤的处理得到第一子输出结果,将原始数据经过所述第二子数据处理流程中各数据处理步骤的处理得到第二子输出结果,将原始数据经过所述第三子数据处理流程中各数据处理步骤的处理得到第三子输出结果。
将得到的第一子输出结果,第二子输出结果和第三子输出结果进行数据的合并处理步骤(对应原网络模型中的节点00)的处理得到输出结果。如图2A所示,图2A为本申请实施例中原网络模型的示意图。
在本步骤S201中,可以获取构成该原网络模型的数据处理流程中的至少一个数据处理步骤,例如图2A中节点11对应的数据的分类处理步骤。
S202,获取所述原网络模型在执行各数据处理步骤时产生的运行数据。
原网络模型在各数据处理步骤时会产生运行数据,此处的运行数据可以包括但不限于:原网络模型在实际运行过程中使用的输入数据、中间数据和获得的输出数据;或者,原网络模型在训练或测试过程中所使用的带标注的输入数据、中间数据和输出数据;或者,由人工针对原网络模型注入的模拟输入数据、中间数据及获得的输出数据。本步骤则分别获取原网络模型在执行各数据处理步骤时产生的运行数据,按照S201所示例子,需要分别获取原网络模型执行步骤a1时产生的运行数据,执行步骤a2时产生的运行数据,执行步骤a3时产生的运行数据以及执行步骤a4时产生的运行数据。
例如,在本步骤中,可以获取上述各数据处理步骤被执行时产生的运行数据,例如上述节点11对应的数据的分类处理步骤被执行时产生的运行数据。
S203,分别从所述原网络模型在执行各数据处理步骤时产生的运行数据中抽样提取参考数据集。
具体实现中,按照本实施例所示例子,假设原网络模型执行步骤a1的过程中产生两组输入/输出数据,那么,本步骤可以抽样提取其中一组输入/输出数据添加至参考数据集,也可以全部提取两组输入/输出数据添加至参考数据集。以此类推,所述参考数据集包含各数据处理步骤对应的至少一组输入/输出数据。
本实施例的步骤S201-S203可以为图1所示的步骤S101的具体细化步骤。
S204,从预设的等价对应表中分别查询与各数据处理步骤相等价的子网络的网络主结构。
如前述,数据处理步骤可以包括但不限于以下至少一种:①单一机器学习的步骤;②数据统计的步骤;③序列分析的步骤;④函数处理的步骤;⑤对数据的编辑处理步骤。每一种类型的数据处理步骤均存在相等价的子网络的网络主结构,等价关系可以参见下述表一:
表一:预设的等价对应表
上述表一的右侧列中,如果存在多种网络主结构,表示任意一种网络主结构或任意组合的网络主结构对应左侧例中的数据处理步骤,例如:上述数据统计的步骤,等价的网络主结构包括汇总(pooling)层、卷积层、全连接层,那么,汇总(pooling)层、卷积层、全连接层中的任一种结构或多种结构的组合与数据统计的步骤相等价,那么,实际应用中如果一个数据处理步骤存在多个相等价的子网络的网络主结构时,可以通过实际训练效果反馈决定子网络最终的网络主结构,例如选择误差率最小的网络主结构。可以理解的是,上述表一并非穷举,如果存在其他数据处理步骤以及相等价的网络主结构,可添加至上述表一中。
在上述表一中预存了各数据处理步骤对应的等价网络主结构。例如,数据的分类处理步骤可以由全连接层和/或Maxout层的网络主结构来完成等价的处理。在本申请一实施例中,每一网络主结构可以是由至少一个神经元组成的神经子网络。在本申请一实施例中,所述神经元可以为侠义或广义的神经元。在本步骤中,按照上
述等价对应表为原网络模型中的每一数据处理步骤确定一等价的子网络的网络主结构。
S205,按照各数据处理步骤对应的至少一组输入/输出数据,确定与各数据处理步骤相等价的子网络的输入层结构和输出层结构。
本步骤中,需要根据各数据处理步骤对应的输入数据,确定等价的子网络的输入层的维度;根据各数据处理步骤对应的输出数据,来确定等价的子网络的输出层的维度;也就是说,与某数据处理步骤相等价的子网络的输入层与该数据处理步骤对应的输入数据的维度相同,而与该数据处理步骤相等价的子网络的输出层与该数据处理步骤对应的输出数据的维度相同。此处的维度可以包括:数据的来源、数据的数量等。
S206,根据与各数据处理步骤相等价的子网络的网络主结构、输入层结构和输出层结构构造与各数据处理步骤相等价的子网络,其中,一个子网络与一个数据处理步骤相等价。
在本申请一实施例中,一个子网络与一个数据处理步骤相等价,是指在相同的输入下,经该子网络处理得到的处理结果和经该数据处理步骤进行处理得到的处理结果相同或相似。相似是指两个处理结果的差值小于预定阈值。
本步骤中,在子网络的网络主结构的基础上添加输入层结构和输出层结构,即可形成子网络的网络结构。本实施例的步骤S204-S206可以为图1所示的步骤S102的具体细化步骤。
S207,依次从所述参考数据集中读取各数据处理步骤对应的至少一组输入/输出数据。
S208,以各数据处理步骤对应的至少一组输入/输出数据为参考,按照神经网络训练优化算法对与各数据处理步骤相等价的子网络的参数进行优化调整,所述参数包括:网络节点、权值、训练速率中的至少一种。
所述神经网络训练优化算法可以包括但不限于以下至少一种:随机梯度下降算法、RMSProp(一种优化算法)、动量法、AdaGrad(为各个参数分配不同学习率的算法)、AdaDelta(一种优化算法)。步骤S207-S208中,从每个数据处理步骤提炼的至少一组输入/输出数据被用来对该数据处理步骤相等价的子网络进行优化训练,按照本实施例所示例子,从参考数据集中读取步骤a1对应的至少一组输入/输出数
据用于对与步骤a1相等价的子网络b1进行优化训练;从参考数据集中读取步骤a2对应的至少一组输入/输出数据用于对与步骤a2相等价的子网络b2进行优化训练,以此类推,从参考数据集中读取步骤a4对应的至少一组输入/输出数据用于对步骤a4相等价的子网络b4进行优化训练。本实施例的步骤S207-S208可以为图1所示的步骤S103的具体细化步骤。
上述步骤S204-S208可请一并参见图3a,图3a示出了与原网络模型中的数据处理步骤a1相等价的子网络b1的构造过程和优化训练过程的示意图。
S209,从所述至少一个子网络中选择任一个子网络作为种子网络。
S210,按照预置的合并顺序从所述至少一个子网络中选取除种子网络之外的一个子网络作为合并对象网络。该合并顺序可以是根据各数据处理步骤设置的。所述预置的合并顺序包括以下任一种:所述至少一个数据处理步骤的先后执行顺序,所述至少一个数据处理步骤的后先执行顺序,所述至少一个子网络之间的结构相似度的高低顺序。
步骤S209-S210中,按照本实施例所示例子,原网络模型执行“步骤a1-步骤a2-步骤a3-步骤a4”四个数据处理步骤,相等价的子网络分别为:子网络b1、子网络b2、子网络b3和子网络b4。假设选择子网络b2为种子网络,那么:1)按照所述至少一个数据处理步骤的先后执行顺序,应当选取与步骤a3相等价的子网络b3作为合并对象网络;2)如果按照至少一个数据处理步骤的先后执行顺序相反的顺序进行选择,应当选取与步骤a1相等价的子网络b1作为合并对象网络;3)如果按照至少一个子网络之间的结构相似度的高低顺序,应当选择与子网络b2的结构相似度最高的子网络作为合并对象网络,假设子网络b2与子网络b4的网络主结构均为全连接层结构,那么,选择子网络b4作为合并对象网络。
S211,拆除所述种子网络与所述合并对象网络之间的输入层和输出层。
按照图3a所示的子网络的网络结构,本步骤需要拆除子网络的输入层,同时拆除子网络的输出层。如果种子网络作为上层子网络,合并对象网络作为下层子网络,那么需要拆除种子网络的输入层,同时拆除合并对象网络的输出层;如果种子网络作为下层子网络,合并对象网络作为上层子网络,那么需要拆除种子网络的输出层,同时拆除合并对象网络的输入层。
S212,将所述种子网络与所述合并对象网络之间采用全连接的方式进行合并连
接。若合并连接成功,转入步骤S214;若合并失败,转入步骤S213。
全连接的方式是指对于前置数据A和目标数据B,令B=W×A,其中W为权值矩阵,×表示矩阵乘法。在本步骤中,令种子网络为A,合并对象网络为B,通过设置的权值矩阵W的映射,将种子网络中作为输出的每一个神经元映射到合并对象网络中作为输入的一个神经元,即建立种子网络中每一个作为输出的神经元和合并对象网络中作为输入的一个神经元之间的映射关系。例如在图3b中,通过全连接方式,利用一个权值矩阵建立子网络b1的每一神经元和子网络b2的各神经元之间的映射关系,以建立其连接。
S213,在所述种子网络与所述合并对象网络之间添加中间隐含层,通过所述中间隐含层将所述种子网络与所述合并对象网络采用全连接方式进行合并连接。
S214,以与所述种子网络相等价的数据处理步骤对应的至少一组输入/输出数据以及与所述合并对象网络相等价的数据处理步骤对应的至少一组输入/输出数据为参考,对合并连接得到的网络的参数进行优化调整。
在本申请一实施例中,该种子网络和合并对象网络均可以为子神经网络,可以利用本步骤中确定出的输入/输出数据对该子神经网络进行优化调整。
由于合并连接的网络的输入层为下层子网络的输入层,合并连接的网络的输出层为上层子网络的输出层,那么,本步骤需要以上层子网络相等价的数据处理步骤对应的输出数据为参考,同时以下层子网络相等价的数据处理步骤对应的输入数据为参考,对合并连接的网络的参数进行优化调整。
S215,以合并连接的网络作为种子网络,迭代上述步骤S210-S214的过程直至将所述至少一个子网络全部合并连接形成目标网络模型。
在本申请一实施例中,可以重复执行上述步骤S210-S214,按照预定的合并顺序,例如按照每一子数据处理流程中各数据处理步骤的执行顺序,对每一子数据处理流程中各数据处理步骤对应的子网络依次进行合并处理得到合并后的子网络。例如,对上述第一子数据处理流程中的各数据处理步骤对应的子网络进行合并处理得到合并后的第一子神经网络。同理,对上述第二子数据处理流程中的各数据处理步骤对应的子网络进行合并处理得到合并后的第二子神经网络,对上述第三子数据处理流程中的各数据处理步骤对应的子网络进行合并处理得到合并后的第三子神经网络。对该合并后的第一子神经网络、合并后的第二子神经网络和合并后的第三子神
经网络按照预定的合并顺序执行上述节点00对应的合并处理,得到目标网络模型,例如神经网络模型。
本实施例的步骤S209-S215可以为图1所示的步骤S104的具体细化步骤。上述步骤S209-S215可请一并参见图3b和图3c,图3b和图3c分别示出了子网络b1和b2的合并过程和对合并连接的网络的优化训练过程的示意图。
在本申请一实施例中,该中间隐含层的作用是将前一子网络的输出适配到后一子网络的输入。例如,在图3b中,如果子网络b1的输出的格式与子网络b2的输入的格式不匹配,通过该中间隐含层的处理,可以对子网络b1的输出进行调整,使调整后的子网络b1的输出的形式符合子网络b2的输入的形式。
本申请实施例的基于机器学习的网络模型构造方法,通过分析原网络模型的数据处理流程,以原网络模型在所述数据处理流程中产生的实际运行数据为参考数据集,分层构造至少一个等价的子网络,并且对至少一个子网络进行优化训练,最后合并形成目标网络模型;由于采用原网络模型的实际运行数据灵活快速地构造目标网络模型的各个层次,再将各个层次进行合并即可形成目标网络模型,而不再需要从零开始构想目标网络模型的整体结构,简化了模型构造过程,有效地提升了模型构造效率;对于目标网络模型的优化调整采用分而治之的方式,分别对各个子网络进行优化调整再合并,这使得目标网络模型的优化调整过程更为灵活,进一步提升模型构造效率。
本申请实施例还公开了一种基于机器学习的网络模型构造装置,请参见图4,该装置可运行如下单元:
获取模块101,用于获取原网络模型的数据处理流程及所述原网络模型在所述数据处理流程中产生的参考数据集。
分层构造模块102,用于按照所述数据处理流程以及所述参考数据集分层构造至少一个子网络。
优化训练模块103,用于采用所述参考数据集对所述至少一个子网络进行优化训练。
合并模块104,用于将优化训练后的至少一个子网络进行合并,形成目标网络模型。
具体实现中,该装置在运行所述获取模块101的过程中,具体运行如下单元:
步骤获取单元1001,用于获取原网络模型在数据处理流程中所执行的至少一个数据处理步骤。
数据获取单元1002,用于获取所述原网络模型在执行各数据处理步骤时产生的运行数据。
抽样提取单元1003,用于分别从所述原网络模型在执行各数据处理步骤时产生的运行数据中抽样提取参考数据集;其中,所述参考数据集包含各数据处理步骤对应的至少一组输入/输出数据。
具体实现中,该装置在运行所述分层构造模块102的过程中,具体运行如下单元:
查询单元2001,用于从预设的等价对应表中分别查询与各数据处理步骤相等价的子网络的网络主结构。
确定单元2002,用于按照各数据处理步骤对应的至少一组输入/输出数据,确定与各数据处理步骤相等价的子网络的输入层结构和输出层结构。
构造单元2003,用于根据与各数据处理步骤相等价的子网络的网络主结构、输入层结构和输出层结构构造与各数据处理步骤相等价的子网络,其中,一个子网络与一个数据处理步骤相等价。
具体实现中,该装置在运行所述优化训练模块103的过程中,具体运行如下单元:
读取单元3001,用于依次从所述参考数据集中读取各数据处理步骤对应的至少一组输入/输出数据。
调整单元3002,用于以各数据处理步骤对应的至少一组输入/输出数据为参考,按照神经网络训练优化算法对与各数据处理步骤相等价的子网络的参数进行优化调整,所述参数包括:网络节点、权值、训练速率中的至少一种。
具体实现中,该装置在运行所述合并模块104的过程中,具体运行如下单元:
种子选择单元4001,用于从所述至少一个子网络中选择任一个子网络作为种子网络。
合并对象选取单元4002,用于按照预置的合并顺序从所述至少一个子网络中选取除种子网络之外的一个子网络作为合并对象网络,所述预置的合并顺序包括以下任一种:所述至少一个数据处理步骤的先后执行顺序,所述至少一个数据处理步骤
的后先执行顺序,所述至少一个子网络之间的结构相似度的高低顺序。
拆除单元4003,用于拆除所述种子网络与所述合并对象网络之间的输入层和输出层。
合并连接单元4004,用于将所述种子网络与所述合并对象网络之间采用全连接的方式进行合并连接。
优化调整单元4005,用于若合并连接成功,以与所述种子网络相等价的数据处理步骤对应的至少一组输入/输出数据以及与所述合并对象网络相等价的数据处理步骤对应的至少一组输入/输出数据为参考,对合并连接的网络的参数进行优化调整。
所述种子选择单元4001还用于将合并连接的网络作为种子网络,迭代由所述合并对象选取单元4002、所述拆除单元4003、所述合并连接单元4004和所述优化调整单元4005执行相应处理直到将所述至少一个子网络全部合并连接形成目标网络模型。
该装置在运行所述合并模块104的过程中,还运行如下单元:
添加单元4006,用于若合并连接失败,在所述种子网络与所述合并对象网络之间添加中间隐含层,以使得所述合并连接单元4004通过所述中间隐含层将所述种子网络与所述合并对象网络采用全连接方式进行合并连接。
由于图4所示的装置可用于执行图1-图3实施例所示方法的各步骤,因此,图4所示装置的各单元的功能可参见图1-图3所示各步骤的描述,在此不赘述。
与方法同理,本申请实施例的基于机器学习的网络模型构造装置,通过分析原网络模型的数据处理流程,以原网络模型在所述数据处理流程中产生的实际运行数据为参考数据集,分层构造至少一个等价的子网络,并且对至少一个子网络进行优化训练,最后合并形成目标网络模型;由于采用原网络模型的实际运行数据灵活快速地构造目标网络模型的各个层次,再将各个层次进行合并即可形成目标网络模型,而不再需要从零开始构想目标网络模型的整体结构,简化了模型构造过程,有效地提升了模型构造效率;对于目标网络模型的优化调整采用分而治之的方式,分别对各个子网络进行优化调整再合并,这使得目标网络模型的优化调整过程更为灵活,进一步提升模型构造效率。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可
以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random Access Memory,RAM)等。
图5为本申请实施提供的一种基于机器学习的网络模型构造装置的结构示意图。如图5所示,该网络模型构造装置50可包括:处理器501、非易失性计算机可读存储器502、显示单元503、网络通信接口504。这些组件通过总线505进行通信。
本实施例中,存储器502中存储有多个程序模块,包括:应用程序506、网络通信模块507和操作系统508。
处理器501可以读取存储器502中的应用程序所包括的各种模块(图中未示出)来执行网络模型构造装置的各种功能应用以及数据处理。本实施例中的处理器501可以为一个,也可以为多个,其可以为CPU,处理单元/模块,ASIC,逻辑模块或可编程门阵列等。
其中,操作系统508可以为:Windows操作系统、Linux操作系统或Android操作系统。该操作系统508可以包括网络模型构造模块509。该网络模型构造模块509可包括图4所示装置中的各个功能模块形成的计算机可执行指令集509-1及对应的元数据及启发式算法509-2。这些计算机可执行指令集可以由所述处理器501执行并完成图1或图2所示方法或图4所示装置的功能。
应用程序506可包括:安装和运行在终端设备上的应用程序。
在本实施例中,网络通信接口504与网络通信模块507相配合完成网络模型构造装置50的各种网络信号的收发。
显示单元503具有一显示面板,用于完成相关信息的输入及显示。
另外,在本申请各个实施例中的各功能模块可以集成在一个处理单元中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。所述各实施例的功能模块可以位于一个终端或网络节点,或者也可以分布到多个终端或网络节点上。
因此本申请还提供了一种存储介质,其中存储有计算机可读指令,被至少一个处理器执行用于执行本申请上述方法的任何一种实施例。
以上所揭露的仅为本申请较佳实施例而已,当然不能以此来限定本申请之权利范围,因此依本申请权利要求所作的等同变化,仍属本申请所涵盖的范围。
Claims (13)
- 一种基于机器学习的网络模型构造方法,其特征在于,包括:获取原网络模型的数据处理流程及所述原网络模型在所述数据处理流程中产生的参考数据集;按照所述数据处理流程以及所述参考数据集分层构造至少一个子网络;采用所述参考数据集对所述至少一个子网络进行优化训练;将优化训练后的至少一个子网络进行合并处理,形成目标网络模型。
- 如权利要求1所述的方法,其特征在于,所述获取原网络模型的数据处理流程及所述原网络模型在所述数据处理流程中产生的参考数据集,包括:获取原网络模型在数据处理流程中所执行的至少一个数据处理步骤;获取所述原网络模型在执行各数据处理步骤时产生的运行数据;分别从所述原网络模型在执行各数据处理步骤时产生的运行数据中提取部分或全部数据,组成所述参考数据集;其中,所述参考数据集包含各数据处理步骤对应的至少一组输入/输出数据。
- 如权利要求2所述的方法,其特征在于,所述按照所述数据处理流程以及所述参考数据集分层构造至少一个子网络,包括:从预设的等价对应表中分别查询与各数据处理步骤相等价的子网络的网络主结构;按照各数据处理步骤对应的至少一组输入/输出数据,确定与各数据处理步骤相等价的子网络的输入层结构和输出层结构;根据与各数据处理步骤相等价的子网络的网络主结构、输入层结构和输出层结构构造与各数据处理步骤相等价的子网络,其中,一个子网络与一个数据处理步骤相等价。
- 如权利要求2或3所述的方法,其特征在于,所述采用所述参考数据集对所述至少一个子网络进行优化训练,包括:依次从所述参考数据集中读取各数据处理步骤对应的至少一组输入/输出数据;以各数据处理步骤对应的至少一组输入/输出数据为参考,按照神经网络训练优化算法对与各数据处理步骤相等价的子网络的参数进行优化调整,所述参数包括:网络节点、权值、训练速率中的至少一种。
- 如权利要求4所述的方法,其特征在于,所述将优化训练后的至少一个子网络进行合并,形成目标网络模型,包括:从所述至少一个子网络中选择任一个子网络作为种子网络;获取根据所述各数据处理步骤设置合并顺序,根据所述合并顺序从所述至少一个子网络中选取除种子网络之外的一个子网络作为合并对象网络;拆除所述种子网络与所述合并对象网络之间的输入层和输出层;将所述种子网络与所述合并对象网络之间采用全连接的方式进行合并连接;若合并连接成功,以与所述种子网络相等价的数据处理步骤对应的至少一组输入/输出数据以及与所述合并对象网络相等价的数据处理步骤对应的至少一组输入/输出数据为参考,对合并连接的网络的参数进行优化调整;再以合并连接得到的网络作为种子网络,迭代上述过程直至将所述至少一个子网络全部合并连接形成所述目标网络模型。
- 如权利要求5所述的方法,进一步包括:若所述合并连接失败,在所述种子网络与所述合并对象网络之间添加中间隐含层,通过所述中间隐含层将所述种子网络与所述合并对象网络采用全连接方式进行合并连接。
- 一种基于机器学习的网络模型构造装置,其特征在于,包括:获取模块,用于获取原网络模型的数据处理流程及所述原网络模型在所述数据处理流程中产生的参考数据集;分层构造模块,用于按照所述数据处理流程以及所述参考数据集分层构造至少一个子网络;优化训练模块,用于采用所述参考数据集对所述至少一个子网络进行优化训练;合并模块,用于将优化训练后的至少一个子网络进行合并处理,形成目标网络模型。
- 如权利要求7所述的装置,其特征在于,所述获取模块包括:步骤获取单元,用于获取原网络模型在数据处理流程中所执行的至少一个数据处理步骤;数据获取单元,用于获取所述原网络模型在执行各数据处理步骤时产生的运行数据;抽样提取单元,用于分别从所述原网络模型在执行各数据处理步骤时产生的运行数据中提取部分或全部数据,组成参考数据集;其中,所述参考数据集包含各数据处理步骤对应的至少一组输入/输出数据。
- 如权利要求8所述的装置,其特征在于,所述分层构造模块包括:查询单元,用于从预设的等价对应表中分别查询与各数据处理步骤相等价的子网络的网络主结构;确定单元,用于按照各数据处理步骤对应的至少一组输入/输出数据,确定与各数据处理步骤相等价的子网络的输入层结构和输出层结构;构造单元,用于根据与各数据处理步骤相等价的子网络的网络主结构、输入层结构和输出层结构构造与各数据处理步骤相等价的子网络,其中,一个子网络与一个数据处理步骤相等价。
- 如权利要求8或9所述的装置,其特征在于,所述优化训练模块包括:读取单元,用于依次从所述参考数据集中读取各数据处理步骤对应的至少一组输入/输出数据;调整单元,用于以各数据处理步骤对应的至少一组输入/输出数据为参考,按照神经网络训练优化算法对与各数据处理步骤相等价的子网络的参数进行优化调整,所述参数包括:网络节点、权值、训练速率中的至少一种。
- 如权利要求10所述的装置,其特征在于,所述合并模块包括:种子选择单元,用于从所述至少一个子网络中选择任一个子网络作为种子网络;合并对象选取单元,用于获取根据所述各数据处理步骤设置合并顺序,根据所述合并顺序从所述至少一个子网络中选取除种子网络之外的一个子网络作为合并对象网络;拆除单元,用于拆除所述种子网络与所述合并对象网络之间的输入层和输出层;合并连接单元,用于将所述种子网络与所述合并对象网络之间采用全连接的方式进行合并连接;优化调整单元,用于若合并连接成功,以与所述种子网络相等价的数据处理步骤对应的至少一组输入/输出数据以及与所述合并对象网络相等价的数据处理步骤对应的至少一组输入/输出数据为参考,对合并连接的网络的参数进行优化调整;所述种子选择单元还用于将合并连接的网络作为种子网络,迭代由所述合并对象选取单元、所述拆除单元、所述合并连接单元和所述优化调整单元执行相应处理直到将所述至少一个子网络全部合并连接形成所述目标网络模型。
- 如权利要求11所述的装置,其特征在于,所述合并模块还包括:添加单元,用于若所述合并连接失败,在所述种子网络与所述合并对象网络之间添加中间隐含层,以使得所述合并连接单元通过所述中间隐含层将所述种子网络与所述合并对象网络采用全连接方式进行合并连接。
- 一种计算机可读存储介质,存储有计算机可读指令,被至少一个处理器执行以执行权利要求1至6任一项所述的基于机器学习的网络模型构造方法。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP17805881.4A EP3467723B1 (en) | 2016-06-02 | 2017-06-02 | Machine learning based network model construction method and apparatus |
KR1020187013732A KR102173555B1 (ko) | 2016-06-02 | 2017-06-02 | 머신 러닝 기반 네트워크 모델 구축 방법 및 장치 |
JP2018543424A JP6549332B2 (ja) | 2016-06-02 | 2017-06-02 | 機械学習に基づくネットワークモデル構築方法及び装置 |
US15/984,754 US11741361B2 (en) | 2016-06-02 | 2018-05-21 | Machine learning-based network model building method and apparatus |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610389530.9 | 2016-06-02 | ||
CN201610389530.9A CN106096727B (zh) | 2016-06-02 | 2016-06-02 | 一种基于机器学习的网络模型构造方法及装置 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/984,754 Continuation US11741361B2 (en) | 2016-06-02 | 2018-05-21 | Machine learning-based network model building method and apparatus |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017206936A1 true WO2017206936A1 (zh) | 2017-12-07 |
Family
ID=57447501
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2017/086917 WO2017206936A1 (zh) | 2016-06-02 | 2017-06-02 | 基于机器学习的网络模型构造方法及装置 |
Country Status (6)
Country | Link |
---|---|
US (1) | US11741361B2 (zh) |
EP (1) | EP3467723B1 (zh) |
JP (1) | JP6549332B2 (zh) |
KR (1) | KR102173555B1 (zh) |
CN (1) | CN106096727B (zh) |
WO (1) | WO2017206936A1 (zh) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109726718A (zh) * | 2019-01-03 | 2019-05-07 | 电子科技大学 | 一种基于关系正则化的视觉场景图生成系统及方法 |
CN110046706A (zh) * | 2019-04-18 | 2019-07-23 | 腾讯科技(深圳)有限公司 | 模型生成方法、装置及服务器 |
CN113378998A (zh) * | 2021-07-12 | 2021-09-10 | 西南石油大学 | 一种基于机器学习的地层岩性随钻识别方法 |
CN113554077A (zh) * | 2021-07-13 | 2021-10-26 | 南京铉盈网络科技有限公司 | 基于多模态神经网络模型的工况评估及业务量预测方法 |
CN113919387A (zh) * | 2021-08-18 | 2022-01-11 | 东北林业大学 | 基于gbdt-lr模型的脑电信号情感识别 |
CN116451099A (zh) * | 2023-06-19 | 2023-07-18 | 浪潮通用软件有限公司 | 一种基于随机遍历的高熵knn聚类方法、设备及介质 |
CN111105020B (zh) * | 2018-10-29 | 2024-03-29 | 西安宇视信息科技有限公司 | 特征表示迁移学习方法及相关装置 |
Families Citing this family (41)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106096727B (zh) | 2016-06-02 | 2018-12-07 | 腾讯科技(深圳)有限公司 | 一种基于机器学习的网络模型构造方法及装置 |
CN108629413B (zh) * | 2017-03-15 | 2020-06-16 | 创新先进技术有限公司 | 神经网络模型训练、交易行为风险识别方法及装置 |
EP3596655B1 (en) * | 2017-06-05 | 2023-08-09 | Siemens Aktiengesellschaft | Method and apparatus for analysing an image |
US20190138929A1 (en) | 2017-07-07 | 2019-05-09 | Darwinai Corporation | System and method for automatic building of learning machines using learning machines |
TWI696971B (zh) * | 2017-12-12 | 2020-06-21 | 浩霆 黃 | 金融商品的相關性預測系統及其方法 |
DE102018109835A1 (de) * | 2018-04-24 | 2019-10-24 | Albert-Ludwigs-Universität Freiburg | Verfahren und Vorrichtung zum Ermitteln einer Netzkonfiguration eines neuronalen Netzes |
WO2019211651A1 (en) * | 2018-05-02 | 2019-11-07 | Telefonaktiebolaget Lm Ericsson (Publ) | Placement-aware acceleration of parameter optimization in a predictive model |
CN109034367A (zh) * | 2018-08-22 | 2018-12-18 | 广州杰赛科技股份有限公司 | 神经网络更新方法、装置、计算机设备和可读存储介质 |
CN109167789B (zh) * | 2018-09-13 | 2021-04-13 | 上海海事大学 | 一种云环境LDoS攻击数据流检测方法及系统 |
CN109447273A (zh) * | 2018-09-30 | 2019-03-08 | 深圳市元征科技股份有限公司 | 模型训练方法、广告推荐方法、相关装置、设备及介质 |
JP7115207B2 (ja) * | 2018-10-11 | 2022-08-09 | 富士通株式会社 | 学習プログラム、学習方法および学習装置 |
KR20200052439A (ko) * | 2018-10-29 | 2020-05-15 | 삼성에스디에스 주식회사 | 딥러닝 모델의 최적화 시스템 및 방법 |
CN111224805A (zh) * | 2018-11-26 | 2020-06-02 | 中兴通讯股份有限公司 | 一种网络故障根因检测方法、系统及存储介质 |
CN111291771B (zh) * | 2018-12-06 | 2024-04-02 | 西安宇视信息科技有限公司 | 对池化层特征优化的方法及装置 |
CN109684087B (zh) * | 2018-12-17 | 2020-01-10 | 中科寒武纪科技股份有限公司 | 运算方法、装置及相关产品 |
CN109685203B (zh) * | 2018-12-21 | 2020-01-17 | 中科寒武纪科技股份有限公司 | 数据处理方法、装置、计算机系统及存储介质 |
CN109684648B (zh) * | 2019-01-14 | 2020-09-01 | 浙江大学 | 一种多特征融合的古今汉语自动翻译方法 |
CN111435432B (zh) * | 2019-01-15 | 2023-05-26 | 北京市商汤科技开发有限公司 | 网络优化方法及装置、图像处理方法及装置、存储介质 |
CN111600734B (zh) * | 2019-02-21 | 2021-11-02 | 烽火通信科技股份有限公司 | 一种网络故障处理模型的构建方法、故障处理方法及系统 |
CN109978024B (zh) * | 2019-03-11 | 2020-10-27 | 北京工业大学 | 一种基于互联模块化神经网络的出水bod预测方法 |
US11488025B2 (en) | 2019-04-29 | 2022-11-01 | Landmark Graphics Corporation | Hybrid neural network and autoencoder |
CN110380888B (zh) * | 2019-05-29 | 2021-02-23 | 华为技术有限公司 | 一种网络异常检测方法和装置 |
CN110322021B (zh) * | 2019-06-14 | 2021-03-30 | 清华大学 | 大规模网络表征学习的超参数优化方法和装置 |
CN110414570B (zh) * | 2019-07-04 | 2022-01-28 | 北京迈格威科技有限公司 | 图像分类模型生成方法、装置、设备和存储介质 |
CN110570013B (zh) * | 2019-08-06 | 2023-04-07 | 山东省科学院海洋仪器仪表研究所 | 一种单站位在线波周期数据的预测诊断方法 |
US11763157B2 (en) * | 2019-11-03 | 2023-09-19 | Microsoft Technology Licensing, Llc | Protecting deep learned models |
JP7203000B2 (ja) * | 2019-11-12 | 2023-01-12 | Hoya株式会社 | プログラム、情報処理方法及び情報処理装置 |
US11847500B2 (en) | 2019-12-11 | 2023-12-19 | Cisco Technology, Inc. | Systems and methods for providing management of machine learning components |
CN113811897B (zh) * | 2019-12-30 | 2022-05-31 | 深圳元戎启行科技有限公司 | 神经网络模型的推理方法、装置、计算机设备和存储介质 |
US11501216B2 (en) * | 2020-02-21 | 2022-11-15 | King.Com Ltd. | Computer system, a computer device and a computer implemented method |
CN111584027B (zh) * | 2020-04-30 | 2022-03-11 | 天津大学 | 融合复杂网络和图卷积的脑控康复系统运动想象识别系统 |
CN113780513B (zh) * | 2020-06-10 | 2024-05-03 | 杭州海康威视数字技术股份有限公司 | 网络模型量化、推理方法、装置、电子设备及存储介质 |
CN112150605B (zh) * | 2020-08-17 | 2024-02-02 | 北京化工大学 | 用于mri局部sar估计的膝关节模型构建方法 |
US12045721B2 (en) * | 2020-10-08 | 2024-07-23 | Lg Electronics Inc. | Method and device for transmitting OFDM signal, and method and device for receiving OFDM signal |
US12047248B2 (en) | 2020-10-26 | 2024-07-23 | Samsung Electronics Co., Ltd. | Method of controlling state control parameter for adjusting state of network of base station by using any one of plurality of models and electronic device performing the method |
KR20220055363A (ko) * | 2020-10-26 | 2022-05-03 | 삼성전자주식회사 | 복수의 모델들 중 어느 하나의 모델을 이용하여 기지국의 네트워크의 상태를 조정하기 위한 상태 제어 파라미터를 제어하는 방법 및 이를 수행하는 전자 장치 |
KR102593844B1 (ko) | 2021-02-25 | 2023-10-25 | 주식회사 인제니오에이아이 | 딥러닝 네트워크 구성 방법, 딥러닝 자동화 플랫폼 서비스 시스템 및 이를 위한 컴퓨터 프로그램 |
CN113378777A (zh) * | 2021-06-30 | 2021-09-10 | 沈阳康慧类脑智能协同创新中心有限公司 | 基于单目摄像机的视线检测的方法和装置 |
KR102644593B1 (ko) * | 2021-11-23 | 2024-03-07 | 한국기술교육대학교 산학협력단 | 지능형 디바이스 개발을 위한 ai 분화 기반의 하드웨어 정보에 최적의 지능형 소프트웨어 개발도구 |
CN114492789B (zh) * | 2022-01-25 | 2024-05-14 | 天津工业大学 | 一种数据样本的神经网络模型构建方法及装置 |
WO2024130041A1 (en) * | 2022-12-14 | 2024-06-20 | Advanced Micro Devices, Inc. | Network collective offload message chunking management |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101814160A (zh) * | 2010-03-08 | 2010-08-25 | 清华大学 | 一种基于特征聚类的rbf神经网络建模方法 |
EP2533176A1 (en) * | 2005-11-15 | 2012-12-12 | Bernadette Garner | Method for determining whether input vectors are known or unknown by a neuron |
CN103729459A (zh) * | 2014-01-10 | 2014-04-16 | 北京邮电大学 | 一种构建情感分类模型的方法 |
CN104751227A (zh) * | 2013-12-31 | 2015-07-01 | 安徽科大讯飞信息科技股份有限公司 | 深度神经网络的构建方法及系统 |
CN106096727A (zh) * | 2016-06-02 | 2016-11-09 | 腾讯科技(深圳)有限公司 | 一种基于机器学习的网络模型构造方法及装置 |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH04190461A (ja) * | 1990-11-26 | 1992-07-08 | Fujitsu Ltd | ニューラルネットワークの構築表示方法 |
JPH05128082A (ja) * | 1991-07-19 | 1993-05-25 | Fujitsu Ltd | 階層ネツトワーク構成データ処理装置とその学習処理方法 |
JP3035106B2 (ja) * | 1993-03-11 | 2000-04-17 | 株式会社東芝 | 大規模情報認識回路 |
JPH0793160A (ja) * | 1993-09-20 | 1995-04-07 | Toshiba Corp | 推論装置 |
US6119112A (en) * | 1997-11-19 | 2000-09-12 | International Business Machines Corporation | Optimum cessation of training in neural networks |
JP3678636B2 (ja) * | 2000-08-21 | 2005-08-03 | Necソフト株式会社 | ニューラルネットワークの学習方法 |
CN103345656B (zh) * | 2013-07-17 | 2016-01-20 | 中国科学院自动化研究所 | 一种基于多任务深度神经网络的数据识别方法及装置 |
US9870537B2 (en) * | 2014-01-06 | 2018-01-16 | Cisco Technology, Inc. | Distributed learning in a computer network |
CN103838836B (zh) * | 2014-02-25 | 2016-09-28 | 中国科学院自动化研究所 | 基于判别式多模态深度置信网多模态数据融合方法和系统 |
-
2016
- 2016-06-02 CN CN201610389530.9A patent/CN106096727B/zh active Active
-
2017
- 2017-06-02 JP JP2018543424A patent/JP6549332B2/ja active Active
- 2017-06-02 KR KR1020187013732A patent/KR102173555B1/ko active IP Right Grant
- 2017-06-02 EP EP17805881.4A patent/EP3467723B1/en active Active
- 2017-06-02 WO PCT/CN2017/086917 patent/WO2017206936A1/zh active Application Filing
-
2018
- 2018-05-21 US US15/984,754 patent/US11741361B2/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2533176A1 (en) * | 2005-11-15 | 2012-12-12 | Bernadette Garner | Method for determining whether input vectors are known or unknown by a neuron |
CN101814160A (zh) * | 2010-03-08 | 2010-08-25 | 清华大学 | 一种基于特征聚类的rbf神经网络建模方法 |
CN104751227A (zh) * | 2013-12-31 | 2015-07-01 | 安徽科大讯飞信息科技股份有限公司 | 深度神经网络的构建方法及系统 |
CN103729459A (zh) * | 2014-01-10 | 2014-04-16 | 北京邮电大学 | 一种构建情感分类模型的方法 |
CN106096727A (zh) * | 2016-06-02 | 2016-11-09 | 腾讯科技(深圳)有限公司 | 一种基于机器学习的网络模型构造方法及装置 |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111105020B (zh) * | 2018-10-29 | 2024-03-29 | 西安宇视信息科技有限公司 | 特征表示迁移学习方法及相关装置 |
CN109726718A (zh) * | 2019-01-03 | 2019-05-07 | 电子科技大学 | 一种基于关系正则化的视觉场景图生成系统及方法 |
CN109726718B (zh) * | 2019-01-03 | 2022-09-16 | 电子科技大学 | 一种基于关系正则化的视觉场景图生成系统及方法 |
CN110046706A (zh) * | 2019-04-18 | 2019-07-23 | 腾讯科技(深圳)有限公司 | 模型生成方法、装置及服务器 |
CN110046706B (zh) * | 2019-04-18 | 2022-12-20 | 腾讯科技(深圳)有限公司 | 模型生成方法、装置及服务器 |
CN113378998A (zh) * | 2021-07-12 | 2021-09-10 | 西南石油大学 | 一种基于机器学习的地层岩性随钻识别方法 |
CN113554077A (zh) * | 2021-07-13 | 2021-10-26 | 南京铉盈网络科技有限公司 | 基于多模态神经网络模型的工况评估及业务量预测方法 |
CN113554077B (zh) * | 2021-07-13 | 2024-09-06 | 南京铉盈网络科技有限公司 | 基于多模态神经网络模型的工况评估及业务量预测方法 |
CN113919387A (zh) * | 2021-08-18 | 2022-01-11 | 东北林业大学 | 基于gbdt-lr模型的脑电信号情感识别 |
CN116451099A (zh) * | 2023-06-19 | 2023-07-18 | 浪潮通用软件有限公司 | 一种基于随机遍历的高熵knn聚类方法、设备及介质 |
CN116451099B (zh) * | 2023-06-19 | 2023-09-01 | 浪潮通用软件有限公司 | 一种基于随机遍历的高熵knn聚类方法、设备及介质 |
Also Published As
Publication number | Publication date |
---|---|
CN106096727A (zh) | 2016-11-09 |
KR102173555B1 (ko) | 2020-11-03 |
EP3467723A1 (en) | 2019-04-10 |
US11741361B2 (en) | 2023-08-29 |
JP6549332B2 (ja) | 2019-07-24 |
EP3467723B1 (en) | 2021-02-24 |
US20180268296A1 (en) | 2018-09-20 |
KR20180069877A (ko) | 2018-06-25 |
EP3467723A4 (en) | 2019-04-10 |
JP2018533153A (ja) | 2018-11-08 |
CN106096727B (zh) | 2018-12-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2017206936A1 (zh) | 基于机器学习的网络模型构造方法及装置 | |
WO2021190597A1 (zh) | 一种神经网络模型的处理方法以及相关设备 | |
US20130346346A1 (en) | Semi-supervised random decision forests for machine learning | |
WO2023065859A1 (zh) | 物品推荐方法、装置及存储介质 | |
US11803971B2 (en) | Generating improved panoptic segmented digital images based on panoptic segmentation neural networks that utilize exemplar unknown object classes | |
WO2021035412A1 (zh) | 一种自动机器学习AutoML系统、方法及设备 | |
CN110993037A (zh) | 一种基于多视图分类模型的蛋白质活性预测装置 | |
US20220121949A1 (en) | Personalized neural network pruning | |
Cortés et al. | Learning edit cost estimation models for graph edit distance | |
CN110110724A (zh) | 基于指数型挤压函数驱动胶囊神经网络的文本验证码识别方法 | |
CN112529071A (zh) | 一种文本分类方法、系统、计算机设备和存储介质 | |
CN112817560B (zh) | 一种基于表函数的计算任务处理方法、系统及计算机可读存储介质 | |
CN108629381A (zh) | 基于大数据的人群筛选方法及终端设备 | |
Cortés et al. | A deep neural network architecture to estimate node assignment costs for the graph edit distance | |
CN109614581B (zh) | 基于对偶局部学习的非负矩阵分解聚类方法 | |
WO2024045319A1 (zh) | 人脸图像聚类方法、装置、电子设备及存储介质 | |
CN116976402A (zh) | 超图卷积神经网络的训练方法、装置、设备及存储介质 | |
CN115019342A (zh) | 一种基于类关系推理的濒危动物目标检测方法 | |
CN108280511A (zh) | 一种基于卷积网络进行网络访问数据进行处理的方法 | |
CN114492648A (zh) | 对象分类方法、装置、计算机设备、存储介质及程序产品 | |
US20210073624A1 (en) | Transfer learning with augmented neural networks | |
CN112463964A (zh) | 文本分类及模型训练方法、装置、设备及存储介质 | |
Fan et al. | Survey of Graph Neural Networks and Applications | |
Madala | A study of Generalization in Deep Neural Networks | |
US20240005158A1 (en) | Model performance linter |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 2018543424 Country of ref document: JP |
|
ENP | Entry into the national phase |
Ref document number: 20187013732 Country of ref document: KR Kind code of ref document: A |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17805881 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |