CN113065641A - Neural network model training method and device, electronic equipment and storage medium - Google Patents

Neural network model training method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113065641A
CN113065641A CN202110304132.3A CN202110304132A CN113065641A CN 113065641 A CN113065641 A CN 113065641A CN 202110304132 A CN202110304132 A CN 202110304132A CN 113065641 A CN113065641 A CN 113065641A
Authority
CN
China
Prior art keywords
sub
network model
output
network
loss
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110304132.3A
Other languages
Chinese (zh)
Other versions
CN113065641B (en
Inventor
高志鹏
苗东
芮兰兰
莫梓嘉
赵晨
林怡静
谭清
付伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Quyun Technology Co ltd
Beijing University of Posts and Telecommunications
Original Assignee
Beijing Quyun Technology Co ltd
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Quyun Technology Co ltd, Beijing University of Posts and Telecommunications filed Critical Beijing Quyun Technology Co ltd
Priority to CN202110304132.3A priority Critical patent/CN113065641B/en
Publication of CN113065641A publication Critical patent/CN113065641A/en
Application granted granted Critical
Publication of CN113065641B publication Critical patent/CN113065641B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • G06V10/267Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The neural network model training method, the neural network model training device, the electronic equipment and the storage medium are applied to the technical field of information, and the image classification network model to be trained is divided into a plurality of sub-network models according to a plurality of preset division points; calculating corresponding losses respectively by a preset loss function aiming at each group of sub-network models; performing joint training on each group of sub-network models according to the calculated loss to obtain a plurality of groups of sub-network models to be output; respectively calculating a plurality of performance parameters corresponding to each to-be-output sub-network model aiming at each group of sub-network models; respectively calculating the comprehensive performance scores corresponding to the sub-network models through a preset entropy weight model according to the multiple performance parameters corresponding to the sub-network models; and selecting one group with the highest comprehensive performance score in the sub-network models as a target sub-network model. The deployment of the image classification network model can be carried out according to the target sub-network model, so that the convenience of the neural network deployment is improved.

Description

Neural network model training method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of model training technologies, and in particular, to a neural network model training method and apparatus, an electronic device, and a storage medium.
Background
At present, artificial intelligence can replace a technology for human beings to complete functions of cognition, recognition, analysis, decision and the like. Can realize through artificial intelligence: image recognition, voice recognition, intelligent life, automatic driving and the like, thereby providing great convenience for the life of people.
However, the structure of the neural network applied to the field of artificial intelligence is often too large, and the requirements on computing resources and storage resources are very high, so that most of the applications based on the deep neural network at present need to rely on a cloud platform with massive computing resources, thereby bringing great limitations to the development of artificial intelligence and related services thereof.
Disclosure of Invention
An object of the embodiments of the present application is to provide a neural network model training method, an apparatus, an electronic device, and a storage medium, so as to solve a problem of how to improve convenience of neural network deployment. The specific technical scheme is as follows:
in a first aspect of the embodiments of the present application, a method for training a neural network model is provided, where the method includes:
dividing an image classification network model to be trained into a plurality of sub-network models according to a plurality of preset groups of dividing points, wherein each sub-network model comprises a first sub-network model, a second sub-network model and a third sub-network model;
for each group of sub-network models, inputting a sample image into the first sub-network model, taking the output of the first sub-network model as the input of the second sub-network model, taking the output of the second sub-network model as the input of the first sub-network model, and generating an image classification result output by the third sub-network model;
calculating a first loss corresponding to the output of the first sub-network model, a second loss corresponding to the output of the second sub-network model and a third loss corresponding to the image classification result output by the third sub-network model respectively through a preset loss function aiming at each group of sub-network models;
for each group of sub-network models, performing joint training on the corresponding first sub-network model, second sub-network model and third sub-network model through the corresponding first loss, second loss and third loss respectively to obtain a first sub-network model to be output, a second sub-network model to be output and a third sub-network model to be output;
calculating a plurality of performance parameters corresponding to the first to-be-output sub-network model, the second to-be-output sub-network model and the third to-be-output sub-network model respectively aiming at each group of sub-network models;
respectively calculating the comprehensive performance scores corresponding to the sub-network models through a preset entropy weight model according to the multiple performance parameters corresponding to the sub-network models;
and selecting one group with the highest comprehensive performance score in the sub-network models as a target sub-network model.
In a second aspect of the embodiments of the present application, there is also provided a neural network model training apparatus, including:
the model segmentation module is used for segmenting the image classification network model to be trained into a plurality of sub-network models according to a plurality of preset segmentation points, wherein each sub-network model comprises a first sub-network model, a second sub-network model and a third sub-network model;
a result generation module, configured to, for each group of sub-network models, input a sample image into the first sub-network model, and generate an image classification result output by the third sub-network model with an output of the first sub-network model as an input of the second sub-network model and an output of the second sub-network model as an input of the first sub-network model;
the loss calculation module is used for calculating a first loss corresponding to the output of the first sub-network model, a second loss corresponding to the output of the second sub-network model and a third loss corresponding to the image classification result output by the third sub-network model respectively through a preset loss function aiming at each group of sub-network models;
the joint training module is used for performing joint training on the corresponding first sub-network model, second sub-network model and third sub-network model respectively through the corresponding first loss, second loss and third loss aiming at each group of sub-network models to obtain a first sub-network model to be output, a second sub-network model to be output and a third sub-network model to be output;
the parameter calculation module is used for calculating a plurality of performance parameters corresponding to the first to-be-output sub-network model, the second to-be-output sub-network model and the third to-be-output sub-network model respectively aiming at each group of sub-network models;
the score calculation module is used for calculating the comprehensive performance scores corresponding to the sub-network models through the preset entropy weight model according to the performance parameters corresponding to the sub-network models;
and the model selection module is used for selecting one group with the highest comprehensive performance score in the sub-network models as a target sub-network model.
The embodiment of the application also provides electronic equipment which comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete mutual communication through the communication bus;
a memory for storing a computer program;
and the processor is used for realizing any one of the neural network model training methods when executing the program stored in the memory.
An embodiment of the present application further provides a computer-readable storage medium, in which a computer program is stored, and when the computer program is executed by a processor, the method for training a neural network model is implemented.
Embodiments of the present application also provide a computer program product containing instructions, which when run on a computer, cause the computer to perform any one of the above neural network model training methods.
The embodiment of the application has the following beneficial effects:
according to the neural network model training method, the device, the electronic equipment and the storage medium, an image classification network model to be trained is divided into a plurality of sub-network models according to a plurality of preset dividing points, wherein each sub-network model comprises a first sub-network model, a second sub-network model and a third sub-network model; for each group of sub-network models, inputting a sample image into the first sub-network model, taking the output of the first sub-network model as the input of the second sub-network model, taking the output of the second sub-network model as the input of the first sub-network model, and generating an image classification result output by the third sub-network model; calculating a first loss corresponding to the output of the first sub-network model, a second loss corresponding to the output of the second sub-network model and a third loss corresponding to the image classification result output by the third sub-network model respectively through a preset loss function aiming at each group of sub-network models; for each group of sub-network models, performing joint training on the corresponding first sub-network model, second sub-network model and third sub-network model through the corresponding first loss, second loss and third loss respectively to obtain a first sub-network model to be output, a second sub-network model to be output and a third sub-network model to be output; calculating a plurality of performance parameters corresponding to the first to-be-output sub-network model, the second to-be-output sub-network model and the third to-be-output sub-network model respectively aiming at each group of sub-network models; respectively calculating the comprehensive performance scores corresponding to the sub-network models through a preset entropy weight model according to the multiple performance parameters corresponding to the sub-network models; and selecting one group with the highest comprehensive performance score in the sub-network models as a target sub-network model. The image classification network model can be deployed through the sub-networks corresponding to the target sub-network model, and each sub-network in the target sub-network model only comprises one part of the image classification network model, so that different sub-networks can be deployed in different devices, and each device only needs to be in charge of one part of calculation tasks, and convenience in neural network deployment is improved.
Of course, not all advantages described above need to be achieved at the same time in the practice of any one product or method of the present application.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other embodiments can be obtained by using the drawings without creative efforts.
Fig. 1 is a schematic flow chart of a neural network model training method according to an embodiment of the present application;
fig. 2 is a candidate network structure selection algorithm according to an embodiment of the present application;
FIG. 3 is a flowchart illustrating an embodiment of generating image classification results;
FIG. 4 is a diagram illustrating an example of a neural network model training method according to an embodiment of the present disclosure;
FIG. 5 is a comparison graph of results obtained by the calculation of the entropy weight topsis algorithm according to the embodiment of the present application;
FIG. 6 is a schematic structural diagram of a neural network model training apparatus according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments that can be derived by one of ordinary skill in the art from the description herein are intended to be within the scope of the present disclosure.
In a first aspect of an embodiment of the present application, a neural network model training method is provided, where the method includes:
dividing an image classification network model to be trained into a plurality of sub-network models according to a plurality of preset groups of dividing points, wherein each sub-network model comprises a first sub-network model, a second sub-network model and a third sub-network model;
for each group of sub-network models, inputting a sample image into a first sub-network model, taking the output of the first sub-network model as the input of a second sub-network model, taking the output of the second sub-network model as the input of the first sub-network model, and generating an image classification result output by a third sub-network model;
respectively calculating a first loss corresponding to the output of the first sub-network model, a second loss corresponding to the output of the second sub-network model and a third loss corresponding to the image classification result output by the third sub-network model by using a preset loss function aiming at each group of sub-network models;
aiming at each group of sub-network models, performing joint training on the corresponding first sub-network model, second sub-network model and third sub-network model respectively through the corresponding first loss, second loss and third loss to obtain a first sub-network model to be output, a second sub-network model to be output and a third sub-network model to be output;
respectively calculating a plurality of performance parameters corresponding to the first to-be-output sub-network model, the second to-be-output sub-network model and the third to-be-output sub-network model aiming at each group of sub-network models;
respectively calculating the comprehensive performance scores corresponding to the sub-network models through a preset entropy weight model according to the multiple performance parameters corresponding to the sub-network models;
and selecting one group with the highest comprehensive performance score in the sub-network models as a target sub-network model.
Therefore, by the neural network model training method, the image classification network model can be deployed through the sub-network corresponding to the target sub-network model, and each sub-network in the target sub-network model only comprises one part of the image classification network model, so that different sub-networks can be deployed in different devices, and each device only needs to be responsible for one part of calculation tasks, and convenience in neural network deployment is improved.
At present, artificial intelligence has become a technology which can be realized by a machine and can replace human beings to complete functions of cognition, recognition, analysis, decision and the like. Can realize through artificial intelligence: image recognition, voice recognition, intelligent life, automatic driving and the like, thereby providing great convenience for the life of people. However, the structure of the neural network applied to the field of artificial intelligence is often too large, and the requirements on computing resources and storage resources are very high, so that most of the applications based on the deep neural network at present need to rely on a cloud platform with massive computing resources, thereby bringing great limitations to the development of artificial intelligence and related services thereof. On the other hand, in recent years, networking terminal devices have increased explosively, massive networking devices bring massive data, and how to safely and efficiently use the data becomes a problem to be solved urgently. Meanwhile, the development of deep learning is driven by massive data, but the technology is very sensitive to computing and storage resources due to the generation of massive data. The contradiction is particularly obvious for the mobile terminal equipment, and due to the lack of sufficient resources, the mobile terminal equipment needs to completely unload the data to the cloud for processing, which means that the time consumed by communication is greatly increased, and the high time delay caused by the fact is unacceptable for some applications with high real-time requirements; on the other hand, the security and privacy of data cannot be guaranteed in the unloading process. If the mobile terminal device chooses to perform data processing locally, a simpler network model must be used for task processing, which results in a great reduction in task accuracy.
In order to solve the above problem, in the embodiment of the present application, a centralized cloud processing mode is changed to an "edge cloud" cooperative processing mode, and a neural network model is optimized to be more suitable for a novel computing model.
Specifically, referring to fig. 1, fig. 1 is a schematic flow chart of a neural network model training method according to an embodiment of the present application, including:
and step S11, dividing the image classification network model to be trained into a plurality of sub-network models according to the preset plurality of groups of dividing points.
Wherein each set of sub-network models comprises a first sub-network model, a second sub-network model, and a third sub-network model.
The image classification network model may be a network model that is established in advance and used for classifying images, and the preset segmentation points may be segmentation points that are set by devices to which the respective segmented sub-network models are applied according to structural characteristics of the image classification network model. For example, the segmented first sub-network model is applied to a client, such as a smart phone, a computer, etc., the segmented second sub-network model is applied to an edge, such as an edge server like a base station, and the segmented third sub-network model is applied to a cloud, such as a cloud device, etc., and the image classification network model to be trained can be segmented into a plurality of sub-network models according to the computing power of different devices.
The neural network model training method is applied to the intelligent terminal, the network model can be trained through the intelligent terminal, and specifically, the intelligent terminal can be a computer or a server and the like.
Step S12 is to input the sample image into the first sub-network model for each group of sub-network models, and to generate an image classification result output from the third sub-network model by using the output of the first sub-network model as the input of the second sub-network model and the output of the second sub-network model as the input of the first sub-network model.
For example, when there are two sets of dividing points, the first, second, and third sub-network models obtained by dividing the data at the first set of dividing points are M1、M2、M3The first sub-network model, the second sub-network model and the third sub-network model obtained by dividing according to the second group of dividing points are respectively M4、M5、M6Wherein M is1Has an output of M2Input of M2Has an output of M3Input of M4Has an output of M5Input of M5Has an output of M6Is input.
Step S13 is to calculate, for each group of sub-network models, a first loss corresponding to an output of the first sub-network model, a second loss corresponding to an output of the second sub-network model, and a third loss corresponding to an image classification result output by the third sub-network model, respectively, by a preset loss function.
The preset loss function may be various loss functions for performing loss calculation, such as a cross-entropy loss function, an absolute-value-to-digital loss function, and the like. Calculating a first loss corresponding to the output of the first sub-network model, a second loss corresponding to the output of the second sub-network model, and a third loss corresponding to the image classification result output by the third sub-network model, respectively, by a preset loss function, e.g., M calculated by a cross entropy loss function for each set of sub-network models1、M2、M3The corresponding first loss, second loss and third loss are respectively L1、L2、L3,M4、M5、M6The corresponding first loss, second loss and third loss are respectively L4、L5、L6
Optionally, for each group of sub-network models, respectively calculating, by using a preset loss function, a first loss corresponding to an output of the first sub-network model, a second loss corresponding to an output of the second sub-network model, and a third loss corresponding to an image classification result output by the third sub-network model, including:
for each group of sub-network models, by presetting a loss function:
youtput_n=ycut_n(x;ω;b);
Figure BDA0002987422620000071
Figure BDA0002987422620000072
where n denotes the nth division point, x is the input data, ω and b denote the network weight and offset from the input to the current division point, respectively, SoftMax () is a normalization function, ylabelOne-hot ground-route label vector, y, representing the dataoutput_nOutput of the network, ycut_nInputting the input of the network model to the output of the current segmentation point, wherein label represents a label set;
and respectively calculating a first loss corresponding to the output of the first sub-network model, a second loss corresponding to the output of the second sub-network model and a third loss corresponding to the image classification result output by the third sub-network model.
And step S14, performing joint training on the corresponding first sub-network model, second sub-network model and third sub-network model according to the corresponding first loss, second loss and third loss respectively aiming at each group of sub-network models to obtain a first sub-network model to be output, a second sub-network model to be output and a third sub-network model to be output.
The first sub-network model, the second sub-network model and the third sub-network model are jointly trained through the corresponding first loss, the second loss and the third loss, and the first sub-network model, the second sub-network model and the third sub-network model can be trained according to the comprehensive loss by calculating the comprehensive loss corresponding to the first loss, the second loss and the third loss. And calculating the comprehensive loss corresponding to the first loss, the second loss and the third loss by a weighted summation mode. For example, according to a preset weight, for L1、L2、L3Carrying out weighted summation to obtain the comprehensive loss L7According to L7To M1、M2、M3Training at the same time to obtain a first to-be-output sub-network model, a second to-be-output sub-network model and a third to-be-output sub-network model m1、m2、m3According to a predetermined weight, for L4、L5、L6Carrying out weighted summation to obtain the comprehensive loss L8According to L8To M4、M5、M6Training at the same time to obtain a first to-be-output sub-network model, a second to-be-output sub-network model and a third to-be-output sub-network model m4、m5、m6
Optionally, for each group of sub-network models, performing joint training on the corresponding first sub-network model, second sub-network model, and third sub-network model through the corresponding first loss, second loss, and third loss, respectively, to obtain a first to-be-output sub-network model, a second to-be-output sub-network model, and a third to-be-output sub-network model, including:
aiming at each group of sub-network models, respectively passing through corresponding first loss, second loss and third loss, and through a preset joint loss function:
Figure BDA0002987422620000081
calculating the joint loss;
wherein, ω isnIs the weight of the corresponding candidate segmentation point loss function;
and performing joint training on the corresponding first sub-network model, second sub-network model and third sub-network model according to the joint loss to obtain a first to-be-output sub-network model, a second to-be-output sub-network model and a third to-be-output sub-network model.
Step S15 is to calculate multiple performance parameters corresponding to the first to-be-exported sub-network model, the second to-be-exported sub-network model and the third to-be-exported sub-network model respectively for each group of sub-network models.
Optionally, for each group of sub-network models, respectively calculating multiple performance parameters corresponding to the first to-be-output sub-network model, the second to-be-output sub-network model, and the third to-be-output sub-network model, including: and respectively calculating model accuracy, end-to-end time delay and data drop-out rate corresponding to the first to-be-output sub-network model, the second to-be-output sub-network model and the third to-be-output sub-network model aiming at each group of sub-network models.
The model accuracy, the end-to-end delay and the data drop-out rate corresponding to the first to-be-output sub-network model, the second to-be-output sub-network model and the third to-be-output sub-network model are calculated by the scheme in the prior art, and details are not repeated here.
And step S16, respectively calculating the comprehensive performance scores corresponding to the sub-network models through the preset entropy weight model according to the multiple performance parameters corresponding to the sub-network models.
The comprehensive performance scores corresponding to the sub-network models of the groups are respectively calculated through a preset entropy weight model according to the performance parameters corresponding to the sub-network models of the groups, and the comprehensive performance scores corresponding to the sub-network models of the groups can be calculated through a weighted summation mode according to the performance parameters corresponding to the sub-network models of the groups and preset weights.
Optionally, calculating the comprehensive performance score corresponding to each group of sub-network models may also be implemented by using a preset algorithm, specifically, the algorithm may refer to a candidate network structure selection algorithm in fig. 2, where the algorithm in the figure is: (ii) Triple-partition network selection; input is accn(accuracy), caln(calculating time delay), commn(communication delay), exitn(withdrawal rate); the output is an Optical keyboard Triple-partition network; setting a network structure, accuracy, calculation time delay, communication time delay and exit rate of NETWORK SELECTION; setting an ideal scoring and a current candidate network structure scoring as 0; scoring is performed through an EntrophyTopsis algorithm aiming at n different candidate network model combinations; when the set ideal score is smaller than the score of the current candidate network structure, setting the ideal score to be equal to the score of the current candidate network structure; otherwise, the next candidate network structure score is continuously calculated.
And step S17, selecting one of the sub-network models with the highest comprehensive performance score as a target sub-network model.
For example, the first model m of the subnetwork to be output is calculated according to the prior art method1The corresponding model accuracy, end-to-end delay and data drop-out rate are respectively E1、E2、E3Second to-be-output subnetwork model m2The corresponding model accuracy, end-to-end delay and data drop-out rate are respectively E4、E5、E6Third to-be-output subnetwork model m3The corresponding model accuracy, end-to-end delay and data drop-out rate are respectively E7、E8、E9Carrying out weighted summation through a preset threshold value to obtain m1、m2、m3Has a comprehensive property of R1M is calculated according to the method4、m5、m6The corresponding overall performance score is R2Comparison of R1And R2If R is large or small1Greater than R2Then select m1、m2、m3Is the target subnetwork model.
Therefore, by the neural network model training method, the image classification network model can be deployed through the sub-network corresponding to the target sub-network model, and each sub-network in the target sub-network model only comprises one part of the image classification network model, so that different sub-networks can be deployed in different devices, and each device only needs to be responsible for one part of calculation tasks, and convenience in neural network deployment is improved.
Optionally, referring to fig. 3, in step S12, for each group of sub-network models, inputting a sample image into a first sub-network model, and taking an output of the first sub-network model as an input of a second sub-network model, and taking an output of the second sub-network model as an input of the first sub-network model, generating an image classification result output by a third sub-network model, including:
step S121, for each group of sub-network models, inputs the sample image into the first sub-network model, and obtains an output of the first sub-network model.
And step S122, calculating corresponding first credibility through a preset entropy method according to the output of the first sub-network model.
According to the output of the first sub-network model, calculating corresponding first credibility through a preset entropy method, wherein the corresponding first credibility can be calculated through a preset loss function: y isoutput_n=ycut_n(x; ω; b); corresponding losses are calculated, and thus corresponding first confidence levels are determined according to the calculated losses.
And step S123, when the first credibility is larger than the preset threshold, inputting the output of the first sub-network model into the second sub-network model to obtain the output of the second sub-network model.
Optionally, the preset threshold is determined by a preset formula:
entropy(y)=∑c∈Cyclog ycand the value obtained by the calculation is calculated,
where, entropy () represents the entropy method, y is the prediction probability vector of each label corresponding to the output of each sub-model, and C is the set of all labels in the classification task.
And step S124, calculating corresponding second credibility through a preset entropy method according to the output of the second sub-network model.
And step S125, when the second reliability is greater than the preset threshold, inputting the output of the second sub-network model into a third sub-network model to obtain an image classification result output by the third sub-network model.
By the method, the neural network can be divided into three parts by a model segmentation mode, and a segmentation outlet is set, so that part of data can exit the network in advance, and the problems of time delay and equipment resource shortage can be solved.
Referring to fig. 4, fig. 4 is a diagram of an example of a neural network model training method according to an embodiment of the present application, including:
and step S41, establishing candidate segmentation points according to the structural characteristics of different neural networks used by the current task. The method comprises the steps of segmenting a neural network, deploying a plurality of sub-neural networks formed after segmentation in a distributed computing model, dividing the structure of the neural network into three parts by utilizing segmentation points, sequentially deploying the three parts at a mobile terminal, an edge terminal and a cloud terminal, wherein in the computing mode, the mobile terminal, the edge terminal and the cloud terminal respectively only undertake the inference task of part of the neural network and can quit the network at the segmentation points in advance under the condition of ensuring the credibility of data, the credibility of the inference result of the segmentation points is measured by using an entropy method, and the smaller the entropy value obtained by SoftMax output computing is, the higher the credibility of the segmentation points to the data is.
And step S42, performing combined weighted training on the loss functions corresponding to the candidate segmentation points of the neural network to obtain a multi-output model. Because the network used in the present application is provided with a plurality of candidate segmentation points, the plurality of candidate segmentation points need to be jointly trained in the training process. The method is characterized in that a cross entropy loss function commonly used in a classification task is used as an optimized objective function, a loss function is arranged behind each candidate segmentation point, and the network is divided into three parts of 'end edge cloud', so that two candidate segmentation points are required to be selected for optimization in addition to the fixed cloud segmentation points, and the loss function at each segmentation point can reach the relatively high accuracy of the network level where the loss function is located.
And step S43, after training, selecting all the double segmentation point combinations as outlets of the mobile terminal and the edge terminal to form a candidate network. And selecting all double segmentation point combinations from the candidate segmentation points, then combining the fixed cloud segmentation points to perform step S2 operation, and using the trained network as a candidate network to evaluate in the subsequent steps.
Step S44, testing the indexes (accuracy, end-to-end delay data drop rate) affecting the performance of the candidate network under different candidate networks.
In order to ensure the accuracy of exiting data at the end partition point and the edge partition point, the accuracy of exiting data at the mobile end or the edge end can be approximate to that of the cloud exit in the classification task.
The end-to-end delay can be determined by a preset formula: c ═ Σ l∈upload 4×(|label|+S(foutput) ); and calculating the data quantity to be uploaded at each division point, and then obtaining the communication time delay of the division point according to different current network states. Wherein upload represents a data set to be uploaded at the division point, | label | represents a correct label of the data, S represents a size of the data, and f represents a size of the dataoutputThe constant 4 is used to indicate that the floating-point number occupies 4 bytes, indicating the output parameter at the current level. The calculation delay is the time required by the model to infer the model, and under the same calculation resource, the more complex the network model is, the larger the input data is, and the larger the calculation delay is, in the embodiment of the present application, the total calculation delay needs to be calculated under three different calculation resources, namely, the mobile terminal, the edge terminal, and the cloud terminal.
The data exit rate refers to the ratio of data exiting at each division point to the whole data, a training set of CIFAR-10 can be used for network model training and threshold determination, and the exit rate of the data in a test set at the division point can correspond to the entropy value of the output result of the division point, so that the threshold of the division point is determined. After the data exit rate is determined, data may be sorted by entropy, and a corresponding critical data entropy, which is a threshold of the segmentation point, is found according to the data exit rate.
And step S45, inputting each index under different candidate networks into the entropy weight Topsis model for analysis to obtain the network with the best performance.
The entropy weight Topsis can be used in the embodiment of the application to evaluate the comprehensive performance of the distributed neural network model. In particular, the candidate network structure selection algorithm can be seen in fig. 2.
In the application, by using the Cifar-10 data set, under the WiFi environment with the ideal network speed of 25MB/s, experimental results show that under the Cifar-10 data set, the optimal network segmentation point can be selected by applying the algorithm provided by the application, the network formed by the Cifar-10 data set and the optimal network segmentation point can locally withdraw about 75% of data, and the overall inference time of the model is compressed by about 3X. The result shows that in the Cifar-10, most data do not need to depend on a cloud end for data processing, and the time delay and the energy consumption of tasks are greatly reduced. Specifically, at a certain stage, the accuracy of the model does not decrease with the increase of the exit rate of the data, that is, the accuracy of the model has a stable period, and at this stage, the accuracy of the model is basically the same as that of the model when all data exits at the cloud, and after the stable period, the accuracy of the model decreases greatly and enters a rapid decrease period, which indicates that it is meaningful to segment the model and exit in advance. And selecting the position with the fastest gradient change as a turning point of the accuracy rate, wherein the position is the critical point of the stable period and the rapid descending period. Referring to fig. 5, fig. 5 shows a comparison graph of the results obtained by the entropy weight topsis algorithm, where position represents the position of the Exit point, Ex1/Ex3 represents Exit at Exit point 1 and Exit point 3, respectively, Ex2/Ex4 represents Exit at Exit point 2 and Exit point 4, respectively, each line corresponds to a different data Exit rate, where the input value is Exit, Exit rate represents the Exit rate, relative closeness C represents relative approximation, Rank represents ranking, and (Device/Edge) Threshold represents the Device end and Edge end thresholds.
In a second aspect of the embodiments of the present application, there is further provided a neural network model training apparatus, referring to fig. 6, where fig. 6 is a schematic structural diagram of the neural network model training apparatus of the embodiments of the present application, and the apparatus includes:
a model segmentation module 601, configured to segment the image classification network model to be trained into multiple sets of sub-network models according to preset multiple sets of segmentation points, where each set of sub-network model includes a first sub-network model, a second sub-network model, and a third sub-network model;
a result generation module 602, configured to, for each group of sub-network models, input the sample image into a first sub-network model, and generate an image classification result output by a third sub-network model with an output of the first sub-network model as an input of a second sub-network model and an output of the second sub-network model as an input of the first sub-network model;
a loss calculating module 603, configured to calculate, for each group of sub-network models, a first loss corresponding to an output of the first sub-network model, a second loss corresponding to an output of the second sub-network model, and a third loss corresponding to an image classification result output by the third sub-network model through a preset loss function;
the joint training module 604 is configured to perform joint training on the corresponding first sub-network model, second sub-network model, and third sub-network model respectively through the corresponding first loss, second loss, and third loss for each group of sub-network models to obtain a first to-be-output sub-network model, a second to-be-output sub-network model, and a third to-be-output sub-network model;
a parameter calculating module 605, configured to calculate, for each group of sub-network models, multiple performance parameters corresponding to the first to-be-output sub-network model, the second to-be-output sub-network model, and the third to-be-output sub-network model, respectively;
a score calculating module 606, configured to calculate, according to multiple performance parameters corresponding to each group of sub-network models, a comprehensive performance score corresponding to each group of sub-network models through a preset entropy weight model;
and the model selection module 607 is configured to select one of the groups of subnetwork models with the highest comprehensive performance score as the target subnetwork model.
Optionally, the parameter calculating module 605 includes:
and the parameter calculation submodule is used for respectively calculating model accuracy, end-to-end time delay and data drop-out rate corresponding to the first to-be-output sub-network model, the second to-be-output sub-network model and the third to-be-output sub-network model aiming at each group of sub-network models.
Optionally, the parameter calculation sub-module is specifically configured to: inputting the sample image into the first sub-network model aiming at each group of sub-network models to obtain the output of the first sub-network model; calculating corresponding first credibility through a preset entropy method according to the output of the first sub-network model; when the first credibility is larger than a preset threshold value, inputting the output of the first sub-network model into a second sub-network model to obtain the output of the second sub-network model; calculating corresponding second credibility through a preset entropy method according to the output of the second sub-network model; and when the second reliability is larger than the preset threshold, inputting the output of the second sub-network model into a third sub-network model to obtain an image classification result output by the third sub-network model.
Optionally, the preset threshold is determined by a preset formula:
entropy(y)=∑c∈Cyclog ycand the value obtained by the calculation is calculated,
where, entropy () represents the entropy method, y is the prediction probability vector of each label corresponding to the output of each sub-model, and C is the set of all labels in the classification task.
Optionally, the loss calculating module 603 is specifically configured to: for each group of sub-network models, by presetting a loss function:
youtput_n=ycut_n(x;ω;b);
Figure BDA0002987422620000141
Figure BDA0002987422620000142
where n denotes the nth division point, x is the input data, ω and b denote the network weight and offset from the input to the current division point, respectively, SoftMax () is a normalization function, ylabelOne-hot ground-route label vector, y, representing the dataoutput_Output of the network, ycut_nInput of network model to currentOutputting the segmentation points, namely label represents a label set;
and respectively calculating a first loss corresponding to the output of the first sub-network model, a second loss corresponding to the output of the second sub-network model and a third loss corresponding to the image classification result output by the third sub-network model.
Optionally, the joint training module 604 includes:
and the joint loss calculation submodule is used for calculating the joint loss of each group of sub-network models through a preset joint loss function according to the corresponding first loss, the corresponding second loss and the corresponding third loss:
Figure BDA0002987422620000151
calculating the joint loss;
wherein, ω isnIs the weight of the corresponding candidate segmentation point loss function;
and the to-be-output model obtaining submodule is used for carrying out joint training on the corresponding first sub-network model, the second sub-network model and the third sub-network model according to the joint loss to obtain the first to-be-output sub-network model, the second to-be-output sub-network model and the third to-be-output sub-network model.
Therefore, through the neural network model training device in the embodiment of the application, the image classification network model can be deployed through the sub-network corresponding to the target sub-network model, and each sub-network in the target sub-network model only comprises one part of the image classification network model, so that different sub-networks can be deployed in different devices, and each device only needs to be responsible for one part of calculation tasks, thereby improving the convenience of neural network deployment.
The embodiment of the present application further provides an electronic device, as shown in fig. 7, which includes a processor 701, a communication interface 702, a memory 703 and a communication bus 704, where the processor 701, the communication interface 702, and the memory 703 complete mutual communication through the communication bus 704,
a memory 703 for storing a computer program;
the processor 701 is configured to implement the following steps when executing the program stored in the memory 703:
dividing an image classification network model to be trained into a plurality of sub-network models according to a plurality of preset groups of dividing points, wherein each sub-network model comprises a first sub-network model, a second sub-network model and a third sub-network model;
for each group of sub-network models, inputting a sample image into the first sub-network model, taking the output of the first sub-network model as the input of the second sub-network model, taking the output of the second sub-network model as the input of the first sub-network model, and generating an image classification result output by the third sub-network model;
calculating a first loss corresponding to the output of the first sub-network model, a second loss corresponding to the output of the second sub-network model and a third loss corresponding to the image classification result output by the third sub-network model respectively through a preset loss function aiming at each group of sub-network models;
for each group of sub-network models, performing joint training on the corresponding first sub-network model, second sub-network model and third sub-network model through the corresponding first loss, second loss and third loss respectively to obtain a first sub-network model to be output, a second sub-network model to be output and a third sub-network model to be output;
calculating a plurality of performance parameters corresponding to the first to-be-output sub-network model, the second to-be-output sub-network model and the third to-be-output sub-network model respectively aiming at each group of sub-network models;
respectively calculating the comprehensive performance scores corresponding to the sub-network models through a preset entropy weight model according to the multiple performance parameters corresponding to the sub-network models;
and selecting one group with the highest comprehensive performance score in the sub-network models as a target sub-network model.
The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the electronic equipment and other equipment.
The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, including a Central Processing Unit (CPU), a Network Processor (NP), and the like; but also Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components.
In yet another embodiment provided by the present application, a computer-readable storage medium is further provided, in which a computer program is stored, and the computer program, when executed by a processor, implements the steps of any of the above neural network model training methods.
In yet another embodiment provided by the present application, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform any of the neural network model training methods of the embodiments described above.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus, the electronic device, the storage medium and the computer program product embodiment, since they are substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only for the preferred embodiment of the present application and is not intended to limit the scope of the present application. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application are included in the protection scope of the present application.

Claims (10)

1. A neural network model training method, the method comprising:
dividing an image classification network model to be trained into a plurality of sub-network models according to a plurality of preset groups of dividing points, wherein each sub-network model comprises a first sub-network model, a second sub-network model and a third sub-network model;
for each group of sub-network models, inputting a sample image into the first sub-network model, taking the output of the first sub-network model as the input of the second sub-network model, taking the output of the second sub-network model as the input of the first sub-network model, and generating an image classification result output by the third sub-network model;
calculating a first loss corresponding to the output of the first sub-network model, a second loss corresponding to the output of the second sub-network model and a third loss corresponding to the image classification result output by the third sub-network model respectively through a preset loss function aiming at each group of sub-network models;
for each group of sub-network models, performing joint training on the corresponding first sub-network model, second sub-network model and third sub-network model through the corresponding first loss, second loss and third loss respectively to obtain a first sub-network model to be output, a second sub-network model to be output and a third sub-network model to be output;
calculating a plurality of performance parameters corresponding to the first to-be-output sub-network model, the second to-be-output sub-network model and the third to-be-output sub-network model respectively aiming at each group of sub-network models;
respectively calculating the comprehensive performance scores corresponding to the sub-network models through a preset entropy weight model according to the multiple performance parameters corresponding to the sub-network models;
and selecting one group with the highest comprehensive performance score in the sub-network models as a target sub-network model.
2. The method of claim 1, wherein calculating a plurality of performance parameters corresponding to the first to-be-exported sub-network model, the second to-be-exported sub-network model, and the third to-be-exported sub-network model for each set of sub-network models comprises:
and respectively calculating model accuracy, end-to-end time delay and data drop-out rate corresponding to the first to-be-output sub-network model, the second to-be-output sub-network model and the third to-be-output sub-network model aiming at each group of sub-network models.
3. The method of claim 2, wherein the generating an image classification result of the output of the third sub-network model with the output of the first sub-network model as the input of the second sub-network model and the output of the second sub-network model as the input of the first sub-network model for each set of sub-network models comprises:
inputting a sample image into the first sub-network model for each group of sub-network models to obtain the output of the first sub-network model;
calculating corresponding first credibility through a preset entropy method according to the output of the first sub-network model;
when the first credibility is larger than a preset threshold, inputting the output of the first sub-network model into the second sub-network model to obtain the output of the second sub-network model;
calculating corresponding second credibility through a preset entropy method according to the output of the second sub-network model;
and when the second credibility is greater than the preset threshold, inputting the output of the second sub-network model into the third sub-network model to obtain an image classification result output by the third sub-network model.
4. The method of claim 3, wherein the preset threshold is determined by a preset formula:
entropy(y)=∑c∈cyclog ycand the value obtained by the calculation is calculated,
where, entropy () represents the entropy method, y is the prediction probability vector of each label corresponding to the output of each sub-model, and C is the set of all labels in the classification task.
5. The method of claim 1, wherein calculating a first loss corresponding to an output of the first sub-network model, a second loss corresponding to an output of the second sub-network model, and a third loss corresponding to an image classification result output by the third sub-network model by using a predetermined loss function for each group of sub-network models comprises:
for each group of sub-network models, by presetting a loss function:
youtput_n=ycut_n(x;ω;b);
Figure FDA0002987422610000021
Figure FDA0002987422610000022
where n denotes the nth division point, x is the input data, ω and b denote the network weight and offset from the input to the current division point, respectively, SoftMax () is a normalization function, ylabelRepresents one of the datahot ground-truth label vector,youtput_nOutput of the network, ycut_nInputting the input of the network model to the output of the current segmentation point, wherein label represents a label set;
and respectively calculating a first loss corresponding to the output of the first sub-network model, a second loss corresponding to the output of the second sub-network model and a third loss corresponding to the image classification result output by the third sub-network model.
6. The method of claim 5, wherein the jointly training the corresponding first, second and third sub-network models with the corresponding first, second and third penalties to obtain a first, second and third to-be-exported sub-network model for each set of sub-network models comprises:
for each group of sub-network models, respectively passing the corresponding first loss, the second loss and the third loss through a preset joint loss function:
Figure FDA0002987422610000031
calculating the joint loss;
wherein, ω isnIs the weight of the corresponding candidate segmentation point loss function;
and performing joint training on the corresponding first sub-network model, the second sub-network model and the third sub-network model according to the joint loss to obtain a first sub-network model to be output, a second sub-network model to be output and a third sub-network model to be output.
7. An apparatus for neural network model training, the apparatus comprising:
the model segmentation module is used for segmenting the image classification network model to be trained into a plurality of sub-network models according to a plurality of preset segmentation points, wherein each sub-network model comprises a first sub-network model, a second sub-network model and a third sub-network model;
a result generation module, configured to, for each group of sub-network models, input a sample image into the first sub-network model, and generate an image classification result output by the third sub-network model with an output of the first sub-network model as an input of the second sub-network model and an output of the second sub-network model as an input of the first sub-network model;
the loss calculation module is used for calculating a first loss corresponding to the output of the first sub-network model, a second loss corresponding to the output of the second sub-network model and a third loss corresponding to the image classification result output by the third sub-network model respectively through a preset loss function aiming at each group of sub-network models;
the joint training module is used for performing joint training on the corresponding first sub-network model, second sub-network model and third sub-network model respectively through the corresponding first loss, second loss and third loss aiming at each group of sub-network models to obtain a first sub-network model to be output, a second sub-network model to be output and a third sub-network model to be output;
the parameter calculation module is used for calculating a plurality of performance parameters corresponding to the first to-be-output sub-network model, the second to-be-output sub-network model and the third to-be-output sub-network model respectively aiming at each group of sub-network models;
the score calculation module is used for calculating the comprehensive performance scores corresponding to the sub-network models through the preset entropy weight model according to the performance parameters corresponding to the sub-network models;
and the model selection module is used for selecting one group with the highest comprehensive performance score in the sub-network models as a target sub-network model.
8. The apparatus of claim 7, wherein the parameter calculation module comprises:
and the parameter calculation submodule is used for respectively calculating model accuracy, end-to-end time delay and data drop-out rate corresponding to the first to-be-output sub-network model, the second to-be-output sub-network model and the third to-be-output sub-network model aiming at each group of sub-network models.
9. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus;
a memory for storing a computer program;
a processor for implementing the method steps of any of claims 1-6 when executing a program stored in the memory.
10. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1 to 6.
CN202110304132.3A 2021-03-22 2021-03-22 Neural network model training method and device, electronic equipment and storage medium Active CN113065641B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110304132.3A CN113065641B (en) 2021-03-22 2021-03-22 Neural network model training method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110304132.3A CN113065641B (en) 2021-03-22 2021-03-22 Neural network model training method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113065641A true CN113065641A (en) 2021-07-02
CN113065641B CN113065641B (en) 2023-09-26

Family

ID=76562826

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110304132.3A Active CN113065641B (en) 2021-03-22 2021-03-22 Neural network model training method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113065641B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024014706A1 (en) * 2022-07-13 2024-01-18 삼성전자주식회사 Electronic device for training neural network model performing image enhancement, and control method therefor

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110717589A (en) * 2019-09-03 2020-01-21 北京旷视科技有限公司 Data processing method, device and readable storage medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110717589A (en) * 2019-09-03 2020-01-21 北京旷视科技有限公司 Data processing method, device and readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
高文龙: "基于图像与深度信息融合的人脸识别研究", 《中国硕士论文数据辑》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2024014706A1 (en) * 2022-07-13 2024-01-18 삼성전자주식회사 Electronic device for training neural network model performing image enhancement, and control method therefor

Also Published As

Publication number Publication date
CN113065641B (en) 2023-09-26

Similar Documents

Publication Publication Date Title
CN113657465B (en) Pre-training model generation method and device, electronic equipment and storage medium
CN111209977B (en) Classification model training and using method, device, equipment and medium
CN111309479A (en) Method, device, equipment and medium for realizing task parallel processing
CN111046286A (en) Object recommendation method and device and computer storage medium
US20210350234A1 (en) Techniques to detect fusible operators with machine learning
US20220180209A1 (en) Automatic machine learning system, method, and device
CN108416032A (en) A kind of file classification method, device and storage medium
CN110717023A (en) Method and device for classifying interview answer texts, electronic equipment and storage medium
JP2022078310A (en) Image classification model generation method, device, electronic apparatus, storage medium, computer program, roadside device and cloud control platform
EP4343616A1 (en) Image classification method, model training method, device, storage medium, and computer program
CN112035626A (en) Rapid identification method and device for large-scale intentions and electronic equipment
CN114511083A (en) Model training method and device, storage medium and electronic device
CN113065641B (en) Neural network model training method and device, electronic equipment and storage medium
Khodaverdian et al. An energy aware resource allocation based on combination of CNN and GRU for virtual machine selection
CN112486467B (en) Interactive service recommendation method based on dual interaction relation and attention mechanism
CN113378067A (en) Message recommendation method, device, medium, and program product based on user mining
CN112766402A (en) Algorithm selection method and device and electronic equipment
CN113806501A (en) Method for training intention recognition model, intention recognition method and equipment
CN113961765B (en) Searching method, searching device, searching equipment and searching medium based on neural network model
EP4339843A1 (en) Neural network optimization method and apparatus
Liyanage et al. Automating the classification of urban issue reports: an optimal stopping approach
CN115202879A (en) Multi-type intelligent model-based cloud edge collaborative scheduling method and application
CN112214675B (en) Method, device, equipment and computer storage medium for determining user purchasing machine
Rahnamayan et al. Image Thresholding Using Differential Evolution.
CN114565105A (en) Data processing method and deep learning model training method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant