CN116935102B - Lightweight model training method, device, equipment and medium - Google Patents

Lightweight model training method, device, equipment and medium Download PDF

Info

Publication number
CN116935102B
CN116935102B CN202310793747.6A CN202310793747A CN116935102B CN 116935102 B CN116935102 B CN 116935102B CN 202310793747 A CN202310793747 A CN 202310793747A CN 116935102 B CN116935102 B CN 116935102B
Authority
CN
China
Prior art keywords
target detection
model
image classification
loss
lightweight
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310793747.6A
Other languages
Chinese (zh)
Other versions
CN116935102A (en
Inventor
孔欧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Mido Technology Co ltd
Original Assignee
Shanghai Mido Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Mido Technology Co ltd filed Critical Shanghai Mido Technology Co ltd
Priority to CN202310793747.6A priority Critical patent/CN116935102B/en
Publication of CN116935102A publication Critical patent/CN116935102A/en
Application granted granted Critical
Publication of CN116935102B publication Critical patent/CN116935102B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • G06V10/765Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects using rules for classification or partitioning the feature space
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a lightweight model training method, a device, equipment and a medium, wherein the training method comprises the following steps: constructing an image classification lightweight model based on the image classification large model, and respectively inputting training data into the image classification large model and the image classification lightweight model to obtain final loss of an image classification task; constructing a target detection lightweight model based on the target detection large model, and respectively inputting training data into the target detection large model and the target detection lightweight model to obtain the final loss of a target detection task; and carrying out weighted summation on the final loss of the image classification task and the final loss of the target detection task to obtain joint task loss, and updating parameters in the image classification lightweight model and the target detection lightweight model by adopting a gradient descent method based on the joint task loss. The invention can improve the effect of the lightweight model.

Description

Lightweight model training method, device, equipment and medium
Technical Field
The invention relates to the technical field of machine learning, in particular to a lightweight model training method.
Background
At present, a light model, namely a model with a small parameter and a high reasoning speed, is to be trained, and is usually obtained by designing a light model structure and then optimizing the light model structure according to a real label (group_trunk) by using a loss function (loss_func), so that the effect of the trained light model is often much poorer than that of a large model (large model).
Disclosure of Invention
The invention aims to solve the technical problem of providing a lightweight model training method which can improve the effect of a lightweight model.
The technical scheme adopted for solving the technical problems is as follows: the light model training method comprises the following steps:
constructing an image classification lightweight model based on an image classification large model, wherein the parameter quantity of a main network part of the image classification lightweight model is set to be a preset percentage of the parameter quantity of the main network part of the image classification large model; respectively inputting training data into an image classification large model and an image classification lightweight model to obtain final loss of an image classification task;
constructing a target detection lightweight model based on a target detection large model, wherein the parameter quantity of a main network part of the target detection lightweight model is set to be a preset percentage of the parameter quantity of the main network part of the target detection large model; respectively inputting training data into a target detection large model and a target detection lightweight model to obtain the final loss of a target detection task;
and carrying out weighted summation on the final loss of the image classification task and the final loss of the target detection task to obtain joint task loss, and updating parameters in the image classification lightweight model and the target detection lightweight model by adopting a gradient descent method based on the joint task loss to realize joint training of the image classification lightweight model and the target detection lightweight model.
The image classification large model and the target detection large model both employ a softmax_t function, which is the input of the index portion in the original softmax function divided by a constant T.
The training data are respectively input into an image classification large model and an image classification lightweight model to obtain final loss of an image classification task, and the method specifically comprises the following steps:
inputting training data into the image classification large model to obtain a first probability of each category;
inputting the training data into the image classification lightweight model to obtain a second probability of each category;
and calculating the final loss of the image classification task based on the first probability of each category, the second probability of each category and the real classification label.
The calculating the final loss of the image classification task based on the first probability of each category, the second probability of each category and the real classification label specifically comprises the following steps:
performing cross entropy loss function calculation on the first probability of each category and the second probability of each category to obtain a first result;
performing cross entropy loss function calculation on the second probability of each category and the real classification label to obtain a second result;
and calculating the average value of the first result and the second result, and taking the obtained average value as the final loss of the image classification task.
The training data are respectively input into a target detection large model and a target detection lightweight model to obtain the final loss of a target detection task, and the method specifically comprises the following steps:
inputting training data into the target detection large model to obtain a first probability of each object category and a first rectangular frame position of each object;
inputting the training data into the target detection lightweight model to obtain a second probability of each object category and a second rectangular frame position of each object;
the final loss of the target detection task is calculated based on the first probability of each object class, the first rectangular box position of each object, the second probability of each object class, the second rectangular box position of each object, and the real target detection label.
The calculating the final loss of the target detection task based on the first probability of each object category, the first rectangular frame position of each object, the second probability of each object category, the second rectangular frame position of each object and the real target detection label specifically comprises the following steps:
performing cross entropy loss function calculation on the first probability of each object class and the second probability of each object class to obtain a first result;
performing cross entropy loss function calculation on the second probability of each object class and the classification sub-label in the real target detection label to obtain a second result;
summing the first result and the second result to obtain a classification loss value;
carrying out a mean square error loss function calculation on the first rectangular frame position of each object and the second rectangular frame position of each object to obtain a third result;
carrying out mean square error loss function calculation on the second rectangular frame position of each object and the position sub-label in the real target detection label to obtain a fourth result;
summing the third result and the fourth result to obtain a position loss value;
and obtaining an average value of the classification loss value and the position loss value, and taking the obtained average value as the final loss of the target detection task.
The technical scheme adopted for solving the technical problems is as follows: provided is a lightweight model training device, comprising:
the image classification training module is used for constructing an image classification lightweight model based on the image classification large model, wherein the parameter quantity of a main network part of the image classification lightweight model is set to be a preset percentage of the parameter quantity of the main network part of the image classification large model; respectively inputting training data into an image classification large model and an image classification lightweight model to obtain final loss of an image classification task;
the target detection training module is used for constructing a target detection lightweight model based on a target detection large model, wherein the parameter quantity of a main network part of the target detection lightweight model is set to be a preset percentage of the parameter quantity of the main network part of the target detection large model; respectively inputting training data into a target detection large model and a target detection lightweight model to obtain the final loss of a target detection task;
and the optimization training module is used for carrying out weighted summation on the final loss of the image classification task and the final loss of the target detection task to obtain joint task loss, and updating parameters in the image classification lightweight model and the target detection lightweight model by adopting a gradient descent method based on the joint task loss so as to realize joint training of the image classification lightweight model and the target detection lightweight model.
The technical scheme adopted for solving the technical problems is as follows: there is provided an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the lightweight model training method described above when executing the computer program.
The technical scheme adopted for solving the technical problems is as follows: there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the lightweight model training method described above.
Advantageous effects
Due to the adoption of the technical scheme, compared with the prior art, the invention has the following advantages and positive effects: according to the invention, the lightweight model is trained by utilizing the output probabilities of the real label and the large model, so that the effect of the lightweight model is improved, and in addition, the lightweight model of the target detection and image classification task can be trained and optimized together, so that the training efficiency is improved.
Drawings
Fig. 1 is a flowchart of a first embodiment of a quantization model training method of the present invention.
Detailed Description
The invention will be further illustrated with reference to specific examples. It is to be understood that these examples are illustrative of the present invention and are not intended to limit the scope of the present invention. Further, it is understood that various changes and modifications may be made by those skilled in the art after reading the teachings of the present invention, and such equivalents are intended to fall within the scope of the claims appended hereto.
A first embodiment of the present invention relates to a lightweight model training method, as shown in fig. 1, comprising the steps of:
step 1, constructing an image classification lightweight model based on an image classification large model, wherein the parameter quantity of a main network part of the image classification lightweight model is set to be a preset percentage of the parameter quantity of the main network part of the image classification large model; and respectively inputting training data into the image classification large model and the image classification lightweight model to obtain the final loss of the image classification task.
The image classification large model and the image classification lightweight model in this step are each composed of 3 parts, namely backbone, class _head and softmax functions, according to the general design of the image classification network.
The parameters of the image classification large model are fixed and only responsible for forward reasoning. Obtaining a feature vector by training an image through a background and a class head of the image classification large model; in this embodiment, the softmax function of the image classification large model is slightly modified, the input of the index part in the original formula is divided by a constant T, the value of T is 100 in this embodiment, and the rest is unchanged, and the softmax function is named as the softmax_t function. The feature vector is subjected to a softmax_t function to obtain a first probability prob_class_large of each category.
The parameter quantity of the backbox of the image classification lightweight model is set to be 1% of the parameter quantity of the image classification large model, the rest is unchanged, and the training image can obtain a second probability prob_class_lite of each category through the image classification lightweight model.
And carrying out cross entropy loss function calculation on the second probability prob_class_lite of each class and the first probability prob_class_large of each class to obtain a first result, and recording the first result as a loss_large value.
And performing cross entropy loss function calculation on the second probability prob_class_lite and the real classification label group_trunk_class of each class to obtain a second result, and marking the second result as a loss_trunk value.
And averaging the first result loss_large and the second result loss_trunk to obtain the final loss loss_final_class of the image classification task.
Step 2, constructing a target detection lightweight model based on a target detection large model, wherein the parameter quantity of a main network part of the target detection lightweight model is set to be a preset percentage of the parameter quantity of the main network part of the target detection large model; and respectively inputting training data into the target detection large model and the target detection lightweight model to obtain the final loss of the target detection task.
The large target detection model and the light target detection model in the step are mainly composed of 5 parts, namely backbone, class _head, box_head, softmax function and sigmoid function according to the general design of a target detection network.
The target detection large model is the same as the image classification large model, and the parameters are fixed and only responsible for forward reasoning. The training image sequentially passes through backbone, class _head and softmax_T functions of the target detection large model (the softmax_T functions are the same as the design of the image classification large model), so that a first probability prob_class_large of each object type is obtained, and the training image sequentially passes through back box, box_head and sigmoid functions of the target detection large model, so that a first rectangular frame position value_box_large of each object is obtained.
The parameter quantity of the backup in the target detection lightweight model is set to be 1% of the parameter quantity of the target detection large model, the rest is unchanged, and the second probability prob_class_lite of each object class and the second rectangular frame position value_box_lite of each object can be obtained after the training image passes through the target detection lightweight model.
And performing cross entropy loss function calculation on the second probability prob_class_lite of each object class and the first probability prob_class_large of each object class to obtain a first result, and marking the first result as a loss_large_class value.
And performing cross entropy loss function calculation on the second probability prob_class_lite of each object class and the classified sub-label group_trunk_class in the real target detection label to obtain a second result, and marking the second result as a loss_trunk_class value.
And summing the first result loss_large_class and the second result loss_trunk_class to obtain a classification loss value loss_class_final.
And carrying out mean square error loss function calculation on the first rectangular frame position value_box_large of each object and the second rectangular frame position value_box_lite of each object to obtain a third result, and recording the third result as a loss_large_box value.
And carrying out mean square error loss function calculation on the second rectangular frame position value_box_lite of each object and the position sub-label group_trunk_box in the real target detection label to obtain a fourth result, and marking the fourth result as a loss_trunk_box value.
And summing the third result loss_large_box and the fourth result loss_trunk to obtain a position loss value loss_box_final.
The classification loss value loss_class_final and the position loss value loss_box_final are averaged to obtain the final loss loss_final_detection of the target detection task.
And 3, carrying out weighted summation on the final loss loss_final_class of the image classification task and the final loss_final_detection of the target detection task to obtain joint task loss, and updating parameters in the image classification lightweight model and the target detection lightweight model by adopting a gradient descent method based on the joint task loss to realize joint training of the image classification lightweight model and the target detection lightweight model.
The formula for weighted summation in this embodiment is: 0.6 x loss_final_class+0.4 x loss_final_detection.
It is easy to find that the invention utilizes the output probability of the real label and the large model to train the lightweight model at the same time, and improves the effect of the lightweight model.
A second embodiment of the present invention relates to a lightweight model training device, comprising:
the image classification training module is used for constructing an image classification lightweight model based on the image classification large model, wherein the parameter quantity of a main network part of the image classification lightweight model is set to be a preset percentage of the parameter quantity of the main network part of the image classification large model; respectively inputting training data into an image classification large model and an image classification lightweight model to obtain final loss of an image classification task;
the target detection training module is used for constructing a target detection lightweight model based on a target detection large model, wherein the parameter quantity of a main network part of the target detection lightweight model is set to be a preset percentage of the parameter quantity of the main network part of the target detection large model; respectively inputting training data into a target detection large model and a target detection lightweight model to obtain the final loss of a target detection task;
and the optimization training module is used for carrying out weighted summation on the final loss of the image classification task and the final loss of the target detection task to obtain joint task loss, and updating parameters in the image classification lightweight model and the target detection lightweight model by adopting a gradient descent method based on the joint task loss so as to realize joint training of the image classification lightweight model and the target detection lightweight model.
The image classification training module inputs training data into an image classification large model and an image classification lightweight model respectively to obtain final loss of an image classification task, and specifically comprises the following steps:
inputting training data into the image classification large model to obtain a first probability of each category;
inputting the training data into the image classification lightweight model to obtain a second probability of each category;
and calculating the final loss of the image classification task based on the first probability of each category, the second probability of each category and the real classification label.
The calculating the final loss of the image classification task based on the first probability of each category, the second probability of each category and the real classification label specifically comprises the following steps:
performing cross entropy loss function calculation on the first probability of each category and the second probability of each category to obtain a first result;
performing cross entropy loss function calculation on the second probability of each category and the real classification label to obtain a second result;
and calculating the average value of the first result and the second result, and taking the obtained average value as the final loss of the image classification task.
When training data are respectively input into a target detection large model and a target detection lightweight model by the target detection training module to obtain the final loss of a target detection task, the target detection training module specifically comprises the following steps:
inputting training data into the target detection large model to obtain a first probability of each object category and a first rectangular frame position of each object;
inputting the training data into the target detection lightweight model to obtain a second probability of each object category and a second rectangular frame position of each object;
the final loss of the target detection task is calculated based on the first probability of each object class, the first rectangular box position of each object, the second probability of each object class, the second rectangular box position of each object, and the real target detection label.
The calculating the final loss of the target detection task based on the first probability of each object category, the first rectangular frame position of each object, the second probability of each object category, the second rectangular frame position of each object and the real target detection label specifically comprises the following steps:
performing cross entropy loss function calculation on the first probability of each object class and the second probability of each object class to obtain a first result;
performing cross entropy loss function calculation on the second probability of each object class and the classification sub-label in the real target detection label to obtain a second result;
summing the first result and the second result to obtain a classification loss value;
carrying out a mean square error loss function calculation on the first rectangular frame position of each object and the second rectangular frame position of each object to obtain a third result;
carrying out mean square error loss function calculation on the second rectangular frame position of each object and the position sub-label in the real target detection label to obtain a fourth result;
summing the third result and the fourth result to obtain a position loss value;
and obtaining an average value of the classification loss value and the position loss value, and taking the obtained average value as the final loss of the target detection task.
A third embodiment of the invention is directed to an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the lightweight model training method of the first embodiment when executing the computer program.
A fourth embodiment of the present invention is directed to a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the lightweight model training method of the first embodiment.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein. The scheme in the embodiment of the invention can be realized by adopting various computer languages, such as object-oriented programming language Java, an transliteration script language JavaScript and the like.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (5)

1. The lightweight model training method is characterized by comprising the following steps of:
constructing an image classification lightweight model based on an image classification large model, wherein the parameter quantity of a main network part of the image classification lightweight model is set to be a preset percentage of the parameter quantity of the main network part of the image classification large model; respectively inputting training data into an image classification large model and an image classification lightweight model to obtain final loss of an image classification task; the training data are respectively input into an image classification large model and an image classification lightweight model to obtain final loss of an image classification task, and the method specifically comprises the following steps:
inputting training data into the image classification large model to obtain a first probability of each category;
inputting the training data into the image classification lightweight model to obtain a second probability of each category;
performing cross entropy loss function calculation on the first probability of each category and the second probability of each category to obtain a first result;
performing cross entropy loss function calculation on the second probability of each category and the real classification label to obtain a second result; calculating the average value of the first result and the second result, and taking the obtained average value as the final loss of the image classification task;
constructing a target detection lightweight model based on a target detection large model, wherein the parameter quantity of a main network part of the target detection lightweight model is set to be a preset percentage of the parameter quantity of the main network part of the target detection large model; respectively inputting training data into a target detection large model and a target detection lightweight model to obtain the final loss of a target detection task;
the training data are respectively input into a target detection large model and a target detection lightweight model to obtain the final loss of a target detection task, and the method specifically comprises the following steps:
inputting training data into the target detection large model to obtain a first probability of each object category and a first rectangular frame position of each object;
inputting the training data into the target detection lightweight model to obtain a second probability of each object category and a second rectangular frame position of each object;
performing cross entropy loss function calculation on the first probability of each object class and the second probability of each object class to obtain a first calculation result;
performing cross entropy loss function calculation on the second probability of each object class and the classification sub-label in the real target detection label to obtain a second calculation result;
summing the first calculation result and the second calculation result to obtain a classification loss value;
carrying out mean square error loss function calculation on the first rectangular frame position of each object and the second rectangular frame position of each object to obtain a third calculation result;
carrying out mean square error loss function calculation on the second rectangular frame position of each object and the position sub-label in the real target detection label to obtain a fourth calculation result;
summing the third calculation result and the fourth calculation result to obtain a position loss value;
calculating the average value of the classification loss value and the position loss value, and taking the obtained average value as the final loss of the target detection task;
and carrying out weighted summation on the final loss of the image classification task and the final loss of the target detection task to obtain joint task loss, and updating parameters in the image classification lightweight model and the target detection lightweight model by adopting a gradient descent method based on the joint task loss to realize joint training of the image classification lightweight model and the target detection lightweight model.
2. The lightweight model training method of claim 1, wherein the image classification large model and the target detection large model each employ a softmax_t function, the softmax_t function being the input of the index portion divided by a constant T in the original softmax function.
3. A lightweight model training apparatus, characterized by applying the lightweight model training method as claimed in any one of claims 1-2, comprising:
the image classification training module is used for constructing an image classification lightweight model based on the image classification large model, wherein the parameter quantity of a main network part of the image classification lightweight model is set to be a preset percentage of the parameter quantity of the main network part of the image classification large model; respectively inputting training data into an image classification large model and an image classification lightweight model to obtain final loss of an image classification task;
the target detection training module is used for constructing a target detection lightweight model based on a target detection large model, wherein the parameter quantity of a main network part of the target detection lightweight model is set to be a preset percentage of the parameter quantity of the main network part of the target detection large model; respectively inputting training data into a target detection large model and a target detection lightweight model to obtain the final loss of a target detection task;
and the optimization training module is used for carrying out weighted summation on the final loss of the image classification task and the final loss of the target detection task to obtain joint task loss, and updating parameters in the image classification lightweight model and the target detection lightweight model by adopting a gradient descent method based on the joint task loss so as to realize joint training of the image classification lightweight model and the target detection lightweight model.
4. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the lightweight model training method of any of claims 1-2 when the computer program is executed.
5. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the lightweight model training method as claimed in any of the claims 1-2.
CN202310793747.6A 2023-06-30 2023-06-30 Lightweight model training method, device, equipment and medium Active CN116935102B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310793747.6A CN116935102B (en) 2023-06-30 2023-06-30 Lightweight model training method, device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310793747.6A CN116935102B (en) 2023-06-30 2023-06-30 Lightweight model training method, device, equipment and medium

Publications (2)

Publication Number Publication Date
CN116935102A CN116935102A (en) 2023-10-24
CN116935102B true CN116935102B (en) 2024-02-20

Family

ID=88383498

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310793747.6A Active CN116935102B (en) 2023-06-30 2023-06-30 Lightweight model training method, device, equipment and medium

Country Status (1)

Country Link
CN (1) CN116935102B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111160379A (en) * 2018-11-07 2020-05-15 北京嘀嘀无限科技发展有限公司 Training method and device of image detection model and target detection method and device
WO2020155518A1 (en) * 2019-02-03 2020-08-06 平安科技(深圳)有限公司 Object detection method and device, computer device and storage medium
WO2021087985A1 (en) * 2019-11-08 2021-05-14 深圳市欢太科技有限公司 Model training method and apparatus, storage medium, and electronic device
CN114022658A (en) * 2021-09-17 2022-02-08 浙江智慧视频安防创新中心有限公司 Target detection method, device, storage medium and terminal
CN115758245A (en) * 2022-11-18 2023-03-07 上海蜜度信息技术有限公司 Multi-mode data classification method, device, equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111160379A (en) * 2018-11-07 2020-05-15 北京嘀嘀无限科技发展有限公司 Training method and device of image detection model and target detection method and device
WO2020155518A1 (en) * 2019-02-03 2020-08-06 平安科技(深圳)有限公司 Object detection method and device, computer device and storage medium
WO2021087985A1 (en) * 2019-11-08 2021-05-14 深圳市欢太科技有限公司 Model training method and apparatus, storage medium, and electronic device
CN114424253A (en) * 2019-11-08 2022-04-29 深圳市欢太科技有限公司 Model training method and device, storage medium and electronic equipment
CN114022658A (en) * 2021-09-17 2022-02-08 浙江智慧视频安防创新中心有限公司 Target detection method, device, storage medium and terminal
CN115758245A (en) * 2022-11-18 2023-03-07 上海蜜度信息技术有限公司 Multi-mode data classification method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN116935102A (en) 2023-10-24

Similar Documents

Publication Publication Date Title
CN109816032B (en) Unbiased mapping zero sample classification method and device based on generative countermeasure network
CN113947196A (en) Network model training method and device and computer readable storage medium
CN111079532A (en) Video content description method based on text self-encoder
CN110362814B (en) Named entity identification method and device based on improved loss function
US11449707B2 (en) Method for processing automobile image data, apparatus, and readable storage medium
CN114359938B (en) Form identification method and device
CN108053454A (en) A kind of graph structure data creation method that confrontation network is generated based on depth convolution
JP7110929B2 (en) Knowledge Complementary Program, Knowledge Complementary Method, and Knowledge Complementary Device
CN113741886A (en) Statement level program repairing method and system based on graph
CN116402352A (en) Enterprise risk prediction method and device, electronic equipment and medium
CN107729885B (en) Face enhancement method based on multiple residual error learning
CN116594601A (en) Pre-training large model code generation method based on knowledge base and multi-step prompt
CN111461353A (en) Model training method and system
CN111950579A (en) Training method and training device for classification model
CN113705402A (en) Video behavior prediction method, system, electronic device and storage medium
CN116935102B (en) Lightweight model training method, device, equipment and medium
CN116152609B (en) Distributed model training method, system, device and computer readable medium
CN110889316B (en) Target object identification method and device and storage medium
CN117574262A (en) Underwater sound signal classification method, system and medium for small sample problem
JPWO2019180868A1 (en) Image generator, image generator and image generator
CN112861601A (en) Method for generating confrontation sample and related equipment
Li et al. Automated deep learning system for power line inspection image analysis and processing: Architecture and design issues
CN114547391A (en) Message auditing method and device
CN112698977B (en) Method, device, equipment and medium for positioning server fault
CN112288032B (en) Method and device for quantitative model training based on generation of confrontation network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: Room 301AB, No. 10, Lane 198, Zhangheng Road, Free Trade Pilot Zone, Pudong New Area, Shanghai, 200120

Applicant after: Shanghai Mido Technology Co.,Ltd.

Address before: Room 301AB, No. 10, Lane 198, Zhangheng Road, Free Trade Pilot Zone, Pudong New Area, Shanghai, 200120

Applicant before: SHANGHAI MDATA INFORMATION TECHNOLOGY Co.,Ltd.

GR01 Patent grant
GR01 Patent grant