CN113240019A - Verification set loss curve correction method and device, terminal device and storage medium - Google Patents

Verification set loss curve correction method and device, terminal device and storage medium Download PDF

Info

Publication number
CN113240019A
CN113240019A CN202110545055.0A CN202110545055A CN113240019A CN 113240019 A CN113240019 A CN 113240019A CN 202110545055 A CN202110545055 A CN 202110545055A CN 113240019 A CN113240019 A CN 113240019A
Authority
CN
China
Prior art keywords
loss
verification
subset
target
total
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110545055.0A
Other languages
Chinese (zh)
Inventor
肖谦
洪坤磊
钱令军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Zhiying Medical Technology Co ltd
Original Assignee
Shenzhen Zhiying Medical Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Zhiying Medical Technology Co ltd filed Critical Shenzhen Zhiying Medical Technology Co ltd
Priority to CN202110545055.0A priority Critical patent/CN113240019A/en
Publication of CN113240019A publication Critical patent/CN113240019A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The application is suitable for the technical field of artificial intelligence, and provides a verification set loss curve correction method, a verification set loss curve correction device, terminal equipment and a storage medium. In the embodiment of the application, a verification loss subset in the current training process is obtained, wherein the verification loss subset comprises the total loss of each verification set sample; adding verification loss subsets to the first set, and deleting the verification loss subset with the earliest time in the first set if the number of the verification loss subsets in the first set exceeds a first preset number; normalizing each verification loss subset in the first set, and determining a target loss subset according to each verification loss subset subjected to normalization, wherein the target loss subset comprises a target total loss of each verification set sample; and selecting a second preset number of verification set samples from the target loss subset, and drawing a loss curve according to the average values of the total losses corresponding to the second preset number of verification set samples respectively, thereby improving the accuracy of the verification set loss curve.

Description

Verification set loss curve correction method and device, terminal device and storage medium
Technical Field
The application belongs to the technical field of artificial intelligence, and particularly relates to a verification set loss curve correction method and device, terminal equipment and a storage medium.
Background
With the development of society, medical imaging technology has entered into the digital era, and related medical imaging devices such as CR and DR are more and more common in people's life, and the target detection algorithm is widely applied to the medical imaging devices. In the target detection algorithm, if the verification set curve is directly used to describe the optimization condition of the training process model, the curve usually fluctuates sharply due to a small part of noise data in the verification set, and even the curve change trend changes prematurely, so that a trainer terminates training prematurely.
In the prior art, indexes such as verification set mAP and AP are usually used for describing an optimization process of model performance in a training process, loss curves are used for describing fitting conditions of a general training set, but the indexes are difficult to be synchronously compared with the training set, and the accuracy of the loss curves of the verification set is poor.
Disclosure of Invention
The embodiment of the application provides a verification set loss curve correction method, a verification set loss curve correction device, terminal equipment and a storage medium, and can solve the problem that the verification set loss curve is poor in accuracy.
In a first aspect, an embodiment of the present application provides a method for correcting a verification set loss curve, including:
obtaining a verification loss subset in the current training process, wherein the verification loss subset comprises the total loss of each verification set sample;
adding the verification loss subsets to a first set, and deleting the verification loss subset with the earliest time in the first set if the number of the verification loss subsets in the first set exceeds a first preset number;
normalizing each verification loss subset in the first set, and determining a target loss subset according to each verification loss subset after the normalization, wherein the target loss subset comprises a target total loss of each verification set sample;
and selecting a second preset number of verification set samples from the target loss subset, and drawing a loss curve according to the average values of the total losses corresponding to the second preset number of verification set samples.
In a second aspect, an embodiment of the present application provides a verification set loss curve modification apparatus, including:
the system comprises an acquisition module, a training module and a training module, wherein the acquisition module is used for acquiring a verification loss subset in the current training process, and the verification loss subset comprises the total loss of each verification set sample;
an adding module, configured to add the verification loss subsets to a first set, and delete a verification loss subset with an earliest time in the first set if the number of the verification loss subsets in the first set exceeds a first preset number;
a normalization processing module, configured to perform normalization processing on each verification loss subset in the first set, and determine a target loss subset according to each verification loss subset after the normalization processing, where the target loss subset includes a target total loss of each verification set sample;
and the selecting module is used for selecting a second preset number of verification set samples from the target loss subset and drawing a loss curve according to the average value of the total loss corresponding to the second preset number of verification set samples.
In a third aspect, an embodiment of the present application provides a terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements any of the steps of the verification set loss curve correction methods when executing the computer program.
In a fourth aspect, the present application provides a computer-readable storage medium, where a computer program is stored, and the computer program, when executed by a processor, implements the steps of any of the verification set loss curve modification methods described above.
In a fifth aspect, embodiments of the present application provide a computer program product, which, when run on a terminal device, causes the terminal device to execute any one of the verification set loss curve correction methods in the first aspect.
In the embodiment of the application, a verification loss subset in a current training process is obtained, and the verification loss subset includes total loss of each verification set sample. Adding the verification loss subsets into the first set, deleting the verification loss subset with the earliest time in the first set if the number of the verification loss subsets in the first set exceeds a first preset number, so as to obtain the stable performance condition of the sample, facilitate subsequent sample screening, and then carrying out standardization processing on each verification loss subset in the first set, so as to improve the data accuracy; and selecting a second preset number of verification set samples from the target loss subset so as to select samples which accord with the distribution of the training set, and drawing a loss curve according to the average values of the total losses corresponding to the second preset number of verification set samples respectively, thereby improving the accuracy of the verification set loss curve.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a first flowchart illustrating a verification set loss curve modification method according to an embodiment of the present disclosure;
FIG. 2 is a second flowchart of a verification set loss curve modification method according to an embodiment of the present disclosure;
FIG. 3 is a third flowchart illustrating a verification set loss curve modification method according to an embodiment of the present disclosure;
FIG. 4 is a validation set loss graph provided by an embodiment of the present application;
FIG. 5 is a schematic structural diagram of a verification set loss curve modification apparatus according to an embodiment of the present disclosure;
fig. 6 is a schematic structural diagram of a terminal device according to an embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to" determining "or" in response to detecting ". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".
Furthermore, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used for distinguishing between descriptions and not necessarily for describing or implying relative importance.
Fig. 1 is a schematic flow chart of a verification set loss curve modification method in an embodiment of the present application, where an execution subject of the method may be a terminal device, as shown in fig. 1, the verification set loss curve modification method may include the following steps:
step S101, obtaining a verification loss subset in the current training process, wherein the verification loss subset comprises the total loss of each verification set sample.
In this embodiment, in order to describe the improvement process of model performance in the training process more accurately, the terminal device needs to solve the problem of interference of the verification set noise data on the verification set curve, so the terminal device performs loss screening by obtaining a verification loss subset including the total loss of each verification set sample, and thus finds the overfitting time point in the training process accurately from the finally obtained loss curve of the verification set without changing the loss size. In the medical field, there may be medical DR data for different target tasks, and the medical DR data is the verification set sample, so when it is detected that there are multiple tasks in the verification set sample, it is described that the total loss of the verification set sample further includes task losses, and the sum of the task losses is the total loss of the verification set sample.
It can be understood that the loss of the training set shows a decreasing trend along with the training process, so in the normal training process, the loss curve of the validation set should also decrease along with the decrease of the loss of the training set until the training is over-fitted, and the above validation set loss curve correction method performs corresponding correction along with each training, so as to determine the final loss curve graph.
By way of specific example and not limitation, if there are total losses of 0.1, 1.1, 2.1, 3.1, 4.1, 5.1, 6.1, 7.1, 8.1, 9.1 for 3 validation set samples in the validation loss subset in the current training process, then the validation loss subset is V ═ 0.1, 1.1, 2.1, 3.1, 4.1, 5.1, 6.1, 7.1, 8.1, 9.1 ]. If there are 4 tasks in the current validation set sample, the corresponding task loss set may be Z0 ═ 0.1, 0.03, 0.01 ], Z1 ═ 1.1, 1.03, 0.03, 0.01 ], Z2 ═ 2.1, 2.03, 0.03, 0.01 ], Z3 ═ 3.1, 3.03, 0.03, 0.01, Z4 ═ 4.1, 4.03, 0.03, 0.01 ], Z5 ═ 5.1, 5.03, 0.03, 0.01 ], Z7 ═ 6.1, 6.03, 0.03, 0.01, · 0.03, 0.01 ], Z7 ═ 6.1, 6.03, 0.03, 0.01 ═ 7, 0.03, 0.01, 0.03, 0.7 ═ 7, 0.8, 7, 7.8, 7, 7.03, 7.8, 7, 7.03, 7.8, 7, 7.8, 7, 3, 3.8, 3, 7, 3, 7, 3, 7, 8, 7, 3, 7, 3, 8, 7, 8, 7, 8, 3, 7, 8, 7, 8, 7, etc. four of the above-th, 3, 7, 3, 7, 3, 7, 3, 7, 3, 7, 3, 7, 3, 7, 3, 7, 3, 7, 3, 7, 3, 8, 3, 7, 3, 7, 3, 8, 7, 3, 7, 3, 8, 7, 8, 7, 3, 7, 3, 7, 8, 7, 3, 8, 7, 8, 3.
Step S102, adding the verification loss subsets to the first set, and deleting the verification loss subset with the earliest time in the first set if the number of the verification loss subsets in the first set exceeds a first preset number.
In this embodiment, in order to improve the real-time performance of the medical DR data, the terminal device deletes the oldest verification loss subset in the first set when the number of verification loss subsets in the first set exceeds a first preset number, so as to process the latest certain number of verification loss subsets. Because the corresponding loss is gradually reduced by correcting in each training process, the stable performance condition of the sample is obtained by selecting the latest verification loss subsets with the first preset number, and the accuracy is improved for the later-stage sample screening of the verification set.
By way of specific example and not limitation, if 5 training passes have been performed currently, and the first preset number is 3, and there are 3 verification loss subsets in the training process, the first set may be Q ═ 0.1,0.15,0.2, [ 0.08,0.14,0.19 ], [ 0.06,0.18,0.11 ], and the first set includes the verification loss subset corresponding to the latest three training passes.
Step S103, normalizing each verification loss subset in the first set, and determining a target loss subset according to each verification loss subset after the normalization, where the target loss subset includes a target total loss of each verification set sample.
In this embodiment, the terminal device normalizes the total loss of each verification set sample in each verification loss subset in the first set to make the total loss of the same verification set sample in different verification loss subsets comparable, so as to facilitate the comprehensive processing of the total loss of the same verification set sample in different verification loss subsets, and determine a target loss subset, which is a loss value obtained by comprehensively processing the total loss of the same verification set sample in different verification loss subsets.
In one embodiment, as shown in fig. 2, the normalizing each subset of verification losses in the first set in step S103 includes:
step S201, adding the verification loss subsets to a second set, and performing an average processing on the second set, where the second set includes all the verification loss subsets in the history training process.
Step S202, each verification loss subset in the first set is subjected to standardization processing according to the second set after the mean value processing.
In this embodiment, the terminal device normalizes the latest certain number of verification loss subsets by using all the verification loss subsets obtained in the training process, so as to make the total loss of the same verification set sample in different verification loss subsets comparable.
By way of specific example and not limitation, if 5 training sessions have been currently performed and there are 3 subsets of verification losses during the training session, the second set may be L ═ 0.1,0.15,0.2, [ 0.1,0.15,0.2 ], [ 0.08,0.14,0.19 ], [ 0.06,0.18,0.11 ].
In one embodiment, the averaging processing on the second set in step S201 includes: the terminal device calculates the total loss sum of each verification set sample in the second set in each verification loss subset, and calculates the average value of the total loss sum corresponding to each verification set sample according to the training times, so as to obtain the loss average value corresponding to each verification set sample.
By way of specific example and not limitation, if 5 training sessions are currently performed, and the second set may be L ═ 0.1,0.15,0.2, [ 0.1,0.15,0.2 ], [ 0.08,0.14,0.19 ], [ 0.06,0.18,0.11 ], then the set M of corresponding loss means for each of the verification set samples is [ 0.1+0.1+0.1+0.08+0.06 ]/5, [ 0.15+0.15+0.14+0.18 ]/5, [ 0.2+0.2+0.19+0.11 ]/5 ], [ 0.088, [ 0.154,0.18 ], [ 0.088 ], and thus corresponding loss means for each of the verification set samples is 0.088.
In one embodiment, step S202 includes: the terminal equipment calculates the difference between the total loss of each verification set sample in the verification loss subset and the loss mean value corresponding to each verification set sample respectively, and then determines the ratio of the difference corresponding to each verification set sample to the loss mean value corresponding to each verification set sample respectively as the verification loss subset after the standardization processing. Wherein, the verification loss subset Q after the calculation standardization processing1The formula of (1) is:
Q1=(Q-M)/M
by way of specific example, and not limitation, if the first set is Q ═ 0.1,0.15,0.2, [ 0.08,0.14,0.19 ], [ 0.06,0.18,0.11 ], and the set M composed of the loss average values corresponding to the 3 verification set samples is [ 0.088,0.154,0.18 ], then the subset of validation losses after the normalization process is calculated as Q1 ═ [ 0.1-0.088)/0.088, (0.15-0.154)/0.154, (0.2-0.18)/0.18 ], [ 0.08-0.088)/0.088, (0.14-0.154)/0.154, (0.19-0.18)/0.18 ], [ 0.06-0.088)/0.088, (0.18-0.154)/0.154, (0.11-0.18)/0.18 ═ 0.18 ], [ 0.13636364, -0.02597403,0.11111111 ], [ 0.09090909, -0.09090909,0.05555556 ], [ 0.31818182,0.16883117, -0.38888889 ].
In one embodiment, as shown in fig. 3, the determining the target loss subset according to each verification loss subset after the normalization process in step S103 includes:
step S301, a total sum of total losses of the same verification set sample in each verification loss subset after the normalization process is calculated.
Step S302, calculating the average value of the total losses of the same verification set sample in each verification loss subset according to the number of the verification loss subsets, and obtaining the target total loss of each verification set sample.
And step S303, determining a target loss subset according to the target total loss of each verification set sample.
By way of specific example and not limitation, if the subset of validation losses after the normalization process is Q1 ═ [ 0.13636364, -0.02597403,0.11111111 ], [ 0.09090909, -0.09090909,0.05555556 ], [ 0.31818182,0.16883117, -0.38888889 ], then the subset of target losses Q2 ═ [ 0.136-0.091-0.318)/3, (-0.026-0.091+0.169)/3, (-0.111 +0.056-0.389)/3 ], [ 0.091,0.0173, -0.074 ].
And step S104, selecting a second preset number of verification set samples from the target loss subset, and drawing a loss curve according to the average values of the total losses corresponding to the second preset number of verification set samples.
In this embodiment, the terminal device may draw the loss curve according to the verification set sample selected from the target loss subset, and may perform denoising on the verification set data, thereby obtaining a smoother loss curve with a more accurate inflection point, and more accurately finding the trained optimal model.
In one embodiment, the step S104 of selecting a second preset number of verification set samples from the target loss subset includes: and the terminal equipment sorts the verification set samples in the target loss subset from small to large according to the target total loss, and selects a second preset number of verification set samples sorted in the front from the sorted verification set samples.
By way of specific example and not limitation, if the target loss subset Q2 is ═ 0.091,0.0173, -0.074, -0.167, -0.167,0.0185,0.038, -0.237, -0.203, -0.290 ], then 9, 7, 8, 3, 4,0, 2, 1, 5, 6 after sorting from small to large, and if the second predetermined number is 8, then the selected validation set samples are 8 samples located at 9, 7, 8, 3, 4,0, 2, 1, and if the current validation loss subset is V ═ 0.1, 1.1, 2.1, 3.1, 4.1, 5.1, 6.1, 7.1, 8.1, 9.1, then the total loss of the finally selected samples in the current training process is V1 ═ 0.1, 1.1, 2.1, 3.1, 4.1, 5.1, 6.1, 7.1, 8.1, 9.1 ]. If there are 4 tasks in the current validation set sample, the corresponding set of task losses may be Z10 ═ 0.1, 0.03, 0.01, ═ Z11 ═ 1.1, 1.03, 0.03, 0.01, ═ Z12 ═ 2.1, 2.03, 0.03, 0.01, — Z13 ═ 3.1, 3.03, 0.03, 0.01, — Z14 ═ 4.1, 4.03, 0.03, 0.01, — Z2 ═ 7.1, 7.03, 0.03, 0.01, —, Z5 ═ 18 ═ 8.1, — (8.03, 0.03, 0.01, —, 0.3879 ═ Z19.
In one embodiment, the step S104 of plotting a loss curve according to the mean values of the total losses corresponding to the second preset number of verification set samples includes: taking the average value of the total loss as the total loss value in the current training process; drawing a loss curve according to each total loss value in the historical training process and the total loss value in the current training process, as shown in fig. 4, fig. 4 is a verification set loss graph, the horizontal axis in fig. 4 is the training times, the vertical axis is the loss value, each point on the curve loss is the loss of the training set corresponding to each training, each point of the curve val _ loss is the total loss of the verification set before correction corresponding to each training, each point of test _ loss is the loss of the test set of the independent source corresponding to each training, each point of fixed _ val _ loss0.8 is the loss of the first verification set after correction corresponding to each training, each point of fixed _ val _ loss0.7 used by the verification set after correction accounts for 80% of the total number of target loss subsets, each point of fixed _ val _ loss0.7 is the loss of the second verification set after correction corresponding to each training, the modified validation set uses a second predetermined number of validation set samples that accounts for 70% of the total number of target loss subsets.
By way of specific example and not limitation, if the current verification loss subset is V ═ 0.1, 1.1, 2.1, 3.1, 4.1, 5.1, 6.1, 7.1, 8.1, 9.1 ], the finally selected total loss of the sample in the current training process is V1 ═ 0.1, 1.1, 2.1, 3.1, 4.1, 7.1, 8.1, 9.1 ], and the total loss value in the current training process can be known by calculating the mean value thereof. If there are 4 tasks in the current validation set sample and the corresponding task loss set may be Z10 ═ 0.1, 0.03, 0.01 ═ 1.1, 1.03, 0.03, 0.01 ═ Z11 ═ 1.1, 1.03, 0.03, 0.01 ═ Z12 ═ 2.1, 2.03, 0.03, 0.01 ═ Z13 ═ 3.1, 3.03, 0.03, 0.01 ═ Z14 ═ 4.1, 4.03, 0.03, 0.01 ═ Z2 ═ 7.1, 7.03, 0.03, 0.01 ═ Z ═ 7.1, 7.03, 0.03, 0.01 ═ Z5 ═ 18 ═ 8.1, 8.03, 0.03, 0.01 ═ Z +1, 0.03, 0.01 ═ Z +1, 0.03, 0.01 ═ Z + 1+ 3.03, 0.01 ═ Z +1, 3.03+ 1,0.01 ═ 3.03, 3.01 ═ 3.1+ 1,0.01 ═ 3.03, 0.01 ═ 8 ═ 3.1, 3.03 ═ 8 ═ 7.03 ═ Z +3.03 ++ 3.01 ++ 3.1 ═ 3.01 ═ 3.03 ═ 3.01 ═ 3.1, 3.01 ═ 8 ═ 7.01 ═ 7.03 ═ 7.1 ═ 8 ═ 7.01 ═ 3.03 ═ 3.1 ═ 8 ═ 7.1 ═ 3.1 ═ 3.01 ═ 8 ═ 3.03 ═ 8 ═ 3.01 ═ 8 ═ 3.03 ═ 3.01 ═ 8 ═ 3.03 ═ Z +3.1 ═ Z +3.03 ++ 3.1 ═ 6 ═ Z +1 ═ 3.01 ═ Z +1 ═ 3.1 ═ 3.01 ═ 8 ═ 6 ═ 3.03 ++ 3.01 ═ 6 ═ 3.1 ═ 3.01 ++ 3.03 ═ 6 ═ 8 ═ 6 ═ 8 ═ 3.01 ═ 6 ═ 8 ═ 6 ═ 3.1 ═ 8 ═ 6 ═ Z +1 ═ 6 ═ Z +1.
In the embodiment of the application, a verification loss subset in a current training process is obtained, and the verification loss subset includes total loss of each verification set sample. Adding the verification loss subsets into the first set, deleting the verification loss subset with the earliest time in the first set if the number of the verification loss subsets in the first set exceeds a first preset number, so as to obtain the stable performance condition of the sample, facilitate subsequent sample screening, and then carrying out standardization processing on each verification loss subset in the first set, so as to improve the data accuracy; and selecting a second preset number of verification set samples from the target loss subset so as to select samples which accord with the distribution of the training set, and drawing a loss curve according to the average values of the total losses corresponding to the second preset number of verification set samples respectively, thereby improving the accuracy of the verification set loss curve.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
Corresponding to the verification set loss curve modification method described above, fig. 5 is a schematic structural diagram of a verification set loss curve modification apparatus in an embodiment of the present application, and as shown in fig. 5, the verification set loss curve modification apparatus may include:
an obtaining module 501, configured to obtain a verification loss subset in a current training process, where the verification loss subset includes total losses of each verification set sample.
An adding module 502 is configured to add the verification loss subsets to the first set, and delete the verification loss subset with the earliest time in the first set if the number of the verification loss subsets in the first set exceeds a first preset number.
The normalization processing module 503 is configured to perform normalization processing on each verification loss subset in the first set, and determine a target loss subset according to each verification loss subset after the normalization processing, where the target loss subset includes a target total loss of each verification set sample.
The selecting module 504 is configured to select a second preset number of verification set samples from the target loss subset, and draw a loss curve according to the average values of the total losses corresponding to the second preset number of verification set samples.
In one embodiment, the normalization processing module 503 may include:
and the adding unit is used for adding the verification loss subsets into a second set and carrying out average processing on the second set, wherein the second set comprises all the verification loss subsets in the historical training process.
And the normalization processing unit is used for normalizing each verification loss subset in the first set according to the second set after the mean processing.
In one embodiment, the adding unit may include:
and the sum calculating subunit is used for calculating the sum of the total losses of the verification set samples in the second set in the verification loss subsets respectively.
And the mean value calculating subunit is used for calculating the mean value of the sum corresponding to each verification set sample according to the training times to obtain the loss mean value corresponding to each verification set sample.
In one embodiment, the normalization processing unit may include:
and the difference value calculating subunit is used for calculating the difference value between the total loss of each verification set sample in the verification loss subset and the loss mean value corresponding to each verification set sample.
And the ratio operator unit is used for determining the ratio between the difference value corresponding to each verification set sample and the loss average value corresponding to each verification set sample as the verification loss subset after the standardization processing.
In one embodiment, the normalization processing module 503 may include:
and the sum calculating unit is used for calculating the sum of the total loss of the same verification set sample in each verification loss subset after the normalization processing.
And the mean value calculating unit is used for calculating the mean value of the total loss of the same verification set sample in each verification loss subset according to the number of the verification loss subsets to obtain the target total loss of each verification set sample.
And the subset determining unit is used for determining a target loss subset according to the target total loss of each verification set sample.
In one embodiment, the selecting module 504 may include:
and the sorting unit is used for sorting the verification set samples in the target loss subset from small to large according to the target total loss, and selecting a second preset number of verification set samples sorted in the front from the sorted verification set samples.
In one embodiment, the selecting module 504 may include:
and the total loss value determining unit is used for taking the average value of the total loss as the total loss value in the current training process.
And the curve drawing unit is used for drawing a loss curve according to all the total loss values in the historical training process and the total loss values in the current training process.
In the embodiment of the application, a verification loss subset in a current training process is obtained, and the verification loss subset includes total loss of each verification set sample. Adding the verification loss subsets into the first set, deleting the verification loss subset with the earliest time in the first set if the number of the verification loss subsets in the first set exceeds a first preset number, so as to obtain the stable performance condition of the sample, facilitate subsequent sample screening, and then carrying out standardization processing on each verification loss subset in the first set, so as to improve the data accuracy; and selecting a second preset number of verification set samples from the target loss subset so as to select samples which accord with the distribution of the training set, and drawing a loss curve according to the average values of the total losses corresponding to the second preset number of verification set samples respectively, thereby improving the accuracy of the verification set loss curve.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the apparatus and the module described above may refer to corresponding processes in the foregoing system embodiments and method embodiments, and are not described herein again.
Fig. 6 is a schematic structural diagram of a terminal device according to an embodiment of the present application. For convenience of explanation, only portions related to the embodiments of the present application are shown.
As shown in fig. 6, the terminal device 6 of this embodiment includes: at least one processor 600 (only one shown in fig. 6), a memory 601 coupled to the processor 600, and a computer program 602, such as a verification set loss profile modification program, stored in the memory 601 and executable on the at least one processor 600. The processor 600 executes the computer program 602 to implement the steps in the verification set loss curve modification method embodiments, such as the steps S101 to S104 shown in fig. 1. Alternatively, the processor 600 executes the computer program 602 to implement the functions of the modules in the device embodiments, such as the modules 501 to 504 shown in fig. 5.
Illustratively, the computer program 602 may be divided into one or more modules, and the one or more modules are stored in the memory 601 and executed by the processor 600 to complete the present application. The one or more modules may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution process of the computer program 602 in the terminal device 6. For example, the computer program 602 may be divided into an obtaining module 501, an adding module 502, a normalizing module 503, and a selecting module 504, and the specific functions of the modules are as follows:
an obtaining module 501, configured to obtain a verification loss subset in a current training process, where the verification loss subset includes total loss of each verification set sample;
an adding module 502, configured to add the verification loss subsets to the first set, and delete the verification loss subset with the earliest time in the first set if the number of the verification loss subsets in the first set exceeds a first preset number;
a normalization processing module 503, configured to perform normalization processing on each verification loss subset in the first set, and determine a target loss subset according to each verification loss subset after the normalization processing, where the target loss subset includes a target total loss of each verification set sample;
the selecting module 504 is configured to select a second preset number of verification set samples from the target loss subset, and draw a loss curve according to the average values of the total losses corresponding to the second preset number of verification set samples.
The terminal device 6 may include, but is not limited to, a processor 600 and a memory 601. Those skilled in the art will appreciate that fig. 6 is merely an example of the terminal device 6, and does not constitute a limitation to the terminal device 6, and may include more or less components than those shown, or combine some components, or different components, such as an input-output device, a network access device, a bus, etc.
The Processor 600 may be a Central Processing Unit (CPU), and the Processor 600 may be other general-purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, a discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The storage 601 may be an internal storage unit of the terminal device 6, such as a hard disk or a memory of the terminal device 6 in some embodiments. In other embodiments, the memory 601 may also be an external storage device of the terminal device 6, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like provided on the terminal device 6. Further, the memory 601 may include both an internal storage unit and an external storage device of the terminal device 6. The memory 601 is used for storing an operating system, an application program, a Boot Loader (Boot Loader), data, and other programs, such as program codes of the computer programs. The memory 601 described above may also be used to temporarily store data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned functions may be distributed as different functional units and modules according to needs, that is, the internal structure of the apparatus may be divided into different functional units or modules to implement all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, the division of the above modules or units is only one logical function division, and there may be other division manners in actual implementation, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit may be stored in a computer-readable storage medium if it is implemented in the form of a software functional unit and sold or used as a separate product. Based on such understanding, all or part of the processes in the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium and can implement the steps of the embodiments of the methods described above when the computer program is executed by a processor. The computer program includes computer program code, and the computer program code may be in a source code form, an object code form, an executable file or some intermediate form. The computer-readable medium may include at least: any entity or device capable of carrying computer program code to a photographing apparatus/terminal apparatus, a recording medium, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), an electrical carrier signal, a telecommunications signal, and a software distribution medium. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc. In certain jurisdictions, computer-readable media may not be an electrical carrier signal or a telecommunications signal in accordance with legislative and patent practice.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims (10)

1. A method for correcting a verification set loss curve is characterized by comprising the following steps:
obtaining a verification loss subset in the current training process, wherein the verification loss subset comprises the total loss of each verification set sample;
adding the verification loss subsets into the first set, and deleting the verification loss subset with the earliest time in the first set if the number of the verification loss subsets in the first set exceeds a first preset number;
normalizing each verification loss subset in the first set, and determining a target loss subset according to each verification loss subset after the normalization, wherein the target loss subset comprises a target total loss of each verification set sample;
and selecting a second preset number of verification set samples from the target loss subset, and drawing a loss curve according to the average values of the total losses corresponding to the second preset number of verification set samples.
2. The validation set loss curve modification method of claim 1, wherein the normalizing each subset of validation losses in the first set comprises:
adding the verification loss subsets into a second set, and carrying out mean processing on the second set, wherein the second set comprises all verification loss subsets in a historical training process;
and normalizing each verification loss subset in the first set according to the second set after the mean processing.
3. The validation set loss curve modification method of claim 2, wherein the averaging the second set comprises:
calculating the sum of the total loss of each verification set sample in the second set in each verification loss subset;
and calculating the average value of the sum corresponding to each verification set sample according to the training times to obtain the loss average value corresponding to each verification set sample.
4. The validation set loss curve modification method of claim 3, wherein the normalizing each validation loss subset of the first set according to the second set after the averaging comprises:
calculating the difference value between the total loss of each verification set sample in the verification loss subset and the loss mean value corresponding to each verification set sample;
and determining the ratio of the difference value corresponding to each verification set sample to the loss average value corresponding to each verification set sample as a verification loss subset after the standardization processing.
5. The validation set loss curve modification method of claim 1, wherein the determining a target loss subset from each of the validation loss subsets after the normalization process comprises:
calculating the sum of the total loss of the same verification set sample in each verification loss subset after the normalization processing;
calculating the average value of the total losses of the same verification set sample in each verification loss subset according to the number of the verification loss subsets to obtain the target total loss of each verification set sample;
determining the target loss subset according to the target total loss of each validation set sample.
6. The validation set loss curve modification method of claim 1, wherein selecting a second predetermined number of validation set samples from the target loss subset comprises:
and sorting the verification set samples in the target loss subset from small to large according to the target total loss, and selecting a second preset number of verification set samples sorted in the front from the sorted verification set samples.
7. The validation set loss curve modification method of claim 1, wherein the plotting the loss curve according to the mean of the total losses corresponding to the second preset number of validation set samples comprises:
taking the average value of the total loss as a total loss value in the current training process;
and drawing a loss curve according to all the total loss values in the historical training process and the total loss values in the current training process.
8. A validation set loss curve modification apparatus, comprising:
the system comprises an acquisition module, a training module and a training module, wherein the acquisition module is used for acquiring a verification loss subset in the current training process, and the verification loss subset comprises the total loss of each verification set sample;
an adding module, configured to add the verification loss subsets to a first set, and delete a verification loss subset with an earliest time in the first set if the number of the verification loss subsets in the first set exceeds a first preset number;
a normalization processing module, configured to perform normalization processing on each verification loss subset in the first set, and determine a target loss subset according to each verification loss subset after the normalization processing, where the target loss subset includes a target total loss of each verification set sample;
and the selecting module is used for selecting a second preset number of verification set samples from the target loss subset and drawing a loss curve according to the average value of the total loss corresponding to the second preset number of verification set samples.
9. A terminal device comprising a memory, a processor and a computer program stored in said memory and executable on said processor, characterized in that said processor when executing said computer program implements the steps of a validation set loss curve modification method according to any of claims 1 to 7.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of a method of validation set loss curve modification according to any one of claims 1 to 7.
CN202110545055.0A 2021-05-19 2021-05-19 Verification set loss curve correction method and device, terminal device and storage medium Pending CN113240019A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110545055.0A CN113240019A (en) 2021-05-19 2021-05-19 Verification set loss curve correction method and device, terminal device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110545055.0A CN113240019A (en) 2021-05-19 2021-05-19 Verification set loss curve correction method and device, terminal device and storage medium

Publications (1)

Publication Number Publication Date
CN113240019A true CN113240019A (en) 2021-08-10

Family

ID=77137623

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110545055.0A Pending CN113240019A (en) 2021-05-19 2021-05-19 Verification set loss curve correction method and device, terminal device and storage medium

Country Status (1)

Country Link
CN (1) CN113240019A (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060277033A1 (en) * 2005-06-01 2006-12-07 Microsoft Corporation Discriminative training for language modeling
CN111079785A (en) * 2019-11-11 2020-04-28 深圳云天励飞技术有限公司 Image identification method and device and terminal equipment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060277033A1 (en) * 2005-06-01 2006-12-07 Microsoft Corporation Discriminative training for language modeling
CN111079785A (en) * 2019-11-11 2020-04-28 深圳云天励飞技术有限公司 Image identification method and device and terminal equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈军等: "基于级联卷积神经网络的驾驶员分心驾驶行为检测", 《科学技术与工程》, vol. 20, no. 14, 28 June 2020 (2020-06-28), pages 5702 - 5708 *

Similar Documents

Publication Publication Date Title
CN110389341B (en) Charging pile identification method and device, robot and computer readable storage medium
CN114025379B (en) Broadband multi-signal detection method, device and equipment
CN109146891B (en) Hippocampus segmentation method and device applied to MRI and electronic equipment
CN112488297B (en) Neural network pruning method, model generation method and device
CN113516275A (en) Power distribution network ultra-short term load prediction method and device and terminal equipment
CN107908998A (en) Quick Response Code coding/decoding method, device, terminal device and computer-readable recording medium
CN111860568B (en) Method and device for balanced distribution of data samples and storage medium
CN113920022A (en) Image optimization method and device, terminal equipment and readable storage medium
CN110955862B (en) Evaluation method and device for equipment model trend similarity
CN114742237A (en) Federal learning model aggregation method and device, electronic equipment and readable storage medium
CN112967347B (en) Pose calibration method, pose calibration device, robot and computer readable storage medium
CN112097772B (en) Robot and map construction method and device thereof
CN112444820B (en) Robot pose determining method and device, readable storage medium and robot
CN113240019A (en) Verification set loss curve correction method and device, terminal device and storage medium
CN109344877B (en) Sample data processing method, sample data processing device and electronic equipment
CN115797453B (en) Positioning method and device for infrared weak target and readable storage medium
CN110968835A (en) Approximate quantile calculation method and device
CN113190429B (en) Server performance prediction method and device and terminal equipment
CN110988673A (en) Motor rotor fault detection method and device and terminal equipment
CN113030648B (en) Power cable fault point position determining method and device and terminal equipment
CN113139563B (en) Optimization method and device for image classification model
CN115391725A (en) Method, system, equipment and medium for identifying parameters of fractional order chaotic system
CN115169089A (en) Wind power probability prediction method and device based on kernel density estimation and copula
CN113030555B (en) Energy storage open-circuit voltage estimation method and device, terminal equipment and storage medium
CN114759904A (en) Data processing method, device, equipment, readable storage medium and program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination