CN111340174A - Adjusting method, device, equipment and medium of cost sensitive neural network - Google Patents

Adjusting method, device, equipment and medium of cost sensitive neural network Download PDF

Info

Publication number
CN111340174A
CN111340174A CN202010107273.1A CN202010107273A CN111340174A CN 111340174 A CN111340174 A CN 111340174A CN 202010107273 A CN202010107273 A CN 202010107273A CN 111340174 A CN111340174 A CN 111340174A
Authority
CN
China
Prior art keywords
cost
classification
classification cost
neural network
adjusting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010107273.1A
Other languages
Chinese (zh)
Inventor
王楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Heilongjiang University
Original Assignee
Heilongjiang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Heilongjiang University filed Critical Heilongjiang University
Priority to CN202010107273.1A priority Critical patent/CN111340174A/en
Publication of CN111340174A publication Critical patent/CN111340174A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0611Request for offers or quotes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0623Item investigation
    • G06Q30/0625Directed, with specific intent or strategy

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention discloses a method, a device, equipment and a medium for adjusting a cost sensitive neural network. The method comprises the following steps: configuring a first cost matrix in a loss function of a cost sensitive neural network for unbalanced data prediction; configuring a second cost matrix in a discriminant function of the cost sensitive neural network; adjusting the first price matrix and the second price matrix until the recall rate of the positive samples in the unbalanced data sample set is 1; wherein, the samples belonging to the minority class in the unbalanced data sample set are positive samples. The adjusting method, the adjusting device, the adjusting equipment and the adjusting medium of the cost sensitive neural network can enable the recall rate of samples belonging to a minority class in the unbalanced data sample set to reach 1, and improve the accuracy rate of sample prediction of the minority class in the unbalanced data sample set.

Description

Adjusting method, device, equipment and medium of cost sensitive neural network
Technical Field
The invention relates to the technical field of computers, in particular to a method, a device, equipment and a medium for adjusting a cost sensitive neural network.
Background
The flight price is dynamically changed along with the market sales condition, the sold airline or flight party can increase the price for sale, and otherwise, the price can be reduced for sale. In practical application, the samples used for flight price prediction are unbalanced samples, the flight price-invariant samples account for about 90% of the total amount of the samples, and the flight price-variant (price increase and decrease) samples account for about 10% of the total amount of the samples. If the value is really generated, the samples of the flight price change, namely the minority class samples, must be accurately predicted, and in the case that the minority class samples are positive samples, namely, the recall rate of the minority class samples reaches 1.
However, the conventional neural network aims to minimize the overall error rate, and cannot achieve a recall rate of a few classes of samples as 1.
Disclosure of Invention
The embodiment of the invention provides a method, a device, equipment and a medium for adjusting a cost sensitive neural network, which can enable the recall rate of minority samples to reach 1 and improve the accuracy of sample prediction of a minority sample in an unbalanced data sample set.
In a first aspect, an embodiment of the present invention provides a method for adjusting a cost sensitive neural network, including:
configuring a first cost matrix in a loss function of a cost sensitive neural network for unbalanced data prediction;
configuring a second cost matrix in a discriminant function of the cost sensitive neural network;
adjusting the first price matrix and the second price matrix until the recall rate of the positive samples in the unbalanced data sample set is 1; the samples in the unbalanced data sample set belonging to the minority class are positive samples, and the unbalanced data comprise unbalanced flight data or other unbalanced data except the flight data.
In one possible implementation manner of the embodiment of the present invention, configuring a first cost matrix in a loss function of a cost sensitive neural network for unbalanced data prediction may include:
configuring a first classification cost and a first classification cost threshold of a first price matrix in a loss function;
in one possible implementation manner of the embodiment of the present invention, configuring the second cost matrix in the discriminant function of the cost sensitive neural network may include:
and configuring a second classification cost and a second classification cost threshold value of the second cost matrix in the discriminant function.
In one possible implementation manner of the embodiment of the present invention, adjusting the first price matrix and the second price matrix until the recall rate of the positive samples in the unbalanced data sample set is 1 may include:
and adjusting the first classification cost, the first classification cost threshold, the second classification cost and the second classification cost threshold until the recall rate of the positive samples in the unbalanced data sample set is 1.
In a possible implementation manner of the embodiment of the present invention, adjusting the first classification cost, the first classification cost threshold, the second classification cost, and the second classification cost threshold until the recall rate of the positive samples in the unbalanced data sample set is 1 may include:
and adjusting the first classification cost, the first classification cost threshold value, the second classification cost and the second classification cost threshold value based on the test set comprising the samples which are classified by the cost sensitive neural network, until the recall rate of the positive samples in the unbalanced data sample set is 1.
In a possible implementation manner of the embodiment of the present invention, adjusting the first classification cost, the first classification cost threshold, the second classification cost, and the second classification cost threshold based on the test set including the sample that is classified incorrectly by the cost-sensitive neural network until the recall rate of the positive sample in the unbalanced data sample set is 1 may include:
detecting whether the recall rate of the positive samples in the test set is 1 or not based on the current first classification cost and the second classification cost;
if the recall rate of the positive samples in the test set is not 1, adjusting the first classification cost; judging whether the adjusted first classification cost is larger than a first classification cost threshold value or not; if the adjusted first classification cost is not greater than the first classification cost threshold, detecting whether the recall rate of the positive samples in the test set is 1 or not based on the current first classification cost and the current second classification cost, and continuously adjusting the first classification cost until the recall rate of the positive samples in the test set is 1 or the adjusted first classification cost is greater than the first classification cost threshold;
if the adjusted first classification cost is larger than the first classification cost threshold, adjusting the second classification cost, and judging whether the adjusted second classification cost is larger than the second classification cost threshold; if the adjusted second classification cost is not greater than the second classification cost threshold, detecting whether the recall rate of the positive samples in the test set is 1 or not based on the current first classification cost and the second classification cost, and continuously adjusting the second classification cost until the recall rate of the positive samples in the test set is 1 or the adjusted second classification cost is greater than the second classification cost threshold;
and if the adjusted second classification cost is larger than the second classification cost threshold value, returning to the step of adjusting the first classification cost to continue executing.
In a possible implementation manner of the embodiment of the present invention, the first classification cost and the second classification cost may be adjusted by a preset cost adjustment value, and the first classification cost threshold and the second classification cost threshold may be adjusted by a preset threshold adjustment value.
In a possible implementation manner of the embodiment of the present invention, the method for adjusting a cost sensitive neural network provided in the embodiment of the present invention may further include:
extracting samples which are wrongly classified by the cost sensitive neural network from the unbalanced data sample set;
based on the extracted samples, a test set is generated.
In a second aspect, an embodiment of the present invention provides an adjusting apparatus for a cost sensitive neural network, including:
a first configuration module, configured to configure a first cost matrix in a loss function of a cost sensitive neural network for unbalanced data prediction;
the second configuration module is used for configuring a second cost matrix in the discriminant function of the cost sensitive neural network;
the adjusting module is used for adjusting the first price matrix and the second price matrix until the recall rate of the positive samples in the unbalanced data sample set is 1; the samples in the unbalanced data sample set belonging to the minority class are positive samples, and the unbalanced data comprise unbalanced flight data or other unbalanced data except the flight data.
In a possible implementation manner of the embodiment of the present invention, the first configuration module may be specifically configured to:
configuring a first classification cost and a first classification cost threshold of a first price matrix in a loss function;
in a possible implementation manner of the embodiment of the present invention, the second configuration module may be specifically configured to:
and configuring a second classification cost and a second classification cost threshold value of the second cost matrix in the discriminant function.
In a possible implementation manner of the embodiment of the present invention, the adjusting module may be specifically configured to:
and adjusting the first classification cost, the first classification cost threshold, the second classification cost and the second classification cost threshold until the recall rate of the positive samples in the unbalanced data sample set is 1.
In a possible implementation manner of the embodiment of the present invention, the adjusting module may be specifically configured to:
and adjusting the first classification cost, the first classification cost threshold value, the second classification cost and the second classification cost threshold value based on the test set comprising the samples which are classified by the cost sensitive neural network, until the recall rate of the positive samples in the unbalanced data sample set is 1.
In a possible implementation manner of the embodiment of the present invention, the adjusting module may be specifically configured to:
detecting whether the recall rate of the positive samples in the test set is 1 or not based on the current first classification cost and the second classification cost;
if the recall rate of the test set is not 1, adjusting the first classification cost; judging whether the adjusted first classification cost is larger than a first classification cost threshold value or not; if the adjusted first classification cost is not greater than the first classification cost threshold, detecting whether the recall rate of the positive samples in the test set is 1 or not based on the current first classification cost and the current second classification cost, and continuously adjusting the first classification cost until the recall rate of the positive samples in the test set is 1 or the adjusted first classification cost is greater than the first classification cost threshold;
if the adjusted first classification cost is larger than the first classification cost threshold, adjusting the second classification cost, and judging whether the adjusted second classification cost is larger than the second classification cost threshold; if the adjusted second classification cost is not greater than the second classification cost threshold, detecting whether the recall rate of the positive samples in the test set is 1 or not based on the current first classification cost and the second classification cost, and continuously adjusting the second classification cost until the recall rate of the positive samples in the test set is 1 or the adjusted second classification cost is greater than the second classification cost threshold;
and if the adjusted second classification cost is larger than the second classification cost threshold value, returning to the step of adjusting the first classification cost to continue executing.
In a possible implementation manner of the embodiment of the present invention, the first classification cost and the second classification cost may be adjusted by a preset cost adjustment value, and the first classification cost threshold and the second classification cost threshold may be adjusted by a preset threshold adjustment value.
In a possible implementation manner of the embodiment of the present invention, the adjusting apparatus of a cost sensitive neural network provided in the embodiment of the present invention may further include:
the extraction module is used for extracting samples which are classified wrongly by the cost sensitive neural network from the unbalanced data sample set;
and the generating module is used for generating a test set based on the extracted samples.
In a third aspect, an embodiment of the present invention provides a device for adjusting a cost sensitive neural network, including: a memory, a processor, and a computer program stored on the memory and executable on the processor;
the processor, when executing the computer program, implements the method for adjusting a cost sensitive neural network according to the first aspect of the embodiments of the present invention or any possible implementation manner of the first aspect.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the method for adjusting a cost sensitive neural network according to the first aspect of the present invention or any possible implementation manner of the first aspect of the present invention is implemented.
The adjusting method, the adjusting device, the adjusting equipment and the adjusting medium of the cost sensitive neural network can enable the recall rate of samples belonging to a minority class in the unbalanced data sample set to reach 1, and improve the accuracy rate of sample prediction of the minority class in the unbalanced data sample set.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the embodiments of the present invention will be briefly described below, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flowchart of a method for adjusting a cost sensitive neural network according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of another adjusting method of a cost sensitive neural network according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an adjusting apparatus of a cost sensitive neural network according to an embodiment of the present invention;
fig. 4 is a block diagram of a hardware architecture of a computing device according to an embodiment of the present invention.
Detailed Description
Features and exemplary embodiments of various aspects of the present invention will be described in detail below, and in order to make objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not to be construed as limiting the invention. It will be apparent to one skilled in the art that the present invention may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the present invention by illustrating examples of the present invention.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
In order to solve the problem of the prior art, embodiments of the present invention provide a method, an apparatus, a device, and a medium for adjusting a cost sensitive neural network. First, a detailed description is given to the adjusting method of the cost sensitive neural network according to the embodiment of the present invention.
Fig. 1 is a schematic flow chart of a method for adjusting a cost sensitive neural network according to an embodiment of the present invention. The adjusting method of the cost sensitive neural network can comprise the following steps:
s101: configuring a first cost matrix in a loss function of a cost sensitive neural network for unbalanced data prediction.
S102: and configuring a second cost matrix in the discriminant function of the cost sensitive neural network.
S103: and adjusting the first price matrix and the second price matrix until the recall rate of the positive samples in the unbalanced data sample set is 1.
The unbalanced data sample set comprises samples belonging to a majority class and samples belonging to a minority class, wherein the samples belonging to the minority class are positive samples, and the samples belonging to the majority class are negative samples.
The unbalanced data may include unbalanced flight data or other unbalanced data in addition to flight data. Other unbalanced data besides flight data such as: unbalanced medical diagnosis data, unbalanced network recommendation data, unbalanced network intrusion detection data and the like.
The following description will take the unbalanced data as the unbalanced flight data as an example.
In one embodiment of the invention, the flight data in the embodiment of the invention may be flight price data.
General spiritThe loss function through the network is expressed as E- ∑ (y)i-oi)2Wherein, yiTo the desired output, oiIs the actual output.
In the embodiment of the invention, after the first cost matrix is configured in the loss function of the cost sensitive neural network for flight data prediction, the loss function is that E is ∑ ((y)i-oi)*CS)2And CS is the first cost matrix.
Figure BDA0002388809070000071
Wherein, aiAnd ajIs the classification cost in the first cost matrix CS. i is the target class and j is the actual output class.
The maximum practical output of the sample t corresponding to the discriminant function of the general neural network after the maximum output of the ith class is standardized is recorded as Pt(i)。
Figure BDA0002388809070000072
Where p (i) is the prior probability that the sample belongs to the ith class, and p (j) is the prior probability that the sample belongs to the jth class.
According to the embodiment of the invention, after the second price matrix is configured in the discriminant function, the sample t corresponding to the discriminant function is divided into the maximum actual output P 'corresponding to the normalized maximum output of the ith class't(i)。
Figure BDA0002388809070000073
Wherein, biAnd bjIs the classification cost in the second cost matrix CP.
Second cost matrix
Figure BDA0002388809070000074
After the first price matrix CS and the second price matrix CP are configured, the first price matrix CS and the second price matrix CP may be adjusted until the recall rate of the positive samples in the unbalanced data sample set is 1.
By the embodiment of the invention. The recall rate of the samples belonging to the minority class in the unbalanced data sample set can reach 1, and the accuracy rate of sample prediction belonging to the minority class in the unbalanced data sample set is improved.
In a possible implementation manner of the embodiment of the present invention, a first classification cost and a first classification cost threshold of a first cost matrix may be configured in a loss function; and configuring a second classification cost and a second classification cost threshold value of the second cost matrix in the discriminant function.
The first price matrix CS and the second price matrix CP respectively include two classification costs. The classification costs a of the first price matrix CS are configured separatelyiAnd aj(ii) a The classification cost b of the second cost matrix CPiAnd bj. And classify the cost aiAs the first classification cost, classifying cost biAs a second sort cost. A first classification cost threshold of the first cost matrix CS is configured and a second classification cost threshold of the second cost matrix CP is configured.
In one possible implementation of the embodiment of the present invention, the first classification cost a may be adjustediA first classification cost threshold value and a second classification cost biAnd a second classification cost threshold until the recall rate of the positive samples in the unbalanced data sample set is 1.
In one possible implementation of the embodiment of the present invention, the classification cost a may be setjAnd bjIs fixed, the first classification cost a is adjustediAnd a second classification cost bi
In a possible implementation manner of the embodiment of the present invention, the first classification cost threshold, the second classification cost, and the second classification cost threshold may be adjusted based on the test set including the samples that are classified incorrectly by the cost sensitive neural network until the recall rate of the positive samples in the unbalanced data sample set is 1.
In one possible implementation manner of the embodiment of the invention, samples which are wrongly classified by the cost sensitive neural network can be extracted from the unbalanced data sample set; based on the extracted samples, a test set is generated.
For example, assume that the unbalanced flight price data sample set includes 10 samples, sample 1 to sample 10. Samples 1 to 7 belong to the flight price invariant class, i.e., the majority class, and samples 8 to 10 belong to the flight price variant class, i.e., the minority class.
When the 10 samples are classified using the cost sensitive neural network, sample 2, sample 8, and sample 9 are classified erroneously. That is, sample 2 is classified into the flight price change class, and sample 8 and sample 9 are classified into the flight price invariant class.
Samples of misclassification by the cost sensitive neural network include: sample 2, sample 8 and sample 9.
Based on sample 2, sample 8 and sample 9, a test set is generated.
In one possible implementation manner of the embodiment of the present invention, one or more test sets may be provided.
When a test set is one, the test set may include all samples that are misclassified by the cost sensitive neural network.
When the test set is multiple, each test set may include a partial sample that is misclassified by the cost sensitive neural network, but the intersection of the multiple test sets is all samples that are misclassified by the cost sensitive neural network. Furthermore, multiple test sets may include the same sample.
It can be understood that when there are multiple test sets, in order to achieve a recall rate of 1 for the positive samples in the unbalanced data sample set, the recall rate of the positive samples in each test set must be 1 when the test set is used for testing.
In a possible implementation manner of the embodiment of the present invention, whether the recall rate of the positive samples in the test set is 1 may be detected based on the current first classification cost and the second classification cost;
if the recall rate of the positive samples in the test set is not 1, adjusting the first classification cost; judging whether the adjusted first classification cost is larger than a first classification cost threshold value or not; if the adjusted first classification cost is not greater than the first classification cost threshold, detecting whether the recall rate of the positive samples in the test set is 1 or not based on the current first classification cost and the current second classification cost, and continuously adjusting the first classification cost until the recall rate of the positive samples in the test set is 1 or the adjusted first classification cost is greater than the first classification cost threshold;
if the adjusted first classification cost is larger than the first classification cost threshold, adjusting the second classification cost, and judging whether the adjusted second classification cost is larger than the second classification cost threshold; if the adjusted second classification cost is not greater than the second classification cost threshold, detecting whether the recall rate of the positive samples in the test set is 1 or not based on the current first classification cost and the second classification cost, and continuously adjusting the second classification cost until the recall rate of the positive samples in the test set is 1 or the adjusted second classification cost is greater than the second classification cost threshold;
and if the adjusted second classification cost is larger than the second classification cost threshold value, returning to the step of adjusting the first classification cost to continue executing.
In a possible implementation manner of the embodiment of the present invention, the first classification cost and the second classification cost may be adjusted by a preset cost adjustment value, and the first classification cost threshold and the second classification cost threshold may be adjusted by a preset threshold adjustment value.
Illustratively, assume a first classification cost aiAnd a second classification cost biThe initial values of (2) are all 9. The initial values of the first and second classification cost thresholds are both 18. The preset cost adjustment value is 1, and the preset threshold adjustment value is 9. There are three test sets.
Then first based on ai9 and biAnd 9, detecting whether the recall rates of the positive samples in the three test sets are all 1.
Assuming that the recall rates of the positive samples in the three test sets are all 1, the recall rate of the positive samples in the unbalanced data sample set is 1, and no adjustment is needed.
Assuming that the recall rate of at least one positive sample in the three test sets is not 1, adjusting ai9+ 1-10. Based on ai10 and biAnd 9, detecting whether the recall rates of the positive samples in the three test sets are all 1.
Assuming that the recall rates of the positive samples in the three test sets are all 1, the recall rate of the positive samples in the unbalanced data sample set is 1, and no adjustment is needed.
Assuming that the recall rate of at least one positive sample in the three test sets is not 1, adjusting ai10+1 11. Based on ai11 and biAnd 9, detecting whether the recall rates of the positive samples in the three test sets are all 1.
Suppose that up to ai18 and biAt least one positive sample in the three test sets also had a recall of not 1, 9. Then adjust ai18+ 1-19, in which case aiIf 19 is greater than the first classification cost threshold 18, the first classification cost threshold is adjusted to 18+9 to 27, and b is adjustedi9+ 1-10. Based on ai18 and biWhether the recall rates of the positive samples in the three test sets are all 1 is tested 10.
Assuming that the recall rate of at least one positive sample in the three test sets is not 1, adjusting bi10+1 11. Based on ai18 and biWhether the recall rates of the positive samples in the three test sets are all 1 is detected as 11.
Suppose that up to ai18 and biAt least one positive sample in the three test sets also had a recall of not 1, 18. Then adjust bi18+ 1-19, in which case biIf 19 is greater than the second classification cost threshold 18, the second classification cost threshold is adjusted to 18+9 to 27, and a is adjustedi18+1 19. Based on ai19 and biWhether all the positive samples in the recall ratio of the three test sets are 1 is tested 18.
Suppose based on ai25 and biThe recall rate of positive samples in all three test sets was detected to be 1 at 18. Then determine ai25 and biThe adjustment ends at 18.
Based on the above, an adjusting method of the cost sensitive neural network provided by the embodiment of the present invention is shown in fig. 2. Fig. 2 is a schematic flow chart of another adjusting method for a cost sensitive neural network according to an embodiment of the present invention. The adjusting method of the cost sensitive neural network can comprise the following steps:
s201: configuring a first classification cost and a first classification cost threshold of a first cost matrix in a loss function of a cost sensitive neural network for unbalanced data prediction.
S202: and configuring a second classification cost and a second classification cost threshold of the second cost matrix in a loss function of the cost sensitive neural network.
S203: and detecting whether the recall rate of the positive samples in the test set is 1 or not based on the current first classification cost and the second classification cost, if not, executing S204, and if so, ending.
S204: the first classification cost is adjusted.
S205: and judging whether the adjusted first classification cost is larger than a first classification cost threshold value, if not, executing S206, and if so, executing S207 and S208.
S206: and detecting whether the recall rate of the positive samples in the test set is 1 or not based on the current first classification cost and the second classification cost, if not, executing S204, and if so, ending.
S207: the first classification cost threshold is adjusted.
S208: the second classification cost is adjusted.
S209: and judging whether the adjusted second classification cost is larger than a second classification cost threshold value, if not, executing S210, and if so, executing S211 and S204.
S210: and detecting whether the recall rate of the positive samples in the test set is 1 or not based on the current first classification cost and the current second classification cost, if not, executing S208, and if so, ending.
S211: the second classification cost threshold is adjusted.
Corresponding to the above method embodiment, the embodiment of the present invention further provides an adjusting apparatus for a cost sensitive neural network. As shown in fig. 3, fig. 3 is a schematic structural diagram of an adjusting apparatus of a cost sensitive neural network according to an embodiment of the present invention. The adjusting device of the cost sensitive neural network may include:
a first configuration module 301, configured to configure a first cost matrix in a loss function of a cost sensitive neural network for unbalanced data prediction;
a second configuration module 302, configured to configure a second cost matrix in a discriminant function of the cost-sensitive neural network;
the adjusting module 303 is configured to adjust the first price matrix and the second price matrix until the recall rate of the positive samples in the unbalanced data sample set is 1; and the samples belonging to the minority class in the unbalanced data sample set are positive samples, and the unbalanced data comprises unbalanced flight data or other unbalanced data except the flight data.
In a possible implementation manner of the embodiment of the present invention, the first configuration module 301 may be specifically configured to:
configuring a first classification cost and a first classification cost threshold of a first price matrix in a loss function;
in a possible implementation manner of the embodiment of the present invention, the second configuration module 302 may be specifically configured to:
and configuring a second classification cost and a second classification cost threshold value of the second cost matrix in the discriminant function.
In a possible implementation manner of the embodiment of the present invention, the adjusting module 303 may be specifically configured to:
and adjusting the first classification cost, the first classification cost threshold, the second classification cost and the second classification cost threshold until the recall rate of the positive samples in the unbalanced data sample set is 1.
In a possible implementation manner of the embodiment of the present invention, the adjusting module 303 may be specifically configured to:
and adjusting the first classification cost, the first classification cost threshold value, the second classification cost and the second classification cost threshold value based on the test set comprising the samples which are classified by the cost sensitive neural network, until the recall rate of the positive samples in the unbalanced data sample set is 1.
In a possible implementation manner of the embodiment of the present invention, the adjusting module 303 may be specifically configured to:
detecting whether the recall rate of the positive samples in the test set is 1 or not based on the current first classification cost and the second classification cost;
if the recall rate of the positive samples in the test set is not 1, adjusting the first classification cost; judging whether the adjusted first classification cost is larger than a first classification cost threshold value or not; if the adjusted first classification cost is not greater than the first classification cost threshold, detecting whether the recall rate of the positive samples in the test set is 1 or not based on the current first classification cost and the current second classification cost, and continuously adjusting the first classification cost until the recall rate of the positive samples in the test set is 1 or the adjusted first classification cost is greater than the first classification cost threshold;
if the adjusted first classification cost is larger than the first classification cost threshold, adjusting the second classification cost, and judging whether the adjusted second classification cost is larger than the second classification cost threshold; if the adjusted second classification cost is not greater than the second classification cost threshold, detecting whether the recall rate of the positive samples in the test set is 1 or not based on the current first classification cost and the second classification cost, and continuously adjusting the second classification cost until the recall rate of the positive samples in the test set is 1 or the adjusted second classification cost is greater than the second classification cost threshold;
and if the adjusted second classification cost is larger than the second classification cost threshold value, returning to the step of adjusting the first classification cost to continue executing.
In a possible implementation manner of the embodiment of the present invention, the first classification cost and the second classification cost may be adjusted by a preset cost adjustment value, and the first classification cost threshold and the second classification cost threshold may be adjusted by a preset threshold adjustment value.
In a possible implementation manner of the embodiment of the present invention, the adjusting apparatus of a cost sensitive neural network provided in the embodiment of the present invention may further include:
the extraction module is used for extracting samples which are classified wrongly by the cost sensitive neural network from the unbalanced data sample set;
and the generating module is used for generating a test set based on the extracted samples.
Fig. 4 is a block diagram of a hardware architecture of a computing device according to an embodiment of the present invention. As shown in fig. 4, computing device 400 includes an input device 401, an input interface 402, a central processor 403, a memory 404, an output interface 405, and an output device 406. The input interface 402, the central processing unit 403, the memory 404, and the output interface 405 are connected to each other through a bus 410, and the input device 401 and the output device 406 are connected to the bus 410 through the input interface 402 and the output interface 405, respectively, and further connected to other components of the computing device 400.
Specifically, the input device 401 receives input information from the outside and transmits the input information to the central processor 403 through the input interface 402; the central processor 403 processes the input information based on computer-executable instructions stored in the memory 404 to generate output information, stores the output information temporarily or permanently in the memory 404, and then transmits the output information to the output device 406 through the output interface 405; output device 406 outputs the output information outside of computing device 400 for use by a user.
That is, the computing device shown in fig. 4 may also be implemented as an adjusting device of a cost sensitive neural network, which may include: a memory storing a computer program; and a processor, which when executing the computer program, can implement the adjusting method of the cost sensitive neural network provided by the embodiment of the present invention.
An embodiment of the present invention further provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium; the computer program is used for realizing the adjusting method of the cost sensitive neural network provided by the embodiment of the invention when being executed by a processor.
It is to be understood that the invention is not limited to the specific arrangements and instrumentality described above and shown in the drawings. A detailed description of known methods is omitted herein for the sake of brevity. In the above embodiments, several specific steps are described and shown as examples. However, the method processes of the present invention are not limited to the specific steps described and illustrated, and those skilled in the art can make various changes, modifications and additions or change the order between the steps after comprehending the spirit of the present invention.
The functional blocks shown in the above-described structural block diagrams may be implemented as hardware, software, firmware, or a combination thereof. When implemented in hardware, it may be, for example, an electronic circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, plug-in, function card, or the like. When implemented in software, the elements of the invention are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine-readable medium or transmitted by a data signal carried in a carrier wave over a transmission medium or a communication link. A "machine-readable medium" may include any medium that can store or transfer information. Examples of a machine-readable medium include electronic circuits, semiconductor memory devices, ROM, flash memory, Erasable ROM (EROM), floppy disks, CD-ROMs, optical disks, hard disks, fiber optic media, Radio Frequency (RF) links, and so forth. The code segments may be downloaded via computer networks such as the internet, intranet, etc.
It should also be noted that the exemplary embodiments mentioned in this patent describe some methods or systems based on a series of steps or devices. However, the present invention is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, may be performed in an order different from the order in the embodiments, or may be performed simultaneously.
As described above, only the specific embodiments of the present invention are provided, and it can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system, the module and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. It should be understood that the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present invention, and these modifications or substitutions should be covered within the scope of the present invention.

Claims (10)

1. A method for tuning a cost sensitive neural network, the method comprising:
configuring a first cost matrix in a loss function of a cost sensitive neural network for unbalanced data prediction;
configuring a second cost matrix in a discriminant function of the cost sensitive neural network;
adjusting the first price matrix and the second price matrix until the recall rate of the positive samples in the unbalanced data sample set is 1; and the samples belonging to the minority class in the unbalanced data sample set are positive samples, and the unbalanced data comprises unbalanced flight data or other unbalanced data except the flight data.
2. The method of claim 1, wherein configuring a first cost matrix in a loss function of a cost sensitive neural network for unbalanced data prediction comprises:
configuring a first classification cost and a first classification cost threshold of the first cost matrix in the loss function;
configuring a second cost matrix in a discriminant function of the cost-sensitive neural network, including:
and configuring a second classification cost and a second classification cost threshold value of the second cost matrix in the discriminant function.
3. The method of claim 2, wherein the adjusting the first and second price matrices until a recall rate of positive samples in the unbalanced set of data samples is 1 comprises:
and adjusting the first classification cost, the first classification cost threshold, the second classification cost and the second classification cost threshold until the recall rate of the positive samples in the unbalanced data sample set is 1.
4. The method of claim 3, wherein the adjusting the first classification cost, the first classification cost threshold, the second classification cost, and the second classification cost threshold until the recall rate of positive samples in the unbalanced data sample set is 1 comprises:
adjusting the first classification cost, the first classification cost threshold, the second classification cost, and the second classification cost threshold based on a test set including samples misclassified by the cost sensitive neural network until a recall rate of positive samples in the unbalanced data sample set is 1.
5. The method of claim 4, wherein the adjusting the first classification cost, the first classification cost threshold, the second classification cost, and the second classification cost threshold based on the test set including samples classified incorrectly by the cost-sensitive neural network until a recall rate of positive samples in the unbalanced data sample set is 1 comprises:
detecting whether the recall rate of the positive samples in the test set is 1 or not based on the current first classification cost and the second classification cost;
if the recall rate of the positive samples in the test set is not 1, adjusting the first classification cost; judging whether the adjusted first classification cost is larger than the first classification cost threshold value or not; if the adjusted first classification cost is not greater than the first classification cost threshold, detecting whether the recall rate of the positive samples in the test set is 1 or not based on the current first classification cost and the current second classification cost, and continuing to adjust the first classification cost until the recall rate of the positive samples in the test set is 1 or the adjusted first classification cost is greater than the first classification cost threshold;
if the adjusted first classification cost is larger than the first classification cost threshold, adjusting the second classification cost, and judging whether the adjusted second classification cost is larger than the second classification cost threshold; if the adjusted second classification cost is not greater than the second classification cost threshold, detecting whether the recall rate of the positive samples in the test set is 1 or not based on the current first classification cost and the second classification cost, and continuously adjusting the second classification cost until the recall rate of the positive samples in the test set is 1 or the adjusted second classification cost is greater than the second classification cost threshold;
and if the adjusted second classification cost is larger than the second classification cost threshold, returning to the step of adjusting the first classification cost for continuous execution.
6. The method of claim 5, wherein the first classification cost and the second classification cost are adjusted by a preset cost adjustment value, and wherein the first classification cost threshold and the second classification cost threshold are adjusted by a preset threshold adjustment value.
7. The method of claim 4, further comprising:
extracting samples which are wrongly classified by the cost sensitive neural network from an unbalanced data sample set;
generating the test set based on the extracted samples.
8. An apparatus for adjusting a cost sensitive neural network, the apparatus comprising:
a first configuration module, configured to configure a first cost matrix in a loss function of a cost sensitive neural network for unbalanced data prediction;
the second configuration module is used for configuring a second cost matrix in the discriminant function of the cost sensitive neural network;
the adjusting module is used for adjusting the first price matrix and the second price matrix until the recall rate of the positive samples in the unbalanced data sample set is 1; and the samples belonging to the minority class in the unbalanced data sample set are positive samples, and the unbalanced data comprises unbalanced flight data or other unbalanced data except the flight data.
9. An apparatus for tuning a cost sensitive neural network, the apparatus comprising: a memory, a processor, and a computer program stored on the memory and executable on the processor;
the processor, when executing the computer program, implements the method of tuning a cost sensitive neural network of any one of claims 1-7.
10. A computer-readable storage medium, having stored thereon a computer program which, when being executed by a processor, implements the method of tuning a cost-sensitive neural network of any one of claims 1 to 7.
CN202010107273.1A 2020-02-21 2020-02-21 Adjusting method, device, equipment and medium of cost sensitive neural network Pending CN111340174A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010107273.1A CN111340174A (en) 2020-02-21 2020-02-21 Adjusting method, device, equipment and medium of cost sensitive neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010107273.1A CN111340174A (en) 2020-02-21 2020-02-21 Adjusting method, device, equipment and medium of cost sensitive neural network

Publications (1)

Publication Number Publication Date
CN111340174A true CN111340174A (en) 2020-06-26

Family

ID=71181741

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010107273.1A Pending CN111340174A (en) 2020-02-21 2020-02-21 Adjusting method, device, equipment and medium of cost sensitive neural network

Country Status (1)

Country Link
CN (1) CN111340174A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022121289A1 (en) * 2020-12-11 2022-06-16 Huawei Cloud Computing Technologies Co., Ltd. Methods and systems for mining minority-class data samples for training neural network

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022121289A1 (en) * 2020-12-11 2022-06-16 Huawei Cloud Computing Technologies Co., Ltd. Methods and systems for mining minority-class data samples for training neural network
US11816183B2 (en) 2020-12-11 2023-11-14 Huawei Cloud Computing Technologies Co., Ltd. Methods and systems for mining minority-class data samples for training a neural network

Similar Documents

Publication Publication Date Title
Baldwin et al. Leveraging support vector machine for opcode density based detection of crypto-ransomware
US8572007B1 (en) Systems and methods for classifying unknown files/spam based on a user actions, a file's prevalence within a user community, and a predetermined prevalence threshold
US8650136B2 (en) Text classification with confidence grading
US20120136812A1 (en) Method and system for machine-learning based optimization and customization of document similarities calculation
US11470097B2 (en) Profile generation device, attack detection device, profile generation method, and profile generation computer program
CN110263824B (en) Model training method, device, computing equipment and computer readable storage medium
CN111858242A (en) System log anomaly detection method and device, electronic equipment and storage medium
US8626675B1 (en) Systems and methods for user-specific tuning of classification heuristics
CN110363121B (en) Fingerprint image processing method and device, storage medium and electronic equipment
CN109344042B (en) Abnormal operation behavior identification method, device, equipment and medium
US20240126876A1 (en) Augmented security recognition tasks
CN111400126A (en) Network service abnormal data detection method, device, equipment and medium
CN111340174A (en) Adjusting method, device, equipment and medium of cost sensitive neural network
CN112949785B (en) Object detection method, device, equipment and computer storage medium
CN113343228B (en) Event credibility analysis method and device, electronic equipment and readable storage medium
Joshi et al. Stacking-based ensemble model for malware detection in android devices
US20210117858A1 (en) Information processing device, information processing method, and storage medium
CN114491282B (en) Abnormal user behavior analysis method and system based on cloud computing
US11941115B2 (en) Automatic vulnerability detection based on clustering of applications with similar structures and data flows
CN113449062B (en) Track processing method, track processing device, electronic equipment and storage medium
CN115567371A (en) Abnormity detection method, device, equipment and readable storage medium
CN112766423B (en) Training method and device for face recognition model, computer equipment and storage medium
EP3989492B1 (en) Abnormality detection device, abnormality detection method, and abnormality detection program
US10826923B2 (en) Network security tool
EP3792799A1 (en) System and method of reducing a number of false positives in classification of files

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200626