CN108985379B

CN108985379B - Method and device for evaluating performance of classifier and computer readable storage medium

Info

Publication number: CN108985379B
Application number: CN201810823251.8A
Authority: CN
Inventors: 孙胜方; 刘丹
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Priority date: 2018-07-25
Filing date: 2018-07-25
Publication date: 2021-04-30
Anticipated expiration: 2038-07-25
Also published as: CN108985379A

Abstract

The disclosure relates to a method and a device for evaluating the performance of a classifier and a computer-readable storage medium, and relates to the technical field of artificial intelligence. The method comprises the following steps: determining the weight of each data category according to the number of data contained in each data category in the training data set, wherein the larger the number of data contained in each data category is, the larger the corresponding weight is; training the classifier by using a training data set; testing the trained classifier through a test data set to obtain the classification accuracy of the classifier on each data category; and carrying out weighted summation on the classification accuracy of each data category by using the weight of each data category so as to determine the classification performance of the classifier. The technical scheme of the disclosure can improve the accuracy of performance evaluation of the classifier.

Description

Method and device for evaluating performance of classifier and computer readable storage medium

Technical Field

The present disclosure relates to the field of artificial intelligence technologies, and in particular, to a method and an apparatus for evaluating performance of a classifier, and a computer-readable storage medium.

Background

Artificial intelligence technology is widely used in various fields, for example, automatic artificial intelligence response systems based on natural language processing. The effectiveness of an artificial intelligence response system depends to a large extent on the classification performance of the classifier it employs. Therefore, it is important to accurately evaluate the classification performance of the classifier.

The related technologies mainly include: a hardout evaluation method for randomly dividing a data set into a training data set and a testing data set; a K-cross validation evaluation method which divides the data set into a plurality of subsets to carry out a plurality of times of training and testing, and the like.

Disclosure of Invention

The inventors of the present disclosure found that the following problems exist in the above-described related art: the method is only suitable for performance evaluation of a large training set, and the accuracy of the performance evaluation is poor under the condition that the training set is small.

In view of this, the present disclosure provides a technical solution of a performance evaluation method for a classifier, which can improve accuracy of performance evaluation on the classifier.

According to some embodiments of the present disclosure, there is provided a performance evaluation method of a classifier, including: determining the weight of each data category according to the number of data included in each data category in a training data set, wherein the larger the number of data included in a data category is, the larger the weight of the data category is; training a classifier using the training data set; testing the trained classifier through a test data set to obtain the classification accuracy of the classifier on each data category; and carrying out weighted summation on the classification accuracy of each data category by using the weight of each data category so as to evaluate the classification performance of the classifier.

In some embodiments, the weight of each data category is determined according to a difference between the number of data included in each data category in the training data set and the number of data included in the training data set.

In some embodiments, the weight w of the data class i_i＝N_iN, i is a positive integer, N_iFor the data class i, the number of data is included, and for the training data set, N, the number of data is included.

In some embodiments, the classifier's classification accuracy R for a data class i_i＝m_i/M_iI is a positive integer, m_iNumber of data for which the classifier achieved correct classification of data class i under test, M_iAnd the data category i in the test data set contains the number of data.

In some embodiments, the number of data classes in the test data set is the same as the number of data classes in the training data set; the data types in the test data set contain the same amount of data.

According to other embodiments of the present disclosure, there is provided a performance evaluation apparatus of a classifier including: a weight determining unit, configured to determine a weight of each data category according to a number of data included in each data category in a training data set, where the larger the number of data included in the data category is, the larger the weight of the data category is; a training unit for training a classifier using the training data set; the testing unit is used for testing the trained classifier through a testing data set so as to obtain the classification accuracy of the classifier on each data category; and the performance evaluation unit is used for carrying out weighted summation on the classification accuracy of each data category by using the weight of each data category so as to evaluate the classification performance of the classifier.

In some embodiments, the weight determining unit determines the weight of each data category according to a difference between the number of data included in each data category in the training data set and the number of data included in the training data set.

In some embodiments, the weight determination unit determines the weight w of the data class i_i＝N_iN, i is a positive integer, N_iFor the data class i, the number of data is included, and for the training data set, N, the number of data is included.

In some embodiments, the test unit determines a classification accuracy R of the classifier for a data class i_i＝m_i/M_iI is a positive integer, m_iNumber of data for which the classifier achieved correct classification of data class i under test, M_iAnd the data category i in the test data set contains the number of data.

According to still further embodiments of the present disclosure, there is provided a performance evaluation apparatus of a classifier, including: a memory; and a processor coupled to the memory, the processor configured to perform one or more steps of the method for performance evaluation of a classifier in any of the above embodiments based on instructions stored in the memory device.

According to still further embodiments of the present disclosure, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements one or more steps of the performance evaluation method of the classifier in any of the above embodiments.

In the above embodiment, the weight is determined according to the number of data included in each data category in the training data set, and the test result of each data category is weighted and summed according to the weight to evaluate the classification performance of the classifier. Therefore, the influence of the classification accuracy of the data categories on the performance evaluation can be adjusted according to the weight and the importance degree of the data categories, the performance of the classifier can be accurately evaluated under the condition of a small training set, and the accuracy of the performance evaluation is improved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments of the disclosure and together with the description, serve to explain the principles of the disclosure.

The present disclosure may be more clearly understood from the following detailed description, taken with reference to the accompanying drawings, in which:

FIG. 1 illustrates a flow diagram of some embodiments of a performance evaluation method of a classifier of the present disclosure;

FIG. 2 illustrates a block diagram of some embodiments of a performance evaluation apparatus of a classifier of the present disclosure;

FIG. 3 shows a block diagram of further embodiments of a performance evaluation apparatus of a classifier of the present disclosure;

fig. 4 shows a block diagram of further embodiments of a performance evaluation apparatus of a classifier of the present disclosure.

Detailed Description

Various exemplary embodiments of the present disclosure will now be described in detail with reference to the accompanying drawings. It should be noted that: the relative arrangement of the components and steps, the numerical expressions, and numerical values set forth in these embodiments do not limit the scope of the present disclosure unless specifically stated otherwise.

Meanwhile, it should be understood that the sizes of the respective portions shown in the drawings are not drawn in an actual proportional relationship for the convenience of description.

The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the disclosure, its application, or uses.

Techniques, methods, and apparatus known to those of ordinary skill in the relevant art may not be discussed in detail but are intended to be part of the specification where appropriate.

In all examples shown and discussed herein, any particular value should be construed as merely illustrative, and not limiting. Thus, other examples of the exemplary embodiments may have different values.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.

As mentioned above, the classification accuracy of each data class in the related art has the same effect on the classifier evaluation. Therefore, in the case of a smaller training set, the classification result of the unimportant data category (e.g., the data category containing a smaller amount of data) also greatly affects the performance evaluation result of the classifier, thereby causing inaccurate performance evaluation.

In order to solve the problems, the performance of the classifier is evaluated in a mode of setting weight for classification accuracy. For example, the following embodiments may be employed.

Fig. 1 illustrates a flow diagram of some embodiments of a performance evaluation method of a classifier of the present disclosure.

As shown in fig. 1, the method includes: step 110, determining the weight of each data category; step 120, training the classifier; step 130, testing the classifier; and step 140, evaluating the classification performance of the classifier.

In step 110, the weight of each data category is determined based on the number of data included in each data category in the training data set. The greater the amount of data contained in a data category, the greater the corresponding weight. For example, the weight of each data class is determined based on the difference between the amount of data included in each data class in the training data set and the amount of data included in the training data set. The weights may take a variety of forms that reflect the difference between the quantities, such as differences, ratios, and the like.

In some embodiments of the present invention, the,n data in the training data set comprise I data categories, and the data category I comprises N_iAnd (4) data. In this case, the weight w of the data class i may be determined_i＝N_i/N。

Therefore, the classification result of the data category occupying a larger part in the training data set can be ensured to have larger influence on the performance evaluation of the classifier, and the performance evaluation accuracy is improved.

In step 120, the classifier is trained using a training data set. For example, the classifier can be an LR (Logistic Regression) classifier, a bayesian classifier, or the like.

In step 130, the trained classifier is tested through the test data set to obtain the classification accuracy of the classifier for each data class.

In some embodiments, the test data set includes I data classes as the training data set, data class I containing M_iAnd (4) data. For example, each data class in the test data set includes M_iData, at which time a common I M in the data set is tested_iThe data can be classified accurately by the classifier under the unified standard, and the accuracy of performance evaluation can be improved by combining the weight.

In some embodiments, the classifier is tested using a test data set, and the classifier correctly classifies the data class i by the number of m data classes_i. Classification accuracy R of classifier for data class i_i＝m_i/M_i。

In step 140, the classification accuracy of each data class is weighted and summed by using the weight of each data class to evaluate the classification performance of the classifier.

In some embodiments, performance parameters of the classifier may be set

The larger the H, the better the classifier.

Fig. 2 illustrates a block diagram of some embodiments of a performance evaluation apparatus of a classifier of the present disclosure.

As shown in fig. 2, the performance evaluation apparatus 2 includes a weight determination unit 21, a training unit 22, a test unit 23, and a performance evaluation unit 24.

The weight determination unit 21 determines the weight of each data type based on the number of data included in each data type in the training data set. The greater the amount of data contained in a data category, the greater the corresponding weight.

In some embodiments, the weight determining unit 21 determines the weight of each data category according to the difference between the number of data included in each data category in the training data set and the number of data included in the training data set. For example, the weight W of the data class i_i＝N_i/N。

The training unit 22 trains the classifier with a training data set.

The testing unit 23 tests the trained classifier through the test data set to obtain the classification accuracy of the classifier for each data category. For example, each data category in the test data set contains the same amount of data. For example, the test unit 23 determines the classification accuracy R of the classifier for the data class i_i＝m_i/M_i。

The performance evaluation unit 24 performs weighted summation on the classification accuracy of each data class by using the weight of each data class to evaluate the classification performance of the classifier.

Fig. 3 shows a block diagram of further embodiments of a performance evaluation apparatus of a classifier of the present disclosure.

As shown in fig. 3, the performance evaluation device 3 of the classifier of this embodiment includes: a memory 31 and a processor 32 coupled to the memory 31, the processor 32 being configured to perform one or more steps of a method of evaluating the performance of a classifier in any one of the embodiments of the present disclosure based on instructions stored in the memory 31.

The memory 31 may include, for example, a system memory, a fixed nonvolatile storage medium, and the like. The system memory stores, for example, an operating system, an application program, a Boot Loader (Boot Loader), a database, and other programs.

As shown in fig. 4, the performance evaluation device 4 of the classifier of this embodiment includes: a memory 410 and a processor 420 coupled to the memory 410, the processor 420 configured to execute a method of performance evaluation of a classifier according to any of the foregoing embodiments based on instructions stored in the memory 410.

The memory 410 may include, for example, system memory, fixed non-volatile storage media, and the like. The system memory stores, for example, an operating system, an application program, a Boot Loader (Boot Loader), and other programs.

The performance evaluation device 4 of the classifier may further include an input-output interface 430, a network interface 440, a storage interface 450, and the like. These

interfaces

430, 440, 450 and the connection between the memory 410 and the processor 420 may be, for example, via a bus 460. The input/output interface 430 provides a connection interface for input/output devices such as a display, a mouse, a keyboard, and a touch screen. The network interface 440 provides a connection interface for various networking devices. The storage interface 440 provides a connection interface for external storage devices such as an SD card and a usb disk.

As will be appreciated by one skilled in the art, embodiments of the present disclosure may be provided as a method, system, or computer program product. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable non-transitory storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

So far, the performance evaluation method of the classifier, the apparatus of the classifier, and the computer-readable storage medium according to the present disclosure have been described in detail. Some details that are well known in the art have not been described in order to avoid obscuring the concepts of the present disclosure. It will be fully apparent to those skilled in the art from the foregoing description how to practice the presently disclosed embodiments.

The method and system of the present disclosure may be implemented in a number of ways. For example, the methods and systems of the present disclosure may be implemented by software, hardware, firmware, or any combination of software, hardware, and firmware. The above-described order for the steps of the method is for illustration only, and the steps of the method of the present disclosure are not limited to the order specifically described above unless specifically stated otherwise. Further, in some embodiments, the present disclosure may also be embodied as programs recorded in a recording medium, the programs including machine-readable instructions for implementing the methods according to the present disclosure. Thus, the present disclosure also covers a recording medium storing a program for executing the method according to the present disclosure.

Although some specific embodiments of the present disclosure have been described in detail by way of example, it should be understood by those skilled in the art that the foregoing examples are for purposes of illustration only and are not intended to limit the scope of the present disclosure. It will be appreciated by those skilled in the art that modifications may be made to the above embodiments without departing from the scope and spirit of the present disclosure. The scope of the present disclosure is defined by the appended claims.

Claims

1. A performance evaluation method of a classifier includes:

determining the weight of each data category according to the quantity of data contained in each data category in a training data set, wherein the larger the quantity of data contained in each data category is, the larger the weight of the data category is, the data in the training data set is natural language data, and the data category is semantic classification in an automatic artificial intelligence response system;

training a classifier using the training data set;

testing the trained classifier through a test data set to obtain the classification accuracy of the classifier on each data category, wherein the data in the test data set are natural language data, and the classifier is used for the automatic artificial intelligence response system to perform semantic classification on the natural language data;

carrying out weighted summation on the classification accuracy of each data category by using the weight of each data category so as to evaluate the classification performance of the classifier,

the number of data classes in the test data set is the same as the number of data classes in the training data set;

the data types in the test data set contain the same amount of data.

2. The performance evaluation method of claim 1, wherein the determining the weight of each data category comprises:

and determining the weight of each data category according to the difference between the number of the data included in each data category in the training data set and the number of the data included in the training data set.

3. The performance evaluation method of claim 2, wherein the determining the weight of each data category comprises:

weight w of data class i_i＝N_iN, i is a positive integer, N_iContaining the number of data for data class i and N for the training data setThe number of the cells.

4. The performance evaluation method of claim 1, wherein the obtaining the classification accuracy of the classifier for the data classes comprises:

the classification accuracy R of the classifier on the data class i_i＝m_i/M_iI is a positive integer, m_iNumber of data for which the classifier achieved correct classification of data class i under test, M_iAnd the data category i in the test data set contains the number of data.

5. A performance evaluation apparatus of a classifier, comprising:

the weight determining unit is used for determining the weight of each data category according to the number of data included in each data category in a training data set, the larger the number of data included in the data category is, the larger the weight of the data category is, the data in the training data set is natural language data, and the data category is semantic classification in an automatic artificial intelligence response system;

a training unit for training a classifier using the training data set;

the testing unit is used for testing the trained classifier through a testing data set so as to obtain the classification accuracy of the classifier on each data category, the data in the testing data set is natural language data, and the classifier is used for performing semantic classification on the natural language data by the automatic artificial intelligence response system;

a performance evaluation unit for performing weighted summation on the classification accuracy of each data class by using the weight of each data class to evaluate the classification performance of the classifier,

the data types in the test data set contain the same amount of data.

6. The performance evaluation apparatus according to claim 5,

the weight determining unit determines the weight of each data type according to the difference between the number of data included in each data type in the training data set and the number of data included in the training data set.

7. The performance evaluation apparatus according to claim 5,

the weight determination unit determines the weight w of the data class i_i＝N_iN, i is a positive integer, N_iFor the data class i, the number of data is included, and for the training data set, N, the number of data is included.

8. The performance evaluation apparatus according to claim 5,

the test unit determines the classification accuracy R of the classifier for a data class i_i＝m_i/M_iI is a positive integer, m_iNumber of data for which the classifier achieved correct classification of data class i under test, M_iAnd the data category i in the test data set contains the number of data.

9. A performance evaluation apparatus of a classifier, comprising:

a memory; and

a processor coupled to the memory, the processor configured to perform the method of performance evaluation of a classifier of any of claims 1-4 based on instructions stored in the memory device.

10. A computer-readable storage medium on which a computer program is stored which, when executed by a processor, implements the performance evaluation method of the classifier of any one of claims 1-4.