CN110874608B - Classification method, classification system and electronic equipment - Google Patents

Classification method, classification system and electronic equipment Download PDF

Info

Publication number
CN110874608B
CN110874608B CN201811024493.7A CN201811024493A CN110874608B CN 110874608 B CN110874608 B CN 110874608B CN 201811024493 A CN201811024493 A CN 201811024493A CN 110874608 B CN110874608 B CN 110874608B
Authority
CN
China
Prior art keywords
classified
classification
module
misjudgment
similar
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811024493.7A
Other languages
Chinese (zh)
Other versions
CN110874608A (en
Inventor
李亚健
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingdong Technology Holding Co Ltd
Original Assignee
Jingdong Technology Holding Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingdong Technology Holding Co Ltd filed Critical Jingdong Technology Holding Co Ltd
Priority to CN201811024493.7A priority Critical patent/CN110874608B/en
Publication of CN110874608A publication Critical patent/CN110874608A/en
Application granted granted Critical
Publication of CN110874608B publication Critical patent/CN110874608B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure provides a classification method, which comprises the steps of obtaining an object to be classified, determining whether the object to be classified belongs to an object easy to be misjudged based on a first preset rule, and classifying the object to be classified based on a second preset rule under the condition that the object to be classified belongs to the object easy to be misjudged. The present disclosure also provides a classification system, an electronic device, and a computer-readable medium.

Description

Classification method, classification system and electronic equipment
Technical Field
The present disclosure relates to the field of internet technology, and more particularly, to a classification method, system, and electronic device.
Background
With the development of artificial intelligence, machine learning algorithms are increasingly applied in various fields. Classification algorithms are constantly being studied as basic and core algorithms for artificial intelligence. However, it is difficult to achieve a satisfactory classification result for various classification models including naive bayes, decision trees, support vector machines, and the like.
Disclosure of Invention
In view of this, the present disclosure provides a classification method, system, and electronic device.
One aspect of the disclosure provides a classification method, which includes obtaining an object to be classified, determining whether the object to be classified belongs to an object prone to misjudgment based on a first predetermined rule, and classifying the object to be classified based on a second predetermined rule when the object to be classified belongs to the object prone to misjudgment.
According to an embodiment of the present disclosure, the method further comprises classifying the object to be classified by means of a trained classification model, in case the object to be classified does not belong to an object subject to misjudgment.
According to an embodiment of the disclosure, determining whether the object to be classified belongs to an object subject to misjudgment based on the first predetermined rule includes determining whether the object to be classified is similar to an object in the set subject to misjudgment based on a set subject to misjudgment, and determining that the object to be classified belongs to an object subject to misjudgment when the object to be classified is similar to an object in the set subject to misjudgment.
According to an embodiment of the disclosure, the classifying the object to be classified based on the second predetermined rule includes classifying the object to be classified based on a classification result of the object in the misjudgment-prone set when the object to be classified is similar to the object in the misjudgment-prone set.
According to an embodiment of the disclosure, the object to be classified has a plurality of attributes, and determining whether the object to be classified is similar to the object in the misjudgment prone set based on the misjudgment prone set includes determining whether each attribute of the object to be classified is similar to each attribute of the object in the misjudgment prone set, determining a number ratio of attributes in which the object to be classified is similar to the object in the misjudgment prone set, and determining whether the object to be classified is similar to the object in the misjudgment prone set by comparing the ratio with a preset threshold.
According to an embodiment of the present disclosure, the object to be classified is network flow data, and the obtaining the object to be classified includes obtaining transport layer data of a network, and processing the transport layer data to obtain the network flow data.
Another aspect of the present disclosure provides a classification system including an acquisition module, a first classification module, and a second classification module. And the obtaining module is used for obtaining the object to be classified. And the first classification module is used for determining whether the object to be classified belongs to an object which is easy to misjudge or not based on a first preset rule. And the second classification module is used for classifying the object to be classified based on a second preset rule under the condition that the object to be classified belongs to the object which is easy to misjudge.
According to an embodiment of the disclosure, the system further comprises a third classification module, configured to classify the object to be classified by means of a trained classification model, in case the object to be classified does not belong to an object subject to misjudgment.
According to an embodiment of the disclosure, the first classification module includes a similarity determination sub-module and a first classification sub-module. And the similarity judging sub-module is used for determining whether the object to be classified is similar to the object in the misjudging set or not based on the misjudging set. And the first classification sub-module is used for determining that the object to be classified belongs to the object which is easy to be misjudged under the condition that the object to be classified is similar to the object in the easy misjudged set.
According to an embodiment of the disclosure, the second classification module includes a second classification sub-module, configured to classify the object to be classified based on a classification result of the object in the misjudgment-prone set, when the object to be classified is similar to the object in the misjudgment-prone set.
According to the embodiment of the disclosure, the object to be classified has a plurality of attributes, and the similarity judging submodule includes an attribute comparing unit, a duty ratio determining unit and a similarity judging unit. And the attribute comparison unit is used for respectively determining whether each attribute of the object to be classified is similar to each attribute of the object in the misjudgment prone set. And the duty ratio determining unit is used for determining the quantity duty ratio of the attributes of which the objects to be classified are similar to the objects in the misjudgment-prone set. And the similarity judging unit is used for determining whether the object to be classified is similar to the object in the misjudgment-prone set or not by comparing the duty ratio with a preset threshold value.
According to the embodiment of the disclosure, the object to be classified is network flow data, and the obtaining module includes an obtaining sub-module and a processing sub-module. And the obtaining submodule is used for obtaining the transmission layer data of the network. And the processing sub-module is used for processing the transmission layer data to obtain network flow data.
Another aspect of the disclosure provides an electronic device comprising at least one processor and at least one memory for storing one or more computer-readable instructions, wherein the one or more computer-readable instructions, when executed by the at least one processor, cause the processor to perform the method as described above.
Another aspect of the present disclosure provides a computer-readable medium having stored thereon computer-readable instructions that, when executed, cause a processor to perform a method as described above.
Another aspect of the present disclosure provides a computer program comprising computer executable instructions which when executed are for implementing a method as described above.
According to the method, whether the object to be classified belongs to the object easy to be misjudged is judged in advance, if the object easy to be misjudged belongs to the object easy to be misjudged, the classification result is determined directly based on the second preset rule and does not pass through the classification model any more, the situation of wrong classification is reduced to a great extent, and the classification accuracy is improved.
Drawings
The above and other objects, features and advantages of the present disclosure will become more apparent from the following description of embodiments thereof with reference to the accompanying drawings in which:
FIG. 1 schematically illustrates a schematic diagram of a classification method according to an embodiment of the disclosure;
FIG. 2 schematically illustrates a flow chart of a classification method according to an embodiment of the disclosure;
fig. 3A and 3B schematically illustrate a flow chart of determining whether the object to be classified is similar to an object in a misjudged set based on the misjudged set according to an embodiment of the present disclosure;
FIG. 4 schematically illustrates a block diagram of a classification system according to an embodiment of the disclosure;
FIG. 5 schematically illustrates a block diagram of a similarity determination sub-module, according to an embodiment of the disclosure; and
fig. 6 schematically illustrates a block diagram of a computer system suitable for implementing the classification method and system according to an embodiment of the disclosure.
Detailed Description
Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is only exemplary and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the present disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the concepts of the present disclosure.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and/or the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.
All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be construed to have meanings consistent with the context of the present specification and should not be construed in an idealized or overly formal manner.
Where expressions like at least one of "A, B and C, etc. are used, the expressions should generally be interpreted in accordance with the meaning as commonly understood by those skilled in the art (e.g.," a system having at least one of A, B and C "shall include, but not be limited to, a system having a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). Where a formulation similar to at least one of "A, B or C, etc." is used, in general such a formulation should be interpreted in accordance with the ordinary understanding of one skilled in the art (e.g. "a system with at least one of A, B or C" would include but not be limited to systems with a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.). It should also be appreciated by those skilled in the art that virtually any disjunctive word and/or phrase presenting two or more alternative items, whether in the description, claims, or drawings, should be understood to contemplate the possibilities of including one of the items, either of the items, or both. For example, the phrase "a or B" should be understood to include the possibility of "a" or "B", or "a and B".
The embodiment of the disclosure provides a classification method, which comprises the steps of obtaining an object to be classified, determining whether the object to be classified belongs to an object easy to be misjudged based on a first preset rule, and classifying the object to be classified based on a second preset rule under the condition that the object to be classified belongs to the object easy to be misjudged.
Fig. 1 schematically illustrates a schematic diagram of a classification method according to an embodiment of the present disclosure. It should be noted that fig. 1 is only an example to which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical contents of the present disclosure.
According to the method, before the object to be classified is classified by using the classification model, the object to be classified is initially classified, and the object easy to be misjudged is screened from the objects to be classified and is processed independently. As shown in fig. 1, after the object to be classified is obtained, it is classified into an object subject to misjudgment and other objects. For other objects, the classification model can be directly input for processing to obtain classification results; for the object easy to be misjudged, the classification model is not used for classifying, and classification is performed according to other preset rules, so that a classification result is obtained. In this way, the accuracy of the classification result can be significantly improved.
Fig. 2 schematically illustrates a flow chart of a classification method according to an embodiment of the disclosure.
As shown in fig. 2, the method includes operations S210 to S230.
In operation S210, an object to be classified is obtained.
According to an embodiment of the present disclosure, the object to be classified may be network flow data, where the obtaining the object to be classified includes obtaining transport layer data of a network, and processing the transport layer data to obtain the network flow data. The network flow data has various attributes, so that the method is particularly suitable for application of the embodiment of the disclosure, and classification accuracy can be effectively improved. It should be appreciated that the methods of the embodiments of the present disclosure can be applied to classification processes of various objects.
In operation S220, it is determined whether the object to be classified belongs to an object subject to erroneous judgment based on a first predetermined rule.
According to an embodiment of the disclosure, determining whether the object to be classified belongs to an object subject to misjudgment based on the first predetermined rule includes determining whether the object to be classified is similar to an object in the set subject to misjudgment based on a set subject to misjudgment, and determining that the object to be classified belongs to an object subject to misjudgment when the object to be classified is similar to an object in the set subject to misjudgment.
For example, the historical classification results of a certain number of objects may be processed, and the historical classification results of the classification model used may be analyzed to distinguish between correctly classified objects and incorrectly classified objects. And adding the objects with wrong classification or objects with classification accuracy lower than a certain threshold value into a set easy to misjudge. When the method of the embodiment of the disclosure is applied, if the object to be classified is similar to one object in the easy misjudgment set, the object to be classified is not classified by using the classification model.
According to embodiments of the present disclosure, the object to be classified may have a plurality of attributes. The following describes, with reference to fig. 3A and 3B, determining whether the object to be classified is similar to the object in the misjudgment-prone set according to the embodiment of the present disclosure.
Fig. 3A and 3B schematically illustrate a flow chart of determining whether the object to be classified is similar to an object in a misjudged set based on the misjudged set according to an embodiment of the present disclosure.
As shown in fig. 3A, the method includes operations S310 to S330.
In operation S310, it is determined whether each attribute of the object to be classified is similar to each attribute of the objects in the misjudgment-prone set, respectively.
In operation S320, a number of the attributes of which the object to be classified is similar to the objects in the misjudgment-prone set is determined.
In operation S330, it is determined whether the object to be classified is similar to the object in the misjudgment-prone set by comparing the duty ratio with a preset threshold.
Please refer to fig. 3B. As shown in fig. 3B, the method includes operations S301 to S306.
In operation S301, an object is obtained.
In operation S302, attributes of an object are traversed.
In operation S303, it is determined whether the attributes are similar, if so, operation S304 is performed, otherwise, operation S302 is returned. According to embodiments of the present disclosure, the similarity of attributes may be defined:
C 1 =|A i,j -A 0,j |/A 0,j
wherein A is i,j A is the value of the j-th attribute of object i 0,j A value representing the j-th attribute of an object in the misjudged set. The smaller the similarity C, the more similar the j-th attribute representing the two objects. A threshold E (0.ltoreq.E.ltoreq.1) may be set, where A is considered to be i,j And A is a 0,j Similarly.
In operation S304, the count parameter is increased by 1 for counting the number of similar attributes between two objects.
In operation S305, it is determined whether the traversal is completed, and if so, operation S306 is performed, otherwise, operation S302 is returned.
In operation S306, a similarity between two objects is calculated.
According to embodiments of the present disclosure, the similarity of objects may be defined:
C 2 =S/N,
wherein N is the number of object attributes, S is the number of similar attributes therein, and can be obtained by reading the count parameter. A threshold B (0.ltoreq.B.ltoreq.1) may be set, if C 2 And (3) if the object is more than or equal to B, the object is considered to be similar to the object in the easy misjudgment set, and if B is larger, the two samples are more similar.
Under the condition of numerous attributes, a common Euclidean distance is used for calculating the similarity, a space of hundreds of dimensions is needed, and the similarity threshold value is not well determined.
Referring back to fig. 2. In operation S230, in the case that the object to be classified belongs to an object subject to misjudgment, the object to be classified is classified based on a second predetermined rule.
According to an embodiment of the present disclosure, the method further comprises classifying the object to be classified by means of a trained classification model, in case the object to be classified does not belong to an object subject to misjudgment.
According to an embodiment of the disclosure, the classifying the object to be classified based on the second predetermined rule includes classifying the object to be classified based on a classification result of the object in the misjudgment-prone set when the object to be classified is similar to the object in the misjudgment-prone set. For example, in the case where the correct classification result of the object in the misjudgment-prone set is known, if the object a to be classified is similar to the object B in the misjudgment-prone set, and the object B belongs to the X category, according to the embodiment of the present disclosure, the object a to be classified may not be sent into the classification model, but may be directly determined as the X category consistent with the classification result of the object B.
According to the method of the embodiments of the disclosure, whether the object to be classified belongs to the object easy to be misjudged is judged in advance, if the object easy to be misjudged belongs to the object easy to be misjudged, the classification result is determined directly based on the second preset rule and does not pass through the classification model any more, so that the situation of wrong classification is reduced to a great extent, and the classification accuracy is improved.
Fig. 4 schematically illustrates a block diagram of a classification system 400 according to an embodiment of the disclosure.
As shown in fig. 4, the classification system 400 includes an acquisition module 410, a first classification module 420, and a second classification module 430.
The obtaining module 410, for example, performs operation S210 described above with reference to fig. 2, for obtaining an object to be classified.
The first classification module 420, for example, performs operation S220 described above with reference to fig. 2, for determining whether the object to be classified belongs to an object subject to erroneous judgment based on a first predetermined rule.
The second classification module 430, for example, performs operation S230 described above with reference to fig. 2, and is configured to classify the object to be classified based on a second predetermined rule in a case where the object to be classified belongs to an object that is prone to misjudgment.
According to an embodiment of the disclosure, the system further comprises a third classification module, configured to classify the object to be classified by means of a trained classification model, in case the object to be classified does not belong to an object subject to misjudgment.
According to an embodiment of the disclosure, the first classification module includes a similarity determination sub-module and a first classification sub-module. And the similarity judging sub-module is used for determining whether the object to be classified is similar to the object in the misjudging set or not based on the misjudging set. And the first classification sub-module is used for determining that the object to be classified belongs to the object which is easy to be misjudged under the condition that the object to be classified is similar to the object in the easy misjudged set.
According to an embodiment of the disclosure, the second classification module includes a second classification sub-module, configured to classify the object to be classified based on a classification result of the object in the misjudgment-prone set, when the object to be classified is similar to the object in the misjudgment-prone set.
Fig. 5 schematically illustrates a block diagram of a similarity determination sub-module 500 according to an embodiment of the disclosure.
As shown in fig. 5, the similarity determination sub-module 500 includes an attribute comparison unit 510, a duty ratio determination unit 520, and a similarity determination unit 530.
According to an embodiment of the disclosure, the object to be classified has a plurality of attributes.
An attribute comparison unit, for example, performs operation S310 described above with reference to fig. 3A, for determining whether each attribute of the object to be classified is similar to each attribute of the objects in the misjudgment-prone set, respectively.
The duty ratio determining unit, for example, performs operation S320 described above with reference to fig. 3A, for determining a number duty ratio of the attributes of the object to be classified that are similar to the objects in the misjudgment-prone set.
The similarity determining unit, for example, performs operation S330 described above with reference to fig. 3A, for determining whether the object to be classified is similar to the object in the misjudgment-prone set by comparing the duty ratio with a preset threshold.
According to the embodiment of the disclosure, the object to be classified is network flow data, and the obtaining module includes an obtaining sub-module and a processing sub-module. And the obtaining submodule is used for obtaining the transmission layer data of the network. And the processing sub-module is used for processing the transmission layer data to obtain network flow data.
Any number of modules, sub-modules, units, sub-units, or at least some of the functionality of any number of the sub-units according to embodiments of the present disclosure may be implemented in one module. Any one or more of the modules, sub-modules, units, sub-units according to embodiments of the present disclosure may be implemented as split into multiple modules. Any one or more of the modules, sub-modules, units, sub-units according to embodiments of the present disclosure may be implemented at least in part as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system-on-chip, a system-on-substrate, a system-on-package, an Application Specific Integrated Circuit (ASIC), or in any other reasonable manner of hardware or firmware that integrates or encapsulates the circuit, or in any one of or a suitable combination of three of software, hardware, and firmware. Alternatively, one or more of the modules, sub-modules, units, sub-units according to embodiments of the present disclosure may be at least partially implemented as computer program modules, which when executed, may perform the corresponding functions.
For example, any of the obtaining module 410, the first classifying module 420, the second classifying module 430, the third classifying module, the similarity judging sub-module, the first classifying sub-module, the second classifying sub-module, the attribute comparing unit 510, the duty ratio determining unit 520, the similarity judging unit 530, the obtaining sub-module, and the processing sub-module may be combined in one module to be implemented, or any of the modules may be split into a plurality of modules. Alternatively, at least some of the functionality of one or more of the modules may be combined with at least some of the functionality of other modules and implemented in one module. According to embodiments of the present disclosure, at least one of the obtaining module 410, the first classifying module 420, the second classifying module 430, the third classifying module, the similarity judging sub-module, the first classifying sub-module, the second classifying sub-module, the attribute comparing unit 510, the duty ratio determining unit 520, the similarity judging unit 530, the obtaining sub-module, and the processing sub-module may be at least partially implemented as a hardware circuit, such as a Field Programmable Gate Array (FPGA), a Programmable Logic Array (PLA), a system on a substrate, a system on a package, an Application Specific Integrated Circuit (ASIC), or may be implemented in hardware or firmware in any other reasonable manner of integrating or packaging the circuits, or in any one of or a suitable combination of three of software, hardware, and firmware. Alternatively, at least one of the obtaining module 410, the first classifying module 420, the second classifying module 430, the third classifying module, the similarity judging sub-module, the first classifying sub-module, the second classifying sub-module, the attribute comparing unit 510, the duty ratio determining unit 520, the similarity judging unit 530, the obtaining sub-module, and the processing sub-module may be at least partially implemented as a computer program module, which may perform the corresponding functions when being executed.
Fig. 6 schematically illustrates a block diagram of a computer system 600 suitable for implementing the classification method and system according to an embodiment of the disclosure. The computer system illustrated in fig. 6 is merely an example and should not be construed as limiting the functionality and scope of use of the embodiments of the present disclosure. The computer system shown in fig. 6 may be implemented as an electronic device including at least one processor (e.g., processor 601) and at least one memory (e.g., storage 608).
As shown in fig. 6, a computer system 600 according to an embodiment of the present disclosure includes a processor 601 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 602 or a program loaded from a storage section 608 into a Random Access Memory (RAM) 603. The processor 601 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or an associated chipset and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), or the like. Processor 601 may also include on-board memory for caching purposes. The processor 601 may comprise a single processing unit or a plurality of processing units for performing different actions of the method flows according to embodiments of the disclosure.
In the RAM 603, various programs and data required for the operation of the system 600 are stored. The processor 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604. The processor 601 performs various operations of the method flow according to the embodiments of the present disclosure by executing programs in the ROM 602 and/or the RAM 603. Note that the program may be stored in one or more memories other than the ROM 602 and the RAM 603. The processor 601 may also perform various operations of the method flow according to embodiments of the present disclosure by executing programs stored in the one or more memories.
According to an embodiment of the present disclosure, the system 600 may further include an input/output (I/O) interface 605, the input/output (I/O) interface 605 also being connected to the bus 604. The system 600 may also include one or more of the following components connected to the I/O interface 605: an input portion 606 including a keyboard, mouse, etc.; an output portion 607 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, a speaker, and the like; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The drive 610 is also connected to the I/O interface 605 as needed. Removable media 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on drive 610 so that a computer program read therefrom is installed as needed into storage section 608.
According to embodiments of the present disclosure, the method flow according to embodiments of the present disclosure may be implemented as a computer software program. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 609, and/or installed from the removable medium 611. The above-described functions defined in the system of the embodiments of the present disclosure are performed when the computer program is executed by the processor 601. The systems, devices, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.
The present disclosure also provides a computer-readable medium that may be embodied in the apparatus/device/system described in the above embodiments; or may exist alone without being assembled into the apparatus/device/system. The computer readable medium carries one or more programs which, when executed, implement methods in accordance with embodiments of the present disclosure.
According to embodiments of the present disclosure, the computer readable medium may be a computer readable signal medium or a computer readable storage medium or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wired, fiber optic cable, radio frequency signals, or the like, or any suitable combination of the foregoing.
For example, according to embodiments of the present disclosure, the computer-readable medium may include ROM 602 and/or RAM 603 and/or one or more memories other than ROM 602 and RAM 603 described above.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Those skilled in the art will appreciate that the features recited in the various embodiments of the disclosure and/or in the claims may be provided in a variety of combinations and/or combinations, even if such combinations or combinations are not explicitly recited in the disclosure. In particular, the features recited in the various embodiments of the present disclosure and/or the claims may be variously combined and/or combined without departing from the spirit and teachings of the present disclosure. All such combinations and/or combinations fall within the scope of the present disclosure.
The embodiments of the present disclosure are described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described above separately, this does not mean that the measures in the embodiments cannot be used advantageously in combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be made by those skilled in the art without departing from the scope of the disclosure, and such alternatives and modifications are intended to fall within the scope of the disclosure.

Claims (8)

1. A classification method, comprising:
obtaining an object to be classified, wherein the object to be classified is network flow data;
respectively determining whether each attribute of the object to be classified is similar to each attribute of the object in the easy misjudgment set;
determining the quantity ratio of the attributes of the object to be classified similar to the objects in the misjudgment-prone set;
determining whether the object to be classified is similar to the object in the misjudgment-prone set or not by comparing the duty ratio with a preset threshold;
under the condition that the object to be classified is similar to the object in the misjudgment-prone set, determining that the object to be classified belongs to the misjudgment-prone object;
classifying the object to be classified based on a second preset rule under the condition that the object to be classified belongs to an object which is easy to misjudge;
and classifying the object to be classified through a trained classification model under the condition that the object to be classified does not belong to the object easy to be misjudged.
2. The method according to claim 1, wherein, in the case that the object to be classified belongs to an object subject to misjudgment, classifying the object to be classified based on a second predetermined rule comprises:
and classifying the object to be classified based on the classification result of the object in the misjudgment-prone set under the condition that the object to be classified is similar to the object in the misjudgment-prone set.
3. The method of claim 1, wherein the object to be classified is network flow data, the obtaining the object to be classified comprising:
obtaining transport layer data of a network; and
and processing the transmission layer data to obtain network stream data.
4. A classification system, comprising:
the device comprises an obtaining module, a classification module and a classification module, wherein the obtaining module is used for obtaining an object to be classified, and the object to be classified is network flow data;
the first classification module comprises a similarity judgment sub-module and a first classification sub-module;
the similarity judging submodule comprises an attribute comparing unit, a duty ratio determining unit and a similarity judging unit, wherein the attribute comparing unit is used for respectively determining whether each attribute of the object to be classified is similar to each attribute of the object in the misjudging set; the duty ratio determining unit is used for determining the quantity duty ratio of the attributes of which the objects to be classified are similar to the objects in the misjudgment-prone set; the similarity judging unit is used for determining whether the object to be classified is similar to the object in the misjudgment prone set or not by comparing the duty ratio with a preset threshold value;
the first classification sub-module is used for determining that the object to be classified belongs to the object which is easy to be misjudged under the condition that the object to be classified is similar to the object in the easy misjudged set;
the second classification module is used for classifying the object to be classified based on a second preset rule under the condition that the object to be classified belongs to the object which is easy to misjudge;
and the third classification module is used for classifying the object to be classified through the trained classification model under the condition that the object to be classified does not belong to the object easy to be misjudged.
5. The system of claim 4, wherein the second classification module comprises:
and the second classification sub-module is used for classifying the object to be classified based on the classification result of the object in the misjudgment-prone set under the condition that the object to be classified is similar to the object in the misjudgment-prone set.
6. The system of claim 4, wherein the object to be classified is network flow data, the obtaining module comprising:
the obtaining submodule is used for obtaining transmission layer data of the network; and
and the processing sub-module is used for processing the transmission layer data to obtain network flow data.
7. An electronic device, comprising:
one or more processors;
a memory for storing one or more computer programs,
wherein the one or more computer programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1 to 3.
8. A computer readable medium having stored thereon executable instructions which when executed by a processor cause the processor to implement the method of any of claims 1 to 3.
CN201811024493.7A 2018-09-03 2018-09-03 Classification method, classification system and electronic equipment Active CN110874608B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811024493.7A CN110874608B (en) 2018-09-03 2018-09-03 Classification method, classification system and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811024493.7A CN110874608B (en) 2018-09-03 2018-09-03 Classification method, classification system and electronic equipment

Publications (2)

Publication Number Publication Date
CN110874608A CN110874608A (en) 2020-03-10
CN110874608B true CN110874608B (en) 2024-04-05

Family

ID=69716895

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811024493.7A Active CN110874608B (en) 2018-09-03 2018-09-03 Classification method, classification system and electronic equipment

Country Status (1)

Country Link
CN (1) CN110874608B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104484461A (en) * 2014-12-29 2015-04-01 北京奇虎科技有限公司 Method and system based on encyclopedia data for classifying entities
WO2017088125A1 (en) * 2015-11-25 2017-06-01 中国科学院自动化研究所 Dense matching relation-based rgb-d object recognition method using adaptive similarity measurement, and device
CN107644364A (en) * 2017-09-18 2018-01-30 北京京东尚科信息技术有限公司 Object filter method and system
CN108182279A (en) * 2018-01-26 2018-06-19 有米科技股份有限公司 Object classification method, device and computer equipment based on text feature
CN108388924A (en) * 2018-03-08 2018-08-10 平安科技(深圳)有限公司 A kind of data classification method, device, equipment and computer readable storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9785866B2 (en) * 2015-01-22 2017-10-10 Microsoft Technology Licensing, Llc Optimizing multi-class multimedia data classification using negative data
DE112015007176T5 (en) * 2015-12-10 2018-08-23 Intel Corporation Visual recognition using deep learning attributes
US9846822B2 (en) * 2015-12-31 2017-12-19 Dropbox, Inc. Generating and utilizing normalized scores for classifying digital objects

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104484461A (en) * 2014-12-29 2015-04-01 北京奇虎科技有限公司 Method and system based on encyclopedia data for classifying entities
WO2017088125A1 (en) * 2015-11-25 2017-06-01 中国科学院自动化研究所 Dense matching relation-based rgb-d object recognition method using adaptive similarity measurement, and device
CN107644364A (en) * 2017-09-18 2018-01-30 北京京东尚科信息技术有限公司 Object filter method and system
CN108182279A (en) * 2018-01-26 2018-06-19 有米科技股份有限公司 Object classification method, device and computer equipment based on text feature
CN108388924A (en) * 2018-03-08 2018-08-10 平安科技(深圳)有限公司 A kind of data classification method, device, equipment and computer readable storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于标签相似度度的不良信息多标签分类方法;刘卓然等;计算机应用研究(第4期);全文 *
面向文本分类的混淆类判别技术;朱靖波;王会珍;张希娟;;软件学报(第3期);全文 *

Also Published As

Publication number Publication date
CN110874608A (en) 2020-03-10

Similar Documents

Publication Publication Date Title
CN108197652B (en) Method and apparatus for generating information
EP3620988A1 (en) Method, device for optimizing simulation data, and computer-readable storage medium
US11379718B2 (en) Ground truth quality for machine learning models
CN111145076B (en) Data parallelization processing method, system, equipment and storage medium
CN110489345A (en) A kind of collapse polymerization, device, medium and equipment
US20210279618A1 (en) System and method for building and using learning machines to understand and explain learning machines
CN112329762A (en) Image processing method, model training method, device, computer device and medium
CN111611390B (en) Data processing method and device
CN111523558A (en) Ship shielding detection method and device based on electronic purse net and electronic equipment
CN111291715B (en) Vehicle type identification method based on multi-scale convolutional neural network, electronic device and storage medium
US8141015B1 (en) Reporting status of timing exceptions
CN109446324B (en) Sample data processing method and device, storage medium and electronic equipment
CN116109907B (en) Target detection method, target detection device, electronic equipment and storage medium
CN117370767A (en) User information evaluation method and system based on big data
CN110874608B (en) Classification method, classification system and electronic equipment
CN114693052A (en) Risk prediction model training method and device, computing equipment and medium
CN112906726B (en) Model training method, image processing device, computing equipment and medium
CN110766228B (en) Method, device, picking system, electronic device and medium for picking
CN112214770A (en) Malicious sample identification method and device, computing equipment and medium
CN112686298A (en) Target detection method and device and electronic equipment
CN111461152A (en) Cargo detection method and device, electronic equipment and computer readable medium
CN110298302A (en) A kind of human body target detection method and relevant device
CN113869904B (en) Suspicious data identification method, device, electronic equipment, medium and computer program
CN115984207A (en) Vehicle defect detection method, device, system and medium
CN115409985A (en) Target object detection method and device, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: Room 221, 2 / F, block C, 18 Kechuang 11th Street, Daxing District, Beijing, 100176

Applicant after: Jingdong Technology Holding Co.,Ltd.

Address before: Room 221, 2nd floor, Block C, 18 Kechuang 11th Street, Daxing Economic and Technological Development Zone, Beijing, 100176

Applicant before: BEIJING JINGDONG FINANCIAL TECHNOLOGY HOLDING Co.,Ltd.

GR01 Patent grant
GR01 Patent grant