CN109214435A - A kind of data classification method and device - Google Patents

A kind of data classification method and device Download PDF

Info

Publication number
CN109214435A
CN109214435A CN201810956382.3A CN201810956382A CN109214435A CN 109214435 A CN109214435 A CN 109214435A CN 201810956382 A CN201810956382 A CN 201810956382A CN 109214435 A CN109214435 A CN 109214435A
Authority
CN
China
Prior art keywords
data
class
label
classification processing
class label
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810956382.3A
Other languages
Chinese (zh)
Inventor
李明
孙翯
池天宇
刘冬阳
张启龙
王玲玲
黎佳林
胡海波
张仲朋
薛旭锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Harmony Information Technology Ltd By Share Ltd
Original Assignee
Beijing Harmony Information Technology Ltd By Share Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Harmony Information Technology Ltd By Share Ltd filed Critical Beijing Harmony Information Technology Ltd By Share Ltd
Priority to CN201810956382.3A priority Critical patent/CN109214435A/en
Publication of CN109214435A publication Critical patent/CN109214435A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a kind of data classification method and devices, which comprises after the source data for obtaining random combine in real time, carries out classification processing to the source data of the random combine first, obtains multiple class data;Wherein, each class data corresponds to a class label in the multiple class data;Meanwhile determining that each class data corresponds to the weight of class label;Later, it stores weight that each identified class data correspond to class label is corresponding with class label to tag library.In this way, can be realized the real time correlation between data and class label in tag system business module, the error in data classification is effectively avoided, and then promote the accuracy and availability of entire data management system.

Description

A kind of data classification method and device
Technical field
The present invention relates to big data processing technique more particularly to a kind of data classification methods and device.
Background technique
Currently, big data management system mainly includes that data management platform, tag system and model platform (MMP) three are big Business module.Under big data high concurrent state, usually require that data in the data management system between each business module as far as possible Keep real-time update.
However, there are apparent defects: 1) update between data and label for big data management system in the related technology A part is not carried out the real-time update degree of association;2) tag library is not followed up in real time;3) there are larger for the setting of label Limitation.
In real data management, just because of the drawbacks described above of big data management system, data classification can be directly resulted in On error, to influence the accuracy and availability of entire big data management system.
Summary of the invention
The embodiment of the present invention creatively provides one to effectively overcome the defect of big data management system in the prior art Kind data classification method and device.
According to the first aspect of the invention, a kind of data classification method is provided, which comprises obtain random combine Source data;Classification processing is carried out to the source data of the random combine, obtains multiple class data;Wherein, the multiple class data In each class data correspond to a class label;Determine that each class data corresponds to the weight of class label;It will be identified every The weight that one class data corresponds to class label corresponding with class label is stored to tag library.
According to an embodiment of the present invention, wherein classification processing is carried out to the source data of the random combine, is obtained multiple Class data, comprising: classification processing is carried out to the source data of the random combine based on preset class label, obtain respectively with There are the class data of mapping relations for the preset class label;Each class data of the determination correspond to the power of class label Weight, comprising: during carrying out classification processing based on source data of the preset class label to the random combine, according to At least one characteristic dimension of the source data of the random combine to carry out the preset class label determination of weight.
According to an embodiment of the present invention, wherein classification processing is carried out to the source data of the random combine, is obtained multiple Class data, comprising: carried out the source data of the random combine according to different characteristic dimensions using the first specific classification algorithm Classification processing obtains a classification processing result;The side of the refinement of characteristic dimension is carried out using the second specific classification algorithm Formula to carry out secondary classification processing to a classification processing result, obtains multiple class data;Each class data pair of the determination Answer the weight of class label, comprising: by the way of the refinement for carrying out each characteristic dimension using the second specific classification algorithm come pair During classification processing result carries out secondary classification processing, according to the accounting of each class data in multiple class data come The weight for corresponding to class label to each class data is determined.
According to an embodiment of the present invention, wherein the first specific classification algorithm includes at least one following algorithm: poly- Class, classification tree, Rd forest.
According to an embodiment of the present invention, wherein the second feature sorting algorithm includes at least one following algorithm: shellfish Ye Si, logistic regression training.
According to an embodiment of the present invention, wherein the method further includes: in response to the application to class label, hold Row operates the update for the weight that each described class data correspond to class label.
According to the second aspect of the invention, a kind of device for classifying data is provided, described device includes: acquisition module, is used for Obtain the source data of random combine;Classification processing module carries out classification processing for the source data to the random combine, obtains Multiple class data;Wherein, each class data corresponds to a class label in the multiple class data;Determining module, for true The weight of each fixed class data corresponding label;Memory module, for each identified class data to be corresponded to class label Weight is corresponding with class label to be stored to tag library.
According to an embodiment of the present invention, wherein the classification processing module is also used to, and is based on preset class label Classification processing is carried out to the source data of the random combine, obtains respectively that there are mapping relations with the preset class label Class data;The determining module is also used to, and is based on preset class label to described random in the classification processing module During combined source data carries out classification processing, according at least one characteristic dimension of the source data of the random combine come The determination of weight is carried out to the preset class label.
According to an embodiment of the present invention, wherein the classification processing module is also used to, and utilizes the first specific classification algorithm The source data of the random combine is subjected to a classification processing according to different characteristic dimensions, obtains a classification processing knot Fruit;Secondary point is carried out to a classification processing result by the way of the refinement that the second specific classification algorithm carries out characteristic dimension Class processing, obtains multiple class data;The determining module is also used to, and is calculated in the classification processing module using the second specific classification The mode that method carries out the refinement of each characteristic dimension is come during carrying out secondary classification processing to a classification processing result, It is determined according to the accounting of each class data in multiple class data come the weight for corresponding to class label to each class data.
According to an embodiment of the present invention, wherein described device further comprises: update module, in response to class The application of label executes the update to the weight of each class data corresponding label and operates.
Data classification method and device described in the embodiment of the present invention, it is first after the source data for obtaining random combine in real time Classification processing first is carried out to the source data of the random combine, obtains multiple class data;Wherein, each in the multiple class data A class data correspond to a class label;Meanwhile determining that each class data corresponds to the weight of class label;Later, it will determine Each class data correspond to the weight of class label and corresponding with class label store to tag library.In this way, in tag system business module It can be realized the real time correlation between data and class label, effectively avoid the error in data classification, and then promote entire data The accuracy and availability of management system.
It is to be appreciated that the teachings of the present invention does not need to realize whole beneficial effects recited above, but it is specific Technical solution may be implemented specific technical effect, and other embodiments of the invention can also be realized and not mentioned above Beneficial effect.
Detailed description of the invention
The following detailed description is read with reference to the accompanying drawings, above-mentioned and other mesh of exemplary embodiment of the invention , feature and advantage will become prone to understand.In the accompanying drawings, if showing by way of example rather than limitation of the invention Dry embodiment, in which:
In the accompanying drawings, identical or corresponding label indicates identical or corresponding part.
Fig. 1 shows the structure composed figure of data management system of the present invention;
Fig. 2 shows an implementation process schematic diagrames of data classification method of the embodiment of the present invention;
Fig. 3 shows the another implementation process schematic diagram of data classification method of the embodiment of the present invention;
Fig. 4 shows the composed structure schematic diagram of device for classifying data of the embodiment of the present invention.
Specific embodiment
The principle and spirit of the invention are described below with reference to several illustrative embodiments.It should be appreciated that providing this A little embodiments are used for the purpose of making those skilled in the art can better understand that realizing the present invention in turn, and be not with any Mode limits the scope of the invention.On the contrary, thesing embodiments are provided so that the present invention is more thorough and complete, and energy It enough will fully convey the scope of the invention to those skilled in the art.
The technical solution of the present invention is further elaborated in the following with reference to the drawings and specific embodiments.
Fig. 1 shows the structure composed figure of data management system of the present invention.
As shown in Figure 1, data management system of the present invention includes data management platform (DMP), tag system (LOS) and model Platform (MMP).
Wherein, the DMP is mainly used for the cleaning, filtering and reparation of data;The LOS is mainly used for data class label Update and storage, the MMP be mainly used for model creation maintenance and storage.
In the operation of entire data management system, main includes following a few step key operations:
The first step carries out data pull by DMP, is further formatted to data;Then by the data of formatting to Amount is input to the cleaning filtering that service sink carries out data;Data reparation further is carried out to the abnormal data for not passing through service sink, To guarantee the accuracy of data.
Second step, the data after being cleaned, filtered and being repaired via DMP are input to LOS system and carry out classification processing, Storing the data for having weight and class label to tag library.
Third step, MMP is by way of automatically creating model or modifying model come to storing to tag library with weight Analysis screening is carried out with the data of class label.
Here, it should be added that, in second step, the class label can be common label, be also possible to Anonymous label.
It is described in detail below mainly for the creative realization process of LOS system.
Fig. 2 shows the implementation process schematic diagrames of data classification method of the embodiment of the present invention.
As shown in Fig. 2, data classification method described in the embodiment of the present invention includes: operation 201, the source number of random combine is obtained According to;Operation 202 carries out classification processing to the source data of the random combine, obtains multiple class data;Wherein, the multiple class Each class data corresponds to a class label in data;Operation 203, determines that each class data corresponds to the weight of class label; Operation 204 is stored weight that each identified class data correspond to class label is corresponding with class label to tag library.
In operation 202, based on the different type of class label, and then exists and classify to the source data of the random combine Two different implementations of processing.
For common label, in operation 202, classification processing is carried out to the source data of the random combine, is obtained more A class data, comprising: classification processing is carried out based on source data of the preset class label to the random combine, is distinguished There are the class data of mapping relations with the preset class label.Wherein, here, the preset class label is usual It is artificially determined by user.The preset class label includes driving score, drive speed or driving duration etc..
Correspondingly, each class data of the determination correspond to the weight of class label in operation 203, comprising: based on preparatory During the class label of setting carries out classification processing to the source data of the random combine, according to the source number of the random combine According at least one characteristic dimension to carry out the preset class label determination of weight.Wherein, it is described at least one Characteristic dimension may include time, region or position etc..
In an application example, when user wants that the driving situation to all chauffeurs in areas of Beijing carries out analysis screening, Therefore a series of class label is preset, such as drive score, drive speed, drive duration;Later, source data is propped up Vector machine (Support Vector Machine, SVM) division is held, so that the classification processing of data is realized, by the data of DMP The mapping of multi-to-multi is carried out with a series of preset class labels;At the same time, according to the time of source data, region, position The initialization of weight is carried out Deng at least one characteristic dimension.Certainly, which can dynamically update.
In this way, passing through the setting of class label, the innovative communication system increased between MMP-LOS-DMP is improved The activity of data, while also allowing and contacting even closer between data and label and initial data, improve data directory Accuracy.
In the classification process of corresponding common label, it is not difficult to find that entire classification processing mainly artificially labels, Therefore there are subjectivities and non-intellectual, therefore common label is not able to satisfy refinement of the user to data, cannot guarantee that user's logarithm According to control, it is therefore desirable to establish anonymous tag system.
The classification processing of source data is described in detail below for anonymous label.
For anonymous label, in operation 202, classification processing is carried out to the source data of the random combine, is obtained more A class data, comprising: using the first specific classification algorithm by the source data of the random combine according to different characteristic dimensions into Classification processing of row, obtains a classification processing result;The refinement of characteristic dimension is carried out using the second specific classification algorithm Mode to carry out secondary classification processing to a classification processing result, obtains multiple class data.Wherein, first specific classification Algorithm includes at least one following algorithm: cluster, classification tree, Rd forest.
Correspondingly, each class data of the determination correspond to the weight of class label in operation 203, comprising: using second The mode that specific classification algorithm carries out the refinement of each characteristic dimension to carry out at secondary classification a classification processing result During reason, according to the accounting of each class data in multiple class data each class data are corresponded to the weight of class label It is determined.Wherein, the second feature sorting algorithm includes at least one following algorithm: Bayes, logistic regression training.
It can be respectively cluster and shellfish with the first specific classification algorithm and the second specific classification algorithm in an application example Ye Si realizes the classification processing to source data.General classification thinking are as follows: data are clustered according to different characteristic dimensions, Such as: region dimension, time dimension, work dimension, house dimension;Bayes is reused later carries out each characteristic dimension Refine and initialize the anonymous label weight after each refinement.
For example, step 1, all data are carried out by seriation and normalization according to coordinate vector first, mainly The later period is facilitated to calculate the complexity of similarity and calculating;Step 2, calculate data vector between similarity, according to similarity into Row cluster, every one kind stand alone as an anonymous label, then carry out inside to each class and classifying, and so on, until each Element gap inside class is sufficiently small (default can degree of being similarly configured judged);Step 3, by the period it is rough be divided into 24 A section, 1 hour is 1 time interval, executes step 2 for each section, the weight of each label is calculated, calculating Method has very much, here to a kind of relatively good understanding: the similar class of each time interval is carried out element number summation, note Are as follows: S;The element number of the corresponding class of each time interval is denoted as: e;Weight are as follows: e/S*100%;Step 4, to all marks Label execute identical process and obtain the weight of each label.
In this way, passing through the setting of class label, the innovative communication system increased between MMP-LOS-DMP is improved The activity of data, while also allowing and contacting even closer between data and label and initial data, improve data directory Accuracy.Moreover, the embodiment of the present invention creatively increases anonymous label, it can preferably increase the granularity of data, be able to The potential data characteristics of mining data.
A possible embodiment according to the present invention, as shown in figure 3, after operation 204, the method also includes: behaviour Make 205, in response to the application to class label, the update for executing the weight for corresponding to class label to each described class data is operated.
Wherein, it can be MMP system in response to the application to class label and carry out model index in triggering.Carried out in triggering In the case that model indexes, start the update operation for executing the weight that each described class data are corresponded to class label.
It should be added that in practical applications, can be executed based on user satisfaction to each described class Data correspond to the update operation of the weight of class label.
Data classification method and device described in the embodiment of the present invention, it is first after the source data for obtaining random combine in real time Classification processing first is carried out to the source data of the random combine, obtains multiple class data;Wherein, each in the multiple class data A class data correspond to a class label;Meanwhile determining that each class data corresponds to the weight of class label;Later, it will determine Each class data correspond to the weight of class label and corresponding with class label store to tag library.In this way, in tag system business module It can be realized the real time correlation between data and class label, effectively avoid the error in data classification, and then promote entire data The accuracy and availability of management system.
Fig. 4 shows the composed structure schematic diagram of device for classifying data of the embodiment of the present invention.As shown in figure 3, the data Sorter 40 includes:
Module 401 is obtained, for obtaining the source data of random combine;
Classification processing module 402 carries out classification processing for the source data to the random combine, obtains multiple class numbers According to;Wherein, each class data corresponds to a class label in the multiple class data;
Determining module 403, for determining the weight of each class data corresponding label;
Memory module 404, weight for each identified class data to be corresponded to class label is corresponding with class label to deposit It stores up to tag library.
According to an embodiment of the present invention, the classification processing module 402 is also used to, and is based on preset class label pair The source data of the random combine carries out classification processing, obtains respectively that there are mapping relations with the preset class label Class data;The determining module 403 is also used to, and is based on preset class label to described random in the classification processing module During combined source data carries out classification processing, according at least one characteristic dimension of the source data of the random combine come The determination of weight is carried out to the preset class label.
According to an embodiment of the present invention, the classification processing module 402 is also used to, will using the first specific classification algorithm The source data of the random combine carries out a classification processing according to different characteristic dimensions, obtains a classification processing result; Secondary classification is carried out to a classification processing result by the way of the refinement that the second specific classification algorithm carries out characteristic dimension Processing, obtains multiple class data;The determining module 403 is also used to, and uses the second specific classification in the classification processing module The mode that algorithm carries out the refinement of each characteristic dimension to carry out a classification processing result process of secondary classification processing In, the weight that according to the accounting of each class data in multiple class data each class data are corresponded to class label carries out really It is fixed.
According to an embodiment of the present invention, as shown in figure 4, described device 40 further comprises: update module 405 is used for In response to the application to class label, executes the update to the weight of each class data corresponding label and operate.
It need to be noted that: the description of above data sorter embodiment, the description with preceding method embodiment Be it is similar, there is with embodiment of the method similar beneficial effect, therefore do not repeat them here.It is real for device for classifying data of the present invention Undisclosed technical detail in example is applied, the description of embodiment of the present invention method is please referred to and understands, to save length, therefore no longer It repeats.
It should be noted that, in this document, the terms "include", "comprise" or its any other variant are intended to non-row His property includes, so that the process, method, article or the device that include a series of elements not only include those elements, and And further include other elements that are not explicitly listed, or further include for this process, method, article or device institute it is intrinsic Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including being somebody's turn to do There is also other identical elements in the process, method of element, article or device.
In several embodiments provided herein, it should be understood that disclosed device and method can pass through it Its mode is realized.Apparatus embodiments described above are merely indicative, for example, the division of the unit, only A kind of logical function partition, there may be another division manner in actual implementation, such as: multiple units or components can combine, or It is desirably integrated into another system, or some features can be ignored or not executed.In addition, shown or discussed each composition portion Mutual coupling or direct-coupling or communication connection is divided to can be through some interfaces, the INDIRECT COUPLING of equipment or unit Or communication connection, it can be electrical, mechanical or other forms.
Above-mentioned unit as illustrated by the separation member, which can be or may not be, to be physically separated, aobvious as unit The component shown can be or may not be physical unit;Both it can be located in one place, and may be distributed over multiple network lists In member;Some or all of units can be selected to achieve the purpose of the solution of this embodiment according to the actual needs.
In addition, each functional unit in various embodiments of the present invention can be fully integrated in one processing unit, it can also To be each unit individually as a unit, can also be integrated in one unit with two or more units;It is above-mentioned Integrated unit both can take the form of hardware realization, can also realize in the form of hardware adds SFU software functional unit.
Those of ordinary skill in the art will appreciate that: realize that all or part of the steps of above method embodiment can pass through The relevant hardware of program instruction is completed, and program above-mentioned can store in computer-readable storage medium, which exists When execution, step including the steps of the foregoing method embodiments is executed;And storage medium above-mentioned includes: movable storage device, read-only deposits The various media that can store program code such as reservoir (Read Only Memory, ROM), magnetic or disk.
If alternatively, the above-mentioned integrated unit of the present invention is realized in the form of software function module and as independent product When selling or using, it also can store in a computer readable storage medium.Based on this understanding, the present invention is implemented Substantially the part that contributes to existing technology can be embodied in the form of software products the technical solution of example in other words, The computer software product is stored in a storage medium, including some instructions are used so that computer equipment (can be with It is personal computer, server or network equipment etc.) execute all or part of each embodiment the method for the present invention. And storage medium above-mentioned includes: various Jie that can store program code such as movable storage device, ROM, magnetic or disk Matter.
The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any Those familiar with the art in the technical scope disclosed by the present invention, can easily think of the change or the replacement, and should all contain Lid is within protection scope of the present invention.Therefore, protection scope of the present invention should be based on the protection scope of the described claims.

Claims (10)

1. a kind of data classification method, which is characterized in that the described method includes:
Obtain the source data of random combine;
Classification processing is carried out to the source data of the random combine, obtains multiple class data;Wherein, every in the multiple class data One class data corresponds to a class label;
Determine that each class data corresponds to the weight of class label;
It stores weight that each identified class data correspond to class label is corresponding with class label to tag library.
2. the method according to claim 1, wherein
Classification processing is carried out to the source data of the random combine, obtains multiple class data, comprising:
Classification processing is carried out to the source data of the random combine based on preset class label, obtain respectively with it is described in advance There are the class data of mapping relations for the class label of setting;
Each class data of the determination correspond to the weight of class label, comprising:
During carrying out classification processing based on source data of the preset class label to the random combine, according to described At least one characteristic dimension of the source data of random combine to carry out the preset class label determination of weight.
3. the method according to claim 1, wherein
Classification processing is carried out to the source data of the random combine, obtains multiple class data, comprising:
The source data of the random combine is subjected to a subseries according to different characteristic dimensions using the first specific classification algorithm Processing, obtains a classification processing result;
A classification processing result is carried out by the way of the refinement that the second specific classification algorithm carries out characteristic dimension secondary Classification processing obtains multiple class data;
Each class data of the determination correspond to the weight of class label, comprising:
Come to a classification processing result by the way of the refinement for carrying out each characteristic dimension using the second specific classification algorithm During carrying out secondary classification processing, according to the accounting of each class data in multiple class data come to each class data pair The weight of class label is answered to be determined.
4. according to the method described in claim 3, it is characterized in that, the first specific classification algorithm include following algorithm at least One of: cluster, classification tree, Rd forest.
5. according to the method described in claim 3, it is characterized in that, the second feature sorting algorithm include following algorithm at least One of: Bayes, logistic regression training.
6. method according to any one of claims 1 to 5, which is characterized in that the method further includes:
In response to the application to class label, the update for executing the weight for corresponding to class label to each described class data is operated.
7. a kind of device for classifying data, which is characterized in that described device includes:
Module is obtained, for obtaining the source data of random combine;
Classification processing module carries out classification processing for the source data to the random combine, obtains multiple class data;Wherein, Each class data corresponds to a class label in the multiple class data;
Determining module, for determining the weight of each class data corresponding label;
Memory module, weight for each identified class data to be corresponded to class label is corresponding with class label to be stored to label Library.
8. device according to claim 7, which is characterized in that
The classification processing module is also used to, and is classified based on preset class label to the source data of the random combine Processing, obtains respectively that there are the class data of mapping relations with the preset class label;
The determining module is also used to, in the classification processing module based on preset class label to the random combine During source data carries out classification processing, according at least one characteristic dimension of the source data of the random combine come to described Preset class label carries out the determination of weight.
9. device according to claim 7, which is characterized in that
The classification processing module is also used to, using the first specific classification algorithm by the source data of the random combine according to difference Characteristic dimension carry out a classification processing, obtain a classification processing result;Feature is carried out using the second specific classification algorithm The mode of the refinement of dimension to carry out secondary classification processing to a classification processing result, obtains multiple class data;
The determining module is also used to, and carries out each feature dimensions using the second specific classification algorithm in the classification processing module The mode of the refinement of degree is come during carrying out secondary classification processing to a classification processing result, according to every in multiple class data The accounting of one class data is determined come the weight for corresponding to class label to each class data.
10. device according to any one of claims 7 to 9, which is characterized in that described device further comprises:
Update module, for executing to the weight of each class data corresponding label in response to the application to class label Update operation.
CN201810956382.3A 2018-08-21 2018-08-21 A kind of data classification method and device Pending CN109214435A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810956382.3A CN109214435A (en) 2018-08-21 2018-08-21 A kind of data classification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810956382.3A CN109214435A (en) 2018-08-21 2018-08-21 A kind of data classification method and device

Publications (1)

Publication Number Publication Date
CN109214435A true CN109214435A (en) 2019-01-15

Family

ID=64989298

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810956382.3A Pending CN109214435A (en) 2018-08-21 2018-08-21 A kind of data classification method and device

Country Status (1)

Country Link
CN (1) CN109214435A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111159158B (en) * 2019-12-31 2024-03-29 北京懿医云科技有限公司 Data normalization method and device, computer readable storage medium and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080301077A1 (en) * 2007-06-04 2008-12-04 Siemens Medical Solutions Usa, Inc. System and Method for Medical Predictive Models Using Likelihood Gamble Pricing
CN105824945A (en) * 2016-03-21 2016-08-03 中国电力科学研究院 Method for collecting global energy Internet technology resource data
CN106777043A (en) * 2016-12-09 2017-05-31 宁波大学 A kind of academic resources acquisition methods based on LDA
CN107526741A (en) * 2016-06-21 2017-12-29 华为软件技术有限公司 user tag generation method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080301077A1 (en) * 2007-06-04 2008-12-04 Siemens Medical Solutions Usa, Inc. System and Method for Medical Predictive Models Using Likelihood Gamble Pricing
CN105824945A (en) * 2016-03-21 2016-08-03 中国电力科学研究院 Method for collecting global energy Internet technology resource data
CN107526741A (en) * 2016-06-21 2017-12-29 华为软件技术有限公司 user tag generation method and device
CN106777043A (en) * 2016-12-09 2017-05-31 宁波大学 A kind of academic resources acquisition methods based on LDA

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111159158B (en) * 2019-12-31 2024-03-29 北京懿医云科技有限公司 Data normalization method and device, computer readable storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
US20220004931A1 (en) Root cause discovery engine
US10423647B2 (en) Descriptive datacenter state comparison
CN103513983B (en) method and system for predictive alert threshold determination tool
WO2021011088A1 (en) Automated generation of machine learning models for network evaluation
JP6488009B2 (en) Method and system for constructing behavioral queries in a graph over time using characteristic subtrace mining
JP2019164761A (en) Improving quality of labeled training data
Mishra et al. A comparative study of customer churn prediction in telecom industry using ensemble based classifiers
US11860721B2 (en) Utilizing automatic labelling, prioritizing, and root cause analysis machine learning models and dependency graphs to determine recommendations for software products
Solaimani et al. Statistical technique for online anomaly detection using spark over heterogeneous data from multi-source vmware performance data
US9600795B2 (en) Measuring process model performance and enforcing process performance policy
US11550707B2 (en) Systems and methods for generating and executing a test case plan for a software product
US10771562B2 (en) Analyzing device-related data to generate and/or suppress device-related alerts
US20130290238A1 (en) Discovery and grouping of related computing resources using machine learning
WO2022134778A1 (en) Dynamic facet ranking
US12015629B2 (en) Tailored network risk analysis using deep learning modeling
US20240112229A1 (en) Facilitating responding to multiple product or service reviews associated with multiple sources
Liu et al. A framework for online process concept drift detection from event streams
US11386331B2 (en) Detecting correlation among sets of time series data
CN103226748B (en) Project management system based on associative memory
CN109214435A (en) A kind of data classification method and device
US11537391B2 (en) Software change analysis and automated remediation
WO2024118188A1 (en) Computer application error root cause diagnostic tool
KR20190105147A (en) Data clustering method using firefly algorithm and the system thereof
US11197597B2 (en) System and method for a task management and communication system
CN115277124B (en) Online system and server for searching matching attack mode based on system traceability graph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190115

RJ01 Rejection of invention patent application after publication