CN110941751A - Method, system, electronic product and medium for classifying data of data set - Google Patents

Method, system, electronic product and medium for classifying data of data set Download PDF

Info

Publication number
CN110941751A
CN110941751A CN201911155581.5A CN201911155581A CN110941751A CN 110941751 A CN110941751 A CN 110941751A CN 201911155581 A CN201911155581 A CN 201911155581A CN 110941751 A CN110941751 A CN 110941751A
Authority
CN
China
Prior art keywords
point
type
axis coordinate
data
polyhedron
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911155581.5A
Other languages
Chinese (zh)
Other versions
CN110941751B (en
Inventor
李雪
邹杨
欧阳永生
黄生
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Electric Distributed Energy Technology Co Ltd
Original Assignee
Shanghai Electric Distributed Energy Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Electric Distributed Energy Technology Co Ltd filed Critical Shanghai Electric Distributed Energy Technology Co Ltd
Priority to CN201911155581.5A priority Critical patent/CN110941751B/en
Publication of CN110941751A publication Critical patent/CN110941751A/en
Application granted granted Critical
Publication of CN110941751B publication Critical patent/CN110941751B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification

Abstract

The invention discloses a method, a system, an electronic product and a medium for classifying data of a data set, wherein the classification method comprises the following steps: constructing a target Cartesian coordinate system, and marking data as sample points in the target Cartesian coordinate system; obtaining a starting point according to the sample point; acquiring a first type of reference points, wherein the first type of reference points are k sample points with the minimum Euclidean distance from a starting point; acquiring a step length radius which is the maximum value in the Euclidean distance between the first type reference point and the starting point; constructing a first polyhedron, wherein the first polyhedron is a polyhedron taking a first class of reference points as vertexes; if the continuation point exists, setting the continuation point as a new first-type reference point; if the continuation point does not exist, the second type of reference point is also marked as the first type of target point, and the sample points outside the first type of target point are marked as the second type of target point. The invention improves the accuracy of the classification of the data set.

Description

Method, system, electronic product and medium for classifying data of data set
Technical Field
The invention belongs to the technical field of data classification, and particularly relates to a method, a system, an electronic product and a medium for classifying data of a data set.
Background
When the battery of the electric vehicle is fully charged to a capacity lower than a rated value (typically 80% of the initial full charge capacity of the battery), the battery should be replaced. If the replaced retired battery is directly eliminated, resource waste is caused. Therefore, it is particularly important to recycle the retired battery according to the consistency of the battery cells.
In order to obtain the consistency of the single batteries in the retired battery, the performance data of the single batteries needs to be obtained to form a data set, and the consistency of sample data in the data set evaluates the consistency of the single batteries. Evaluating the consistency of sample data in a data set often requires classifying the sample data in the data set. The existing data classification method is low in classification precision and can influence the evaluation of the consistency of the single batteries.
Disclosure of Invention
The invention aims to overcome the defect of low precision of a classification method of sample data in a data set in the prior art, and provides a classification method, a classification system, an electronic product and a classification medium of data of the data set.
The invention solves the technical problems through the following technical scheme:
the invention provides a data classification method of a data set, wherein the data set comprises a plurality of data, each data comprises a first numerical value element, a second numerical value element and a third numerical value element, and the classification method comprises the following steps:
s1, constructing a target Cartesian coordinate system, taking the first numerical element as the X-axis coordinate value of the data, taking the second numerical element as the Y-axis coordinate value of the data, taking the third numerical element as the Z-axis coordinate value of the data, and marking the data as a sample point in the target Cartesian coordinate system; obtaining a starting point according to the sample point;
s2, acquiring a first type of reference point, wherein the first type of reference point is k sample points with the minimum Euclidean distance from the starting point, and k is an integer greater than or equal to 4; acquiring a step length radius which is the maximum value in the Euclidean distance between the first type reference point and the starting point;
s3, constructing a first polyhedron, wherein the first polyhedron is a polyhedron taking a first class of reference points as vertexes;
s4, marking the sample points positioned inside the first polyhedron as first-type target points; acquiring a second type of reference point, wherein the second type of reference point is a sample point positioned on the surface of the first polyhedron; judging whether a continuation point exists, wherein the continuation point is a sample point of which the distance from at least one second type reference point is less than the step radius, the continuation point is the second type reference point and the sample point except the first type target point, and the continuation point is a non-coplanar sample point; if the continuation point exists, setting the continuation point as a new first-type reference point, and then returning to the step S3; if the continuation point does not exist, the second type of reference point is also marked as the first type of target point, and the sample points outside the first type of target point are marked as the second type of target point.
Preferably, the X-axis coordinate of the starting point is an average value of the maximum value and the minimum value in the X-axis coordinate values, the Y-axis coordinate of the starting point is an average value of the maximum value and the minimum value in the Y-axis coordinate values, and the Z-axis coordinate of the starting point is an average value of the maximum value and the minimum value in the Z-axis coordinate values;
or the X-axis coordinate of the starting point is the average value of the X-axis coordinate values, the Y-axis coordinate of the starting point is the average value of the Y-axis coordinate values, and the Z-axis coordinate of the starting point is the average value of the Z-axis coordinate values.
Preferably, before constructing the first polyhedron, the step S3 further includes:
and judging whether the first type reference points are coplanar, if so, assigning k +1 to k, and returning to the step S2.
Preferably, k is 2% -10% of the number of sample points.
The invention also discloses a data classification system of a data set, wherein the data set comprises a plurality of data, each data comprises a first numerical value element, a second numerical value element and a third numerical value element, and the classification system comprises a starting point acquisition unit, a step length acquisition unit, a polyhedron construction unit and a marking unit;
the initial point acquisition unit is used for constructing a target Cartesian coordinate system, and is also used for marking the data as a sample point in the target Cartesian coordinate system by taking the first numerical element as an X-axis coordinate value of the data, taking the second numerical element as a Y-axis coordinate value of the data and taking the third numerical element as a Z-axis coordinate value of the data; the starting point obtaining unit is also used for obtaining a starting point according to the sample point;
the step length obtaining unit is used for obtaining a first type of reference point, wherein the first type of reference point is k sample points with the minimum Euclidean distance from a starting point, and k is an integer greater than or equal to 4; the step length obtaining unit is also used for obtaining a step length radius, wherein the step length radius is the maximum value in the Euclidean distance between the first type reference point and the starting point;
the polyhedron constructing unit is used for constructing a first polyhedron, and the first polyhedron is a polyhedron taking a first class of reference points as vertexes;
the marking unit is used for marking the sample points positioned in the first polyhedron as a first type target point; the marking unit is further used for acquiring a second type of reference point, wherein the second type of reference point is a sample point located on the surface of the first polyhedron; the marking unit is further used for judging whether an extension point exists, wherein the extension point is a sample point of which the distance from at least one second type reference point is smaller than the step radius, the extension point is the second type reference point and the sample point except the first type target point, and the extension point is a sample point which is not coplanar; if the continuation points exist, the marking unit is also used for setting the continuation points as new first-class reference points and then calling the polyhedron constructing unit; the marking unit is further configured to mark the second type of reference point as the first type of target point if no continuation point exists, and the marking unit is further configured to mark sample points other than the first type of target point as the second type of target point.
Preferably, the X-axis coordinate of the starting point is an average value of the maximum value and the minimum value in the X-axis coordinate values, the Y-axis coordinate of the starting point is an average value of the maximum value and the minimum value in the Y-axis coordinate values, and the Z-axis coordinate of the starting point is an average value of the maximum value and the minimum value in the Z-axis coordinate values;
or the X-axis coordinate of the starting point is the average value of the X-axis coordinate values, the Y-axis coordinate of the starting point is the average value of the Y-axis coordinate values, and the Z-axis coordinate of the starting point is the average value of the Z-axis coordinate values.
Preferably, before the first polyhedron is constructed, the polyhedron construction unit is further configured to determine whether the first type of reference points are coplanar, and if the first type of reference points are coplanar, assign k +1 to k, and invoke the step length obtaining unit.
Preferably, k is 2% -10% of the number of sample points.
The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method for classifying data of a data set according to the invention when executing the computer program.
The invention also provides a computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method for classification of data of a data set of the invention.
The positive progress effects of the invention are as follows: the invention improves the accuracy of the classification of the data set.
Drawings
Fig. 1 is a flowchart of a data classification method for a data set according to embodiment 1 of the present invention.
Fig. 2 is a schematic structural diagram of a data classification system for a data set according to embodiment 1 of the present invention.
Fig. 3 is a schematic structural diagram of an electronic device according to embodiment 2 of the present invention.
Detailed Description
The invention is further illustrated by the following examples, which are not intended to limit the scope of the invention.
Example 1
The present embodiment provides a method for classifying data of a data set. The data set includes a number of data, each data including a first numerical element, a second numerical element, and a third numerical element. Referring to fig. 1, the method for classifying data of a data set of the present embodiment includes the steps of:
step S101, a target Cartesian coordinate system is established, and data are marked as sample points in the target Cartesian coordinate system. During specific marking, the first numerical element is used as an X-axis coordinate value of the data, the second numerical element is used as a Y-axis coordinate value of the data, the third numerical element is used as a Z-axis coordinate value of the data, and the data are marked as sample points in a target Cartesian coordinate system. That is, each data in the data set is mapped into a target cartesian coordinate system, corresponding to one sample point.
And step S102, acquiring a starting point according to the sample point. As an alternative embodiment, the X-axis coordinate of the starting point is an average value of the maximum value and the minimum value in the X-axis coordinate values, that is, the X-axis coordinate of the starting point is the sum of the maximum value in the X-axis coordinate values of all the sample points and the minimum value in the X-axis coordinate values of all the sample points, and then divided by 2. The Y-axis coordinate of the starting point is the average value of the maximum value and the minimum value in the Y-axis coordinate values, namely the Y-axis coordinate of the starting point is the sum of the maximum value in the Y-axis coordinate values of all the sample points and the minimum value in the Y-axis coordinate values of all the sample points, and then the sum is divided by 2; the Z-axis coordinate of the starting point is an average value of the maximum value and the minimum value in the Z-axis coordinate values, that is, the Z-axis coordinate of the starting point is the sum of the maximum value in the Z-axis coordinate values of all the sample points and the minimum value in the Z-axis coordinate values of all the sample points, and then divided by 2.
In another alternative embodiment, the X-axis coordinate of the starting point is an average value of the X-axis coordinate values, that is, the X-axis coordinate of the starting point is the sum of the X-axis coordinate values of all the sample points divided by the total number of the sample points; the Y-axis coordinate of the starting point is the average value of the Y-axis coordinate values, namely the Y-axis coordinate of the starting point is the sum of the Y-axis coordinate values of all the sample points and then is divided by the total number of the sample points; the Z-axis coordinate of the starting point is an average value of the Z-axis coordinate values, that is, the Z-axis coordinate of the starting point is an accumulated sum of the Z-axis coordinate values of all the sample points divided by the total number of the sample points.
And step S103, acquiring a first type of reference point and acquiring a step radius. The first type of reference points are k sample points with the minimum Euclidean distance from the starting point, and k is an integer greater than or equal to 4. The step radius is the maximum value in the euclidean distance between the first type reference point and the starting point. In order to achieve both accuracy and computational efficiency, k is 5% of the number of sample points as an alternative embodiment. In other alternative embodiments, k is 2% -10% of the number of sample points.
Step S104, judging whether the first type of reference points are coplanar, and if the first type of reference points are coplanar, executing step S105; if the first type of reference points are not coplanar, step S106 is performed.
And step S105, assigning k +1 to k, and returning to step S103.
And S106, constructing a first polyhedron. Coplanar reference points of the first type means that all reference points of the first type are in the same plane. The first polyhedron is a polyhedron with a first kind of reference point as a vertex. The first polyhedron is a polyhedron formed by planes passing through any 3 sample points in first-class reference points, and each first-class reference point is respectively used as a vertex of the first polyhedron.
Step S107, marking the sample points positioned inside the first polyhedron as first-class target points; and acquiring a second type of reference point. The second type of reference points are sample points located on the surface of the first polyhedron, and the second type of reference points include the first type of reference points. In specific implementation, the first type of target point is labeled as 1.
Step S108, judging whether an extension point exists, and if so, executing step S109; if there is no continuation point, step S110 is performed. The continuation points are sample points which are at a distance smaller than the step radius from at least one second-type reference point, the continuation points are the second-type reference points and the sample points except the first-type target point, and the continuation points are non-coplanar sample points. As an alternative embodiment, in step S108, a sphere is respectively constructed with each second-class reference point as a sphere center and a step radius as a radius, and in sample points located inside any one sphere (i.e., the distance from the sphere center of the sphere is smaller than the step radius), if the sample point does not belong to the second-class reference point, and the sample point does not belong to the already-marked first-class target point, and the sample points are not coplanar (a polyhedron can be constructed with it as a vertex), the sample points are set as continuation points. If there are no sample points that simultaneously satisfy the above conditions, then there is no continuation point.
Step S109, setting the continuation point as a new first-type reference point, and then returning to step S106.
Step S110, mark the second type of reference point as the first type of target point, and mark the sample points other than the first type of target point as the second type of target point. In specific implementation, the second type target point is marked as 0.
According to the steps, the classification of the data in the data set can be completed. Wherein, the data corresponding to the first kind of target points have consistency; and the data corresponding to the second type target points does not have consistency with the data corresponding to the first type target points.
The data classification method of the data set of the embodiment can be applied to the evaluation of the consistency of the battery cells of the retired battery.
The embodiment also provides a classification system of the data set. The classification system of data of a data set can implement the classification method of data of a data set of the present embodiment. The data set comprises a plurality of data, and each data comprises a first numerical value element, a second numerical value element and a third numerical value element. Referring to fig. 2, the classification system of data of a data set of the present embodiment includes a start point acquisition unit 201, a step size acquisition unit 202, a polyhedron construction unit 203, and a labeling unit 204.
The initial point acquisition unit is used for constructing a target Cartesian coordinate system, and is also used for marking the data as a sample point in the target Cartesian coordinate system by taking the first numerical element as an X-axis coordinate value of the data, taking the second numerical element as a Y-axis coordinate value of the data and taking the third numerical element as a Z-axis coordinate value of the data; the starting point obtaining unit is further configured to obtain a starting point according to the sample point.
The step length obtaining unit is used for obtaining a first type of reference point, wherein the first type of reference point is k sample points with the minimum Euclidean distance from a starting point, and k is an integer greater than or equal to 4; the step length obtaining unit is further configured to obtain a step length radius, where the step length radius is a maximum value in the euclidean distance between the first-class reference point and the start point.
The polyhedron construction unit is used for constructing a first polyhedron, and the first polyhedron is a polyhedron taking a first class of reference points as vertexes.
The marking unit is used for marking the sample points positioned in the first polyhedron as a first type target point; the marking unit is further used for acquiring a second type of reference point, wherein the second type of reference point is a sample point located on the surface of the first polyhedron; the marking unit is further used for judging whether an extension point exists, wherein the extension point is a sample point of which the distance from at least one second type reference point is smaller than the step radius, the extension point is the second type reference point and the sample point except the first type target point, and the extension point is a sample point which is not coplanar; if the continuation points exist, the marking unit is also used for setting the continuation points as new first-class reference points and then calling the polyhedron constructing unit; the marking unit is further configured to mark the second type of reference point as the first type of target point if no continuation point exists, and the marking unit is further configured to mark sample points other than the first type of target point as the second type of target point.
In specific implementation, first, the starting point obtaining unit constructs a target cartesian coordinate system, and marks data as sample points in the target cartesian coordinate system. During specific marking, the first numerical element is used as an X-axis coordinate value of the data, the second numerical element is used as a Y-axis coordinate value of the data, the third numerical element is used as a Z-axis coordinate value of the data, and the data are marked as sample points in a target Cartesian coordinate system. That is, the initial point obtaining unit maps each data in the data set to the target cartesian coordinate system, and corresponds to one sample point respectively.
Then, the start point acquisition unit acquires a start point from the sample point. As an alternative embodiment, the X-axis coordinate of the starting point is an average value of the maximum value and the minimum value in the X-axis coordinate values, that is, the X-axis coordinate of the starting point is the sum of the maximum value in the X-axis coordinate values of all the sample points and the minimum value in the X-axis coordinate values of all the sample points, and then divided by 2. The Y-axis coordinate of the starting point is the average value of the maximum value and the minimum value in the Y-axis coordinate values, namely the Y-axis coordinate of the starting point is the sum of the maximum value in the Y-axis coordinate values of all the sample points and the minimum value in the Y-axis coordinate values of all the sample points, and then the sum is divided by 2; the Z-axis coordinate of the starting point is an average value of the maximum value and the minimum value in the Z-axis coordinate values, that is, the Z-axis coordinate of the starting point is the sum of the maximum value in the Z-axis coordinate values of all the sample points and the minimum value in the Z-axis coordinate values of all the sample points, and then divided by 2.
In another alternative embodiment, the X-axis coordinate of the starting point is an average value of the X-axis coordinate values, that is, the X-axis coordinate of the starting point is the sum of the X-axis coordinate values of all the sample points divided by the total number of the sample points; the Y-axis coordinate of the starting point is the average value of the Y-axis coordinate values, namely the Y-axis coordinate of the starting point is the sum of the Y-axis coordinate values of all the sample points and then is divided by the total number of the sample points; the Z-axis coordinate of the starting point is an average value of the Z-axis coordinate values, that is, the Z-axis coordinate of the starting point is an accumulated sum of the Z-axis coordinate values of all the sample points divided by the total number of the sample points.
Next, the step size obtaining unit obtains the first type of reference point and obtains the step size radius. The first type of reference points are k sample points with the minimum Euclidean distance from the starting point, and k is an integer greater than or equal to 4. The step radius is the maximum value in the euclidean distance between the first type reference point and the starting point. In order to achieve both accuracy and computational efficiency, k is 5% of the number of sample points as an alternative embodiment. In other alternative embodiments, k is 2% -10% of the number of sample points.
Then, the polyhedron building unit judges whether the first type of reference points are coplanar or not, if the first type of reference points are coplanar, k +1 is assigned to k, and the step length obtaining unit is called to regenerate the first type of reference points. And repeating the steps until the generated first type of reference points are not coplanar, and constructing a first polyhedron by the polyhedron constructing unit. Coplanar reference points of the first type means that all reference points of the first type are in the same plane. The first polyhedron is a polyhedron with a first kind of reference point as a vertex. The first polyhedron is a polyhedron formed by planes passing through any 3 sample points in first-class reference points, and each first-class reference point is respectively used as a vertex of the first polyhedron.
Then, the marking unit marks the sample points positioned inside the first polyhedron as first-type target points; the marking unit also acquires a second type of reference point. The second type of reference points are sample points located on the surface of the first polyhedron, and the second type of reference points include the first type of reference points. In specific implementation, the first type of target point is labeled as 1.
Next, the marking unit judges whether or not there is a continuation point. If an extension point exists, the marking unit sets the extension point as a new first-class reference point, and then calls the polyhedron constructing unit to construct a new first polyhedron based on the new first-class reference point. Then, based on the new first polyhedron, the marking unit marks the sample points, and the marking unit also acquires a new second-type reference point. For sample points that have been marked as target points of the first type, the opposite marking is not done. As an optional implementation manner, for a sample point that has been marked as a first-type target point, in the subsequent judgment and marking process, the sample point is no longer used as an object for judgment and marking, and repeated marking is not performed, so that the operation load can be reduced, and the operation efficiency can be improved.
The continuation points are sample points which are at a distance smaller than the step radius from at least one second-type reference point, the continuation points are the second-type reference points and the sample points except the first-type target point, and the continuation points are non-coplanar sample points. As an alternative embodiment, the marking unit constructs the spherical surfaces with the step radius as the radius and each of the second type reference points as the center of the spherical surface, respectively, and is located in the sample points inside any one of the spherical surfaces (i.e., the distance from the center of the spherical surface is smaller than the step radius), and if the sample point does not belong to the second type reference point, and the sample point does not belong to the first type target point that has been marked, and the sample points are not coplanar (a polyhedron can be constructed with the sample points as the vertices), the sample points are set as continuation points. If there are no sample points that simultaneously satisfy the above conditions, then there is no continuation point.
And repeating the steps until no continuation point exists, marking the current second-type reference point as the first-type target point by the marking unit, and marking the sample points except the first-type target point as the second-type target points by the marking unit. In specific implementation, the marking unit marks the second type target point as 0.
To this end, the data classification system of the data set of the present embodiment completes classification of data in the data set. Wherein, the data corresponding to the first kind of target points have consistency; and the data corresponding to the second type target points does not have consistency with the data corresponding to the first type target points.
The classification system of the data set of the present embodiment can be applied to the evaluation of the consistency of the battery cells of the retired battery.
Example 2
Fig. 3 is a schematic structural diagram of an electronic device provided in this embodiment. The electronic device comprises a memory, a processor and a computer program stored on the memory and executable on the processor, which when executed by the processor implements the method of classifying data of a data set of embodiment 1. The electronic device 30 shown in fig. 3 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiment of the present invention.
The electronic device 30 may be embodied in the form of a general purpose computing device, which may be, for example, a server device. The components of the electronic device 30 may include, but are not limited to: the at least one processor 31, the at least one memory 32, and a bus 33 connecting the various system components (including the memory 32 and the processor 31).
The bus 33 includes a data bus, an address bus, and a control bus.
The memory 32 may include volatile memory, such as Random Access Memory (RAM)321 and/or cache memory 322, and may further include Read Only Memory (ROM) 323.
Memory 32 may also include a program/utility 325 having a set (at least one) of program modules 324, such program modules 324 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
The processor 31 executes various functional applications and data processing, such as a classification method of data of a data set of embodiment 1 of the present invention, by executing the computer program stored in the memory 32.
The electronic device 30 may also communicate with one or more external devices 34 (e.g., keyboard, pointing device, etc.). Such communication may be through input/output (I/O) interfaces 35. Also, model-generating device 30 may also communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet) via network adapter 36. As shown, network adapter 36 communicates with the other modules of model-generating device 30 via bus 33. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with the model-generating device 30, including but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID (disk array) systems, tape drives, and data backup storage systems, etc.
It should be noted that although in the above detailed description several units/modules or sub-units/modules of the electronic device are mentioned, such a division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more of the units/modules described above may be embodied in one unit/module according to embodiments of the invention. Conversely, the features and functions of one unit/module described above may be further divided into embodiments by a plurality of units/modules.
Example 3
The present embodiment provides a computer-readable storage medium on which a computer program is stored, which program, when executed by a processor, implements the steps of the method of classifying data of a data set of embodiment 1.
More specific examples, among others, that the readable storage medium may employ may include, but are not limited to: a portable disk, a hard disk, random access memory, read only memory, erasable programmable read only memory, optical storage device, magnetic storage device, or any suitable combination of the foregoing.
In a possible implementation, the invention can also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps of the method for classifying data implementing the data set of embodiment 1, when said program product is run on said terminal device.
Where program code for carrying out the invention is written in any combination of one or more programming languages, the program code may be executed entirely on the user device, partly on the user device, as a stand-alone software package, partly on the user device and partly on a remote device or entirely on the remote device.
While specific embodiments of the invention have been described above, it will be appreciated by those skilled in the art that this is by way of example only, and that the scope of the invention is defined by the appended claims. Various changes and modifications to these embodiments may be made by those skilled in the art without departing from the spirit and scope of the invention, and these changes and modifications are within the scope of the invention.

Claims (10)

1. A method of classifying data of a data set, wherein said data set comprises a plurality of said data, each of said data comprising a first numerical element, a second numerical element, and a third numerical element, said method comprising the steps of:
s1, constructing a target Cartesian coordinate system, taking the first numerical element as an X-axis coordinate value of the data, taking the second numerical element as a Y-axis coordinate value of the data, taking the third numerical element as a Z-axis coordinate value of the data, and marking the data as a sample point in the target Cartesian coordinate system; obtaining a starting point according to the sample point;
s2, acquiring first type reference points, wherein the first type reference points are k sample points with the minimum Euclidean distance from the starting point, and k is an integer greater than or equal to 4; acquiring a step radius which is the maximum value in the Euclidean distance between the first type reference point and the starting point;
s3, constructing a first polyhedron, wherein the first polyhedron is a polyhedron taking the first type of reference point as a vertex;
s4, marking the sample points positioned in the first polyhedron as first-class target points; acquiring a second type of reference point, wherein the second type of reference point is the sample point on the surface of the first polyhedron; judging whether an extension point exists, wherein the extension point is the sample point of which the distance from at least one second type reference point is smaller than the step radius, the extension point is the second type reference point and the sample point except the first type target point, and the extension point is the sample point which is not coplanar; if the continuation point exists, setting the continuation point as a new first-type reference point, and then returning to the step S3; if the continuation point does not exist, marking the second type of reference point as the first type of target point, and marking the sample points except the first type of target point as second type of target points.
2. The method of classifying data of a data set according to claim 1, wherein the X-axis coordinate of the start point is an average of a maximum value and a minimum value among the X-axis coordinate values, the Y-axis coordinate of the start point is an average of a maximum value and a minimum value among the Y-axis coordinate values, and the Z-axis coordinate of the start point is an average of a maximum value and a minimum value among the Z-axis coordinate values;
or the X-axis coordinate of the starting point is the average value of the X-axis coordinate values, the Y-axis coordinate of the starting point is the average value of the Y-axis coordinate values, and the Z-axis coordinate of the starting point is the average value of the Z-axis coordinate values.
3. The method for classifying data of a data set according to claim 1, wherein, before said constructing a first polyhedron, step S3 further comprises:
and judging whether the first type of reference points are coplanar or not, if so, assigning k +1 to k, and returning to the step S2.
4. The method of classifying data of a data set according to claim 1, wherein k is 2% -10% of the number of sample points.
5. A classification system of data of a data set is characterized in that the data set comprises a plurality of data, each data comprises a first numerical value element, a second numerical value element and a third numerical value element, and the classification system comprises a starting point acquisition unit, a step length acquisition unit, a polyhedron construction unit and a marking unit;
the starting point obtaining unit is configured to construct a target cartesian coordinate system, and the starting point obtaining unit is further configured to mark the data as a sample point in the target cartesian coordinate system by using the first numerical element as an X-axis coordinate value of the data, using the second numerical element as a Y-axis coordinate value of the data, and using the third numerical element as a Z-axis coordinate value of the data; the starting point obtaining unit is further used for obtaining a starting point according to the sample point;
the step length obtaining unit is configured to obtain a first type of reference point, where the first type of reference point is k sample points with a minimum euclidean distance to the starting point, and k is an integer greater than or equal to 4; the step length obtaining unit is further configured to obtain a step length radius, where the step length radius is a maximum value in euclidean distances between the first class reference point and the start point;
the polyhedron constructing unit is used for constructing a first polyhedron, and the first polyhedron is a polyhedron taking the first class of reference points as vertexes;
the marking unit is used for marking the sample points positioned in the first polyhedron as a first type target point; the marking unit is further configured to obtain a second type of reference point, where the second type of reference point is the sample point located on the surface of the first polyhedron; the marking unit is further configured to determine whether there is an extension point, where the extension point is the sample point whose distance from at least one of the second-class reference points is smaller than the step radius, the extension point is the sample point other than the second-class reference point and the first-class target point, and the extension point is the sample point that is not coplanar; if the continuation point exists, the marking unit is further used for setting the continuation point as a new first type of reference point and then calling the polyhedron constructing unit; the marking unit is further configured to mark a second type of reference point as the first type of target point if the continuation point does not exist, and the marking unit is further configured to mark the sample points other than the first type of target point as second type of target points.
6. The system for classifying data of a data set according to claim 5, wherein the X-axis coordinate of the start point is an average of a maximum value and a minimum value among the X-axis coordinate values, the Y-axis coordinate of the start point is an average of a maximum value and a minimum value among the Y-axis coordinate values, and the Z-axis coordinate of the start point is an average of a maximum value and a minimum value among the Z-axis coordinate values;
or the X-axis coordinate of the starting point is the average value of the X-axis coordinate values, the Y-axis coordinate of the starting point is the average value of the Y-axis coordinate values, and the Z-axis coordinate of the starting point is the average value of the Z-axis coordinate values.
7. The method for classifying data of a data set according to claim 5, wherein, before said constructing a first polyhedron, said polyhedron constructing unit is further configured to determine whether said first class of reference points are coplanar, and if said first class of reference points are coplanar, assign k +1 to k, and invoke said step size obtaining unit.
8. The system for classification of data of a data set according to claim 5, characterized in that k is 2-10% of the number of sample points.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of classification of data of a data set according to any of claims 1-4 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method of classification of data of a data set according to any one of claims 1 to 4.
CN201911155581.5A 2019-11-22 2019-11-22 Data classification method, system, electronic product and medium of data set Active CN110941751B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911155581.5A CN110941751B (en) 2019-11-22 2019-11-22 Data classification method, system, electronic product and medium of data set

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911155581.5A CN110941751B (en) 2019-11-22 2019-11-22 Data classification method, system, electronic product and medium of data set

Publications (2)

Publication Number Publication Date
CN110941751A true CN110941751A (en) 2020-03-31
CN110941751B CN110941751B (en) 2023-09-15

Family

ID=69908009

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911155581.5A Active CN110941751B (en) 2019-11-22 2019-11-22 Data classification method, system, electronic product and medium of data set

Country Status (1)

Country Link
CN (1) CN110941751B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999046731A1 (en) * 1998-03-13 1999-09-16 The University Of Houston System Methods for performing daf data filtering and padding
US6982726B1 (en) * 2000-07-18 2006-01-03 Canon Kabushiki Kaisha Non-Cartesian representation
CN109325118A (en) * 2018-09-03 2019-02-12 平安科技(深圳)有限公司 Uneven sample data preprocess method, device and computer equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999046731A1 (en) * 1998-03-13 1999-09-16 The University Of Houston System Methods for performing daf data filtering and padding
US6982726B1 (en) * 2000-07-18 2006-01-03 Canon Kabushiki Kaisha Non-Cartesian representation
CN109325118A (en) * 2018-09-03 2019-02-12 平安科技(深圳)有限公司 Uneven sample data preprocess method, device and computer equipment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
王艳云;唐煜;: "任意凸多面体上布点均匀性的度量", 应用数学学报, no. 05 *
陈志群;: "数控车床坐标系统的机理分析", 机械工程师, no. 09 *

Also Published As

Publication number Publication date
CN110941751B (en) 2023-09-15

Similar Documents

Publication Publication Date Title
CN108959127B (en) Address translation method, device and system
CN112364213A (en) Graph database-based power grid retrieval method and system
CN103970604A (en) Method and device for realizing image processing based on MapReduce framework
CN110209348B (en) Data storage method and device, electronic equipment and storage medium
CN104301442A (en) Method for achieving client of access object storage cluster based on fuse
CN104572448A (en) Method and device for realizing use condition of thread stack
CN110769002A (en) LabVIEW-based message analysis method, system, electronic device and medium
CN117236236B (en) Chip design data management method and device, electronic equipment and storage medium
CN110941751B (en) Data classification method, system, electronic product and medium of data set
CN113344074A (en) Model training method, device, equipment and storage medium
CN109284108B (en) Unmanned vehicle data storage method and device, electronic equipment and storage medium
CN111079796A (en) Battery screening method, system, electronic product and medium
CN110175128A (en) A kind of similar codes case acquisition methods, device, equipment and storage medium
CN112541834B (en) Identifier processing method, device and system for hydropower industry digital object
CN114123190A (en) Method and device for determining target region to which ammeter belongs, electronic equipment and storage medium
US9021496B2 (en) Method and program for recording object allocation site
CN111737593A (en) Method, device, equipment and storage medium for acquiring cross-group communication relationship diagram
CN108386325A (en) A kind of method and system of wind power generating set intelligent diagnostics
CN110968645B (en) Data read-write method, system, equipment and storage medium of distributed system
CN115964002B (en) Electric energy meter terminal archive management method, device, equipment and medium
CN117094268B (en) Inter-grid data transmission method and device, storage medium and electronic equipment
CN117592311B (en) Multi-level simulation method, device and equipment for workflow and readable medium
CN110673893B (en) Application program configuration method, system, electronic device and storage medium
CN116860183A (en) Data storage method, electronic equipment and storage medium
CN114817766A (en) Method, system, device and medium for determining company city

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant