CN117349630A

CN117349630A - Method and system for biochemical data analysis

Info

Publication number: CN117349630A
Application number: CN202311641939.1A
Authority: CN
Inventors: 周静茹; 曹志勇
Original assignee: Xingtai Medical College
Current assignee: Xingtai Medical College
Priority date: 2023-12-04
Filing date: 2023-12-04
Publication date: 2024-01-05
Anticipated expiration: 2043-12-04
Also published as: CN117349630B

Abstract

The invention relates to the technical field of digital data processing, and provides a method and a system for biochemical data analysis, wherein the method comprises the following steps: acquiring a time sequence of biochemical detection parameters; acquiring a neighbor data set and local density according to the time sequence of the biochemical detection parameter, acquiring a local density change sequence according to the neighbor data set and the local density, and acquiring a structure change index according to the local density change sequence; acquiring a candidate representative point set according to the structure change index, acquiring a distance distribution difference degree and a distance distribution sequence according to the candidate representative point set, acquiring a target representative point according to the distance distribution sequence, and acquiring a clustering result of the biochemical detection parameter based on the target representative point by using a CURE clustering algorithm; and obtaining an analysis result of the biochemical detection parameters according to the clustering result of the biochemical detection parameters. The invention avoids the phenomenon that the representative points in the CURE clustering algorithm are intensively distributed, and improves the accuracy of the clustering result of the biochemical detection parameters.

Description

Method and system for biochemical data analysis

Technical Field

The invention relates to the technical field of digital data processing, in particular to a method and a system for biochemical data analysis.

Background

Biochemical data generally refers to data reflecting metabolism, physiological functions and disease states in a living body, and biochemical data analysis is commonly used in the medical field, and by analyzing some biochemical index data in a human body, the metabolic condition in the human body is further known, so as to judge the possibility of developing some diseases. Common biochemical data include biochemical detection parameter data of human urine, and the influence of kidney diseases on human body is researched through urine biochemical analysis. At present, due to the complexity of biochemical data analysis, the quality of the biochemical data analysis is poor, and more accurate scientific support cannot be provided for medical treatment.

In order to detect the level of an index in a patient, changes in the index of a patient suffering from kidney disease are mainly studied by a statistical method, but this method is time-consuming and extremely wasteful of human resources and is prone to errors. Along with the development of the digital data processing field, biochemical data of urine detection are obtained, and the index condition in urine can be rapidly obtained through a clustering analysis method. For example, the CURE hierarchical clustering algorithm can be used for performing cluster analysis on complex biochemical data. However, due to different detection indexes of kidney disease patients with different degrees, the selection of the cluster representative points greatly influences the effect of cluster analysis, and the accuracy of the cluster analysis is easy to be poor.

Disclosure of Invention

The invention provides a method and a system for biochemical data analysis, which aim to solve the problem of poor accuracy of cluster analysis, and the adopted technical scheme is as follows:

in a first aspect, one embodiment of the present invention is a method for biochemical data analysis, the method comprising the steps of:

acquiring a time sequence of biochemical detection parameters;

acquiring a local density and a neighbor data set of each data point in the time sequence of each biochemical detection parameter according to the time sequence of each biochemical detection parameter, and acquiring a local density change sequence of each data point in the time sequence of each biochemical detection parameter according to the local density and the neighbor data set of each data point in the time sequence of each biochemical detection parameter; acquiring a density difference index of each data point in the time sequence of each biochemical detection parameter according to the local density change sequence of each data point in the time sequence of each biochemical detection parameter; obtaining the structural change index of each data point in the time sequence of each biochemical detection parameter according to the density difference index of each data point in the time sequence of each biochemical detection parameter;

acquiring a candidate representative point set of each biochemical detection parameter according to the structural change index of the data points in the time sequence of each biochemical detection parameter; obtaining the distance distribution difference degree of each candidate representative point in the candidate representative point set of each biochemical detection parameter according to the candidate representative point set of each biochemical detection parameter; acquiring target representative points of each biochemical detection parameter according to the distance distribution difference degree of the candidate representative points in the candidate representative point set of each biochemical detection parameter, and acquiring a clustering result of each biochemical detection parameter based on the target representative points of each biochemical detection parameter by adopting a CURE clustering algorithm;

and acquiring an abnormal cluster of each biochemical detection parameter according to the clustering result of each biochemical detection parameter, and acquiring an analysis result of the biochemical detection parameter according to the abnormal cluster of the biochemical detection parameter.

Preferably, the method for obtaining the local density and the neighbor data set of each data point in the time sequence of each biochemical detection parameter according to the time sequence of each biochemical detection parameter, and obtaining the local density change sequence of each data point in the time sequence of each biochemical detection parameter according to the local density and the neighbor data set of each data point in the time sequence of each biochemical detection parameter comprises the following steps:

for the time sequence of each biochemical detection parameter, taking a set formed by all data points in the time sequence of the biochemical detection parameter as input of a DPC density peak clustering algorithm, and taking output of the DPC density peak clustering algorithm as local density of each data point in the time sequence of the biochemical detection parameter;

for each data point in the time sequence of each biochemical detection parameter, taking the data point as a central data point, and taking a set formed by all data points within a preset cut-off distance range of the central data point as a neighbor data set of the central data point;

for the time sequence of each biochemical detection parameter, a sequence formed by the local densities of all data points in the neighbor data set of each data point according to the ascending order of the numerical value is used as the local density change sequence of each data point.

Preferably, the method for obtaining the density difference index of each data point in the time sequence of each biochemical detection parameter according to the local density change sequence of each data point in the time sequence of each biochemical detection parameter comprises the following steps:

in the method, in the process of the invention,a density difference index representing the jth data point in the time series of the ith biochemical test parameter,representing an exponential function based on natural constants, < ->Representing the number of data in the neighbor dataset of the jth data point in the time series of the ith biochemical detection parameter, and>representation->Distance function->A local density change sequence representing the jth data point in the time series of the ith biochemical detection parameter,/for the jth data point>Local density change sequence of c-th data point in neighbor data set representing j-th data point in time series of i-th biochemical detection parameter,/v>And->Respectively representing the maximum value and the minimum value of data in the local density change sequence of the jth data point in the time sequence of the ith biochemical detection parameter.

Preferably, the method for obtaining the structural change index of each data point in the time sequence of each biochemical detection parameter according to the density difference index of each data point in the time sequence of each biochemical detection parameter comprises the following steps:

acquiring the local data neighbor degree of each data point in the time sequence of each biochemical detection parameter according to the neighbor data set of each data point in the time sequence of each biochemical detection parameter;

for each data point in the time sequence of each biochemical detection parameter, taking a negative mapping result taking a natural constant as a base and taking the local data neighbor of the data point as an index as a first product factor, and taking the product of the first product factor and the density difference index of the data point as a structure change index of the data point.

Preferably, the method for obtaining the local data proximity of each data point in the time sequence of each biochemical detection parameter according to the neighbor data set of each data point in the time sequence of each biochemical detection parameter comprises the following steps:

in the method, in the process of the invention,representing local data proximity of the jth data point in the time series of the ith biochemical test parameter,coefficient of variation of data in neighbor dataset representing jth data point in time series of ith biochemical detection parameter, +.>Representing the ith biochemical test parameterNumber of data in neighbor dataset of jth data point in time series of numbers, +.>And->The local densities of the d-th and b-th data points in the neighbor data set of the j-th data point in the time sequence of the i-th biochemical detection parameter are respectively represented.

Preferably, the method for obtaining the candidate representative point set of each biochemical detection parameter according to the structural change index of the data points in the time sequence of each biochemical detection parameter comprises the following steps:

taking a data set consisting of structural change indexes of all data points in the time sequence of each biochemical detection parameter as a structural data set of each biochemical detection parameter, taking all data in the structural data set of each biochemical detection parameter as the input of a k-means clustering algorithm, and taking the output of the k-means clustering algorithm as the clustering result of the structural data set of each biochemical detection parameter;

taking each cluster in the clustering result of the structural data set of each biochemical detection parameter as each data distribution category, acquiring the average value of all data in each data distribution category, and taking the average value as the average level of each data distribution category;

for each biochemical detection parameter, taking the data point closest to the average level of the data distribution category in each data distribution category as each candidate representative point, and taking a set formed by all candidate representative points as a candidate representative point set of the biochemical detection parameter.

Preferably, the specific method for obtaining the distance distribution difference degree of each candidate representative point in the candidate representative point set of each biochemical detection parameter according to the candidate representative point set of each biochemical detection parameter comprises the following steps:

in the method, in the process of the invention,distance distribution difference of g candidate representative points in candidate representative point set representing ith biochemical detection parameter, +.>Representing the number of candidate representative points in the candidate representative point set of the ith biochemical detection parameter, +.>Representing Euclidean distance function, ">Representing the position of the g candidate representative point in the data space in the candidate representative point set of the ith biochemical detection parameter,/for>The position of the h candidate representative point in the data space in the candidate representative point set of the ith biochemical detection parameter is represented.

Preferably, the method for obtaining the target representative point of each biochemical detection parameter according to the distance distribution difference degree of the candidate representative point in the candidate representative point set of each biochemical detection parameter and obtaining the clustering result of each biochemical detection parameter based on the target representative point of each biochemical detection parameter by adopting the CURE clustering algorithm comprises the following steps:

for each candidate representative point set of the biochemical detection parameters, taking a sequence formed by the distance distribution difference degrees of all candidate representative points in the candidate representative point set according to the ascending order of the numerical values as a distance distribution sequence of the biochemical detection parameters, and taking all candidate representative points corresponding to a preset number of distance distribution difference degrees at the tail end of the distance distribution sequence of the biochemical detection parameters as target representative points of the biochemical detection parameters;

for each biochemical detection parameter, taking all data in the time sequence of the biochemical detection parameter as input of a CURE clustering algorithm, taking a target representative point of the biochemical detection parameter as a representative point selected when all data points in the time sequence of the biochemical detection parameter are clustered, and taking output of the CURE clustering algorithm as a clustering result of the biochemical detection parameter.

Preferably, the method for obtaining the abnormal cluster of each biochemical detection parameter according to the clustering result of each biochemical detection parameter and obtaining the analysis result of the biochemical detection parameter according to the abnormal cluster of the biochemical detection parameter comprises the following steps:

for the clustering result of each biochemical detection parameter, calculating the element mean value of all elements in each clustering cluster in the clustering result, and obtaining a clustering cluster corresponding to the maximum element mean value and the minimum element mean value;

the biochemical detection parameters comprise urine specific gravity, urine beta-2-microglobulin, urine N-acetyl-D amino acid glucosidase and urine cystatin C, wherein a cluster corresponding to the mean value of the smallest element in the clustering result of the urine specific gravity is used as a first abnormal cluster, and a cluster corresponding to the mean value of the largest element in the clustering result of the time sequence of the urine beta-2-microglobulin, the urine N-acetyl-D amino acid glucosidase and the urine cystatin C is respectively used as a second abnormal cluster, a third abnormal cluster and a fourth abnormal cluster;

taking each element in the abnormal cluster of the biochemical detection parameters as each abnormal element, wherein each abnormal element represents the content of the biochemical detection parameters in urine of each patient, and taking the abnormal cluster of the biochemical detection parameters as an analysis result of the biochemical detection parameters.

In a second aspect, an embodiment of the present invention further provides a system for biochemical data analysis, including a memory, a processor, and a computer program stored in the memory and running on the processor, where the processor implements the steps of any one of the methods described above when executing the computer program.

The beneficial effects of the invention are as follows: the method comprises the steps of obtaining local density of time sequence data of each biochemical detection parameter by using a density peak clustering algorithm, obtaining a structure change index according to a change rule of the local density in a local neighborhood of a data point, obtaining candidate representative points by using a k-means clustering algorithm according to the structure change index, obtaining distance distribution difference degree and distance distribution sequence according to Euclidean distance between the candidate representative points, and selecting self-adaptive representative points according to the distance distribution sequence. The method has the advantages that the method combines the structural change of the local neighborhood of the data point and the Euclidean distance self-adaptive representative point, avoids the phenomenon of centralized distribution of the selected representative points when the biochemical detection parameters are clustered, and improves the accuracy of the clustering result of the biochemical detection parameters.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are only some embodiments of the invention, and that other drawings can be obtained according to these drawings without inventive faculty for a person skilled in the art.

FIG. 1 is a flow chart of a method for biochemical data analysis according to an embodiment of the present invention;

FIG. 2 is a flowchart showing an embodiment of a method for biochemical data analysis according to the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1, a flowchart of a method for biochemical data analysis according to an embodiment of the present invention is shown, the method includes the following steps:

step S001, obtaining a time sequence of biochemical detection parameters.

The invention mainly analyzes the biochemical indexes of urine in a patient to obtain the clustering analysis result of biochemical detection parameters. The invention collects the biochemical examination in the urine of 500 patients in the recent in a biochemical data platform of a hospitalParameter measurement data, the biochemical detection parameters comprise urine specific gravityUrinary->-microglobulin->Urine N-acetyl-D amino acid glucosidaseUrocystatin C->. For each biochemical detection parameter, a sequence of biochemical detection parameter data in ascending order of patient seeing time is taken as a time sequence of each biochemical detection parameter.

To this end, a time series of each biochemical detection parameter is obtained.

Step S002, obtaining a neighbor data set and local density according to the time sequence of the biochemical detection parameters, obtaining a local density change sequence and a density difference index according to the neighbor data set and the local density, and obtaining a structure change index according to the density difference index.

The traditional CURE clustering algorithm only considers the relation of the distance between the data when selecting the representative points, but the selected partial representative points are easy to cause that the representative points do not have the representativeness of some data, namely the property of the representative points is not reflected well, and the accuracy of the obtained clustering result is poor. Therefore, the selection mode of the representative points needs to be improved so as to acquire accurate clustering results and improve the accuracy of biochemical data analysis. A flow chart of an embodiment of the present invention is shown in fig. 2.

Based on the above analysis, since the kidney damage to different degrees reflects different data conditions, it is necessary to analyze the distribution of biochemical data in order to accurately obtain representative points. Time series for each biochemical detection parameter:

in the method, in the process of the invention,time sequence representing the ith biochemical detection parameter, < ->And->The contents of the ith biochemical detection parameter in urine of the 1 st and nth patients are respectively shown.

For the time sequence of each biochemical detection parameter, the DPC density peak clustering algorithm is utilized, the preset cutoff distance is selected so that the number of data points with average surrounding distance of each data point smaller than the preset cutoff distance accounts for the number of all data points in the data setThe set formed by all time series data of each biochemical detection parameter is used as the input of a DPC density peak value clustering algorithm, the output of the DPC density peak value clustering algorithm is used as the local density of each data point in the time series of each biochemical detection parameter, and the DPC density peak value clustering algorithm is a known technology and is not redundant.

The change of the local density of the data point neighbor region can reflect the data distribution structure in the data point neighbor region to a certain extent, and in order to select an effective representative point by using the CURE clustering algorithm, the spatial distribution of the data needs to be analyzed.

Specifically, for each time series of biochemical detection parameters, each data point is taken as each center data point, and a set of data points within a truncated distance range of each center data point is taken as a neighbor data set of each center data point. Further, a sequence in which the local densities of all data points in the neighbor data set of each data point are formed in order of ascending numerical value is taken as a local density change sequence of each data point.

Calculating a density difference index for each data point in the time series of each biochemical detection parameter:

Ith biochemical test parametersBetween the sequence of local density changes of the jth data point in the time series of data points and the sequence of local density changes of the c data point in the neighbor data set of that data pointDistance->The larger the difference between the maximum value and the minimum value of the data in the local density change sequence of the jth data point in the time sequence of the ith biochemical detection parameter +.>The larger the sequence of local density changes, the lower the similarity between the sequence of local density changes, and the larger the local density change, i.e., the larger the local area density difference of the data points, the larger the density difference index.

Further, the structural change index of each data point in the time series of each biochemical detection parameter is calculated:

in the method, in the process of the invention,representing local data proximity of the jth data point in the time series of the ith biochemical test parameter,coefficient of variation of data in neighbor dataset representing jth data point in time series of ith biochemical detection parameter, +.>Representing the number of data in the neighbor dataset of the jth data point in the time series of the ith biochemical detection parameter, and>and->Local densities of the d-th and b-th data points in the neighbor data set respectively representing the j-th data point in the time series of the i-th biochemical detection parameter, +.>Index of structural change indicating the jth data point in the time series of the ith biochemical test parameter, +.>Density difference index indicating the jth data point in the time series of the ith biochemical test parameter,/>An exponential function based on a natural constant is represented.

Differences between local densities of the d-th and b-th data points in the neighbor data set of the j-th data point in the time series of the i-th biochemical detection parameterThe larger, and the coefficient of variation of the data in the neighbor dataset of the jth data point in the time series of the ith biochemical detection parameter +.>The larger the density variation in the neighborhood of the data point, the larger the difference in local spatial distribution variation of the data point, and the smaller the local data neighborhood. In addition, local data proximity +_for the jth data point in the time series of the ith biochemical detection parameter>The smaller the density difference index +.>The larger the description numberThe larger the data change of the local area of the data point, namely the larger the data structure change, the larger the structure change index.

Thus, the structural change index of each data point in the time sequence of each biochemical detection parameter is obtained.

Step S003, a candidate representative point set is obtained according to the structure change index, a distance distribution sequence is obtained according to the candidate representative point set, a target representative point is obtained according to the distance distribution sequence, and a clustering result of biochemical detection parameters is obtained based on the target representative point by using a CURE clustering algorithm.

The structure change index reflects the discrete condition of the local spatial distribution of the data to a certain extent, and the representative points selected in the CURE clustering algorithm need to have the representativeness of the data characteristics, so that the consideration of the data distribution characteristics of the data point neighbor areas is beneficial to selecting effective representative points, and a better clustering result is obtained.

Further, a data set composed of the structural change indexes of all data points in the time series of each biochemical detection parameter is taken as a structural data set of each biochemical detection parameter. In order to select a proper representative point, using a k-means clustering algorithm, taking all data in the structural data set of each biochemical detection parameter as the input of the k-means clustering algorithm, presetting the empirical value of the classification parameter k to be 30, measuring the distance to be Euclidean distance, and taking the output of the k-means clustering algorithm as the clustering result of the structural data set of each biochemical detection parameter. And for the clustering result of the structural data set of each biochemical detection parameter, acquiring 30 clustering clusters in the clustering result, taking each clustering cluster as each data distribution category, calculating the data average value of each data distribution category, and taking the data average value of each data distribution category as the average level of each data distribution category.

Based on the above analysis, for each biochemical detection parameter data, in order to obtain effective representative points, the representative points are selected in consideration of the data structure information. Specifically, for each biochemical detection parameter, a data point corresponding to an element value closest to the average level of the data distribution category in each data distribution category is taken as each candidate representative point, and a set formed by all candidate representative points is taken as a candidate representative point set of the biochemical detection parameter. Different candidate representative points in the candidate representative point set can represent data points of different data distribution characteristics to a certain extent.

Further, selecting representative points requires avoiding a centralized distribution of data points, because the representative points of the centralized distribution are not sufficiently representative of the entire data set. Therefore, the distance distribution difference degree of each candidate representative point in the candidate representative point set of each biochemical detection parameter is calculated in consideration of the euclidean distance between the candidate representative points:

Euclidean distance between g and h candidate representative points in candidate representative point set of ith biochemical detection parameter in data spaceThe smaller the candidate representative point is, the closer the candidate representative point is to the rest candidate representative points is, and the more likely the concentration distribution phenomenon of the representative points is, the smaller the distance distribution difference degree is.

Further, for each candidate representative point set of the biochemical detection parameters, a sequence formed by the distance distribution differences of all candidate representative points in the candidate representative point set according to the ascending order of the values is used as a distance distribution sequence.

For the time series data of each biochemical detection parameter, using a CURE clustering algorithm, taking all the time series data of each biochemical detection parameter as the input of the CURE clustering algorithm, wherein the preset cluster number is 15, the preset representative point number is 20, the contraction factor takes an empirical value of 0.9, and taking the output of the CURE clustering algorithm as the clustering result of all the time series data of each biochemical detection parameter. It should be noted that, the preset number of representative points in the CURE clustering algorithm is 20, and the selected target representative points are candidate representative points corresponding to the last 20 elements of the distance distribution sequence of each parameter. Thus, when clustering is performed on time-series data of each biochemical detection parameter, 20 different target representative points can be obtained.

The representative points selected in the traditional CURE clustering algorithm are only based on the distance, so that the phenomenon that the selected representative points are intensively distributed is easy to cause, and the biochemical detection parameter data set cannot be represented well. The method considers the structural change of the local neighborhood of the data points, and combines the distance between the data points to obtain more representative points, thereby obtaining more accurate clustering results.

So far, the clustering result of each biochemical detection parameter is obtained.

Step S004, performing abnormal analysis according to the clustering result of the biochemical detection parameters to obtain an analysis result of the biochemical detection parameters.

Respectively obtain urine specific gravity and urine-clustering results corresponding to microglobulin, urinary N-acetyl-D amino acid glucosidase, urinary cystatin C. For each of the parameters of the biochemical tests,and calculating element average values in each cluster in the clustering result, and acquiring a cluster corresponding to the maximum element average value and a cluster corresponding to the minimum element average value.

Medical research shows that urine in human bodyWhen the content of microglobulin, urinary N-acetyl-D amino acid glucosidase and urinary cystatin C is high, the abnormal phenomenon is caused, and kidney diseases are likely to exist; when the urine specific gravity content is low, it is an abnormal phenomenon, and kidney diseases are more likely to occur.

Therefore, the cluster corresponding to the smallest element mean value in the clustering result of the time series data of urine density is taken as the first abnormal cluster, and urine is respectively taken as the first abnormal clusterAnd taking a cluster corresponding to the maximum element mean value in the clustering result of the time sequence data of the microglobulin, the urine N-acetyl-D amino acid glucosidase and the urine cystatin C as a second abnormal cluster, a third abnormal cluster and a fourth abnormal cluster. The abnormal cluster reflects the abnormal condition of the kidney in the patient to a certain extent, and the patient corresponding to each element in the abnormal cluster is more likely to suffer from kidney diseases. Therefore, the abnormal cluster of the biochemical detection parameter is used as the analysis result of the biochemical detection parameter.

Based on the same inventive concept as the method, the embodiment of the invention also provides a system for biochemical data analysis, after the clustering result of the biochemical detection parameters is obtained, the clustering result of the biochemical detection parameters is transmitted to an abnormal analysis module, the abnormal clustering cluster of the clustering result of each biochemical detection parameter is obtained by using the method, and the abnormal clustering cluster in the clustering result of each biochemical detection parameter is used as the analysis result of the biochemical detection parameters.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. The above description is only of the preferred embodiments of the present invention and is not intended to limit the invention, but any modifications, equivalent substitutions, improvements, etc. within the principles of the present invention should be included in the scope of the present invention.

Claims

1. A method for biochemical data analysis, the method comprising the steps of:

acquiring a time sequence of biochemical detection parameters;

2. The method for analyzing biochemical data according to claim 1, wherein the method for obtaining the local density and the neighbor data set of each data point in the time sequence of each biochemical detection parameter according to the time sequence of each biochemical detection parameter, and obtaining the local density change sequence of each data point in the time sequence of each biochemical detection parameter according to the local density and the neighbor data set of each data point in the time sequence of each biochemical detection parameter comprises:

3. The method for analyzing biochemical data according to claim 1, wherein the method for obtaining the density difference index of each data point in the time series of each biochemical detection parameter from the local density variation sequence of each data point in the time series of each biochemical detection parameter comprises:

in the method, in the process of the invention,representing the ith biochemical test parameterDensity difference index, ++j, for the jth data point in the time series>Representing an exponential function based on natural constants, < ->Representing the number of data in the neighbor dataset of the jth data point in the time series of the ith biochemical detection parameter, and>representation->Distance function->A local density change sequence representing the jth data point in the time series of the ith biochemical detection parameter,/for the jth data point>Local density change sequence of c-th data point in neighbor data set representing j-th data point in time series of i-th biochemical detection parameter,/v>And->Respectively representing the maximum value and the minimum value of data in the local density change sequence of the jth data point in the time sequence of the ith biochemical detection parameter.

4. The method for analyzing biochemical data according to claim 1, wherein the method for obtaining the structural change index of each data point in the time series of each biochemical test parameter according to the density difference index of each data point in the time series of each biochemical test parameter comprises:

5. The method for analyzing biochemical data according to claim 4, wherein the method for obtaining the local data proximity of each data point in the time series of each biochemical detection parameter from the neighboring data set of each data point in the time series of each biochemical detection parameter comprises:

in the method, in the process of the invention,local data proximity,/-for the jth data point in the time series representing the ith biochemical detection parameter>Coefficient of variation of data in neighbor dataset representing jth data point in time series of ith biochemical detection parameter, +.>Representing the number of data in the neighbor dataset of the jth data point in the time series of the ith biochemical detection parameter, and>and->The local densities of the d-th and b-th data points in the neighbor data set of the j-th data point in the time sequence of the i-th biochemical detection parameter are respectively represented.

6. The method for analyzing biochemical data according to claim 1, wherein the method for obtaining the candidate representative point set of each biochemical detection parameter according to the structural change index of the data points in the time series of each biochemical detection parameter is as follows:

7. The method for analyzing biochemical data according to claim 1, wherein the specific method for obtaining the distance distribution difference degree of each candidate representative point in the candidate representative point set of each biochemical detection parameter according to the candidate representative point set of each biochemical detection parameter is as follows:

8. The method for analyzing biochemical data according to claim 1, wherein the method for obtaining the target representative point of each biochemical detection parameter according to the difference of the distance distribution of the candidate representative points in the candidate representative point set of each biochemical detection parameter, and obtaining the clustering result of each biochemical detection parameter based on the target representative point of each biochemical detection parameter by using the CURE clustering algorithm comprises the following steps:

9. The method for analyzing biochemical data according to claim 1, wherein the method for acquiring the abnormal cluster of each biochemical detection parameter according to the clustering result of each biochemical detection parameter and acquiring the analysis result of the biochemical detection parameter according to the abnormal cluster of the biochemical detection parameter comprises the steps of:

the biochemical detection parameters comprise urine specific gravity and urine-microglobulin, urine N-acetyl-D amino acid glucosidase, urine cystatin C, taking a cluster corresponding to the mean value of the minimum element in the clustering result of urine specific gravity as a first abnormal cluster, and respectively taking urine->The cluster corresponding to the maximum element mean value in the clustering results of the time sequence of the microglobulin, the urine N-acetyl-D amino acid glucosidase and the urine cystatin C are respectively used as a second abnormal cluster, a third abnormal cluster and a fourth abnormal cluster;

10. A system for biochemical data analysis comprising a memory, a processor and a computer program stored in the memory and running on the processor, characterized in that the processor implements the steps of the method according to any one of claims 1-9 when executing the computer program.