CN117851836B - Intelligent data analysis method for pension information service system - Google Patents

Intelligent data analysis method for pension information service system Download PDF

Info

Publication number
CN117851836B
CN117851836B CN202410245134.3A CN202410245134A CN117851836B CN 117851836 B CN117851836 B CN 117851836B CN 202410245134 A CN202410245134 A CN 202410245134A CN 117851836 B CN117851836 B CN 117851836B
Authority
CN
China
Prior art keywords
sample
data
representative
obtaining
day
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410245134.3A
Other languages
Chinese (zh)
Other versions
CN117851836A (en
Inventor
倪佳斌
应必善
蔡修成
朱小龙
李川
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Pukang Intelligent Old Age Industry Technology Co ltd
Zhejiang Pukang Intelligent Elderly Care Industry Technology Co ltd
Original Assignee
Suzhou Pukang Intelligent Old Age Industry Technology Co ltd
Zhejiang Pukang Intelligent Elderly Care Industry Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Pukang Intelligent Old Age Industry Technology Co ltd, Zhejiang Pukang Intelligent Elderly Care Industry Technology Co ltd filed Critical Suzhou Pukang Intelligent Old Age Industry Technology Co ltd
Priority to CN202410245134.3A priority Critical patent/CN117851836B/en
Publication of CN117851836A publication Critical patent/CN117851836A/en
Application granted granted Critical
Publication of CN117851836B publication Critical patent/CN117851836B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of physiological data processing, in particular to a data intelligent analysis method for a pension information service system. The method comprises the steps of obtaining data structure similarity among samples in each dimension and data abnormality degree of each sample in each dimension in each day; further combining the relative distances between the data sequences of the corresponding samples within each day to obtain a distance measure between each sample; obtaining a local range density for each sample; in the process of performing CURE clustering on all samples, an initial sample cluster is obtained; any sample of preset quantity in each initial sample cluster is taken as a group of representative points, the optimal representative point group is screened out by obtaining the representative degree of each group of representative points, the initial sample clusters are clustered, a clustering result is obtained, and the old people are served. The invention improves the clustering effect by obtaining the proper representative points in the sample cluster, and provides personalized service for the elderly.

Description

Intelligent data analysis method for pension information service system
Technical Field
The invention relates to the technical field of physiological data processing, in particular to a data intelligent analysis method for a pension information service system.
Background
The intelligent data analysis method of the pension information service system is developed rapidly in the industry and becomes one of key means for improving the pension service quality and efficiency; the system platform can analyze the health data, daily activity information and social interaction of the user so as to improve the service level of the aged, and plays an important role in promoting the health and social contact of the aged.
Considering that the old people with different physical states can be clustered by monitoring the health data of the old people and utilizing the thought of clustering, and more personalized and accurate services are provided for the old people with different states according to the clustering result. In the prior art, a CURE clustering algorithm is utilized to perform clustering analysis on physiological parameter data of different old people; in the traditional CURE clustering algorithm, the distance between clusters is measured according to the positions of representative points in the clusters and shrinkage factors, but the difference of the shrinkage factors of the selected representative points causes larger difference in distance measurement between the obtained clusters, and proper representative points in the clusters are not obtained, so that the accuracy of the distance measurement between the clusters is lower, the clustering effect is poor, and personalized service cannot be provided.
Disclosure of Invention
In order to solve the technical problem that proper representative points in clusters are not obtained and the clustering effect is poor, the invention aims to provide a data intelligent analysis method for an aged care information service system, and the adopted technical scheme is as follows:
The invention provides a data intelligent analysis method for a pension information service system, which comprises the following steps:
taking each elderly person as one sample, acquiring multidimensional physiological parameter data of each sample at each moment, and acquiring a multidimensional data sequence of each sample at each hour in each day;
Obtaining the data structure similarity among the samples according to the correlation coefficient of the data sequence among each sample in each day in one dimension; obtaining the data abnormality degree of each sample in each day according to the correlation coefficient of the data sequence of each sample in each day and other days in the time neighborhood range;
Obtaining a distance measurement between each sample according to the data structure similarity between each sample in each dimension, the data abnormality degree of the corresponding sample in each day and the relative distance between data sequences; obtaining the local range density of each sample according to the distance measurement between each sample and each other sample in the sample neighborhood range;
In the process of performing CURE clustering on all samples, an initial sample cluster is obtained according to the distance measurement between each sample; in each initial sample cluster, obtaining a representative of a center sample in a sample neighborhood according to the distance measurement and the density difference characteristics between each sample in each sample neighborhood; taking a preset number of samples in each initial sample cluster as a group of representative points, and obtaining the representative degree of each group of representative points according to the representativeness and the corresponding distance measurement between each representative point in each group of representative points; screening out an optimal representative point group according to the representative degree of each group of representative points in each initial cluster, and clustering the initial sample clusters to obtain a clustering result;
And serving the old people according to the clustering result.
Further, the method for acquiring the data structure similarity comprises the following steps: acquiring a data sequence of each sample at all hours in each day under each dimension to form an overall data sequence of each sample in each day; and calculating the correlation coefficient of the whole data sequence in each sample in each day, and averaging the correlation coefficients in all days to obtain the data structure similarity between each sample.
Further, the method for acquiring the data anomaly degree comprises the following steps:
obtaining the data abnormality degree according to an obtaining formula of the data abnormality degree, wherein the obtaining formula of the data abnormality degree is as follows: formula one; wherein/> Representing the degree of data abnormality of the ith sample in the ith dimension in the (r) th day; omega represents the number of other days around the time neighborhood range centered on the r-th day; /(I)Representing the data sequence of the ith sample in the ith dimension at the t-th hour on the r-th day; representing the data sequence at the t hour of the ith sample in the ith dimension on day r+u; Representing the correlation coefficient of the data sequence at the t-th hour between the r-th and r+u-th days for the i-th sample in the l dimension; /(I) Representing a minimum function.
Further, the method for obtaining the distance measurement comprises the following steps:
Obtaining a distance measure according to an obtaining formula of the distance measure, wherein the obtaining formula of the distance measure is as follows: A second formula; wherein/> Representing a distance measure between the i-th sample and the j-th sample; m represents the dimension number of the physiological parameter data; /(I)Representing data structure similarity between the ith sample and the jth sample in the ith dimension; n represents the acquisition days of the physiological parameter data; /(I)Representing the degree of data abnormality of the ith sample in the ith dimension within the v-th day; /(I)Representing the degree of data abnormality of the jth sample in the ith dimension in the ith day; /(I)Representing the relative distance between the ith sample and the jth sample in the ith dimension over the data sequence on the ith day; exp () represents an exponential function that bases on a natural constant.
Further, the method for obtaining the local range density comprises the following steps:
Performing negative correlation mapping on the distance measurement between each sample and each other sample in a sample neighborhood range, and accumulating all negative correlation mapping results to obtain a first accumulated value; and calculating the ratio of the first accumulated value to the preset distance threshold value to obtain the local range density of each sample.
Further, the representative acquisition method includes:
Obtaining other samples corresponding to the maximum distance measurement between each sample and each other sample in each sample neighborhood range in each initial sample cluster, and taking the other samples as comparison samples; calculating the density difference between the sample in the sample neighborhood range and the comparison sample, and averaging all density difference results to obtain the representative of the center sample in the sample neighborhood range.
Further, the representative degree obtaining method includes:
for each set of representative points in each initial sample cluster, calculating a representative mean value between each representative point as a first representative mean value; calculating the product of the first representative mean value between each representative point and the corresponding distance measurement to obtain a first product value;
and calculating the average value of the first product values among all the representative points to obtain the representative degree of each group of the representative points.
Further, the method for obtaining the optimal representative point group comprises the following steps:
and selecting a group of representative points with the maximum representative degree corresponding to all groups of representative points in each initial sample cluster as an optimal representative point group.
Further, the method for acquiring the clustering result comprises the following steps:
calculating the average value of the distance measurement between all the representative points in the optimal representative point group between each initial sample cluster as the average distance between each initial sample cluster; and clustering all initial sample clusters by adopting CURE according to the average distance to obtain new initial sample clusters until the number of the preset cluster clusters is reached, and obtaining a clustering result.
Further, the correlation coefficient is a pearson correlation coefficient.
The invention has the following beneficial effects:
in order to know whether the variation trend of physiological parameters of two samples within each day is similar, under one dimension, the similarity of data structures among the samples is obtained according to the correlation coefficient of the data sequence among each sample within each day; according to the correlation coefficient of the data sequence of each sample between each day and other days in the time neighborhood range, the data abnormality degree of each sample in each day is obtained, and the physiological condition change trend of the old in different time periods is analyzed; according to the similarity of the data structure between each sample in each dimension, the degree of data abnormality of the corresponding sample in each day and the relative distance between the data sequences, the distance measurement between each sample is obtained, and the fine difference between the samples is captured better, so that the accuracy of the distance measurement is improved; according to the distance measurement between each sample and each other sample in the sample neighborhood range, the local range density of each sample is obtained, the difference of life patterns among the aged is considered, the spatial relationship among the samples is analyzed, and the distribution condition of data can be captured better; in the process of performing CURE clustering on all samples, an initial sample cluster is obtained according to distance measurement between each sample; in each initial sample cluster, according to the distance measurement and the density difference characteristic between each sample in each sample neighborhood range, obtaining the representativeness of a central sample in the sample neighborhood range, better identifying and describing the structure and the characteristics of each cluster, and evaluating the representativeness of each sample in a local range; taking a preset number of samples in each initial sample cluster as a group of representative points, obtaining the representative degree of each group of representative points according to the representativeness between each representative point in each group of representative points and the corresponding distance measurement, and identifying samples with higher representativeness to better understand the structure and distribution of the clusters; and screening out an optimal representative point group according to the representative degree of each group of representative points in each initial cluster, and clustering the initial sample clusters to obtain a clustering result, so that the real cluster structure of the data can be found more accurately and rapidly. The invention improves the clustering effect by obtaining the proper representative points in the sample cluster, and provides personalized service for the elderly.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a data intelligent analysis method for a pension information service system according to an embodiment of the present invention.
Detailed Description
In order to further describe the technical means and effects adopted by the invention to achieve the preset aim, the following is a detailed description of specific implementation, structure, characteristics and effects of the data intelligent analysis method for the pension information service system according to the invention, which is provided by the invention, with reference to the accompanying drawings and the preferred embodiment. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The following specifically describes a specific scheme of the data intelligent analysis method for the pension information service system provided by the invention with reference to the accompanying drawings.
Referring to fig. 1, a flowchart of a method for intelligently analyzing data of an information service system for aged people according to an embodiment of the present invention is shown, where the method specifically includes:
Step S1: taking each elderly person as one sample, acquiring multidimensional physiological parameter data of each sample at each moment, and acquiring multidimensional data sequences of each sample at each hour in each day.
In one embodiment of the present invention, to improve the level of care services, the health data of the elderly is continuously monitored to provide more personalized and accurate services; firstly, real-time monitoring physiological parameter data of the old through wearable equipment such as an intelligent watch, a healthy bracelet and the like, taking each old as one sample, and acquiring multidimensional physiological parameter data of each sample at each moment; the multi-dimensions include: heart rate, blood pressure, blood sample saturation, body temperature, blood glucose level, respiratory rate, etc.; the multi-dimensional data sequence of each sample in each hour of each day is obtained, and physiological parameter data of different aged people are analyzed, so that more personalized and accurate service is provided for the aged people.
It should be noted that, due to external factors such as equipment errors, noise may exist in the data, which interferes with subsequent data analysis; in one embodiment of the invention, in order to facilitate the processing of subsequent data, the obtained physiological parameter data is subjected to denoising pretreatment, so that the influence of noise can be eliminated, and the quality of the data is improved. The specific denoising algorithm is a technical means well known to those skilled in the art, and will not be described in detail herein.
It should be noted that, in the embodiment of the present invention, the time interval for acquiring data is 1min; in other embodiments of the present invention, the frequency of the data may be specifically set according to specific situations, which are not limited and described herein.
Step S2: obtaining the data structure similarity among the samples according to the correlation coefficient of the data sequence among each sample in each day in one dimension; and obtaining the data abnormality degree of each sample in each day according to the correlation coefficient of the data sequence of each sample in each day and other days in the time neighborhood range.
Different physiological parameter data can be generated when different activities are performed due to the fact that living habits of different old people are different, for example, body temperature, heart rate and the like in the exercise process are improved to a certain extent; the correlation coefficient can quantify the linear relation between the data sequences of the two samples, know whether the variation trend of the physiological parameters of the two samples within each day is similar, and find the common characteristics and rules among the samples by comparing the similarities, so that the physiological condition of the old can be analyzed more accurately. So in one dimension, the data structure similarity between samples is obtained based on the correlation coefficient of the data sequence between each sample over the day.
Preferably, in one embodiment of the present invention, the method for acquiring data structure similarity includes:
Acquiring a data sequence of each sample at all hours in each day under each dimension to form an overall data sequence of each sample in each day; and calculating the correlation coefficient of the whole data sequence in each sample in each day, and averaging the correlation coefficients in all days to obtain the data structure similarity between each sample. In one embodiment of the invention, the formula for data structure similarity is expressed as: a formula III; wherein/> Representing data structure similarity between the ith sample and the jth sample in the ith dimension; /(I)Representing the overall data sequence of the ith sample in the ith dimension over the r-th day; /(I)Representing the overall data sequence of the jth sample in the ith dimension over the (r) th day; a correlation coefficient representing the overall data sequence between the ith sample and the jth sample in the ith dimension over the nth day; n represents the number of days of acquisition of physiological parameter data.
In the formula of the data structure similarity, the larger the correlation coefficient of the whole data sequence in each day between each sample in the first dimension is, the smaller the data difference is, the higher the data structure similarity is, namely, the life patterns among the old people are closer.
It should be noted that, in one embodiment of the present invention, the correlation coefficient is a pearson correlation coefficient; the specific pearson correlation coefficient is a technical means well known to those skilled in the art, and will not be described herein.
The daily life of the same aged is not constant, different schedule activities can occur between different days, and different physiological parameter data are generated; by analyzing the data sequence of each sample between different days, the physiological condition change trend of the old in different time periods can be analyzed; if one sample is highly correlated with data changes on one day and on other days within the neighborhood, there is a greater likelihood that the data is normal on that day; conversely, if the correlation coefficient of one sample is small for data on one day and data on other days in the neighborhood, the more likely that the data on that day is abnormal; the degree of data anomalies for each sample per day is obtained from the correlation coefficients of the data sequences for each sample between each day and other days within the time neighborhood.
Preferably, in one embodiment of the present invention, the method for acquiring the degree of abnormality of data includes:
obtaining the data abnormality degree according to an obtaining formula of the data abnormality degree, wherein the obtaining formula of the data abnormality degree is as follows: formula one; wherein/> Representing the degree of data abnormality of the ith sample in the ith dimension in the r day; omega represents the number of other days around the time neighborhood range centered on the r-th day; /(I)Representing the data sequence of the ith sample in the ith dimension at the t-th hour on the r-th day; representing the data sequence at the t hour of the ith sample in the ith dimension on day r+u; Representing the correlation coefficient of the data sequence at the t-th hour between the r-th and r+u-th days for the i-th sample in the l dimension; /(I) Representing a minimum function.
In the acquisition formula of the degree of abnormality of the data,Representing the minimum correlation coefficient for the data sequence per hour between the r-th and r+u-th days for the i-th sample in the l-th dimension; normalizing the minimum value of the correlation coefficient, wherein the smaller the minimum value of the correlation coefficient is, the larger the data sequence difference of the sample in the corresponding hour between two days is, the larger the data abnormality degree is, and the larger the physiological activity difference is; the smaller the correlation coefficient of the data sequence per hour between each day and the other days in the time neighborhood of the sample, the more likely an abnormality occurs on the corresponding days, and the greater the degree of abnormality of the data.
It should be noted that, in one embodiment of the present invention, the time neighborhood range is a range with a size of 7 built around each day, that is, ω is a checked value of 3; in other embodiments of the present invention, the size of the time neighborhood range may be specifically set according to specific situations, which is not limited and described herein.
Step S3: obtaining a distance measurement between each sample according to the data structure similarity between each sample in each dimension, the data abnormality degree of the corresponding sample in each day and the relative distance between data sequences; the local range density of each sample is obtained from a distance measure between each sample and each other sample within the sample neighborhood. Because the distance measurement between the traditional samples only considers the data difference under the corresponding moments of the two samples, the sample data under different activities are ignored to generate different changes, and in order to more comprehensively understand the internal structure and relation of the data, the similarity of the data structure, the degree of abnormality of the data in each day and the relative distance are comprehensively analyzed; data structure similarity considers the data distribution or pattern of samples in a certain dimension, and the higher the similarity, the smaller the distance measurement between the samples; the degree of data abnormality in each day focuses on the local characteristics of the data, and the greater the degree of data abnormality, the poorer the accuracy of analyzing the distance measurement between samples; the relative distance provides global information of the relationship between the samples, the greater the relative distance, the greater the distance measure between the samples; the distance measure between each sample is obtained from the data structure similarity between each sample in each dimension, the degree of data anomalies of the corresponding sample within each day, and the relative distance between the data sequences.
Preferably, in one embodiment of the present invention, the method for obtaining the distance metric includes:
Obtaining a distance measure according to an obtaining formula of the distance measure, wherein the obtaining formula of the distance measure is as follows: A second formula; wherein/> Representing a distance measure between the i-th sample and the j-th sample; m represents the dimension number of the physiological parameter data; /(I)Representing data structure similarity between the ith sample and the jth sample in the ith dimension; n represents the acquisition days of the physiological parameter data; representing the degree of data abnormality of the ith sample in the ith dimension within the v-th day; /(I) Representing the degree of data abnormality of the jth sample in the ith dimension in the ith day; /(I)Representing the relative distance between the ith sample and the jth sample in the ith dimension over the data sequence on the ith day; exp () represents an exponential function that bases on a natural constant.
In the distance measurement acquisition formula, the natural constant-based exponential function is used for obtaining the distance measurementPerforming negative correlation mapping, wherein the larger the similarity of the data structure under each dimension is, the smaller the difference between the data is, and the smaller the distance measurement between samples is; The average value of the data abnormality degree of the ith sample and the jth sample in the ith dimension in the ith day is represented, the larger the average value is, the higher the data abnormality degree in the corresponding days is, the lower the accuracy of data analysis is, the larger the relative distance is required to be regulated, the influence of irrelevant factors is avoided, the larger the relative distance is, and the distance measurement between the samples is larger.
In one embodiment of the present invention, the method for obtaining the relative distance includes: acquiring a data sequence of each sample at all hours in each day under each dimension to form an overall data sequence of each sample in each day, and calculating Euclidean distance of the overall data sequence of each sample in each day as a relative distance; the specific euclidean distance is a technical means well known to those skilled in the art, and will not be described herein.
Because the sample size is too large, the relationship between all samples can be very time-consuming and computationally intensive to directly process, the old people with similar life patterns are selected through the distance measurement between samples in the neighborhood range of each sample, and the reference samples which are similar in some key characteristics but not identical can greatly reduce the data size to be processed, do not lose much information, and improve the calculation efficiency and accuracy; by calculating a distance measure between each sample and each other sample within the sample neighborhood, the density around the sample can be estimated; the greater the distance metric, the less the sample is gathered and the less the local range density of the sample; and obtaining the local range density of each sample according to the distance measurement between each sample and each other sample in the sample neighborhood range and the preset distance threshold value.
Preferably, in one embodiment of the present invention, the method for obtaining the local area density includes:
Performing negative correlation mapping on the distance measurement between each sample and each other sample in the sample neighborhood range, and accumulating all negative correlation mapping results to obtain a first accumulated value; and calculating the ratio of the first accumulated value to the preset distance threshold value to obtain the local range density of each sample. In one embodiment of the invention, the formula for the local area density is expressed as: A formula IV; wherein/> Representing the local range density of the ith sample; /(I)Representing a distance measure between the ith sample and the kth other samples in the sample neighborhood; l represents a preset distance threshold; representing the number of other samples in the i-th sample neighborhood; /(I) An exponential function based on a natural constant is represented.
In the formula of density, the natural constant-based exponential function is used for thePerforming negative correlation mapping, wherein the larger the distance measurement between samples is, the more discrete the distribution of the samples is; /(I)The ratio of the first accumulated value to the preset distance threshold value is shown, which shows the distribution condition and similarity of the samples in the space, and the larger the ratio is, the closer the distribution among the samples is, and the higher the density of the samples is.
It should be noted that, in one embodiment of the present invention, the sample neighborhood range is a range size formed by taking each sample as a center and other samples with a distance metric between each sample smaller than a preset distance threshold, and the preset distance threshold is 20; in other embodiments of the present invention, the size of the sample neighborhood range may be specifically set according to specific situations, which is not limited and described herein. In other embodiments of the present invention, the positive-negative correlation may be constructed by other basic mathematical operations, and the specific means are technical means well known to those skilled in the art, and will not be described herein.
Step S4: in the process of performing CURE clustering on all samples, an initial sample cluster is obtained according to distance measurement between each sample; in each initial sample cluster, obtaining a representative of a center sample in a sample neighborhood according to the distance measurement and the density difference characteristics between each sample in each sample neighborhood; taking a preset number of samples in each initial sample cluster as a group of representative points, and obtaining the representative degree of each group of representative points according to the representativeness and the corresponding distance measurement between each representative point in each group of representative points; and screening out optimal representative points according to the representative degree of each group of representative points in each initial cluster, and clustering the initial sample clusters to obtain a clustering result.
In the clustering process, samples are distributed into different clusters according to the characteristic similarity degree, so that samples in each cluster are similar as much as possible, and sample points among different clusters are different as much as possible; to improve the quality and effect of clustering, an initial cluster of samples is obtained from a distance metric between each sample during the CURE clustering of all samples.
It should be noted that, the specific CURE clustering is a technical means well known to those skilled in the art, and will not be described herein.
To better identify and describe the structure and characteristics of each cluster, evaluate the representativeness of each sample in a local range, comprehensively considering density difference characteristics and distance metrics; distance metrics are important indicators for assessing similarity or variability between samples; the density difference describes the distribution of the sample in space; the farther the distance metric, the greater the density difference, the more likely the sample is an outlier or edge point, more representative; in each initial sample cluster, a representative sample in the sample neighborhood is obtained according to the distance measurement and the density difference characteristic between each sample in each sample neighborhood.
Preferably, in one embodiment of the present invention, the representative acquisition method includes:
Obtaining other samples corresponding to the maximum distance measurement between each sample and each other sample in each sample neighborhood range in each initial sample cluster, and taking the other samples as comparison samples; calculating the density difference between the sample in the sample neighborhood range and the comparison sample, and averaging all density difference results to obtain the representative of the center sample in the sample neighborhood range. In one embodiment of the invention, a representative formula is expressed as: A fifth formula; wherein/> Representative of the p-th sample in each initial sample cluster; /(I)Representing the number of samples in the neighborhood range of the p-th sample in each initial sample cluster; /(I)Representing the local range density of the τ -th sample within the neighborhood of the p-th sample in each initial sample cluster; /(I)Representing a comparison sample/>, in each initial sample cluster, with the greatest distance metric from the τ point in the neighborhood of the p-th sampleIs a local range density of (c).
In a representative formula of the present invention,Representing the τ sample and the comparison sample/>, within the neighborhood of the p-th sample, in each initial sample clusterThe larger the difference, the larger the density variation of the samples in different directions within the neighborhood, the more likely to be edge points and outliers in the initial sample cluster, the larger the representativeness.
In order to simplify calculation and improve clustering efficiency, samples which can represent the characteristics of initial sample clusters are selected for analysis, and any preset number of samples in each initial sample cluster are taken as a group of representative points; by comparing the distance between representative points with the representativeness, the structure and distribution of clusters can be better understood, helping to identify samples with higher representativeness; the larger the distance measure between the representative points, the more scattered the representative, the more obvious the features of the initial sample cluster, and the higher the representative degree; and obtaining the representative degree of each group of representative points according to the representativeness among each representative point in each group of representative points and the corresponding distance measurement.
Preferably, in one embodiment of the present invention, the representative degree acquiring method includes:
For each set of representative points in each initial sample cluster, calculating a representative mean value between each representative point as a first representative mean value; calculating the product of the first representative mean value between each representative point and the corresponding distance measurement to obtain a first product value; and calculating the average value of the first product values among all the representative points to obtain the representative degree of each group of the representative points. In one embodiment of the invention, the formula for the representative degree is expressed as: a formula six; wherein R represents the degree of representation of each set of representative points in each initial sample cluster; /(I) Expressed in each initial sample cluster as the/>, of the representative points of each groupRepresentative of the individual representative points; /(I)Representative of the b-th representative point in each set of representative points represented in each initial sample cluster; /(I)Represents the/>, of each group of representative pointsA distance measure between the representative point and the b-th representative point; n represents a preset number of representative points of each group in each initial sample cluster.
In the formula of the representative degree of the present invention,Expressed in each initial sample cluster, the/>The larger the first representative average value is, the larger the distance measure between the representative points is, the more important the edge or anomaly of the initial sample cluster is represented, and the more discrete the distribution between the representative points is, the better the shape of the corresponding initial sample cluster is represented.
It should be noted that, in an embodiment of the present invention, the preset number of operators of each group of representative points may be specifically set according to specific situations, which is not described herein.
The representative degree is used for measuring the proper degree of a sample serving as the representative of a sample cluster, and the higher the representative degree is, the characteristics of the cluster where the optimal representative point is located can be accurately reflected, so that the real cluster structure of the data can be more accurately and rapidly found, visual understanding of a clustering result can be provided, and the clustering precision is improved. And screening out an optimal representative point group according to the representative degree of each group of representative points in each initial cluster, and clustering the initial sample clusters to obtain a clustering result.
Preferably, in one embodiment of the present invention, the method for acquiring the optimal representative point group includes:
And selecting a group of representative points with the maximum representative degree corresponding to all groups of representative points in each initial sample cluster as an optimal representative point group.
Preferably, in one embodiment of the present invention, the method for obtaining the clustering result includes:
Calculating the average value of the distance measurement between all the representative points in the optimal representative point group between each initial sample cluster as the average distance between each initial sample cluster; and clustering all the initial sample clusters by using CURE according to the average distance to obtain new initial sample clusters until the number of the preset cluster clusters is reached, and obtaining a clustering result.
It should be noted that, in one embodiment of the present invention, the number of preset clusters is 5; in other embodiments of the present invention, the number of the preset clusters may be specifically set by the operators according to specific situations, which is not limited and described herein.
Step S5: and serving the old according to the clustering result.
Dividing the elderly into different groups according to clustering results, wherein each group has similar characteristics and requirements; according to the clustering results, resources can be more reasonably distributed, and customized services can be provided for each group, wherein the customized services comprise customized health management plans, specific social activities recommended and personalized entertainment suggestions and the like, so as to meet the unique requirements of different aged people. And serving the old according to the clustering result.
In summary, in one dimension, the invention obtains the similarity of the data structures among the samples according to the correlation coefficient of the data sequence among each sample in each day; obtaining the data abnormality degree of each sample in each day according to the correlation coefficient of the data sequence of each sample in each day and other days in the time neighborhood range; further combining the relative distances between the data sequences of the corresponding samples within each day to obtain a distance measure between each sample; further obtaining a local range density for each sample; in the process of performing CURE clustering on all samples, an initial sample cluster is obtained; and taking a preset number of samples in each initial sample cluster as a group of representative points, obtaining the representative degree of each group of representative points according to the representativeness between each representative point in each group of representative points and the corresponding distance measurement, screening out the optimal representative point group, clustering the initial sample clusters to obtain a clustering result, and serving the aged. The invention improves the clustering effect by obtaining the proper representative points in the sample cluster, and provides personalized service for the elderly.
It should be noted that: the sequence of the embodiments of the present invention is only for description, and does not represent the advantages and disadvantages of the embodiments. The processes depicted in the accompanying drawings do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments.

Claims (6)

1. A method for intelligent analysis of data for a pension information service system, the method comprising:
taking each elderly person as one sample, acquiring multidimensional physiological parameter data of each sample at each moment, and acquiring a multidimensional data sequence of each sample at each hour in each day;
Obtaining the data structure similarity among the samples according to the correlation coefficient of the data sequence among each sample in each day in one dimension; obtaining the data abnormality degree of each sample in each day according to the correlation coefficient of the data sequence of each sample in each day and other days in the time neighborhood range;
Obtaining a distance measurement between each sample according to the data structure similarity between each sample in each dimension, the data abnormality degree of the corresponding sample in each day and the relative distance between data sequences; obtaining the local range density of each sample according to the distance measurement between each sample and each other sample in the sample neighborhood range;
in the process of performing CURE clustering on all samples, an initial sample cluster is obtained according to the distance measurement between each sample; in each initial sample cluster, obtaining a representative of a center sample in a sample neighborhood according to the distance measurement and the density difference characteristics between each sample in each sample neighborhood; taking a preset number of samples in each initial sample cluster as a group of representative points, and obtaining the representative degree of each group of representative points according to the representativeness and the corresponding distance measurement between each representative point in each group of representative points; screening out an optimal representative point group according to the representative degree of each group of representative points in each initial cluster, and clustering the initial sample clusters to obtain a clustering result;
service the old people according to the clustering result;
The method for acquiring the data structure similarity comprises the following steps:
Acquiring a data sequence of each sample at all hours in each day under each dimension to form an overall data sequence of each sample in each day; calculating the correlation coefficient of the whole data sequence in each sample every day, and averaging the correlation coefficients in all days to obtain the data structure similarity between each sample;
The method for acquiring the data abnormality degree comprises the following steps:
obtaining the data abnormality degree according to an obtaining formula of the data abnormality degree, wherein the obtaining formula of the data abnormality degree is as follows:
wherein/> Representing the degree of data abnormality of the ith sample in the ith dimension in the r day; omega represents the number of other days around the time neighborhood range centered on the r-th day; representing the data sequence of the ith sample in the ith dimension at the t-th hour on the r-th day; /(I) Representing the data sequence at the t hour of the ith sample in the ith dimension on day r+u; /(I)Representing the correlation coefficient of the data sequence at the t-th hour between the r-th and r+u-th days for the i-th sample in the l dimension; min () represents a minimum function;
the distance measurement acquisition method comprises the following steps:
Obtaining a distance measure according to an obtaining formula of the distance measure, wherein the obtaining formula of the distance measure is as follows:
Wherein D i,j represents a distance measure between the i-th sample and the j-th sample; m represents the dimension number of the physiological parameter data; /(I) Representing data structure similarity between the ith sample and the jth sample in the ith dimension; n represents the acquisition days of the physiological parameter data; /(I)Representing the degree of data abnormality of the ith sample in the ith dimension within the v-th day; /(I)Representing the degree of data abnormality of the jth sample in the ith dimension in the ith day; /(I)Representing the relative distance between the ith sample and the jth sample in the ith dimension over the data sequence on the ith day; exp () represents an exponential function based on a natural constant;
The representative acquisition method comprises the following steps:
Obtaining other samples corresponding to the maximum distance measurement between each sample and each other sample in each sample neighborhood range in each initial sample cluster, and taking the other samples as comparison samples; calculating the density difference between the sample in the sample neighborhood range and the comparison sample, and averaging all density difference results to obtain the representative of the center sample in the sample neighborhood range.
2. The intelligent analysis method for data of a pension information service system according to claim 1, wherein the obtaining method for the local range density includes:
Performing negative correlation mapping on the distance measurement between each sample and each other sample in a sample neighborhood range, and accumulating all negative correlation mapping results to obtain a first accumulated value; and calculating the ratio of the first accumulated value to the preset distance threshold value to obtain the local range density of each sample.
3. The intelligent analysis method for data of a pension information service system according to claim 1, wherein the representative degree obtaining method comprises:
for each set of representative points in each initial sample cluster, calculating a representative mean value between each representative point as a first representative mean value; calculating the product of the first representative mean value between each representative point and the corresponding distance measurement to obtain a first product value;
and calculating the average value of the first product values among all the representative points to obtain the representative degree of each group of the representative points.
4. The intelligent data analysis method for a pension information service system according to claim 1, wherein the method for acquiring the optimal representative point group includes:
and selecting a group of representative points with the maximum representative degree corresponding to all groups of representative points in each initial sample cluster as an optimal representative point group.
5. The intelligent analysis method for data of a pension information service system according to claim 1, wherein the method for obtaining the clustering result comprises:
calculating the average value of the distance measurement between all the representative points in the optimal representative point group between each initial sample cluster as the average distance between each initial sample cluster; and clustering all initial sample clusters by adopting CURE according to the average distance to obtain new initial sample clusters until the number of the preset cluster clusters is reached, and obtaining a clustering result.
6. The intelligent analysis method for senior information service system according to claim 1, wherein the correlation coefficient is pearson correlation coefficient.
CN202410245134.3A 2024-03-05 2024-03-05 Intelligent data analysis method for pension information service system Active CN117851836B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410245134.3A CN117851836B (en) 2024-03-05 2024-03-05 Intelligent data analysis method for pension information service system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410245134.3A CN117851836B (en) 2024-03-05 2024-03-05 Intelligent data analysis method for pension information service system

Publications (2)

Publication Number Publication Date
CN117851836A CN117851836A (en) 2024-04-09
CN117851836B true CN117851836B (en) 2024-05-28

Family

ID=90529468

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410245134.3A Active CN117851836B (en) 2024-03-05 2024-03-05 Intelligent data analysis method for pension information service system

Country Status (1)

Country Link
CN (1) CN117851836B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118296216B (en) * 2024-06-06 2024-08-02 厦门市华林测绘信息有限公司 Association matching method and system for family spectrum information and geographic information
CN118571487A (en) * 2024-07-26 2024-08-30 深圳市万德昌创新智能有限公司 Intelligent pension health data acquisition method and humanoid pension robot

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111035366A (en) * 2019-12-30 2020-04-21 苏州普康智慧养老产业科技有限公司 Home-based aged-care intelligent service system based on Internet of things
CN113112374A (en) * 2020-12-21 2021-07-13 中国计量大学 Empty nest user electricity utilization abnormity detection method based on machine learning algorithm
CN113421400A (en) * 2021-06-15 2021-09-21 苏州普康智慧养老产业科技有限公司 Endowment service system who possesses safe information acquisition
CN116186634A (en) * 2023-04-26 2023-05-30 青岛新航农高科产业发展有限公司 Intelligent management system for construction data of building engineering
CN116344023A (en) * 2023-03-17 2023-06-27 浙江普康智慧养老产业科技有限公司 Remote monitoring system based on wisdom endowment medical treatment
CN116597954A (en) * 2023-03-15 2023-08-15 浙江普康智慧养老产业科技有限公司 Intelligent home-based remote diagnosis and treatment appointment consultation system
CN116740053A (en) * 2023-08-08 2023-09-12 山东顺发重工有限公司 Management system of intelligent forging processing production line
WO2023206888A1 (en) * 2022-04-25 2023-11-02 广东玖智科技有限公司 Ppg signal cluster center acquisition method and apparatus, and ppg signal processing method and apparatus
CN117421618A (en) * 2023-11-24 2024-01-19 上海东方低碳科技产业股份有限公司 Building energy consumption monitoring method and system
CN117542536A (en) * 2024-01-10 2024-02-09 中国人民解放军海军青岛特勤疗养中心 Intelligent nursing method and system based on physical examination data
CN117609813A (en) * 2024-01-23 2024-02-27 山东第一医科大学附属省立医院(山东省立医院) Intelligent management method for intensive patient monitoring data

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220058174A1 (en) * 2020-08-24 2022-02-24 Microsoft Technology Licensing, Llc System and method for removing exception periods from time series data

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111035366A (en) * 2019-12-30 2020-04-21 苏州普康智慧养老产业科技有限公司 Home-based aged-care intelligent service system based on Internet of things
CN113112374A (en) * 2020-12-21 2021-07-13 中国计量大学 Empty nest user electricity utilization abnormity detection method based on machine learning algorithm
CN113421400A (en) * 2021-06-15 2021-09-21 苏州普康智慧养老产业科技有限公司 Endowment service system who possesses safe information acquisition
WO2023206888A1 (en) * 2022-04-25 2023-11-02 广东玖智科技有限公司 Ppg signal cluster center acquisition method and apparatus, and ppg signal processing method and apparatus
CN116597954A (en) * 2023-03-15 2023-08-15 浙江普康智慧养老产业科技有限公司 Intelligent home-based remote diagnosis and treatment appointment consultation system
CN116344023A (en) * 2023-03-17 2023-06-27 浙江普康智慧养老产业科技有限公司 Remote monitoring system based on wisdom endowment medical treatment
CN116186634A (en) * 2023-04-26 2023-05-30 青岛新航农高科产业发展有限公司 Intelligent management system for construction data of building engineering
CN116740053A (en) * 2023-08-08 2023-09-12 山东顺发重工有限公司 Management system of intelligent forging processing production line
CN117421618A (en) * 2023-11-24 2024-01-19 上海东方低碳科技产业股份有限公司 Building energy consumption monitoring method and system
CN117542536A (en) * 2024-01-10 2024-02-09 中国人民解放军海军青岛特勤疗养中心 Intelligent nursing method and system based on physical examination data
CN117609813A (en) * 2024-01-23 2024-02-27 山东第一医科大学附属省立医院(山东省立医院) Intelligent management method for intensive patient monitoring data

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Flow monitoring system and abnormal log traffic mode detection based on artificial intelligence;Jinghua Cao 等;《Optical and Quantum Electronics》;20231213;第1-19页 *
基于特征组合分析的主泵异常检测方法;龚安;史海涛;;科学技术与工程;20190428(12);第 228-235页 *
融入密度和距离的K-means初始簇中心优选方法研究;冯勇;张学理;王嵘冰;徐红艳;;小型微型计算机系统;20180815(08);第175-178页 *

Also Published As

Publication number Publication date
CN117851836A (en) 2024-04-09

Similar Documents

Publication Publication Date Title
CN117851836B (en) Intelligent data analysis method for pension information service system
Al-Fahoum Quality assessment of ECG compression techniques using a wavelet-based diagnostic measure
US20040199482A1 (en) Systems and methods for automatic and incremental learning of patient states from biomedical signals
CN109009017B (en) Intelligent health monitoring system and data processing method thereof
CN112641451B (en) Multi-scale residual error network sleep staging method and system based on single-channel electroencephalogram signal
CN111931578B (en) Electroencephalogram identification method based on minimum spanning tree and regional double-layer network
US20220323000A1 (en) A Construction Method for Automatic Sleep Staging and Use Thereof
CN112450947A (en) Dynamic brain network analysis method for emotional arousal degree
CN117786429A (en) Old man health monitoring data processing method based on wearable equipment
CN116418882A (en) Memory data compression method based on HPLC dual-mode carrier communication
CN116451110A (en) Blood glucose prediction model construction method based on signal energy characteristics and pulse period
CN118430815B (en) Remote monitoring method and system for patient data for medical care
CN111067513B (en) Sleep quality detection key brain area judgment method based on characteristic weight self-learning
CN107832656B (en) Brain function state information processing method based on dynamic function brain network
CN117473351A (en) Power supply information remote transmission system based on Internet of things
La Rosa et al. Detection of uterine MMG contractions using a multiple change point estimator and the K-means cluster algorithm
Biswas et al. A peak synchronization measure for multiple signals
CN113876337A (en) Heart disease identification method based on multivariate recursive network
CN116509417B (en) Consumer preference consistency prediction method based on nerve similarity
CN113907770B (en) Ratchet composite wave detection and identification method and system based on feature fusion
CN114668375A (en) Cuff-free blood pressure prediction system based on deep neural network model
Anishchenko et al. Comparative analysis of methods for classifying the cardiovascular system's states under stress
Mehta et al. IDENTIFICATION AND DELINEATION OF QRS COMPLEXES IN ELECTROCARDIOGRAM USING FUZZY C-MEANS ALGORITHM.
Raghav et al. Fractal feature based ECG arrhythmia classification
Vergara et al. Selection of Efficient Clustering Index to Estimate the Number of Dynamic Brain States from Functional Network Connectivity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant