KR101462748B1 - Method for clustering health-information - Google Patents

Method for clustering health-information Download PDF

Info

Publication number
KR101462748B1
KR101462748B1 KR1020130002601A KR20130002601A KR101462748B1 KR 101462748 B1 KR101462748 B1 KR 101462748B1 KR 1020130002601 A KR1020130002601 A KR 1020130002601A KR 20130002601 A KR20130002601 A KR 20130002601A KR 101462748 B1 KR101462748 B1 KR 101462748B1
Authority
KR
South Korea
Prior art keywords
information
label
similarity
vector
patients
Prior art date
Application number
KR1020130002601A
Other languages
Korean (ko)
Other versions
KR20140090483A (en
Inventor
이영구
팜더안
홍지혜
Original Assignee
경희대학교 산학협력단
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 경희대학교 산학협력단 filed Critical 경희대학교 산학협력단
Priority to KR1020130002601A priority Critical patent/KR101462748B1/en
Publication of KR20140090483A publication Critical patent/KR20140090483A/en
Application granted granted Critical
Publication of KR101462748B1 publication Critical patent/KR101462748B1/en

Links

Images

Abstract

More particularly, the present invention relates to a method for grouping similar health information by using a health information database composed of health information having a treatment label for each patient and health information without a treatment label, To a health information clustering method capable of generating training data from a normal health information database in which a treatment label hardly exists.

Description

Methods for clustering health information

More particularly, the present invention relates to a method for grouping similar health information by using a health information database composed of health information having a treatment label for each patient and health information without a treatment label, To a health information clustering method capable of generating training data from a normal health information database in which a treatment label hardly exists.

In the ubiquitous environment, the user's health information is collected through various kinds of devices, and the collected information is stored in the database of the health care server through the communication network. The user obtains treatment information on the user's health information from a professional consultant such as a doctor or a nurse according to a request or according to a set cycle, and manages the user's health.

The user health information collected through the communication network is stored in a database of the health care server. When a plurality of health care servers are operated, a plurality of user health information are randomly distributed in a plurality of health care servers, In the health information stored in the health information database of the health care server, only the information of the emotional label, the numerical label information, and the indication label indicating the user's state is stored. In most user information, . In other words, it is common to provide treatment information directly to the user through offline consultation or telephone consultation of the professional consultant and the user, and moreover, the professional consultant generally does not disclose his / her personal treatment information to others.

Therefore, if we want to generate training data to provide treatment label information according to user's health status in health related social network, the health information of collected user is almost all health information without treatment label, Information is scarcely present. Therefore, it is difficult to generate training data that can be used in supervisory-based learning techniques.
Korean Patent Laid-Open No. 10-2014-0064471 is a prior patent of the present invention.

It is an object of the present invention to provide a method of clustering a health information database using a health information database in which there is almost no treatment label.

It is another object of the present invention to provide a method and apparatus for collecting health information data by utilizing a health information database in which a treatment label is hardly existed and providing treatment label information among health information of a cluster most similar to the patient & And to provide a therapeutic advice method.

Another object of the present invention is to provide a method for clustering health information by a simple process based on the closeness of health information, which is specialized in health information that does not guarantee linear distribution.

In order to accomplish the object of the present invention, the clustering method of health information according to the present invention is a method for clustering health information according to the present invention, comprising a health information database including health information having a treatment label for each patient and health information without a treatment label, Generating a label similarity matrix representing the similarity of each label information among the patients according to the type of label information, generating a total similarity matrix between the patients from the sum of the label similarity matrices indicating the degree of similarity for each label information, The method comprising the steps of: generating preliminary population information of a plurality of patients by densifying the population density of patients having similar properties to each other such that the population density is smaller than a threshold value; and clustering a plurality of patients with a set number of clusters based on distribution characteristics of the preliminary clustering information .

Here, the similarity of each label information between patients is calculated as the euclidean distance of each label information for each patient. The similarity of the treatment label information is set to 1 when the Euclidean distance exceeds zero.

Meanwhile, the type of the label information includes at least one of emotion label information, numerical label information, sign label information, and treatment label information, and the emotional label information, the numeric label information, the indication label information, As an identification value matching the item identifier.

Preferably, values of each element constituting the label similarity matrix for the emotion label information, the numerical label information, the symptom label information, and the treatment label information are normalized to the same first reference value.

Preferably, the step of generating the preliminary cluster information comprises the steps of (c1) generating an initial vector normalized to a second reference value, generating a next vector from the product of the global similarity matrix and the previous vector, (C3) determining whether the difference between the previous vector and the next vector is smaller than the threshold value, and determining whether the difference between the previous vector and the next vector is smaller than the threshold value, (C4). If the difference between the previous vector and the next vector is larger than the threshold value, the next vector is set as the previous vector, and the steps (c2) to (c3) are repeatedly performed.

On the other hand, the preliminary cluster information is clustered into a set number of clusters according to the distribution density characteristics of the preliminary cluster information through the K-means clustering algorithm.

In the meantime, the treatment advice method according to the present invention includes treatment label information in a health information database made up of health information having a treatment label for each patient and health information without a treatment label, Generating a label similarity matrix representing information similarity by information; generating a total similarity matrix between the patients from the sum of label similarity matrices indicating similarity per label information; Clustering health information of a plurality of patients and clustering a plurality of patients with a predetermined number based on distribution density characteristics of the preliminary clustering information, The cluster that generates cluster information And providing the treatment label information of the cluster information having the highest degree of similarity to the new patient as the treatment label information for the new patient by determining the similarity between the health information and the cluster information of the new patient.

In order to achieve the object of the present invention, the apparatus for clustering health information according to the present invention comprises a health information database including health information with a treatment label for each patient and health information without a treatment label, A similarity degree matrix generating unit for generating a label similarity matrix showing similarity degrees of stars, a total similarity matrix generating unit for generating a total similarity matrix between patients from the sum of label similarity matrices indicating similarity degrees for respective label information, A preliminary cluster information generation unit for generating preliminary population information of a plurality of patients by densifying the population density between patients having similar properties to each other such that the population density is smaller than a threshold value, and a plurality of sets of population based on the distribution density characteristics of the preliminary population information Includes clustering to clustering patients The.

Here, the label similarity matrix generator includes an emotion matrix generator for calculating the similarity of emotional label information between patients and generating an emotional label similarity matrix between the patients, a similarity degree calculator for calculating the similarity of the numeric label information between the patients, A sign matrix generator for calculating the similarity degree of the label label information between the patients and generating a sign label similarity matrix between the patients, a calculation matrix generator for calculating the similarity of the treatment label information between the patients, And generating a treatment label similarity matrix of the treatment label.

Here, the label similarity matrix generating unit may further include a matrix normalizing unit that normalizes the emotional label matrix, the numeric label matrix, the symptom label matrix, and the treatment label matrix to the same first reference value.

Herein, the preliminary cluster information generator includes an initial vector generator for generating an initial vector normalized to a second reference value, a next vector generator for setting the initial vector as a previous vector, multiplying the previous vector and the global similarity matrix to generate a next vector, Determining whether the difference between the previous vector and the next vector is smaller than the threshold value and if the difference between the previous vector and the next vector is greater than the threshold value based on the determination result, And a spare cluster generator for generating spare cluster information of a plurality of patients based on the result of the determination by the dense part, based on the next vector when the difference between the previous vector and the next vector is smaller than the threshold value.

The health information clustering method according to the present invention has various effects as compared with the conventional clustering method as follows.

First, in the clustering method of health information according to the present invention, a total similarity matrix is generated from a label similarity matrix showing similarity between patients for each of health label information other than the treatment label information and the treatment label information, and a total similarity matrix is used , The health information can be clustered even if a health information database in which the treatment label is almost absent is used.

Second, the clustering method of health information according to the present invention clusters health information data by using a health information database in which there is almost no treatment label, thereby obtaining treatment label information of health information of the clusters most similar to the patient's health information .

Third, the clustering method of health information according to the present invention can cluster the health information, which does not guarantee the linear distribution, on health information, which is specialized in health information, based on the closeness of health information, and which has almost no treatment label information.

1 is a functional block diagram for explaining a health information clustering apparatus according to the present invention.
FIG. 2 is a functional block diagram for explaining the label similarity matrix generator according to the present invention in more detail.
3 is a functional block diagram for explaining the spare cluster information generating unit according to the present invention in more detail.
4 is a functional block diagram for explaining a treatment advice system that provides treatment label information to a new patient using training data of health information generated using the health information clustering apparatus according to the present invention.
5 is a flowchart illustrating a method of clustering health information according to the present invention.
6 is a view for explaining an example of health label information stored in the health information database.
FIG. 7 is a flowchart for explaining a step of generating a label similarity matrix in the health information clustering method according to the present invention.
FIG. 8 is a flowchart illustrating a method for generating reserve cluster information in the health information clustering method according to the present invention.
FIG. 9 is a flowchart for explaining a treatment advice method according to the present invention.
10 is a diagram for explaining an example of a K-means clustering algorithm.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, a health information clustering method and apparatus according to the present invention will be described in detail with reference to the accompanying drawings.

1 is a functional block diagram for explaining a health information clustering apparatus according to the present invention.

Referring to FIG. 1, the label similarity matrix generator 120 generates a label similarity matrix representing similarities among a plurality of patients according to each health label information from health information of a plurality of patients stored in the health information database 110 . The health information stored in the health information database 110 is a mixture of health information in which the patient's treatment label exists and health information in which the patient's treatment label does not exist. Here, the type of the label information includes the emotional label information indicating the emotional state of the patient, the numerical label information indicating the health state of the patient, the indication label information indicating the symptom state of the patient, and the treatment label information prescribed according to the health state of the patient do. Typically, some of the multiple patients have all kinds of label information, while some others do not have the treatment label information or may be stored without any of the emotional label information, numerical label information, or symptom label information .

According to the field to which the present invention is applied, additional health label information other than the emotion label information, the numerical label information, the indication label information, and the treatment label information is stored in the health information database 110 or the emotional label information, the numeric label information, Information label information and health label information other than health label information may be stored in the health information database 110. The label similarity matrix generator 110 may generate a label similarity matrix based on the type of the health label information stored in the health information database 110 Thereby generating a similarity matrix.

The total similarity matrix generator 130 combines the label similarity matrices representing the similarity of each health label information among a plurality of patients to generate a total similarity matrix between the patients. According to the field to which the present invention is applied, each label similarity matrix may be added to the same weight or may be combined with different weights, which falls within the scope of the present invention.

The reserve cluster information generation unit 140 repeatedly increases the cluster density among patients having similar properties using the generated similarity degree matrix, and the difference between the cluster density at the previous step and the cluster density at the current step is larger than the threshold value The preliminary cluster information of a plurality of patients is generated when the population density of the patients is smaller than the threshold value. The clustering unit 150 clusters a plurality of patients in a set number of clusters based on the distribution density characteristics of the generated preliminary clustering information.

FIG. 2 is a functional block diagram for explaining the label similarity matrix generator according to the present invention in more detail.

Referring to FIG. 2, the emotion matrix generator 111 calculates the similarity between emotion label information of a plurality of patients stored in the health information database 110, and calculates the similarity between emotion information of the plurality of patients, To generate an emotion label similarity matrix. The numerical matrix generation unit 113 calculates the similarity between the numeric label information of a plurality of patients stored in the health information database 110 and generates a numeric label similarity matrix . The symptom matrix generation unit 115 calculates the similarity between the symptom label information of a plurality of patients stored in the health information database 110 and generates a symptom label similarity matrix using the similarity between the calculated symptom label information . The treatment matrix generation unit 117 calculates the similarity between the treatment label information of a plurality of patients stored in the health information database and generates a treatment label similarity matrix having the similarity between the calculated treatment label information of the plurality of patients as elements .

Preferably, the similarity between the emotion information, the similarity between the numerical information, the similarity between the symptom information, and the similarity between the therapeutic information are calculated using the euclidean distance, and each item constituting the emotion information of each patient It is calculated as the difference of identification value.

The matrix normalization unit 119 normalizes the element values constituting the emotion label similarity matrix, the element values constituting the numeric label similarity matrix, and the element values constituting the label label similarity matrix to the same reference value. For example, the element values constituting the emotion label similarity matrix, the element values constituting the numeric label similarity matrix, and the element values constituting the label label similarity matrix are normalized to have a value between 0 and 1.

3 is a functional block diagram for explaining the spare cluster information generating unit according to the present invention in more detail.

Referring to FIG. 3, the initial vector generation unit 141 generates a 1 × n initial vector (v 0 ) when the total similarity matrix is n × n. The initial vector generation unit 141 may be set to an arbitrary value having a value from 1 to 9, and then normalized to a value having a value of 0 to 1. Alternatively, the initial vector generation unit 141 may normalize the sum of each row of the total similarity matrix, .

The next vector generating unit 143 sets the initial vector to the previous vector v and multiplies the previous vector by the global similarity matrix to generate the next vector v '. The dense section 145 includes a difference calculator 145-1 and a dense controller 145-3. The difference calculator 145-1 subtracts the next vector v 'and the previous vector v The density control section 145-3 determines whether the difference between the previous vector and the next vector calculated by the difference calculation section 145-1 is smaller than the threshold value and determines whether the difference between the previous vector and the next vector If it is larger than the threshold value, the next vector generating unit 143 sets the next vector as the previous vector, and controls the next vector to be repeatedly generated. On the other hand, if the difference between the previous vector and the next vector is smaller than the threshold value, the preliminary cluster determination unit 147 determines the next vector as the preliminary cluster information of a plurality of patients based on the determination result of the density control unit 145-3.

4 is a functional block diagram for explaining a treatment advice system that provides treatment label information to a new patient using training data of health information generated using the health information clustering apparatus according to the present invention.

4, the treatment advice system includes a health information clustering apparatus 100, a treatment advice apparatus 200, and a user terminal 400, which are connected to a wired / wireless network 300. As shown in FIG.

1 to 3, the health information clustering apparatus 100 clusters similar patients using a health information database including only a part of the treatment label information, And generates data. The therapy advice device 200 receives the health information of the new patient through the user terminal 400 carried by the new patient, that is, the emotion label information, the numeric label information, and the symptom label information through the network 300, A cluster having health information most similar to the health information of the new patient is determined and the treatment label information of the cluster having the most similar health information is provided to the user terminal 400 through the network 300. [

According to the field to which the present invention is applied, instead of receiving health information of a new patient by using a separate user terminal 400, the therapy advice device 200 further includes a user interface unit (not shown) New patient's health information can be input.

5 is a flowchart illustrating a method of clustering health information according to the present invention.

Referring to FIG. 5, in the health information database including the health information having the treatment label for each patient and the health information without the treatment label, the health label information And generates a label similarity matrix representing the similarity of stars (S100). 6 is a view for explaining an example of health label information stored in the health information database. 6B shows an example of the numerical label information for each patient. FIG. 6C shows an example of the label information of each patient. FIG. 6A shows an example of the emotional label information for each patient, And FIG. 6 (d) shows an example of treatment label information for each patient. Preferably, each health label information includes a plurality of items constituting each health label information, and the numerical values of the respective items are matched with identification values of the corresponding level.

Referring to FIG. 5 again, a label similarity matrix representing similarity for each label information is multiplied by a weight, and each label similarity matrix multiplied by the weight is added to generate a total similarity matrix between the patients (S200).

(S300), and the number of clusters based on the distribution density characteristics of the preliminary clustering information is set to a predetermined number of clusters Multiple patients are clustered (S400). Preferably, the preliminary clustering information is clustered into a set number of clusters according to the distribution density characteristics of the preliminary clustering information through the K-means clustering algorithm.

FIG. 10 is a diagram for explaining an example of a K-means clustering algorithm. As shown in FIG. 10 (a), when a plurality of pieces of data exist in a cluster and a plurality of data is clustered into two clusters, As shown in FIG. When data 1 and 2 are selected as shown in FIG. 10 (b), the remaining data 3 and 4 are formed as data 1 and data 2, and a preliminary cluster. The center point is calculated as the average of the data constituting the spare cluster formed as shown in FIG. 10 (c), and a cluster is newly formed with the data close to the center points (c1, c2) calculated from the data 1 to 4. New center points c3 and c4 are calculated as an average of the data constituting the newly formed cluster as shown in FIG. 10 (d), and data 1 to 4 are newly formed as data adjacent to the new center points c3 and c4 . Clustering is completed when the newly formed cluster is the same as the previously formed cluster.

FIG. 7 is a flowchart for explaining a step of generating a label similarity matrix in the health information clustering method according to the present invention.

7, the same kind of health label information is extracted from a health information database having various health label information of a plurality of patients to calculate the Euclidean distance of the health label information between the patients, and the calculated Euclidean distance The similarity of the health label information between the patients is calculated (S110).

For example, if the health information information of Patient 1 to Patient 5 is stored in the health information database, it is assumed that Patient 1 and Patient 2, Patient 1 and Patient 3, Patient 1 and Patient 4, Patient 1 and Patient 5 The Euclidean distance is calculated from the difference for the same item among the emotional label information between < RTI ID = 0.0 > 5 < / RTI > The Euclidean distance between the patient (i) and the patient (j) for the health label information (S k ) is calculated by the following equation (1)

[Equation 1]

Figure 112013002407517-pat00001

Here, a, b, ...., j are identification values of the patient (i) and the patient (j) constituting the health label information (S k ).

A label similarity matrix representing the similarity of each health label information is generated from the similarity of health label information between patients calculated from the Euclidean distance (S120). Here, the values of the elements constituting the treatment label similarity matrix have values of 0 and 1. When the similarity of treatment information between patients is coincident, the Euclidean distance has a value of 0 and the element value has a value of 0 equally. On the other hand, the Euclidean distance exceeds 0 and the element value is set to 1 when the similarity of treatment information between patients does not coincide.

For example, one example of the emotion label similarity matrix, the numeric label similarity matrix, the symptom label similarity matrix, and the treatment label similarity matrix of the patients 1 to 5 is shown in the following equations (2) to (5).

&Quot; (2) "

Figure 112013002407517-pat00002

&Quot; (3) "

Figure 112013002407517-pat00003

&Quot; (4) "

Figure 112013002407517-pat00004

&Quot; (5) "

Figure 112013002407517-pat00005

The label similarity matrix for each generated health label information is normalized to the same reference value (S130). Preferably, the reference values are 0 and 1, and the elements of the label similarity matrix for each health label information are normalized to have a value between 0 and 1.

FIG. 8 is a flowchart illustrating a method for generating reserve cluster information in the health information clustering method according to the present invention.

8, when the total similarity matrix is n × n, an initial vector having a size of 1 × n is generated (S310), and the generated initial vector is normalized to a reference value (S320). Preferably, the initial vector is set to an arbitrary value and then normalized to have a reference value, for example, a value between 0 and 1.

The initial vector is set as the previous vector, the overall similarity matrix is multiplied by the previous vector to generate the next vector (S330), and the generated vector is subtracted from the previous vector to determine whether the difference between the next vector and the previous vector is smaller than the set threshold (340).

If the difference between the previous vector and the next vector is smaller than the threshold value, the next vector is generated as the standby cluster data of a plurality of patients based on the next vector (S350). However, if the difference between the previous vector and the next vector is greater than the threshold value, the next vector is reset to the previous vector, the next vector is reconstructed from the previous set vector, and the difference between the vector and the previous set vector is smaller than the set threshold And the next vector is regenerated.

Through the regeneration of these next vectors, more closely related, i.e., similar, patients among a larger number of patients become more dense.

FIG. 9 is a flowchart for explaining a treatment advice method according to the present invention.

More specifically, referring to FIG. 9, in a health information database including health information having a treatment label for each patient and health information without a treatment label, And generates a label similarity matrix indicating similarity degree by information (S500).

The total similarity matrix between the patients is generated from the sum of the label similarity matrices representing the similarity of each label information (S600), and the health information between the patients having similar attributes in the overall similarity matrix is concentrated, And cluster the plurality of patients with a predetermined number based on the distribution density characteristics of the preliminary population information (S700).

Cluster information for each cluster is generated as a central value of health information for each type of label information of the clustered patient (S800). The cluster information for each cluster is calculated by using the average value of each label information of the patient constituting each cluster as each label information of each cluster.

The degree of similarity between the health information and the cluster information of the new patient is determined, and the treatment label information of the cluster information having the highest degree of similarity with the new patient is provided as the treatment label information for the new patient (S900).

The above-described embodiments of the present invention can be embodied in a general-purpose digital computer that can be embodied as a program that can be executed by a computer and operates the program using a computer-readable recording medium.

The computer-readable recording medium may be a magnetic storage medium (e.g., ROM, floppy disk, hard disk, etc.), an optical reading medium (e.g. CD ROM, Lt; / RTI > transmission).

While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is evident that many alternatives, modifications and variations will be apparent to those skilled in the art. Accordingly, the true scope of the present invention should be determined by the technical idea of the appended claims.

110: health information database 120: label similarity matrix generation unit
130: total degree of similarity matrix generation unit 140:
150: Clustering unit 111: Emotion matrix generating unit
113: Numerical Matrix Generating Unit 115: Indication Matrix Generating Unit
117: Treatment matrix generation unit 119: Matrix normalization unit
141: Initial vector generation unit 143: Next vector generation unit
145: dense portion
100: health information clustering device
200: Therapeutic advice device
300: Network
400: User terminal

Claims (16)

  1. (a) In the label similarity matrix generation unit, in the health information database made up of the health information having the treatment label for each patient and the health information without the treatment label, each label information Generating a label similarity matrix indicating the similarity of stars;
    (b) generating a total similarity matrix between the patients from the sum of the label similarity matrices representing the similarity for each label information in the total similarity matrix generating unit;
    (c) generating reserve cluster information of the plurality of patients by clustering the cluster aggregation information such that cluster density between patients having similar properties in the overall similarity matrix is smaller than a threshold; And
    (d) clustering a plurality of patients with a predetermined number of clusters based on distribution characteristics of the preliminary clustering information in the clustering unit.
  2. The method according to claim 1, wherein the degree of similarity of each label information between the patients is calculated as an euclidean distance of each label information for each patient.
  3. 3. The method according to claim 2, wherein the similarity of the treatment label information is set to 1 when the Euclidean distance exceeds 0.
  4. The method of claim 1, wherein the type of the label information is
    Wherein the health information includes at least one of emotion label information, numeric label information, sign label information, and treatment label information.
  5. 5. The method of claim 4,
    Wherein the value of each element constituting the label similarity matrix for the emotional label information, the numeric label information, the symptom label information, and the treatment label information is normalized to the same first reference value.
  6. 6. The method according to claim 5, wherein the emotional label information, the numeric label information, the symptom label information, and the treatment label information are stored as identification values matching an identifier of each item constituting each label information .
  7. The method as claimed in any one of claims 1 to 6, wherein the generating of the preliminary cluster information by the preliminary cluster information generating unit
    (c1) generating an initial vector normalized to a second reference value;
    (c2) setting the initial vector to a previous vector and generating a next vector from a product of the global similarity matrix and the previous vector;
    (c3) determining whether a difference between the previous vector and the next vector is less than a threshold value; And
    (c4) generating the next vector as the preliminary cluster information of the plurality of patients when the difference between the previous vector and the next vector is smaller than a threshold value,
    And if the difference between the previous vector and the next vector is greater than the threshold value, the next vector is set as a previous vector, and the steps (c2) to (c3) are repeatedly performed.
  8. 8. The method according to claim 7, wherein the preliminary cluster information is clustered into a set number of clusters according to distribution density characteristics of the preliminary clustering information through a K-means clustering algorithm.
  9. (a) a label similarity label indicating the degree of similarity between each label information according to the type of label information, including the treatment label information, in a health information database including health information with treatment label and patient information without treatment label for a plurality of patients; Generating a matrix;
    (b) generating a total similarity matrix between the patients from the sum of the label similarity matrices indicating the similarity for each label information;
    (c) generating preliminary population information of the plurality of patients by densifying health information between patients having similar properties in the overall similarity matrix, and clustering the plurality of patients with a predetermined number based on distribution density characteristics of the preliminary population information ;
    (d) generating cluster information for each cluster with the center value of health information for each type of label information of the clustered patient; And
    (e) determining similarity between the health information of the new patient and the cluster information, and providing the treatment label information of the cluster information having the highest degree of similarity to the new patient as the treatment label information for the new patient,
    Wherein the steps (a) to (d) are performed in a health information clustering apparatus, and the step (e) is performed in a medical advice apparatus.
  10. 10. The method of claim 9, wherein generating the preliminary cluster information comprises:
    (c1) generating an initial vector normalized to a reference value;
    (c2) setting the initial vector to a previous vector and generating a next vector from a product of the global similarity matrix and the previous vector;
    (c3) determining whether a difference between the previous vector and the next vector is less than a threshold value; And
    (c4) generating the preliminary cluster data of the plurality of patients based on the next vector if a difference between the previous vector and the next vector is less than a threshold value,
    And if the difference between the previous vector and the next vector is larger than the threshold value, the next vector is set as a previous vector and the steps (c2) to (c3) are repeatedly performed.
  11. The method of claim 10, wherein the label information type is
    Wherein the information includes at least one of emotion label information, numerical label information, sign label information, and treatment label information.
  12. A label similarity matrix generating unit for generating a label similarity matrix representing the similarity of each label information between the patients in a health information database made up of health information having a treatment label for each patient and health information having no treatment label;
    A total similarity matrix generator for generating a total similarity matrix between patients based on a sum of label similarity matrices indicating similarities for each label information;
    A preliminary community information generation unit for generating preliminary community information of the plurality of patients by densifying the population density between patients having similar attributes in the overall similarity matrix to be smaller than a threshold value; And
    And a clustering unit for clustering a plurality of patients with a set number of clusters based on distribution density characteristics of the preliminary clustering information.
  13. 13. The apparatus as claimed in claim 12, wherein the label similarity matrix generator
    An emotion matrix generation unit for calculating similarity of emotion label information between the patients and generating an emotion label similarity matrix between the patients;
    A numerical matrix generation unit for calculating a similarity of the numerical label information between the patients and generating a numerical label similarity matrix between the patients;
    A symptom matrix generation unit for calculating the similarity of the symptom label information between the patients and generating a symptom label similarity matrix between the patients; And
    And a treatment matrix generation unit for calculating a similarity of treatment label information between the patients and generating a treatment label similarity matrix between the patients.
  14. 14. The apparatus as claimed in claim 13, wherein the label similarity matrix generator
    And a matrix normalization unit for normalizing the emotional label matrix, the numeric label matrix, the symptom label matrix, and the treatment label matrix to the same first reference value.
  15. 13. The apparatus of claim 12, wherein the spare cluster information generator
    An initial vector generation unit for generating an initial vector normalized to a second reference value;
    A next vector generating unit for setting the initial vector as a previous vector and multiplying the previous vector by the global similarity matrix to generate a next vector;
    If the difference between the previous vector and the next vector is less than the threshold value and the difference between the previous vector and the next vector is greater than the threshold value based on the determination result, the next vector is set as the previous vector, A dense portion to be controlled; And
    And a preliminary community determination unit for generating the next vector as the preliminary community information of the plurality of patients when the difference between the previous vector and the next vector is smaller than a threshold value based on a determination result of the dense unit. Device.
  16. 16. The apparatus of claim 15, wherein the clustering unit
    Wherein the preliminary cluster information is clustered with a number set according to a distribution density characteristic of the preliminary cluster information through a K-means clustering algorithm.
KR1020130002601A 2013-01-09 2013-01-09 Method for clustering health-information KR101462748B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020130002601A KR101462748B1 (en) 2013-01-09 2013-01-09 Method for clustering health-information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020130002601A KR101462748B1 (en) 2013-01-09 2013-01-09 Method for clustering health-information

Publications (2)

Publication Number Publication Date
KR20140090483A KR20140090483A (en) 2014-07-17
KR101462748B1 true KR101462748B1 (en) 2014-11-21

Family

ID=51738097

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020130002601A KR101462748B1 (en) 2013-01-09 2013-01-09 Method for clustering health-information

Country Status (1)

Country Link
KR (1) KR101462748B1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003331055A (en) 2002-05-14 2003-11-21 Hitachi Ltd Information system for supporting operation of clinical path
US20060015369A1 (en) 2004-07-15 2006-01-19 Bachus Sonja C Healthcare provider recommendation system
JP2011501845A (en) 2007-10-12 2011-01-13 ペイシェンツライクミー, インコーポレイテッド Personal management as well as a comparison of the pathology and outcomes based on the profile of the patient's community
JP2011039653A (en) 2009-08-07 2011-02-24 Ntt Data Corp Medical information generation device, medical information generation method and program

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003331055A (en) 2002-05-14 2003-11-21 Hitachi Ltd Information system for supporting operation of clinical path
US20060015369A1 (en) 2004-07-15 2006-01-19 Bachus Sonja C Healthcare provider recommendation system
JP2011501845A (en) 2007-10-12 2011-01-13 ペイシェンツライクミー, インコーポレイテッド Personal management as well as a comparison of the pathology and outcomes based on the profile of the patient's community
JP2011039653A (en) 2009-08-07 2011-02-24 Ntt Data Corp Medical information generation device, medical information generation method and program

Also Published As

Publication number Publication date
KR20140090483A (en) 2014-07-17

Similar Documents

Publication Publication Date Title
Lotte et al. Regularizing common spatial patterns to improve BCI designs: unified theory and new algorithms
Stern et al. Matchbox: large scale online bayesian recommendations
US8108774B2 (en) Avatar appearance transformation in a virtual universe
JP6261665B2 (en) Determining connections within a community
US8416997B2 (en) Method of person identification using social connections
Newman et al. Why social networks are different from other types of networks
JP4625365B2 (en) Recommendation order selection apparatus and recommendation order selection program
DE102016101661A1 (en) Based on data privacy considerations based on crowd based evaluations calculated on the basis of measures of the affective reaction
Yao et al. Web-based medical decision support systems for three-way medical decision making with game-theoretic rough sets
JP2000172697A (en) Method and device for customer information retrieval, data generating method, and data base
JP2008287707A (en) System and method for content selection based on user profile data
WO2008015565A2 (en) Biophysical virtual model database and applications
WO2002042876A3 (en) Systems and methods for integrating disease management into a physician workflow
US20060021024A1 (en) User certification apparatus and user certification method
KR20050043917A (en) Statistical personalized recommendation system
Lenert et al. Estimation of utilities for the effects of depression from the SF-12
JP4650541B2 (en) Recommendation apparatus and method, program, and recording medium
US20120271612A1 (en) Predictive modeling
CN102289444A (en) Online social network of a user classification methods, apparatus and articles of manufacture
Johnson et al. A new severity of illness scale using a subset of acute physiology and chronic health evaluation data elements shows comparable predictive accuracy
JP2008546046A (en) The method and system of the genetic algorithm of Mahalanobis distance
CA2470733C (en) Generation of continuous mathematical model for common features of a subject group
CN101853259A (en) Methods and device for adding and processing label with emotional data
JP2007213401A (en) Community site server and program for configurating community based on user preferable music data
US20140149177A1 (en) Responding to uncertainty of a user regarding an experience by presenting a prior experience

Legal Events

Date Code Title Description
A201 Request for examination
E902 Notification of reason for refusal
E701 Decision to grant or registration of patent right
GRNT Written decision to grant
FPAY Annual fee payment

Payment date: 20181002

Year of fee payment: 5