CN107480426A - From iteration case history archive cluster analysis system - Google Patents
From iteration case history archive cluster analysis system Download PDFInfo
- Publication number
- CN107480426A CN107480426A CN201710596235.5A CN201710596235A CN107480426A CN 107480426 A CN107480426 A CN 107480426A CN 201710596235 A CN201710596235 A CN 201710596235A CN 107480426 A CN107480426 A CN 107480426A
- Authority
- CN
- China
- Prior art keywords
- case history
- cluster analysis
- vector
- module
- cluster
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000007621 cluster analysis Methods 0.000 title claims abstract description 44
- 238000006243 chemical reaction Methods 0.000 claims abstract description 32
- 238000012545 processing Methods 0.000 claims abstract description 20
- 238000004458 analytical method Methods 0.000 claims abstract description 19
- 230000008676 import Effects 0.000 claims abstract description 13
- 238000000034 method Methods 0.000 claims description 16
- 238000013507 mapping Methods 0.000 claims description 5
- 238000012216 screening Methods 0.000 claims description 4
- 238000001914 filtration Methods 0.000 claims description 3
- 238000012937 correction Methods 0.000 claims description 2
- 238000004364 calculation method Methods 0.000 abstract description 6
- 230000036541 health Effects 0.000 abstract description 3
- 238000012546 transfer Methods 0.000 abstract description 3
- 238000000605 extraction Methods 0.000 abstract 1
- 201000010099 disease Diseases 0.000 description 9
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 9
- 238000001228 spectrum Methods 0.000 description 8
- 238000004422 calculation algorithm Methods 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 206010012601 diabetes mellitus Diseases 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 206010020850 Hyperthyroidism Diseases 0.000 description 2
- 206010067584 Type 1 diabetes mellitus Diseases 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 208000030172 endocrine system disease Diseases 0.000 description 1
- 239000000686 essence Substances 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 238000003064 k means clustering Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The present invention discloses one kind from iteration case history archive cluster analysis system, including case history import modul, Vector Processing module and ISODATA Cluster Analysis modules, and the case history import modul is used for variable and the standardization that the extraction from case history archive needs to analyze;The Vector Processing module is used for the conversion that type and ratio are carried out to different types of variable in case history archive, and after completing vector conversion, each individual space vector coordinate is deposited in space vector storehouse;The ISODATA Cluster Analysis modules are used to transfer space vector to be analyzed from the space vector storehouse in Vector Processing module, into ISODATA cluster analyses;In this way, on the one hand reducing amount of calculation compared with the substantial amounts of amount of calculation of hierarchical clustering, most rational classification results on the other hand can be obtained.The classification results to high-volume case history archive can be obtained, consequently facilitating the processing or analysis of next step by the complicated numerous and complicated electronic health record of content by cluster analysis.
Description
Technical field
The present invention relates to field of medical technology, particularly relates to a kind of from iteration case history archive cluster analysis system.
Background technology
Existing individual character, there is general character again between different case history archives.When carrying out clinical research, need often according to difference
Some features of case history archive are analyzed, so as to classify to it, in order to carry out the processing of next step or analysis.So
And the object of existing cluster analysis is all specific numeric type variable, for the case history that variable is various, type is complicated be difficult into
The direct computing of row.And there is amount of calculation for the case history archive cluster analysis system based on hierarchical clustering developed before this
Greatly, classify the problem of not accurate enough, for these problems, it is necessary to be improved to existing algorithm, so as to adapt to case history archive number
Measure the characteristics of huge, content is complicated.
Compared with hierarchical clustering, ISODATA amounts of calculation are less, can directly obtain cluster result, it is not necessary to which user is carried out
Further screening;And compared with K-MEANS clustering algorithms, ISODATA, which is calculated, can adjust classification number, obtain relatively reasonable
Classification results.Therefore foundational development case history archive cluster algorithm is calculated as with ISODATA, while adapts to case history archive
Feature.
The content of the invention
For problem present in background technology, divide it is an object of the invention to provide one kind from iteration case history archive cluster
Analysis system, by the complicated numerous and complicated electronic health record of content by cluster analysis, the classification results to high-volume case history archive are obtained,
Consequently facilitating the processing or analysis of next step.
The technical proposal of the invention is realized in this way:It is a kind of from iteration case history archive cluster analysis system, including case history
Import modul, Vector Processing module, ISODATA Cluster Analysis modules, wherein, the case history import modul:For passing through filtering
The case history archive that device imports to user carries out preliminary filtering, and need are extracted from case history archive according to the mapping relations of initialization
The variable to be analyzed, and to each variable specifications in case history archive, be abstracted for the vector of next step;The vector
Processing module:For carrying out the conversion of type and ratio to different types of variable in case history archive, turn comprising continuous variable
Change, logical type variable conversion and text-type variable conversion, complete vector conversion after, by each individual space vector coordinate
Deposit in space vector storehouse, for the ISODATA cluster analyses of next step;The ISODATA Cluster Analysis modules:With
In transferring space vector to be analyzed from the space vector storehouse in Vector Processing module, into ISODATA cluster analyses.
In the above-mentioned technical solutions, the text-type variable conversion is divided into special conversion and common conversion.
In the above-mentioned technical solutions, the ISODATA Cluster Analysis modules are divided into seven secondary modules, respectively initialize
Module, basic module I, basic module II, judgement and iteration module, division module, merging module and terminate module.
In the above-mentioned technical solutions, the basic module I include central subset extract, minimum distance method clustered with
And cluster screening.
In the above-mentioned technical solutions, it is described judgement with iteration module include cluster centre correction, average distance calculate and
Calculate the population mean distance of all classes.
The present invention from iteration case history archive cluster analysis system, including case history import modul, Vector Processing module,
ISODATA Cluster Analysis modules, space vector is abstracted as according to the specific object of each part case history archive first, then by these
Space vector is applied among ISODATA cluster analyses;The parameter value selected according to user, ISODATA Cluster Analysis modules pair
Space vector passes through successive ignition, finally obtains classification results.On the one hand meter is reduced compared with the substantial amounts of amount of calculation of hierarchical clustering
Calculation amount, it on the other hand can obtain most rational classification results.The complicated numerous and complicated electronic health record of content can be passed through cluster point
Analysis, obtains the classification results to high-volume case history archive, consequently facilitating the processing or analysis of next step.
Brief description of the drawings
Fig. 1 is to be of the invention from iteration case history archive cluster analysis system module annexation figure;
Fig. 2 is seven secondary module annexation figures in ISODATA Cluster Analysis modules in Fig. 1;
Fig. 3 is instantiation cluster analysis spatial distribution map in the present invention.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, rather than whole embodiments.Based on this
Embodiment in invention, those of ordinary skill in the art are obtained every other under the premise of creative work is not made
Embodiment, belong to the scope of protection of the invention.
It is of the present invention one kind from iteration case history archive cluster analysis system, key point be space vector conversion with
ISODATA cluster analyses.Space vector is abstracted as according to the specific object of each part case history archive first, then by these spaces
Vector is applied among ISODATA cluster analyses.The parameter value selected according to user, ISODATA Cluster Analysis modules are to space
Vector passes through successive ignition, finally obtains classification results, it includes case history import modul, Vector Processing module, ISODATA and gathered
Alanysis module, the annexation figure of each module is as shown in figure 1, following is the detailed description to above-mentioned each module.
(1) case history import modul:
Case history import modul is responsible for carrying out preliminary working process to the case history archive that user imports.Case history import modul
The most key part is filter, and filter extracts according to the mapping relations of initialization from case history archive needs what is analyzed
Variable, the vector for next step are abstracted.By processing of the filter to case history archive, each change in case history archive measures
To standardization.
(2) Vector Processing module:
The specific object of case history archive cannot be used directly for cluster analysis, it is necessary to can just enter afterwards by the abstract of vector
Row cluster analysis.Therefore, it is necessary to carry out the conversion of type and ratio to variable according to certain rule.For in case history archive
Different types of variable, there is different conversion methods, be broadly divided into three major types:Continuous variable conversion, the conversion of logical type variable
And text-type variable conversion.It is specific as follows:
A. continuous variable is changed:For some continuous variable, make it as a dimension in space vector, choosing
Its fixed average value is as standard value 100 (or being manually set to other values as standard value), each individual variable in sample
Value divided by the average value are multiplied by with standard value, respective value of the value obtained after conversion as the dimension in space vector.
B. logical type variable is changed:For the logical type variable of yes/no, make its dimension as space vector,
It is that corresponding value is 100 (or being manually set to other values as standard value), no corresponding value is 0, is set as that the dimension is corresponding
Value.
C. text-type variable is changed:Text-type variable conversion method is divided into both of which:Special conversion method turns with common
Change method.The common feature of two methods is all to take certain standard to turn the data of text-type to be quantified as the number of numeric type
According to.
Special conversion method:Special conversion method is preset with transfer standard in the system module, according to the transfer standard
Be converted to specific numerical value.Such as diagnosis, diagnostic result is a kind of character type variable, is preset with four in systems
The spectrum of disease of dimension, different diseases have corresponding space coordinates in the spectrum of disease.The setting of spectrum of disease is according to various disease institute
The order of severity of corresponding section office, mutual contact or even disease, one developed using certain standard are four-dimensional empty
Between.
Such as hyperthyroidism, type 1 diabetes, diabetes B have certain similarity, endocrine system disease is belonged to, and wherein 1
Patients with type Ⅰ DM, diabetes B similarity are higher, therefore residing coordinate in spectrum of disease is more nearly.The coordinate of hyperthyroidism is
(102,321,210,3), type 1 diabetes (102,321,211,4), diabetes B (102,321,211,5).Therefore vector turns
Coordinate of the root tuber according to diagnostic result in spectrum of disease is changed the mold, is integrated among space vector.In addition to spectrum of disease, also have outer
Section's operation spectrum and prescription spectrum etc., belong to special conversion method.
Common conversion method:Common conversion method needs user when importing case history, different to text type specification of variables
Mapping relations between text and numerical value, such as excellent middle difference Dui Ying 100,75,50,25.Vectorial modular converter is according to setting
Definite value and mapping relations, numerical value corresponding to imparting, as a dimension in space vector.
After completing vectorial conversion operation, each individual space vector coordinate is deposited among space vector storehouse, used
In the ISODATA cluster analyses of next step.
(3) ISODATA Cluster Analysis modules:
The core of ISODATA Cluster Analysis modules is ISODATA algorithms.The module is from the space in Vector Processing module
Space vector to be analyzed is transferred in vectorial storehouse, into ISODATA cluster analyses:
ISODATA Cluster Analysis modules are divided into seven secondary modules, and annexation is as shown in Figure 2:
A. initialization module:
A. initiation parameter:, it is necessary to initialize parameters before ISODATA cluster analyses start:
Parameter name | Implication |
K | Target cluster numbers |
k | Initial setting cluster numbers |
θN | Minimum vectorial number, is clustered if less than if the value not as single one in each cluster |
θ | What is allowed in each cluster then enters line splitting apart from maximum standard deviation, the such as larger than value |
θc | The minimum range of two cluster centres, the such as less than value then merge |
L | The most logarithms for the cluster centre that can merge in an iteration |
I | Iterations |
B. basic module I:
B. central subset extracts:K sample is randomly selected from space vector storehouse, as cluster centre subset { z1, z,,
z3..., zk}。
C. minimum distance method is clustered:IfThen should
Space vector assigns the nearest cluster Si。
D. cluster screening:If SiIn space vector number be less than defined minimum value θN, then the cluster, k=k- are cancelled
1。
C. basic module II:
E. cluster centre corrects:For j-th of dimension values of i-th of classification, its central value needs to be revised as:
F. average distance calculates:Average distance of each space vector to cluster centre in calculating cluster:
G. the population mean distance of all classes is calculated:
D. judgement and iteration module:
H. need to be judged:1. if iterations reaches I times, put θc=0, go to module G.
2. if k<=K/2, then module D is gone to, enter line splitting processing;
3. if k>=K/2, then module E is gone to, merge processing;
4. if K/2<k<2K, then module D is gone to when iterations is odd number, module E is gone to when being even number.
E. module is divided:
I. for each cluster Si, ask the standard deviation of each dimension under the cluster, formula such as following formula:
Find the maximum σ of each dimension standard deviation under each clusteri max。
For σi max, if σi max> θs, and meet one in following two condition:
1. average distance is more than all group average distances in such classAnd the Space like vector number is more than θNOne times
Above NiThe θ of > 2N。
2.k<=K/2.
The cluster is then divided into two cluster blocks, two cluster centres are respectively h
For the arbitrary value in 0 to 1 so that the distance of each vector to new cluster centre is different in original cluster.
After completing division, k=k+1.
F. merging module:
J. any two cluster centre C is calculatediAnd CjDistance:
Dij=dCCi, Cj)
K. D is comparedijWith θcSize, being less than θcDijAscending order arranges.
From the D of minimumijStart, to each DijMerge CiAnd Cj, cluster centre is:
K=k-1;
L. from the second small DijIf corresponding two cluster centres are not all merged before this, continue to be closed
And.If the total logarithm of classification merged reaches L, stop merging.
G. terminate module:
M. iteration count adds one:I=i+1.
N. if iterations reaches the upper limit, iteration is terminated.Otherwise B modules are returned to.
It is to combine the further explanation that an instantiation is done to the present invention below:
Existing 10 parts of case history archives need to carry out cluster analysis, and its parameters is as shown in the table:
Its spatial distribution map is as shown in Figure 3:
The parameter value set as:
Parameter name | Parameter value |
K | 3 |
k | 2 |
θN | 2 |
θ | 20 |
θc | 20 |
L | 2 |
I | 20 |
Originally two cluster centres set are:(0,20) and (25,200), but by after successive ignition, two poly-
Class has split into three clusters, and new cluster centre is (2,20) respectively, (11,83) and (25,250).Can from figure
Find out, this batch of case history can be divided into three classes, and be the lower left corner respectively, the middle and upper right corner.
To sum up, compared with prior art, the present invention has below beneficial to effect from iteration case history archive cluster analysis system
Fruit:
1. existing cluster analysis includes hierarchical clustering, K-MEANS is clustered etc., but one existing for these cluster analyses
Problem is exactly that can not just need to preset classification according to specific vector distribution adjust automatically classification number, such as K-MEANS
Number, the cluster numbers finally drawn are equal therewith.And the maximum feature that ISODATA is calculated is according to actual conditions adjustment to be gathered
Class number and cluster centre so that the result of cluster more conforms to actual distribution situation.In actual applications, due to researcher couple
Rational cluster numbers cognitive presence deviation, it is expected that cluster numbers and may not meet actual distribution situation, can using ISODATA
To be adjusted according to actual conditions to cluster numbers so that case history archive classification is more reasonable.
If 2. using manually classifying to large quantities of case history archives, especially need according to multiple variables carry out by
, it is necessary to which sorter carries out comprehensive analysis to variable during one classification, the classification belonging to it is judged, this process needs to spend greatly
The time of amount and energy, it is extremely inefficient.And use hierarchical clustering system, it becomes possible to carried out according to multiple variables of quantization related
The computing of coefficient, the result of cluster analysis is obtained according to the operation result of coefficient correlation, this process can be located using computer
The data of magnanimity are managed, substantially increase operating efficiency.
3. for case history archive classification ISODATA clustering systems flexibility be embodied in user can according to the actual requirements,
The parameters of cluster analysis are adjusted.Although the parameter that ISODATA needs are set is more, these parameters are use
Family provide flexible range of choice, by select different iteration upper limit numbers, cluster between minimum range and cluster in most
The parameters such as big standard deviation, can make certain adjustment so that the result of cluster analysis more conforms to reality to the precision of cluster analysis
Border situation.In addition, user can also be according to this analysis result Reparametrization after a cluster analysis so that cluster
Analysis is more nearly actual conditions.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention
God any modification, equivalent substitution and improvements made etc., should be included in the scope of the protection with principle.
Claims (5)
- It is 1. a kind of from iteration case history archive cluster analysis system, it is characterised in that:Including case history import modul, Vector Processing mould Block, ISODATA Cluster Analysis modules, wherein,The case history import modul:Case history archive for being imported by filter to user carries out preliminary filtering, according to first The mapping relations of beginningization are extracted from case history archive needs the variable analyzed, and to each variable specifications in case history archive, It is abstracted for the vector of next step;The Vector Processing module:For carrying out the conversion of type and ratio to different types of variable in case history archive, comprising Continuous variable conversion, the conversion of logical type variable and the conversion of text-type variable, after completing vector conversion, by each individual sky Between vectorial coordinate deposit in space vector storehouse, for the ISODATA cluster analyses of next step;The ISODATA Cluster Analysis modules:For transferring space to be analyzed from the space vector storehouse in Vector Processing module Vector, into ISODATA cluster analyses.
- It is 2. according to claim 1 from iteration case history archive cluster analysis system, it is characterised in that:The text-type variable Conversion is divided into special conversion and common conversion.
- It is 3. according to claim 1 from iteration case history archive cluster analysis system, it is characterised in that:The ISODATA gathers Alanysis module is divided into seven secondary modules, respectively initialization module, basic module I, basic module II, judgement and iteration mould Block, division module, merging module and terminate module.
- It is 4. according to claim 3 from iteration case history archive cluster analysis system, it is characterised in that:The basic module I Extracted comprising central subset, minimum distance method is clustered and clustered screening.
- It is 5. according to claim 3 from iteration case history archive cluster analysis system, it is characterised in that:The judgement and iteration Module includes cluster centre correction, average distance calculating and the population mean distance for calculating all classes.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710596235.5A CN107480426B (en) | 2017-07-20 | 2017-07-20 | Self-iteration medical record file clustering analysis system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710596235.5A CN107480426B (en) | 2017-07-20 | 2017-07-20 | Self-iteration medical record file clustering analysis system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107480426A true CN107480426A (en) | 2017-12-15 |
CN107480426B CN107480426B (en) | 2021-01-19 |
Family
ID=60595154
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710596235.5A Expired - Fee Related CN107480426B (en) | 2017-07-20 | 2017-07-20 | Self-iteration medical record file clustering analysis system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107480426B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108319682A (en) * | 2018-01-31 | 2018-07-24 | 天闻数媒科技(北京)有限公司 | Method, apparatus, equipment and the medium of grader amendment and taxonomy library structure |
CN108346474A (en) * | 2018-03-14 | 2018-07-31 | 湖南省蓝蜻蜓网络科技有限公司 | The electronic health record feature selection approach of distribution within class and distribution between class based on word |
CN109215795A (en) * | 2018-08-10 | 2019-01-15 | 上海交通大学 | case complexity prediction method and system |
CN112233742A (en) * | 2020-09-30 | 2021-01-15 | 吾征智能技术(北京)有限公司 | Medical record document classification system, equipment and storage medium based on clustering |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104915560A (en) * | 2015-06-11 | 2015-09-16 | 万达信息股份有限公司 | Method for disease diagnosis and treatment scheme based on generalized neural network clustering |
CN105868526A (en) * | 2016-02-24 | 2016-08-17 | 上海市儿童医院 | Robust tensor maintenance based child community-acquired pneumonia data processing system and method |
CN106202477A (en) * | 2016-07-18 | 2016-12-07 | 北京千安哲信息技术有限公司 | Medical expense method for digging and device |
CN106202891A (en) * | 2016-06-30 | 2016-12-07 | 电子科技大学 | A kind of big data digging method towards Evaluation of Medical Quality |
CN106874693A (en) * | 2017-03-15 | 2017-06-20 | 国信优易数据有限公司 | A kind of medical big data analysis process system and method |
CN106919671A (en) * | 2017-02-20 | 2017-07-04 | 广东省中医院 | A kind of traditional Chinese medical science text medical record is excavated and aid decision intelligence system |
-
2017
- 2017-07-20 CN CN201710596235.5A patent/CN107480426B/en not_active Expired - Fee Related
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104915560A (en) * | 2015-06-11 | 2015-09-16 | 万达信息股份有限公司 | Method for disease diagnosis and treatment scheme based on generalized neural network clustering |
CN105868526A (en) * | 2016-02-24 | 2016-08-17 | 上海市儿童医院 | Robust tensor maintenance based child community-acquired pneumonia data processing system and method |
CN106202891A (en) * | 2016-06-30 | 2016-12-07 | 电子科技大学 | A kind of big data digging method towards Evaluation of Medical Quality |
CN106202477A (en) * | 2016-07-18 | 2016-12-07 | 北京千安哲信息技术有限公司 | Medical expense method for digging and device |
CN106919671A (en) * | 2017-02-20 | 2017-07-04 | 广东省中医院 | A kind of traditional Chinese medical science text medical record is excavated and aid decision intelligence system |
CN106874693A (en) * | 2017-03-15 | 2017-06-20 | 国信优易数据有限公司 | A kind of medical big data analysis process system and method |
Non-Patent Citations (3)
Title |
---|
张欣: "中文Blog热门话题检测技术研究", 《软件导刊》 * |
李湘云: "ISODATA动态聚类算法在文本挖掘中的应用", 《长春工程学院学报》 * |
柳培林: "基于向量空间模型的中文文本分类技术研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108319682A (en) * | 2018-01-31 | 2018-07-24 | 天闻数媒科技(北京)有限公司 | Method, apparatus, equipment and the medium of grader amendment and taxonomy library structure |
CN108319682B (en) * | 2018-01-31 | 2021-12-28 | 天闻数媒科技(北京)有限公司 | Method, device, equipment and medium for correcting classifier and constructing classification corpus |
CN108346474A (en) * | 2018-03-14 | 2018-07-31 | 湖南省蓝蜻蜓网络科技有限公司 | The electronic health record feature selection approach of distribution within class and distribution between class based on word |
CN109215795A (en) * | 2018-08-10 | 2019-01-15 | 上海交通大学 | case complexity prediction method and system |
CN109215795B (en) * | 2018-08-10 | 2020-11-06 | 上海交通大学 | Case complexity prediction method and system |
CN112233742A (en) * | 2020-09-30 | 2021-01-15 | 吾征智能技术(北京)有限公司 | Medical record document classification system, equipment and storage medium based on clustering |
CN112233742B (en) * | 2020-09-30 | 2024-02-23 | 吾征智能技术(北京)有限公司 | Medical record document classification system, equipment and storage medium based on clustering |
Also Published As
Publication number | Publication date |
---|---|
CN107480426B (en) | 2021-01-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Genolini et al. | kmlShape: an efficient method to cluster longitudinal data (time-series) according to their shapes | |
WO2021120934A1 (en) | Convolutional neural network-based method for automatically grouping drgs | |
WO2017215346A1 (en) | Service data classification method and apparatus | |
CN111899882B (en) | Method and system for predicting cancer | |
CN107480426A (en) | From iteration case history archive cluster analysis system | |
CN106022480B (en) | Robot function module granularity division evaluation method based on D-S evidence theory | |
CN104462184B (en) | A kind of large-scale data abnormality recognition method based on two-way sampling combination | |
WO2021139116A1 (en) | Method, apparatus and device for intelligently grouping similar patients, and storage medium | |
Ding et al. | Automatic clustering based on density peak detection using generalized extreme value distribution | |
CN107301328B (en) | Cancer subtype accurate discovery and evolution analysis method based on data flow clustering | |
CN109726749A (en) | A kind of Optimal Clustering selection method and device based on multiple attribute decision making (MADM) | |
CN108427756B (en) | Personalized query word completion recommendation method and device based on same-class user model | |
Shi et al. | An improved mean imputation clustering algorithm for incomplete data | |
CN112819054A (en) | Slice template configuration method and device | |
CN112434172A (en) | Pathological image prognosis feature weight calculation method and system | |
CN114091603A (en) | Spatial transcriptome cell clustering and analyzing method | |
CN107480441B (en) | Modeling method and system for children septic shock prognosis prediction | |
CN107436933A (en) | The hierarchical clustering system arranged for case history archive | |
CN106469318A (en) | A kind of characteristic weighing k means clustering method based on the sparse restriction of L2 | |
CN104778205B (en) | A kind of mobile application sequence and clustering method based on Heterogeneous Information network | |
CN114999574B (en) | Parallel identification and analysis method and system for intestinal flora big data | |
CN108415958A (en) | The weight processing method and processing device of index weight VLAD features | |
CN112784886A (en) | Brain image classification method based on multilayer maximum spanning tree image kernel | |
CN111986815A (en) | Project combination mining method based on co-occurrence relation and related equipment | |
Wang et al. | Cosine kernel based density peaks clustering algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20210119 |