WO2021082444A1 - 用于大规模脑病历分割的多粒度Spark超信任模糊方法 - Google Patents
用于大规模脑病历分割的多粒度Spark超信任模糊方法 Download PDFInfo
- Publication number
- WO2021082444A1 WO2021082444A1 PCT/CN2020/094104 CN2020094104W WO2021082444A1 WO 2021082444 A1 WO2021082444 A1 WO 2021082444A1 CN 2020094104 W CN2020094104 W CN 2020094104W WO 2021082444 A1 WO2021082444 A1 WO 2021082444A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- granularity
- super
- population
- center
- elite
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H10/00—ICT specially adapted for the handling or processing of patient-related medical or healthcare data
- G16H10/60—ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/004—Artificial life, i.e. computing arrangements simulating life
- G06N3/006—Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
Definitions
- the medical health service big data project not only requires the construction of electronic health records and electronic medical records databases, but also a medical health management and service big data application system covering public health, medical services, medical security, drug supply, family planning and integrated management services.
- a medical health management and service big data application system covering public health, medical services, medical security, drug supply, family planning and integrated management services.
- we need to make full use of various information technologies such as big data, cloud computing and mobile Internet to promote the effective interoperability of electronic medical record databases and electronic health record databases, and Realize benign interaction to implement the big data project of medical and health services.
- the present invention discloses a multi-granular Spark super-trust fuzzy method for large-scale brain medical record segmentation.
- the large-scale brain medical record data attribute set is divided into different multi-granular evolutionary subpopulations Granu-population i on the Spark cloud platform; Design a super-trust model based on multi-granularity Spark to build trust among different super elites in multi-granularity populations; adjust multi-granularity center threshold, use multi-granularity sub-population balance adjustment strategy for super elites to dynamically update, and large-scale brain disease records
- the invention can stably segment a large-scale brain disease history knowledge reduction collection, and provide an important diagnosis basis for the intelligent diagnosis and auxiliary treatment of brain diseases.
- step B the specific steps of step B are as follows:
- the population trust between the h-th multi-granularity population and the u-th multi-granularity population center is Calculated as follows:
- ⁇ is the similarity threshold, and the range is ⁇ [0,1], then the multi-granularity population conforms to the subpopulation trust relationship in different granular spaces;
- ⁇ is the confidence factor of the direct trust between super elites.
- the value of ⁇ is related to the number of super elite interactions. The greater the number of interactions, the greater the value of ⁇ , 0 ⁇ 1.
- the size of the large-scale brain disease record attribute set is determined by different granularity spaces.
- the neutron population trust relationship is dynamically updated iteratively.
- step C A further improvement of the present invention lies in: the specific steps of step C are as follows:
- the distance between the particle size center c 1 and the initial particle size center c 0 after the first iteration of the particle size subpopulation is d(c 1 , c 0 ), and the new particle size center c′ and the original particle size center after the i-th iteration
- step E the specific steps of the step E are as follows:
- the global optimal consensus probability of obtaining all super elites is t ⁇ 1,2,...,s ⁇ , construct the optimal consistent equilibrium degree and probability degree pair of large-scale brain disease record attribute segmentation as t ⁇ 1,2,...,s ⁇ ;
- the present invention constructs a multi-granularity population super-elite dynamic cooperative operation mechanism on the Spark cloud platform based on the dynamic elite dominant area, and achieves the optimal and consistent balance of large-scale brain medical record segmentation, and reduces the complexity cost of large-scale brain medical record feature segmentation. It further improves the granularity and robustness of large-scale parallel feature extraction of brain medical records on the cloud computing Spark cloud platform, and lays a good foundation for the development of intelligent services such as brain medical record feature selection, rule mining, and clinical decision support.
- Figure 1 is the overall flow chart of the system
- Figure 3-5 is a diagram of the dynamic fuzzy collaborative operation process of multi-granularity population super elites
- n is the total number of elites
- SP i is the i-th super elite
- P ij is the j-th ordinary elite in the i-th multi-granularity population
- Re ij is the credibility of the i-th super elite to the j-th super elite
- R mj is the partial trust recommended by the m-th ordinary elite in the population to the j-th super elite
- I(j) is the The set of all elites in j multi-granularity populations GP j ,
- is the potential of the set;
- the population trust between the h-th multi-granularity population and the u-th multi-granularity population center is Calculated as follows:
- ⁇ is the similarity threshold, and the range is ⁇ [0,1], then the multi-granularity population conforms to the subpopulation trust relationship in different granular spaces;
- ⁇ is the confidence factor of the direct trust between super elites.
- the value of ⁇ is related to the number of super elite interactions. The greater the number of interactions, the greater the value of ⁇ , 0 ⁇ 1.
- the size of the large-scale brain disease record attribute set is determined by different granularity spaces.
- the neutron population trust relationship is dynamically updated iteratively.
- the distance between the particle size center c 1 and the initial particle size center c 0 after the first iteration of the particle size subpopulation is d(c 1 , c 0 ), and the new particle size center c′ and the original particle size center after the i-th iteration
- the invention adopts a multi-granular Spark super trust model to construct trust between different super elites in a multi-granular population, uses different multi-granular sub-population balance adjustment strategies for super elites to dynamically update, and performs global search and segmentation of large-scale brain disease records With local refined segmentation, super elites can collaboratively extract knowledge reduction subsets in their respective regions, which greatly reduces the execution time and improves the accuracy of large-scale brain medical record segmentation.
- the present invention constructs a multi-granularity population super-elite dynamic cooperative operation mechanism on the Spark cloud platform based on the dynamic elite dominant area, achieves the optimal and consistent balance of large-scale brain disease record segmentation, reduces the complexity cost of large-scale brain disease record feature segmentation, and further improves
- the fine-grained and robustness of large-scale parallel feature extraction of brain medical records on the cloud computing Spark cloud platform has laid a good foundation for the development of intelligent services such as brain medical record feature selection, rule mining, and clinical decision support.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- Primary Health Care (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Medical Informatics (AREA)
- Data Mining & Analysis (AREA)
- Public Health (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Epidemiology (AREA)
- Measuring And Recording Apparatus For Diagnosis (AREA)
- Magnetic Resonance Imaging Apparatus (AREA)
Abstract
Description
Claims (4)
- 用于大规模脑病历分割的多粒度Spark超信任模糊方法,其特征在于:具体步骤如下:A.在大数据Spark云平台上将大规模脑病历属性集分割至不同的多粒度进化种群Granu-population i,i=1,2,…n,脑病历属性分割任务分解为多个并行化的作业任务,然后在分解后的多个作业任务中计算出不同脑病历候选属性集的等价类;B.设计基于多粒度超信任模型,将第i个多粒度进化种群Granu-population i用于脑病历第i个属性集的约简和分割处理,构建多粒度种群内不同超级精英之间信任度,计算多粒度种群的信任偏差,大规模脑病历属性集的规模大小通过不同粒度空间中子种群信任度关系进行动态迭代更新;C.设置用于大规模脑病历分割的多粒度Spark超信任中心调整阈值为λ,在第i次迭代完成后,将粒度中心调整量大于阈值λ的多粒度子种群Granu-population i进行下一次迭代调整,设置粒度中心迁移的调整阈值为ε和多粒度子种群数目调整阈值为θ,优化多粒度V tj的中心c tj,并添加到最终多粒度种群中心集合中,形成包含k个多粒度中心集合;D.对多粒度子种群中超级精英使用均衡调整策略动态更新,将多粒度子种群超级精英划分到一个等腰直角三角形内容,分别计算各自的粒度值 如果两个超级精英具有相同较低粒度 则他们的近似度属性值收敛于均衡对为 如果两个超级精英具有相同较高粒度 则他们的近似度属性值收敛于均衡对为 该均衡调整策略有利于增加多粒度子种群最优一致均衡度。E.构建多粒度子种群超级精英动态模糊协同分割策略,在动态精英优势区域内对大规模脑病历属性进行全局搜索分割与局部精化分割,在多粒度子种群中执行竞争和合作的混合协同,构建大规模脑病历属性分割最优一致均衡度和概率度,使超级精英在各自对应的Pareto优势区域内协同提取知识约简子集,并能稳定分割大规模脑病历不同的属性区域,求得大规模脑病历最优特征集F.比较上述求出的大规模脑病历分割精度RC与预先设定精度值η关系,若满足RC≥η,则输出大规模脑病历最优分割知识集。否则,继续执行上述C、D和E步骤,直至大规模脑病历分割精度满足RC≥η;
- 根据权利要求1所述一种用于大规模脑病历分割的多粒度Spark超信任模糊方法,其特征在于:所述步骤B的具体步骤如下:a.设置多粒度种群个数为n,且n≥2,初始化多粒度种群为GP h且h∈{1,...,n};d.在相同粒度子种群中第i个超级精英的信任度定义如下:其中n是精英总数,SP i为第i个超级精英,P ij为在第i个多粒度种群中第j个普通精英;f.设多粒度种群中心 之间相似度在当前的循环次数为t,t∈{2,...,n-1},每一个多粒度种群中心 的信任度由上一轮第t-1次迭代计算出来,这样大规模脑病历属性集的规模大小将通过不同粒度空间中子种群信任度关系进行动态迭代更新;g.计算多粒度种群中不同超级精英SP i和SP j信任度间的信任偏差Diff ij,计算公式为式中Re ij为第i个超级精英对第j个超级精英的信誉度,R mj为种群中任选第m个普通精英对第j个超级精英推荐的局部信任度,I(j)为第j个多粒度种群GP j中所有精英集合,|I(j)|为该集合的势;g.构建多粒度种群内不同超级精英之间信任度关系公式,定义为其中λ是超级精英之间直接信任度的信心因子,λ的取值和超级精英交互的数目有关,交互的数目越多则λ取值越大,0≤λ≤1。我们取λ=h/H Lmt,其中h为超级精英i和超级精英j之间交互的数目,H Lmt为设定的交互数目门限值,大规模脑病历属性集的规模大小通过不同粒度空间中子种群信任度关系进行动态迭代更新。
- 根据权利要求1所述一种用于大规模脑病历分割的多粒度Spark超信任模糊方法,其特征在于:所述步骤C的具体步骤如下:b.设多粒度子种群集和中心都为空集,V=Φ和C=Φ,迭代次数t=1。计算每个多粒度子种群与多粒度中心的距离,按最小距离原则将大规模脑病历属性集划分到相应的多粒度中心,形成k个 并记录各中心中超级精英个数 设置初始的调整标号d.粒度子种群在第一次迭代后粒度中心c 1与初始粒度中心c 0之间距离为d(c 1,c 0),在第i次迭代后新的粒度中心c′与原粒度中心c之间距离d(c,c′),如果 ε为相似度阈值,范围为ε∈[0,1],则以c′为代表的粒度中心不再参与下轮迭代调整,否则继续进行迭代调整;e.计算标号f tj=1的多粒度种群中每个超级精英与参与调整多粒度种群中心的距离,按最小距离原则将脑病历属性划分到相应的多粒度种群,形成k个新多粒度种群{V tj},并记录各多粒度种群中超级精英个数{N tj},求出调整后用于大规模脑病历属性分割的超级精英个数ΔN tj;
- 根据权利要求1所述一种用于大规模脑病历分割的多粒度Spark超信任模糊方法,其特征在于:所述步骤E的具体步骤如下:c.在多粒度子种群中执行竞争和合作的混合协同的大规模脑病历分割,假设S i为第i个超级精英,在i=1至|S i|执行如下操作:(1)插入S i超级精英的代表S i,rep到P i t中;(2)如果n x>|S i|,从多粒度子种群Granu-subpopulation i中选择超级精英P i t;(3)将所有的S i,j和其他多粒度子种群Granu-subpopulation i的解进行组合,将其进行排序值和计算出S i,j的小生成境数;(4)更新S i的超级精英代表取得Pareto优势区域内非优势解,决定获胜的多粒度子种群,并更新S i=S k;d.超级精英的模糊成员度 uCh(P i)采用相似成员方式计算,其中参考值P i和超级精英中心C h之间的距离定义为d(P i,C h);
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2020286320A AU2020286320B2 (en) | 2019-10-28 | 2020-06-03 | Multi-granularity spark super trust fuzzy method applied to large-scale brain medical record segmentation |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911030948.0A CN110867224B (zh) | 2019-10-28 | 2019-10-28 | 用于大规模脑病历分割的多粒度Spark超信任模糊方法 |
CN201911030948.0 | 2019-10-28 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021082444A1 true WO2021082444A1 (zh) | 2021-05-06 |
Family
ID=69653442
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/094104 WO2021082444A1 (zh) | 2019-10-28 | 2020-06-03 | 用于大规模脑病历分割的多粒度Spark超信任模糊方法 |
Country Status (3)
Country | Link |
---|---|
CN (1) | CN110867224B (zh) |
AU (1) | AU2020286320B2 (zh) |
WO (1) | WO2021082444A1 (zh) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110867224B (zh) * | 2019-10-28 | 2022-02-08 | 南通大学 | 用于大规模脑病历分割的多粒度Spark超信任模糊方法 |
CN113012775B (zh) * | 2021-03-30 | 2021-10-08 | 南通大学 | 红斑病电子病历病变分类的增量属性约简Spark方法 |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120130929A1 (en) * | 2010-11-24 | 2012-05-24 | International Business Machines Corporation | Controlling quarantining and biasing in cataclysms for optimization simulations |
CN104462853A (zh) * | 2014-12-29 | 2015-03-25 | 南通大学 | 用于电子病历特征提取的种群精英分布云协同均衡方法 |
CN105279388A (zh) * | 2015-11-17 | 2016-01-27 | 南通大学 | 多层云计算框架协同的孕龄新生儿脑病历集成约简方法 |
CN105719004A (zh) * | 2016-01-18 | 2016-06-29 | 合肥工业大学 | 一种基于协同进化粒子群算法求解多任务问题 |
CN108133260A (zh) * | 2018-01-17 | 2018-06-08 | 浙江理工大学 | 基于实时状态监控的多目标粒子群优化的工作流调度方法 |
CN108986872A (zh) * | 2018-06-21 | 2018-12-11 | 南通大学 | 用于大数据电子病历约简的多粒度属性权重Spark方法 |
CN109120017A (zh) * | 2017-06-22 | 2019-01-01 | 南京理工大学 | 一种基于改进粒子群算法的电力系统无功优化方法 |
CN110867224A (zh) * | 2019-10-28 | 2020-03-06 | 南通大学 | 用于大规模脑病历分割的多粒度Spark超信任模糊方法 |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN201788510U (zh) * | 2010-07-13 | 2011-04-06 | 南通大学 | 融合粒子群与可拓粗糙格的动态电子病历协同挖掘系统 |
EP2784748B1 (en) * | 2013-03-28 | 2017-11-01 | Expert Ymaging, SL | A computer implemented method for assessing vascular networks from medical images and uses thereof |
CN103838972B (zh) * | 2014-03-13 | 2016-08-24 | 南通大学 | 一种用于mri病历属性约简的量子协同博弈实现方法 |
CN105069503A (zh) * | 2015-07-30 | 2015-11-18 | 重庆邮电大学 | 基于合作度的异种群并行粒子群算法及MapReduce模型的实现方法 |
CN106157370B (zh) * | 2016-03-03 | 2019-04-02 | 重庆大学 | 一种基于粒子群算法的三角网格规范化方法 |
US20180108430A1 (en) * | 2016-09-30 | 2018-04-19 | Board Of Regents, The University Of Texas System | Method and system for population health management in a captivated healthcare system |
CN107257307B (zh) * | 2017-06-29 | 2020-06-02 | 中国矿业大学 | 基于Spark的并行化遗传算法求解多终端协同接入网络方法 |
CN108446740B (zh) * | 2018-03-28 | 2019-06-14 | 南通大学 | 一种用于脑影像病历特征提取的多层一致协同方法 |
CN109117864B (zh) * | 2018-07-13 | 2020-02-28 | 华南理工大学 | 基于异构特征融合的冠心病风险预测方法、模型及系统 |
CN109840551B (zh) * | 2019-01-14 | 2022-03-15 | 湖北工业大学 | 一种用于机器学习模型训练的优化随机森林参数的方法 |
CN109871995B (zh) * | 2019-02-02 | 2021-03-26 | 浙江工业大学 | Spark框架下分布式深度学习的量子优化调参方法 |
-
2019
- 2019-10-28 CN CN201911030948.0A patent/CN110867224B/zh active Active
-
2020
- 2020-06-03 WO PCT/CN2020/094104 patent/WO2021082444A1/zh active Application Filing
- 2020-06-03 AU AU2020286320A patent/AU2020286320B2/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120130929A1 (en) * | 2010-11-24 | 2012-05-24 | International Business Machines Corporation | Controlling quarantining and biasing in cataclysms for optimization simulations |
CN104462853A (zh) * | 2014-12-29 | 2015-03-25 | 南通大学 | 用于电子病历特征提取的种群精英分布云协同均衡方法 |
CN105279388A (zh) * | 2015-11-17 | 2016-01-27 | 南通大学 | 多层云计算框架协同的孕龄新生儿脑病历集成约简方法 |
CN105719004A (zh) * | 2016-01-18 | 2016-06-29 | 合肥工业大学 | 一种基于协同进化粒子群算法求解多任务问题 |
CN109120017A (zh) * | 2017-06-22 | 2019-01-01 | 南京理工大学 | 一种基于改进粒子群算法的电力系统无功优化方法 |
CN108133260A (zh) * | 2018-01-17 | 2018-06-08 | 浙江理工大学 | 基于实时状态监控的多目标粒子群优化的工作流调度方法 |
CN108986872A (zh) * | 2018-06-21 | 2018-12-11 | 南通大学 | 用于大数据电子病历约简的多粒度属性权重Spark方法 |
CN110867224A (zh) * | 2019-10-28 | 2020-03-06 | 南通大学 | 用于大规模脑病历分割的多粒度Spark超信任模糊方法 |
Non-Patent Citations (2)
Title |
---|
DING WEIPING , WANG JIANDONG , ZHANG XIAOFENG GUAN ZHIJIN: "Co-evolutionary cloud-based attribute ensemble multi-agent reduction algorithm", JOURNAL OF SOUTHEAST UNIVERSITY ( ENGLISH EDITION), vol. 32, no. 4, 15 December 2016 (2016-12-15), pages 432 - 438, XP055809178, ISSN: 1003-7985, DOI: 10.3969/j.issn.1003-7985.2016.04.007 * |
XU MING-JIE; WEI CHENG-JIAN; SHEN HANG: "Research on Parallel K-means Algorithm Based on Spark", MICROELECTRONICS & COMPUTER, vol. 35, no. 5, 1 February 2019 (2019-02-01), pages 95 - 99, XP009527906, DOI: 10.19304/j.cnki.issn1000-7180.2018.05.018 * |
Also Published As
Publication number | Publication date |
---|---|
CN110867224B (zh) | 2022-02-08 |
CN110867224A (zh) | 2020-03-06 |
AU2020286320B2 (en) | 2022-10-20 |
AU2020286320A1 (en) | 2021-05-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Bezdan et al. | Hybrid fruit-fly optimization algorithm with k-means for text document clustering | |
Jadhav et al. | WGC: Hybridization of exponential grey wolf optimizer with whale optimization for data clustering | |
Kumar et al. | Hazy: making it easier to build and maintain big-data analytics | |
Ajin et al. | Big data and clustering algorithms | |
WO2021082444A1 (zh) | 用于大规模脑病历分割的多粒度Spark超信任模糊方法 | |
Padhy et al. | Multi relational data mining approaches: A data mining technique | |
Zhu et al. | HUNA: A method of hierarchical unsupervised network alignment for IoT | |
Xu et al. | Bipolar fuzzy Petri nets for knowledge representation and acquisition considering non-cooperative behaviors | |
Zhang et al. | Local multigranulation decision-theoretic rough set in ordered information systems | |
CN111985623A (zh) | 基于最大化互信息和图神经网络的属性图群组发现方法 | |
Wu et al. | Generating realistic synthetic population datasets | |
Yang et al. | A novel cluster validity index for fuzzy c-means algorithm | |
Zhang et al. | Federated representation learning with data heterogeneity for human mobility prediction | |
Sundarakumar et al. | A heuristic approach to improve the data processing in big data using enhanced Salp Swarm algorithm (ESSA) and MK-means algorithm | |
Wang et al. | A survey of distributed and parallel extreme learning machine for big data | |
Song | Deriving the priority weights from probabilistic linguistic preference relation with unknown probabilities | |
Xu et al. | Feature selection using relative dependency complement mutual information in fitting fuzzy rough set model | |
Nie et al. | Temporal-structural importance weighted graph convolutional network for temporal knowledge graph completion | |
Kaur et al. | Generative adversarial networks with quantum optimization model for mobile edge computing in IoT big data | |
Yuan et al. | Feature selection based on self-information and entropy measures for incomplete neighborhood decision systems | |
Huang et al. | Data Mining algorithm for cloud network information based on artificial intelligence decision mechanism | |
Wang et al. | A three-way adaptive density peak clustering (3W-ADPC) method | |
Kumar et al. | Hazy: Making it Easier to Build and Maintain Big-data Analytics: Racing to unleash the full potential of big data with the latest statistical and machine-learning techniques. | |
Jiang et al. | KTPGN: Novel event-based group recommendation method considering implicit social trust and knowledge propagation | |
Wang et al. | Hierarchical Particle Swarm Optimization Based on Mean Value. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 2020286320 Country of ref document: AU Date of ref document: 20200603 Kind code of ref document: A |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20880867 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20880867 Country of ref document: EP Kind code of ref document: A1 |