CN111192662B - Medical image processing method and storage medium based on random forest algorithm - Google Patents

Medical image processing method and storage medium based on random forest algorithm Download PDF

Info

Publication number
CN111192662B
CN111192662B CN202010009003.7A CN202010009003A CN111192662B CN 111192662 B CN111192662 B CN 111192662B CN 202010009003 A CN202010009003 A CN 202010009003A CN 111192662 B CN111192662 B CN 111192662B
Authority
CN
China
Prior art keywords
user
medical image
symptoms
characteristic information
symptom
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010009003.7A
Other languages
Chinese (zh)
Other versions
CN111192662A (en
Inventor
霍颖瑜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Foshan University
Original Assignee
Foshan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Foshan University filed Critical Foshan University
Priority to CN202010009003.7A priority Critical patent/CN111192662B/en
Publication of CN111192662A publication Critical patent/CN111192662A/en
Application granted granted Critical
Publication of CN111192662B publication Critical patent/CN111192662B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H30/00ICT specially adapted for the handling or processing of medical images
    • G16H30/20ICT specially adapted for the handling or processing of medical images for handling medical images, e.g. DICOM, HL7 or PACS
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Biomedical Technology (AREA)
  • Radiology & Medical Imaging (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

The application relates to a medical image processing method and a storage medium based on a random forest algorithm, comprising the following steps ofStep 101, obtaining characteristic information P of a user, obtaining characteristic information Q of symptoms of a sample library, and integrating the characteristic information P and the characteristic information Q to obtain characteristic information H; 102, constructing a random forest for matching user symptoms, and training by combining the characteristic information H to obtain a random forest model for matching the user symptoms; step 103, obtaining users and symptoms to be matched, and integrating the users and the symptoms to be matched according to the step 101 to obtain integrated characteristic information; 104, repeating the above steps until a degree index set { I } of all symptoms of the user and the sample library is obtained 1 、I 2 、…I m Get MAX { I } 1 、I 2 、…I m And MAX { I }, and 1 、I 2 、…I m the symptom corresponding to the symptom is pushed to the user as the matching symptom of the user. The application can save time for doctors.

Description

Medical image processing method and storage medium based on random forest algorithm
Technical Field
The application relates to the field of artificial intelligence, in particular to a medical image processing method based on a random forest algorithm and a storage medium.
Background
The Hospital (Hospital) is from Latin original meaning "guest", because the Hospital is refuge for people when setting up at first, and also has a rest room, so that the user is comfortable and has a waiting intention. Later, it became a professional for satisfying the medical needs of human beings and providing medical services, and a service place for housing and treating patients.
Hospitals are medical institutions with the main purpose of curing and supporting injuries by carrying out necessary medical examination, treatment measures, nursing technology, diagnosis receiving service, rehabilitation equipment, curing and transportation and other services for patients according to laws and regulations and industry specifications.
When a hospital is always ill, doctors in many departments cannot deal with consultation of a large number of patients, and for some slight typical symptoms, the patients can be completely determined according to the image information of the previous diagnosis, so that if the doctors are seen one by one, time is wasted.
Disclosure of Invention
The application aims to solve the defects of the prior art and provides a medical image processing method and a storage medium based on a random forest algorithm.
In order to achieve the above purpose, the present application adopts the following technical scheme:
the application provides a medical image processing method and a storage medium based on a random forest algorithm, wherein the medical image processing method comprises the following steps:
step 101, acquiring characteristic information P of a user, acquiring characteristic information Q of symptoms of a sample library, and integrating the characteristic information P and the characteristic information Q to obtain characteristic information H;
102, constructing a random forest for matching user symptoms, and training by combining the characteristic information H to obtain a random forest model for matching the user symptoms;
step 103, obtaining users and symptoms to be matched, integrating the users and the symptoms to be matched according to the step 101 to obtain integrated characteristic information, and inputting the integrated characteristic information into the random forest model to obtain a matching degree index I;
104, repeating the above steps until a degree index set { I } of all symptoms of the user and the sample library is obtained 1 、I 2 、…I m Get MAX { I } 1 、I 2 、…I m And MAX { I }, and 1 、I 2 、…I m the symptom corresponding to the symptom is pushed to the user as the matching symptom of the user.
Further, the characteristic information P of the user in step 101 includes the age, sex, disease history of the user and the medical image information of the user, which is the medical image information selected by the user for consultation.
Further, the characteristic information Q of the symptoms in the sample library in the step 101 includes the morbidity corresponding to the age group of the symptoms, the morbidity corresponding to the sex, and the medical image information set of the conventional diagnosis.
Further, the characteristic information H in the step 101 includes the morbidity corresponding to the age of the user, the morbidity corresponding to the sex of the user, whether the user has the history of the disease, and the similarity between the medical image of the user and the medical image information set of the symptom which has been confirmed in the past.
Further, the method of constructing the random forest model in the step 102 includes the following steps:
step 501, randomly extracting M new self-service sample sets from sample library symptoms by adopting a bootstrap method, and constructing M classification regression trees according to the self-service sample sets;
step 502, defining the number of the feature information P as n, randomly extracting m features at each node of each tree, wherein m is less than or equal to n, and selecting the feature with the most classification capability from the m features to perform node splitting in a mode of calculating information gain;
step 503, making each tree grow to the maximum extent, and not performing pruning operation;
and 504, forming the generated M trees into a random forest, and generating a random forest model, wherein the random forest model carries out user symptom matching in a voting mode, and if the number of successfully matched trees in the voting is not lower than a threshold value N in proportion to M, matching the user with the symptom.
Further, the method for calculating the information gain in step 502 is to calculate through an ID3 algorithm, and specifically includes the following steps:
if the sub-feature P in the feature information P divides the sample library symptom set T into j subsets of T1, T2, … Tj, the information gain of the sub-feature P is
Wherein M is the number of symptom sets T which is the same as the M in step 501, |T j I is the number of samples in the subset that belong to Tj,freq(C j t) samples of T are C j The frequency of the class, s, is the number of classes of samples in T.
Further, the voting in the step 504 is performed in the following manner:
defining C as the label to be pushed, andwherein M is the number of trees, I is the sexual function, ++>Is a single tree h i Classification result of class C, < >>Is tree h i If the weight of C is greater than the threshold value H, the leaf node number of C represents a single tree H i Matching the user with the symptom is favored.
Further, the method for calculating the similarity between the medical image of the user and the medical image information set of the symptom which is confirmed in the past comprises the steps of,
and calculating the similarity of the medical image and each medical image in the medical image set of the symptom through OpenCV, and taking the maximum value as the similarity of the medical image of the user and the medical image information set of the symptom, wherein the medical image set is selected by a doctor according to the typical medical image of the symptom.
The application also proposes a computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the medical image processing method based on a random forest algorithm.
The beneficial effects of the application are as follows: according to the medical image processing method and the storage medium based on the random forest algorithm, the user can be matched with the corresponding symptoms through the random forest, and then the user can find the corresponding department to carry out registration diagnosis according to the matched result, so that the medical image processing method and the storage medium are very convenient, and the time of doctors is saved.
Drawings
Fig. 1 is a flowchart of a medical image processing method and a storage medium based on a random forest algorithm according to the present application.
Detailed Description
The conception, specific structure, and technical effects produced by the present application will be clearly and completely described below with reference to the embodiments and the drawings to fully understand the objects, aspects, and effects of the present application. It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The same reference numbers will be used throughout the drawings to refer to the same or like parts.
Referring to fig. 1, a medical image processing method and a storage medium based on a random forest algorithm are provided, including the following steps:
step 101, acquiring characteristic information P of a user, acquiring characteristic information Q of symptoms of a sample library, and integrating the characteristic information P and the characteristic information Q to obtain characteristic information H;
102, constructing a random forest for matching user symptoms, and training by combining the characteristic information H to obtain a random forest model for matching the user symptoms;
step 103, obtaining users and symptoms to be matched, integrating the users and the symptoms to be matched according to the step 101 to obtain integrated characteristic information, and inputting the integrated characteristic information into the random forest model to obtain a matching degree index I;
104, repeating the above steps until a degree index set { I } of all symptoms of the user and the sample library is obtained 1 、I 2 、…I m Get MAX { I } 1 、I 2 、…I m And MAX { I }, and 1 、I 2 、…I m the symptom corresponding to the symptom is pushed to the user as the matching symptom of the user.
As a preferred embodiment of the present aspect, the characteristic information P of the user in step 101 includes the age, sex, disease history of the user, and the medical image information of the user, which is the medical image information selected by the user for consultation.
In a preferred embodiment of the present application, the characteristic information Q of the symptoms in the sample library in step 101 includes the occurrence rate corresponding to the age group of the symptoms, the occurrence rate corresponding to the sex, and the medical image information set of the conventional diagnosis.
In a preferred embodiment of the present embodiment, the characteristic information H in the step 101 includes a disease occurrence rate corresponding to the age of the user, a disease occurrence rate corresponding to the sex of the user, whether the user has a history of the disease, and a similarity between the medical image of the user and a medical image information set of the symptom to be diagnosed. The collection of the user information can be input by the user by the human-computer interaction interface, and can be achieved by other modes reasonably.
As a preferred embodiment of the present solution, the method that can only be used to construct the random forest model in the step 102 includes the following steps:
step 501, randomly extracting M new self-service sample sets from a symptom sample library in a put-back way by adopting a bootstrap method, and constructing M classification regression trees according to the self-service sample sets;
step 502, defining the number of the feature information P as n, randomly extracting m features at each node of each tree, wherein m is less than or equal to n, and selecting the feature with the most classification capability from the m features to perform node splitting in a mode of calculating information gain;
step 503, making each tree grow to the maximum extent, and not performing pruning operation;
and 504, forming the generated M trees into a random forest, and generating a random forest model, wherein the random forest model carries out user symptom matching in a voting mode, and if the number of successfully matched trees in the voting is not lower than a threshold value N in proportion to M, matching the user with the symptom.
As a preferred embodiment of the present solution, the method for calculating the information gain in step 502 is performed by an ID3 algorithm, and specifically includes the following steps:
if the sub-feature P in the feature information P divides the sample library symptom set T into j subsets of T1, T2, … Tj, the information gain of the sub-feature P is
Wherein M is the number of symptom sets T which is the same as the M in step 501, |T j I is the number of samples in the subset that belong to Tj,freq(C j t) samples of T are C j The frequency of the class, s, is the number of classes of samples in T.
As a preferred embodiment of the present solution, the voting in step 504 is performed by:
defining C as the label to be pushed, andwherein M is the number of trees, I is the sexual function, ++>Is a single tree h i Classification result of class C, < >>Is tree h i If the weight of C is greater than the threshold value H, the leaf node number of C represents a single tree H i Matching the user with the symptom is favored.
As a preferred embodiment of the present application, the method for calculating the similarity between the medical image of the user and the medical image information set of the symptom to be diagnosed in the past includes,
and calculating the similarity of the medical image and each medical image in the medical image set of the symptom through OpenCV, and taking the maximum value as the similarity of the medical image of the user and the medical image information set of the symptom, wherein the medical image set is selected by a doctor according to the typical medical image of the symptom. When the corresponding symptoms are obtained, a related database can be constructed as a preferable mode, when the corresponding symptoms are obtained, the database can be automatically checked up, and departments corresponding to the corresponding symptoms are pushed to users, so that people with low cultural degree can quickly find out targeted local registration.
When the specific scheme is implemented, the user can obtain the matched symptoms only by inputting the related data of the user, and then the user can find the corresponding department according to the matched symptoms to diagnose.
The application also proposes a computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the medical image processing method based on a random forest algorithm.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical modules, i.e., may be located in one place, or may be distributed over a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in each embodiment of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module. The integrated modules may be implemented in hardware or in software functional modules.
The integrated modules, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on this understanding, the present application may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of each of the method embodiments described above. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium may include content that is subject to appropriate increases and decreases as required by jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is not included as electrical carrier signals and telecommunication signals.
While the present application has been described in considerable detail and with particularity with respect to several described embodiments, it is not intended to be limited to any such detail or embodiments or any particular embodiment, but is to be construed as providing broad interpretation of such claims by reference to the appended claims in view of the prior art so as to effectively encompass the intended scope of the application. Furthermore, the foregoing description of the application has been presented in its embodiments contemplated by the inventors for the purpose of providing a useful description, and for the purposes of providing a non-essential modification of the application that may not be presently contemplated, may represent an equivalent modification of the application.
The present application is not limited to the above embodiments, but is merely preferred embodiments of the present application, and the present application should be construed as being limited to the above embodiments as long as the technical effects of the present application are achieved by the same means. Various modifications and variations are possible in the technical solution and/or in the embodiments within the scope of the application.

Claims (2)

1. A medical image processing method based on a random forest algorithm is characterized by comprising the following steps:
step 101, acquiring characteristic information P of a user, acquiring characteristic information Q of symptoms of a sample library, and integrating the characteristic information P and the characteristic information Q to obtain characteristic information H;
102, constructing a random forest for matching user symptoms, and training by combining the characteristic information H to obtain a random forest model for matching the user symptoms;
step 103, obtaining users and symptoms to be matched, integrating the users and the symptoms to be matched according to the step 101 to obtain integrated characteristic information, and inputting the integrated characteristic information into the random forest model to obtain a matching degree index I;
104, repeating the above steps until obtaining the degree index set of all symptoms of the user and the sample library Taking MAX->And MAX->Pushing the corresponding symptoms to the user as matching symptoms of the user;
the characteristic information P of the user comprises the age, sex, disease history of the user and the medical image information of the user, wherein the medical image information is selected by the user and used for consultation;
the characteristic information Q of the symptoms of the sample library comprises morbidity corresponding to the age bracket of the symptoms, morbidity corresponding to the gender and a medical image information set of the past diagnosis;
the characteristic information H comprises morbidity corresponding to the age of the user, morbidity corresponding to the sex of the user, whether the user has the disease history, and the similarity between the medical image of the user and a medical image information set of the symptom which is diagnosed in the past;
the method for calculating the similarity between the medical image of the user and the medical image information set of the symptom which is confirmed in the past comprises the steps of,
calculating the similarity of the medical image and each medical image in the medical image set of the symptom through OpenCV, and taking the maximum value as the similarity of the medical image of the user and the medical image information set of the symptom, wherein the medical image set is selected by a doctor according to the typical medical image of the symptom;
in the step 102, the method for constructing the random forest model includes the following steps:
step 501, randomly extracting M new self-service sample sets from sample library symptoms by adopting a bootstrap method, and constructing M classification regression trees according to the self-service sample sets;
step 502, defining the number of the feature information P as n, randomly extracting m features at each node of each tree, and mn, selecting the feature with the most classification capability from m features by calculating information gain to perform node splitting;
step 503, making each tree grow to the maximum extent, and not performing pruning operation;
step 504, forming the generated M trees into a random forest, and generating a random forest model, wherein the random forest model carries out user symptom matching in a voting mode, and if the number of successfully matched trees in the voting is not lower than a threshold value N in proportion to M, the user is matched with the symptom;
the method for calculating the information gain in step 502 is to calculate through an ID3 algorithm, and specifically includes the following steps:
if the sub-feature P in the feature information P divides the sample library symptom set T into j subsets of T1, T2, … Tj, the information gain of the sub-feature P is
Wherein M is the same number of symptom sets T as M in step 501,for the number of samples in the subset that belong to Tj,,/>samples of T are->The frequency of the class, s, is the number of classes of samples in T;
the voting in step 504 is performed in the following manner:
definition of the definitionFor pushing labels, then +.>Wherein M is the number of trees, I is the indirection function, ++>Is a single tree->Classification result of class C, < >>Is tree->If the weight of C is greater than the threshold value H, it represents a single tree +.>Matching the user with the symptom is favored.
2. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the steps of a random forest algorithm based medical image processing method according to claim 1.
CN202010009003.7A 2020-01-06 2020-01-06 Medical image processing method and storage medium based on random forest algorithm Active CN111192662B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010009003.7A CN111192662B (en) 2020-01-06 2020-01-06 Medical image processing method and storage medium based on random forest algorithm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010009003.7A CN111192662B (en) 2020-01-06 2020-01-06 Medical image processing method and storage medium based on random forest algorithm

Publications (2)

Publication Number Publication Date
CN111192662A CN111192662A (en) 2020-05-22
CN111192662B true CN111192662B (en) 2023-08-22

Family

ID=70710693

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010009003.7A Active CN111192662B (en) 2020-01-06 2020-01-06 Medical image processing method and storage medium based on random forest algorithm

Country Status (1)

Country Link
CN (1) CN111192662B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111753790B (en) * 2020-07-01 2023-12-12 武汉楚精灵医疗科技有限公司 Video classification method based on random forest algorithm

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110442746A (en) * 2019-07-01 2019-11-12 佛山科学技术学院 A kind of intelligent music method for pushing and storage medium based on random forests algorithm

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110442746A (en) * 2019-07-01 2019-11-12 佛山科学技术学院 A kind of intelligent music method for pushing and storage medium based on random forests algorithm

Also Published As

Publication number Publication date
CN111192662A (en) 2020-05-22

Similar Documents

Publication Publication Date Title
US9165116B2 (en) Patient data mining
CN110929016A (en) Intelligent question and answer method and device based on knowledge graph
US20090299977A1 (en) Method for Automatic Labeling of Unstructured Data Fragments From Electronic Medical Records
JP2014505950A (en) Imaging protocol updates and / or recommenders
US10902351B1 (en) Methods and systems for using artificial intelligence to analyze user activity data
US11775053B2 (en) Methods and systems for using artificial intelligence to analyze user activity data
US20210407638A1 (en) Methods and systems for an apparatus for an emotional pattern matching system
CN111710429A (en) Information pushing method and device, computer equipment and storage medium
EP4042377A1 (en) Synthetic generation of clinical skin images in pathology
CN116303981B (en) Agricultural community knowledge question-answering method, device and storage medium
WO2020222908A1 (en) Methods and systems for classification using expert data
CN112837772A (en) Pre-inquiry case history generation method and device
Mao et al. Automated identification of chicken distress vocalizations using deep learning models
CN111192662B (en) Medical image processing method and storage medium based on random forest algorithm
Sudeshna et al. Identifying symptoms and treatment for heart disease from biomedical literature using text data mining
CN112381598A (en) Product service information pushing method and device
CN110199354B (en) Biological system information retrieval system and method
CN117292783A (en) Medical image report generating system
Hsu Embedded grey relation theory in Hopfield neural network: application to motor imagery EEG recognition
CN116469534A (en) Hospital number calling management system and method thereof
CN110717057A (en) Digital pathology full-section image retrieval method
CN112116976A (en) Method and device for processing medicine information and computer readable storage medium
US20230032180A1 (en) Method and system for empowering cancer patient(s)
CN111667029B (en) Clustering method, device, equipment and storage medium
CN114218378A (en) Content pushing method, device, equipment and medium based on knowledge graph

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant