US20220319706A1 - A drgs automatic grouping method based on a convolutional neural network - Google Patents

A drgs automatic grouping method based on a convolutional neural network Download PDF

Info

Publication number
US20220319706A1
US20220319706A1 US17/627,622 US202017627622A US2022319706A1 US 20220319706 A1 US20220319706 A1 US 20220319706A1 US 202017627622 A US202017627622 A US 202017627622A US 2022319706 A1 US2022319706 A1 US 2022319706A1
Authority
US
United States
Prior art keywords
neural network
data
convolutional neural
drgs
method based
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
US17/627,622
Inventor
Jian Wu
Jintai Chen
Tingting Chen
Haochao Ying
Biwen Lei
Xuechen LIU
Qingyu SONG
Jiucheng Zhang
Xiaohong Jiang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Assigned to ZHEJIANG UNIVERSITY reassignment ZHEJIANG UNIVERSITY ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: CHEN, Jintai, CHEN, TINGTING, JIANG, XIAOHONG, LEI, Biwen, LIU, Xuechen, SONG, Qingyu, WU, JIAN, YING, Haochao, ZHANG, Jiucheng
Publication of US20220319706A1 publication Critical patent/US20220319706A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/09Supervised learning
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H40/00ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
    • G16H40/20ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the management or administration of healthcare resources or facilities, e.g. managing hospital staff or surgery rooms

Definitions

  • the present disclosure belongs to computer medical technology field, and especially relates to a DRGs (Diagnosis Related Groups) automatic grouping method based on a convolutional neural network.
  • DRGs Diagnosis Related Groups
  • DRGs Diagnosis Related Groups
  • DRGs Diagnosis Related Groups
  • the population structure, health status and economic development level vary in different regions, so it is necessary to establish a grouping system adapted to local characteristics, and adjust the grouping system according to the operation results.
  • CMI case mix index
  • Chinese patent document with publication number CN107463771A discloses a method and a system for grouping cases, including: obtaining case information, grouping them into corresponding basic groups according to a major diagnosis codes and operation codes in the case information, and obtaining basic group codes and basic group names; when the major diagnosis corresponding to the major diagnosis codes does not belong to the inpatient time impact type, or, the basic group does not belong to a specific basic group, the diagnostic complexity score corresponding to each diagnosis code is then calculated based on the basic group code and each diagnosis code; the disease complexity index corresponding to the case information is calculated based on the diagnostic complexity score corresponding to each diagnosis code; the case information is divided into subgroups from the basic group based on the disease complexity index to obtain the diagnosis related groups code, diagnosis related groups name and diagnosis related groups relative weight to complete the case grouping.
  • the present disclosure provides a DRGs automatic grouping method based on a convolutional neural network, which can synthesize actual information of the data for automatic division of disease types.
  • a DRGs automatic grouping method based on a convolutional neural network comprising the following steps:
  • the method of the present disclosure is used to avoid the disadvantages of manual feature selection and additional data labeling for adding new grouping categories, automatic learning grouping can be performed for data with vague and difficult groupings.
  • step (2) when performing the numerical coding, quantitating the case data and uniformly converting the case data to be within a range of 0 to 1 with the following conversion formula:
  • V c is a current value to be calculated
  • V min , V max are the minimum and maximum values in the serial number, respectively.
  • step (3) the shallow convolutional neural network with 3 convolutional layers is used for feature extraction on the data.
  • step (3) the process of training the convolutional neural network model is as follows:
  • (3-1) performing feature extraction on the coded data by using the convolutional neural network.
  • f(x,y) is input data
  • g(x,y) is a convolution kernel function
  • m and n are the convolution kernel length and width, respectively.
  • the purpose of feature extraction is to synthesize different information of the data, and find the correlation between various information.
  • step (3-2) introducing the feature vectors extracted by the convolutional neural network in the step (3-1) into a k-means clusterer to perform classification, calculating a distance between two categories of vectors by using cosine distance, dividing the closer ones into a class cluster, measuring a distance between class clusters by a distance between the shortest two points between all members of a class cluster to all members of another class cluster, finally taking a maximum distance between the class clusters as optimum efficiency, and selecting a corresponding k value automatically according to clustering efficiency;
  • the cosine distance is calculated by a formula as follows:
  • step (3-2) using the k categories obtained in the step (3-2) as labels for the data, using a regression model and a loss metric function to measure learning efficiency of the network, so as to supervise neural network learning until the network model converges.
  • the regression model is a softmax method that can be used for multi-category problems, and is calculated by an approach as follows:
  • Zj is output of a jth neuron
  • N is a total quantity of categories
  • P(z) j is a probability value of the jth category
  • the model outputs a probability value for each category, thereby outputting N probability values for N categories.
  • the above-mentioned loss metric function is a cross entropy, and is calculated by a formula as follows:
  • y i is the label of a ith category
  • ⁇ i is a probability value of being predicted as the ith category
  • M is the quantity of samples.
  • the method of present disclosure combining a convolutional neural network with a k-means clustering method, taking advantages of convolutional neural network automatic feature extraction and automatic optimization to extract connection between various features, using the labels generated by the clustering method to act into the neural network classifier, and then supervises the training and learning of the neural network to form a method that can achieve automatic optimizing grouping efficiency.
  • the method can combine all the information of the actual data for grouping, and can add data to optimize the grouping efficiency without additional workload.
  • FIG. 1 is a flow schematic diagram of the present disclosure of a DRGs automatic grouping method based on a convolutional neural network.
  • a DRGs automatic grouping method based on a convolutional neural network comprising the following steps:
  • S 1 collecting case data and dividing the cases according to major diagnostic broad categories and core diagnosis-related grouping methods into their corresponding groups.
  • the training data is performed in any of the core diagnosis related groups.
  • the actual data is structured data is textually described, which needs to be coded into numerical form and input into a convolutional network for learning, quantitating the data and uniformly limiting the data within a range of 0 to 1.
  • this implementation uses a 0, 1 approach to perform coding on the presence or absence of disease
  • V c is a current value to be calculated
  • V min , V max are the minimum and maximum values in the serial number, respectively.
  • the blood type column generally has 6 types of status that are A, B, O, AB, unknown and unchecked, which can be assigned serial numbers of 1, 2, 3, 4, 5 and 0 respectively, with A corresponding to serial number 0 and B corresponding to serial number 1, and the converted values are 0.2 and 0.4 respectively.
  • a convolutional neural network is constructed to iteratively train the data obtained from the S2, and k-means clustering is performed on the feature information output from the network to obtain k category labels, which is then combined with the classifier and the category labels of the network to supervise the neural network training.
  • f(x,y) is input data
  • g(x,y) is a convolution kernel function
  • m and n are the convolution kernel length and width, respectively.
  • the purpose of feature extraction is to synthesize different information of the data, and find the correlation between various information.
  • the various feature information vectors output from the S3-1 are introduced into the k-means clustering method, and the distance between the various vectors is measured by using the cosine similarity method, and the clustering algorithm is optimized to classify the feature vectors into k categories.
  • the initial value of k for k-means is determined according to the grouping rules of the core disease diagnosis-related groups, for example, according to the grouping rules, the prior grouping disease and related operation grouping are initially divided into 9 groups, and when performing training on data of the group, initial value of k is set to be 9 tentatively. During the calculation of the clustering method, the k value is then adjusted by the clustering efficiency.
  • the present disclosure is a softmax method that can be used for multi-category problems, and is calculated by a formula as follows:
  • Zj is output of a jth neuron
  • N is a total quantity of categories
  • P(z) j is a probability value of the jth category.
  • the model outputs a probability value for each category, thereby outputting N probability values for N categories.
  • the above-mentioned loss metric function uses cross entropy, and is calculated by a formula as follows:
  • y i is the label of an ith category
  • ⁇ i is a probability value of being predicted as the ith category
  • M is the quantity of samples. Iteratively training the network by the direction of minimizing the loss metric function, so as to achieve the best classification efficiency for the network.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Primary Health Care (AREA)
  • Epidemiology (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A DRGs automatic grouping method based on convolutional neural network, including: collecting case data and grouping according to a major diagnostic broad categories and core diagnosis-related grouping method; performing numerical coding to the data; constructing a shallow convolutional neural network model, using a k-means clustering method to cluster the feature vectors extracted from the convolutional network to obtain k category labels, combining the category labels and classifier to supervise the network performing iterative training; after finishing training the model, perform data grouping application. The method of the present disclosure is used to avoid the disadvantages of manual feature selection and additional data labeling for adding new grouping categories, automatic learning grouping can be performed for data with vague and difficult groupings.

Description

    FIELD OF TECHNOLOGY
  • The present disclosure belongs to computer medical technology field, and especially relates to a DRGs (Diagnosis Related Groups) automatic grouping method based on a convolutional neural network.
  • BACKGROUND
  • Due to the current aging population and the development of new science and technology, the deficiencies of the post-payment system of the health insurance fund tend to stimulate excessive medical services, and the prepayment system tends to prevaricate severe patients thereby reduce medical services, which have caused the total health costs to keep rising, the expenditures of the medical benefits fund to rise significantly, and medical benefits fund in many regions faces the risk of fund shortage.
  • DRGs (Diagnosis Related Groups) is a case combination method, and performs grouping on the cases mainly based on a principle of similar clinical courses and similar cost consumption. Making payments and performing targeted treatments according to diseases of different groupings to avoid waste of medical resources. However, due to the uneven economic development and medical care level, the population structure, health status and economic development level vary in different regions, so it is necessary to establish a grouping system adapted to local characteristics, and adjust the grouping system according to the operation results.
  • Chinese patent document with publication number CN110289088A discloses a method and a system for big data intelligent management based on DRGs, including: putting the inpatient case home page data of the yearly inpatient cases of a hospital in a region into a DRG grouper, grouping them according to DRG grouping principles (according to disease diagnosis, surgical operation, complications/complications, age, severity, etc.) to obtain n DRG groups and the distribution of the number of weights and cases, corresponding hospital days and costs for each DRG group; calculating the total number of weights of inpatient cases in the hospital; calculating the case mix index (CMI) value=total number of weights in the hospital/total number of inpatient cases in the hospital; calculating the relative weight RWi of the ith DRG group, and analyzing the proportion of cases with relative weight RWi>2 in the hospital to all cases in the hospital, the average cost of cases in DRG group i represents the average cost of the ith DRG group.
  • Chinese patent document with publication number CN107463771A discloses a method and a system for grouping cases, including: obtaining case information, grouping them into corresponding basic groups according to a major diagnosis codes and operation codes in the case information, and obtaining basic group codes and basic group names; when the major diagnosis corresponding to the major diagnosis codes does not belong to the inpatient time impact type, or, the basic group does not belong to a specific basic group, the diagnostic complexity score corresponding to each diagnosis code is then calculated based on the basic group code and each diagnosis code; the disease complexity index corresponding to the case information is calculated based on the diagnostic complexity score corresponding to each diagnosis code; the case information is divided into subgroups from the basic group based on the disease complexity index to obtain the diagnosis related groups code, diagnosis related groups name and diagnosis related groups relative weight to complete the case grouping.
  • However, the grouping of certain disease categories in each region may be controversial, different groupings may exist by using conventional methods, therefore there is an urgent need to design a method that can synthesize various actual information to divide categories that are relatively difficult to group.
  • SUMMARY OF THE INVENTION
  • To solve the above-mentioned problems exist in the prior art, the present disclosure provides a DRGs automatic grouping method based on a convolutional neural network, which can synthesize actual information of the data for automatic division of disease types.
  • A DRGs automatic grouping method based on a convolutional neural network, comprising the following steps:
  • (1) collecting case data and dividing cases according to major diagnostic broad categories and a core disease diagnosis related grouping method, and dividing the case data into their corresponding groups as a training data set;
  • (2) performing numerical coding process to the case data in the training data set, and converting textual data into a corresponding numerical form;
  • (3) constructing a convolutional neural network model and performing iterative training on the model by using the data obtained from the step (2), during the training process, using a k-means clustering method to cluster feature vectors extracted from the convolutional neural network to obtain k category labels, combining the category labels and a classifier to supervise the convolutional neural network for iterative training; and
  • (4) after finishing training the model, performing numerical coding on the data to be divided and then inputting the data into the trained model for grouping.
  • The method of the present disclosure is used to avoid the disadvantages of manual feature selection and additional data labeling for adding new grouping categories, automatic learning grouping can be performed for data with vague and difficult groupings.
  • Wherein in the step (2), when performing the numerical coding, quantitating the case data and uniformly converting the case data to be within a range of 0 to 1 with the following conversion formula:
  • s = V c - V min V max - V min
  • Wherein Vc is a current value to be calculated, Vmin, Vmax, are the minimum and maximum values in the serial number, respectively.
  • Due to the relatively small amount of information in data relative to image, most of the popular network structures with relatively deep layers tend to cause data overfitting situation, wherein in the step (3), the shallow convolutional neural network with 3 convolutional layers is used for feature extraction on the data.
  • Wherein in the step (3), the process of training the convolutional neural network model is as follows:
  • (3-1), performing feature extraction on the coded data by using the convolutional neural network.
  • The convolutional calculation formula used for extracting features is as follows:

  • f(x,y) * g(x,y)=Σi=−m mΣj=n n g(i,jf(x−i,y−j)
  • wherein f(x,y) is input data, g(x,y) is a convolution kernel function, and m and n are the convolution kernel length and width, respectively. The purpose of feature extraction is to synthesize different information of the data, and find the correlation between various information.
  • (3-2), introducing the feature vectors extracted by the convolutional neural network in the step (3-1) into a k-means clusterer to perform classification, calculating a distance between two categories of vectors by using cosine distance, dividing the closer ones into a class cluster, measuring a distance between class clusters by a distance between the shortest two points between all members of a class cluster to all members of another class cluster, finally taking a maximum distance between the class clusters as optimum efficiency, and selecting a corresponding k value automatically according to clustering efficiency;
  • The cosine distance is calculated by a formula as follows:
  • cos θ = a · b a × b
  • wherein a and b are two different feature vectors.
  • (3-3), using the k categories obtained in the step (3-2) as labels for the data, using a regression model and a loss metric function to measure learning efficiency of the network, so as to supervise neural network learning until the network model converges.
  • Since there may be multiple categories of division, the regression model is a softmax method that can be used for multi-category problems, and is calculated by an approach as follows:
  • P ( z ) j = e Z j n = 1 N e Z n , j = 1 , 2 , , N
  • wherein Zj is output of a jth neuron, N is a total quantity of categories, P(z)j is a probability value of the jth category; the model outputs a probability value for each category, thereby outputting N probability values for N categories.
  • The above-mentioned loss metric function is a cross entropy, and is calculated by a formula as follows:

  • L=Σ i=1 M y i×log ŷ i +(1−y ilog 1−ŷ i )
  • wherein yi is the label of a ith category, ŷi is a probability value of being predicted as the ith category, M is the quantity of samples.
  • Compared with the prior art, the present disclosure has the following beneficial effects:
  • The method of present disclosure, combining a convolutional neural network with a k-means clustering method, taking advantages of convolutional neural network automatic feature extraction and automatic optimization to extract connection between various features, using the labels generated by the clustering method to act into the neural network classifier, and then supervises the training and learning of the neural network to form a method that can achieve automatic optimizing grouping efficiency. For situations where it is difficult to group using conventional grouping rules, the method can combine all the information of the actual data for grouping, and can add data to optimize the grouping efficiency without additional workload.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a flow schematic diagram of the present disclosure of a DRGs automatic grouping method based on a convolutional neural network.
  • DESCRIPTION OF THE EMBODIMENTS
  • The present disclosure is described in further detail below with a drawing and an embodiment, it should be pointed that the embodiment mentioned below is intended to make the present disclosure more understandable and do not limit it in any way.
  • As shown in FIG. 1, a DRGs automatic grouping method based on a convolutional neural network, comprising the following steps:
  • S 1, collecting case data and dividing the cases according to major diagnostic broad categories and core diagnosis-related grouping methods into their corresponding groups. In this embodiment, the training data is performed in any of the core diagnosis related groups.
  • S2, performing coding on the data. The actual data is structured data is textually described, which needs to be coded into numerical form and input into a convolutional network for learning, quantitating the data and uniformly limiting the data within a range of 0 to 1.
  • S2-1, this implementation uses a 0, 1 approach to perform coding on the presence or absence of disease;
  • S2-2, for data with existing criteria such as department of consultation, blood type, surgery level and operation name, labeling each category using the ordinal numbers 0, 1, . . . , n by sorting the various categories, and then converting the ordinal number values to the values corresponding to 0 to 1, and is calculated by a formula as follows:
  • s = V c - V min V max - V min
  • wherein Vc is a current value to be calculated, Vmin, Vmax are the minimum and maximum values in the serial number, respectively.
  • Taking blood type as an example, the blood type column generally has 6 types of status that are A, B, O, AB, unknown and unchecked, which can be assigned serial numbers of 1, 2, 3, 4, 5 and 0 respectively, with A corresponding to serial number 0 and B corresponding to serial number 1, and the converted values are 0.2 and 0.4 respectively.
  • S2-3, for data in categories of age and treatment, the formula in the S2-2 is also applicable, the difference is that the minimum and maximum values are extracted from the data set to be trained.
  • S3, a convolutional neural network is constructed to iteratively train the data obtained from the S2, and k-means clustering is performed on the feature information output from the network to obtain k category labels, which is then combined with the classifier and the category labels of the network to supervise the neural network training.
  • S3-1, due to the relatively small amount of information in data relative to image, most of the popular network structures with relatively deep layers tend to cause data overfitting situation, therefore the embodiment choosing a network structure of the first 3 layers of the residual block of ResNet, convolution using a network composed of 1-dimensional convolutional kernels for performing feature extraction on the data, the convolutional approach can combine the information of various types of data with better semantic information, and is calculated by a formula as follows:

  • f(x,y) * g(x,y)=Σi=−m mΣj=−n n g(i,jf(x−i,y−j)
  • wherein f(x,y) is input data, g(x,y) is a convolution kernel function, and m and n are the convolution kernel length and width, respectively. The purpose of feature extraction is to synthesize different information of the data, and find the correlation between various information.
  • S3-2, the various feature information vectors output from the S3-1 are introduced into the k-means clustering method, and the distance between the various vectors is measured by using the cosine similarity method, and the clustering algorithm is optimized to classify the feature vectors into k categories.
  • Wherein the initial value of k for k-means is determined according to the grouping rules of the core disease diagnosis-related groups, for example, according to the grouping rules, the prior grouping disease and related operation grouping are initially divided into 9 groups, and when performing training on data of the group, initial value of k is set to be 9 tentatively. During the calculation of the clustering method, the k value is then adjusted by the clustering efficiency.
  • During the clustering training, using the principle of the distance between feature vectors to determine whether the feature vectors are the same class cluster, and if the distance between two feature vectors is small then they are the same class cluster, otherwise they are different class clusters. Measuring a distance between class clusters by a distance between the shortest two points between all members of a class cluster to all members of another class cluster, finally taking a maximum distance between the class clusters as optimum efficiency, and the cosine distance is used in the calculation to measure the distance between feature vectors, and is calculated by a formula as follows:
  • cos θ = a · b a × b
  • wherein a and b are two different feature vectors.
  • S3-3, using the k categories obtained in the S3-2 as labels for the data, using a regression model and a loss metric function to measure learning efficiency of the network.
  • Since there may be multiple categories of division, the present disclosure is a softmax method that can be used for multi-category problems, and is calculated by a formula as follows:
  • P ( z ) j = e Z j n = 1 N e Z n , j = 1 , 2 , , N
  • wherein Zj is output of a jth neuron, N is a total quantity of categories, P(z)j is a probability value of the jth category. The model outputs a probability value for each category, thereby outputting N probability values for N categories.
  • The above-mentioned loss metric function uses cross entropy, and is calculated by a formula as follows:

  • L=Σ i=1 Myi×log ŷ i +(1−y ilog (1-ŷ i )
  • wherein yi is the label of an ith category, ŷi is a probability value of being predicted as the ith category, M is the quantity of samples. Iteratively training the network by the direction of minimizing the loss metric function, so as to achieve the best classification efficiency for the network.
  • In the specific application, performing numerical coding on the data to be divided and then inputting the data into the classification model, which automatically divides corresponding groupings.
  • The above-mentioned embodiment provides a detailed description of the technical solutions and beneficial efficiencies of the present disclosure, it should be understood that the above-mentioned is only specific embodiments of the present disclosure and is not intended to limit the present disclosure, and any modifications, additions and equivalent replacements made within the scope of the principles of the present disclosure shall be included in the scope of protection of the present disclosure.

Claims (8)

What is claimed is:
1. A DRGs (Diagnosis Related Groups) automatic grouping method based on a convolutional neural network, the method comprising:
(1) collecting case data and dividing cases according to major diagnostic broad categories and a core disease diagnosis related grouping method, and dividing the case data into their corresponding groups as a training data set;
(2) performing numerical coding process to the case data in the training data set, and converting textual data into a corresponding numerical form;
(3) constructing a convolutional neural network model and performing iterative training on the model by using the data obtained from the step (2), during the training process, using a k-means clustering method to cluster feature vectors extracted from the convolutional neural network to obtain k category labels, combining the category labels and a classifier to supervise the convolutional neural network for iterative training; and
(4) after finishing training the model, performing numerical coding on the data to be divided and then inputting the data into the trained model for grouping.
2. The DRGs automatic grouping method based on a convolutional neural network according to claim 1, wherein in the step (2), when performing the numerical coding, quantitating the case data and uniformly converting the case data to be within a range of 0 to 1 with the following conversion formula:
s = V c - V min V max - V min
wherein Vc, is a current value to be calculated, Vmin, Vmin, are the minimum and maximum values in the serial number, respectively.
3. The DRGs automatic grouping method based on a convolutional neural network according to claim 1, wherein in the step (3), using a shallow convolutional neural network with 3 convolutional layers to perform feature extraction on the data.
4. The DRGs automatic grouping method based on a convolutional neural network according to claim 1, wherein in the step (3), the process of training the convolutional neural network model is as follows:
(3-1), performing feature extraction on the coded data by using the convolutional neural network;
(3-2), introducing the feature vectors extracted by the convolutional neural network in the step (3-1) into a k-means clusterer to perform classification, calculating a distance between two categories of vectors by using cosine distance, dividing the closer ones into a class cluster, measuring a distance between class clusters by a distance between the shortest two points between all members of a class cluster to all members of another class cluster, finally taking a maximum distance between the class clusters as optimum efficiency, and selecting a corresponding k value automatically according to clustering efficiency;
(3-3), using the k categories obtained in the step (3-2) as labels for the data, using a regression model and a loss metric function to measure learning efficiency of the network, so as to supervise neural network learning until the network model converges.
5. The DRGs automatic grouping method based on a convolutional neural network according to claim 4, wherein in the step (3-1), a convolutional calculation formula used for extracting features is as follows:
f ( x , y ) * g ( x , y ) = i = - m m j = - n n g ( i , j ) · f ( x - i , y - j )
wherein f(x,y) is input data, g(x,y) is a convolution kernel function, and m and n are the convolution kernel length and width, respectively.
6. The DRGs automatic grouping method based on a convolutional neural network according to claim 4, wherein in the step (3-2), the cosine distance is calculated by a formula as follows:
cos θ = a · b a × b
wherein a and b are two different feature vectors.
7. The DRGs automatic grouping method based on a convolutional neural network according to claim 4, wherein in the step (3-3), the regression model is a softmax method, and is calculated by a formula as follows:
P ( z ) j = e Z j n = 1 N e Z n , j = 1 , 2 , , N
wherein Zj is output of a jth neuron, N is a total quantity of categories, P(z)j is a probability value of the jth category; the model outputs a probability value for each category, thereby outputting N probability values for N categories.
8. The DRGs automatic grouping method based on a convolutional neural network according to claim 4, wherein in the step (3-3), the loss metric function is a cross entropy, and is calculated by a formula as follows:
L = i = 1 M y i × log y ^ i + ( 1 - y i ) × log ( 1 - y ^ i )
wherein yi is the label of a ith category, ŷi is a probability value of being predicted as the ith category, M is the quantity of samples.
US17/627,622 2019-12-18 2020-11-12 A drgs automatic grouping method based on a convolutional neural network Pending US20220319706A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
CN201911310269.9A CN111161814A (en) 2019-12-18 2019-12-18 DRGs automatic grouping method based on convolutional neural network
CN201911310269.9 2019-12-18
PCT/CN2020/128369 WO2021120934A1 (en) 2019-12-18 2020-11-12 Convolutional neural network-based method for automatically grouping drgs

Publications (1)

Publication Number Publication Date
US20220319706A1 true US20220319706A1 (en) 2022-10-06

Family

ID=70557684

Family Applications (1)

Application Number Title Priority Date Filing Date
US17/627,622 Pending US20220319706A1 (en) 2019-12-18 2020-11-12 A drgs automatic grouping method based on a convolutional neural network

Country Status (3)

Country Link
US (1) US20220319706A1 (en)
CN (1) CN111161814A (en)
WO (1) WO2021120934A1 (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111161814A (en) * 2019-12-18 2020-05-15 浙江大学 DRGs automatic grouping method based on convolutional neural network
CN112885481A (en) * 2021-03-09 2021-06-01 联仁健康医疗大数据科技股份有限公司 Case grouping method, case grouping device, electronic equipment and storage medium
CN113109869A (en) * 2021-03-30 2021-07-13 成都理工大学 Automatic picking method for first arrival of shale ultrasonic test waveform
CN113729715A (en) * 2021-10-11 2021-12-03 山东大学 Parkinson's disease intelligent diagnosis system based on finger pressure
CN114637263B (en) * 2022-03-15 2024-01-12 中国石油大学(北京) Abnormal working condition real-time monitoring method, device, equipment and storage medium
CN114677071B (en) * 2022-05-31 2022-08-02 创智和宇信息技术股份有限公司 Probability analysis-based medical advice data quality control method and system and storage medium
CN116127402B (en) * 2022-09-08 2023-08-22 天津大学 DRG automatic grouping method and system integrating ICD hierarchical features
CN116150698B (en) * 2022-09-08 2023-08-22 天津大学 Automatic DRG grouping method and system based on semantic information fusion
CN115934661B (en) * 2023-03-02 2023-07-14 浪潮电子信息产业股份有限公司 Method and device for compressing graphic neural network, electronic equipment and storage medium
CN117093920B (en) * 2023-10-20 2024-01-23 四川互慧软件有限公司 User DRGs grouping method

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106874923A (en) * 2015-12-14 2017-06-20 阿里巴巴集团控股有限公司 A kind of genre classification of commodity determines method and device
CN106203330A (en) * 2016-07-08 2016-12-07 西安理工大学 A kind of vehicle classification method based on convolutional neural networks
CN109934719A (en) * 2017-12-18 2019-06-25 北京亚信数据有限公司 The detection method and detection device of medical insurance unlawful practice, medical insurance control charge system
CN109411082B (en) * 2018-11-08 2022-01-04 西华大学 Medical quality evaluation and treatment recommendation method
CN109817339B (en) * 2018-12-14 2023-07-04 平安医疗健康管理股份有限公司 Patient grouping method and device based on big data
CN109920501B (en) * 2019-01-24 2021-04-20 西安交通大学 Electronic medical record classification method and system based on convolutional neural network and active learning
CN110164519B (en) * 2019-05-06 2021-08-06 北京工业大学 Classification method for processing electronic medical record mixed data based on crowd-sourcing network
CN111161814A (en) * 2019-12-18 2020-05-15 浙江大学 DRGs automatic grouping method based on convolutional neural network

Also Published As

Publication number Publication date
CN111161814A (en) 2020-05-15
WO2021120934A1 (en) 2021-06-24

Similar Documents

Publication Publication Date Title
US20220319706A1 (en) A drgs automatic grouping method based on a convolutional neural network
CN109411082B (en) Medical quality evaluation and treatment recommendation method
Liu et al. A novel approach for failure mode and effects analysis using combination weighting and fuzzy VIKOR method
Azrar et al. Data mining models comparison for diabetes prediction
CN103714261B (en) Intelligent auxiliary medical treatment decision supporting method of two-stage mixed model
CN108509484B (en) Classifier construction and intelligent question and answer method, device, terminal and readable storage medium
Aslan et al. Multi-classification deep CNN model for diagnosing COVID-19 using iterative neighborhood component analysis and iterative ReliefF feature selection techniques with X-ray images
WO2021139116A1 (en) Method, apparatus and device for intelligently grouping similar patients, and storage medium
Chitra et al. Heart attack prediction system using fuzzy C means classifier
CN109036577A (en) Diabetic complication analysis method and device
Jiang et al. A hybrid intelligent model for acute hypotensive episode prediction with large-scale data
Bushehri et al. An expert model for self-care problems classification using probabilistic neural network and feature selection approach
CN114494196A (en) Retina diabetic depth network detection method based on genetic fuzzy tree
Peng et al. The health care fraud detection using the pharmacopoeia spectrum tree and neural network analytic contribution hierarchy process
Alqudah et al. Reduced number of parameters for predicting post-stroke activities of daily living using machine learning algorithms on initiating rehabilitation
CN114880538A (en) Attribute graph community detection method based on self-supervision
CN110335160A (en) A kind of medical treatment migratory behaviour prediction technique and system for improving Bi-GRU based on grouping and attention
CN110299194A (en) The similar case recommended method with the wide depth model of improvement is indicated based on comprehensive characteristics
Li et al. Identification method of influencing factors of hospital catering service satisfaction based on decision tree algorithm
Khademolqorani et al. Development of a decision support system for handling health insurance deduction
Patil et al. Impact of K-Means on the performance of classifiers for labeled data
Zhang et al. Cost-sensitive ensemble classification algorithm for medical image
CN112836772A (en) Random contrast test identification method integrating multiple BERT models based on LightGBM
US20230376569A1 (en) Analysis of clustered data
Paul Hybrid decision tree-based machine learning models for diabetes prediction

Legal Events

Date Code Title Description
AS Assignment

Owner name: ZHEJIANG UNIVERSITY, CHINA

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:WU, JIAN;CHEN, JINTAI;CHEN, TINGTING;AND OTHERS;REEL/FRAME:058665/0976

Effective date: 20211129

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION