CN109448808B - Abnormal prescription screening method based on multi-view theme modeling technology - Google Patents

Abnormal prescription screening method based on multi-view theme modeling technology Download PDF

Info

Publication number
CN109448808B
CN109448808B CN201810992868.2A CN201810992868A CN109448808B CN 109448808 B CN109448808 B CN 109448808B CN 201810992868 A CN201810992868 A CN 201810992868A CN 109448808 B CN109448808 B CN 109448808B
Authority
CN
China
Prior art keywords
data
prescription
feature
topic
medication
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810992868.2A
Other languages
Chinese (zh)
Other versions
CN109448808A (en
Inventor
赵俊峰
詹思延
谢冰
卓琳
唐爽
刘少钦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN201810992868.2A priority Critical patent/CN109448808B/en
Publication of CN109448808A publication Critical patent/CN109448808A/en
Application granted granted Critical
Publication of CN109448808B publication Critical patent/CN109448808B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H20/00ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
    • G16H20/10ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to drugs or medications, e.g. for ensuring correct administration to patients
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/40ICT specially adapted for the handling or processing of medical references relating to drugs, e.g. their side effects or intended usage

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Primary Health Care (AREA)
  • Medical Informatics (AREA)
  • Data Mining & Analysis (AREA)
  • Medicinal Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Toxicology (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention discloses an abnormal prescription screening method based on a multi-view theme modeling technology, which comprises the following steps: 1) collating data from the medical system into prescription data, wherein each prescription data includes diagnostic and medication characteristics; 2) inputting prescription data into an MV-LDA model for training; the MV-LDA model comprises K topics, and each topic comprises a diagnosis characteristic view and a medication characteristic view; the diagnosis feature view in the subject k consists of a diagnosis feature set and a probability value corresponding to each diagnosis feature, and the medication feature view consists of a medication feature set and a probability value corresponding to each medication feature in the set; 3) deducing data of a to-be-identified party by using the trained MV-LDA model to obtain topic distribution based on diagnostic characteristics and topic distribution based on medication characteristics; then, the similarity of the distribution of the two subjects is calculated, and whether the prescription data to be identified is an abnormal prescription or not is judged.

Description

Abnormal prescription screening method based on multi-view theme modeling technology
Technical Field
The invention belongs to the field of medical information processing, and relates to an abnormal prescription screening method based on a multi-view theme modeling technology.
Background
Anomaly detection algorithms in the existing medical field can be divided into supervised and unsupervised categories. Among supervised learning methods, some machine learning methods are commonly used to analyze artificially labeled medical data. Kumar et al, for example, detect recording errors in medical claim Data using SVM supervised learning methods in a dataset labeled with sufficient instances of abnormalities and of good quality (Kumar M, Ghani R, Mei Z S.data Mining to predict and present errors in health instruments processing: ACM SIGKDD International Conference on Knowledge Discovery and Mining, Washington, Dc, Usa, July,2010[ C ]), K.Heller et al (Chandola V, Banerjee A, Kumar V.anommy detection: A survey [ M ] ACM, 2009.). Assuming that all the examples belong to a certain category, the boundaries of the two types of examples are drawn from the data set by using an SVM method, and any example with a wrong edge is regarded as an example with abnormal information. However, since it is very difficult to obtain a high-quality labeled data set required for supervised learning, researchers have also proposed a series of unsupervised anomaly detection methods. Unsupervised methods are typically implemented by finding outliers by abstracting each instance to a point in the high dimensional space, with data points far from other points in the space as outliers. For example, Yamanishi et al use an unsupervised PAD method based on a probabilistic generative model to detect abnormalities in pathological Data (Yamanishi K, Takeuchi J I, Williams G, et al, on-line unsupervised output detection using a fine geometry with a distinguishing learning algorithms [ J ]. Data Mining and Knowledge Discovery, 2004,8(3): 275-300); and the density-based LOF method proposed by M.M.Breunig et al (Breunig M.LOF: identifying sensitivity-based local entities: ACM SIGMOD International Conference on Management of Data, May 16-18,2000, Dallas, Texas, Usa,2000[ C ]). However, in the medical field, such outliers are not necessarily abnormal data, because there are a lot of rare diseases with low incidence rate in the medical field, and actually, except for some common diseases, the incidence rate of most diseases is very low, and the abnormal point detection method cannot deal with such problems. We prefer to detect instances of mismatch between those features over rare data. Context Anomaly Detection (CAD) is an unsupervised method for detecting outliers using the relationship between two classes of features, where CAD classifies features into context features, set as y, and indication features, set as x, and learns a mapping function from x to y, where y is f (x), assuming that most data is normal. For a certain piece of test data, if the two types of characteristics of the test data do not accord with y ═ f (x), the test data is considered to be abnormal data. CAD methods also have application in medicine, for example, the solution of j.hu et al uses a regression model on an indicative property and a set of context features, and then uses test cases of the remaining parts to determine outliers to identify abnormal medication cases in medical records (Hu J, Wang F, Sun J, et al a Healthcare Utilization Analysis Framework for Hot spraying and Contextual Analysis Detection [ J ]. ami a. However, due to the high dimensional sparsity of medical data, the CAD method does not work well in the medical field and can only be used to detect mismatches between two types of features.
Disclosure of Invention
The invention provides an abnormal prescription detection method based on a multi-view theme model (MV-LDA). Since the topic model is based on bag-of-words assumptions, assuming that all words are of the same type, but the diagnosis and medication in the prescription fall into two different types, for this purpose, the invention proposes a multi-view topic model, and in the following, explains the training process of the model, and the inference process of the data (topic model is a type of statistical model for describing the composition of unstructured text, and in the field of machine learning, it is used to mine the potential feature "topic" from a series of texts).
The technical scheme of the invention is as follows:
an abnormal prescription screening method based on a multi-view theme modeling technology comprises the following steps:
1) arranging data from the medical system into normative prescription data, wherein each prescription data comprises diagnosis characteristics and medication characteristics in the prescription;
2) inputting the prescription data into an MV-LDA model, and training the MV-LDA model; the MV-LDA model comprises K topics, and each topic comprises a diagnosis characteristic view and a medication characteristic view; the diagnosis feature view in the subject k consists of a diagnosis feature set and a probability value corresponding to each diagnosis feature, and correspondingly, the medication feature view consists of a medication feature set and a probability value corresponding to each medication feature in the set;
3) for the data of a prescription to be identified, deducing the data of the prescription to be identified by using a trained MV-LDA model to obtain the topic distribution of the data of the prescription to be identified based on the diagnosis characteristics and the topic distribution of the data of the prescription to be identified based on the medication characteristics; and then calculating the similarity of the distribution of the two subjects, and if the similarity is lower than a set threshold value, judging that the data of the party to be identified is an abnormal prescription.
Furthermore, solving of the MV-LDA model is carried out by using Gibbs sampling, and parameters in the MV-LDA model are calculated to obtain the well-trained MV-LDA model.
Further, the method for solving the MV-LDA model by using Gibbs sampling comprises the following steps: for prescription data m, sampling class A features in the prescription data m to obtain features x in the class A featuresaThe probability of assigning topic k is:
Figure RE-GDA0001875739130000021
wherein C represents a matrix, VAIs the number of class A feature classes, xAThe number of topics corresponding to the class a features,
Figure RE-GDA0001875739130000031
representing x in all prescription data of the training datasetaA count assigned to a topic K, K representing the number of topics, K representing the kth topic of the K topics;
Figure RE-GDA0001875739130000032
representing all counts, β, of any class A feature assigned to the topic kAIs Dirichlet prior; z is given a feature xaTopic of assignment, z-iThe theme assigned to the remaining features is represented,
Figure RE-GDA0001875739130000033
indicating that the number of subjects k is assigned to all the features in the prescription data m,
Figure RE-GDA0001875739130000034
representing the number of all features in the prescription data M, M being the total number of prescription data in the training data set, α being Dirichlet prior; class a characteristics are diagnostic characteristics or drug characteristics; then according to the given xaThe distributed theme k obtains parameter values in the MV-LDA model.
Further, the subject feature distribution of class A features is
Figure RE-GDA0001875739130000035
Wherein the content of the first and second substances,
Figure RE-GDA0001875739130000036
and the theme characteristic distribution of the A-type characteristic is a value under the condition that the theme is k and the characteristic is x.
Further, calculating the similarity by adopting KL divergence, Euclidean distance, cosine similarity, Pearson correlation or a vector point multiplication method.
The invention utilizes MV-LDA to model prescriptions, reduces two types of characteristics of diagnosis and medication from high-dimensional word space to low-dimensional subject space by using abstract characteristics of subjects as an intermediate layer, and the two types of characteristics are related by subjects. The abstract concept of the theme is a group of semantically related words and corresponding probabilities thereof, the central thought of the corpus is described, and the method has good interpretability.
For a prescription data set, the steps for abnormal prescription detection using the present method are as follows:
1) and data preprocessing, namely arranging the data from the medical system into normative prescription data, wherein each prescription data comprises the diagnosis characteristics and the medication characteristics in the prescription.
2) And (3) solving the MV-LDA model, inputting the sorted prescription data into the model, and then carrying out model training according to the model training method provided in the implementation step 2) to obtain the trained MV-LDA model.
3) And (3) deducing the examples by using the data deducing method given in the step 3) and the model obtained in the step 2) and respectively using the diagnosis characteristics and the example characteristics to obtain two example theme distributions, wherein the specific deducing method is shown in the model deducing in the technical scheme.
4) The similarity of the example subject distributions of the two obtained in the step 3) is calculated as the normality degree of the prescription, and the lower the similarity is, the more abnormal the prescription is. Then, a threshold value is set, and if the similarity is lower than the threshold value, the prescription is judged to be an abnormal prescription.
Compared with the prior art, the invention has the following advantages:
based on the MV-LDA model, the invention can detect abnormal prescriptions from a large number of prescriptions. In the experiment, 97% of the 40 prescriptions with higher threshold were abnormal prescriptions after expert review. Compared with other methods, the method has higher detection accuracy and stable detection effect as before when the data is extremely sparse. The method can be used for detecting abnormal prescriptions and matching relation abnormalities among other characteristics, and compared with abnormal detection algorithms in other medical fields, the method is wider in application range and better in expansibility. The MV-LDA can be expanded to any view and obtain the corresponding relation among various views, so that the method can more conveniently detect the abnormal matching among various characteristics.
Drawings
FIG. 1 is a diagram of a MV-LDA probability map model;
FIG. 2 is an example of MV-LDA theme;
FIG. 3 is a flowchart of the steps for anomalous prescription detection.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below, without limiting the present invention thereto.
The method completes the detection of the abnormal prescription through four steps, namely three steps of data initialization, model training, data inference and abnormal value calculation, and the three steps are described in detail as follows:
1) data initialization:
prescription data is typically stored in the form of structured data, such as in a relational database.
The present invention requires that the data be converted into a format suitable for processing before the MV-LDA can be used to extract the correspondence between diagnosis and medication in the prescription. For diagnostic features, all diagnoses in each record are grouped herein into a diagnostic feature set. The medication information includes the medicine code and the corresponding cost, and the cost is regarded as the dosage of the medicine and is used as the word frequency in the medication characteristic set.
For a certain drug m, let its cost in a certain diagnostic record be cmNormalized to an integer value n by the present inventionmThe formula is as follows:
Figure RE-GDA0001875739130000041
the Median (m) function in the above equation represents the Median of the cost of drug m in the data set. Round () is a rounding function, λ represents a multiplier factor, and is manually determined for the number of drugs (n)m) Is not less than 1. After such transformation, the present invention can obtain an input prescription data set for training the MV-LDA model.
2) Model training:
two types of features x representing A, B are extracted on training data by using a multi-view topic model (MV-LDA)AAnd xB(e.g., diagnostic and drug use) are provided. The multi-view theme model (MV-LDA) proposed for the present method will be first introduced here.
MV-LDA is the expansion of an LDA theme model in a characteristic view, the characteristics associated with each characteristic in the LDA theme belong to the same category, the characteristics can be mutually exchanged and can be regarded as only comprising one view; for examples having various features described, the features may be viewed as describing the examples from a different perspective and are associated with the examples described. Taking prescription data as an example, if all the diagnosis and medication information of the prescription are respectively used as the input of the MV-LDA model, an MV-LDA model can be obtained. The model is composed of K (K is a set hyper-parameter) abstract subjects, each subject comprises two types of characteristics of diagnosis and medication and corresponding probability values, and the invention considers that the height of the probability values determines a characteristic to be matched with the latent meanings of the subjects. Assuming that the latent meaning of a theme is 'teeth', the probability of diagnosis and medication related to the teeth belonging to the theme is high, and different from training a plurality of LDA models, the diagnosis and medication characteristics are distributed according to the same example theme, one theme has two views of diagnosis and medication, and the characteristics with higher probability in the two views are both diagnosis or medication matched with the latent meaning of the theme; if the LDA is trained to obtain two models of diagnosis information and medication information, the two types of characteristics are not related.
The present invention models both features a and B, and each instance (i.e., the prescription data) is considered to be described from the view of the class a feature and the view of the class B feature along with two views, hereinafter referred to as class a view and class B view, and the probability map representation of the MV-LDA model is shown in fig. 1.
Like the LDA topic model, α in the figure is a hyper-parameter of topic distribution, β is a hyper-parameter of word distribution under the topic, and θ represents topic distribution of each instance. The difference is that since different kinds of features are considered as delineating instances from different views, each topic is also described by multiple views, the different views have different topic feature distributions ΦaAnd phib. In different views, the topic assignment variable z, the generated feature x, and the hyper-parameter β are all different, and the feature x in different views also generates a corresponding relationship because it is generated by the same example topic distribution θ.
Now, the model has hyper-parameters α, β, the example topic distributions of all examples, and the topic feature distributions under all views are the model parameters to be obtained by the present invention, and θ, φ is in the probability map. The solution of these parameters, i.e. the MV-LDA model, will be described below.
The multi-view topic model uses gibbs sampling for the solution of the model to compute parameters in the model. In the solving process, firstly, randomly distributing a theme to all the characteristics; and then sampling and updating the theme corresponding to each feature of each example according to the current state.
For the MV-LDA model with two features, assuming that class A features are sampled, the feature x in class A features in example m is the case that the state at the previous moment is knownaThe probability of assigning topic k is:
Figure RE-GDA0001875739130000051
where C denotes a matrix and V is the first factorAIs the number of class a feature classes,
Figure RE-GDA0001875739130000052
represents x in all examplesaA count assigned to a topic K, K representing the number of topics, K representing the kth topic of the K topics, and
Figure RE-GDA0001875739130000061
representing all counts, β, of any class A feature assigned to the topic kAIs Dirichlet prior, and z in the left equation gives the feature xaTopic of assignment, z-iShowing the theme assigned to the remaining features. First factor in right formula
Figure RE-GDA0001875739130000062
Representing all the features x in class A with the assigned subject kaIn a ratio of, i.e. to
Figure RE-GDA0001875739130000063
For the second factor, similarly to the aboveK denotes the number of subjects, M denotes the number of instances,
Figure RE-GDA0001875739130000064
indicating that the number of subjects k assigned to all the features (both AB and m) in example m,
Figure RE-GDA0001875739130000065
representing the number of all features in instance m, α is Dirichlet a priori. The right expression represents the proportion of the features of the example m to which the subject k is assigned to the total features, i.e.
Figure RE-GDA0001875739130000066
After assigning themes to all features, calculating the distribution of theme features under each theme
Figure RE-GDA0001875739130000067
The needed MV-LDA model can be obtained.
The present invention requires that the data be converted into a format suitable for processing before the MV-LDA can be used to extract the correspondence between diagnosis and medication in the prescription. For diagnostic features, no processing may be done. The medicine information comprises medicine codes and corresponding expenses, and the expenses are regarded as the dosage of the medicines and are used as word frequencies in the characteristic set.
Each piece of data in the processed prescription data set comprises a diagnosis feature set and a medication feature set, and the data set is used as training data to be used as the input of the MV-LDA, so that K subjects with the association relation between diagnosis and medication reserved can be obtained. Each topic contains two topic-feature distributions, corresponding to a plurality of distributions of the topic over medication features and a plurality of distributions over diagnostic features, respectively. The two types of characteristics of diagnosis and medication respectively correspond to the two views of the theme, so that the association relationship is kept.
3) Data inference
Data inference refers to inferring the distribution of topics on the subjects learned by the test data in step 1), where it is necessary to infer the distribution of topics for class a features and the distribution of topics for class B features, respectively.
When model inference is used, each view in the MV-LDA can be regarded as a separate LDA model, and the characteristics can be used independently for inference. For example, when a model containing A, B classes of features has been trained on a data set, the topic-feature distribution φ of A features can be usedAThe method is used for deducing an example only containing the A characteristic and estimating the example-topic distribution of the A characteristic on the model, wherein the deduction formula is as follows:
Figure RE-GDA0001875739130000068
wherein
Figure RE-GDA0001875739130000069
The probability value obtained when the topic feature representing the class a feature is distributed when the topic is k and the feature is x. And the factor on the right side is similar to the equation (3-1), and represents the proportion of the feature to the total feature to which the subject k is assigned in the example.
The inference process uses only the topic feature distributions associated with class a features. Inferences can be drawn from multiple views, respectively, resulting in multiple instance-topic distributions. Since these distributions describe the same instance, the distributions should be very close. Specifically, for a prescription which needs to be subjected to anomaly detection, the existing model is used for deducing diagnosis and medication respectively to obtain example theme distribution of the two prescriptions, and then the similarity of the two distributions is compared to judge whether the prescription is normal.
At this step, the present invention will use the diagnostic data and the medication data separately to infer an example-subject distribution of the prescription
Figure RE-GDA0001875739130000071
And
Figure RE-GDA0001875739130000072
and based on model assumptions, diagnosis and medication are derived fromThe different views describe the prescription identically, and this correspondence is reflected in the subject, if the prescription is normal, the example-subject distributions derived from the different views should be similar, whereas if the distributions are very different, the prescription is likely to be an anomalous prescription.
4) Outlier calculation
Example topic distribution is inferred from two characteristics of diagnosis and medication
Figure RE-GDA0001875739130000073
And
Figure RE-GDA0001875739130000074
then, when the prescription is normal,
Figure RE-GDA0001875739130000075
and
Figure RE-GDA0001875739130000076
the values of the components on the respective topics should be relatively similar. The similarity between two vectors can be calculated by adopting various vector similarity measurement methods, namely KL divergence (KL), Euclidean distance (EUC), cosine similarity (COS), Pearson correlation (PS) and vector DOT-product (DOT). The outlier is equal to or the inverse of the similarity of the two vectors, depending on whether the vector similarity measure method is more similar for small values of the two vectors or for large values of the two vectors.
Then, by setting a threshold, abnormal prescriptions above the threshold can be marked for experts to review.
The above embodiments are only for illustrating the technical solutions of the present invention and not for limiting the same, and those skilled in the art can make modifications or equivalent substitutions to the technical solutions of the present invention without departing from the spirit and scope of the present invention, and the scope of the present invention should be determined by the claims.

Claims (5)

1. An abnormal prescription screening method based on a multi-view theme modeling technology comprises the following steps:
1) arranging the medical data into prescription data, wherein each prescription data comprises diagnosis characteristics and medication characteristics in a prescription;
2) inputting the prescription data into an MV-LDA model, and training the MV-LDA model; the MV-LDA model comprises K topics, and each topic comprises a diagnosis characteristic view and a medication characteristic view; the diagnosis feature view in the subject k consists of a diagnosis feature set and a probability value corresponding to each diagnosis feature in the set, and correspondingly, the medication feature view consists of a medication feature set and a probability value corresponding to each medication feature in the set;
3) for the data of a prescription to be identified, deducing the data of the prescription to be identified by using a trained MV-LDA model to obtain the topic distribution of the data of the prescription to be identified based on the diagnosis characteristics and the topic distribution of the data of the prescription to be identified based on the medication characteristics; and then calculating the similarity of the distribution of the two subjects, and if the similarity is lower than a set threshold value, judging that the data of the party to be identified is an abnormal prescription.
2. The method of claim 1, wherein the MV-LDA model is solved using gibbs sampling, and parameters in the MV-LDA model are calculated to obtain a trained MV-LDA model.
3. The method of claim 2, wherein the MV-LDA model solution using gibbs sampling is performed by: for prescription data m, sampling class A features in the prescription data m to obtain features x in the class A featuresaThe probability of assigning topic k is:
Figure FDA0001781210180000011
wherein C represents a matrix, VAIs the number of class A feature classes, xAThe number of topics corresponding to the class a features,
Figure FDA0001781210180000012
all prescription data representing training data setIn xaA count assigned to a topic K, K representing the number of topics, K representing the kth topic of the K topics;
Figure FDA0001781210180000013
representing all counts, β, of any class A feature assigned to the topic kAIs Dirichlet prior; z is given a feature xaTopic of assignment, z-iThe theme assigned to the remaining features is represented,
Figure FDA0001781210180000014
indicating that the number of subjects k is assigned to all the features in the prescription data m,
Figure FDA0001781210180000015
representing the number of all features in the prescription data M, M being the total number of prescription data in the training data set, α being Dirichlet prior; class a characteristics are diagnostic characteristics or drug characteristics; then according to the given xaThe distributed theme k obtains parameter values in the MV-LDA model.
4. The method of claim 3, wherein the subject feature distribution of class A features is
Figure FDA0001781210180000016
Figure FDA0001781210180000017
Wherein the content of the first and second substances,
Figure FDA0001781210180000018
and the theme characteristic distribution of the A-type characteristic is a value under the condition that the theme is k and the characteristic is x.
5. The method according to claim 1, wherein the similarity is calculated using KL divergence, euclidean distance, cosine similarity, pearson correlation, or vector point multiplication.
CN201810992868.2A 2018-08-29 2018-08-29 Abnormal prescription screening method based on multi-view theme modeling technology Active CN109448808B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810992868.2A CN109448808B (en) 2018-08-29 2018-08-29 Abnormal prescription screening method based on multi-view theme modeling technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810992868.2A CN109448808B (en) 2018-08-29 2018-08-29 Abnormal prescription screening method based on multi-view theme modeling technology

Publications (2)

Publication Number Publication Date
CN109448808A CN109448808A (en) 2019-03-08
CN109448808B true CN109448808B (en) 2022-05-03

Family

ID=65530200

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810992868.2A Active CN109448808B (en) 2018-08-29 2018-08-29 Abnormal prescription screening method based on multi-view theme modeling technology

Country Status (1)

Country Link
CN (1) CN109448808B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111145854B (en) * 2019-12-20 2022-06-24 昆明理工大学 Chest X-ray film diagnosis report abnormity detection method based on topic model
CN111951924A (en) * 2020-08-14 2020-11-17 江苏云脑数据科技有限公司 Abnormal medication behavior detection method and system
CN114428836A (en) * 2021-12-30 2022-05-03 沈阳东软智能医疗科技研究院有限公司 Information processing method and device, readable storage medium and electronic equipment

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8510257B2 (en) * 2010-10-19 2013-08-13 Xerox Corporation Collapsed gibbs sampler for sparse topic models and discrete matrix factorization

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
MaLDA:基于LDA的用药分析;周靖等;《计算机工程与应用》;20160915(第18期);全文 *
基于潜在狄利克雷分配模型的医疗数据研究;许珠香等;《厦门大学学报(自然科学版)》;20130528(第03期);全文 *
基于隐语义模型的中医在线辅助诊疗系统;张颖等;《计算机应用》;20170615;第303-307页 *
面向电子病历中文医学信息的可视组织方法;徐天明等;《计算机系统应用》;20151115(第11期);全文 *

Also Published As

Publication number Publication date
CN109448808A (en) 2019-03-08

Similar Documents

Publication Publication Date Title
Ahmad et al. Survey of state-of-the-art mixed data clustering algorithms
CN109036577B (en) Diabetes complication analysis method and device
CN109448808B (en) Abnormal prescription screening method based on multi-view theme modeling technology
Doquire et al. Feature selection for interpatient supervised heart beat classification
US20220036137A1 (en) Method for detecting anomalies in a data set
US11379685B2 (en) Machine learning classification system
Arbet et al. Lessons and tips for designing a machine learning study using EHR data
Guo et al. Visual anomaly detection in event sequence data
Buza Fusion methods for time-series classification
US20230395196A1 (en) Method and system for quantifying cellular activity from high throughput sequencing data
WO2017017554A1 (en) Reliability measurement in data analysis of altered data sets
Chen et al. A novel information-theoretic approach for variable clustering and predictive modeling using dirichlet process mixtures
Zhang et al. Probabilistic-mismatch anomaly detection: do one’s medications match with the diagnoses
Vateekul et al. Tree-based approach to missing data imputation
Jaffar et al. Efficient deep learning models for predicting super-utilizers in smart hospitals
EP3142026A1 (en) Computer-implemented system and method for discovering heterogeneous communities with shared anomalous components
Huo et al. Sparse embedding for interpretable hospital admission prediction
Bacci et al. Two-Tier Latent Class IRT Models in R.
Masitha et al. Preparing Dual Data Normalization for KNN Classfication in Prediction of Heart Failure
Walker et al. Acquisition and validation of knowledge from data
Jacobson et al. A Machine Learning-Based Statistical Analysis of Predictors for Spinal Cord Stimulation Success
Antil et al. Health Care Data Mining using Intelligent Classification Techniques
Zhang Supporting the Understanding of Rare Disease Diagnostics with Questionnaire-Based Data Analysis and Computer-Aided Classifier Fusion
Thanigainathan USING ENSEMBLE CLUSTERING TO IDENTIFY PHENOTYPES OF DIABETES PATIENTS FOR EVALUATING DISEASE PROGRESSION
Singh Identification of Mental Disorder based on Changes in Personal Behaviour using Machine Learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant