CN112016302A - Recognition method and device for decomposing hospitalization behaviors, electronic equipment and storage medium - Google Patents

Recognition method and device for decomposing hospitalization behaviors, electronic equipment and storage medium Download PDF

Info

Publication number
CN112016302A
CN112016302A CN202010768490.5A CN202010768490A CN112016302A CN 112016302 A CN112016302 A CN 112016302A CN 202010768490 A CN202010768490 A CN 202010768490A CN 112016302 A CN112016302 A CN 112016302A
Authority
CN
China
Prior art keywords
hospitalization
behavior
cost
information
item information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010768490.5A
Other languages
Chinese (zh)
Other versions
CN112016302B (en
Inventor
董子坤
舒正
尹珊珊
朱波
田雅如
张骁雅
傅兆翔
艾馨
罗屿浪
王净
刘英杰
赵明
李璐璐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Guoxin Health Industry Technology Co ltd
Original Assignee
Qingdao Guoxin Health Industry Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Guoxin Health Industry Technology Co ltd filed Critical Qingdao Guoxin Health Industry Technology Co ltd
Priority to CN202010768490.5A priority Critical patent/CN112016302B/en
Publication of CN112016302A publication Critical patent/CN112016302A/en
Application granted granted Critical
Publication of CN112016302B publication Critical patent/CN112016302B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/211Selection of the most significant subset of features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q20/00Payment architectures, schemes or protocols
    • G06Q20/38Payment protocols; Details thereof
    • G06Q20/40Authorisation, e.g. identification of payer or payee, verification of customer or shop credentials; Review and approval of payers, e.g. check credit lines or negative lists
    • G06Q20/401Transaction verification
    • G06Q20/4016Transaction verification involving fraud or risk level assessment in transaction processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Systems or methods specially adapted for specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Accounting & Taxation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Strategic Management (AREA)
  • General Business, Economics & Management (AREA)
  • Computational Linguistics (AREA)
  • Tourism & Hospitality (AREA)
  • Development Economics (AREA)
  • Educational Administration (AREA)
  • Databases & Information Systems (AREA)
  • Finance (AREA)
  • Economics (AREA)
  • Computer Security & Cryptography (AREA)
  • Human Resources & Organizations (AREA)
  • Marketing (AREA)
  • Primary Health Care (AREA)
  • Probability & Statistics with Applications (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The embodiment of the invention provides an identification method, an identification device, electronic equipment and a storage medium for resolving hospitalization behaviors; the method comprises the following steps: acquiring information of a first hospitalization behavior and a second hospitalization behavior of a target patient; obtaining a cost characteristic vector of the first hospitalization behavior according to the cost information of the first hospitalization behavior, and obtaining a cost characteristic vector of the second hospitalization behavior according to the cost information of the second hospitalization behavior; and judging whether the first hospitalization behavior and the second hospitalization behavior are suspected decomposition hospitalization behaviors or not according to the cost characteristic vector of the first hospitalization behavior and the cost characteristic vector of the second hospitalization behavior. The embodiment of the invention realizes the automatic identification of the decomposed hospitalization behavior, and has high execution efficiency and higher identification accuracy compared with the traditional manual method.

Description

Recognition method and device for decomposing hospitalization behaviors, electronic equipment and storage medium
Technical Field
The invention relates to the technical field of data processing, in particular to an identification method and device for decomposing hospitalization behaviors, electronic equipment and a storage medium.
Background
The 'decomposition hospitalization' refers to a fraudulent conduct of medical insurance fund for handling a plurality of discharge and hospitalization procedures for patients to obtain more reimbursement expenses on the premise that the insured people are not fully cured.
With the development of medical insurance cost control work, medical insurance departments generally have maximum limit regulations on hospital treatment, and the part of medical insurance departments with the patient hospitalization expenditure exceeding the maximum limit is not reimbursed, and the part of cost needs to be borne by hospitals. The practice prevents the situations of abusing medicine, using expensive medicine and the like of a fixed-point medical institution to a great extent, but some hospitals are reluctant to bear the expense, so that the pressure brought to the hospitals by the extra expense is avoided by a 'split hospitalization' method. By using the 'decomposition hospitalization' technique, the medical insurance fund is maliciously collected, which not only causes the loss of the national medical insurance fund, but also increases the burden of the ginseng insurance people and worsens the relationship between doctors and patients.
In the prior art, the identification of the split hospitalization behavior mainly depends on reporting of folk clues and simple rule screening, the two modes provide a suspicious clue with low coverage and low accuracy, and the brought suspicious clue has large workload of expert examination and small attack surface of final confirmation, so that the work efficiency of searching and attacking the cases by the medical insurance fund management department is low, the accuracy is low and the cost is high.
Disclosure of Invention
Aiming at the problems in the prior art, the embodiment of the invention provides an identification method and device for decomposing hospitalization behaviors, electronic equipment and a storage medium.
The embodiment of the invention provides an identification method for decomposing hospitalization behaviors, which comprises the following steps:
acquiring information of a first hospitalization behavior and a second hospitalization behavior of a target patient; wherein the first hospitalization is a twice-adjacent hospitalization of the subject patient to a second hospitalization, and the first hospitalization occurs before the second hospitalization; the information comprises cost information;
obtaining a cost characteristic vector of the first hospitalization behavior according to the cost information of the first hospitalization behavior, and obtaining a cost characteristic vector of the second hospitalization behavior according to the cost information of the second hospitalization behavior;
and judging whether the first hospitalization behavior and the second hospitalization behavior are suspected decomposition hospitalization behaviors or not according to the cost characteristic vector of the first hospitalization behavior and the cost characteristic vector of the second hospitalization behavior.
In the above technical solution, the information includes time information;
accordingly, after the step of obtaining information on the first hospitalization behavior and the second hospitalization behavior of the target patient, the method further comprises:
and filtering out the first hospitalization behavior and the second hospitalization behavior which do not belong to the decomposed hospitalization behavior according to the time information of the first hospitalization behavior and the second hospitalization behavior.
In the above-mentioned technical scheme, according to the time information of first act of being in hospital with the second act of being in hospital, the first act of being in hospital and the second act of being in hospital that the filtering does not belong to decomposition act of being in hospital include:
calculating a time interval between the first hospitalization behavior and the second hospitalization behavior, wherein when the time interval is greater than or equal to a preset time interval threshold value, the first hospitalization behavior and the second hospitalization behavior do not belong to decomposition hospitalization behavior, and filtering the first hospitalization behavior and the second hospitalization behavior;
and/or the presence of a gas in the gas,
according to the time information of the first hospitalization behavior and the second hospitalization behavior, judging that the first hospitalization behavior and the second hospitalization behavior have a cross phenomenon in time, and filtering the first hospitalization behavior and the second hospitalization behavior if the first hospitalization behavior and the second hospitalization behavior do not belong to decomposition hospitalization behaviors.
In the above technical solution, the information includes medical institution information;
accordingly, after the step of obtaining information on the first hospitalization behavior and the second hospitalization behavior of the target patient, the method further comprises:
and filtering out the first hospitalization behavior and the second hospitalization behavior which do not belong to the decomposed hospitalization behavior according to the medical institution information of the first hospitalization behavior and the second hospitalization behavior.
Among the above-mentioned technical scheme, according to the medical institution information of the action of being in hospital for the first time and the action of being in hospital for the second time, the filtering does not belong to and decomposes the action of being in hospital for the first time and the action of being in hospital for the second time of being in hospital for the action, include:
according to the medical institution information of the first hospitalization behavior and the second hospitalization behavior, the first hospitalization behavior and the second hospitalization behavior are judged to occur in different medical institutions, and then the first hospitalization behavior and the second hospitalization behavior do not belong to decomposition hospitalization behaviors, and the first hospitalization behavior and the second hospitalization behavior are filtered.
In the above-mentioned technical scheme, the expense eigenvector of the action of being hospitalized for the first time is obtained according to the expense information of the action of being hospitalized for the first time, and the expense eigenvector of the action of being hospitalized for the second time is obtained according to the expense information of the action of being hospitalized for the second time, include:
dividing all charge item information in the charge information of the first hospitalization behavior according to the category of charge items to obtain a plurality of classified first charge item information sets;
performing feature vectorization on each charging item information in the plurality of first charging item information sets to obtain a plurality of first cost feature sub-vectors corresponding to the plurality of first charging item information sets, wherein the plurality of first cost feature sub-vectors form a cost feature vector of the first hospitalization behavior;
all charging item information in the cost information of the second hospitalization behavior is divided according to the category of the charging item, so that a plurality of classified second charging item information sets are obtained;
and performing feature vectorization on each charging item information in the second charging item information sets to obtain a plurality of second cost feature sub-vectors corresponding to the second charging item information sets, wherein the second cost feature sub-vectors form a cost feature vector of the second hospitalization behavior.
In the above technical solution, the performing feature vectorization on each charge item information in the plurality of first charge item information sets includes:
calculating a TF-IDF value for each charging item information of the plurality of first charging item information sets;
taking the TF-IDF value obtained by calculation as a characteristic value corresponding to the charging item information;
and performing feature vectorization on each charging item information in the plurality of second charging item information sets, including:
calculating a TF-IDF value for each charging item information of the plurality of second charging item information sets;
and taking the TF-IDF value obtained by calculation as a characteristic value corresponding to the charging item information.
In the above technical solution, the performing feature vectorization on each charge item information in the plurality of first charge item information sets includes:
sorting the charging item information in any one first charging item information set according to a time sequence;
deleting repeated charging item information in the set from the any one first charging item information set after sorting;
inputting the any one first charging item information set which is sequenced and the repeated charging item information is deleted into a pre-trained BERT model in a sentence mode, wherein the BERT model outputs a first charge characteristic sub-vector corresponding to the any one first charging item information set;
and performing feature vectorization on each charging item information in the plurality of second charging item information sets, including:
sorting the charging item information in any one second charging item information set according to a time sequence;
deleting repeated charging item information in the set from the any one second charging item information set after sorting;
inputting the any one second charging item information set which is sequenced and the repeated charging item information is deleted into a pre-trained BERT model in a sentence mode, and outputting a second charge characteristic sub-vector corresponding to the any one second charging item information set by the BERT model.
In the foregoing technical solution, the determining whether the first hospitalization activity and the second hospitalization activity are suspected decomposition hospitalization activities according to the cost eigenvector of the first hospitalization activity and the cost eigenvector of the second hospitalization activity includes:
selecting a corresponding second expense characteristic sub-vector from the expense characteristic vectors of the second hospitalization behavior according to the category of the charging item for any first expense characteristic sub-vector in the expense characteristic vectors of the first hospitalization behavior, and calculating the similarity between the any first expense characteristic sub-vector and the selected corresponding second expense characteristic sub-vector;
and judging whether the first hospitalization behavior and the second hospitalization behavior are suspected decomposition hospitalization behaviors or not according to the calculated similarity.
In a second aspect, an embodiment of the present invention provides an identification apparatus for resolving hospitalization behavior, including:
the information acquisition module is used for acquiring the information of the first hospitalization behavior and the second hospitalization behavior of the target patient; wherein the first hospitalization is a twice-adjacent hospitalization of the subject patient to a second hospitalization, and the first hospitalization occurs before the second hospitalization; the information comprises cost information;
the cost characteristic vector generation module is used for obtaining a cost characteristic vector of the first hospitalization behavior according to the cost information of the first hospitalization behavior and obtaining a cost characteristic vector of the second hospitalization behavior according to the cost information of the second hospitalization behavior;
and the judging module is used for judging whether the first hospitalization behavior and the second hospitalization behavior are suspected decomposition hospitalization behaviors or not according to the cost characteristic vector of the first hospitalization behavior and the cost characteristic vector of the second hospitalization behavior.
In a third embodiment of the present invention, an electronic device is provided, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor executes the computer program to implement the steps of the hospitalization behavior decomposition identification method according to the first embodiment of the present invention.
A fourth aspect of the present invention provides a non-transitory computer readable storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the identification method for decomposing hospitalization behavior according to the embodiment of the first aspect of the present invention.
According to the identification method, the identification device, the electronic equipment and the storage medium for decomposing hospitalization behaviors, the characteristic vectors are extracted for the cost information of the twice hospitalization behaviors, the similarity between the characteristic vectors of the twice hospitalization behaviors is calculated, and whether the twice hospitalization behaviors belong to suspected decomposed hospitalization behaviors or not is judged according to the similarity between the characteristic vectors; the automatic identification of the hospitalization decomposition behavior is realized, and compared with the traditional manual method, the automatic identification method is high in execution efficiency and high in identification accuracy.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
Fig. 1 is a flowchart of an identification method for resolving hospitalization behaviors according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an identification apparatus for decomposing hospitalization activities according to another embodiment of the present invention;
fig. 3 is a schematic physical structure diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 is a flowchart of a hospital hospitalization decomposition identification method according to an embodiment of the present invention, and as shown in fig. 1, the hospital hospitalization decomposition identification method according to the embodiment of the present invention includes:
step 101, obtaining information of a first hospitalization behavior and a second hospitalization behavior of a target patient.
In an embodiment of the invention, the first hospitalization is a target patient's twice-adjacent hospitalization, and the first hospitalization occurs before the second hospitalization.
The information of the hospitalization behavior of the patient includes at least cost information of the hospitalization behavior of the patient.
The cost of hospitalization of a patient refers to the medical-related costs incurred by the patient during hospitalization, such as: the cost of medications taken by the patient during the hospital stay, the bed cost of the patient during the hospital stay, the medical care cost of the patient during the hospital stay, the cost of consumables used by the patient during the hospital stay, and the like.
Cost information for hospitalization of a patient is information related to cost. Specifically, the fee information includes the charge item and the amount of the charge item. For example, the medical service charge for intravenous infusion is 50 yuan, and the medical service charge for intravenous infusion in the charge information is a charge item, and 50 yuan is the amount of the charge item.
The cost information of the hospitalization of the patient may be obtained from a database of a social security institution or a medical institution.
And 102, obtaining a cost characteristic vector of the first hospitalization behavior according to the cost information of the first hospitalization behavior, and obtaining a cost characteristic vector of the second hospitalization behavior according to the cost information of the second hospitalization behavior.
The cost of a patient's hospitalization activity refers to the medical-related costs incurred by the patient during hospitalization. As will be readily understood by those skilled in the art, there are a number of billing items that may be incurred by a patient during an hospitalization session, such as the cost of service during an intravenous infusion, the cost of medications used during an intravenous infusion, the cost of a bed during an hospitalization session, the cost of surgery during an hospitalization session, and the like. These charging items are various, and for convenience of handling, in the embodiment of the present invention, it is necessary to classify the charging items in the cost information of the hospitalization of the patient. And then respectively calculating corresponding feature vectors for the classified charging items.
Specifically, the method further comprises the following steps:
dividing all charge item information in the charge information of the first hospitalization behavior according to the category of charge items to obtain a plurality of classified first charge item information sets;
performing feature vectorization on each charging item information in the plurality of first charging item information sets to obtain a plurality of first cost feature sub-vectors corresponding to the plurality of first charging item information sets, wherein the plurality of first cost feature sub-vectors form a cost feature vector of the first hospitalization behavior;
all charging item information in the cost information of the second hospitalization behavior is divided according to the category of the charging item, so that a plurality of classified second charging item information sets are obtained;
and performing feature vectorization on each charging item information in the second charging item information sets to obtain a plurality of second cost feature sub-vectors corresponding to the second charging item information sets, wherein the second cost feature sub-vectors form a cost feature vector of the second hospitalization behavior.
For example, in one embodiment, the items charged may be divided into 8 broad categories, respectively: exam, drug, test, treatment, surgery, care, consumable, and others.
Then according to the preset 8 categories, mapping each charging item in the cost information of the hospitalization behavior of the patient according to the category to obtain a corresponding charging item information set. For example, the cost information for a patient's hospitalization activity includes the following charge items: the service cost of intravenous transfusion, the bed cost, the nursing cost of nursing staff, the cost of azithromycin, the cost of CT examination and the cost of blood examination. The cost of CT examination can be mapped into the cost of examination class, the cost of azithromycin drug can be mapped into the cost of medicine class, the cost of blood examination can be mapped into the cost of examination class, the service charge of venous transfusion and the nursing charge of nursing staff can be mapped into the cost of nursing class, and the bed charge can be mapped into the cost of other classes. For the hospitalization of the patient, no corresponding charging items exist for the treatment type cost, the operation type cost and the consumable type cost.
It will be readily understood by those skilled in the art that some billing items may occur more than once during a patient's hospitalization, and that the cost of intravenous fluid service may occur more than once if the patient is likely to have intravenous fluid on a daily basis during the hospitalization period. When the charging items are mapped according to categories, the charging items of the same type which occur many times need to be mapped into the corresponding categories.
After the charging items in the cost information of the hospitalization behavior of the patient are mapped according to the categories to obtain a plurality of charging item information sets, corresponding feature vectors can be combined and extracted for each charging item information set. For example, the cost incurred by a certain hospitalization behavior of the patient covers the above 8 categories, after the charging items in the cost incurred by the hospitalization behavior are mapped to the corresponding categories according to the mapping operation, an operation of extracting feature vectors is performed once for all the charging items (i.e., a charging item information set) under a category, and finally, the cost feature sub-vectors corresponding to the 8 categories can be obtained according to the cost information of the hospitalization behavior. The set of the cost feature sub-vectors corresponding to the 8 categories is the cost feature vector corresponding to the cost information of the hospitalization.
The charge feature vector comprises the feature value of each charge item, and the feature value of each charge item reflects the information quantity, namely the importance degree, contained in each treatment item in a hospitalization treatment
As an alternative implementation, the cost feature vector or the cost feature sub-vector is expressed in the form of: the cost feature vector or the cost feature sub-vector comprises a plurality of feature terms, each feature term representing a theoretically possible charging item. For example, assuming that 3000 drugs are available for a patient during hospitalization, 3000 cost items of the corresponding cost feature sub-vector are provided for the cost information of the drug class. The characteristic value of the characteristic item depends on the cost of the patient during the hospitalization, if the patient uses the azithromycin during the hospitalization, the characteristic value of the characteristic item corresponding to the azithromycin is not 0, and conversely, if the patient does not use the penicillin during the hospitalization, the characteristic value of the characteristic item corresponding to the penicillin is 0.
For the feature items with the feature values not being 0, the specific values of the feature values are determined by the adopted feature vector extraction method. In the embodiment of the present invention, a text feature extraction method may be used to determine the magnitude of the feature value of the feature item in the feature vector or the feature sub-vector of the cost information. For example, a TF-IDF text vectorization method can be adopted, and text feature vectorization work can also be carried out by adopting a language model Bert (bidirectional Encoder expressions from transformations). In the embodiment of the present invention, a specific implementation manner of the text feature extraction method is not limited.
Step 103, judging whether the first hospitalization behavior and the second hospitalization behavior are suspected decomposition hospitalization behaviors according to the cost characteristic vector of the first hospitalization behavior and the cost characteristic vector of the second hospitalization behavior.
In the previous step, corresponding cost feature vectors have been obtained for the first hospitalization and the second hospitalization, respectively. In this step, whether the first hospitalization behavior and the second hospitalization behavior are suspected decomposition hospitalization behaviors is judged according to the cost feature vector.
Specifically, for any one first expense feature sub-vector in the expense feature vectors of the first hospitalization behavior, selecting a corresponding second expense feature sub-vector from the expense feature vectors of the second hospitalization behavior according to the category of the charging item, and calculating the similarity between the any one first expense feature sub-vector and the selected corresponding second expense feature sub-vector;
and judging whether the first hospitalization behavior and the second hospitalization behavior are suspected decomposition hospitalization behaviors or not according to the calculated similarity.
A suspected split hospitalization is one in which the compared hospitalization is more likely to be a split hospitalization. The expert's judgment can then be incorporated to determine whether a suspected resolved hospitalization is a true resolved hospitalization. Further determination of suspected hospital stay by an expert is not within the scope of embodiments of the present invention and is therefore not further described herein.
According to the previous description, the cost eigenvector of the first hospitalization activity includes a plurality of first cost eigenvectors, and the cost eigenvector of the second hospitalization activity includes a plurality of second cost eigenvectors. In an embodiment of the present invention, a comparison may be made between the feature sub-vectors of the corresponding categories of the two hospitalization activities. For example, the feature sub-vector of the drug category of the first hospitalization activity is compared with the feature sub-vector of the drug category of the second hospitalization activity. Obviously, the comparison of the characteristic sub-vectors of the same category of different hospitalization behaviors helps to improve the accuracy of the comparison result.
When calculating the similarity between the feature sub-vectors, various calculation methods can be adopted, such as a cosine similarity method, a manhattan distance method, an Euclidean distance method, and the like, and other calculation methods in the prior art can also be adopted.
Take cosine similarity method as an example. Cosine similarity is the cosine value of the included angle of two vectors directly calculated, and the smaller the value is, the more similar the two vectors are.
The calculation formula is as follows:
Figure BDA0002615588750000101
the larger the similarity value is, the smaller the included angle between the two vectors is, the closer the two vectors are in the vector space, namely the closer the two hospitalization behaviors are.
The calculated similarity value can be regarded as the similarity of the two hospitalizations of the patient in the corresponding category.
After the characteristic sub-vectors of the cost information of the two hospitalization behaviors in each category are respectively compared, the similarity of the cost information in each category can be obtained. The set of these similarities is the similarity between the two hospitalizations.
According to the similarity between the two hospitalizations, whether the two hospitalizations are suspected to be decomposed hospitalizations can be judged.
In the specific determination, one implementation manner is to compare the similarity of the two compared hospitalization behaviors in each category with a preset similarity threshold in the corresponding category, and determine whether the two compared hospitalization behaviors are suspected decomposition hospitalization behaviors according to the comparison result.
For example, the calculation result of the similarity between the first hospitalization and the second hospitalization in the category of the drug is 0.6, and the preset similarity threshold value in the category of the drug is 0.55, so that the first hospitalization and the second hospitalization are similar in the category of the drug. Wherein, the preset similarity threshold value on the medicine category is obtained according to historical data. Similarly, a determination result of whether the first hospitalization behavior is similar to the second hospitalization behavior in the examination category, a determination result of whether the first hospitalization behavior is similar to the second hospitalization behavior in the treatment category, a determination result of whether the first hospitalization behavior is similar to the second hospitalization behavior in the operation category, a determination result of whether the first hospitalization behavior is similar to the second hospitalization behavior in the care category, a determination result of whether the first hospitalization behavior is similar to the second hospitalization behavior in the consumable category, and a determination result of whether the first hospitalization behavior is similar to the second hospitalization behavior in other categories may be obtained, respectively.
Note that the similarity threshold value in each category is obtained from historical data. The magnitude of the similarity threshold on different classes may be different.
After the judgment results of the various categories are obtained, the judgment results are combined to obtain the conclusion that whether the hospitalization behaviors are suspected to be decomposed twice. For example, according to one embodiment, the two hospitalizations need to be judged similar in all categories according to a preset rule to conclude that the two hospitalizations are suspected to be decomposed hospitalizations. In another embodiment, according to the preset rule, half or more of the categories of the two hospitalizations are judged to be similar, so that the conclusion that the two hospitalizations are suspected to be decomposed hospitalizations can be obtained.
Another implementation way of determining whether two hospitalizations are suspected to decompose hospitalization behaviors according to the similarity between the two hospitalization behaviors is as follows: and obtaining a comprehensive similarity according to the similarity of the compared two hospitalization behaviors in each category, comparing the comprehensive similarity with a preset comprehensive similarity threshold, and judging whether the compared two hospitalization behaviors are suspected decomposition hospitalization behaviors according to a comparison result.
Specifically, after the similarity of the two hospitalization behaviors in each category is obtained, a comprehensive similarity can be calculated according to the similarities. For example, the similarities in the respective categories are added to obtain the integrated similarity. For another example, in order to reflect the difference of importance degrees of different categories during similarity comparison, corresponding weight coefficients are set for the categories; then, the similarity of each category is multiplied by the weight coefficient of the category, and the multiplication results are added respectively to obtain the final comprehensive similarity.
After the comprehensive similarity is calculated, comparing the comprehensive similarity with a preset comprehensive similarity threshold, and if the comprehensive similarity is greater than or equal to the preset comprehensive similarity threshold, determining that the compared two hospitalization behaviors are suspected decomposition hospitalization behaviors; if the integrated similarity is less than the preset integrated similarity threshold, the compared two hospitalizations are not suspected decomposed hospitalizations. Wherein, the comprehensive similarity threshold value can be preset according to historical data.
Yet another implementation of determining whether two hospitalizations are split hospitalizations is: and calculating the similarity for the charging items of the two hospitalization behaviors without classification according to the cost characteristic vector of the first hospitalization behavior and the cost characteristic vector of the second hospitalization behavior. And comparing the calculated total similarity value of the two hospitalization behaviors with a preset threshold value, and determining whether the two hospitalization behaviors are suspected decomposition hospitalization behaviors according to the comparison result.
In the two previous implementations, in calculating the similarity of the two hospitalization behaviors, the similarity between the feature sub-vectors of the same category in the two hospitalization behaviors is calculated first. For example, the feature sub-vector of the drug category of the first hospitalization is compared with the feature sub-vector of the drug category of the second hospitalization, and the similarity between them is calculated. In the current implementation, rather than differentiating between specific categories, the similarity between all the items charged in the two hospitalizations is calculated. And calculating to obtain a total similarity value, comparing the total similarity value with a preset threshold value, and determining whether the two hospitalizations are suspected decomposition hospitalizations according to the comparison result. Wherein the preset threshold is calculated by historical data.
The identification method for decomposing hospitalization behaviors, provided by the embodiment of the invention, is characterized in that the characteristic vectors are extracted from the cost information of two hospitalization behaviors, the similarity between the characteristic vectors of the two hospitalization behaviors is calculated, and whether the two hospitalization behaviors belong to suspected decomposition hospitalization behaviors or not is judged according to the similarity between the characteristic vectors; the automatic identification of the hospitalization decomposition behavior is realized, and compared with the traditional manual method, the automatic identification method is high in execution efficiency and high in identification accuracy.
Based on any of the above embodiments, in an embodiment of the present invention, the information further includes time information;
accordingly, after the step of obtaining information on the first hospitalization behavior and the second hospitalization behavior of the target patient, the method further comprises:
and filtering out the first hospitalization behavior and the second hospitalization behavior which do not belong to the decomposed hospitalization behavior according to the time information of the first hospitalization behavior and the second hospitalization behavior.
Specifically, filtering out the first hospitalization and the second hospitalization that are not decomposed hospitalization further comprises:
calculating a time interval between the first hospitalization behavior and the second hospitalization behavior, wherein when the time interval is greater than or equal to a preset time interval threshold value, the first hospitalization behavior and the second hospitalization behavior do not belong to decomposition hospitalization behavior, and filtering the first hospitalization behavior and the second hospitalization behavior;
and/or the presence of a gas in the gas,
according to the time information of the first hospitalization behavior and the second hospitalization behavior, judging that the first hospitalization behavior and the second hospitalization behavior have a cross phenomenon in time, and filtering the first hospitalization behavior and the second hospitalization behavior if the first hospitalization behavior and the second hospitalization behavior do not belong to decomposition hospitalization behaviors.
It is well known to those skilled in the art that a split hospitalization occurs if the two preceding and following hospitalizations are closely spaced in time. If the interval is far, there is no possibility of breaking up hospitalization. Therefore, the first hospitalization behavior and the second hospitalization behavior are determined as non-decomposed hospitalization behaviors, so that one situation of filtering out the first hospitalization behavior and the second hospitalization behavior is that the time interval between the first hospitalization behavior and the second hospitalization behavior is greater than or equal to a preset time interval threshold.
And if the first hospitalization behavior and the second hospitalization behavior are smaller than a preset time interval threshold, continuously calculating a cost characteristic vector for the first hospitalization behavior and the second hospitalization behavior, and calculating the similarity according to the cost characteristic vector.
The time interval between the first hospitalization and the second hospitalization can be obtained from the time information of the patient's hospitalization. The time information at least comprises the time information of patient admission and the time information of patient discharge, and can also comprise the occurrence time information of various types of examination, medication, operation and other treatment behaviors of the patient during the hospitalization period.
For example, whether the time interval between the discharge time of the first hospitalization behavior and the admission time of the second hospitalization behavior of the patient is less than 24 hours is judged, if the time interval is less than 24 hours, the subsequent steps are executed on the information of the two hospitalization behaviors, and if the time interval is greater than or equal to 24 hours, the first hospitalization behavior and the second hospitalization behavior are filtered.
It is also known to the person skilled in the art that a patient may be admitted again only after discharge, i.e. that the patient may not have two hospitalizations at the same time. If the first hospitalization is crossed with the second hospitalization in time, i.e. the time of admission of the second hospitalization is earlier than the time of discharge of the first hospitalization, then this phenomenon is obviously not logical. The information of the corresponding first hospitalization behavior and the second hospitalization behavior belongs to wrong information, so the first hospitalization behavior and the second hospitalization behavior also need to be filtered out.
According to the recognition method for decomposing the hospitalization behaviors, provided by the embodiment of the invention, the information which is obviously impossible to decompose the hospitalization behaviors can be filtered in advance by filtering the hospitalization behavior information of the patient, so that the follow-up unnecessary operation is reduced, and the recognition efficiency of decomposing the hospitalization behaviors is improved.
Based on any one of the above embodiments, in an embodiment of the present invention, the information further includes medical institution information;
accordingly, after the step of obtaining information on the first hospitalization behavior and the second hospitalization behavior of the target patient, the method further comprises:
and filtering out the first hospitalization behavior and the second hospitalization behavior which do not belong to the decomposed hospitalization behavior according to the medical institution information of the first hospitalization behavior and the second hospitalization behavior.
Specifically, the filtering out the first hospitalization behavior and the second hospitalization behavior which are not decomposed hospitalization behavior according to the medical institution information of the first hospitalization behavior and the second hospitalization behavior comprises:
according to the medical institution information of the first hospitalization behavior and the second hospitalization behavior, the first hospitalization behavior and the second hospitalization behavior are judged to occur in different medical institutions, and then the first hospitalization behavior and the second hospitalization behavior do not belong to decomposition hospitalization behaviors, and the first hospitalization behavior and the second hospitalization behavior are filtered.
The phenomenon of transfer treatment exists in the daily diagnosis and treatment process, namely, patients are transferred from one hospital to another hospital due to the disease condition. In transfer therapy, the time between the discharge of the first hospitalization session and the time between the admission of the second hospitalization session are generally short in time interval, usually less than a preset time interval threshold. Identification of whether or not it is a split hospitalization should be made. However, transfer therapy is clearly rational and should not be a split hospitalization. The first hospitalization may thus be filtered out from the second hospitalization.
According to the recognition method for decomposing the hospitalization behaviors, provided by the embodiment of the invention, the information which is obviously impossible to decompose the hospitalization behaviors can be filtered in advance by filtering the hospitalization behavior information of the patient, so that the follow-up unnecessary operation is reduced, and the recognition efficiency of decomposing the hospitalization behaviors is improved.
Based on any one of the above embodiments, in an embodiment of the present invention, the performing feature vectorization on each charging item information in the plurality of first charging item information sets includes:
calculating a TF-IDF value for each charging item information of the plurality of first charging item information sets;
taking the TF-IDF value obtained by calculation as a characteristic value corresponding to the charging item information;
and performing feature vectorization on each charging item information in the plurality of second charging item information sets, including:
calculating a TF-IDF value for each charging item information of the plurality of second charging item information sets;
and taking the TF-IDF value obtained by calculation as a characteristic value corresponding to the charging item information.
TF-IDF (term frequency-inverse text frequency index) is a common weighting technology for information retrieval and data mining, is commonly used for mining keywords in articles, and has a simple and efficient algorithm. The main idea of TF-IDF is: if a word or phrase appears in an article frequently and rarely in other articles, the word or phrase is considered to have good category distinguishing capability and is suitable for extracting main information of the article and classifying the article.
The calculation formula of TF-IDF is:
TF-IDF=TF*IDF;
wherein, TF refers to the frequency of a given word appearing in the document, and the calculation formula is as follows:
Figure BDA0002615588750000151
IDF is "Inverse Document Frequency" (abbreviated IDF) and represents the total number of documents divided by the number of documents containing the term, and the resulting quotient is logarithmized.
The calculation formula is as follows:
Figure BDA0002615588750000152
the denominator plus 1 is to prevent the occurrence of smoothing processing that cannot be performed by calculation because the denominator is zero due to corpus missing words.
As can be seen from the TF-IDF calculation formula, a high word frequency within a particular document, and a low document frequency for that word across the document set, can result in a high weighted TF-IDF.
When the TF-IDF is applied to the embodiment of the invention, a specific charging item is regarded as a word, and a hospitalization behavior is regarded as a document, so that a characteristic vector of a certain charging item in the hospitalization behavior can be obtained; calculating TF value, namely calculating TF value as one hospitalization behavior, dividing the frequency of a certain charging item by the frequency of all charging items generated by the hospitalization behavior, wherein the IDF value of each charging item can be obtained by historical data training or real-time training when identifying and decomposing hospitalization behaviors are executed each time; the vector space is all the items charged.
The feature value of each feature item in the cost feature vector obtained through the above operation is the TF-IDF value of the corresponding charging item.
The identification method for decomposing hospitalization behaviors provided by the embodiment of the invention sets the characteristic value of the charging item by calculating the TF-IDF value for the charging item, the TF-IDF is simple and quick to calculate, easy to realize, easy to understand and strong in interpretability, and in practical application, the extracted characteristic can be manually corrected and checked by experts in related fields, so that the identification method is widely applied in the industrial field in the practical field.
Based on any one of the above embodiments, in an embodiment of the present invention, the performing feature vectorization on each charging item information in the plurality of first charging item information sets includes:
sorting the charging item information in any one first charging item information set according to a time sequence;
deleting repeated charging item information in the set from the any one first charging item information set after sorting;
inputting the any one first charging item information set which is sequenced and the repeated charging item information is deleted into a pre-trained BERT model in a sentence mode, wherein the BERT model outputs a first charge characteristic sub-vector corresponding to the any one first charging item information set;
and performing feature vectorization on each charging item information in the plurality of second charging item information sets, including:
sorting the charging item information in any one second charging item information set according to a time sequence;
deleting repeated charging item information in the set from the any one second charging item information set after sorting;
inputting the any one second charging item information set which is sequenced and the repeated charging item information is deleted into a pre-trained BERT model in a sentence mode, and outputting a second charge characteristic sub-vector corresponding to the any one second charging item information set by the BERT model.
In the embodiment of the invention, the charging item information is represented in the form of a standard code. The standard code for the toll collection item can be formulated by referring to the existing related standard, such as: ICD (International Classification of Diseases) 10, ICD-9-CM-3(International Classification of Diseases Clinical Modification of 9th Revision and Procedures, ninth version of the International Classification of Diseases Clinical Revision surgery and operation), "national medical insurance DRG grouping and Payment technical Specification," medical service item Classification and code (medical insurance), medical insurance drug Classification and code (medical insurance), medical insurance consumable Classification and code (medical insurance).
The BERT (Bidirectional Encoder characterization based on transform model) model is a new language model developed and released by Google corporation (Google) at the end of 2018.
BERT is essentially a two-stage NLP (natural language processing) model, the first stage being: pre-training, similar to Word Embedding, may use unlabeled corpus to train language models to obtain the characterization vectors of sentences. The second stage uses the Fine-Tuning model to solve downstream tasks such as text classification, etc. In the embodiments of the present invention, the first stage in BERT is mainly involved.
In the embodiment of the invention, a characteristic vector is obtained for each sentence by using a BERT model, namely, all the charging item information of one category in a hospitalization behavior is sequenced and deduplicated to form a sentence, the sentence is input into the BERT model which is trained in advance, and the sentence vector output by a transform layer of the BERT model is the charge characteristic sub-vector corresponding to the charging item information of the category. The charging item information of each category of the first hospitalization behavior is processed according to the operation, and a plurality of first cost characteristic sub-vectors can be obtained. The plurality of first cost feature sub-vectors constitute a feature vector of a first hospitalization activity. Similarly, the charging item information of each category of the second hospitalization behavior is processed according to the above operation, so that a plurality of second cost characteristic sub-vectors can be obtained. The plurality of second cost feature sub-vectors constitute a feature vector for a second hospitalization activity.
The BERT model in the embodiment of the invention is obtained by carrying out unsupervised training by utilizing the existing historical charging item information. In the embodiment of the invention, the BERT model is obtained by pre-training. In combination with the processing flow of the charging item information according to the embodiment of the present invention and the common general knowledge of those skilled in the art, those skilled in the art can implement training of the BERT model by using the historical charging item information without creative labor, and therefore, the description is not repeated in the embodiment of the present invention.
The recognition method for decomposing hospitalization behaviors provided by the embodiment of the invention is based on the BERT model, utilizes a large amount of data and constructs a complex and deep network structure, can train out high-quality toll project characteristic vectors, can further improve the discrimination capability of different toll projects, can obtain better effect in the later period compared with TF-IDF, and improves the precision rate and recall rate of suspected decomposition hospitalization behavior recognition.
Based on any of the above embodiments, in an embodiment of the present invention, after the step of obtaining information of the first hospitalization behavior and the second hospitalization behavior of the target patient, the method further includes:
and mapping the cost information of the first hospitalization behavior and the cost information of the second hospitalization behavior to standard codes.
The billing items are typically encoded for computer storage, for example, intravenous infusion is designated as "f 12040000607". Since each place and each company have their own coding system, the representation of the charge item is very different in the charge information of the target patient acquired from the database, and in order to realize uniform processing, it is necessary to map the original code of the charge item to the standard code.
In the embodiment of the present invention, the standard code for the charging item can be formulated with reference to the existing related standard, such as: ICD (International Classification of Diseases) 10, ICD-9-CM-3(International Classification of Diseases Clinical Modification of 9th Revision and Procedures, ninth version of the International Classification of Diseases Clinical Revision surgery and operation), "national medical insurance DRG grouping and Payment technical Specification," medical service item Classification and code (medical insurance), medical insurance drug Classification and code (medical insurance), medical insurance consumable Classification and code (medical insurance).
In the process of mapping the charging item from the original code to the standard code, the core problem is to realize the correspondence between the original code and the standard code. One implementation is to match the Chinese name corresponding to the original code with the Chinese name corresponding to the standard code, so as to implement the correspondence between the original code and the standard code.
In one particular embodiment, as shown in table 1 below:
different iv items get unified standard charge item name and code after mapping-iv (120400006):
TABLE 1
Original charging item ID Original charging item name Standard charging item ID Standard charge item name
f12040000607 Intravenous infusion (hospitalization/transfusion-free) 120400006 Intravenous infusion
zx000011 Additional collection of venous transfusion 120400006 Intravenous infusion
According to the recognition method for decomposing hospitalization behaviors, provided by the embodiment of the invention, the charging items are mapped to the standard codes from the original codes, so that a foundation is provided for the subsequent recognition of decomposing hospitalization behaviors.
Based on any of the above embodiments, fig. 2 is a schematic diagram of an identification apparatus for decomposing hospitalization behaviors provided by another embodiment of the present invention, and as shown in fig. 2, the identification apparatus for decomposing hospitalization behaviors provided by another embodiment of the present invention includes:
an information acquisition module 201, configured to acquire information of a first hospitalization behavior and a second hospitalization behavior of a target patient; wherein the first hospitalization is a twice-adjacent hospitalization of the subject patient to a second hospitalization, and the first hospitalization occurs before the second hospitalization; the information comprises cost information;
the cost characteristic vector generation module 202 is configured to obtain a cost characteristic vector of the first hospitalization behavior according to the cost information of the first hospitalization behavior, and obtain a cost characteristic vector of the second hospitalization behavior according to the cost information of the second hospitalization behavior;
the determining module 203 is configured to determine whether the first hospitalization activity and the second hospitalization activity are suspected decomposition hospitalization activities according to the cost eigenvector of the first hospitalization activity and the cost eigenvector of the second hospitalization activity.
The recognition device for decomposing hospitalization behaviors extracts characteristic vectors for cost information of twice hospitalization behaviors, calculates the similarity between the characteristic vectors of the twice hospitalization behaviors, and judges whether the twice hospitalization behaviors belong to suspected decomposition hospitalization behaviors or not according to the similarity between the characteristic vectors; the automatic identification of the hospitalization decomposition behavior is realized, and compared with the traditional manual method, the automatic identification method is high in execution efficiency and high in identification accuracy.
Fig. 3 is a schematic physical structure diagram of an electronic device according to an embodiment of the present invention, and as shown in fig. 3, the electronic device may include: a processor (processor)310, a communication Interface (communication Interface)320, a memory (memory)330 and a communication bus 340, wherein the processor 310, the communication Interface 320 and the memory 330 communicate with each other via the communication bus 340. The processor 310 may call logic instructions in the memory 330 to perform the following method: acquiring information of a first hospitalization behavior and a second hospitalization behavior of a target patient; wherein the first hospitalization is a twice-adjacent hospitalization of the subject patient to a second hospitalization, and the first hospitalization occurs before the second hospitalization; the information comprises cost information; obtaining a cost characteristic vector of the first hospitalization behavior according to the cost information of the first hospitalization behavior, and obtaining a cost characteristic vector of the second hospitalization behavior according to the cost information of the second hospitalization behavior; and judging whether the first hospitalization behavior and the second hospitalization behavior are suspected decomposition hospitalization behaviors or not according to the cost characteristic vector of the first hospitalization behavior and the cost characteristic vector of the second hospitalization behavior.
In addition, the logic instructions in the memory 330 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, an embodiment of the present invention further provides a non-transitory computer-readable storage medium, on which a computer program is stored, where the computer program is implemented by a processor to perform the method provided by the foregoing embodiments, for example, including: acquiring information of a first hospitalization behavior and a second hospitalization behavior of a target patient; wherein the first hospitalization is a twice-adjacent hospitalization of the subject patient to a second hospitalization, and the first hospitalization occurs before the second hospitalization; the information comprises cost information; obtaining a cost characteristic vector of the first hospitalization behavior according to the cost information of the first hospitalization behavior, and obtaining a cost characteristic vector of the second hospitalization behavior according to the cost information of the second hospitalization behavior; and judging whether the first hospitalization behavior and the second hospitalization behavior are suspected decomposition hospitalization behaviors or not according to the cost characteristic vector of the first hospitalization behavior and the cost characteristic vector of the second hospitalization behavior.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (12)

1. An identification method for resolving hospitalization behavior, comprising:
acquiring information of a first hospitalization behavior and a second hospitalization behavior of a target patient; wherein the first hospitalization is a twice-adjacent hospitalization of the subject patient to a second hospitalization, and the first hospitalization occurs before the second hospitalization; the information comprises cost information;
obtaining a cost characteristic vector of the first hospitalization behavior according to the cost information of the first hospitalization behavior, and obtaining a cost characteristic vector of the second hospitalization behavior according to the cost information of the second hospitalization behavior;
and judging whether the first hospitalization behavior and the second hospitalization behavior are suspected decomposition hospitalization behaviors or not according to the cost characteristic vector of the first hospitalization behavior and the cost characteristic vector of the second hospitalization behavior.
2. The hospital stay decomposition identification method according to claim 1, wherein the information includes time information;
accordingly, after the step of obtaining information on the first hospitalization behavior and the second hospitalization behavior of the target patient, the method further comprises:
and filtering out the first hospitalization behavior and the second hospitalization behavior which do not belong to the decomposed hospitalization behavior according to the time information of the first hospitalization behavior and the second hospitalization behavior.
3. The hospital hospitalization decomposition behavior identification method of claim 2, wherein the filtering out the first hospitalization behavior and the second hospitalization behavior not belonging to the hospital hospitalization decomposition behavior according to the time information of the first hospitalization behavior and the second hospitalization behavior comprises:
calculating a time interval between the first hospitalization behavior and the second hospitalization behavior, wherein when the time interval is greater than or equal to a preset time interval threshold value, the first hospitalization behavior and the second hospitalization behavior do not belong to decomposition hospitalization behavior, and filtering the first hospitalization behavior and the second hospitalization behavior;
and/or the presence of a gas in the gas,
according to the time information of the first hospitalization behavior and the second hospitalization behavior, judging that the first hospitalization behavior and the second hospitalization behavior have a cross phenomenon in time, and filtering the first hospitalization behavior and the second hospitalization behavior if the first hospitalization behavior and the second hospitalization behavior do not belong to decomposition hospitalization behaviors.
4. The hospital stay decomposition identification method according to claim 1, wherein the information includes medical institution information;
accordingly, after the step of obtaining information on the first hospitalization behavior and the second hospitalization behavior of the target patient, the method further comprises:
and filtering out the first hospitalization behavior and the second hospitalization behavior which do not belong to the decomposed hospitalization behavior according to the medical institution information of the first hospitalization behavior and the second hospitalization behavior.
5. The hospital hospitalization decomposition behavior identification method of claim 4, wherein filtering out the first hospitalization behavior and the second hospitalization behavior not belonging to the hospital hospitalization decomposition behavior according to the medical institution information of the first hospitalization behavior and the second hospitalization behavior comprises:
according to the medical institution information of the first hospitalization behavior and the second hospitalization behavior, the first hospitalization behavior and the second hospitalization behavior are judged to occur in different medical institutions, and then the first hospitalization behavior and the second hospitalization behavior do not belong to decomposition hospitalization behaviors, and the first hospitalization behavior and the second hospitalization behavior are filtered.
6. The hospitalization resolution identification method of any of claims 1 to 5, wherein the deriving the cost eigenvector of the first hospitalization according to the cost information of the first hospitalization and the cost eigenvector of the second hospitalization according to the cost information of the second hospitalization comprises:
dividing all charge item information in the charge information of the first hospitalization behavior according to the category of charge items to obtain a plurality of classified first charge item information sets;
performing feature vectorization on each charging item information in the plurality of first charging item information sets to obtain a plurality of first cost feature sub-vectors corresponding to the plurality of first charging item information sets, wherein the plurality of first cost feature sub-vectors form a cost feature vector of the first hospitalization behavior;
all charging item information in the cost information of the second hospitalization behavior is divided according to the category of the charging item, so that a plurality of classified second charging item information sets are obtained;
and performing feature vectorization on each charging item information in the second charging item information sets to obtain a plurality of second cost feature sub-vectors corresponding to the second charging item information sets, wherein the second cost feature sub-vectors form a cost feature vector of the second hospitalization behavior.
7. The hospitalization resolution identification method of claim 6, wherein the feature vectorization for each charge item information in the first charge item information sets comprises:
calculating a TF-IDF value for each charging item information of the plurality of first charging item information sets;
taking the TF-IDF value obtained by calculation as a characteristic value corresponding to the charging item information;
and performing feature vectorization on each charging item information in the plurality of second charging item information sets, including:
calculating a TF-IDF value for each charging item information of the plurality of second charging item information sets;
and taking the TF-IDF value obtained by calculation as a characteristic value corresponding to the charging item information.
8. The hospitalization resolution identification method of claim 6, wherein the feature vectorization for each charge item information in the first charge item information sets comprises:
sorting the charging item information in any one first charging item information set according to a time sequence;
deleting repeated charging item information in the set from the any one first charging item information set after sorting;
inputting the any one first charging item information set which is sequenced and the repeated charging item information is deleted into a pre-trained BERT model in a sentence mode, wherein the BERT model outputs a first charge characteristic sub-vector corresponding to the any one first charging item information set;
and performing feature vectorization on each charging item information in the plurality of second charging item information sets, including:
sorting the charging item information in any one second charging item information set according to a time sequence;
deleting repeated charging item information in the set from the any one second charging item information set after sorting;
inputting the any one second charging item information set which is sequenced and the repeated charging item information is deleted into a pre-trained BERT model in a sentence mode, and outputting a second charge characteristic sub-vector corresponding to the any one second charging item information set by the BERT model.
9. The method of claim 6, wherein the determining whether the first hospitalization activity and the second hospitalization activity are suspected split hospitalization activities according to the cost eigenvector of the first hospitalization activity and the cost eigenvector of the second hospitalization activity comprises:
selecting a corresponding second expense characteristic sub-vector from the expense characteristic vectors of the second hospitalization behavior according to the category of the charging item for any first expense characteristic sub-vector in the expense characteristic vectors of the first hospitalization behavior, and calculating the similarity between the any first expense characteristic sub-vector and the selected corresponding second expense characteristic sub-vector;
and judging whether the first hospitalization behavior and the second hospitalization behavior are suspected decomposition hospitalization behaviors or not according to the calculated similarity.
10. An identification device for resolving hospitalization activity, comprising:
the information acquisition module is used for acquiring the information of the first hospitalization behavior and the second hospitalization behavior of the target patient; wherein the first hospitalization is a twice-adjacent hospitalization of the subject patient to a second hospitalization, and the first hospitalization occurs before the second hospitalization; the information comprises cost information;
the cost characteristic vector generation module is used for obtaining a cost characteristic vector of the first hospitalization behavior according to the cost information of the first hospitalization behavior and obtaining a cost characteristic vector of the second hospitalization behavior according to the cost information of the second hospitalization behavior;
and the judging module is used for judging whether the first hospitalization behavior and the second hospitalization behavior are suspected decomposition hospitalization behaviors or not according to the cost characteristic vector of the first hospitalization behavior and the cost characteristic vector of the second hospitalization behavior.
11. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the method for hospital stay resolution identification method according to any one of claims 1 to 9 when executing the program.
12. A non-transitory computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method for hospital stay identification decomposition according to any one of claims 1 to 9.
CN202010768490.5A 2020-08-03 2020-08-03 Identification method and device for decomposing hospitalization behaviors, electronic equipment and storage medium Active CN112016302B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010768490.5A CN112016302B (en) 2020-08-03 2020-08-03 Identification method and device for decomposing hospitalization behaviors, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010768490.5A CN112016302B (en) 2020-08-03 2020-08-03 Identification method and device for decomposing hospitalization behaviors, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112016302A true CN112016302A (en) 2020-12-01
CN112016302B CN112016302B (en) 2024-04-30

Family

ID=73499182

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010768490.5A Active CN112016302B (en) 2020-08-03 2020-08-03 Identification method and device for decomposing hospitalization behaviors, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112016302B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100076785A1 (en) * 2008-09-25 2010-03-25 Air Products And Chemicals, Inc. Predicting rare events using principal component analysis and partial least squares
CN104182824A (en) * 2014-08-08 2014-12-03 平安养老保险股份有限公司 Rule checking system and rule checking method for recognizing medical insurance reimbursement violations
KR20180003345A (en) * 2016-06-30 2018-01-09 삼성에스디에스 주식회사 Apparatus and method for providing information of medical cost and lengh of stay of patient
CN107609980A (en) * 2017-09-07 2018-01-19 平安医疗健康管理股份有限公司 Medical data processing method, device, computer equipment and storage medium
CN109118376A (en) * 2018-08-14 2019-01-01 平安医疗健康管理股份有限公司 Medical insurance premium calculation principle method, apparatus, computer equipment and storage medium
CN109492803A (en) * 2018-10-30 2019-03-19 平安科技(深圳)有限公司 Chronic disease hospitalization cost method for detecting abnormality and relevant apparatus based on artificial intelligence
CN109545317A (en) * 2018-10-30 2019-03-29 平安科技(深圳)有限公司 The method and Related product of behavior in hospital are determined based on prediction model in hospital
CN109934723A (en) * 2019-02-27 2019-06-25 生活空间(沈阳)数据技术服务有限公司 A kind of medical insurance fraud recognition methods, device and equipment
CN109935287A (en) * 2019-02-28 2019-06-25 生活空间(沈阳)数据技术服务有限公司 A kind of similarity analysis method, device and equipment of medical record information
CN110334843A (en) * 2019-04-22 2019-10-15 山东大学 A kind of time-varying attention improves be hospitalized medial demand prediction technique and the device of Bi-LSTM

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100076785A1 (en) * 2008-09-25 2010-03-25 Air Products And Chemicals, Inc. Predicting rare events using principal component analysis and partial least squares
CN104182824A (en) * 2014-08-08 2014-12-03 平安养老保险股份有限公司 Rule checking system and rule checking method for recognizing medical insurance reimbursement violations
KR20180003345A (en) * 2016-06-30 2018-01-09 삼성에스디에스 주식회사 Apparatus and method for providing information of medical cost and lengh of stay of patient
CN107609980A (en) * 2017-09-07 2018-01-19 平安医疗健康管理股份有限公司 Medical data processing method, device, computer equipment and storage medium
CN109118376A (en) * 2018-08-14 2019-01-01 平安医疗健康管理股份有限公司 Medical insurance premium calculation principle method, apparatus, computer equipment and storage medium
CN109492803A (en) * 2018-10-30 2019-03-19 平安科技(深圳)有限公司 Chronic disease hospitalization cost method for detecting abnormality and relevant apparatus based on artificial intelligence
CN109545317A (en) * 2018-10-30 2019-03-29 平安科技(深圳)有限公司 The method and Related product of behavior in hospital are determined based on prediction model in hospital
CN109934723A (en) * 2019-02-27 2019-06-25 生活空间(沈阳)数据技术服务有限公司 A kind of medical insurance fraud recognition methods, device and equipment
CN109935287A (en) * 2019-02-28 2019-06-25 生活空间(沈阳)数据技术服务有限公司 A kind of similarity analysis method, device and equipment of medical record information
CN110334843A (en) * 2019-04-22 2019-10-15 山东大学 A kind of time-varying attention improves be hospitalized medial demand prediction technique and the device of Bi-LSTM

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
WEIJIA ZHANG: "An Anomaly Detection Method for Medicare Fraud Detection", IEEE, pages 309 - 314 *
吴奎: "层次分析法在建立医保定点医疗机构监控指标权重系数中的应用研究", 中国医疗保险, pages 36 - 39 *
吴婧;姚新宝;刘金宝;: "某三甲医院医保患者住院费用分析", 新疆医学, no. 03, pages 12 - 16 *
林源: "新型农村合作医疗保险欺诈风险管理研究", 中国博士学位论文电子期刊网, 15 August 2016 (2016-08-15), pages 161 - 7 *
高婵;: "基于平衡计分卡的医院成本控制指标体系的构建", 医学与社会, no. 03, pages 65 - 67 *
高永昌: "医疗保险大数据中的欺诈检测关键问题研究", 中国优秀硕士论文电子期刊网, pages 1 - 164 *

Also Published As

Publication number Publication date
CN112016302B (en) 2024-04-30

Similar Documents

Publication Publication Date Title
CN111414393B (en) Semantic similar case retrieval method and equipment based on medical knowledge graph
CN107656952B (en) The modeling method of parallel intelligence case recommended models
CN106227880B (en) Method for implementing doctor search recommendation
EP3734604A1 (en) Method and system for supporting medical decision making
CN107193919A (en) The search method and system of a kind of electronic health record
CN111798941A (en) Predictive system for generating clinical queries
CN111465990B (en) Method and system for clinical trials of healthcare
CN112687397B (en) Rare disease knowledge base processing method and device and readable storage medium
CN113345577B (en) Diagnosis and treatment auxiliary information generation method, model training method, device, equipment and storage medium
US20200118683A1 (en) Medical diagnostic aid and method
CN107480131A (en) Chinese electronic health record symptom semantic extracting method and its system
CN112820416A (en) Major infectious disease queue data typing method, typing model and electronic equipment
CN112885478A (en) Medical document retrieval method, medical document retrieval device, electronic device, and storage medium
Wang et al. Multiple valued logic approach for matching patient records in multiple databases
Jiang et al. Stroke risk prediction using artificial intelligence techniques through electronic health records
Vu et al. Identifying patients with pain in emergency departments using conventional machine learning and deep learning
Liu et al. Extracting patient demographics and personal medical information from online health forums
CN113094476A (en) Risk early warning method, system, equipment and medium based on natural language processing
Loh et al. Knowledge discovery in texts for constructing decision support systems
CN112016302B (en) Identification method and device for decomposing hospitalization behaviors, electronic equipment and storage medium
CN115775635A (en) Medicine risk identification method and device based on deep learning model and terminal equipment
CN112735584B (en) Malignant tumor diagnosis and treatment auxiliary decision generation method and device
EP4226383A1 (en) A system and a way to automatically monitor clinical trials - virtual monitor (vm) and a way to record medical history
Kongburan et al. Enhancing predictive power of cluster-boosted regression with text-based indexing
Baghal et al. Agile natural language processing model for pathology knowledge extraction and integration with clinical enterprise data warehouse

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant