CN113488152A - Semantic triage method and system - Google Patents

Semantic triage method and system Download PDF

Info

Publication number
CN113488152A
CN113488152A CN202110793215.3A CN202110793215A CN113488152A CN 113488152 A CN113488152 A CN 113488152A CN 202110793215 A CN202110793215 A CN 202110793215A CN 113488152 A CN113488152 A CN 113488152A
Authority
CN
China
Prior art keywords
historical
department
training
term
classification model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110793215.3A
Other languages
Chinese (zh)
Other versions
CN113488152B (en
Inventor
张后今
曾培基
周金龙
章昊
肖航
吴珂仪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN202110793215.3A priority Critical patent/CN113488152B/en
Publication of CN113488152A publication Critical patent/CN113488152A/en
Application granted granted Critical
Publication of CN113488152B publication Critical patent/CN113488152B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H40/00ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
    • G16H40/20ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices for the management or administration of healthcare resources or facilities, e.g. managing hospital staff or surgery rooms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Databases & Information Systems (AREA)
  • Primary Health Care (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • General Business, Economics & Management (AREA)
  • Molecular Biology (AREA)
  • Epidemiology (AREA)
  • Biophysics (AREA)
  • Business, Economics & Management (AREA)
  • Pathology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention discloses a semantic triage method and a semantic triage system, wherein the method comprises the following steps: acquiring actual inquiry data; inputting actual inquiry data into a department classification model to obtain a target department; the establishment method of the department classification model comprises the following steps: acquiring historical inquiry data; the historical inquiry data comprises historical disease description and corresponding historical departments; and training the long-term and short-term memory network model according to historical inquiry data to obtain a department classification model. The traditional triage method needs a diagnostician to manually judge the corresponding department according to the disease description, and the manual speed is limited, so that the target department can be automatically obtained by only inputting the disease description by utilizing a department classification model, and the triage efficiency is improved; and the diagnostician can conduct division diagnosis of departments completely by means of experience, and mistakes are easy to occur.

Description

Semantic triage method and system
Technical Field
The invention relates to the technical field of department consultation, in particular to a semantic triage method and a semantic triage system.
Background
In a hospital, it is time consuming and labor intensive to introduce a patient to the correct department, requiring a care giver to analyze the patient's profile and then assign a department to the patient. However, because of the small number of hospital instructors, the patient needs to wait a long time to obtain triage results. Moreover, since the patients are of various kinds, the doctor guide is not always able to give the patient the correct department, depending on the field of expertise.
Disclosure of Invention
The invention aims to provide a semantic triage method and a semantic triage system so as to improve efficiency and accuracy of triage.
In order to achieve the purpose, the invention provides the following scheme:
a method of semantic triage, the method comprising:
acquiring actual inquiry data;
inputting the actual inquiry data into a department classification model to obtain a target department;
the establishment method of the department classification model comprises the following steps:
acquiring historical inquiry data; the historical interrogation data comprises historical disease description and corresponding historical departments;
and training a long-term and short-term memory network model according to the historical inquiry data to obtain the department classification model.
Optionally, the training of the long-term and short-term memory network model according to the historical inquiry data to obtain the department classification model specifically includes:
inputting the historical disease descriptions into the long-short term memory network model to obtain the probability of each department corresponding to each historical disease description;
judging whether to stop training according to the department with the highest probability and the corresponding historical department;
if so, taking the long-term and short-term memory network model under the current training times as the department classification model;
if not, updating the parameters of the long-term and short-term memory network model, and carrying out next training.
Optionally, before the step of inputting the historical disease descriptions into the long-term and short-term memory network model to obtain the probability of each department corresponding to each historical disease description, the method further includes:
preprocessing the historical disease description.
Optionally, the preprocessing the historical disease description specifically includes:
performing data cleaning on the historical disease description;
performing word segmentation processing on the historical disease description after data cleaning to obtain a plurality of words;
and mapping the words to a vector space to obtain a plurality of numerical data.
Optionally, the data cleaning of the historical disease description specifically includes:
performing at least one deletion operation on the historical disease description; the deleting operation comprises character deletion, letter conversion, prototype conversion, space deletion and information deletion;
the character deletion is to delete irrelevant characters, non-English characters, non-Chinese characters, non-numeric characters and non-Chinese and English punctuation coincidence and hyperlinks in the historical disease description; the extraneous characters include: html tags, messy codes, special characters and tags;
the letters are converted into capital letters of English letters in the historical disease description and are converted into lowercase letters;
converting the prototype into a prototype for converting English letters into the English letters;
the space deletion is to delete redundant spaces;
the information deletion is to delete at least one of text information with personal information; the text information includes: name, contact address, and personal address.
Optionally, the determining, according to the department with the highest probability and the corresponding historical department, whether to stop training specifically includes:
for any historical disease description, judging whether the department with the maximum output probability is consistent with the corresponding historical department;
counting the number of historical disease descriptions of departments with the highest probability and corresponding historical departments to obtain the target number;
calculating the accuracy of the long-term and short-term memory network model under the current training times according to the target number and the total number of the historical disease descriptions;
and if the accuracy is greater than a set threshold, stopping training.
A semantic triage system, the system comprising:
the actual inquiry data acquisition module is used for acquiring actual inquiry data;
the target department determining module is used for inputting the actual inquiry data into a department classification model to obtain a target department;
the establishment method of the department classification model comprises the following steps:
acquiring historical inquiry data; the historical interrogation data comprises historical disease description and corresponding historical departments;
and training a long-term and short-term memory network model according to the historical inquiry data to obtain the department classification model.
Optionally, the system further includes: a department classification model establishing module;
the department classification model establishing module is used for:
acquiring historical inquiry data; the historical interrogation data comprises historical disease description and corresponding historical departments; training a long-term and short-term memory network model according to the historical inquiry data to obtain the department classification model;
the department classification model establishing module specifically comprises:
the probability determining unit is used for inputting the historical disease descriptions into the long-term and short-term memory network model to obtain the probability of each department corresponding to each historical disease description;
the stopping judgment unit is used for judging whether to stop training according to the department with the highest probability and the corresponding historical department;
a department classification model generation unit, configured to, when the stop determination unit determines that training is stopped, take a long-term and short-term memory network model under the current training frequency as the department classification model;
and the continuous training unit is used for updating the parameters of the long-term and short-term memory network model and carrying out the next training when the stopping judgment unit judges that the training is continued.
Optionally, the department classification model establishing module further includes: and the preprocessing unit is used for preprocessing the historical disease descriptions before the historical disease descriptions are input into the long-short term memory network model and the probabilities of departments corresponding to the historical disease descriptions are obtained.
Optionally, the stop determining unit specifically includes:
the consistency judgment subunit is used for judging whether the department with the highest output probability is consistent with the corresponding historical department or not for any historical disease description;
the statistical subunit is used for counting the number of historical disease descriptions of departments with the maximum probability and corresponding historical departments to obtain the target number;
the accuracy calculation subunit is used for calculating the accuracy of the long-term and short-term memory network model under the current training times according to the target number and the total number of the historical disease descriptions;
and the training stopping subunit is used for stopping training when the accuracy is greater than a set threshold.
According to the specific embodiment provided by the invention, the invention discloses the following technical effects:
the invention discloses a semantic triage method and a semantic triage system, wherein the method comprises the following steps: acquiring actual inquiry data; inputting actual inquiry data into a department classification model to obtain a target department; the establishment method of the department classification model comprises the following steps: acquiring historical inquiry data; the historical inquiry data comprises historical disease description and corresponding historical departments; and training the long-term and short-term memory network model according to historical inquiry data to obtain a department classification model. The traditional triage method needs a diagnostician to manually judge the corresponding department according to the disease description, and the manual speed is limited, so that the target department can be automatically obtained by only inputting the disease description by utilizing a department classification model, and the triage efficiency is improved; and the diagnostician can conduct division diagnosis of departments completely by means of experience, and mistakes are easy to occur.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a flow chart of a semantic triage method provided by an embodiment of the present invention;
FIG. 2 is a flow chart of a modified LSTM model according to an embodiment of the present invention;
fig. 3 is a structural diagram of a semantic triage system according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The invention aims to provide a semantic triage method and a semantic triage system, aims to improve efficiency and accuracy of triage, and can be applied to the technical field of department diagnosis guidance.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
Fig. 1 is a flow chart of a semantic triage method provided by the embodiment of the invention. As shown in fig. 1, the semantic triage method in this embodiment includes:
step 101: actual interrogation data is acquired.
Specifically, historical inquiry data of 12 departments are collected from various large-open-source websites (such as github and the like), the total historical inquiry data is 12845, and in practical application, more historical inquiry data can be collected according to actual needs.
Step 102: and inputting the actual inquiry data into a department classification model to obtain a target department.
The establishment method of the department classification model comprises the following steps:
acquiring historical inquiry data; the historical interrogation data includes historical disease descriptions and corresponding historical departments.
And training the long-term and short-term memory network model according to historical inquiry data to obtain a department classification model.
Specifically, the long-term and short-term memory network model is a modified long-term and short-term memory network model, and researches show that higher features can be extracted by a deeper network. However, a pure stacked LSTM (long short term memory network model) is likely to result in an overfitting, and thus the trained network, although performing well on the training set, will have a much reduced effect on the new data set. Here, a HighwayNetwork (high speed network) structure is used to reform the LSTM model. The HighwayNetwork establishes two nonlinear gates, one of which is a transform Gate and a Carry Gate. The transfer gate can control the proportion of the currently flowing information that is transferred, and the carrying gate can control the proportion of the currently flowing information that is carried. The following is a detailed procedure for reconstructing LSTM using HighWay.
The conventional LSTM is simplified to an input-output function model, as shown below,
y=H(x,WH)。
where x is an output matrix that maps the processed text to a vector space using a Robustly optimized BERT pretraining method (RoBERTa), which is also an input matrix of the LSTM network, rows of the matrix are lengths of text, and columns of the matrix are lengths of word vectors. WHAre parameters of the LSTM network. y is the output of the LSTM network, and is the information for further extracting the semantic information of the words and sufficiently learning the disease description of the patient.
Now, two further non-linear transformations are defined:
α=T(x,WT) And β ═ C (x, W)C)。
Where T and C are non-linear transformation functions used to calculate the proportion of the current input that is diverted and carrying information. The nonlinear transformations T and C here can take many forms, for example: sigmoid activation functions, etc. Alpha and beta are both values between 0 and 1, alpha represents the proportion of the currently input transferred information, and beta represents the proportion of the currently input carried information. WTIs a parameter in the transfer gate that calculates the transferred proportion of the LSTM network input x, WCIs a parameter in the carry gate that calculates the ratio of carried LSTM network input x. The modified LSTM structure is:
y'=α*y+β*x。
where y' is the output of the modified LSTM model. To simplify the calculation, let β be 1- α. A flow chart of the reconstructed LSTM model is shown in fig. 3.
The network structure can effectively control information flow among different LSTM networks, and the risk of overfitting is greatly reduced. This idea is applied today in many ways to prevent overfitting. The network modified in this way can not only obtain more advanced features but also effectively prevent the occurrence of overfitting when the number of layers is increased. The modified LSTM network may be used as a generic model.
Step 104: and inputting the actual inquiry data into a department classification model to obtain a target department.
As an optional implementation manner, step 103 specifically includes:
and inputting the historical disease descriptions into the long-term and short-term memory network model to obtain the probability of each department corresponding to each historical disease description.
And judging whether to stop training according to the department with the highest probability and the corresponding historical department.
If so, taking the long-term and short-term memory network model under the current training times as a department classification model.
If not, updating the parameters of the long-term and short-term memory network model and carrying out the next training.
As an optional implementation, before inputting the historical disease descriptions into the long-short term memory network model, obtaining the probability of each department corresponding to each historical disease description, the method further includes:
the historical disease description is preprocessed.
As an alternative embodiment, the historical disease description is preprocessed, which specifically includes:
and (5) performing data cleaning on the historical disease description.
And performing word segmentation processing on the historical disease description after data cleaning to obtain a plurality of words.
Specifically, the method comprises the steps of utilizing a jieba (Chinese word segmentation tool) to segment the text of the disease description after data cleaning, and segmenting the text into word sets with actual meanings. For example, for a description of the condition: to ask about the question about blood sugar, is diabetic? How well you should diabetes be treated? ", will be divided into: [ "ask questions", "blood sugar", "question", ",", "blood sugar", "too high", "is", "suffer from", "diabetes", "woolen", "is? "," you good "," diabetes "," should "," what "," treatment "," woolen ","? "].
And mapping the words to a vector space to obtain a plurality of numerical data.
Specifically, since the neural network model cannot directly process text-type character string data, it can only process numerical-type data. It is necessary to convert a text type character string into a numeric type.
A pre-training model constructed by RoBERTA is selected to map a plurality of words to a vector space, so that the question text data is converted into numerical data which can be used for neural network training. Because the pre-training model is trained in a chinese corpus, the obtained numeric-type vectors can have rich chinese semantic information, including word similarity and text similarity information, i.e., the vector spaces generated by similar words or texts are similar, for example: the vector space of headache and headache has a high similarity. Can be better applied to the task.
As an optional implementation, the data washing is performed on the historical disease description, which specifically includes:
performing at least one deletion operation on the historical disease description; the deleting operation comprises character deletion, letter conversion, prototype conversion, space deletion and information deletion.
The character deletion is to delete irrelevant characters, non-English characters, non-Chinese characters, non-numeric characters and punctuation coincidence and hyperlinks of non-Chinese and English in historical disease description; the extraneous characters include: html tags, messy codes, special characters, and tags.
The letters are converted from capital letters to lowercase letters of the English letters in the historical disease description.
The prototype is converted into a prototype in which english letters are converted into english letters.
Space deletion is the deletion of excess spaces.
The information deletion is to delete at least one of text information with personal information; the text information includes: name, contact address, and personal address.
Specifically, the directly collected data may include many noises, such as hypertext Markup Language (html) tags, messy codes, special expressions, and the like, and because of the sensitivity of the inquiry data, before the model training, the data needs to be subjected to data cleaning processing, which specifically includes:
1) removing irrelevant characters by using a computer programming technology, such as: html tags, messy codes, special characters and tags, and the like.
2) English characters are processed using a Natural Language Toolkit (NLTK). In english processing, converting capital letters into lowercase letters, and converting an english word into a stem of the word itself by extracting the stem, for example: like, turning to like, this can prevent the same semantic word from having different forms, lead to the sparseness of the word. The vocabulary correction is also needed, and when English is input, a certain letter in a word is likely to be wrong, so that the word cannot be recognized in the vectorization of the word. When judging that an English word does not exist, finding out words with the same prefix in WordNet of NLTK, and finding out candidate words for replacement by using the minimum editing distance (adding, deleting or replacing characters of the current word, and changing into target characters for the minimum times). For example: an applet is an error word, and candidate words include apps, applets, applications, and the like. However, since the edit distance from an applet to an applet is minimum, the semantic meaning is restored by replacing the applet. If the situation with the same edit distance occurs, further confirmation is required according to surrounding semantics or the probability of a daily occurrence of a word.
3) And processing the Chinese words. Punctuation marks, special characters, hyperlinks, messy codes and the like of non-English, non-Chinese, non-numeric and non-Chinese are deleted, and the semantic meaning of the text is prevented from being influenced. And deleting redundant spaces in the Chinese word sequence.
4) To ensure that the training data does not relate to privacy, text information with personal identity is deleted, such as: name, contact phone, personal address, etc.
As an optional implementation manner, judging whether to stop training according to the department with the highest probability and the corresponding historical department specifically includes:
and judging whether the department with the highest output probability is consistent with the corresponding historical department or not for any historical disease description.
And counting the number of historical disease descriptions of departments with the highest probability and corresponding historical departments to obtain the target number.
And calculating the accuracy of the long-term and short-term memory network model under the current training times according to the target number and the total number of the historical disease descriptions.
And if the accuracy is greater than the set threshold, stopping training.
In actual use, it is desirable to construct an application that can be embedded into a portal site of any hospital. Therefore, a simple and easy-to-use web application is designed, and the prediction service can be provided.
In the Web front end, a simple webpage is designed by using HTML, CSS, JavaScript and bootstrap, and comprises a description page, a symptom triage page and a semantic triage page of a department. In the description page of the departments, each hospital can modify the description of the departments according to the condition of the hospital. In the symptom triage page, a symptom function according to the disease incidence part and the selected disease incidence is provided, and the user can obtain the feedback of the department prediction only by selecting the condition of the user. In the semantic triage, any information of the patient is not required to be provided, and feedback of department prediction can be obtained only by inputting description of the illness state, and privacy of the user is not required to be involved.
Django (Python's Web framework) is used to design the Web backend. The Web back end processes the input of the user, converts the text character string of the disease description into a numerical value, inputs the numerical value into the model, converts the numerical value into distribution of departments, and finally returns the departments with the first five probabilities to the webpage, so that the user can clearly and directly go to which department to achieve the effect of triage.
Fig. 3 is a structural diagram of a semantic triage system according to an embodiment of the present invention. As shown in fig. 3, the semantic triage system in this embodiment includes:
and an actual inquiry data acquiring module 201, configured to acquire actual inquiry data.
And the target department determining module 202 is used for inputting the actual inquiry data into the department classification model to obtain the target department.
The establishment method of the department classification model comprises the following steps:
acquiring historical inquiry data; the historical interrogation data includes historical disease descriptions and corresponding historical departments.
And training the long-term and short-term memory network model according to historical inquiry data to obtain a department classification model.
As an optional implementation, the system further comprises: and a department classification model building module.
The department classification model building module is used for:
acquiring historical inquiry data; the historical inquiry data comprises historical disease description and corresponding historical departments; and training the long-term and short-term memory network model according to historical inquiry data to obtain a department classification model.
The department classification model establishing module specifically comprises:
and the probability determining unit is used for inputting the historical disease description into the long-term and short-term memory network model to obtain the probability of each department corresponding to each historical disease description.
And the stopping judgment unit is used for judging whether to stop training according to the department with the highest probability and the corresponding historical department.
And the department classification model generating unit is used for taking the long-term and short-term memory network model under the current training times as the department classification model when the stopping judging unit judges that the training is stopped.
And the continuous training unit is used for updating the parameters of the long-term and short-term memory network model and carrying out the next training when the stopping judgment unit judges that the training is continued.
As an optional implementation manner, the department classification model building module 203 further includes: and the preprocessing unit is used for preprocessing the historical disease description before inputting the historical disease description into the long-short term memory network model and obtaining the probability of each department corresponding to each historical disease description.
As an optional implementation, the method specifically includes:
and the consistency judgment subunit is used for judging whether the department with the maximum output probability is consistent with the corresponding historical department or not for any historical disease description.
And the counting subunit is used for counting the number of historical disease descriptions of departments with the maximum probability and corresponding historical departments to obtain the target number.
And the accuracy calculation subunit is used for calculating the accuracy of the long-term and short-term memory network model under the current training times according to the target number and the total number of the historical disease descriptions.
And the training stopping subunit is used for stopping training when the accuracy is greater than a set threshold.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. For the system disclosed by the embodiment, the description is relatively simple because the system corresponds to the method disclosed by the embodiment, and the relevant points can be referred to the method part for description.
The principles and embodiments of the present invention have been described herein using specific examples, which are presented solely to aid in the understanding of the apparatus and its core concepts; meanwhile, for a person skilled in the art, according to the idea of the present invention, the specific embodiments and the application range may be changed. In view of the above, the present disclosure should not be construed as limiting the invention.

Claims (10)

1. A method of semantic triage, the method comprising:
acquiring actual inquiry data;
inputting the actual inquiry data into a department classification model to obtain a target department;
the establishment method of the department classification model comprises the following steps:
acquiring historical inquiry data; the historical interrogation data comprises historical disease description and corresponding historical departments;
and training a long-term and short-term memory network model according to the historical inquiry data to obtain the department classification model.
2. The semantic triage method according to claim 1, wherein the training of the long-term and short-term memory network model according to the historical inquiry data to obtain the department classification model specifically comprises:
inputting the historical disease descriptions into the long-short term memory network model to obtain the probability of each department corresponding to each historical disease description;
judging whether to stop training according to the department with the highest probability and the corresponding historical department;
if so, taking the long-term and short-term memory network model under the current training times as the department classification model;
if not, updating the parameters of the long-term and short-term memory network model, and carrying out next training.
3. The semantic triage method according to claim 2, further comprising, before the inputting the historical disease descriptions into the long-short term memory network model to obtain the probability of each department corresponding to each historical disease description:
preprocessing the historical disease description.
4. The semantic triage method according to claim 3, wherein the preprocessing the historical disease description specifically comprises:
performing data cleaning on the historical disease description;
performing word segmentation processing on the historical disease description after data cleaning to obtain a plurality of words;
and mapping the words to a vector space to obtain a plurality of numerical data.
5. The semantic triage method according to claim 4, wherein the data cleansing of the historical disease description specifically comprises:
performing at least one deletion operation on the historical disease description; the deleting operation comprises character deletion, letter conversion, prototype conversion, space deletion and information deletion;
the character deletion is to delete irrelevant characters, non-English characters, non-Chinese characters, non-numeric characters and non-Chinese and English punctuation coincidence and hyperlinks in the historical disease description; the extraneous characters include: html tags, messy codes, special characters and tags;
the letters are converted into capital letters of English letters in the historical disease description and are converted into lowercase letters;
converting the prototype into a prototype for converting English letters into the English letters;
the space deletion is to delete redundant spaces;
the information deletion is text information deletion; the text information includes: name, contact address, and personal address.
6. The semantic triage method according to claim 2, wherein the step of judging whether to stop training according to the department with the highest probability and the corresponding historical department specifically comprises:
for any historical disease description, judging whether the department with the maximum output probability is consistent with the corresponding historical department;
counting the number of historical disease descriptions of departments with the highest probability and corresponding historical departments to obtain the target number;
calculating the accuracy of the long-term and short-term memory network model under the current training times according to the target number and the total number of the historical disease descriptions;
and if the accuracy is greater than a set threshold, stopping training.
7. A semantic triage system, the system comprising:
the actual inquiry data acquisition module is used for acquiring actual inquiry data;
the target department determining module is used for inputting the actual inquiry data into a department classification model to obtain a target department;
the establishment method of the department classification model comprises the following steps:
acquiring historical inquiry data; the historical interrogation data comprises historical disease description and corresponding historical departments;
and training a long-term and short-term memory network model according to the historical inquiry data to obtain the department classification model.
8. The semantic triage system according to claim 7, further comprising: a department classification model establishing module;
the department classification model establishing module is used for:
acquiring historical inquiry data; the historical interrogation data comprises historical disease description and corresponding historical departments; training a long-term and short-term memory network model according to the historical inquiry data to obtain the department classification model;
the department classification model establishing module specifically comprises:
the probability determining unit is used for inputting the historical disease descriptions into the long-term and short-term memory network model to obtain the probability of each department corresponding to each historical disease description;
the stopping judgment unit is used for judging whether to stop training according to the department with the highest probability and the corresponding historical department;
a department classification model generation unit, configured to, when the stop determination unit determines that training is stopped, take a long-term and short-term memory network model under the current training frequency as the department classification model;
and the continuous training unit is used for updating the parameters of the long-term and short-term memory network model and carrying out the next training when the stopping judgment unit judges that the training is continued.
9. The semantic triage system of claim 8, wherein the department classification model building module further comprises: and the preprocessing unit is used for preprocessing the historical disease descriptions before the historical disease descriptions are input into the long-short term memory network model and the probabilities of departments corresponding to the historical disease descriptions are obtained.
10. The semantic triage system according to claim 8, wherein the stopping judgment unit specifically includes:
the consistency judgment subunit is used for judging whether the department with the highest output probability is consistent with the corresponding historical department or not for any historical disease description;
the statistical subunit is used for counting the number of historical disease descriptions of departments with the maximum probability and corresponding historical departments to obtain the target number;
the accuracy calculation subunit is used for calculating the accuracy of the long-term and short-term memory network model under the current training times according to the target number and the total number of the historical disease descriptions;
and the training stopping subunit is used for stopping training when the accuracy is greater than a set threshold.
CN202110793215.3A 2021-07-14 2021-07-14 Semantic triage method and system Active CN113488152B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110793215.3A CN113488152B (en) 2021-07-14 2021-07-14 Semantic triage method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110793215.3A CN113488152B (en) 2021-07-14 2021-07-14 Semantic triage method and system

Publications (2)

Publication Number Publication Date
CN113488152A true CN113488152A (en) 2021-10-08
CN113488152B CN113488152B (en) 2022-06-24

Family

ID=77939068

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110793215.3A Active CN113488152B (en) 2021-07-14 2021-07-14 Semantic triage method and system

Country Status (1)

Country Link
CN (1) CN113488152B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116344009A (en) * 2023-05-22 2023-06-27 武汉盛博汇信息技术有限公司 Online diagnosis notification method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180315182A1 (en) * 2017-04-28 2018-11-01 Siemens Healthcare Gmbh Rapid assessment and outcome analysis for medical patients
CN108922608A (en) * 2018-06-13 2018-11-30 平安医疗科技有限公司 Intelligent hospital guide's method, apparatus, computer equipment and storage medium
CN110085308A (en) * 2019-04-23 2019-08-02 挂号网(杭州)科技有限公司 A kind of diagnosis and treatment department classification method based on fusion deep learning
CN110277155A (en) * 2019-06-19 2019-09-24 秒针信息技术有限公司 Hospital guide's method and device, storage medium, electronic device
CN112700862A (en) * 2020-12-25 2021-04-23 上海钛米机器人股份有限公司 Target department determining method and device, electronic equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180315182A1 (en) * 2017-04-28 2018-11-01 Siemens Healthcare Gmbh Rapid assessment and outcome analysis for medical patients
CN108922608A (en) * 2018-06-13 2018-11-30 平安医疗科技有限公司 Intelligent hospital guide's method, apparatus, computer equipment and storage medium
CN110085308A (en) * 2019-04-23 2019-08-02 挂号网(杭州)科技有限公司 A kind of diagnosis and treatment department classification method based on fusion deep learning
CN110277155A (en) * 2019-06-19 2019-09-24 秒针信息技术有限公司 Hospital guide's method and device, storage medium, electronic device
CN112700862A (en) * 2020-12-25 2021-04-23 上海钛米机器人股份有限公司 Target department determining method and device, electronic equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116344009A (en) * 2023-05-22 2023-06-27 武汉盛博汇信息技术有限公司 Online diagnosis notification method and device
CN116344009B (en) * 2023-05-22 2023-08-15 武汉盛博汇信息技术有限公司 Online diagnosis notification method and device

Also Published As

Publication number Publication date
CN113488152B (en) 2022-06-24

Similar Documents

Publication Publication Date Title
CN108664589B (en) Text information extraction method, device, system and medium based on domain self-adaptation
US20200012953A1 (en) Method and apparatus for generating model
CN111858944B (en) Entity aspect level emotion analysis method based on attention mechanism
CN111209401A (en) System and method for classifying and processing sentiment polarity of online public opinion text information
CN109325112B (en) A kind of across language sentiment analysis method and apparatus based on emoji
CN112084789B (en) Text processing method, device, equipment and storage medium
CN111401084A (en) Method and device for machine translation and computer readable storage medium
CN110277167A (en) The Chronic Non-Communicable Diseases Risk Forecast System of knowledge based map
CN111597341B (en) Document-level relation extraction method, device, equipment and storage medium
CN112307168A (en) Artificial intelligence-based inquiry session processing method and device and computer equipment
CN113946684A (en) Electric power capital construction knowledge graph construction method
CN111695338A (en) Interview content refining method, device, equipment and medium based on artificial intelligence
CN109741824A (en) A kind of medical way of inquisition based on machine learning
CN113657105A (en) Medical entity extraction method, device, equipment and medium based on vocabulary enhancement
EP4170542A2 (en) Method for sample augmentation
CN111831783A (en) Chapter-level relation extraction method
CN113488152B (en) Semantic triage method and system
CN114139551A (en) Method and device for training intention recognition model and method and device for recognizing intention
CN112216379A (en) Disease diagnosis system based on intelligent joint learning
CN114628001A (en) Prescription recommendation method, system, equipment and storage medium based on neural network
Fei et al. GFMRC: A machine reading comprehension model for named entity recognition
CN111339779A (en) Named entity identification method for Vietnamese
CN114169447B (en) Event detection method based on self-attention convolution bidirectional gating cyclic unit network
CN116151260A (en) Diabetes named entity recognition model construction method based on semi-supervised learning
CN115309862A (en) Causal relationship identification method and device based on graph convolution network and contrast learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant