CN115394428A

CN115394428A - Multi-model cooperative patient state monitoring method

Info

Publication number: CN115394428A
Application number: CN202210925053.9A
Authority: CN
Inventors: 任涛涛; 江左文; 刘博文; 罗嘉楠; 罗彬瑞; 刘翔
Original assignee: College of Science and Technology of Ningbo University
Current assignee: College of Science and Technology of Ningbo University
Priority date: 2022-08-03
Filing date: 2022-08-03
Publication date: 2022-11-25

Abstract

The invention relates to a patient state monitoring method based on multi-model cooperation, which comprises the following steps: s1, obtaining a trained LiquidNet network model; s2, establishing a network end, and shooting doctor-patient scene images by a network camera; s3, establishing a server, wherein the server comprises a target positioning module and a liquid level and expression recognition module; s4, positioning the face and the liquid drop bottle in the doctor-patient scene image by a target positioning module of the server; s5, after the face and the liquid drop bottle are positioned, a liquid level and expression recognition module of the server side recognizes the liquid level of the liquid drop bottle and the expression of the face; s6, displaying a final liquid level output result and an output result with abnormal expression state through a UI (user interface), and simultaneously playing a prompt voice prompt; the method can timely and accurately monitor the abnormal state of the liquid medicine and the abnormal state of the patient in the infusion process, assist medical staff in carrying out doctor-patient management, reduce the workload of the medical staff and better provide service for the patient.

Description

Multi-model cooperative patient state monitoring method

Technical Field

The invention relates to the technical field of target detection, in particular to a multi-model cooperative disease state monitoring method.

Background

Due to the rapid development of deep learning in recent years, technologies in speech recognition and image recognition have made important progress. For example, with the development of convolutional neural networks, excellent target detection network structures such as R-CNN and YOLO appear, and can be seen in real life, and the development results of the technologies promote the vigorous development of many industries.

Today's hospitals, due to insufficient medical staff, may have some safety issues, such as: the abnormal conditions of the patient in the infusion process can not be found in time.

Hospitals have been confronted with a situation of 'many patients and difficult monitoring' for a long time, and in the related research, although nurse teams are increasingly expanded in recent years, the manpower resources of first-line nurses are still insufficient. Then, from the perspective of infusion, the nurse has some problems of invisibility, which are mainly two problems: 1. the infusion bottle needs to be replaced in time, if the infusion bottle is not replaced in time and the patient is inconvenient to move, and when no doctor is nearby, unnecessary troubles are caused to the patient; 2. in the process of infusion, patients can occasionally have adverse reactions, and if a nurse does not find that the problem is a small problem in time when an infusion bottle needs to be replaced, the adverse reactions of the patients in the infusion process can cause irreparable consequences.

Because the hospital environment is complicated and changeable, the traditional patient state monitoring method is difficult to adapt to the situations, the patient state comprises the abnormal state of the liquid medicine in the infusion process and the abnormal state of the patient in the infusion process, the abnormal state of the liquid medicine usually means whether the liquid medicine in the infusion bottle is completely infused, and the abnormal state of the patient usually means the adverse reaction generated in the infusion process of the patient. However, with the rapid development of artificial intelligence, it is possible to provide a multi-model cooperative patient condition monitoring method and an alarm system thereof, which can adapt to the environment. In order to relieve the current situation that a hospital is out of the population and explore the landing application of artificial intelligence, the combination of the existing methods is tried, and a possible solution is provided.

Disclosure of Invention

The invention aims to provide a multi-model cooperative patient state monitoring method which can accurately and timely monitor the abnormal state of liquid medicine in the transfusion process and the abnormal state of a patient so as to relieve the current situation of shortage of people in a hospital.

The technical scheme adopted by the invention is that a multi-model cooperative patient state monitoring method comprises the following steps:

s1, establishing a DarkNet 53-based network model, training the DarkNet 53-based network model to obtain a trained network model, and naming the trained network model as a LiquidNet network model;

s2, a network end is established, the network end comprises a raspberry pie which is connected to a local area network and a network camera which is connected with the raspberry pie, the raspberry pie is arranged at a plurality of meter distances which are opposite to a patient, the network camera shoots a doctor-patient scene image, and the doctor-patient scene image completely comprises the whole face of the person and the space range of the whole liquid drop bottle;

s3, establishing a server, wherein the server comprises a target positioning module and a liquid level and expression recognition module, and the target positioning module comprises a support vector machine model for positioning a liquid drop bottle, a cascade face positioning model based on opencv and a network model based on YoLoV 3; the liquid level and expression recognition module comprises a liquid level height recognition module and a patient expression recognition module, the liquid level height recognition module comprises a GoogLeNetV4 network model, a DeepLabV3Plus network model and the LiquidNet network model obtained in the step S1, and the GoogLeNetV4 network model is used for outputting the confidence degree of whether the liquid level line is lower than the warning line; the DeepLabV3Plus network model is used for dividing the pixel range of the bottle body without liquid level to complete the calculation of the percentage of the residual liquid amount of the liquid drop bottle; the LiquidNet model is used for carrying out regression estimation on the liquid level line; the patient expression identification module comprises a faceCNN network model and is used for identifying whether the expression of the patient is abnormal or not;

s4, the doctor-patient scene image shot by the network side is sent to the server side, the target positioning module of the server side positions the face and the liquid drop bottle in the doctor-patient scene image, and the specific process is as follows: positioning a liquid drop bottle in the doctor-patient scene image by adopting a support vector machine model, and positioning a face in the doctor-patient scene image by adopting a casecade face positioning model based on opencv; if the support vector machine model cannot be positioned to a liquid drop bottle or/and the casecade face positioning model based on opencv cannot be positioned to a face, starting a network model based on YoLoV3 for positioning;

s5, after the face and the liquid drop bottle are positioned, the liquid level and expression recognition module of the server side recognizes the liquid level of the liquid drop bottle and the expression of the face, and the specific process of recognizing the liquid level of the liquid drop bottle is as follows: calculating the confidence degree of whether the liquid level line of the liquid in the output liquid dropping bottle is lower than a warning line through a GoogLeNetV4 network model, calculating the percentage of the residual liquid amount of the output liquid dropping bottle through a DeepLabV3Plus network model, estimating the position of the liquid level line through a LiquidNet network model, and if the output results of at least two network models in the output results of the three network models indicate that the liquid level line in the liquid dropping bottle is lower than a normal standard, taking the liquid level line in the liquid dropping bottle lower than the normal standard as a final liquid level output result; the specific process of recognizing the expression of the face comprises the following steps: cutting the positioned face image, and graying the cut face image; then sending the gray image into a FaceCNN network model, outputting a result by the FaceCNN network model, carrying out Sigmoid operation on the output result of the FaceCNN network model, then comparing the output result with a threshold, if the output result is greater than the threshold, indicating that the expression state of the patient is abnormal, otherwise, indicating that the expression state of the patient is normal;

and S6, displaying the final liquid level output result and the output result of the expression state abnormity obtained in the step S5 through a UI interface, and simultaneously playing a prompt voice to prompt the abnormity of the patient state.

The invention has the beneficial effects that: by adopting the patient state monitoring method with the cooperation of the multiple models, the abnormal state of the liquid medicine in the infusion process and the abnormal state of the patient can be timely and accurately monitored, medical care personnel can be assisted to carry out doctor-patient management, the workload of the medical care personnel is reduced, meanwhile, the service is better provided for the patient, and the number of infusion accidents is reduced.

Preferably, in step S1, the DarkNet 53-based network model includes an Encoder module and a Decoder module, and the input feature map is set to be F, and F ∈ R ^1*3*H*W The characteristic matrix is nxn, the Encoder module is used for carrying out residual error characteristic coding on the characteristic diagram F for 5 times to obtain the characteristic diagram F with the size of nxn

A feature matrix of (a); the Decoder module is used for carrying out decoding operation on the coded feature matrix for 4 times, finally regressing the feature matrix with the size of n multiplied by n, then carrying out channel number adjustment through convolution of 1 multiplied by 1, and obtaining the feature matrix of 1 multiplied by 2 as final output through a layer of linear layer and a sigmoid activation function.

Preferably, the specific process of the Encoder module performing residual feature coding on the feature map F5 times includes: a. the width and height of the feature layer of the feature map F are scaled five times, each time by scaling the feature map to the original size using a step size of 2, convolution and a 3 × 3 2D convolution

The number of the characteristic channels is increased by 2 times when the zooming is carried out each time; each zoom will result inPerforming feature extraction on the feature map through a function BatchNormlize and an activation function LeakyRelu, and obtaining a final extracted feature matrix after five times of scaling; b. b, sending the finally extracted feature matrix obtained in the step a into a fixed number of residual blocks, and extracting features again; c. and (c) performing weight fusion on the finally extracted feature matrix obtained in the step (a) and the feature matrix extracted in the step (b) to complete 5 times of residual error feature coding on the feature map F.

Preferably, the specific process of the decoder module performing 4 decoding operations on the encoded feature matrix is as follows: inputting the feature graph F subjected to residual error feature coding for 5 times into a decoder module to perform reverse feature extraction for four times, wherein the reverse feature extraction is performed each time by increasing the size of the feature graph to 2 times of the original size; and then, the Decoder module performs feature fusion on feature maps with the same shape in the feature maps output by four times to finally obtain a decoded feature map.

Preferably, in step S1, the specific process of training the network model based on the DarkNet53 to obtain the trained LiquidNet network model includes the following steps:

s01, placing the raspberry pie at a plurality of meter distances opposite to a patient, connecting the raspberry pie with a network camera, and shooting a plurality of doctor-patient scene simulation images by using the network camera, wherein the doctor-patient scene simulation images completely comprise the whole face of the patient and the space range of the whole liquid drop bottle;

s02, preprocessing and labeling the doctor-patient scene simulation image acquired in the step S01, and obtaining a plurality of data sets, wherein the specific process is as follows:

s02.1, labeling the FACE position and the dropping bottle position on the doctor-patient scene simulation image by using a labelImg tool, and transcoding the labeling result to obtain a WIDER FACE data set with data expansion;

s02.2, intercepting the droplet bottle data in the WIDER FACE data set obtained in the step S2.01 by using opencv;

s02.3, marking the liquid level line by using a double-threshold marking tool to form a regression prediction liquid level line data set;

and S03, inputting the regression prediction liquid level line data set obtained in the step S02.3 into a LiquidNet model for training to obtain the trained LiquidNet model.

Preferably, in step S5, the specific process of estimating the location of the tapping line by using the liquid net model includes the following steps:

s501, inputting the doctor-patient scene images positioned by the face and the liquid drop bottle into a LiquidNet model, and outputting two thresholds by the LiquidNet model;

s502, converting the color space of the doctor-patient scene image positioned by the face and the liquid drop bottle into HSV, creating a binary image by using one threshold value of a LiquidNet model, finding out the pixel position with the output result of 1 in the binary image, and performing fuzzy operation on the pixel at the same position of the original image, wherein the aim is to eliminate the interference pixel information on the liquid drop bottle;

s503, according to the image information of the drip bottle without the interference pixels in the step S502, creating an enhanced binary image by using another threshold value of the LiquidNet model, vertically calculating the maximum difference value between every two rows of pixels on the enhanced binary image, finding out the liquid level line, and finally according to a formula:

and completing the calculation of the liquid level.

Drawings

FIG. 1 is a flow chart of a multi-model collaborative patient status monitoring method of the present invention;

FIG. 2 is a schematic illustration of partitioned data sets in the present invention;

FIG. 3 is a schematic diagram of an image shown in a data set in the present invention;

fig. 4 is a structural diagram of the LiquidNet network model in the present invention.

Detailed Description

The invention is further described below with reference to the accompanying drawings in combination with specific embodiments so that those skilled in the art can practice the invention with reference to the description, and the scope of the invention is not limited to the specific embodiments.

It will be understood by those skilled in the art that in the present disclosure, the terms "longitudinal," "lateral," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like are used in an orientation or positional relationship indicated in the drawings for ease of description and simplicity of description, and do not indicate or imply that the referenced device or element must have a particular orientation, be constructed and operated in a particular orientation, and thus, the above terms should not be construed as limiting the present invention.

The invention relates to a multi-model cooperative patient state monitoring method, as shown in fig. 1, the method comprises the following steps:

s2, a network end is established, the network end comprises a raspberry group connected to a local area network and a network camera connected with the raspberry group, the raspberry group is placed at a plurality of meter distances opposite to a patient, the network camera shoots doctor-patient scene images, and the doctor-patient scene images completely contain the whole face of the person and the space range of the whole liquid drop bottle;

s3, establishing a server, wherein the server comprises a target positioning module and a liquid level and expression recognition module, and the target positioning module comprises a support vector machine model for positioning a liquid drop bottle, a cascade face positioning model based on opencv and a network model based on YoLoV 3; the liquid level and expression recognition module comprises a liquid level height recognition module and a patient expression recognition module, the liquid level height recognition module comprises a GoogLeNetV4 network model, a DeepLabV3Plus network model and the LiquidNet network model obtained in the step S1, and the GoogLeNetV4 network model is used for outputting the confidence degree of whether the liquid level line is lower than the warning line; the DeepLabV3Plus network model is used for dividing the pixel range of the bottle body without liquid level and completing the calculation of the percentage of the residual liquid amount of the liquid drop bottle; the LiquidNet model is used for carrying out regression estimation on the liquid level line; the patient expression identification module comprises a faceCNN network model and is used for identifying whether the expression of the patient is abnormal or not;

and S6, displaying the final output result of the liquid level and the output result of the expression state abnormity, which are obtained in the step S5, through a UI interface, and simultaneously playing a prompt voice to prompt the abnormity of the state of the patient.

By adopting the patient state monitoring method with the cooperation of the multiple models, the abnormal state of the liquid medicine in the infusion process and the abnormal state of the patient can be timely and accurately monitored, medical personnel can be assisted to conduct the patient management, the workload of the medical personnel is reduced, meanwhile, the service is better provided for the patient, and the number of infusion accidents is reduced.

In step S1, the DarkNet 53-based network model comprises an Encoder module and a Decorder module, an input feature map is set to be F, and F belongs to R ^1*3*H*W The characteristic matrix is nxn, the Encoder module is used for carrying out residual error characteristic coding on the characteristic diagram F for 5 times to obtain the characteristic diagram F with the size of nxn

A feature matrix of (a); the Decoder module is used for carrying out decoding operation on the coded feature matrix for 4 times, finally regressing the feature matrix with the size of n multiplied by n, then carrying out channel number adjustment through convolution with the size of 1 multiplied by 1, and obtaining the feature matrix with the size of 1 multiplied by 2 through a layer of linear layer and a sigmoid activation function to serve as final output.

The specific process of the Encoder module for carrying out 5 times of residual error feature coding on the feature map F comprises the following steps: a. the width and height of the feature layer of the feature map F are scaled five times, each time by scaling the feature map to the original size using a step size of 2, convolution and a 3 × 3 2D convolution

While zooming every time, the number of the characteristic channels is increased by 2 times; performing feature extraction on the obtained feature map through a function BatchNormlize and an activation function LeakyRelu for each zooming, and obtaining a finally extracted feature matrix after five times of zooming; b. b, sending the finally extracted feature matrix obtained in the step a into a fixed number of residual blocks, and extracting features again; c. and (c) performing weight fusion on the finally extracted feature matrix obtained in the step (a) and the feature matrix extracted in the step (b) to finish 5 times of residual processing on the feature map FAnd (4) encoding the difference features.

The specific process of the Decoder module performing 4 decoding operations on the coded feature matrix is as follows: inputting the feature graph F subjected to residual error feature coding for 5 times into a decoder module to perform reverse feature extraction for four times, wherein the reverse feature extraction is performed each time by increasing the size of the feature graph to 2 times of the original size; and then, the Decoder module performs feature fusion on feature maps with the same shape in the feature maps output by four times to finally obtain a decoded feature map.

In the step S1, the specific process of training the network model based on the DarkNet53 to obtain the trained LiquidNet network model includes the following steps:

s02.3, marking the liquid level line by using a double-threshold marking tool to form a regression prediction liquid level line data set, wherein the data set is used for training a LiquidNet network model;

and S03, inputting the regression prediction liquid level line data set obtained in the step S02.3 into a LiquidNet model for training to obtain a trained LiquidNet model.

In step S02.1, a labelImg tool is used to label the FACE position and the drip bottle position on the doctor-patient scene simulation image, and the labeling result is transcoded to obtain a data-augmented wide FACE data set, which can be used to train the network model based on YoLoV3 in step S3, and the network model based on YoLoV3 is used to locate the drip bottle and the FACE.

In step S02.2, using opencv and simultaneously intercepting the FACE data of the person in the wide FACE data set obtained in step S2.01, that is, marking a positioning frame on the FACE data of the person, where the positioning frame is used to train the opencv-based cascade FACE positioning model in step S3, where the cascade FACE positioning model based on opencv also belongs to an open source algorithm, and can be directly called, and the training process is also a traditional training process; and the drop bottle positioning frame obtained in the step S02.2 is used for training a support vector machine model, the support vector machine model related to the invention also belongs to an open source algorithm, and can be directly called, and the training process is also the traditional training process.

After intercepting the human face data and the liquid drop bottle data, graying the human face data, dividing the human face data into a normal type and an abnormal type, and then forming a patient state classification data set, wherein the data set can be used for training a FaceNet network model, and the liquid drop bottle data is divided into a high liquid level type and a low liquid level type to form a liquid level state classification data set which is used for training a GoogleNet V4 network model; the labelme was used to label areas of liquid in the drip bottle, constituting an image segmentation dataset that will be used to train the deplab v3Plus network model. The FaceNet network model, the GoogleNet V4 network model and the DeepLabV3Plus network model all belong to open source algorithms and can be directly called, and the training processes are all traditional training processes.

In step S5, the specific process of estimating the position of the tapping line by using the liquid net model includes the following steps:

s502, converting the color space of the doctor-patient scene image positioned by the face and the liquid drop bottle into HSV, creating a binary image by using one threshold value of a LiquidNet model, finding the pixel position with the output result of 1 in the binary image, and carrying out fuzzy operation on the pixels at the same position of the original image, wherein the aim is to eliminate the interference pixel information on the liquid drop bottle;

and completing the calculation of the liquid level.

Claims

1. A multi-model cooperative patient state monitoring method is characterized in that: the method comprises the following steps:

s5, after the face and the liquid drop bottle are positioned, the liquid level and expression recognition module of the server side recognizes the liquid level of the liquid drop bottle and the expression of the face, and the specific process of recognizing the liquid level of the liquid drop bottle is as follows: calculating the confidence degree of whether the liquid level line of the liquid in the output liquid dropping bottle is lower than a warning line through a GoogLeNetV4 network model, calculating the percentage of the residual liquid amount of the output liquid dropping bottle through a DeepLabV3Plus network model, estimating the position of the liquid level line through a LiquidNet network model, and if the output results of at least two network models in the output results of the three network models indicate that the liquid level line in the liquid dropping bottle is lower than a normal standard, taking the liquid level line in the liquid dropping bottle lower than the normal standard as a final liquid level output result; the specific process of recognizing the expression of the face comprises the following steps: cutting the positioned face image, and graying the cut face image; then sending the gray image into a faceCNN network model, outputting a result by the faceCNN network model, carrying out Sigmoid operation on the output result of the faceCNN network model, then comparing the output result with a threshold, if the output result is greater than the threshold, indicating that the expression state of the patient is abnormal, otherwise indicating that the expression state of the patient is normal;

2. The method of claim 1, wherein the patient state monitoring system further comprises: in step S1, the DarkNet 53-based network model comprises an Encoder module and a Decorder module, an input feature map is set to be F, and F belongs to R ^1*3*H*W The characteristic matrix is nxn, the Encoder module is used for carrying out residual error characteristic coding on the characteristic diagram F for 5 times to obtain the characteristic diagram F with the size of nxn

3. The method of claim 2, wherein the patient state monitoring system further comprises: the specific process of the Encoder module for carrying out 5 times of residual error feature coding on the feature map F comprises the following steps: a. the width and height of the feature layer of the feature map F are scaled five times, each time by scaling the feature map to the original size using a step size of 2, convolution and a 3 × 3 2D convolution

The number of the characteristic channels is increased by 2 times when the zooming is carried out each time; performing feature extraction on the obtained feature map through a function BatchNormlize and an activation function LeakyRelu for each zooming, and obtaining a finally extracted feature matrix after five times of zooming; b. c, the finally extracted feature matrix obtained in the step aSending a fixed number of residual blocks, and performing feature extraction again; c. and (c) performing weight fusion on the finally extracted feature matrix obtained in the step (a) and the feature matrix extracted in the step (b) to complete 5 times of residual error feature coding on the feature map F.

4. The method of claim 3, wherein the patient state monitoring system further comprises: the specific process of the Decoder module performing 4 decoding operations on the coded feature matrix is as follows: inputting the feature graph F subjected to residual error feature coding for 5 times into a decoder module to perform reverse feature extraction for four times, wherein the reverse feature extraction is performed each time by increasing the size of the feature graph to 2 times of the original size; and then, the Decorder module performs feature fusion on feature maps with the same shape in the feature maps output by the four times, and finally obtains a decoded feature map.

5. The method of claim 4, wherein the patient state monitoring system further comprises: in the step S1, the specific process of training the network model based on the DarkNet53 to obtain the trained LiquidNet network model includes the following steps:

s02.2, intercepting the droplet bottle data in the WIDER FACE data set obtained in the step S02.1 by using opencv;

6. The method of claim 5, wherein the patient state monitoring system further comprises: in step S5, the specific process of estimating the position of the liquid level line through the LiquidNet network model includes the following steps:

and completing the calculation of the liquid level.