CN114783601A

CN114783601A - Physiological data analysis method and device, electronic equipment and storage medium

Info

Publication number: CN114783601A
Application number: CN202210315874.0A
Authority: CN
Inventors: 何峻青
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2022-03-28
Filing date: 2022-03-28
Publication date: 2022-07-22

Abstract

The application relates to the technical field of data analysis, in particular to the technical field of artificial intelligence, and provides a physiological data analysis method, a physiological data analysis device, an electronic device and a storage medium, which are used for improving the physiological data analysis accuracy. The method comprises the following steps: acquiring original state information and at least one type of object attribute information of an object to be analyzed; performing feature extraction on the original state information based on the conditional probability configured corresponding to each preset state information to obtain a corresponding first intermediate result; respectively extracting the characteristics of at least one type of object attribute information to obtain second intermediate results corresponding to the at least one type of object attribute information; and determining the class probability of each type of preset physiological data analysis result corresponding to the object to be analyzed based on the first intermediate result and at least one second intermediate result, and determining the target physiological data analysis result of the object to be analyzed based on each class probability. Therefore, the analysis accuracy can be effectively improved by combining the conditional probability corresponding to the preset state information configuration.

Description

Physiological data analysis method and device, electronic equipment and storage medium

Technical Field

The application relates to the technical field of data analysis, in particular to the technical field of artificial intelligence, and provides a physiological data analysis method and device, electronic equipment and a storage medium.

Background

With the development of computer technology, the medical field has more and more applied to computers, for example, the physiological data of a subject is electronized for medical statistical analysis; the analysis is performed on the basis of physiological data of the subject by means of a neural network.

Taking intelligent inquiry as an example, the subject can input physiological data such as the appeared symptoms and the like into the automatic inquiry system, and obtain an analysis result which is output by the automatic inquiry system and possibly matched with the symptoms, so that the subject can carry out subsequent medical treatment or self-treatment according to the obtained analysis result.

In the automatic inquiry system of the related art, the physiological data analysis method is generally realized by establishing a machine learning model or a neural network model based on a large amount of labeled data, and the machine learning model or the neural network model is large in data quantity and high in labeling cost. However, the knowledge driving method generally cannot automatically adjust knowledge according to data, requires a large amount of manpower to manually adjust and correct the knowledge, cannot check the adjusted result of each knowledge in real time, and is prone to causing misjudgment, so that accuracy is affected.

Therefore, how to effectively improve the analysis accuracy of physiological data is urgently needed to be solved.

Disclosure of Invention

The embodiment of the application provides a method and a device for analyzing physiological data, electronic equipment and a storage medium, which are used for improving the accuracy of analysis of the physiological data.

The embodiment of the application provides a method for analyzing physiological data, which comprises the following steps:

acquiring original state information of an object to be analyzed and at least one type of object attribute information related to the object to be analyzed; the original state information comprises physiological data of the object to be analyzed under the environment where the object to be analyzed is located;

performing feature extraction on the original state information based on the conditional probability configured corresponding to each preset state information to obtain a corresponding first intermediate result; the conditional probability represents a probability of occurrence of a preset state information under a condition of a preset physiological data analysis result;

respectively extracting the characteristics of the at least one type of object attribute information to obtain second intermediate results corresponding to the at least one type of object attribute information;

and determining the category probability of each type of preset physiological data analysis result corresponding to the object to be analyzed based on the first intermediate result and at least one second intermediate result, and determining the target physiological data analysis result of the object to be analyzed based on each category probability.

The analysis device of physiological data that this application embodiment provided includes:

the device comprises an information acquisition unit, a processing unit and a processing unit, wherein the information acquisition unit is used for acquiring original state information of an object to be analyzed and at least one type of object attribute information related to the object to be analyzed; the original state information comprises physiological data of the object to be analyzed under the environment where the object to be analyzed is located;

the characteristic extraction unit is used for extracting the characteristics of the original state information based on the conditional probability configured corresponding to each preset state information to obtain a corresponding first intermediate result; the conditional probability represents a probability of occurrence of a preset state information under a condition of a preset physiological data analysis result;

the feature extraction unit is further configured to perform feature extraction on the at least one type of object attribute information, respectively, to obtain second intermediate results corresponding to the at least one type of object attribute information;

and the result analysis unit is used for determining the category probability of each type of preset physiological data analysis result corresponding to the object to be analyzed based on the first intermediate result and at least one second intermediate result, and determining the target physiological data analysis result of the object to be analyzed based on each category probability.

Optionally, the feature extraction unit is specifically configured to:

respectively acquiring the respective conditional probability of each preset state information under each preset physiological data analysis result, and determining a first weight matrix for feature extraction based on each acquired conditional probability;

and performing feature extraction on the original state information based on the first weight matrix to obtain a corresponding first intermediate result.

Optionally, the feature extraction unit is specifically configured to:

for each type of object attribute information in the at least one type of object attribute information, respectively executing the following operations:

for one type of object attribute information, determining a second weight matrix for performing feature extraction on the one type of object attribute information based on the matching association degree between each preset physiological data analysis result and the one type of object attribute information;

and performing feature extraction on the attribute information of the class of objects based on the second weight matrix to obtain a corresponding second intermediate result.

Optionally, the feature extraction unit is specifically configured to:

inputting the original state information of the object to be analyzed into a first feature network in a trained target analysis model;

and performing feature extraction on the original state information based on the first feature network to obtain the first intermediate result, wherein a first weight matrix in the first feature network is determined based on the conditional probability.

Optionally, the feature extraction unit is specifically configured to:

respectively inputting at least one type of object attribute information of the object to be analyzed into corresponding second feature networks in the target analysis model, wherein each second feature network corresponds to one type of object attribute information;

and respectively extracting the characteristics of the corresponding object attribute information based on the second characteristic networks corresponding to the at least one type of object attribute information to obtain corresponding second intermediate results.

Optionally, the apparatus further comprises:

the model training unit is used for obtaining the target analysis model through training in the following modes:

acquiring a sample data set, wherein each sample data in the sample data set comprises a sample object, original state information of the sample object, at least one type of object attribute information and a physiological data analysis result label of the sample object;

performing loop iterative training on an analysis model to be trained according to the sample data set, and outputting a corresponding target analysis model; wherein, in a loop iteration process, the following operations are executed:

inputting the selected sample data into the analysis model to be trained, and obtaining the class probability of the sample object corresponding to various preset physiological data analysis results;

and performing parameter adjustment on the analysis model to be trained by adopting a target loss function constructed based on the class probability of various preset physiological data analysis results corresponding to the sample object and the physiological data analysis result labels.

Optionally, in the analysis model to be trained, the first weight matrix in the feature network corresponding to the original state information is: initializing based on the conditional probability; the second weight matrix in the feature network corresponding to each of the at least one type of object attribute information is: based on the matching association degree between each preset physiological data analysis result and each preset physiological data analysis result, initializing the preset physiological data analysis result;

the model training unit is specifically configured to:

and performing parameter adjustment on the first weight matrix and each second weight matrix based on the target loss function.

Optionally, the model training unit is specifically configured to:

dividing initial sample data into at least one sample data subset according to a preset physiological data analysis result category, and determining the number of samples in the sample data subset corresponding to each preset physiological data analysis result category;

carrying out up-sampling treatment on sample data in a sample data subset with the number of samples not reaching the preset number to obtain at least one up-sampled sample data;

and constructing the sample data set based on the upsampled sample data and the initial sample data.

Optionally, the result analysis unit is specifically configured to:

determining a summation result obtained by accumulating and summing the first intermediate result and the at least one second intermediate result;

and carrying out normalization processing on the summation result to obtain the category probability of various preset physiological data analysis results corresponding to the object to be analyzed.

Optionally, the object attribute information includes at least one of the following types: age information of the subject, time information of the visit of the subject, sex information of the subject.

Optionally, the determining, based on the class probabilities, a target physiological data analysis result of the object to be analyzed includes:

taking a preset physiological data analysis result corresponding to the class probability reaching the reference threshold value in all the class probabilities as a target physiological data analysis result of the object to be analyzed; or

And sequencing the class probabilities, and taking a preset physiological data analysis result corresponding to the class probability of the sequencing result in a specified sequence range as a target physiological data analysis result of the object to be analyzed.

An electronic device provided by an embodiment of the present application includes a processor and a memory, where the memory stores a computer program, and when the computer program is executed by the processor, the processor is caused to execute the steps of any one of the above methods for analyzing physiological data.

An embodiment of the present application provides a computer-readable storage medium, which includes a computer program, when the computer program runs on an electronic device, the computer program is configured to enable the electronic device to perform any one of the steps of the method for analyzing physiological data.

Embodiments of the present application provide a computer program product, which includes a computer program, stored in a computer readable storage medium; when the processor of the electronic device reads the computer program from the computer-readable storage medium, the processor executes the computer program, so that the electronic device performs the steps of any one of the above-described physiological data analysis methods.

The beneficial effect of this application is as follows:

the embodiment of the application provides a physiological data analysis method and device, electronic equipment and a storage medium. In the embodiment of the application, the original state information is subjected to feature extraction based on the conditional probabilities configured corresponding to the preset state information, and each conditional probability represents the probability of occurrence of one preset state information under the condition of one preset physiological data analysis result, so that when the original state information is subjected to feature extraction based on the conditional probabilities, physiological data analysis results more associated with the original state information can be extracted more preferentially, and the accuracy of feature extraction can be improved; in addition, the original state information of the object to be analyzed and the object attribute information related to the physiological data analysis result are combined to perform multi-class multi-label classification, and the target physiological data analysis result for the object to be analyzed is determined based on the class probability corresponding to each preset physiological data analysis result. The accuracy is further improved.

Additional features and advantages of the present application will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the present application. The objectives and other advantages of the application may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

fig. 1 is an alternative schematic diagram of an application scenario in an embodiment of the present application;

FIG. 2 is a schematic diagram of an intelligent interrogation interface in an embodiment of the present application;

FIG. 3 is a flowchart illustrating a method for analyzing physiological data according to an embodiment of the present application;

fig. 4 is a schematic diagram illustrating a correspondence relationship between a preset physiological data analysis result and preset status information in an embodiment of the present application;

fig. 5 is a schematic diagram of a processing procedure after inputting original state information and object attribute information of an object to be analyzed into a model in an embodiment of the present application;

FIG. 6 is a schematic flow chart illustrating a model training method according to an embodiment of the present disclosure;

FIG. 7 is a schematic structural diagram of an analysis model in an embodiment of the present application;

FIG. 8 is a schematic diagram illustrating a sample data expansion method according to an embodiment of the present application;

fig. 9A is a flowchart illustrating a method for analyzing physiological data according to an embodiment of the present application;

FIG. 9B is a diagram illustrating a detailed scenario of a method for analyzing physiological data according to an embodiment of the present application;

fig. 10 is a schematic structural diagram illustrating an analysis apparatus for physiological data according to an embodiment of the present application;

fig. 11 is a schematic diagram of a hardware component of an electronic device to which an embodiment of the present application is applied;

fig. 12 is a schematic diagram of a hardware component structure of another electronic device to which the embodiment of the present application is applied.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments, but not all embodiments, of the technical solutions of the present application. All other embodiments obtained by a person skilled in the art based on the embodiments described in the present application without any creative effort belong to the protection scope of the technical solution of the present application.

Some concepts related to the embodiments of the present application are described below.

State information: for characterizing physiological data of a subject in a certain environment. In the embodiments of the present application, the following are specifically included: presetting state information and original state information aiming at an object to be analyzed; the original state information represents physiological data of an object to be analyzed under the environment of the object to be analyzed; and the preset state information is historical physiological data of some sample objects determined by means of statistical analysis and the like.

Object attribute information: some characteristic information characterizing the subject itself, or some characteristic information related to the result of physiological data analysis of the subject, such as the age of the subject, the sex of the subject, the time of the visit of the subject, the history of the visit, etc.

Physiological data analysis results: refers to a result obtained by analyzing the health state, emotional state, etc. of the subject itself based on the physiological data of the subject. In the embodiment of the application, the physiological data analysis result can represent the emotional fluctuation conditions of the subject, such as happiness, anger, sadness and terror; the physical condition of the subject may also be characterized, such as simply being healthy, sub-healthy, unhealthy, etc., and may be further subdivided into various types of diseases. The physiological data analysis results can also be specifically divided into two categories: and presetting a physiological data analysis result and a target physiological data analysis result. In the embodiment of the present application, the preset physiological data analysis result refers to some physiological data analysis results of different categories preset in a statistical analysis or other manner, for example: cold, heat stroke, etc.; and the target physiological data analysis result is one or more physiological data analysis results screened from preset physiological data analysis results.

Knowledge Graph (Knowledge Graph): also known as knowledge domain visualization or knowledge domain mapping maps, are a series of various different graphs showing the relationship between the knowledge development process and the structure. The knowledge graph uses visualization technology to describe knowledge, mine, analyze, construct, draw and display the knowledge and the mutual connection between the knowledge and the knowledge. The knowledge graph is a modern theory which achieves the aim of multi-discipline fusion by combining theories and methods of applying subjects such as mathematics, graphics, information visualization technology, information science and the like with methods such as metrology introduction analysis, co-occurrence analysis and the like and utilizing a visualized graph to vividly display the core structure, development history, frontier field and overall knowledge framework of the subjects. In the embodiment of the application, the knowledge map can be applied to medical diagnosis and intelligent inquiry, and effective assistance is provided for disease diagnosis.

Conditional probability: is the probability of an event a occurring if another event B has occurred. The conditional probability is denoted as P (A | B), read as "probability of A under B conditions". It should be noted that there is not necessarily a causal or chronological relationship between a and B in these definitions. A may occur before B, the opposite may occur, or both may occur simultaneously. A may cause B to occur, and vice versa, or there may be no causal relationship between the two at all. The conditional probability formulation may be implemented by bayesian theorem, for example, taking into account the probabilistic conditionality of some information that may be new.

Additive Neural network Models (NAM) and feature networks: NAM uses neural network to predict the probability or score of each feature mapping to the final category for each feature, then directly sums the probability or score, and obtains the result after regularization. The feature network is the sub-network in the NAM that calculates the probability of each feature for the final class.

Intermediate results: in the embodiment of the present application, the intermediate result refers to a result obtained before a target physiological data analysis result corresponding to a target to be analyzed is finally obtained, and the intermediate result may be divided into two categories, namely a first intermediate result corresponding to original state information and a second intermediate result corresponding to object attribute information. Wherein the first intermediate result is characterized specifically: the probability or score of each preset physiological data analysis result, which is obtained based on the original state information prediction, possibly being the analysis result corresponding to the object to be analyzed; the second intermediate result is characterized specifically: and each preset physiological data analysis result is predicted to be the probability or score of the analysis result corresponding to the object to be analyzed based on the object attribute information. And performing some data processing based on the first intermediate result and the second intermediate result to obtain a final target physiological data analysis result corresponding to the object to be analyzed.

Fine-tune (Fine-tune): and training the network with initialized parameters by using data, and updating the parameters. In the embodiment of the present application, fine tuning of the analysis model is required.

A sample data set: the sample data set in the embodiment of the application can be divided into a training set and a verification set. A machine learning model typically includes two parts of parameters: model parameters and hyper-parameters. Where hyper-parameters are parameters used to control the behavior of the model, which are not learned by the model itself, e.g. the learning rate is a hyper-parameter. The training set is used to adjust model parameters, and the validation set is used to adjust hyper-parameters, such as adjusting learning rates.

The embodiment of the present application relates to Artificial Intelligence (AI), Natural Language Processing (NLP), and Machine Learning technology (ML), and is designed based on computer vision technology and Machine Learning in Artificial Intelligence.

Artificial intelligence is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence.

Artificial intelligence is the research of the design principle and the implementation method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making. The artificial intelligence technology mainly comprises a computer vision technology, a natural language processing technology, machine learning/deep learning, automatic driving, intelligent traffic and other directions. With the research and progress of artificial intelligence technology, artificial intelligence is researched and applied in a plurality of fields, such as common smart homes, smart customer service, virtual assistants, smart speakers, smart marketing, unmanned driving, automatic driving, robots, smart medical treatment and the like.

Natural language processing is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between a person and a computer using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.

Machine learning is a multi-field cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Compared with the method for finding mutual characteristics among big data by data mining, the machine learning focuses more on the design of an algorithm, so that a computer can automatically learn rules from the data and predict unknown data by using the rules.

Machine learning is the core of artificial intelligence, is the fundamental approach to make computers have intelligence, and is applied in various fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and the like.

The analysis method of the physiological data provided in the embodiment of the application can be divided into two parts, including a training part and an application part; the training part is used for training the analysis model through the machine learning technology, specifically, the analysis model to be trained is subjected to cyclic iterative training based on the sample data set to obtain the target analysis model. The application part is used for combining the original state information and the object attribute information of the object to be analyzed by using the target analysis model obtained by training in the training part to obtain a corresponding target physiological data analysis result.

The following briefly introduces the design concept of the embodiments of the present application:

still taking intelligent interrogation as an example, an automated interrogation system is used to provide interrogation services to subjects on-line. The subject can input the presented symptoms into the automatic inquiry system, and the disease which is possibly corresponding to the symptoms and output by the automatic inquiry system is obtained, so that the subject can carry out subsequent medical treatment or self-treatment according to the obtained disease.

In the related art, the prediction of the disease in the automatic interrogation system is generally realized by establishing a machine learning model or a neural network model based on a large amount of labeled data, the cost required for establishing the machine learning model or the neural network model is high, and the training time is long. Several common machine learning models are listed below:

1) bayesian networks and variants thereof. A bayesian network uses a probabilistic graph model to describe the relationships between variables and a directed graph to describe the joint probability distribution of the variables of the model. Variables are typically modeled using a two or three layer bayesian network. The specific algorithm is as follows: firstly, determining each layer of nodes and constructing a network structure. And (5) counting data, and calculating the prior probability of the first-layer node and the conditional probability between each node and all connected father nodes. When making inferences, the conditional probabilities of ancestral points are computed from the bottom up, in a diagnostic inference. Since the probability distribution of each node combination needs to be known in the bayesian network, there may be a case where some node combinations have no data and the probability is 0 due to the sparsity of data.

2) Naive Bayes method. When the bayesian network has only two layers, the first layer is a class, the second layer is a feature, and the simplification is to a naive bayes method assuming that all features are mutually independent with respect to each class.

In the bayesian network method, the posterior probability is generally used as an index to analyze or rank the physiological data, and there is no specific scenario of disease diagnosis.

3) And machine learning methods such as Support Vector Machines (SVMs), deep neural networks and the like. However, such methods are not interpretable and require large amounts of training data, and are therefore not widely used in the analysis of physiological data.

4) A knowledge-driven method. Knowledge cannot be automatically adjusted according to data, a large amount of manpower is consumed for manual adjustment and correction of knowledge, and the adjusted results of the knowledge cannot be checked in real time, for example, the probability of nasal obstruction under the condition of cold is manually increased, so that some rhinitis cases are easily judged to be cold, and misjudgment is caused.

In summary, the methods in the related art cannot be applied to the situation with only a small amount of data, and after the test on the verification set, the reasons for the errors need to be manually analyzed, so as to be modified, and then the test verification is performed after the modification, and the modification is performed again. This manual iterative process is time consuming for the physician and is very slow.

In view of this, the present application provides a method and an apparatus for analyzing physiological data, an electronic device, and a storage medium. In the embodiment of the application, the original state information is subjected to feature extraction based on the conditional probabilities configured corresponding to the preset state information, and each conditional probability represents the probability of occurrence of one preset state information under the condition of one preset physiological data analysis result, so that when the original state information is subjected to feature extraction based on the feature extraction, the physiological data analysis results more associated with the original state information can be extracted more preferentially, and the accuracy of feature extraction can be improved; in addition, the original state information of the object to be analyzed and the object attribute information related to the physiological data analysis result are combined, multi-class multi-label classification is carried out, and the target physiological data analysis result for the object to be analyzed is determined based on the class probability corresponding to each preset physiological data analysis result. Further improving accuracy.

The preferred embodiments of the present application will be described in conjunction with the drawings of the specification, it should be understood that the preferred embodiments described herein are only for illustrating and explaining the present application, and are not intended to limit the present application, and the embodiments and features of the embodiments in the present application may be combined with each other without conflict.

Fig. 1 is a schematic view of an application scenario in the embodiment of the present application. The application scenario diagram includes two terminal devices 110 and a server 120.

In the embodiment of the present application, the terminal device 110 includes, but is not limited to, a mobile phone, a tablet computer, a notebook computer, a desktop computer, an e-book reader, an intelligent voice interaction device, an intelligent household appliance, a vehicle-mounted terminal, and other devices; the terminal device may be installed with a client related to analysis of physiological data, where the client may be software (e.g., a browser, intelligent inquiry software, health management software, etc.), or a web page, an applet, etc., and the server 120 is a background server corresponding to the software, or the web page, the applet, etc., or a server specially used for analyzing physiological data, and the application is not limited specifically. The server 120 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, web service, cloud communication, middleware service, domain name service, security service, CDN (Content Delivery Network), big data, and an artificial intelligence platform.

It should be noted that, the method for analyzing physiological data in the embodiments of the present application may be performed by an electronic device, which may be the terminal device 110 or the server 120, that is, the method may be performed by the terminal device 110 or the server 120 alone, or may be performed by both the terminal device 110 and the server 120. For example, when the terminal device 110 and the server 120 are executed together, the original state information input by the object to be analyzed and at least one type of object attribute information related to the object to be analyzed may be obtained through the terminal device 110, and these pieces of information are sent to the server 120, and the server 120 performs feature extraction on the original state information based on the conditional probability configured corresponding to each preset state information, so as to obtain a corresponding first intermediate result; the conditional probability represents a probability of occurrence of a preset state information under a condition of a preset physiological data analysis result; respectively extracting the characteristics of at least one type of object attribute information to obtain second intermediate results corresponding to the at least one type of object attribute information; finally, the server 120 determines category probabilities of various preset physiological data analysis results corresponding to the object to be analyzed based on the first intermediate result and the at least one second intermediate result, and displays the target physiological data analysis result to the object to be analyzed through the terminal device 110 after determining the target physiological data analysis result of the object to be analyzed based on each category probability.

In an alternative embodiment, terminal device 110 and server 120 may communicate via a communication network.

In an alternative embodiment, the communication network is a wired network or a wireless network.

It should be noted that, the illustration shown in fig. 1 is only an example, and the number of the terminal devices and the servers is not limited in practice, and is not specifically limited in the embodiment of the present application.

In the embodiment of the application, when the number of the servers is multiple, the multiple servers can be combined into a block chain, and the servers are nodes on the block chain; the method for analyzing physiological data as disclosed in the embodiments of the present application, wherein the data such as the original state information, the object attribute information, the analysis result of the target physiological data, etc. can be saved on the block chain.

In addition, the physiological data analysis method according to the embodiment of the present application can be applied to various physiological data analysis result-related scenes, including but not limited to cloud technologies, Artificial Intelligence (AI), intelligent transportation, driving assistance, and other scenes. Under the scenes, the method comprises the steps of performing a plurality of tasks including but not limited to an intelligent inquiry task, an AI auxiliary diagnosis and treatment task, a health management detection task and the like, wherein the tasks can be applied to all tasks which need interpretability, have initial weights, cannot directly train a model only with a small amount of data, and need to adjust the weights according to the data.

Taking intelligent inquiry as an example, the status information may be represented as symptoms, and the physiological data analysis result may be represented as a disease, as shown in fig. 2, which is a schematic diagram of an intelligent inquiry interface in the embodiment of the present application.

In the embodiment of the present application, the terminal device may provide a disease prediction function for the object to be analyzed, and when the object to be analyzed feels that the body is not appropriate, the disease prediction function in the terminal device may be turned on, and at this time, the terminal device may guide the object to be analyzed to input a description related to physical discomfort in an object interface, as shown in an interface 2a in fig. 2. After the terminal device obtains the relevant description (for example, including the original state information and at least one type of object attribute information) input by the object to be analyzed, the terminal device may predict disease information (i.e., a target physiological data analysis result) of the object to be analyzed according to the relevant description, and push the predicted disease information to the object to be analyzed, as shown in an interface 2b in fig. 2, where the predicted disease information includes a name of a disease that the object to be analyzed may have. For example, as shown in fig. 2, the subject had symptoms of dizziness, fatigue, cough, and high body temperature, and the resulting disease was predicted to be a cold.

It should be noted that, in the interface 2a in fig. 2, only the example that the related description input by the object to be analyzed is taken as the original state information is illustrated, and fig. 2 does not show the input process of the object attribute information, actually, the object attribute information of the object to be analyzed may also be input by the object on the current interface, or obtained based on the object account, and the like, and the text is not particularly limited.

In the embodiment of the application, the object to be analyzed can select a proper doctor to see a doctor according to the disease information, so that the operations of registration error, repeated registration and the like are avoided, the object to be analyzed can see a doctor in time, and the time cost and the labor cost can be greatly saved.

It is understood that in the specific embodiments of the present application, related data (such as symptoms, age, sex, time of visit, history of visit, etc.) related to the original state information, physiological data, object attribute information, history data of sample objects, etc. of the subject are involved, when the above embodiments of the present application are applied to specific products or technologies, permission or consent of the subject needs to be obtained, and the collection, use and handling of the related data need to comply with relevant laws and regulations and standards of relevant countries and regions.

The method for analyzing physiological data provided by the exemplary embodiments of the present application is described below with reference to the accompanying drawings in conjunction with the application scenarios described above, it should be noted that the application scenarios described above are only shown for the convenience of understanding the spirit and principles of the present application, and the embodiments of the present application are not limited in this respect.

Referring to fig. 3, a flowchart of an implementation of a method for analyzing physiological data according to an embodiment of the present application is shown, and the method includes the following steps S31-S34:

s31: the server obtains original state information of the object to be analyzed and at least one type of object attribute information related to the object to be analyzed.

The original state information includes physiological data of the object to be analyzed in the environment, such as symptom information, which can represent subjective abnormal feeling or some objective pathological changes of the patient caused by a series of abnormal changes of functions, metabolism and morphological structure in the organism during the disease process.

In the present embodiment, symptoms include both symptoms and signs, and refer to subjective discomfort, abnormal sensations, functional changes or significant pathological changes in a patient caused by a disease. Some of the symptoms that are common are: fever, pain, weight change, edema, dyspnea, cough, hemoptysis, anorexia, dyspepsia, dysphagia, nausea and emesis, hematemesis, anemia, shock, etc. It should be noted that the symptoms may also include information determined in combination with the subjective perception of the patient, such as the number of vomits, the color of vomits, etc., which pertain to physiological data of the subject in the environment of the subject.

It should be noted that the above is only a simple list of some symptoms, and practically any state information that can represent the physiological data of the subject under a certain environment is applicable to the embodiments of the present application, and is not specifically limited herein.

In addition, the object attribute information indicates some characteristic information related to the analysis result of the acquired physiological data, such as at least one of the object age information, the object gender information, the object visit time information, and the like, and may further include the visit history information, and the like, which is not specifically limited herein.

S32: and the server performs feature extraction on the original state information based on the conditional probability configured corresponding to each preset state information to obtain a corresponding first intermediate result.

The conditional probability represents a probability of occurrence of a preset state information under a condition of a preset physiological data analysis result, and since each preset state information may represent preset different state information, such as different symptoms, and each preset physiological data analysis result represents preset different categories of results, such as different diseases, the conditional probability may also represent a probability of occurrence of a symptom under a condition that a disease has occurred, such as: probability of nausea under conditions of heat stroke; probability of dizziness under heatstroke conditions; probability of coughing in the case of a cold; and so on.

In the embodiment of the present application, the conditional probability configured corresponding to each preset state information may be obtained according to books, doctor experience, knowledge maps or other channels, for example, in the field of medical care, there are many well-organized knowledge maps, such as international disease classification, clinical guidelines and consensus, etc., and the conditional probability has hierarchical information and complex association relationship conforming to human cognition, and is not specifically limited herein.

Fig. 4 is a schematic diagram illustrating a correspondence relationship between a preset physiological data analysis result and preset status information according to an embodiment of the present application.

Fig. 4 is a simplified list of preset status information (symptoms) corresponding to several common preset physiological data analysis results (diseases). For example: common symptoms of cold are: nasal obstruction, rhinorrhea, pharynx itch, sore throat, cough, headache, dizziness, aversion to cold, low fever, etc.; common symptoms of hypoglycemia are: palpitation, trembling hands, sweating, headache, dizziness, coma, convulsion, etc.; common symptoms of acute gastroenteritis are: nausea, vomiting, abdominal pain, diarrhea, fever, thirst, etc.; common symptoms of heatstroke are: profuse sweating, dizziness, dim eyesight, nausea, palpitation, shortness of breath, body temperature often less than 37.5 degrees celsius, and the like.

It should be noted that the above listed corresponding relationships are only examples, and any corresponding relationship between the analysis result of the preset physiological data and the preset status information is applicable to the embodiments of the present application, and is not limited herein.

In addition, in the embodiment of the present application, the manner of obtaining the corresponding relationship between the analysis result of the preset physiological data and the preset state information is the same as or similar to the manner of obtaining the conditional probability listed above, and repeated parts are not described again.

In an alternative embodiment, step S32 can be further divided into the following substeps (not shown in fig. 3):

s321: and respectively acquiring the respective conditional probability of each preset state information under each preset physiological data analysis result, and determining a first weight matrix for feature extraction based on each acquired conditional probability.

Specifically, a conditional probability matrix composed of the respective conditional probabilities of symptoms under each disease is obtained from the knowledge graph, a first weight matrix is obtained based on the initialization of the conditional probability matrix, the first weight matrix is updated in a machine learning mode, and finally the first weight matrix for feature extraction is obtained.

For example, given a set of diseases, D ═ D₁,d₂,...d_NWhere N is the number of diseases (i.e., number of classes), d_iFor a disease, such as common cold, the value of i is 1-N. Symptom feature set F ═ { F ═ F₁,f₂,...f_KH, where K is the symptom characteristic number, f_jFor a certain feature, such as cough, the value range of each feature is { -1,0,1}, which respectively represents negative symptoms, unreferenced symptoms and positive symptoms, and j takes a value from 1 to K.

Obtaining a conditional probability matrix C with the size of K multiplied by N of each symptom characteristic under each disease according to books, doctor experience or other channels, wherein each element C_jiHas a value of P (f)_j|d_i) Value range of [0,1]。

When a first weight matrix is obtained based on the conditional probability matrix, the initialization method may be various, for example, directly assign values to elements in the first weight matrix based on the conditional probability matrix, or may adopt a mapping manner, which is not specifically limited herein, and the size of the first weight matrix is K × N.

S322: and performing feature extraction on the original state information based on the first weight matrix to obtain a corresponding first intermediate result.

Specifically, the original state information of the object to be analyzed can be input into the first feature network in the trained target analysis model; based on a first feature network, feature extraction is performed on original state information to obtain a first intermediate result, wherein a first weight matrix in the first feature network is determined based on conditional probability, the process is the same as a model training process, a specific calculation mode can be referred to the following formula (1-1), and repeated parts are not repeated.

In the above embodiment, the knowledge is expressed as the conditional probability of each symptom, and the expert is invited to label the probability according to the book and the experience, and the labeled probability is used to initialize the feature network, so that the accuracy can be effectively improved.

S33: and the server respectively extracts the characteristics of the at least one type of object attribute information to obtain second intermediate results corresponding to the at least one type of object attribute information.

In an alternative embodiment, step S33 can be further divided into the following sub-steps (not shown in fig. 3):

for each type of object attribute information in at least one type of object attribute information, respectively executing the following operations:

s331: and for the first class of object attribute information, determining a second weight matrix for performing feature extraction on the first class of object attribute information based on the matching association degree between each preset physiological data analysis result and the first class of object attribute information.

Similar to the first weight matrix, the method and the device can obtain a weight matrix through initialization based on the matching association degree between each preset physiological data analysis result and one type of object attribute information, update the weight matrix in a machine learning mode, and finally obtain a second weight matrix for feature extraction.

Taking the example that the object attribute information includes the object age information, the matching correlation degree between each type of disease and the object age information represents the correlation between each type of disease and the age, for example, for hand-foot-and-mouth disease, the main disease population is children under 5 years old; and for the elderly (over 60 years old), high-grade disease including coronary heart disease; for middle-aged people (45-59 years old), the high-incidence disease includes hypertension. That is, hand-foot-and-mouth disease is more closely associated with the age group of 0-5 years, coronary heart disease is more closely associated with the age group over 60 years, and hypertension is more closely associated with the age group of 45-59 years.

For another example, taking the example that the object attribute information includes the object visit time information, the matching association degree between each type of disease and the object visit time information represents the association between each type of disease and time, for example, sunstroke frequently occurs in summer; influenza frequently occurs in spring. That is, the degree of matching correlation between heatstroke and summer periods is higher, while the degree of matching correlation between influenza and spring periods is higher.

When the second weight matrix is initialized based on the degree of matching correlation, the corresponding weight may be set to be higher when the degree of matching correlation is higher, and may be set to be a positive value, for example, 2, and the corresponding weight may be also lower when the degree of matching correlation is lower, and may be set to be a negative value, for example, -2.

It should be noted that the above listed manners for determining the second weight matrix based on the degree of matching correlation are only simple examples, and practically any manner for determining the second weight matrix based on the degree of matching correlation is applicable to the embodiments of the present application, and is not limited in detail herein.

In addition, the second weight matrix can be initialized randomly, and compared with the random initialization, the preset physiological data analysis result and the object attribute information can be more closely correlated by adopting the mode of determining based on the matching correlation degree, so that the accuracy of the analysis result obtained based on the preset physiological data analysis result is higher.

S332: and performing feature extraction on the attribute information of the first class of objects based on the second weight matrix to obtain a corresponding second intermediate result.

Specifically, at least one type of object attribute information of the object to be analyzed can be respectively input into corresponding second feature networks in the target analysis model, wherein each second feature network corresponds to one type of object attribute information; and then, respectively extracting the characteristics of the corresponding object attribute information based on the second characteristic networks corresponding to the at least one type of object attribute information to obtain corresponding second intermediate results, wherein the process is the same as the model training process, the specific calculation mode can be referred to the following formula (1-2) or (1-3), and repeated parts are not repeated.

Fig. 5 is a schematic diagram illustrating a processing procedure after inputting original state information and object attribute information of an object to be analyzed into a target analysis model in the embodiment of the present application. Taking the example that the object attribute information includes the object age information and the object visit time information, the second intermediate result includes: a second intermediate result corresponding to the age information of the subject, and a second intermediate result corresponding to the time information of the visit of the subject.

As shown in fig. 5, the original state information of the object to be analyzed is input into the first feature network, and a first intermediate result is obtained; inputting the age information of the object into a corresponding second characteristic network-1 to obtain a second intermediate result-1; and inputting the information of the patient treatment time into the corresponding second characteristic network-2 to obtain a second intermediate result-2.

Next, step S34 may be executed:

s34: the server determines class probabilities of various preset physiological data analysis results corresponding to the object to be analyzed based on the first intermediate result and the at least one second intermediate result, and determines a target physiological data analysis result of the object to be analyzed based on the class probabilities.

Optionally, when the object attribute information includes object age information and object visit time information, the category probability may be determined in the following manner:

firstly, determining a first intermediate result and at least one second intermediate result, and accumulating and summing to obtain a summation result; and further, normalizing the summation result to obtain the category probability of various preset physiological data analysis results corresponding to the object to be analyzed.

As shown in fig. 5, the first intermediate result, the second intermediate result-1 and the second intermediate result-2 are accumulated, and then the summation result is normalized by activating the Sigmoid function, so as to obtain the class probability of each type of preset physiological data analysis result corresponding to the object to be analyzed.

For example, taking the total number N of the preset physiological data analysis results as 4 as an example, that is, a total of four types of preset physiological data analysis results are set, and the four types of preset physiological data analysis results are: cold, hypoglycemia, acute gastroenteritis, and heatstroke; the original state information of the object to be analyzed includes: dizziness, fatigue, cough and high body temperature; the age is 20 years; based on the above, the category probabilities of the object to be analyzed corresponding to the cold, hypoglycemia, acute gastroenteritis and heatstroke are predicted to be respectively: p₁、P₂、P₃、P₄。

For another example, the object attribute information further includes: the second intermediate result further includes, when the subject gender information is received: and a second intermediate result corresponding to the object gender information. At this time, the summation result means: and accumulating and summing the first intermediate result, the second intermediate result corresponding to the age information of the subject, the second intermediate result corresponding to the time information of the diagnosis of the subject and the second intermediate result corresponding to the sex information of the subject, and the like, when the attribute information of the subject is converted, and the like, and the description is not repeated.

Based on the above manner, the category probability corresponding to each preset physiological data analysis result may be obtained, that is, based on each category probability, one or more target physiological data analysis results matched with the object to be analyzed may be determined.

In an optional embodiment, when determining the target physiological data analysis result of the object to be analyzed based on each category probability, there are two ways:

and determining a first mode, wherein a preset physiological data analysis result corresponding to the class probability reaching the reference threshold value in all the class probabilities is used as a target physiological data analysis result of the object to be analyzed.

For example, the reference threshold is tr, and when the class probability of a disease is greater than the threshold, the disease is regarded as a matchMatching the target physiological data analysis result; assume the four category probabilities P listed above₁、P₂、P₃、P₄In, P₁If tr is greater than tr, the disease possibly suffered by the object to be analyzed is cold.

And determining a second mode, sequencing the category probabilities, and taking a preset physiological data analysis result corresponding to the category probability of the sequencing result in the designated sequence range as a target physiological data analysis result of the object to be analyzed.

For example, when the category probabilities are sorted from large to small, the top m categories can be selected as the target physiological data analysis results of the object to be analyzed; when the category probabilities are sorted from small to large, m selected as the target physiological data analysis results of the object to be analyzed, wherein m is a positive integer.

Such as P₁>P₃>P₂>P₄And m ═ 2, the disease that the object to be analyzed is likely to have is at least one of cold and acute gastroenteritis, wherein the disease that is most likely to have is cold.

It should be noted that the above-listed determining manners are only examples, and any manner of determining the analysis result of the target physiological data of the object to be analyzed based on the probability of each category is applicable to the embodiment of the present application, and repeated details are not repeated.

In the embodiment of the present application, the process of feature extraction may be implemented by machine learning, such as the analysis model proposed in the present application for performing physiological data analysis result matching.

Considering that a medical diagnosis scene needs high interpretability and the meaning of perfect compatibility with conditional probability, a NAM model is used for modeling, taking the analysis model as NAM as an example, for each type of feature (namely, original state information, object attribute information and the like), a feature network can be used for calculating the network output h of the feature, and further, N category probabilities are predicted based on the output of each feature network.

In the embodiment of the application, the sample data set for training the NAM can be divided into a training set and a verification set, the NAM is trained by using the training set, for each iteration, a Loss function Loss and a gradient of the Loss are calculated, and the parameters are updated by using an optimizer. After the iterative training is completed, the model is saved. Furthermore, samples of the verification set are input, the Loss and required indexes are calculated, the learning rate is adjusted according to a certain training strategy, and whether the early termination is carried out or not is determined.

For example, the preset number of iterations is 20, but after the 5 th iteration, the Loss calculated based on the verification set does not decrease any more, but is in an oscillating state, and the iteration can be stopped in advance.

After training is completed, the probability P of each class is calculated with the optimal model as the output of the model. Then sorting the diseases according to P or outputting the diseases with the probability Pi higher than a preset threshold value as results.

The following illustrates the detailed process of model training:

in an alternative embodiment, the target analysis model may be trained by:

referring to fig. 6, which is a schematic diagram of a training process of a target analysis model in the embodiment of the present application, taking a server as an execution subject, specifically including the following steps S61-S62:

s61: the method comprises the steps that a server obtains a sample data set, wherein each sample data in the sample data set comprises a sample object, original state information of the sample object, at least one type of object attribute information and a physiological data analysis result label of the sample object.

For example, for each sample data S_l(l is 1-n) and the known characteristic sequence F_l(i.e., raw status information) and disease tag D_l(i.e., a physiological data analysis result tag), F_lIs a subset of F, where each element is the symptom characteristic observed for the sample subject, in addition to subject attribute information such as age, gender, month of visit, etc. The initialization and fine-tuning are here performed taking symptom characteristics as an example. D_lThe subset of the disease set D characterizes the true disease. Since the number of actual diseases is not limited in the examples of the present application, the characteristic sequences are knownA multi-label problem to predict disease probability.

Specifically, for each sample data, the feature sequence F may be divided into_lThe mapping is a vector of length K, where the value of a positive feature is 1, the value of a negative feature is-1, and the value of an unobserved feature is 0. The disease label is mapped to a vector of length N, where the true disease has a value of 1 and the rest are 0. Thus resulting in a vectorized training set representation. The same mapping method is also used for the verification set, and each sample is represented as a vector pair, so that vectorized verification set representation is obtained.

Providing a training set S with small data volume on the basis that the sample data contains the content_trainAnd a verification set S_validEach set S { (F)₁,D₁),(F₂,D₂),...,(F_n,D_n) How to adjust the conditional probability matrix C is the purpose of model training, so that the prediction accuracy of the model is higher.

S62: and the server performs loop iterative training on the analysis model to be trained according to the sample data set and outputs a corresponding target analysis model.

In one loop iteration process, the following operations are executed:

s621: and the server inputs the selected sample data into an analysis model to be trained to obtain the class probability of the sample object corresponding to various preset physiological data analysis results.

Specifically, the process is the same as the above-listed process of analyzing physiological data of a subject to be analyzed based on the target analysis model. For example, fig. 7 is a schematic structural diagram of an analysis model in an embodiment of the present application, where the analysis model is an additive neural network, each type of feature corresponds to a feature network, the feature network may have one or more layers, and the number of neurons in the last layer is the number of diseases N. For example, the feature networks 1 and … and the feature network N in fig. 7, the various features in the input analysis model are respectively input into the respective corresponding feature networks for feature extraction, that is, the feature vectors formed by the original state information, the object attribute information, and the like in fig. 7 are input into the corresponding feature networks 1-feature network N in segments, and finally, the hidden layer outputs of the feature networks are summed up and normalized by Sigmoid to obtain a probability vector P composed of the probabilities of the various classes, which includes N elements, one of which represents the class probability corresponding to the analysis result (disease) of the preset physiological data.

In the embodiment of the present application, the features (i.e., original state information, object attribute information, etc.) and the labels (i.e., labels of the physiological data analysis results) of the training set and the verification set are vectorized and input to the additive neural network, as shown in fig. 7.

The following description will be given by taking the original state information as a symptom, the object attribute information as age, and the visit time as examples:

for symptom feature x_sUsing a layer of neural network, wherein the first weight matrix W_sInitialization is performed using the labeled conditional probability matrix C.

During training and inference, the output h of the feature network is calculated according to the formula (1-1)_sI.e. the first intermediate result, is a vector of length N.

h_s＝g(W_sx_s+b_s) Formula (1-1)

Wherein, W_sIs a first trainable weight matrix of the same size as C. b_sIs a bias vector of length N. g is an activation function used for regularizing the output, commonly using the activation functions such as tanh, Relu, Leaky Relu, etc., and selecting different activation functions according to different tasks.

Specifically, the present application considers that if a corrected linear unit (reduce) or a Sigmoid is used, the score value of each category after a subsequent Sigmoid transformation is 0.5 or more, and therefore, in the present disease probability prediction task, a tanh or leakage corrected linear unit (leak reduce) is used.

Similarly, for age characteristic x_aThat is, the second intermediate result corresponding to the age information of the subject is also a vector with length N, and can be calculated according to the formula (1-2):

h_a＝g(W_ax_a+b_a) Formula (1-2)

Wherein h is_aHidden output for age-specific networks, W_aIs a trainable second weight matrix, b_aIs a bias vector.

For time of visit feature x_mThat is, the second intermediate result corresponding to the visit time information of the subject is also a vector with length N, and can be calculated according to the formula (1-3):

h_m＝g(W_mx_m+b_m) Formula (1-3)

Wherein h is_mHidden layer output of the feature network corresponding to the time of visit, W_mAlso a trainable second weight matrix, b_mIs a bias vector.

Then, the vectors output by each feature network may be added, and normalized by a Sigmoid function, and the obtained probability P is the probability P of each category, as shown in formula (2):

P＝f(∑(h_s,h_a,h_m) Equation (2)

Wherein P is a vector of length N, each element P_iIs the probability of belonging to the corresponding category, i.e. the category probability. f is a normalization function, mapping the value range to [0,1 ]]In the meantime. Considering the embodiment of the present application as multi-tag classification, Sigmoid function may be used as f (if single classification problem, softmax function may be used as f).

S622: the server adopts a target loss function constructed based on class probabilities of various preset physiological data analysis results corresponding to the sample objects and physiological data analysis result labels to perform parameter adjustment on the analysis model to be trained.

For the training set, calculating the network output of all samples and calculating the loss function with the label vector, the common loss functions for the classification problem have cross entropy and its variation, taking cross entropy as an example, the calculation of the loss function is shown in formula (3).

Loss＝-∑_iy_ilogP_i-(1-y_i)log(1-P_i) Formula (3)

In the formula (3), y_iThe actual probability (i.e., the physiological data analysis result label in the sample data) for each class is 0 or 1, P_iThe predicted class probability is the ith element of the probability matrix P.

Specifically, the gradient of the Loss is calculated during training, and parameters are updated by using any optimization method. After each iteration is finished, calculating the error and the precision on the data of the verification set, and storing the model until the preset iteration times are reached to finish the training. And if the error does not decrease or the accuracy does not increase, the model with the highest accuracy or the smallest error is taken as a disease prediction model, namely a target analysis model.

It should be noted that, in the above embodiments, an interpretable high-accuracy method, i.e., an additive neural network, is used to predict the disease probability, and historical data of some sample objects is used to perform fine tuning on the initialized additive neural network, so as to improve the accuracy.

Optionally, in this embodiment of the application, after training, the influence of various features on various categories and parameters may be visualized or printed, and whether the relationship and the parameter value are reasonable or not may be checked, so as to adjust the network or the parameters, and a certain feature may also be removed, which is specifically determined according to an actual situation and is not specifically limited herein.

In the embodiment of the present application, if there is an imbalance between classes in the training set or the verification set, the classes with less data may be upsampled, so that the sample size of each class is the same.

An optional embodiment is to expand the sample data by:

firstly, dividing initial sample data into at least one sample data subset according to a preset physiological data analysis result type, and determining the number of samples in the sample data subset corresponding to each preset physiological data analysis result type; and then, carrying out up-sampling treatment on the sample data in the sample data subset with the sample number not reaching the preset number to obtain at least one up-sampled sample data. Thus, the sample data set can be constructed based on the up-sampled sample data and the initial sample data.

Fig. 8 is a schematic diagram illustrating a sample data expansion method according to an embodiment of the present application, in which the sample data set includes initial sample data and upsampled sample data, where the initial sample data includes four types of sample data subsets divided according to a preset physiological data analysis result category, and the four types of sample data subsets are represented by rectangles filled with different patterns in fig. 8. And finally, the sample data can be expanded in the above mode and then input into an analysis model for model training and verification, and finally a target analysis model is output.

During the up-sampling, the sample data can be randomly sampled for a plurality of times according to a certain rule, so that the data volume is increased. For example, the age, the time of visit, and the like in the subject attribute information are adjusted, and the present disclosure is not particularly limited.

After the sample data is expanded based on the above manner, a large amount of sample data (sample data with labeled labels) can be obtained, further, sample data extraction is performed from the sample data set, for example, the extracted sample data includes original state information of a sample object, object age information, object visit time information and a physiological data analysis result label, and further, the model is trained and verified to obtain a target analysis model.

In the above embodiment, for a scene with only a small amount of data, the method may also be used to perform sample expansion and further train a model, so as to solve the problem that the model cannot be directly trained with only a small amount of data in the related art.

In summary, in the feature network, the embodiment of the application initializes the conditional probability of each symptom, and fine-tunes the conditional probability by using the training set with a small data volume, so that the conditional probability can be automatically adjusted, the accuracy is improved, and the manual labor is reduced.

Referring to fig. 9A, which is a schematic flowchart illustrating a specific flowchart of a physiological data analysis method in an embodiment of the present application, taking a server as an execution subject, an implementation flowchart of the method is as follows:

step S901: the server acquires original state information of an object to be analyzed, age information of the object and clinic time information of the object;

step S902: the server inputs the original state information into a first characteristic network in a target analysis model, and respectively inputs the age information and the visit time information of the object into corresponding second characteristic networks in the target analysis model;

step S903: the server extracts the characteristics of the original state information based on a first characteristic network to obtain a first intermediate result;

step S904: the server extracts the characteristics of the corresponding object attribute information based on the corresponding second characteristic networks respectively to obtain a second intermediate result corresponding to the object age information and a second intermediate result corresponding to the object diagnosis time information;

step S905: the server determines a summation result obtained by performing summation based on the first intermediate result, a second intermediate result corresponding to the age information of the object and a second intermediate result corresponding to the visit time information of the object;

step S906: the server carries out normalization processing on the summation result to obtain the class probability of various preset physiological data analysis results corresponding to the object to be analyzed;

step S907: and the server takes a preset physiological data analysis result corresponding to the class probability reaching the reference threshold value in all the class probabilities as a target physiological data analysis result of the object to be analyzed.

Fig. 9B is a schematic diagram illustrating a specific scenario of a method for analyzing physiological data according to an embodiment of the present application. The original state information of the object to be analyzed includes: sweating, dizziness, dim eyesight, nausea, age information of the subject: the age of 20 years, the visit time information of the subject is as follows: 7, month and 2 days. After the object to be analyzed inputs the information through a client installed on the terminal equipment, the terminal equipment can send the information to a server, a target analysis model is configured on the server side, and then feature extraction is carried out based on each feature network in the target analysis model to obtain intermediate results, wherein the intermediate results comprise a first intermediate result corresponding to the original state information, a second intermediate result corresponding to the object age information and a second intermediate result corresponding to the object visit time information; furthermore, in the manner of the above steps S905 to S907, a target physiological data analysis result of the object to be analyzed is obtained, and the result is fed back to the terminal device, and is displayed to the object to be analyzed by the terminal device through the client. For example, the server finally obtains, through the target analysis model, each class probability corresponding to the object to be analyzed as follows: the heatstroke probability is 0.7, the acute gastroenteritis probability is 0.5, the cold probability is 0.4, and the hypoglycemia probability is 0.3, wherein the reference threshold is 0.6, and the reference threshold is: the heatstroke probability is 0.7, and the corresponding analysis result of the preset physiological data is as follows: and (4) heatstroke, namely, the heatstroke result can be fed back to the terminal equipment, and the terminal equipment displays the intelligent inquiry interface shown in fig. 9B to the object to be analyzed.

By combining the analysis method of the physiological data in the embodiment of the application, under the intelligent pediatric inquiry scene, 110 diseases are treated, the data set of 719 samples is used for fine adjustment, so that the parameters can be automatically adjusted, the precision on the verification set is increased from 26.4% to 51.7%, the precision is increased by 95.8%, and the increase amplitude on P @3 is larger.

Based on the same inventive concept, the embodiment of the application also provides a device for analyzing physiological data. As shown in fig. 10, a schematic structural diagram of an apparatus 1000 for analyzing physiological data may include:

an information obtaining unit 1001 configured to obtain original state information of an object to be analyzed and at least one type of object attribute information related to the object to be analyzed; the original state information comprises physiological data of an object to be analyzed in the environment where the object to be analyzed is located;

a feature extraction unit 1002, configured to perform feature extraction on the original state information based on the conditional probability configured for each preset state information, to obtain a corresponding first intermediate result; the conditional probability represents a probability of occurrence of a preset state information under a condition of a preset physiological data analysis result;

the feature extraction unit 1002 is further configured to perform feature extraction on the at least one type of object attribute information, respectively, to obtain second intermediate results corresponding to the at least one type of object attribute information;

the result analysis unit 1003 is configured to determine category probabilities of various types of preset physiological data analysis results corresponding to the object to be analyzed based on the first intermediate result and the at least one second intermediate result, and determine a target physiological data analysis result of the object to be analyzed based on each category probability.

Optionally, the feature extraction unit 1002 is specifically configured to:

for the first-class object attribute information, determining a second weight matrix for performing feature extraction on the first-class object attribute information based on the matching association degree between each preset physiological data analysis result and the first-class object attribute information;

and performing feature extraction on the attribute information of the first class of objects based on the second weight matrix to obtain a corresponding second intermediate result.

Optionally, the feature extraction unit 1002 is specifically configured to:

inputting original state information of an object to be analyzed into a first feature network in a trained target analysis model;

and performing feature extraction on the original state information based on a first feature network to obtain a first intermediate result, wherein a first weight matrix in the first feature network is determined based on the conditional probability.

Optionally, the feature extraction unit 1002 is specifically configured to:

respectively inputting at least one type of object attribute information of an object to be analyzed into corresponding second feature networks in the target analysis model, wherein each second feature network corresponds to one type of object attribute information;

Optionally, the apparatus further comprises:

a model training unit 1004, configured to train to obtain a target analysis model by:

performing loop iterative training on the analysis model to be trained according to the sample data set, and outputting a corresponding target analysis model; wherein, in a loop iteration process, the following operations are executed:

inputting the selected sample data into an analysis model to be trained to obtain class probabilities of various preset physiological data analysis results corresponding to the sample object;

and performing parameter adjustment on the analysis model to be trained by adopting a target loss function constructed based on class probabilities of various preset physiological data analysis results corresponding to the sample object and physiological data analysis result labels.

Optionally, in the analysis model to be trained, the first weight matrix in the feature network corresponding to the original state information is: based on the initialization of the conditional probability; the second weight matrix in the feature network corresponding to each of the at least one type of object attribute information is: the analysis result is obtained by initializing the matching association degree between each preset physiological data analysis result and each preset physiological data analysis result;

the model training unit 1004 includes:

Optionally, the model training unit 1004 is specifically configured to:

dividing initial sample data into at least one sample data subset according to a preset physiological data analysis result type, and determining the number of samples in the sample data subset corresponding to each preset physiological data analysis result type;

and constructing a sample data set based on the upsampled sample data and the initial sample data.

Optionally, the result analysis unit 1003 is specifically configured to:

determining a summation result obtained by accumulating and summing the first intermediate result and at least one second intermediate result;

Optionally, the object attribute information includes at least one of the following types: age information of the subject, visit time information of the subject, and sex information of the subject.

Optionally, determining a target physiological data analysis result of the object to be analyzed based on the class probabilities includes:

In the embodiment of the application, the original state information is subjected to feature extraction based on the conditional probabilities configured corresponding to the preset state information, and each conditional probability represents the probability of occurrence of one preset state information under the condition of one preset physiological data analysis result, so that when the original state information is subjected to feature extraction based on the condition, physiological data analysis results more associated with the original state information can be extracted more preferentially, and the accuracy of feature extraction can be improved; in addition, the original state information of the object to be analyzed and the object attribute information related to the physiological data analysis result are combined to perform multi-class multi-label classification, and the target physiological data analysis result for the object to be analyzed is determined based on the class probability corresponding to each preset physiological data analysis result. Further improving accuracy.

For convenience of description, the above parts are separately described as modules (or units) according to functional division. Of course, the functionality of the various modules (or units) may be implemented in the same one or more pieces of software or hardware when implementing the present application.

Having described the method and apparatus for analyzing physiological data according to the exemplary embodiment of the present application, an electronic device according to another exemplary embodiment of the present application will be described next.

As will be appreciated by one skilled in the art, aspects of the present application may be embodied as a system, method or program product. Accordingly, various aspects of the present application may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.

The electronic equipment is based on the same inventive concept as the method embodiment. In one embodiment, the electronic device may be a server, such as server 120 shown in FIG. 1. In this embodiment, the electronic device may be configured as shown in fig. 11, and include a memory 1101, a communication module 1103, and one or more processors 1102.

A memory 1101 for storing computer programs executed by the processor 1102. The memory 1101 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, programs required for running an instant messaging function, and the like; the storage data area can store various instant messaging information, operation instruction sets and the like.

The memory 1101 may be a volatile memory (volatile memory), such as a random-access memory (RAM); the memory 1101 may also be a non-volatile memory (non-volatile memory), such as a read-only memory (rom), a flash memory (flash memory), a hard disk (HDD) or a solid-state drive (SSD); or the memory 1101 is any other medium that can be used to carry or store a desired computer program in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory 1101 may be a combination of the above memories.

The processor 1102 may include one or more Central Processing Units (CPUs), a digital processing unit, and the like. The processor 1102 is configured to implement the above-described physiological data analysis method when the computer program stored in the memory 1101 is called.

The communication module 1103 is used for communicating with the terminal device and other servers.

In the embodiment of the present application, a specific connection medium among the memory 1101, the communication module 1103, and the processor 1102 is not limited. In fig. 11, the memory 1101 and the processor 1102 are connected through a bus 1104, the bus 1104 is depicted by a thick line in fig. 11, and the connection manner between other components is only schematically illustrated and not limited. The bus 1104 may be divided into an address bus, a data bus, a control bus, and the like. For ease of description, only one thick line is depicted in FIG. 11, but only one bus or one type of bus is not depicted.

The memory 1101 stores a computer storage medium, and the computer storage medium stores computer-executable instructions for implementing the physiological data analysis method according to the embodiment of the present application. The processor 1102 is configured to perform the above-described physiological data analysis method, as shown in fig. 3.

In another embodiment, the electronic device may also be other electronic devices, such as the terminal device 110 shown in fig. 1. In this embodiment, the structure of the electronic device may be as shown in fig. 12, including: communications assembly 1210, memory 1220, display unit 1230, camera 1240, sensors 1250, audio circuitry 1260, bluetooth module 1270, processor 1280, and the like.

The communication component 1210 is configured to communicate with a server. In some embodiments, a Wireless Fidelity (WiFi) module may be included, the WiFi module belongs to a short-distance Wireless transmission technology, and the electronic device may help the object to send and receive information through the WiFi module.

The memory 1220 may be used for storing software programs and data. Processor 1280 performs various functions of terminal device 110 and data processing by executing software programs or data stored in memory 1220. The memory 1220 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device. Memory 1220 stores an operating system that enables terminal device 110 to operate. The memory 1220 may store an operating system and various application programs, and may also store a computer program for executing the method for analyzing physiological data according to the embodiment of the present application.

The display unit 1230 may also be used to display information input by or provided to the object and a graphic object interface of various menus of the terminal device 110. Specifically, the display unit 1230 may include a display screen 1232 disposed on the front surface of the terminal device 110. The display 1232 may be configured in the form of a liquid crystal display, a light emitting diode, or the like. The display unit 1230 may be configured to display an application operation interface (e.g., the intelligent inquiry interface shown in fig. 2) in this embodiment of the application.

The display unit 1230 may be further configured to receive input numeric or character information and generate signal input related to object setting and function control of the terminal device 110, and specifically, the display unit 1230 may include a touch screen 1231 disposed on the front of the terminal device 110 and may collect touch operations of an object thereon or nearby, such as clicking a button, dragging a scroll box, and the like.

The touch screen 1231 may be covered on the display screen 1232, or the touch screen 1231 and the display screen 1232 may be integrated to implement the input and output functions of the terminal device 110, and after integration, the touch screen may be referred to as a touch display screen for short. The display unit 1230 may display the application programs and the corresponding operation steps in this application.

The camera 1240 may be used to capture still images and the subject may post comments on the images taken by the camera 1240 through the application. The number of the cameras 1240 may be one or plural. The object generates an optical image through the lens and projects the optical image to the photosensitive element. The photosensitive element may be a charge coupled device or a complementary metal oxide semiconductor phototransistor. The light sensing elements convert the light signals into electrical signals, which are then passed to a processor 1280 for conversion into digital image signals.

The terminal device may further comprise at least one sensor 1250, such as an acceleration sensor 1251, a distance sensor 1252, a fingerprint sensor 1253, a temperature sensor 1254. The terminal device may also be configured with other sensors such as a gyroscope, barometer, hygrometer, thermometer, infrared sensor, light sensor, motion sensor, etc.

Audio circuit 1260, speaker 1261, microphone 1262 may provide an audio interface between the object and terminal device 110. The audio circuit 1260 may transmit the received electrical signal converted from the audio data to the speaker 1261, and the audio signal is converted into a sound signal by the speaker 1261 and output. Terminal device 110 may also be configured with a volume button for adjusting the volume of the sound signal. On the other hand, the microphone 1262 converts the collected sound signal into an electric signal, receives it by the audio circuit 1260, converts it into audio data, and outputs the audio data to the communication module 1210 to be transmitted to, for example, another terminal device 110, or outputs the audio data to the memory 1220 for further processing.

The bluetooth module 1270 is used for information interaction with other bluetooth devices having bluetooth modules through a bluetooth protocol. For example, the terminal device may establish a bluetooth connection with a wearable electronic device (e.g., a smart watch) that is also equipped with a bluetooth module through the bluetooth module 1270, so as to perform data interaction.

The processor 1280 is a control center of the terminal device, connects various parts of the entire terminal device using various interfaces and lines, and performs various functions of the terminal device and processes data by running or executing software programs stored in the memory 1220 and calling data stored in the memory 1220. In some embodiments, processor 1280 may include one or more processing units; the processor 1280 may also integrate an application processor, which mainly handles operating systems, object interfaces, applications, etc., and a baseband processor, which mainly handles wireless communications. It is to be appreciated that the baseband processor described above may not be integrated into the processor 1280. In the present application, the processor 1280 may run an operating system, an application program, an object interface display, and a touch response, and the method for analyzing physiological data according to the embodiment of the present application. Additionally, processor 1280 is coupled with display unit 1230.

In some possible embodiments, various aspects of the method for analyzing physiological data provided herein may also be implemented in the form of a program product including a computer program for causing an electronic device to perform the steps of the method for analyzing physiological data according to various exemplary embodiments of the present disclosure described above in this specification when the program product is run on the electronic device, for example, the electronic device may perform the steps as shown in fig. 3.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a random access memory, a read only memory, an erasable programmable read only memory, an optical fiber, a portable compact disk read only memory, an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The program product of embodiments of the present application may employ a portable compact disc read only memory and include a computer program and may be run on an electronic device. However, the program product of the present application is not so limited, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A readable signal medium may include a propagated data signal with a readable computer program embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with a command execution system, apparatus, or device.

The computer program embodied on the readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Computer programs for carrying out operations of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer program may execute entirely on the user computing device, partly on the user computing device, as a stand-alone software package, partly on the user computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a local area network or a wide area network, or may be connected to an external computing device.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having a computer-usable computer program embodied therein.

While the preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all alterations and modifications as fall within the scope of the application.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims of the present application and their equivalents, the present application is intended to include such modifications and variations as well.

Claims

1. A method of analyzing physiological data, the method comprising:

2. The method as claimed in claim 1, wherein said performing feature extraction on the original state information based on the conditional probability configured corresponding to each preset state information to obtain a corresponding first intermediate result comprises:

3. The method according to claim 1, wherein said performing feature extraction on the at least one type of object attribute information respectively to obtain second intermediate results corresponding to the at least one type of object attribute information respectively comprises:

and performing feature extraction on the class object attribute information based on the second weight matrix to obtain a corresponding second intermediate result.

4. The method according to any one of claims 1 to 3, wherein the performing feature extraction on the original state information based on the conditional probability configured for each preset state information to obtain a corresponding first intermediate result comprises:

5. The method according to claim 4, wherein said performing feature extraction on the at least one type of object attribute information respectively to obtain second intermediate results corresponding to the at least one type of object attribute information respectively comprises:

6. The method of claim 4, wherein the target analysis model is trained by:

7. The method of claim 6, wherein in the analysis model to be trained, the first weight matrix in the feature network corresponding to the raw state information is: initializing based on the conditional probability; the second weight matrix in the feature network corresponding to each of the at least one type of object attribute information is: based on the matching association degree between each preset physiological data analysis result and each preset physiological data analysis result, initializing the preset physiological data analysis result;

then, the parameter adjustment of the analysis model to be trained by using the target loss function constructed based on the class probability of each type of preset physiological data analysis result corresponding to the sample object and the physiological data analysis result label includes:

8. The method of claim 6, wherein said obtaining a sample data set comprises:

performing upsampling treatment on sample data in the sample data subset with the number of samples not reaching the preset number to obtain at least one upsampled sample data;

9. The method according to any one of claims 1 to 3, wherein the determining the class probability of each type of preset physiological data analysis result corresponding to the object to be analyzed based on the first intermediate result and the at least one second intermediate result comprises:

and carrying out normalization processing on the summation result to obtain the class probability of various preset physiological data analysis results corresponding to the object to be analyzed.

10. A method according to any one of claims 1 to 3, wherein the object property information comprises at least one of the following: age information of the subject, time information of the visit of the subject, sex information of the subject.

11. The method according to any one of claims 1 to 3, wherein the determining the target physiological data analysis result of the object to be analyzed based on the respective class probabilities comprises:

12. An apparatus for analyzing physiological data, comprising:

the device comprises an information acquisition unit, a processing unit and a processing unit, wherein the information acquisition unit is used for acquiring original state information of an object to be analyzed and at least one type of object attribute information related to the object to be analyzed; the original state information comprises physiological data of the object to be analyzed under the environment of the object to be analyzed;

and the result analysis unit is used for determining the class probability of each type of preset physiological data analysis result corresponding to the object to be analyzed based on the first intermediate result and at least one second intermediate result, and determining the target physiological data analysis result of the object to be analyzed based on each class probability.

13. An electronic device, comprising a processor and a memory, wherein the memory stores a computer program which, when executed by the processor, causes the processor to carry out the steps of the method according to any one of claims 1 to 11.

14. A computer-readable storage medium, characterized in that it comprises a computer program for causing an electronic device to carry out the steps of the method according to any one of claims 1 to 11, when said computer program is run on said electronic device.

15. A computer program product, comprising a computer program stored in a computer readable storage medium; when a processor of an electronic device reads the computer program from the computer-readable storage medium, the processor executes the computer program, causing the electronic device to perform the steps of the method of any of claims 1-11.