CN114121265A - Pathological data model construction method based on dynamic sliding window and template matching - Google Patents

Pathological data model construction method based on dynamic sliding window and template matching Download PDF

Info

Publication number
CN114121265A
CN114121265A CN202111449009.7A CN202111449009A CN114121265A CN 114121265 A CN114121265 A CN 114121265A CN 202111449009 A CN202111449009 A CN 202111449009A CN 114121265 A CN114121265 A CN 114121265A
Authority
CN
China
Prior art keywords
window
template
vector
data
pathological
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111449009.7A
Other languages
Chinese (zh)
Inventor
曹一彤
齐金鹏
任晴
钟金美
贾灿
袁傲
薛宇鑫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Donghua University
Original Assignee
Donghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Donghua University filed Critical Donghua University
Priority to CN202111449009.7A priority Critical patent/CN114121265A/en
Publication of CN114121265A publication Critical patent/CN114121265A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/24Detecting, measuring or recording bioelectric or biomagnetic signals of the body or parts thereof
    • A61B5/316Modalities, i.e. specific diagnostic methods
    • A61B5/318Heart-related electrical modalities, e.g. electrocardiography [ECG]
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/40Detecting, measuring or recording for evaluating the nervous system
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • A61B5/7267Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/50ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/60ICT specially adapted for the handling or processing of medical references relating to pathologies

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Molecular Biology (AREA)
  • Surgery (AREA)
  • Animal Behavior & Ethology (AREA)
  • Veterinary Medicine (AREA)
  • Artificial Intelligence (AREA)
  • Physiology (AREA)
  • Neurosurgery (AREA)
  • Cardiology (AREA)
  • Neurology (AREA)
  • Evolutionary Computation (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Psychiatry (AREA)
  • Signal Processing (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

A pathological data model construction method based on dynamic sliding window and template matching. Acquiring electrocardiosignals of a patient with nervous system diseases to acquire time sequence pathological data, and dividing the time sequence pathological data into data segments with fixed sizes by using a sliding window model; calculating the fluctuation amount and the mean value of each data segment to obtain two groups of disease condition segmentation threshold values; classifying the disease condition of the data segment by combining the segmentation threshold value to form a pathological data template library; and carrying out template matching on the data segment to be detected according to the characteristic similarity degree of the data segment to be detected and the pathological template, so as to realize the disease condition diagnosis function of the expert system model. The invention combines a multi-mutation-point detection algorithm, provides two disease classification strategies, and effectively finishes the classification of the disease state of the patient; by utilizing the dynamic sliding window and the segmentation threshold value and combining expert knowledge, pathological templates with various dimensions and lengths are generated, and the types and pathological information of the templates in the expert template library are greatly enriched.

Description

Pathological data model construction method based on dynamic sliding window and template matching
Technical Field
The invention relates to the technical field of big data rapid analysis and anomaly detection, in particular to a method for constructing a pathological data model based on dynamic sliding window and template matching.
Background
Abnormal discharges in the human nervous system can induce a variety of paroxysmal disorders such as epilepsy, sleep disorders, and ataxia. The prominent characteristic of the disease is that the disease is rapid and has no obvious precursor. The rhythm of life is fast, and long-term staying up night, overuse electronic product all can become nervous system paroxysmal disease's cause. Currently, diagnosis of such diseases relies on analysis of bioelectric signals such as electrocardiogram, electroencephalogram, and electromyogram.
Abnormal changes of the electrical signals can detect abnormal neuron activities of the patient in the disease state of the nervous system. At present, most of pathological time series data are detected by adopting a fixed sliding window for segmentation, but the detection is not suitable for detecting nervous system electric signals, and the nervous electric signals of patients with nervous system diseases are not different from those of the patients with the nervous system diseases in health. Only at the onset of the disease does the signal fluctuate sharply for a short time. The fixed sliding window is adopted all the time, the detection speed is low, important pathological characteristics in the disease state can be missed, and if the sliding window can be dynamically changed along with the disease state, the accuracy and the speed of pathological signal detection can be greatly improved. Furthermore, the ability to accurately diagnose a disease from an electrocardiogram requires the accumulation of a great deal of experience, which means that in less developed areas where experienced physicians are not available, the diagnosis and treatment of neurological diseases are difficult. Therefore, if the condition classification can be carried out on the window vector data decomposed by the dynamic sliding window to form a pathological template, and a set of pathological data expert system model is constructed by combining expert knowledge, the method has important significance for detecting the nervous system diseases.
Disclosure of Invention
To solve the problems mentioned in the background, a method for constructing a pathology data model based on matching a dynamic sliding window with a template is provided. The construction process comprises the following steps:
step 1, collecting electrocardiosignals of a patient with nervous system disease, acquiring time sequence pathological data, and dividing the time sequence pathological data into data segments with fixed sizes by using a sliding window model;
step 2, calculating the fluctuation quantity and the mean value of each data segment, and acquiring two groups of disease condition segmentation threshold values according to the fluctuation quantity and the mean value;
step 3, dynamically segmenting the training data and the data to be detected by using a dynamic sliding window model, and classifying the disease conditions of the data segments by combining the segmentation threshold in the step 2 to form a pathological data template base;
step 4, carrying out template matching on the data segment to be detected according to the characteristic similarity degree of the data segment to be detected and the pathological template, and realizing the disease condition diagnosis function of the expert system model;
and step 5, further improving the performance of the expert system model by setting an expert database updating mechanism, a template sorting mechanism and a preferential selection area.
Preferably, the fluctuation amount and the mean value of each data segment are calculated in step 2 of the invention, and two groups of disease condition segmentation threshold values are obtained according to the fluctuation amount and the mean value; the specific process is as follows:
identifying windows with the mutation points by using a TSTKS mutation point detection algorithm, and calculating fluctuation amounts of a plurality of windows according to the positions of data mutation in each window; if no mutation point exists, recording the window fluctuation as 0; combining the fluctuation amounts of the windows into a multi-dimensional fluctuation vector, sequencing the fluctuation amounts in the fluctuation vector, and combining the maximum fluctuation amount WFmaxAs a waveA quantity split threshold; the extraction mode of the Mean value threshold value is similar to the fluctuation quantity threshold value, the Mean value of each window is calculated and sequenced, and the maximum Mean value Mean is takenmaxAs a mean split threshold.
Preferably, in step 3 of the method, a dynamic sliding window model is used for dynamically segmenting the training data and the data to be detected, and the segmentation threshold in step 2 is combined to classify the disease condition of the data segment to form a pathological data template base; the specific process is as follows:
taking the real electrocardiosignals of a patient as training data, segmenting the training data by using a dynamic sliding window model, detecting the position of a mutation point in a window vector, and calculating the fluctuation ratio of the window vector, wherein the calculation formula is as follows:
Figure BDA0003382261950000031
wherein, WVFRiRepresenting the fluctuation ratio of the ith window vector; n iscpRepresenting the number of the mutation points in the window vector; wlfcpRepresenting a window position of a first mutation point in the window vector; wlendRepresenting the position of the last window in the window vector; according to the parameter, realizing the dynamic change of the window vector;
when a mutation point exists in a window vector, the position of a first mutation point is positioned, if the first mutation point does not appear in a first window, a truncation mechanism is executed, all windows before the first mutation point are combined into a template, and the WVFR is calculated by the rest of the vector; if the first mutation point is in the first window of the vector, the WVFR of the whole vector is directly calculated, when the WVFR is larger than or equal to 0.3, the window vector with the dimension of 2 and the window size of 128 is used for redetection, and when the WVFR is smaller than 0.3, the window vector with the dimension of 2 and the window size of 256 is used for redetection; thereby realizing dynamic change of the window vector;
summarizing window vectors segmented by the dynamic sliding window model in the step 3, calculating the average mean value and the average fluctuation amount of each window in the vectors, and comparing the average mean value and the average fluctuation amount with the two groups of segmentation threshold values in the step 2 to obtain the morbidity degree corresponding to the window vector; the disease condition diagnosis result of the vector, the dimension of the vector and the width of the window are used as labels and marked on the window vector to form a complete pathological template and generate enough pathological templates.
Preferably, the step 4 of the invention carries out template matching on the data segment to be detected according to the similarity degree of the characteristics of the data segment to be detected and the pathological template, thereby realizing the disease condition diagnosis function of the expert system model; the specific process is as follows:
classifying and storing the pathology template constructed in the step 3 into an expert template library according to the dimension of the template and the width of a window, and calculating the similarity degree of the template in the expert library and the pathology data to be detected by means of information on the template; the pathological data to be detected is segmented by means of a dynamic sliding window model, templates with the same length and dimension are preferentially searched in an expert database according to the dimension and the length of a segmented window vector, and then the degree of closeness of the mean value and the fluctuation quantity of each window in the vector and the mean value and the fluctuation quantity of a corresponding window in a pathological template vector is calculated; and when the mean value and the fluctuation amount are both in the interval of 80% -120% of the mean value and the fluctuation amount of the corresponding window in the template, indicating that the matching is successful, and quickly diagnosing the data to be detected according to the disease diagnosis result corresponding to the data in the template.
Preferably, step 5 of the invention further improves the performance of the expert system model by setting an expert database updating mechanism, a template sorting mechanism and a preferential selection area; the specific process of the expert database updating mechanism is as follows:
if the window vector in the pathological data to be detected in the step 4 is not successfully matched after traversing all the templates with the equal length, dividing the disease condition through the division threshold value trained in the step 3, taking the division result as a label, recording the division result on the window vector, taking the window vector as a new template, and adding the new template into the expert database to update the pathological template database.
Preferably, step 5 of the invention further improves the performance of the expert system model by setting an expert database updating mechanism, a template sorting mechanism and a preferential selection area; the specific processes of the expert database template sorting mechanism and the preferential selection area are as follows:
arranging a template sorting mechanism in a pathology template library, sorting templates with the same length and dimension together, and sorting the templates from high to low according to the successful matching times; when pathological data to be detected are input into an expert system model, templates are sequentially matched from high frequency to low frequency, so that pathological data can be quickly matched with a proper template;
a priority queue selection area for accommodating three templates is arranged in the expert database; when a certain template in the sequencing queue is successfully matched, copying the template into a priority queue; when the next round of template matching starts, firstly traversing the templates in the priority queue; according to the characteristic that the pathological data change has inertia, the matching time is shortened, and the model performance of the expert system is improved.
By adopting the technical scheme, the invention has the advantages that:
1. the invention combines a multi-mutation-point detection algorithm, provides two disease classification strategies based on the fluctuation quantity and the mean value, and effectively completes the classification of the disease state of the patient.
2. The invention uses the dynamic sliding window model, depends on the dynamic change of the window vector, can effectively reduce the detection time in the non-diseased data segment, and can strengthen the feature extraction of the diseased data segment.
3. The invention utilizes the dynamic sliding window and the segmentation threshold value, combines the expert knowledge to generate the pathological templates with various dimensions and lengths, and greatly enriches the types and pathological information of the templates in the expert template library.
4. In the aspect of template matching, the similarity degree of each window data in the window vector to be detected and the corresponding window data in the template vector is compared, and the change trend of the data is considered and compared while the template with the data fluctuation most similar to the data mean value is found.
5. The invention improves the speed of template matching by a template sorting mechanism, and realizes the self-updating of the expert template library by combining an updating mechanism.
6. The invention sets a preferential selection area in the expert database, thereby further improving the performance of the expert system model.
Drawings
FIG. 1 is a flow chart of the present invention for modeling pathology data.
Fig. 2 is a schematic diagram of a change strategy of the dynamic sliding window model according to the present invention.
Fig. 3a is an exemplary diagram of the truncation mechanism when the first mutation occurs at an odd position in the window vector.
Fig. 3b is an exemplary diagram of the truncation mechanism when the first mutation occurs at an even position in the window vector.
Fig. 4 is a diagram of a template matching strategy according to the present invention.
FIG. 5 is a graph of a disease segmentation strategy based on fluctuation and mean values in accordance with the present invention.
FIG. 6 is an exemplary diagram of the expert database template sorting mechanism in the present invention.
FIG. 7 is an exemplary diagram of the expert pool preferred area in the present invention.
Detailed Description
The invention is further illustrated below with reference to specific examples. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Further, it should be understood that various changes or modifications of the present invention may be made by those skilled in the art after reading the teaching of the present invention, and such equivalents may fall within the scope of the present invention as defined in the appended claims.
Example 1:
the pathological data model construction method based on the matching of the dynamic sliding window and the template specifically comprises the following steps:
step 1, collecting electrocardiosignals of a patient with nervous system diseases, acquiring time sequence pathological data, and dividing the time sequence pathological data into data segments with fixed sizes by using a sliding window model.
Step 2, calculating the fluctuation quantity and the mean value of each data segment, and acquiring two groups of disease condition segmentation threshold values according to the fluctuation quantity and the mean value;
identifying windows with the mutation points by using a TSTKS mutation point detection algorithm, and calculating fluctuation amounts of a plurality of windows according to the positions of data mutation in each window; if no mutation point exists, recording the window fluctuation as 0; combining the fluctuation amounts of each window into a multi-dimensionalThe fluctuation amount in the fluctuation vector is sorted to obtain the maximum fluctuation amount WFmaxAs a fluctuation amount division threshold; the extraction mode of the Mean value threshold value is similar to the fluctuation quantity threshold value, the Mean value of each window is calculated and sequenced, and the maximum Mean value Mean is takenmaxAs a mean split threshold.
Step 3, dynamically segmenting the training data and the data to be detected by using a dynamic sliding window model, and classifying the disease conditions of the data segments by combining the segmentation threshold in the step 2 to form a pathological data template base;
taking the real electrocardiosignals of a patient as training data, segmenting the training data by using a dynamic sliding window model, detecting the position of a mutation point in a window vector, and calculating the fluctuation ratio of the window vector, wherein the calculation formula is as follows:
Figure BDA0003382261950000071
wherein, WVFRiRepresenting the fluctuation ratio of the ith window vector; n iscpRepresenting the number of the mutation points in the window vector; wlfcpRepresenting a window position of a first mutation point in the window vector; wlendRepresenting the position of the last window in the window vector; according to the parameter, realizing the dynamic change of the window vector;
when a mutation point exists in a window vector, the position of a first mutation point is positioned, if the first mutation point does not appear in a first window, a truncation mechanism is executed, all windows before the first mutation point are combined into a template, and the WVFR is calculated by the rest of the vector; if the first mutation point is in the first window of the vector, the WVFR of the whole vector is directly calculated, when the WVFR is larger than or equal to 0.3, the window vector with the dimension of 2 and the window size of 128 is used for redetection, and when the WVFR is smaller than 0.3, the window vector with the dimension of 2 and the window size of 256 is used for redetection; thereby realizing dynamic change of the window vector;
summarizing window vectors segmented by the dynamic sliding window model in the step 3, calculating the average mean value and the average fluctuation amount of each window in the vectors, and comparing the average mean value and the average fluctuation amount with the two groups of segmentation threshold values in the step 2 to obtain the morbidity degree corresponding to the window vector; the disease condition diagnosis result of the vector, the dimension of the vector and the width of the window are used as labels and marked on the window vector to form a complete pathological template and generate enough pathological templates.
Step 4, carrying out template matching on the data segment to be detected according to the characteristic similarity degree of the data segment to be detected and the pathological template, and realizing the disease condition diagnosis function of the expert system model;
classifying and storing the pathology template constructed in the step 3 into an expert template library according to the dimension of the template and the width of a window, and calculating the similarity degree of the template in the expert library and the pathology data to be detected by means of information on the template; the pathological data to be detected is segmented by means of a dynamic sliding window model, templates with the same length and dimension are preferentially searched in an expert database according to the dimension and the length of a segmented window vector, and then the degree of closeness of the mean value and the fluctuation quantity of each window in the vector and the mean value and the fluctuation quantity of a corresponding window in a pathological template vector is calculated; and when the mean value and the fluctuation amount are both in the interval of 80% -120% of the mean value and the fluctuation amount of the corresponding window in the template, indicating that the matching is successful, and quickly diagnosing the data to be detected according to the disease diagnosis result corresponding to the data in the template.
Step 5, further improving the performance of the expert system model by setting an expert database updating mechanism, a template sorting mechanism and a preferential selection area;
the specific process of the expert database updating mechanism is as follows:
if a certain window vector in the pathological data to be detected in the step 4 is not successfully matched after traversing all the templates with the equal length, dividing the disease condition through the division threshold value trained in the step 3, taking the division result as a label, recording the division result on the window vector, taking the window vector as a new template, and adding the new template into the expert database to update the pathological template database;
the specific processes of the expert database template sorting mechanism and the preferential selection area are as follows:
arranging a template sorting mechanism in a pathology template library, sorting templates with the same length and dimension together, and sorting the templates from high to low according to the successful matching times; when pathological data to be detected are input into an expert system model, templates are sequentially matched from high frequency to low frequency, so that pathological data can be quickly matched with a proper template;
a priority queue selection area for accommodating three templates is arranged in the expert database; when a certain template in the sequencing queue is successfully matched, copying the template into a priority queue; when the next round of template matching starts, firstly traversing the templates in the priority queue; according to the characteristic that the pathological data change has inertia, the matching time is shortened, and the model performance of the expert system is improved.
Example 2:
the pathological data model construction method based on the matching of the dynamic sliding window and the template specifically comprises the following steps:
step 1, acquiring time sequence pathological data by collecting electrocardiosignals of a patient with nervous system diseases, preprocessing the data by using a sliding window model, and segmenting the data.
And 2, calculating the fluctuation quantity and the average value of each section of data, and acquiring two groups of disease condition segmentation threshold values according to the fluctuation quantity and the average value. The method comprises the following steps:
2a) segmentation threshold extraction based on fluctuation amount and mean value
Firstly, segmenting learning data by using a sliding window model with a fixed length, detecting whether each window contains a mutation point by using a TSTSK algorithm, if the mutation point exists, indicating that data in the window has abnormal fluctuation, and defining the fluctuation quantity of each window as follows:
Figure BDA0003382261950000091
in the formula, Var (Z)L) The variance, Var (Z), of the data to the left of the mutation point in the windowR) And the variance of data on the right of the mutation point in the window is represented, Z represents the whole data including the mutation point in the window, and max and min represent maximum and minimum values. If there is no mutation point, the window fluctuation is recorded as 0. Sequencing the fluctuation amount of each window to obtain the maximum fluctuation amount WFmaxAs a fluctuation amount division threshold. The extraction mode of the Mean value threshold value is similar to the fluctuation quantity threshold value, the Mean value of each window is calculated and sequenced, and the maximum Mean value Mean is takenmaxAs a mean split threshold.
2b) Multi-threshold pathological data classification method
The multi-threshold segmentation method comprises two disease condition segmentation strategies, wherein the strategy 1 is a multi-threshold segmentation strategy based on fluctuation quantity and utilizes window fluctuation and maximum fluctuation quantity WFmaxThe disease condition can be divided into four states of 'normal', 'suspected morbidity', 'mild morbidity' and 'severe morbidity'. However, when the patient is in a continuous disease state, the distribution of the data may not change significantly, so that the mutation point cannot be detected, which also causes the strategy 1 to misclassify part of the disease data segment into a "normal" state. Therefore, when the pathological data are judged to be normal by the strategy 1, the strategy 2 needs to be started, and the Mean value interval [ Mean ] is divided by the multi-threshold segmentation strategy based on the Mean valuemin,Meanmax]The average is divided into 4 sections, which respectively correspond to four states of normal, suspected illness, mild illness and serious illness. And judging the state of the illness of the current window data according to the position of the window mean value in the interval.
And 3, generating a sufficient amount of pathological templates based on the newly invented dynamic sliding window model and the segmentation threshold in the step 2, wherein the specific process comprises two steps:
3a) segmenting training data using dynamic sliding windows
The method aims at the problems of low detection speed and single generated template when a fixed-length sliding window detects pathological time sequence data. The invention provides a dynamic change strategy of a sliding window based on a window vector fluctuation ratio. The calculation formula is as follows:
Figure BDA0003382261950000101
in the formula, WVFRiRepresenting the fluctuation ratio of the ith window vector; n iscpRepresenting the number of mutation points of the window vector; wlfcpTo representA window position of a first mutation point in the window vector; wlendIndicating the position of the last window in the window vector.
The strategy is to form a window vector by a plurality of sliding windows, detect the mutation points in the window by using the TSTKS algorithm, and dynamically adjust the width of the window and the dimension of the window vector by the proportion of the number of the windows with the mutation points in the vector in the whole window vector. The detection is carried out by using the window vector with a large window and a large dimensionality in the data section of the non-disease state, the detection is carried out by changing a small window in the data section of the disease state, and the detection of the window vector with the small dimensionality is carried out, so that the important pathological features are prevented from being missed.
When training data is input, the initial window vector is set to 4 dimensions, and each window has a size of 28And (6) detecting. If no mutation point is detected in the 4 continuous windows, 8 dimensions are used for increasing the dimension of the window vector, and the window size is 28And (6) detecting. If the mutation points are detected in the 4 windows, the position of the first mutation point is required to be positioned, a truncation mechanism is implemented, the window before the first mutation point is independently generated into a template, then the fluctuation ratio of the window vector of the window after the first mutation point is calculated, the dimension and the window width of the detected window vector are adjusted according to the fluctuation ratio, and when the fluctuation ratio of the window vector belongs to (0, 0.3)]Using 2 dimensions, each window has a width of 28The window vector of (a) re-detects the data segment after the first discontinuity. If the window vector fluctuation ratio belongs to (0.3, 1)]In time, it is stated that there are many mutation points in this segment of data, the data distribution is abnormal, and it is necessary to change the smallest window vector to detect again, i.e. one 2-dimensional, each window has a width of 27The window vector of (2).
Using a dynamic sliding window, the training data can be decomposed into 2 dimensions according to the fluctuation ratio, with a window size of 27Or 28(ii) a 4 dimension, window size 28(ii) a 6 dimension, window size 28(ii) a 8 dimension, window size 28A total of 4 dimensions of window vectors.
3b) Generation of pathology templates
The training data is taken from real pathological data, the training data is divided into window vectors with various dimensions and lengths by using a dynamic sliding window, the average mean value and the average fluctuation amount of the windows in the vectors are calculated, and the disease condition classification is carried out by using the segmentation threshold obtained in the step 2, so that the disease degree corresponding to the window vector can be obtained. The disease degree can be used as a label together with the dimension of the vector and the width of the window, and the disease condition is judged by combining expert knowledge to form a complete pathological template. And generating a sufficient amount of pathological templates according to the method, and finally forming an expert template library.
Step 4, constructing an expert system model based on the pathological template generated in the step 1-3 and the dynamic sliding window model in combination with a template matching method, and specifically comprising the following three steps:
4a) pathological data template matching
The label of the pathology template constructed in step 3 has the mean and fluctuation of each window in the template, in addition to the disease level. By means of the characteristic of the label of the pathological template, the calculation of the matching degree of the template and the pathological data to be detected in the expert database can be realized. The pathological data to be measured are firstly segmented by means of a dynamic sliding window. According to the dimension and the length of the segmented window vector, templates with the same length and dimension are preferentially searched in an expert library, and then the degree of closeness of the mean value and the fluctuation quantity of each window in the vector and the mean value and the fluctuation quantity of the corresponding window in the pathological template vector is calculated. And when the mean value and the fluctuation amount are both in the interval of 80% -120% of the mean value and the fluctuation amount of the corresponding window in the template, the matching is successful, otherwise, the matching is failed.
4b) Ranking and updating of pathology templates
In order to improve the speed of template matching, a set of sorting mechanism according to the occurrence frequency of the templates is arranged in the expert database, as shown in fig. 6, the templates with the same length and dimension are sorted together, and the templates are dynamically arranged from high to low according to the successful matching times. The higher the successful frequency of template matching, i.e. the more times of the pathological condition corresponding to the template appearing in the actual pathological data, the easier the successful matching. When real pathological data are input for matching, the templates are sequentially matched from high frequency to low frequency, and therefore pathological data can be quickly matched to a proper template.
If a certain window vector in the pathological data to be detected is not successfully matched after traversing all the templates with the equal length, dividing the pathological condition through the segmentation threshold value trained in the step 2, taking the division result as a label, recording the division result on the window vector of the pathological data to be detected, taking the window vector as a new template, adding the new template into the expert database, and participating in the sequencing mechanism of the templates, thereby realizing the updating of the pathological template database.
4c) Setting a preferred region
In the process of onset of nervous system diseases, changes of the patient's conditions such as electrocardio, electroencephalogram, myoelectricity and the like have inertia. According to the principle, a priority queue selection area capable of accommodating three templates is arranged in the expert database. When a template match in the sorting queue is successful, the template is copied into the priority queue. When the next round of template matching starts, the templates in the priority queue are traversed first. Therefore, the characteristic that the pathological signal changes have inertia can be utilized, the probability of successful matching is improved, and the matching time is shortened.
As shown in FIG. 7, there are updates to the templates in the selected area of the priority queue, because only three templates at most can be accommodated in the selected area, and when a template in the queue still fails to match successfully in three matching rounds, it will be pushed out of the queue by the template that has successfully matched, thereby completing the update of the template in the priority queue.
Example 3:
as shown in fig. 1, the flowchart of the method for constructing a pathology data model based on matching a dynamic sliding window with a template specifically includes the following steps:
(1) electrocardiosignals of a patient with nervous system diseases are collected, and time sequence pathological data are obtained.
(2) Preprocessing data by using a sliding window model, segmenting pathological data by using a fixed sliding window, detecting window mutation points by using a TSTKS algorithm, calculating fluctuation amounts and mean values of windows, and calculating a segmentation threshold value by taking the maximum fluctuation amount and mean value after sorting.
(3) The method comprises the steps of using a dynamic sliding window model to dynamically segment training data according to window fluctuation comparison, combining a plurality of sliding windows into a window vector according to a specific strategy shown in figure 2, firstly positioning the position of a first catastrophe point in the window vector, combining windows before the first catastrophe point by using a truncation mechanism, independently generating a template, combining subsequent windows into a window vector, calculating a vector fluctuation ratio, and using dynamic sliding windows with different dimensions and different window widths to detect again according to the fluctuation ratio. As shown in fig. 3a and 3b, when the first mutation point appears at an even number position, in order to ensure that the dimensions of all case templates are within the range of [2,4,6,8], the first mutation point window is also included in the truncation window, and a template with an even number of dimensions is generated separately. And then obtaining pathological data templates with various window widths and dimensions, and calculating the average fluctuation amount and the average mean value of each pathological template. As shown in fig. 4 and 5, the disease condition was diagnosed for each pathology template by combining the two sets of segmentation thresholds obtained in (2) with expert knowledge. And acquiring sufficient pathological templates by learning mass training data and putting the sufficient pathological templates into an expert database.
(4) And (3) slicing the input pathological data to be detected by using a dynamic sliding window model, calculating the fluctuation quantity and the mean value of each window in the window vector, and searching for an isometric template in a template library for matching. And comparing the closeness degree of the mean value and the fluctuation quantity of each window in the vector with the mean value and the fluctuation quantity of the corresponding window in the pathological template vector. The matching success conditions are two:
0.8WFi≤wfi≤1.2WFi
in the formula, wfiRepresenting the amount of fluctuation of the ith window in the vector to be detected, WFiRepresenting the amount of fluctuation of the ith window in the template.
0.8Meani≤meani≤1.2Meani
In the formula, meaniRepresenting the Mean, of the ith window in the vector to be detectediRepresents the mean of the ith window in the template. If the two conditions are met, the matching is successful, otherwise, the matching is failed.
(5) And (4) matching the failed template in the step (4), judging the state of illness by using a segmentation threshold, marking a judgment result on a vector, and adding the judgment result into an expert database as a new template to realize automatic updating of the template in the expert system.
(6) A template sorting mechanism is set in the expert system model, as shown in fig. 6, according to the number of times of successful matching, the templates with equal dimensionality and equal window width are sorted from high frequency to low frequency, in the process of template matching, the templates are traversed according to the frequency sequence, the high frequency template is preferentially selected, and the time for finding a proper template can be shortened.
(7) A priority selection area is set in the expert system model, and as shown in FIG. 7, a priority queue selection area capable of accommodating three templates is set. When a template match in the sorting queue is successful, the template is copied into the priority queue. When the next round of template matching starts, the templates in the priority queue are traversed first. According to the characteristic that the pathological data change has inertia, the matching time is shortened, and the model performance of the expert system is improved.
The invention combines a multi-mutation-point detection algorithm, provides two disease classification strategies based on the fluctuation quantity and the mean value, and effectively completes the classification of the disease state of the patient. By using the dynamic sliding window model and depending on the dynamic change of the window vector, the detection time of the data segment without diseases can be effectively reduced, and the feature extraction of the data segment with diseases can be enhanced. By utilizing the dynamic sliding window and the segmentation threshold value and combining expert knowledge, pathological templates with various dimensions and lengths are generated, and the types and pathological information of the templates in the expert template library are greatly enriched. In the aspect of template matching, the similarity degree of each window data in the window vector to be detected and the corresponding window data in the template vector is compared, and the change trend of the data is considered and compared while the template with the data fluctuation most similar to the data mean value is found. The template matching speed is improved through a template sorting mechanism, and self-updating of the expert template library is realized by combining an updating mechanism. And a priority selection area is set in the expert database, so that the performance of the expert system model is further improved.
It should be noted that the various features described in the above embodiments may be combined in any suitable manner without departing from the scope of the invention. The invention is not described in detail in order to avoid unnecessary repetition.

Claims (6)

1. A construction method of a pathology data model based on dynamic sliding window and template matching is characterized by comprising the following steps:
step 1, collecting electrocardiosignals of a patient with nervous system disease, acquiring time sequence pathological data, and dividing the time sequence pathological data into data segments with fixed sizes by using a sliding window model;
step 2, calculating the fluctuation quantity and the mean value of each data segment, and acquiring two groups of disease condition segmentation threshold values according to the fluctuation quantity and the mean value;
step 3, dynamically segmenting the training data and the data to be detected by using a dynamic sliding window model, and classifying the disease conditions of the data segments by combining the segmentation threshold in the step 2 to form a pathological data template base;
step 4, carrying out template matching on the data segment to be detected according to the characteristic similarity degree of the data segment to be detected and the pathological template, and realizing the disease condition diagnosis function of the expert system model;
and step 5, further improving the performance of the expert system model by setting an expert database updating mechanism, a template sorting mechanism and a preferential selection area.
2. The construction method according to claim 1, wherein the specific process of the step 2 is as follows:
identifying windows with the mutation points by using a TSTKS mutation point detection algorithm, and calculating fluctuation amounts of a plurality of windows according to the positions of data mutation in each window; if no mutation point exists, recording the window fluctuation as 0; combining the fluctuation amounts of the windows into a multi-dimensional fluctuation vector, sequencing the fluctuation amounts in the fluctuation vector, and combining the maximum fluctuation amount WFmaxAs a fluctuation amount division threshold; the extraction mode of the Mean value threshold value is similar to the fluctuation quantity threshold value, the Mean value of each window is calculated and sequenced, and the maximum Mean value Mean is takenmaxAs a mean split threshold.
3. The construction method according to claim 2, wherein the specific process of the step 3 is:
taking the real electrocardiosignals of a patient as training data, segmenting the training data by using a dynamic sliding window model, detecting the position of a mutation point in a window vector, and calculating the fluctuation ratio of the window vector, wherein the calculation formula is as follows:
Figure FDA0003382261940000021
wherein, WVFRiRepresenting the fluctuation ratio of the ith window vector; n iscpRepresenting the number of the mutation points in the window vector; wlfcpRepresenting a window position of a first mutation point in the window vector; wlendRepresenting the position of the last window in the window vector; according to the parameter, realizing the dynamic change of the window vector;
when a mutation point exists in a window vector, the position of a first mutation point is positioned, if the first mutation point does not appear in a first window, a truncation mechanism is executed, all windows before the first mutation point are combined into a template, and the WVFR is calculated by the rest of the vector; if the first mutation point is in the first window of the vector, the WVFR of the whole vector is directly calculated, when the WVFR is larger than or equal to 0.3, the window vector with the dimension of 2 and the window size of 128 is used for redetection, and when the WVFR is smaller than 0.3, the window vector with the dimension of 2 and the window size of 256 is used for redetection; thereby realizing dynamic change of the window vector;
summarizing window vectors segmented by the dynamic sliding window model in the step 3, calculating the average mean value and the average fluctuation amount of each window in the vectors, and comparing the average mean value and the average fluctuation amount with the two groups of segmentation threshold values in the step 2 to obtain the morbidity degree corresponding to the window vector; the disease condition diagnosis result of the vector, the dimension of the vector and the width of the window are used as labels and marked on the window vector to form a complete pathological template and generate enough pathological templates.
4. The construction method according to claim 1, wherein the specific process of the step 4 is as follows:
classifying and storing the pathology template constructed in the step 3 into an expert template library according to the dimension of the template and the width of a window, and calculating the similarity degree of the template in the expert library and the pathology data to be detected by means of information on the template; the pathological data to be detected is segmented by means of a dynamic sliding window model, templates with the same length and dimension are preferentially searched in an expert database according to the dimension and the length of a segmented window vector, and then the degree of closeness of the mean value and the fluctuation quantity of each window in the vector and the mean value and the fluctuation quantity of a corresponding window in a pathological template vector is calculated; and when the mean value and the fluctuation amount are both in the interval of 80% -120% of the mean value and the fluctuation amount of the corresponding window in the template, indicating that the matching is successful, and quickly diagnosing the data to be detected according to the disease diagnosis result corresponding to the data in the template.
5. The construction method according to claim 4, wherein the specific process of the expert database update mechanism in the step 5 is as follows:
if the window vector in the pathological data to be detected in the step 4 is not successfully matched after traversing all the templates with the equal length, dividing the disease condition through the division threshold value trained in the step 3, taking the division result as a label, recording the division result on the window vector, taking the window vector as a new template, and adding the new template into the expert database to update the pathological template database.
6. The construction method according to claim 4, wherein the specific process of the expert library template sorting mechanism and the preferential selection area in the step 5 is as follows:
arranging a template sorting mechanism in a pathology template library, sorting templates with the same length and dimension together, and sorting the templates from high to low according to the successful matching times; when pathological data to be detected are input into an expert system model, templates are sequentially matched from high frequency to low frequency, so that pathological data can be quickly matched with a proper template;
a priority queue selection area for accommodating three templates is arranged in the expert database; when a certain template in the sequencing queue is successfully matched, copying the template into a priority queue; when the next round of template matching starts, firstly traversing the templates in the priority queue; according to the characteristic that the pathological data change has inertia, the matching time is shortened, and the model performance of the expert system is improved.
CN202111449009.7A 2021-11-29 2021-11-29 Pathological data model construction method based on dynamic sliding window and template matching Pending CN114121265A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111449009.7A CN114121265A (en) 2021-11-29 2021-11-29 Pathological data model construction method based on dynamic sliding window and template matching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111449009.7A CN114121265A (en) 2021-11-29 2021-11-29 Pathological data model construction method based on dynamic sliding window and template matching

Publications (1)

Publication Number Publication Date
CN114121265A true CN114121265A (en) 2022-03-01

Family

ID=80369498

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111449009.7A Pending CN114121265A (en) 2021-11-29 2021-11-29 Pathological data model construction method based on dynamic sliding window and template matching

Country Status (1)

Country Link
CN (1) CN114121265A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115659162A (en) * 2022-09-15 2023-01-31 云南财经大学 Method, system and equipment for extracting features in radar radiation source signal pulse
CN116912248A (en) * 2023-09-13 2023-10-20 惠州市耀盈精密技术有限公司 Irregular hardware surface defect detection method based on computer vision

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115659162A (en) * 2022-09-15 2023-01-31 云南财经大学 Method, system and equipment for extracting features in radar radiation source signal pulse
CN115659162B (en) * 2022-09-15 2023-10-03 云南财经大学 Method, system and equipment for extracting intra-pulse characteristics of radar radiation source signals
CN116912248A (en) * 2023-09-13 2023-10-20 惠州市耀盈精密技术有限公司 Irregular hardware surface defect detection method based on computer vision
CN116912248B (en) * 2023-09-13 2024-01-05 惠州市耀盈精密技术有限公司 Irregular hardware surface defect detection method based on computer vision

Similar Documents

Publication Publication Date Title
CN108968915B (en) Sleep state classification method and system based on entropy characteristics and support vector machine
Li et al. Hyclasss: a hybrid classifier for automatic sleep stage scoring
Ramoni et al. Bayesian clustering by dynamics
CN114121265A (en) Pathological data model construction method based on dynamic sliding window and template matching
Exarchos et al. EEG transient event detection and classification using association rules
Katsis et al. A novel method for automated EMG decomposition and MUAP classification
CN111000553B (en) Intelligent classification method for electrocardiogram data based on voting ensemble learning
Korshunova et al. Towards improved design and evaluation of epileptic seizure predictors
Yean et al. Analysis of the distance metrics of KNN classifier for EEG signal in stroke patients
CN111553127A (en) Multi-label text data feature selection method and device
CN113274031A (en) Arrhythmia classification method based on deep convolution residual error network
Boonyakitanont et al. Automatic epileptic seizure onset-offset detection based on CNN in scalp EEG
CN113288157A (en) Arrhythmia classification method based on depth separable convolution and improved loss function
Liu et al. Automatic sleep arousals detection from polysomnography using multi-convolution neural network and random forest
CN112450944B (en) Label correlation guide feature fusion electrocardiogram multi-classification prediction system and method
US6941288B2 (en) Online learning method in a decision system
Toma et al. Discovery and integration of univariate patterns from daily individual organ-failure scores for intensive care mortality prediction
Pacheco et al. Integrated system for analysis and automatic classification of sleep EEG
CN116520150A (en) Anomaly detection system construction system and method based on dynamic strategy and active learning
Rocha et al. Personalized detection of explosive cough events in patients with pulmonary disease
CN114366116A (en) Parameter acquisition method based on Mask R-CNN network and electrocardiogram
CN113261975A (en) Deep learning-based electrocardiogram classification method
CN111354458A (en) Touch interactive motion user feature extraction method based on general drawing task and auxiliary disease detection system
Lee et al. Comparative neural network based on template cluster for automated abnormal beat detection in electrocardiogram signals
Ramoni et al. Bayesian clustering by dynamics

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination