CN114121265A

CN114121265A - Pathological data model construction method based on dynamic sliding window and template matching

Info

Publication number: CN114121265A
Application number: CN202111449009.7A
Authority: CN
Inventors: 曹一彤; 齐金鹏; 任晴; 钟金美; 贾灿; 袁傲; 薛宇鑫
Original assignee: Donghua University
Current assignee: Donghua University
Priority date: 2021-11-29
Filing date: 2021-11-29
Publication date: 2022-03-01

Abstract

A pathological data model construction method based on dynamic sliding window and template matching. Acquiring electrocardiosignals of a patient with nervous system diseases to acquire time sequence pathological data, and dividing the time sequence pathological data into data segments with fixed sizes by using a sliding window model; calculating the fluctuation amount and the mean value of each data segment to obtain two groups of disease condition segmentation threshold values; classifying the disease condition of the data segment by combining the segmentation threshold value to form a pathological data template library; and carrying out template matching on the data segment to be detected according to the characteristic similarity degree of the data segment to be detected and the pathological template, so as to realize the disease condition diagnosis function of the expert system model. The invention combines a multi-mutation-point detection algorithm, provides two disease classification strategies, and effectively finishes the classification of the disease state of the patient; by utilizing the dynamic sliding window and the segmentation threshold value and combining expert knowledge, pathological templates with various dimensions and lengths are generated, and the types and pathological information of the templates in the expert template library are greatly enriched.

Description

Pathological data model construction method based on dynamic sliding window and template matching

Technical Field

The invention relates to the technical field of big data rapid analysis and anomaly detection, in particular to a method for constructing a pathological data model based on dynamic sliding window and template matching.

Background

Abnormal discharges in the human nervous system can induce a variety of paroxysmal disorders such as epilepsy, sleep disorders, and ataxia. The prominent characteristic of the disease is that the disease is rapid and has no obvious precursor. The rhythm of life is fast, and long-term staying up night, overuse electronic product all can become nervous system paroxysmal disease's cause. Currently, diagnosis of such diseases relies on analysis of bioelectric signals such as electrocardiogram, electroencephalogram, and electromyogram.

Abnormal changes of the electrical signals can detect abnormal neuron activities of the patient in the disease state of the nervous system. At present, most of pathological time series data are detected by adopting a fixed sliding window for segmentation, but the detection is not suitable for detecting nervous system electric signals, and the nervous electric signals of patients with nervous system diseases are not different from those of the patients with the nervous system diseases in health. Only at the onset of the disease does the signal fluctuate sharply for a short time. The fixed sliding window is adopted all the time, the detection speed is low, important pathological characteristics in the disease state can be missed, and if the sliding window can be dynamically changed along with the disease state, the accuracy and the speed of pathological signal detection can be greatly improved. Furthermore, the ability to accurately diagnose a disease from an electrocardiogram requires the accumulation of a great deal of experience, which means that in less developed areas where experienced physicians are not available, the diagnosis and treatment of neurological diseases are difficult. Therefore, if the condition classification can be carried out on the window vector data decomposed by the dynamic sliding window to form a pathological template, and a set of pathological data expert system model is constructed by combining expert knowledge, the method has important significance for detecting the nervous system diseases.

Disclosure of Invention

To solve the problems mentioned in the background, a method for constructing a pathology data model based on matching a dynamic sliding window with a template is provided. The construction process comprises the following steps:

step 1, collecting electrocardiosignals of a patient with nervous system disease, acquiring time sequence pathological data, and dividing the time sequence pathological data into data segments with fixed sizes by using a sliding window model;

step 2, calculating the fluctuation quantity and the mean value of each data segment, and acquiring two groups of disease condition segmentation threshold values according to the fluctuation quantity and the mean value;

step 3, dynamically segmenting the training data and the data to be detected by using a dynamic sliding window model, and classifying the disease conditions of the data segments by combining the segmentation threshold in the step 2 to form a pathological data template base;

step 4, carrying out template matching on the data segment to be detected according to the characteristic similarity degree of the data segment to be detected and the pathological template, and realizing the disease condition diagnosis function of the expert system model;

and step 5, further improving the performance of the expert system model by setting an expert database updating mechanism, a template sorting mechanism and a preferential selection area.

Preferably, the fluctuation amount and the mean value of each data segment are calculated in step 2 of the invention, and two groups of disease condition segmentation threshold values are obtained according to the fluctuation amount and the mean value; the specific process is as follows:

identifying windows with the mutation points by using a TSTKS mutation point detection algorithm, and calculating fluctuation amounts of a plurality of windows according to the positions of data mutation in each window; if no mutation point exists, recording the window fluctuation as 0; combining the fluctuation amounts of the windows into a multi-dimensional fluctuation vector, sequencing the fluctuation amounts in the fluctuation vector, and combining the maximum fluctuation amount WF_maxAs a waveA quantity split threshold; the extraction mode of the Mean value threshold value is similar to the fluctuation quantity threshold value, the Mean value of each window is calculated and sequenced, and the maximum Mean value Mean is taken_maxAs a mean split threshold.

Preferably, in step 3 of the method, a dynamic sliding window model is used for dynamically segmenting the training data and the data to be detected, and the segmentation threshold in step 2 is combined to classify the disease condition of the data segment to form a pathological data template base; the specific process is as follows:

taking the real electrocardiosignals of a patient as training data, segmenting the training data by using a dynamic sliding window model, detecting the position of a mutation point in a window vector, and calculating the fluctuation ratio of the window vector, wherein the calculation formula is as follows:

wherein, WVFR_iRepresenting the fluctuation ratio of the ith window vector; n is_cpRepresenting the number of the mutation points in the window vector; wl_fcpRepresenting a window position of a first mutation point in the window vector; wl_endRepresenting the position of the last window in the window vector; according to the parameter, realizing the dynamic change of the window vector;

when a mutation point exists in a window vector, the position of a first mutation point is positioned, if the first mutation point does not appear in a first window, a truncation mechanism is executed, all windows before the first mutation point are combined into a template, and the WVFR is calculated by the rest of the vector; if the first mutation point is in the first window of the vector, the WVFR of the whole vector is directly calculated, when the WVFR is larger than or equal to 0.3, the window vector with the dimension of 2 and the window size of 128 is used for redetection, and when the WVFR is smaller than 0.3, the window vector with the dimension of 2 and the window size of 256 is used for redetection; thereby realizing dynamic change of the window vector;

summarizing window vectors segmented by the dynamic sliding window model in the step 3, calculating the average mean value and the average fluctuation amount of each window in the vectors, and comparing the average mean value and the average fluctuation amount with the two groups of segmentation threshold values in the step 2 to obtain the morbidity degree corresponding to the window vector; the disease condition diagnosis result of the vector, the dimension of the vector and the width of the window are used as labels and marked on the window vector to form a complete pathological template and generate enough pathological templates.

Preferably, the step 4 of the invention carries out template matching on the data segment to be detected according to the similarity degree of the characteristics of the data segment to be detected and the pathological template, thereby realizing the disease condition diagnosis function of the expert system model; the specific process is as follows:

classifying and storing the pathology template constructed in the step 3 into an expert template library according to the dimension of the template and the width of a window, and calculating the similarity degree of the template in the expert library and the pathology data to be detected by means of information on the template; the pathological data to be detected is segmented by means of a dynamic sliding window model, templates with the same length and dimension are preferentially searched in an expert database according to the dimension and the length of a segmented window vector, and then the degree of closeness of the mean value and the fluctuation quantity of each window in the vector and the mean value and the fluctuation quantity of a corresponding window in a pathological template vector is calculated; and when the mean value and the fluctuation amount are both in the interval of 80% -120% of the mean value and the fluctuation amount of the corresponding window in the template, indicating that the matching is successful, and quickly diagnosing the data to be detected according to the disease diagnosis result corresponding to the data in the template.

Preferably, step 5 of the invention further improves the performance of the expert system model by setting an expert database updating mechanism, a template sorting mechanism and a preferential selection area; the specific process of the expert database updating mechanism is as follows:

if the window vector in the pathological data to be detected in the step 4 is not successfully matched after traversing all the templates with the equal length, dividing the disease condition through the division threshold value trained in the step 3, taking the division result as a label, recording the division result on the window vector, taking the window vector as a new template, and adding the new template into the expert database to update the pathological template database.

Preferably, step 5 of the invention further improves the performance of the expert system model by setting an expert database updating mechanism, a template sorting mechanism and a preferential selection area; the specific processes of the expert database template sorting mechanism and the preferential selection area are as follows:

arranging a template sorting mechanism in a pathology template library, sorting templates with the same length and dimension together, and sorting the templates from high to low according to the successful matching times; when pathological data to be detected are input into an expert system model, templates are sequentially matched from high frequency to low frequency, so that pathological data can be quickly matched with a proper template;

a priority queue selection area for accommodating three templates is arranged in the expert database; when a certain template in the sequencing queue is successfully matched, copying the template into a priority queue; when the next round of template matching starts, firstly traversing the templates in the priority queue; according to the characteristic that the pathological data change has inertia, the matching time is shortened, and the model performance of the expert system is improved.

By adopting the technical scheme, the invention has the advantages that:

1. the invention combines a multi-mutation-point detection algorithm, provides two disease classification strategies based on the fluctuation quantity and the mean value, and effectively completes the classification of the disease state of the patient.

2. The invention uses the dynamic sliding window model, depends on the dynamic change of the window vector, can effectively reduce the detection time in the non-diseased data segment, and can strengthen the feature extraction of the diseased data segment.

3. The invention utilizes the dynamic sliding window and the segmentation threshold value, combines the expert knowledge to generate the pathological templates with various dimensions and lengths, and greatly enriches the types and pathological information of the templates in the expert template library.

4. In the aspect of template matching, the similarity degree of each window data in the window vector to be detected and the corresponding window data in the template vector is compared, and the change trend of the data is considered and compared while the template with the data fluctuation most similar to the data mean value is found.

5. The invention improves the speed of template matching by a template sorting mechanism, and realizes the self-updating of the expert template library by combining an updating mechanism.

6. The invention sets a preferential selection area in the expert database, thereby further improving the performance of the expert system model.

Drawings

FIG. 1 is a flow chart of the present invention for modeling pathology data.

Fig. 2 is a schematic diagram of a change strategy of the dynamic sliding window model according to the present invention.

Fig. 3a is an exemplary diagram of the truncation mechanism when the first mutation occurs at an odd position in the window vector.

Fig. 3b is an exemplary diagram of the truncation mechanism when the first mutation occurs at an even position in the window vector.

Fig. 4 is a diagram of a template matching strategy according to the present invention.

FIG. 5 is a graph of a disease segmentation strategy based on fluctuation and mean values in accordance with the present invention.

FIG. 6 is an exemplary diagram of the expert database template sorting mechanism in the present invention.

FIG. 7 is an exemplary diagram of the expert pool preferred area in the present invention.

Detailed Description

The invention is further illustrated below with reference to specific examples. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Further, it should be understood that various changes or modifications of the present invention may be made by those skilled in the art after reading the teaching of the present invention, and such equivalents may fall within the scope of the present invention as defined in the appended claims.

Example 1:

the pathological data model construction method based on the matching of the dynamic sliding window and the template specifically comprises the following steps:

step 1, collecting electrocardiosignals of a patient with nervous system diseases, acquiring time sequence pathological data, and dividing the time sequence pathological data into data segments with fixed sizes by using a sliding window model.

identifying windows with the mutation points by using a TSTKS mutation point detection algorithm, and calculating fluctuation amounts of a plurality of windows according to the positions of data mutation in each window; if no mutation point exists, recording the window fluctuation as 0; combining the fluctuation amounts of each window into a multi-dimensionalThe fluctuation amount in the fluctuation vector is sorted to obtain the maximum fluctuation amount WF_maxAs a fluctuation amount division threshold; the extraction mode of the Mean value threshold value is similar to the fluctuation quantity threshold value, the Mean value of each window is calculated and sequenced, and the maximum Mean value Mean is taken_maxAs a mean split threshold.

Step 5, further improving the performance of the expert system model by setting an expert database updating mechanism, a template sorting mechanism and a preferential selection area;

the specific process of the expert database updating mechanism is as follows:

if a certain window vector in the pathological data to be detected in the step 4 is not successfully matched after traversing all the templates with the equal length, dividing the disease condition through the division threshold value trained in the step 3, taking the division result as a label, recording the division result on the window vector, taking the window vector as a new template, and adding the new template into the expert database to update the pathological template database;

the specific processes of the expert database template sorting mechanism and the preferential selection area are as follows:

Example 2:

step 1, acquiring time sequence pathological data by collecting electrocardiosignals of a patient with nervous system diseases, preprocessing the data by using a sliding window model, and segmenting the data.

And 2, calculating the fluctuation quantity and the average value of each section of data, and acquiring two groups of disease condition segmentation threshold values according to the fluctuation quantity and the average value. The method comprises the following steps:

2a) segmentation threshold extraction based on fluctuation amount and mean value

Firstly, segmenting learning data by using a sliding window model with a fixed length, detecting whether each window contains a mutation point by using a TSTSK algorithm, if the mutation point exists, indicating that data in the window has abnormal fluctuation, and defining the fluctuation quantity of each window as follows:

in the formula, Var (Z)_L) The variance, Var (Z), of the data to the left of the mutation point in the window_R) And the variance of data on the right of the mutation point in the window is represented, Z represents the whole data including the mutation point in the window, and max and min represent maximum and minimum values. If there is no mutation point, the window fluctuation is recorded as 0. Sequencing the fluctuation amount of each window to obtain the maximum fluctuation amount WF_maxAs a fluctuation amount division threshold. The extraction mode of the Mean value threshold value is similar to the fluctuation quantity threshold value, the Mean value of each window is calculated and sequenced, and the maximum Mean value Mean is taken_maxAs a mean split threshold.

2b) Multi-threshold pathological data classification method

The multi-threshold segmentation method comprises two disease condition segmentation strategies, wherein the strategy 1 is a multi-threshold segmentation strategy based on fluctuation quantity and utilizes window fluctuation and maximum fluctuation quantity WF_maxThe disease condition can be divided into four states of 'normal', 'suspected morbidity', 'mild morbidity' and 'severe morbidity'. However, when the patient is in a continuous disease state, the distribution of the data may not change significantly, so that the mutation point cannot be detected, which also causes the strategy 1 to misclassify part of the disease data segment into a "normal" state. Therefore, when the pathological data are judged to be normal by the strategy 1, the strategy 2 needs to be started, and the Mean value interval [ Mean ] is divided by the multi-threshold segmentation strategy based on the Mean value_min,Mean_max]The average is divided into 4 sections, which respectively correspond to four states of normal, suspected illness, mild illness and serious illness. And judging the state of the illness of the current window data according to the position of the window mean value in the interval.

And 3, generating a sufficient amount of pathological templates based on the newly invented dynamic sliding window model and the segmentation threshold in the step 2, wherein the specific process comprises two steps:

3a) segmenting training data using dynamic sliding windows

The method aims at the problems of low detection speed and single generated template when a fixed-length sliding window detects pathological time sequence data. The invention provides a dynamic change strategy of a sliding window based on a window vector fluctuation ratio. The calculation formula is as follows:

in the formula, WVFR_iRepresenting the fluctuation ratio of the ith window vector; n is_cpRepresenting the number of mutation points of the window vector; wl_fcpTo representA window position of a first mutation point in the window vector; wl_endIndicating the position of the last window in the window vector.

The strategy is to form a window vector by a plurality of sliding windows, detect the mutation points in the window by using the TSTKS algorithm, and dynamically adjust the width of the window and the dimension of the window vector by the proportion of the number of the windows with the mutation points in the vector in the whole window vector. The detection is carried out by using the window vector with a large window and a large dimensionality in the data section of the non-disease state, the detection is carried out by changing a small window in the data section of the disease state, and the detection of the window vector with the small dimensionality is carried out, so that the important pathological features are prevented from being missed.

When training data is input, the initial window vector is set to 4 dimensions, and each window has a size of 2⁸And (6) detecting. If no mutation point is detected in the 4 continuous windows, 8 dimensions are used for increasing the dimension of the window vector, and the window size is 2⁸And (6) detecting. If the mutation points are detected in the 4 windows, the position of the first mutation point is required to be positioned, a truncation mechanism is implemented, the window before the first mutation point is independently generated into a template, then the fluctuation ratio of the window vector of the window after the first mutation point is calculated, the dimension and the window width of the detected window vector are adjusted according to the fluctuation ratio, and when the fluctuation ratio of the window vector belongs to (0, 0.3)]Using 2 dimensions, each window has a width of 2⁸The window vector of (a) re-detects the data segment after the first discontinuity. If the window vector fluctuation ratio belongs to (0.3, 1)]In time, it is stated that there are many mutation points in this segment of data, the data distribution is abnormal, and it is necessary to change the smallest window vector to detect again, i.e. one 2-dimensional, each window has a width of 2⁷The window vector of (2).

Using a dynamic sliding window, the training data can be decomposed into 2 dimensions according to the fluctuation ratio, with a window size of 2⁷Or 2⁸(ii) a 4 dimension, window size 2⁸(ii) a 6 dimension, window size 2⁸(ii) a 8 dimension, window size 2⁸A total of 4 dimensions of window vectors.

3b) Generation of pathology templates

The training data is taken from real pathological data, the training data is divided into window vectors with various dimensions and lengths by using a dynamic sliding window, the average mean value and the average fluctuation amount of the windows in the vectors are calculated, and the disease condition classification is carried out by using the segmentation threshold obtained in the step 2, so that the disease degree corresponding to the window vector can be obtained. The disease degree can be used as a label together with the dimension of the vector and the width of the window, and the disease condition is judged by combining expert knowledge to form a complete pathological template. And generating a sufficient amount of pathological templates according to the method, and finally forming an expert template library.

Step 4, constructing an expert system model based on the pathological template generated in the step 1-3 and the dynamic sliding window model in combination with a template matching method, and specifically comprising the following three steps:

4a) pathological data template matching

The label of the pathology template constructed in step 3 has the mean and fluctuation of each window in the template, in addition to the disease level. By means of the characteristic of the label of the pathological template, the calculation of the matching degree of the template and the pathological data to be detected in the expert database can be realized. The pathological data to be measured are firstly segmented by means of a dynamic sliding window. According to the dimension and the length of the segmented window vector, templates with the same length and dimension are preferentially searched in an expert library, and then the degree of closeness of the mean value and the fluctuation quantity of each window in the vector and the mean value and the fluctuation quantity of the corresponding window in the pathological template vector is calculated. And when the mean value and the fluctuation amount are both in the interval of 80% -120% of the mean value and the fluctuation amount of the corresponding window in the template, the matching is successful, otherwise, the matching is failed.

4b) Ranking and updating of pathology templates

In order to improve the speed of template matching, a set of sorting mechanism according to the occurrence frequency of the templates is arranged in the expert database, as shown in fig. 6, the templates with the same length and dimension are sorted together, and the templates are dynamically arranged from high to low according to the successful matching times. The higher the successful frequency of template matching, i.e. the more times of the pathological condition corresponding to the template appearing in the actual pathological data, the easier the successful matching. When real pathological data are input for matching, the templates are sequentially matched from high frequency to low frequency, and therefore pathological data can be quickly matched to a proper template.

If a certain window vector in the pathological data to be detected is not successfully matched after traversing all the templates with the equal length, dividing the pathological condition through the segmentation threshold value trained in the step 2, taking the division result as a label, recording the division result on the window vector of the pathological data to be detected, taking the window vector as a new template, adding the new template into the expert database, and participating in the sequencing mechanism of the templates, thereby realizing the updating of the pathological template database.

4c) Setting a preferred region

In the process of onset of nervous system diseases, changes of the patient's conditions such as electrocardio, electroencephalogram, myoelectricity and the like have inertia. According to the principle, a priority queue selection area capable of accommodating three templates is arranged in the expert database. When a template match in the sorting queue is successful, the template is copied into the priority queue. When the next round of template matching starts, the templates in the priority queue are traversed first. Therefore, the characteristic that the pathological signal changes have inertia can be utilized, the probability of successful matching is improved, and the matching time is shortened.

As shown in FIG. 7, there are updates to the templates in the selected area of the priority queue, because only three templates at most can be accommodated in the selected area, and when a template in the queue still fails to match successfully in three matching rounds, it will be pushed out of the queue by the template that has successfully matched, thereby completing the update of the template in the priority queue.

Example 3:

as shown in fig. 1, the flowchart of the method for constructing a pathology data model based on matching a dynamic sliding window with a template specifically includes the following steps:

(1) electrocardiosignals of a patient with nervous system diseases are collected, and time sequence pathological data are obtained.

(2) Preprocessing data by using a sliding window model, segmenting pathological data by using a fixed sliding window, detecting window mutation points by using a TSTKS algorithm, calculating fluctuation amounts and mean values of windows, and calculating a segmentation threshold value by taking the maximum fluctuation amount and mean value after sorting.

(3) The method comprises the steps of using a dynamic sliding window model to dynamically segment training data according to window fluctuation comparison, combining a plurality of sliding windows into a window vector according to a specific strategy shown in figure 2, firstly positioning the position of a first catastrophe point in the window vector, combining windows before the first catastrophe point by using a truncation mechanism, independently generating a template, combining subsequent windows into a window vector, calculating a vector fluctuation ratio, and using dynamic sliding windows with different dimensions and different window widths to detect again according to the fluctuation ratio. As shown in fig. 3a and 3b, when the first mutation point appears at an even number position, in order to ensure that the dimensions of all case templates are within the range of [2,4,6,8], the first mutation point window is also included in the truncation window, and a template with an even number of dimensions is generated separately. And then obtaining pathological data templates with various window widths and dimensions, and calculating the average fluctuation amount and the average mean value of each pathological template. As shown in fig. 4 and 5, the disease condition was diagnosed for each pathology template by combining the two sets of segmentation thresholds obtained in (2) with expert knowledge. And acquiring sufficient pathological templates by learning mass training data and putting the sufficient pathological templates into an expert database.

(4) And (3) slicing the input pathological data to be detected by using a dynamic sliding window model, calculating the fluctuation quantity and the mean value of each window in the window vector, and searching for an isometric template in a template library for matching. And comparing the closeness degree of the mean value and the fluctuation quantity of each window in the vector with the mean value and the fluctuation quantity of the corresponding window in the pathological template vector. The matching success conditions are two:

0.8WF_i≤wf_i≤1.2WF_i

in the formula, wf_iRepresenting the amount of fluctuation of the ith window in the vector to be detected, WF_iRepresenting the amount of fluctuation of the ith window in the template.

0.8Mean_i≤mean_i≤1.2Mean_i

In the formula, mean_iRepresenting the Mean, of the ith window in the vector to be detected_iRepresents the mean of the ith window in the template. If the two conditions are met, the matching is successful, otherwise, the matching is failed.

(5) And (4) matching the failed template in the step (4), judging the state of illness by using a segmentation threshold, marking a judgment result on a vector, and adding the judgment result into an expert database as a new template to realize automatic updating of the template in the expert system.

(6) A template sorting mechanism is set in the expert system model, as shown in fig. 6, according to the number of times of successful matching, the templates with equal dimensionality and equal window width are sorted from high frequency to low frequency, in the process of template matching, the templates are traversed according to the frequency sequence, the high frequency template is preferentially selected, and the time for finding a proper template can be shortened.

(7) A priority selection area is set in the expert system model, and as shown in FIG. 7, a priority queue selection area capable of accommodating three templates is set. When a template match in the sorting queue is successful, the template is copied into the priority queue. When the next round of template matching starts, the templates in the priority queue are traversed first. According to the characteristic that the pathological data change has inertia, the matching time is shortened, and the model performance of the expert system is improved.

The invention combines a multi-mutation-point detection algorithm, provides two disease classification strategies based on the fluctuation quantity and the mean value, and effectively completes the classification of the disease state of the patient. By using the dynamic sliding window model and depending on the dynamic change of the window vector, the detection time of the data segment without diseases can be effectively reduced, and the feature extraction of the data segment with diseases can be enhanced. By utilizing the dynamic sliding window and the segmentation threshold value and combining expert knowledge, pathological templates with various dimensions and lengths are generated, and the types and pathological information of the templates in the expert template library are greatly enriched. In the aspect of template matching, the similarity degree of each window data in the window vector to be detected and the corresponding window data in the template vector is compared, and the change trend of the data is considered and compared while the template with the data fluctuation most similar to the data mean value is found. The template matching speed is improved through a template sorting mechanism, and self-updating of the expert template library is realized by combining an updating mechanism. And a priority selection area is set in the expert database, so that the performance of the expert system model is further improved.

It should be noted that the various features described in the above embodiments may be combined in any suitable manner without departing from the scope of the invention. The invention is not described in detail in order to avoid unnecessary repetition.

Claims

1. A construction method of a pathology data model based on dynamic sliding window and template matching is characterized by comprising the following steps:

2. The construction method according to claim 1, wherein the specific process of the step 2 is as follows:

identifying windows with the mutation points by using a TSTKS mutation point detection algorithm, and calculating fluctuation amounts of a plurality of windows according to the positions of data mutation in each window; if no mutation point exists, recording the window fluctuation as 0; combining the fluctuation amounts of the windows into a multi-dimensional fluctuation vector, sequencing the fluctuation amounts in the fluctuation vector, and combining the maximum fluctuation amount WF_maxAs a fluctuation amount division threshold; the extraction mode of the Mean value threshold value is similar to the fluctuation quantity threshold value, the Mean value of each window is calculated and sequenced, and the maximum Mean value Mean is taken_maxAs a mean split threshold.

3. The construction method according to claim 2, wherein the specific process of the step 3 is:

4. The construction method according to claim 1, wherein the specific process of the step 4 is as follows:

5. The construction method according to claim 4, wherein the specific process of the expert database update mechanism in the step 5 is as follows:

6. The construction method according to claim 4, wherein the specific process of the expert library template sorting mechanism and the preferential selection area in the step 5 is as follows: