WO2016205993A1 - 基于gpu的并行心电信号分析方法 - Google Patents

基于gpu的并行心电信号分析方法 Download PDF

Info

Publication number
WO2016205993A1
WO2016205993A1 PCT/CN2015/082040 CN2015082040W WO2016205993A1 WO 2016205993 A1 WO2016205993 A1 WO 2016205993A1 CN 2015082040 W CN2015082040 W CN 2015082040W WO 2016205993 A1 WO2016205993 A1 WO 2016205993A1
Authority
WO
WIPO (PCT)
Prior art keywords
ecg signal
interval
gpu
thread
threads
Prior art date
Application number
PCT/CN2015/082040
Other languages
English (en)
French (fr)
Inventor
李烨
樊小毛
项福如
蔡云鹏
苗芬
Original Assignee
中国科学院深圳先进技术研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院深圳先进技术研究院 filed Critical 中国科学院深圳先进技术研究院
Priority to CN201580000415.1A priority Critical patent/CN105899268B/zh
Priority to PCT/CN2015/082040 priority patent/WO2016205993A1/zh
Priority to JP2016576010A priority patent/JP6389907B2/ja
Priority to US15/389,978 priority patent/US10258250B2/en
Publication of WO2016205993A1 publication Critical patent/WO2016205993A1/zh

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/24Detecting, measuring or recording bioelectric or biomagnetic signals of the body or parts thereof
    • A61B5/316Modalities, i.e. specific diagnostic methods
    • A61B5/318Heart-related electrical modalities, e.g. electrocardiography [ECG]
    • A61B5/346Analysis of electrocardiograms
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7203Signal processing specially adapted for physiological signals or for diagnostic purposes for noise prevention, reduction or removal
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/24Detecting, measuring or recording bioelectric or biomagnetic signals of the body or parts thereof
    • A61B5/316Modalities, i.e. specific diagnostic methods
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/24Detecting, measuring or recording bioelectric or biomagnetic signals of the body or parts thereof
    • A61B5/316Modalities, i.e. specific diagnostic methods
    • A61B5/318Heart-related electrical modalities, e.g. electrocardiography [ECG]
    • A61B5/346Analysis of electrocardiograms
    • A61B5/349Detecting specific parameters of the electrocardiograph cycle
    • A61B5/352Detecting R peaks, e.g. for synchronising diagnostic apparatus; Estimating R-R interval
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/24Detecting, measuring or recording bioelectric or biomagnetic signals of the body or parts thereof
    • A61B5/316Modalities, i.e. specific diagnostic methods
    • A61B5/318Heart-related electrical modalities, e.g. electrocardiography [ECG]
    • A61B5/346Analysis of electrocardiograms
    • A61B5/349Detecting specific parameters of the electrocardiograph cycle
    • A61B5/366Detecting abnormal QRS complex, e.g. widening
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems

Definitions

  • the present invention relates to the field of biomedical engineering technology, and in particular, to a GPU-based parallel ECG signal analysis method.
  • ECG data has gradually become a research hotspot in the field of biomedicine.
  • most of the automated analysis technology of ECG data is aimed at the ECG data collected by the hospital as the basic unit, so the scale of ECG data faced is quite limited.
  • the Family Health Cloud platform serves small and medium-sized urban users, and thousands of users upload long-range and short-range ECG data every day.
  • the healthy cloud platform uses a serial ECG data analysis algorithm, which can realize real-time analysis and real-time feedback of short-range ECG data, but the analysis of long-range ECG signal data will still take a lot of time, seriously affecting users.
  • the current home health cloud platform for 24-hour long-range ECG data, from uploading to feedback analysis results the average response time is 35 seconds, taking a long time.
  • the present invention provides a GPU-based parallel ECG signal analysis method to solve one or more of the above missing.
  • the invention provides a GPU-based parallel ECG signal analysis method, which comprises: filtering the ECG signal by long interval false positive rejection and short interval false positive rejection; R wave position extraction, QRS wave The start and stop position extraction and the QRS complex width extraction perform QRS detection on the filtered ECG signal; the abnormal waveform classification is performed on the ECG signal after the QRS detection by creating a template; wherein the long interval pseudo Differential culling, said short interval false positive culling, said R wave position extraction, said QRS complex width extraction, and said template creation Few ones execute in parallel through multiple threads on the GPU device side, which reads and processes their corresponding data through its unique corresponding index number.
  • the long interval false positive culling is performed in parallel by a plurality of the threads, the method comprising: declaring a variable of the GPU device end and allocating a corresponding video memory thereof, and the ECG signal is from the host end Copying to the global memory of the GPU device; segmenting the ECG signal according to the interval of the long interval false positive culling, each of the threads calling the kernel function of the GPU device, according to the index number Reading, by the global memory, a segment of the ECG signal corresponding to the thread, calculating a standard deviation of the segment of the ECG signal, and storing the standard deviation according to the index number to a first standard deviation sequence in the global memory; Excluding the segment ECG signal whose standard deviation is not within a set threshold; the thread calls the protocol summation kernel function of the GPU device side, and calculates a sum of standard deviations of all the ECG signals remaining after the culling a value; an average value is obtained according to the sum value, and a first threshold range is generated according to the average value
  • the number of threads The number of sampling points of the ECG signal corresponding to each thread is f*T, where L is the sampling point sequence length of the ECG signal copied from the host end, and f is the sampling frequency of the ECG signal. T is the interval of the long interval false positive rejection.
  • the standard deviation among them Where p j is the electrocardiographic signal at the jth sampling point in the segment ECG signal, j is an integer, j ⁇ 0, f is the sampling frequency of the electrocardiographic signal, and T is the long interval false positive rejection Interval.
  • the upper limit of the first threshold range and the lower limit of the first threshold range are respectively 3 times the mean value and 1/3.5 times the mean value.
  • the set threshold range is [0.5, 3].
  • the short interval artifact culling is performed in parallel by a plurality of the threads, the method comprising: declaring a variable of the GPU device end and assigning a corresponding video memory thereto, and culling the long interval artifact
  • the subsequent ECG signal is copied from the host end to the global memory of the GPU device side; the ECG signal is segmented according to the inter-period T 1 of the short interval artifact culling, and each of the threads invokes the GPU device
  • the kernel function of the end reads a piece of ECG signal according to the index number, and calculates a formula And storing, according to the index number, a value of the formula to a second noise sequence in the global memory, where sum is a sum of squares of central signals of all the segment ECG signals; modifying the thread The number and the number of sampling points of the corresponding processed ECG signals, each modified thread re-calls the kernel function, and reads the unique corresponding long-term pseudo-defect culling according to the modified thread's index number
  • the modified thread calls a protocol summation kernel function in the GPU device end, and calculates a sum value of all ECG signals remaining after being screened;
  • the sum value is averaged, and a second threshold range is generated according to the average value; the number of the threads and the number of corresponding processed ECG signals are modified again, and the modified thread again calls kern again.
  • the method further includes: the number of modified threads Wherein, L is the sampling point sequence length of the ECG signal copied from the host end, f is the sampling frequency of the ECG signal, T is the interval of the long interval false positive rejection, and DataPerBlock is modified by 1 thread.
  • a thread block, the number of which is BlockNum1 (L1+DataPerBlock1-1)/DataPerBlock1, where L1 is the length of the first noise sequence of the result of all of the long interval artifact culling.
  • the R wave position extraction is performed in parallel by a plurality of the threads, the method comprising: declaring a variable of the GPU device end and assigning a corresponding video memory thereto, and filtering the ECG signal from the processing
  • the memory of the host side is copied to the global memory of the GPU device; each thread invokes a kernel function, and reads an ECG signal to be detected from the ECG signal after filtering according to the index number; the thread According to the set window size w and a set gradient, the read ECG signal to be detected and the w-1 to-be-detected ECG signals immediately after the process are subjected to different degrees of corrosion operation, and the read after the corrosion operation is performed.
  • each of the threads reads a minimum value from the first temporary sequence according to the index number; according to the set window size w and the set gradient, the minimum of the read Value and its immediate w-1 Performing a different degree of expansion operation on the minimum value, and storing, according to the index number, the minimum value of the read after the expansion operation and the maximum value of the w-1 immediately after the expansion operation to the a second temporary sequence in the global memory; reading and calculating a difference between the to-be-detected ECG signal and the maximum value according to the index number, and storing the difference value into one of the global memories a third temporary sequence; the thread calling a statistic summation kernel function calculates a sum of all the differences in the third temporary sequence; averaging according to the sum; and averaging the GPU
  • the method further includes: storing a minimum value of the read-to-detect ECG signal and the w-1 to-be-detected ECG signals after the corrosion operation into a register; Reading the minimum value read in the register, reading the w-1 of the minimum values from the global memory.
  • the QRS complex width extraction is performed in parallel by a plurality of the threads, the method comprising: declaring a variable of the GPU device end and assigning a corresponding video memory thereto, and extracting a result of the QRS wave group start and stop position extraction from The host side copies the global memory to the GPU device end; the thread calls the kernel function of the GPU device end, and reads the start position and the end position in the result of extracting the start and stop positions of the QRS wave group according to the index number thereof, and calculates a difference between the start position and the end position, storing the difference value in the global memory according to the index number, and generating a result of extracting a QRS complex; and extracting the result of the QRS complex width from the The GPU device side is copied to the host end.
  • the starting position and the ending position of the QRS complex are obtained at the host end according to a peak group determination method.
  • the creating a template is performed in parallel by a plurality of the threads, the method comprising: declaring a variable of the GPU device end and allocating a corresponding video memory thereof, and copying the result of the R wave position extraction from the host end to the The global memory of the GPU device side; the thread calls the kernel function, reads the RR interval in the result of the R wave position extraction according to the index number, and obtains the identifier of each of the RR intervals according to a setting criterion. And storing the identifier into the global memory according to the index number, generating a result of creating a template; and copying the result of creating the template from the GPU device end to the host end.
  • the setting criterion comprises: if the i+1th RR interval RRlist[i+1] adjacent to the i-th RR interval RRlist[i] is not in the interval (0.6*RRlist[i] , in 1.5*RRlist[i]), the identifier of the RR interval data RRlist[i] is -1, where i is an integer, i ⁇ 0; if the RR interval RRlist[i+1] In the interval (0.6*RRlist[i], 1.5*RRlist[i]), and the RR interval RRlist[i] is not in the interval (0.8*RRmean, 1.3*RRmean), then the RR interval RRlist[ The identifier of i] is 0, where RRmean is the average of all the RR intervals; if the RR interval RRlist[i+1] is in the interval (0.6*RRlist[i], 1.5*RRlist[i] In the case where the RR interval RRlist[i] is within the interval (0.8*RRmean,
  • the GPU-based parallel ECG signal data analysis method of the embodiment of the present invention can significantly improve the analysis speed of the ECG signal by performing one or more steps in the ECG signal analysis process in parallel on the GPU.
  • the GPU-based parallel ECG analysis method designed by the invention achieves several times to several tens of times acceleration in each stage of ECG analysis, and the total time consumption of ECG analysis is 17 times faster than that of the ordinary workstation server.
  • FIG. 1 is a schematic flow chart of a conventional serial ECG signal analysis method
  • FIG. 3 is a schematic flow chart of a long-term pseudo-pickup serial execution algorithm in an embodiment of the present invention.
  • FIG. 4 is a schematic flow chart of a parallel execution algorithm for long interval false positive rejection in an embodiment of the present invention
  • FIG. 5 is a schematic flow chart of a parallel execution algorithm for long interval false positive rejection in an embodiment of the present invention
  • FIG. 6 is a schematic flow chart of a short inter-time pseudo culling serial execution algorithm in an embodiment of the present invention.
  • FIG. 7 is a schematic flow chart of a parallel execution algorithm for short interval false positive culling in an embodiment of the present invention.
  • FIG. 8 is a schematic flow chart of a parallel execution algorithm for short interval false positive culling in an embodiment of the present invention.
  • FIG. 9 is a schematic flow chart of a R wave position extraction serial execution algorithm in an embodiment of the present invention.
  • FIG. 10 is a schematic flow chart of a parallel execution algorithm for R wave position extraction according to an embodiment of the present invention.
  • FIG. 11 is a flow chart showing a parallel execution algorithm of R wave position extraction according to an embodiment of the present invention.
  • FIG. 12 is a flow chart showing a serial execution algorithm of QRS complex width extraction in an embodiment of the present invention.
  • FIG. 13 is a schematic flow chart of a parallel execution algorithm for QRS complex width extraction in an embodiment of the present invention.
  • FIG. 14 is a schematic flow chart of a parallel execution algorithm for creating a template in an embodiment of the present invention.
  • FIG. 1 is a schematic flow chart of a conventional serial ECG signal analysis method.
  • the prior art adopts a long interval false positive rejection algorithm, a short interval false positive rejection algorithm, and is simple.
  • the integral coefficient comb filter filters the ECG signal to eliminate artifacts including power frequency interference, baseline drift, and myoelectric interference to eliminate the effects of noise generated by the human body and signal acquisition instruments.
  • the above three algorithms for the filtering process 101 are serial algorithms, and the pseudo-cancellation is also used in the stages of feature extraction and waveform classification, so the above three algorithms significantly reduce the speed of ECG signal processing.
  • each algorithm occupies 1/3 of the filtering processing phase; the all-pass network simple integer coefficient comb filter filters the data.
  • the data has a very strong dependency before and after, which is contrary to the Single Instruction Multiple Data Principle (SIMD) of parallel computing under the CUDA (Compute Unified Device Architecture) programming model, so the algorithm is not suitable for parallel operation.
  • SIMD Single Instruction Multiple Data Principle
  • CUDA Computer Unified Device Architecture
  • the long-term pseudo-difference culling algorithm and/or the short-interval pseudo-pick culling algorithm can save 1/3 or 2/3 of the filtering processing time by using the parallel execution method.
  • a serial algorithm is used to perform two processes of R wave position extraction and QRS wave group start and stop position extraction in the QRS detection 102. If the algorithm of R wave position extraction and QRS complex start and stop position extraction can be designed in parallel, the speed of ECG signal analysis can be further improved.
  • the abnormal waveform classification 103 is performed by identifying the QRS complex by a serial algorithm. If a parallel algorithm can be designed to identify the QRS complex, a faster rate of ECG signal analysis can be obtained.
  • the parallel ECG signal analysis method of the embodiment of the present invention includes the following steps:
  • S201 Filtering the ECG signal by long interval false positive rejection and short interval false positive rejection.
  • S202 Perform QRS detection on the filtered ECG signal by R wave position extraction, QRS start and stop position extraction, and QRS complex width extraction.
  • S203 Perform abnormal waveform classification on the ECG signal after QRS detection by creating a template.
  • the long interval false positive rejection, the short interval false positive rejection, the R wave position extraction, the QRS complex width extraction, and at least one of the created templates are executed in parallel by a plurality of threads of the GPU device end, the thread The data corresponding to it is read and processed by its unique corresponding index number.
  • the parallel ECG signal analysis method of the embodiment of the present invention is configured on the GPU by using at least one of a long interval false positive rejection algorithm, a short interval false positive rejection algorithm, an R wave position extraction, a QRS complex width extraction, and a template creation. Parallel execution can speed up the analysis of ECG signals to some extent.
  • the long interval artifact culling can be performed using a serial algorithm.
  • a noise sequence noiselist[length] is generated, and the noise sequence array length is:
  • FIG. 3 is an implementation of the present invention
  • f is the sampling frequency of the electrocardiographic signal
  • p j represents the jth sampled data point in each episode of the ECG signal data set
  • j is a positive integer
  • Steps S302 and S303 are repeated until all ECG signal data is processed.
  • S305 Determine, according to the empirical threshold range [0.5, 30], whether the standard deviation of each segment of the ECG data set is within the empirical threshold range, and calculate a mean value temp_M of the standard deviation of the ECG data set within the empirical threshold range. .
  • Temp_l temp_M ⁇ 3.5, temp_l ⁇ 0.6.
  • S308 Re-traversing, if the two pieces of ECG signal data of the segment of the ECG signal data are noise segments, the segment ECG signal data is also determined as a noise segment.
  • the inventor found that there are four loop bodies in the long-interval pseudo-difference judgment process, and in the CUDA architecture, the threads are organized in a single instruction and multiple data. , so the loop body is suitable for parallelization improvements.
  • ThreadsPerBlock is calculated according to the following formula (1) and formula (2):
  • BlockNum (ecgnum+DataPerBlo ck-1)/DataPerBlo ck (2)
  • the first loop is a nested loop, and its function is to calculate the standard deviation of each sub data set in the entire data set according to the number of sub data sets specified by the inner loop. Based on the characteristics of the loop, the loop is parallelized and improved, so that one thread can calculate the standard deviation of a sub-data set, and multiple threads can execute in parallel without communication.
  • the second loop implements the function of culling part of the data based on the empirical threshold and calculating the mean of the remaining data.
  • Parallelization of the loop can be performed by using a serial execution of the CPU to generate a new array, and then calling the kernel kernel function of the reduction summation to calculate the sum of the data in the new array.
  • the function implemented by the third loop is to determine whether the data set corresponding to the result set m_noise generated by the first loop is noise.
  • the specific implementation process is that each time a value is read from the result set m_noise, and then the value is compared with the above empirical threshold, and thereby the data corresponding to the value is determined to be noise.
  • the loop is parallelized and improved, and length threads can be started, so that each thread only corresponds to one value in the result set m_noise, so that each thread can read one from m_noise according to its thread index number. The corresponding value.
  • the value read according to the thread index number is then compared with the set threshold, and the comparison result is written into the array noiselist according to the thread index number.
  • the fourth cycle is to reconfirm the noise sequence generated by the third cycle.
  • the criterion by which the loop is based is "if both data sets of a sub-data set are judged to be noise, then the data set of the segment is also noise".
  • Parallelization of the loop can start length threads, so that the index number of each thread only corresponds to a value in the array noiselist, so that the thread can read its name from the array noiselist according to its index number. The corresponding value and the value adjacent to the value, and then the corresponding operation according to the read value.
  • the value corresponding to the index number of the thread can be recorded as a, the front and back values adjacent to a are respectively recorded as font and back, and the previous value adjacent to the font is recorded as b, adjacent to the back.
  • a value is written as c.
  • the font or back value is read by its corresponding thread, one of the values is modified, the font or back value must be 0, and the values of b and a are 1, or a and c The value is 1. In other words, no matter which of the font and the back is the heart data, the value of a must be 1. Obviously, this does not affect the thread's processing of the a value, so even if the reading of the heart data occurs, it will not be a. The final result of the value is misjudged.
  • the fourth loop does not read the heart data during parallel execution.
  • FIG. 4 is a schematic flow chart of a parallel execution algorithm for long interval false positive rejection in the embodiment of the present invention. As shown in FIG. 4, the long interval false positive culling is performed in parallel by a plurality of the threads, including:
  • S401 Declare a variable of the GPU device end and allocate a corresponding video memory to the same, and copy the ECG signal from the host end to the global memory of the GPU device end.
  • S402 segment the ECG signal according to the interval of the long interval false positive culling, each of the threads calling a kernel function of the GPU device, and reading the global memory according to the index number.
  • An ECG signal corresponding to the thread calculates a standard deviation of the segment of the ECG signal, and stores the standard deviation according to the index number to a first standard deviation sequence in the global memory.
  • S404 The thread invokes a protocol summation kernel function of the GPU device end, and calculates a sum of standard deviations of all the ECG signals remaining after the culling.
  • S405 Find an average value according to the sum value, and generate a first threshold range according to the average value.
  • Each of the threads re-calls the kernel function, and reads and determines, according to the index number, whether a standard deviation of the segment ECG signals remaining after the culling is within the first threshold range, according to the The index number stores the result of the determination into a first noise sequence in the global memory.
  • the number of threads may be:
  • the number of sampling points of the ECG signal processed by each thread may be f*T, where L is the sampling point sequence length of the ECG signal copied from the host end, f is the sampling frequency of the ECG signal, and T is long Interval between false positives.
  • step S402 the standard deviation can be calculated by using the standard deviation calculation formula in step S303, and the standard deviation is:
  • p j is the electrocardiographic signal at the jth sampling point in the segment ECG signal
  • j is an integer
  • f is the sampling frequency of the electrocardiographic signal
  • T is the long interval false positive rejection Interval.
  • the set threshold may be in the range of [0.5, 3].
  • the upper limit of the first threshold range and the lower limit of the first threshold range are respectively 3 times and 1/3.5 times of the average value obtained in the step.
  • the long interval false positive rejection result may include ECG signal data, noise/non-noise flag, and standard deviation data.
  • FIG. 5 is a schematic flow chart of a parallel execution algorithm for long interval false positive rejection in an embodiment of the present invention. As shown in FIG. 5, the long interval pseudo culling parallel execution algorithm includes the following steps:
  • S501 Declare the device-side variable, allocate corresponding video memory to it, and then copy the ECG signal data required for the kernel function to be copied from the host end to the device end.
  • Each thread calls a kernel function, first reads the ECG signal data belonging to the segment from the global memory, and calculates a standard deviation M of the segment of the ECG signal data; and then writes the calculation result according to the index number gid of the thread.
  • Global memory, ie st_d[gid] M.
  • step S504 Synchronize the thread until all the threads complete the operation of step S503.
  • S505 Determine whether the standard deviation of each episode signal data set is within the above empirical threshold range according to the empirical threshold range [0.5, 30], and thereby eliminate the obviously noisy ECG signal data segment.
  • S506 Call the protocol summation kernel function to calculate the sum of the remaining data.
  • step S507 Calculate the average value temp_M according to the sum value obtained in step S506; and then calculate a new upper threshold value and a lower threshold value according to the average value temp_M.
  • Each thread calls a new kernel function, first reads the corresponding standard deviation from the global memory according to the respective index number gid, and determines whether the data segment corresponding to the standard deviation is based on the upper threshold and the lower threshold generated in step S507. Noise segment. The result is written to the global memory noiselist[gid], and then the thread operation is synchronized.
  • Each thread reads noiselist[gid-1] and noiselist[gid+1] from the global memory, and performs judgment: "If the two pieces of ECG signal data of a certain segment of ECG data are noise segments, then The segment ECG signal data is also judged as a noise segment, and the result is written to noiselist[gid].
  • S510 Copy related data returned by the kernel kernel function from the device side to the host end, release useless device-side variables, and recycle the device-side memory and the host-side memory.
  • the long interval false positive culling algorithm can significantly reduce the time consuming of the filtering processing stage by performing parallel execution on the GPU device side.
  • the short interval false positive culling algorithm may employ a serial algorithm.
  • Short-term false-spoken culling can be used to perform pseudo-pseudo-cancellation on ECG signals after long-interval pseudo-absorption and comb filter filtering.
  • the noise of the smaller interval T 1 is eliminated, and the interval T 1 is selected depending on
  • the interval T of the long interval false positive rejection that is, T may be a multiple of T 1 , for example, the interval T of the long interval false positive rejection is 5, and the value of the interval T 1 of the short interval false positive rejection is 1, which means 5 seconds interval and 1 second interval artifact.
  • FIG. 6 is a schematic flow chart of a short inter-time pseudo culling serial execution algorithm in an embodiment of the present invention. As shown in FIG. 6, the short interval pseudo culling serial execution algorithm includes the following steps:
  • T is an integer multiple of T 1 and p n is the value of the nth electrocardiographic signal data.
  • Steps S602, S603, and S604 are repeated until all the data in the noise sequence noiselist is read.
  • the function of the first cycle is to perform a shorter interval of false positive culling on the non-noise segment data in the long interval pseudo culling result.
  • DETAILED can be divided into two parts: a first, non-noise segment data selected from the difference between pseudo-noise sequence of length produced after culling; a second, non-noise segment of data to a smaller interval T 1 segment Calculation.
  • a first, non-noise segment data selected from the difference between pseudo-noise sequence of length produced after culling
  • T 1 segment Calculation a second, non-noise segment of data to a smaller interval
  • all the original ECG signal data can be segmented according to the interval T 1 , that is, the corresponding number of thread resources are started, and the size of the data set processed by each thread is f*T 1 , and the calculated result is indexed by the thread.
  • the number is written to the video memory.
  • the thread resource is started, and the number of threads is equal to the length of the noise sequence in the long interval pseudo-cancellation result, and each thread has a one-to-one correspondence with the data in the noise sequence obtained by the long interval false positive rejection by each index number.
  • the result calculated in the previous step is corrected according to the data in the noise sequence.
  • the function of the second loop is to calculate the mean of all data greater than 0 in the resulting sequence of results calculated in the first loop.
  • the function of the third loop is to obtain a new noise sequence based on the calculation results of the first two loops.
  • the parallel design of it starts a corresponding number of thread resources, and each thread determines whether the data corresponding to the index number is according to its index number. Noise and write the result to global memory.
  • FIG. 7 is a schematic flow chart of a parallel execution algorithm for short interval false positive culling in an embodiment of the present invention. As shown in FIG. 7, the parallel execution algorithm of short interval false positive culling includes steps.
  • S701 Declare a variable of the GPU device end and allocate a corresponding video memory to the same, and copy the ECG signal after the long interval false negative culling from the host end to the global memory of the GPU device end.
  • S702 segment the ECG signal according to the interval T 1 of the short interval artifact culling, each thread calling a kernel function of the GPU device, and reading an ECG signal according to the index number, Calculating a value of the formula S, storing the value of the formula according to the index number to a second noise sequence in the global memory, where the formula S is:
  • S703 Modify the number of the threads and the number of sampling points of the corresponding processed ECG signals, and each modified thread re-calls the kernel function, and reads the unique corresponding location according to the modified thread index number. Describe a marker value in the first noise sequence of the result of the long interval artifact culling, and mark, by a third marker, an ECG signal indicating the segment corresponding to the marker value of the noise.
  • S704 Serially screening, according to the third mark, an ECG signal indicating a segment corresponding to the tag value that is noise.
  • the modified thread invokes a protocol summation kernel function in the GPU device end, and calculates a sum value of all ECG signals remaining after being sieved.
  • S706 Determine an average value according to the sum value, and generate a second threshold range according to the average value.
  • S707 Modify the number of the threads and the number of corresponding ECG signals, and the modified thread again calls the kernel function again, and the second noise is obtained according to the index number of the thread that is modified again.
  • the sequence reads and rejects the value of each of the formulas that are not within the second threshold range to generate a short interval false positive culling result.
  • S708 Copy the short interval false positive rejection result from the GPU device end to the host end.
  • step S702 the ECG signal is segmented according to the interval of the short interval false positive rejection, and then the short-term false positive rejection is performed on the ECG signal in parallel, which can simplify the complexity of the parallel program and reduce the complexity. The amount of work done by programming.
  • the number of sampling points of each of the thread-processed ECG signals may be f*T 1 , and the number of the threads may be
  • L is the sample point sequence length of the ECG signal copied from the host end
  • f is the sampling point frequency of the ECG signal.
  • ThreadsPerBlock can be set to correspond to one thread block, and the number of sampling points of the ECG signal corresponding to each thread block can be obtained by the above formula (1), that is, it can be:
  • the number of thread blocks can be obtained by the above formula (2), that is, it can be:
  • BlockNum (L+DataPerBlock-1)/DataPerBlock.
  • the number of modified threads may be Where L is the length of the sampling point sequence of the ECG signal copied from the host end, f is the sampling frequency of the ECG signal, and T is the interval of the long interval false positive rejection, and the DataPerBlock can be modified.
  • the thread corresponds to a thread block, and the number of thread blocks can be calculated according to the above formula (2), which can be:
  • BlockNum1 (L1+DataPerBlock1-1)/DataPerBlock1,
  • L1 is the length of the first noise sequence as a result of all of the long interval artifact rejections.
  • the number of threads modified again may be
  • Each of the modified threads may correspondingly process a value of the formula, and may set ThreadsPerBlock2 to be a modified thread corresponding to a thread block, and the number of the thread blocks may be calculated according to the above formula (2).
  • BlockNum2 (n+ThreadsPerBlock2-1)/ThreadsPerBlock2,
  • L is the sample point sequence length of the electrocardiographic signal copied from the host side.
  • the short interval false positive rejection result may include ECG data, noise/non-noise flag, and value of S.
  • the short interval false positive rejection algorithm adopts a parallel execution algorithm, which can reduce the time consumption of the filtering phase and improve the speed of ECG signal analysis.
  • the algorithm used for R wave position extraction may be a mathematical morphology change algorithm, specifically, a corrosion and expansion operation in a mathematical morphology transformation algorithm, that is, first performing a corrosion operation on the data, and then performing a corrosion operation.
  • the data is expanded.
  • the embodiment of the present invention can perform a corrosion operation using a serial algorithm. During the corrosion operation, the size of the corrosion window is set to 5 and the sliding step rate is set to 1.
  • the ECG signal data ecglist and the subsequent four data adjacent to the data. Then, the five data read are subjected to different degrees of corrosion operations, and the minimum value thereof is obtained, and the minimum value is written into the temporary array f0.
  • the expansion operation can be performed on the basis of the temporary array f0 obtained after the etching operation.
  • the size of the expansion window can also be set to 5, and the sliding step rate can be 1.
  • the five data read are subjected to different degrees of expansion operations to obtain the maximum value, and the maximum value is written into the temporary array f1.
  • FIG. 8 is a flow chart showing a parallel execution algorithm of short interval false positive culling in an embodiment of the present invention. As shown in FIG. 8, the parallel execution algorithm for short interval false positive culling includes the following steps:
  • S801 Declare the required variables on the device side, allocate corresponding video memory to the same, and then copy the relevant data required by the kernel function from the host side to the device side.
  • S803 Modify the number of threads in the thread block ThreadsPerBlock, and the amount of data processed by each thread DataPerThread, and the total amount of data (The relevant variables are defined in the long interval false positive culling, where n is the length of the noise array generated by the long interval false positive culling), and the number of thread blocks is calculated according to the above formulas (1) and (2).
  • S804 Serializes the zero value in the array noise and generates a new temporary array temp_noise with no zero value, and calls the protocol summation kernel function to calculate the mean of the temporary array.
  • ThreadsPerBlock Modify the number of threads per thread block ThreadsPerBlock, according to the length of the array noise Calculate the number of thread blocks that need to be started, BlockNum, call kernel_3 function, each thread reads noise[gid] from global memory, compares it with the threshold range generated by step S805, judges whether noise, and finally generates short interval artifact Sequence noiselist_2.
  • S807 Copy related data from the device end to the host end, release useless device-side variables, and recycle the device-side memory and the host-side memory.
  • the R-wave position extraction algorithm employs a serial algorithm.
  • FIG. 9 is a schematic flow chart of an R-wave position extraction serial execution algorithm in an embodiment of the present invention. As shown in FIG. 9, the R wave position extraction serial execution algorithm includes the steps of:
  • S902 sequentially reading an ECG signal data ecglist[i] from the ECG signal data ecglist, and subsequent four ECG signal data ecglist[i+1], ecglist[i+2], ecglist[i+3 ], ecglist[i+4], respectively, perform different degrees of corrosion operations with the corresponding gradients in S901, and obtain the minimum value, which is written into the array f0[i].
  • Step S902 is repeated until there are only 4 data remaining in the array ecglist.
  • ECG signal data ecglist are ecglist[num_read-w+1], ecglist[num_read-w+2], ecglist[num_read-w+3], ecglist[num_read-w+ 4] Do the maximum degree of corrosion operation, and write the result of the corrosion operation to f0[num_read-w+1], f0[num_read-w+2], f0[num_read-w+3], f0[num_read-w+ 4] Medium.
  • S906 performing an expansion operation on the result set array f0 obtained by the etching operation, sequentially reading a data f0[i] from the array f0, and subsequent data f0[i+1], f0[i+2], f0[i +3], f0[i+4], respectively perform a corresponding degree of expansion operation with the corresponding gradient in S901 to obtain a maximum value, and write it to the array f1[i].
  • Step S906 is repeated until the last four data are left in the array f0.
  • the inventors found that in the above serial analysis algorithm, not all the ECG data are subjected to mathematical morphology expansion operation and corrosion operation, but are performed in multiple batches. Each time, a certain amount (num_read) of ECG data is read from the filtered ECG data, and then the part of the read ECG data is subjected to expansion operation and corrosion operation. According to the operation result, the corresponding logical judgment is made, thereby Take all the R wave positions in the segment of ECG data. In the parallel improvement of the R wave position extraction algorithm, the "one-to-one" approach is adopted.
  • a kernel function is called for expansion and erosion operations, a thread resource of the same amount as the data is started, so that each thread can find a unique ECG data point according to its index number. Then, the data and the four adjacent data are expanded and corrupted, and the calculation result is written into the global video memory according to the index number of the thread.
  • the program will call the kernel function multiple times, and the data required by the kernel function is written from the CPU side memory to the GPU-side memory, and between the CPU and the GPU before each call. Frequent data transmission can seriously affect the efficiency of program execution. In the following examples, the inventors have further optimized this.
  • FIG. 10 is a flow chart showing a parallel execution algorithm of R wave position extraction according to an embodiment of the present invention. As shown in FIG. 10, the parallel execution algorithm of R wave position extraction includes the steps:
  • S1001 Declare a variable of the GPU device end and allocate a corresponding video memory to the same, and copy the filtered ECG signal from the memory of the host end to the global memory of the GPU device end.
  • Each of the threads invokes a kernel function, and reads an ECG signal to be detected from the ECG signal after filtering according to the index number.
  • S1003 The thread performs different degrees of corrosion operations on the read-to-detect ECG signal and the immediately adjacent w-1 to-be-detected ECG signals according to the set window size w and a set gradient. And reading a minimum value of the to-be-detected ECG signal and the w-1 to-be-detected ECG signals after the corrosion operation is stored in the first temporary sequence in the global memory according to the index number, w Is an integer, w ⁇ 2.
  • S1005 Perform different degrees of expansion operations on the read minimum value and the immediately following w-1 the minimum value according to the set window size w and the set gradient, and according to the index The number stores the minimum value of the read after the expansion operation and the maximum value of the short values of the following w-1 to a second temporary sequence in the global memory.
  • S1006 Read and calculate a difference between the to-be-detected ECG signal and the maximum value according to the index number, and store the difference value into a third temporary sequence in the global memory.
  • S1007 The thread invokes a protocol summation kernel function to calculate a sum of all the differences in the third temporary sequence.
  • the position of the R wave can be continuously calculated on the host end according to the average value.
  • the minimum value of the read-to-detect ECG signal and the w-1 to-be-detected ECG signals after the corrosion operation may be stored in a register.
  • a minimum value read from the register is read, and w-1 minimum values adjacent to the minimum value are read from the global memory.
  • the number of the thread may be equal to the sampling point sequence length L3 of the ECG signal copied from the host end, and the ThreadsPerBlock may be set to three threads corresponding to one thread block, and the number of the thread blocks may be:
  • BlockNum3 (L3+ThreadsPerBlock3-1)/ThreadsPerBlock3.
  • FIG. 11 is a flow chart showing a parallel execution algorithm of R wave position extraction according to an embodiment of the present invention. As shown in FIG. 11, the parallel execution algorithm for R wave position extraction includes the steps:
  • S1101 Determine the number of thread blocks to be started, BlockNum, and the number of threads in each thread block, threadPerBlock, according to the data size and the GPU's own parameter limit, so that the total number of threads is equal to the amount of data.
  • S1102 Calling the kernel_1 function, in which each thread reads the corresponding ECG data and the following four data from the global video memory according to its index number gid, and then performs different degrees on the five data respectively. Corrosion operation, and the minimum value obtained is again written into the temporary d_f0[gid] and the register according to the index number of the thread.
  • step S1103 Synchronize the thread so that all threads have completed the corrosion operation in step S1102.
  • S1105 In the kernel function, the thread first reads the data d_f0[gid] from the register, and then reads four adjacent data d_f0[gid+1], d_f0[gid+ from the memory according to the index number of the thread. 2], d_f0[gid+3], D_f0[gid+4], then perform different degrees of expansion operations on the five data, and write the maximum value into the temporary d_f1[gid] in the global memory according to the index number of the thread.
  • S1108 Calling the reduction summation kernel function, summing the data in the array d_s1, and writing the final result to the array d_sum_s1.
  • the QRS complex start termination position extraction may be obtained by a method of peak group determination. Find the peaks and troughs near the R wave and determine the edge position of the QRS complex based on a set threshold. The feature parameters required for the extraction of the start position of the QRS complex depend on the extraction of the R-wave position.
  • the calculation of the QRS complex width depends on two arrays: the array QRS_startlist and the array QRS_endlist.
  • the array QRS_startlist stores the starting point of the QRS complex
  • the array QRS_endlist stores the ending point of the QRS complex, so the width of the QRS wave is the difference between the two.
  • the QRS complex width extraction is performed using a serial algorithm.
  • FIG. 12 is a flow chart showing a serial execution algorithm of QRS complex width extraction in an embodiment of the present invention.
  • the start position QRS_startlist[i] and the end position of the QRS complex are sequentially read from the QRS_startlist of the QRS complex and the QRS_endlist of the QRS complex.
  • QRS_endlist[i] finds the difference QRSlist of the start position QRS_startlist[i] and the end position QRS_endlist[i], and then determines whether to record the difference QRSlist according to the set threshold 80.
  • the QRS complex width extraction is performed using a parallel algorithm. According to the length of the array QRS_startlist and the length of the array QRS_endlist, the corresponding number of thread resources are started, and then each thread reads the difference between the two in accordance with the index gid from the array QRS_startlist and the array QRS_endlist, respectively, to obtain the corresponding difference. The width of the QRS complex, and finally the width of the QRS complex is written to the global memory based on the index of the thread.
  • FIG. 13 is a flow chart showing a parallel execution algorithm of QRS complex width extraction in an embodiment of the present invention. As shown in Figure 13, the parallel execution algorithm for QRS complex extraction includes the steps:
  • S1301 Declare the variable of the GPU device end and allocate corresponding video memory to the same, and copy the result of the QRS wave group start and end location extraction from the host end to the global memory of the GPU device end.
  • the thread invokes a kernel function of the GPU device, and reads a start position and an end position in a result of extracting the start and stop positions of the QRS complex according to an index number thereof, and calculates a difference between the start position and the end position. And storing, according to the index number, the difference value to the global memory, and generating a result of QRS group width extraction.
  • the starting position and the ending position of the QRS complex may be obtained at the host end according to the peak group determining method.
  • the steps that are not easy to perform in parallel are performed serially on the host side, reducing the time required to copy data between the CPU and the GPU.
  • the number of thread blocks required to be started, blocknum, and the number of threads in each thread block, threadsPerBlock may be determined according to the data size and the GPU's own parameter limits.
  • the total number of threads started can be blocknum*threadPerBlock.
  • the thread reads the corresponding start position data QRS_startlist[gid] and the end position data QRS_endlist[gid] from the QRS group start position array QRS_startlist and the QRS wave group end position array QRS_endlist in the global memory according to the index number gid thereof.
  • the QRS complex width is written to the QRSlist[gid] again based on the thread's index number.
  • the start position and the end position of the QRS complex used can be calculated at the host end.
  • the abnormal waveform classification serial algorithm can be divided into four steps:
  • one or more types of heart beats are judged one by one.
  • the second step is for the data segment heavily polluted by noise, the heartbeat with too small amplitude, and the determination of atrial premature beats and ventricular premature beats.
  • the third step is to judge the leak and the stroke.
  • the fourth step is to use the template comparison method for all abnormal waves.
  • the template comparison method is to create a QRS template near each abnormal heart beat.
  • the core part of the template comparison method is the most time-consuming, and the core part is to generate the identification sequence.
  • the identification sequence is generated based on the RR interval RRlist.
  • the identifier includes three values of -1, 0, and 1.
  • FIG. 14 is a schematic flow chart of a parallel execution algorithm for creating a template in an embodiment of the present invention. As shown in Figure 14, the parallel execution algorithm for creating a template includes the steps of:
  • S1401 Declare the variable of the GPU device side and allocate corresponding video memory to the same, and copy the result of the R wave position extraction from the host end to the global memory of the GPU device end.
  • S1402 The thread invokes a kernel function, reads an RR interval in a result of extracting the R wave position according to an index number, and obtains an identifier of each of the RR intervals according to a setting criterion, and according to the The index number stores the identifier into the global memory to generate a result of creating a template.
  • the RR interval can be calculated by the R wave position extraction result at the host end.
  • the setting criterion may include:
  • the identifier of the RR interval data RRlist[i] is -1, where i is an integer and i ⁇ 0.
  • the number of thread blocks required to be started, blocknum, and the number of threads in each thread block, threadPerBlock may be determined according to the data size and the GPU's own parameter limits.
  • the number of startup threads can be blocknum*threadPerBlock, each thread reads the corresponding RR interval data from the global video memory according to its index, and according to the above three criteria, obtains the corresponding identifier, and writes the result according to the thread index.
  • the template is created near the abnormal heartbeat, that is, when an abnormal heartbeat is found, the template creation function needs to be called to create the template, so when the abnormal waveform is very large, it will consume a lot of time at this stage.
  • the identification sequence is not required to be generated every time the template is created. Therefore, the embodiment of the present invention can significantly reduce the time consumption of the abnormal waveform classification by optimizing the generation of the identification sequence and the creation of the template.
  • the parallel ECG signal analysis method of the embodiment of the present invention always maintains the same accuracy rate as the serial ECG signal analysis method.
  • the serial and parallel algorithms were tested multiple times using 24-hour long-range ECG signal data, and then the mean values were obtained to obtain the acceleration ratios of the ECG signal analysis stages, as shown in Table 1.
  • Table 1 is the serial run time and parallel run time and the corresponding speedup
  • the average running time is 44 milliseconds, and the obtained acceleration ratio is 5.9;
  • the serial operation speed of the mathematical morphology expansion operation and the corrosion operation is 1685 milliseconds, but the average running time after parallel design and optimization is 160 milliseconds, and the acceleration ratio is 10.5;
  • the QRS complex width calculation The serial running time is 7 milliseconds, the running average running time after parallel acceleration is 6.8 milliseconds, and the acceleration ratio is close to 1;
  • the average running time of the arrhythmia waveform classification algorithm is 1562 milliseconds, and the running speed after parallel acceleration is 30.2.
  • the obtained acceleration ratio is 48.5.
  • the parallel algorithm of the embodiment of the present invention is subjected to multiple analysis and testing using the ECG signal sample set of different durations to obtain an average analysis time, as shown in Table 2. It can be seen from Table 2 that the more abnormal waveforms and the longer the analysis time, the parallel ECG signal analysis method of the embodiment of the present invention can be well adapted to ECG signal data files of different durations.
  • the analysis time can be kept within 2 seconds.
  • the average analysis time of ECG signal data with a duration of 24 hours was 1.9 seconds.
  • a "CPU+GPU” server can analyze the 24-hour long-range ECG signal data in real time by 44,920. Then, by deploying an algorithmic server cluster consisting of seven “CPU+GPU” servers on the Home Health Cloud platform, you can meet the needs of 300,000 long-term ECG signal data analysis.
  • ECG duration (h) 1 2 4 8 12 16 20 twenty four File size (MB) 0.5 1 2.1 4.1 6.2 8.3 10.3 12.4 Analysis time (ms) 105 183 345 676 1023 1364 1644 1932
  • the invention uses a "GPU+CPU” heterogeneous parallel system to accelerate the analysis process of long-range ECG data, that is, firstly using a GPU (Graphic Processing Unit) acceleration technology to combine the merging in the serial ECG algorithm. Parallelization of the line and time-consuming part, accelerate the analysis process of ECG from the instruction level, and then deploy the GPU server loaded with parallel ECG analysis algorithm on the home health cloud platform to meet the needs of the current family health cloud service platform. .
  • GPU Graphic Processing Unit
  • the GPU-based parallel ECG data analysis method of the embodiment of the present invention can significantly improve the analysis speed of the ECG signal by performing one or more steps in the ECG signal analysis process in parallel on the GPU.
  • embodiments of the present invention can be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or a combination of software and hardware. Moreover, the invention can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Cardiology (AREA)
  • Engineering & Computer Science (AREA)
  • Surgery (AREA)
  • Medical Informatics (AREA)
  • Veterinary Medicine (AREA)
  • Physics & Mathematics (AREA)
  • Public Health (AREA)
  • Biophysics (AREA)
  • Pathology (AREA)
  • Biomedical Technology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Physiology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Psychiatry (AREA)
  • Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)
  • Complex Calculations (AREA)

Abstract

一种基于GPU的并行心电信号分析方法,包括:通过长间期伪差剔除和短间期伪差剔除对心电信号进行滤波处理(S201);通过R波位置提取、QRS波群起止位置提取及QRS波群宽度提取对滤波处理后的心电信号进行QRS检波(S202);通过创建模板对QRS检波后的心电信号进行异常波形分类(S203)。其中,长间期伪差剔除、短间期伪差剔除、R波位置提取、QRS波群宽度提取及创建模板中至少有一个通过GPU设备端的多个线程并行执行,线程通过其唯一对应的索引号读取并处理其所对应的数据。该方法通过在GPU上并行执行心电信号分析中的一个或多个步骤提高心电信号分析的速度。

Description

基于GPU的并行心电信号分析方法 技术领域
本发明涉及生物医学工程技术领域,尤其涉及一种基于GPU的并行心电信号分析方法。
背景技术
随着生物信息技术的发展,越来越多的可穿戴健康医疗产品通过对人体心电信号的采集和分析,可为人们提供个性化的健康医疗服务,使得人们无需前往专门的医疗机构就可以了解自身的健康状况。
因此,对心电数据的自动化分析逐渐成为当下生物医学领域的研究热点。目前,大部分心电数据自动化分析技术都是针对以医院为基本单位所收集的心电数据,这样所面对的心电数据规模是相当有限的。但是,家庭健康云平台的服务对象是中小型甚至是大型城市的家庭用户,每天会有成千上万用户上传长程和短程心电数据。
目前,健康云平台使用的是串行心电数据分析算法,可以实现对短程心电数据的实时分析及实时反馈,但对于长程心电信号数据的分析仍然会耗费大量时间,严重影响了用户的体验。例如,当前家庭健康云平台对24小时的长程心电数据,从上传到反馈分析结果,平均响应时间为35秒,耗时较长。
国内外相关研究人员正积极尝试从不同角度来加速心电数据的分析和处理过程。虽然在心电数据并行处理方面已有很多有意义研究成果,但是这些研究成果仅仅是针对心电数据分析提出了粗粒度的处理流程,还难以解决当前所遇到的问题。
发明内容
本发明提供一种基于GPU的并行心电信号分析方法,以解决上述一项或多项缺失。
本发明提供一种基于GPU的并行心电信号分析方法,所述方法包括:通过长间期伪差剔除和短间期伪差剔除对心电信号进行滤波处理;通过R波位置提取、QRS波群起止位置提取及QRS波群宽度提取对滤波处理后的所述心电信号进行QRS检波;通过创建模板对QRS检波后的所述心电信号进行异常波形分类;其中,所述长间期伪差剔除、所述短间期伪差剔除、所述R波位置提取、所述QRS波群宽度提取及所述创建模板中至 少有一个通过GPU设备端的多个线程并行执行,所述线程通过其唯一对应的索引号读取并处理其所对应的数据。
一个实施例中,所述长间期伪差剔除通过多个所述线程并行执行,所述方法包括:声明所述GPU设备端的变量并为其分配相应显存,将所述心电信号从主机端拷贝至所述GPU设备端的全局内存;根据长间期伪差剔除的间期对所述心电信号进行分段,每个所述线程调用所述GPU设备端的kernel函数,根据所述索引号从所述全局内存读取所述线程对应的一段心电信号,计算该段心电信号的标准差,根据所述索引号将所述标准差存储至所述全局内存中一第一标准差序列;剔除标准差不在一设定阈值范围内的所述段心电信号;所述线程调用所述GPU设备端的规约求和kernel函数,计算经过剔除后剩余的所有该段心电信号的标准差的和值;根据所述和值求均值,并根据所述均值生成第一阈值范围;每个所述线程重新调用所述kernel函数,根据所述索引号读取并判断经过剔除后剩余的所述段心电信号的标准差是否在所述第一阈值范围内,根据所述索引号将判断结果存储至所述全局内存中一第一噪声序列;每个所述线程根据所述索引号从所述第一噪声序列依次读取所述判断结果及与其相邻的前一个判断结果和后一个判断结果,如果所述前一个判断结果和所述后一个判断结果均为标准差不在所述第一阈值范围内,则将表示是噪声的第一标记根据所述索引号存储至所述第一噪声序列中,反之,则将表示不是噪声的第二标记根据所述索引号存储至所述第一噪声序列中,生成长间期伪差剔除结果;将所述长间期伪差剔除结果从所述GPU设备端拷贝至所述主机端。
一个实施例中,所述线程的数量
Figure PCTCN2015082040-appb-000001
每个所述线程对应处理的心电信号的采样点个数为f*T,其中,L为从所述主机端拷贝的心电信号的采样点序列长度,f为心电信号的采样频率,T为所述长间期伪差剔除的间期。
一个实施例中,所述标准差
Figure PCTCN2015082040-appb-000002
其中,
Figure PCTCN2015082040-appb-000003
其中,pj是所述段心电信号中第j个采样点处的心电信号,j为整数,j≥0,f为心电信号的采样频率,T为所述长间期伪差剔除的间期。
一个实施例中,所述第一阈值范围的上限和所述第一阈值范围的下限分别为所述均值的3倍和所述均值的1/3.5倍。
一个实施例中,所述设定阈值范围为[0.5,3]。
一个实施例中,所述短间期伪差剔除通过多个所述线程并行执行,所述方法包括:声明所述GPU设备端的变量并为其分配相应显存,将所述长间期伪差剔除后的心电信号从主机端拷贝至所述GPU设备端的全局内存;根据短间期伪差剔除的间期T1对所述心电信号进行分段,每个所述线程调用所述GPU设备端的kernel函数,根据所述索引号读取一段心电信号,计算公式
Figure PCTCN2015082040-appb-000004
的值,根据所述索引号将所述公式的值存储至所述全局内存中一第二噪声序列,其中,sum是所有所述段心电信号中心电信号的平方和;修改所述线程的数量及其对应处理的心电信号的采样点个数,每个修改后的线程重新调用kernel函数,根据所述修改后的线程的索引号读取其唯一对应的所述长间期伪差剔除的结果的第一噪声序列中的一个标记值,并通过一第三标记标出显示是噪声的所述标记值所对应段的心电信号;根据所述第三标记串行筛除显示是噪声的所述标记值所对应段的心电信号;所述修改后的线程调用所述GPU设备端中的规约求和kernel函数,计算经过筛除后剩余的所有心电信号的和值;根据所述和值求得均值,根据所述均值生成一第二阈值范围;再次修改所述线程的数量及其对应处理的心电信号的数量,再次修改后的所述线程再次重新调用kernel函数,根据再次修改后的所述线程的索引号从所述第二噪声序列读取并剔除每个不在所述第二阈值范围内的所述公式的值,生成短间期伪差剔除结果;将所述短间期伪差剔除结果从所述GPU设备端拷贝至所述主机端。
一个实施例中,还包括:每个所述线程处理的心电信号的采样点个数为f*T1,所述线程的数量为
Figure PCTCN2015082040-appb-000005
其中,L为从所述主机端拷贝的心电信号的采样点序列长度,f为心电信号的采样点频率;设定ThreadsPerBlock个所述线程对应一个线程块,每个所述线程块对应的心电信号的采样点个数为DataPerBlock=f*T*ThreadsPerBlock,所述线程块的数量为BlockNum=(L+DataPerBlock-1)/DataPerBlock。
一个实施例中,还包括:修改后的线程的数量
Figure PCTCN2015082040-appb-000006
其中,L为从所述主机端拷贝的心电信号的采样点序列长度,f为心电信号的采样频率,T为所述长间期伪差剔除的间期,DataPerBlock1个修改后的线程对应一个线程块,所述线程块的数量为BlockNum1=(L1+DataPerBlock1-1)/DataPerBlock1,其中,L1为所有所述长间期伪差剔除的结果的第一噪声序列的长度。
一个实施例中,所述方法还包括:再次修改后的线程的数量为
Figure PCTCN2015082040-appb-000007
每个所述再次修改后的线程对应处理一个所述公式的值,设定ThreadsPerBlock2个所述再次修改后的线程对应一个线程块,所述线程块的数量BlockNum2=(n+ThreadsPerBlock2-1)/ThreadsPerBlock2,L为从所述主机端拷贝的心电信号的采样点序列长度。
一个实施例中,所述R波位置提取通过多个所述线程并行执行,所述方法包括:声明所述GPU设备端的变量并为其分配相应显存,将滤波处理后的所述心电信号从主机端的内存拷贝到所述GPU设备端的全局内存;每个所述线程调用kernel函数,根据其索引号对应从滤波处理后的所述心电信号中读取一个待检波心电信号;所述线程根据设定窗口大小w和一设定梯度,对读取的所述待检波心电信号及其后紧邻的w-1个待检波心电信号进行不同程度的腐蚀运算,将腐蚀运算后的读取的所述待检波心电信号及所述w-1个待检波心电信号中的最小值根据所述索引号存储至所述全局内存中的一第一临时序列,w为整数,w≥2;每个所述线程根据所述索引号从所述第一临时序列中读取一个所述最小值;根据所述设定窗口大小w和所述设定梯度,对读取的所述最小值及其后紧邻的w-1个所述最小值进行不同程度的膨胀运算,并根据所述索引号将膨胀运算后的读取的所述最小值及其后紧邻的w-1个所述最小值中的最大值存储至所述全局内存中的一第二临时序列;根据所述索引号读取并计算所述待检波心电信号和所述最大值的差值,并将所述差值存储至所述全局内存中的一第三临时序列;所述线程调用规约求和kernel函数计算所述第三临时序列中的所有所述差值的和值;根据所述和值求均值;将所述均值从所述GPU设备端拷贝至所述主机端。
一个实施例中,所述方法还包括:将腐蚀运算后的读取的所述待检波心电信号及所述w-1个待检波心电信号中的最小值存储至一寄存器中;从所述寄存器中读取的所述最小值,从所述全局内存中读取所述w-1个所述最小值。
一个实施例中,所述设定窗口大小w=5,所述设定梯度为k[w]={0,50,100,50,0}。
一个实施例中,所述线程的数量等于从所述主机端拷贝的所述心电信号的采样点序列长度L3,设定ThreadsPerBlock3个所述线程对应一个线程块,所述线程块的数量BlockNum3=(L3+ThreadsPerBlock3-1)/ThreadsPerBlock3。
一个实施例中,所述QRS波群宽度提取通过多个所述线程并行执行,所述方法包括:声明所述GPU设备端的变量并为其分配相应显存,将QRS波群起止位置提取的结果从主机端拷贝到所述GPU设备端的全局内存;所述线程调用所述GPU设备端的kernel函数,根据其索引号读取所述QRS波群起止位置提取的结果中的起始位置和终止位置,计算所述起始位置和终止位置的差值,根据所述索引号将所述差值存储至所述全局内存,生成QRS波群宽度提取的结果;将所述QRS波群宽度提取的结果从所述GPU设备端拷贝至所述主机端。
一个实施例中,在主机端根据峰值群确定法获得QRS波群的所述起始位置和所述终止位置。
一个实施例中,所述创建模板通过多个所述线程并行执行,所述方法包括:声明所述GPU设备端的变量并为其分配相应显存,将R波位置提取的结果从主机端拷贝到所述GPU设备端的全局内存;所述线程调用kernel函数,根据其索引号读取所述R波位置提取的结果中的RR间期,根据一设定准则获取每个所述RR间期的标识符,并根据所述索引号将所述标识符存储至所述全局内存中,生成创建模板的结果;将所述创建模板的结果从所述GPU设备端拷贝到所述主机端。
一个实施例中,所述设定准则包括:如果与第i个RR间期RRlist[i]相邻的第i+1个RR间期RRlist[i+1]不在区间(0.6*RRlist[i],1.5*RRlist[i])内,则所述RR间期数据RRlist[i]的标识符为-1,其中,i为整数,i≥0;如果所述RR间期RRlist[i+1]在区间(0.6*RRlist[i],1.5*RRlist[i])内,及所述RR间期RRlist[i]不在区间(0.8*RRmean,1.3*RRmean)内,则所述RR间期RRlist[i]的标识符为0,其中,RRmean为所有所述RR间期的平均值;如果所述RR间期RRlist[i+1]在区间(0.6*RRlist[i],1.5*RRlist[i])内,及所述RR间期RRlist[i]在区间(0.8*RRmean,1.3*RRmean)内,则所述RR间期RRlist[i]的标识符为1。
本发明实施例的基于GPU的并行心电信号数据分析方法通过将心电信号分析过程中的一个或多个步骤在GPU上并行执行,可以显著提高心电信号的分析速度。
本发明设计的基于GPU的并行心电分析方法在心电分析的各个阶段取得了几倍至几十倍的加速,心电分析总耗时获得了相对于普通工作站式服务器17倍的加速比。通过在家庭健康云平台的GPU服务器上部署该并行算法,可以满足现阶段对大规模长程心电信号数据快速分析的需求。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。在附图中:
图1是现有串行心电信号分析方法的流程示意图;
图2是本发明实施例的基于GPU的并行心电信号分析方法;
图3是本发明实施例中的长间期伪差剔除串行执行算法的流程示意图;
图4是本发明实施例中的长间期伪差剔除并行执行算法的流程示意图;
图5是本发明一实施例中长间期伪差剔除并行执行算法的流程示意图;
图6是本发明实施例中的短间期伪差剔除串行执行算法的流程示意图;
图7是本发明实施例中的短间期伪差剔除的并行执行算法的流程示意图;
图8是本发明一实施例中的短间期伪差剔除的并行执行算法的流程示意图;
图9是本发明实施例中的R波位置提取串行执行算法的流程示意图;
图10是本发明实施例的R波位置提取的并行执行算法的流程示意图;
图11是本发明一实施例的R波位置提取的并行执行算法的流程示意图;
图12是本发明实施例中的QRS波群宽度提取的串行执行算法的流程示意图;
图13是本发明实施例中的QRS波群宽度提取的并行执行算法的流程示意图;
图14是本发明实施例中的创建模板的并行执行算法的流程示意图。
具体实施方式
为使本发明实施例的目的、技术方案和优点更加清楚明白,下面结合附图对本发明实施例做进一步详细说明。在此,本发明的示意性实施例及其说明用于解释本发明,但并不作为对本发明的限定。
图1是现有串行心电信号分析方法的流程示意图,如图1所示,在滤波处理101过程中,现有技术采用长间期伪差剔除算法、短间期伪差剔除算法及简单整系数梳状滤波器对心电信号进行滤波处理,从而剔除包括工频干扰、基线漂移、肌电干扰的伪差干扰,以消除人体和信号采集仪器等外界环境产生的噪声影响。上述三种用于滤波处理101的算法均为串行算法,而且伪差剔除在特征提取和波形分类的阶段也都有使用,所以上述三种算法明显降低了心电信号处理的速度。
通过对包含上述三种串行算法的滤波处理过程进行分析,发明人发现:每种算法占滤波处理阶段耗时的1/3;全通网络简单整系数梳状滤波器在对数据进行滤波处理的过程中,数据的前后依赖关系非常强,这与CUDA(Compute Unified Device Architecture,统一计算设备架构)编程模型下并行计算的单指令多数据原则(SIMD)相违背,所以该算法不适合并行进行;长间期伪差剔除算法和短间期伪差剔除算法在对数据进行滤波处理的过程中,数据的关联性较弱,可以进行并行处理。
因此,若在对心电信号进行滤波处理时,长间期伪差剔除算法及/或短间期伪差剔除算法采用并行执行方法可以节省1/3或2/3的滤波处理阶段的耗时。
如图1所示,现有技术中,采用串行算法进行QRS检测102中的R波位置提取和QRS波群起止位置提取两个处理过程。若可设计出R波位置提取和QRS波群起止位置提取并行执行的算法,可进一步提高心电信号分析的速度。
再如图1所示,现有技术中,通过串行算法标识QRS波群来进行异常波形分类103。若可设计出并行算法标识QRS波群,则可获得更快的心电信号分析速度。
为了解决现有技术中存在的问题,本发明实施例提供了一种基于GPU的并行心电信号分析方法。如图2所示,本发明实施例的并行心电信号分析方法包括步骤:
S201:通过长间期伪差剔除和短间期伪差剔除对心电信号进行滤波处理。
S202:通过R波位置提取、QRS波群起止位置提取及QRS波群宽度提取对滤波处理后的所述心电信号进行QRS检波。
S203:通过创建模板对QRS检波后的所述心电信号进行异常波形分类。
其中,上述长间期伪差剔除、上述短间期伪差剔除、上述R波位置提取、上述QRS波群宽度提取及上述创建模板中至少有一个通过GPU设备端的多个线程并行执行,该线程通过其唯一对应的索引号读取并处理其所对应的数据。
本发明实施例的并行心电信号分析方法,通过对长间期伪差剔除算法、短间期伪差剔除算法、R波位置提取、QRS波群宽度提取及创建模板中的至少一个在GPU上并行执行,可在某种程度上加快心电信号的分析速度。
一个实施例中,长间期伪差剔除可以采用串行算法执行。在长间期伪差剔除的过程中,会产生一个噪声序列noiselist[length],噪声序列数组长度为:
length=ecgnum÷T,
其中,ecgnum表示原始心电信号数据的数组长度,T表示长间期伪差剔除的间期,如果某段数据被判断为噪声,则在噪声序列相应位置写入1,反之写入0。图3是本发明实施 例中的长间期伪差剔除串行执行算法的流程示意图。如图3所示,长间期伪差剔除串行执行算法包括步骤:
S301:初始化心电信号的段标识符K=0。
S302:从原始心电信号中依次读取心电信号数据。
S303:当读入f*T个心电信号数据点时,根据标准差公式M计算每段心电信号数据集的标准差,并将段标识符K加1:
Figure PCTCN2015082040-appb-000008
其中,f为心电信号的采样频率,pj表示每段心电信号数据集中的第j个采样数据点,j为正整数,j≥0。
S304:重复S302和S303步骤,直到所有心电信号数据都被处理。
S305:根据经验阈值范围[0.5,30]判断每段心电信号数据集的标准差是否在该经验阈值范围以内,并计算在该经验阈值范围内的心电信号数据集的标准差的均值temp_M。
S306:根据均值temp_M计算得到新的阈值上限temp_h和阈值下限temp_l,进而得到一个新的阈值范围,其中,
temp_h=temp_M*3,
temp_l=temp_M÷3.5,temp_l≥0.6。
S307:根据步骤S306得到的新的阈值范围判断心电信号噪声段,即如果该段心电信号数据集的标准差在该新的阈值范围以内则不是噪声段,记作noiselist[i]=0,反之,则判断为噪声段,记作noiselist[i]=1。
S308:重新遍历,如果某段心电信号数据的前后两段心电信号数据都是噪声段,则将该段心电信号数据也判断为噪声段。
S309:长间期伪差剔除结束。
通过对长间期伪差剔除串行算法的分析,发明人发现,在长间期伪差判断流程中有四个循环体,而在CUDA的架构中,线程是以单指令多数据的方式组织的,所以循环体适合进行并行化改进。
在调用内核kernel函数时,指定线程块和块内线程资源的数量,线程块的数量BlockNum可以依据原始心电数据的长度和给定的每个线程块中的线程数量 ThreadsPerBlock按照下面的公式(1)和公式(2)计算得到:
DataPerBlo ck=f*T*ThreadsPer Block   (1)
BlockNum=(ecgnum+DataPerBlo ck-1)/DataPerBlo ck   (2)
在长间期伪差判断过程中,第一个循环是一个嵌套循环,其实现的功能是,按照内循环规定的子数据集的数量,计算整个数据集中每个子数据集的标准差。基于该循环的特点,将该循环进行并行化改进,可以使一个线程计算一个子数据集的标准差,多个线程同时无通信地并行执行。
第二个循环实现的功能是,依据经验阈值剔除部分数据,并计算剩余数据的均值。将该循环进行并行化改进,可采用CPU串行执行的方式,生成一个新的数组,然后调用归约求和的内核kernel函数,计算该新的数组中的数据的和。
第三个循环实现的功能是判断第一个循环产生的结果集m_noise所对应的数据集是否是噪声。其具体实现过程是,每次从结果集m_noise中读取一个值,然后将该值与上述经验阈值比较,并以此判断该值所对应的数据是否是噪声。针对这种情形,将该循环进行并行化改进,可启动length个线程,使每个线程仅与结果集m_noise中的一个值对应,从而每个线程按照其线程索引号可以从m_noise中读取一个相应的值。然后将根据线程索引号读取的值和设定阈值比较,并按该线程索引号将比较结果写入数组noiselist中。
第四个循环是对第三个循环产生的噪声序列进行再确认。该循环所依据的判断准则是“如果一个子数据集的前后两个数据集均被判断为噪声,那么该段数据集也为噪声”。在功能的实现过程中,每次从数组noiselist中读取一个数据值,并读取与该数据值相邻的前后两个数据值,然后依据上述判断准则修改该数据值。将该循环进行并行化改进,可启动length个线程,使得每个线程的索引号仅与数组noiselist中的一个值对应,如此一来,线程就可以按照其索引号从数组noiselist中读取其所对应的值以及与该值前后相邻的值,然后根据所读取的值进行相应的运算。
由于改进后的第四个循环是并行执行的,所以需要考虑所读取的前后相邻的值是否是脏数据。
为了便于分析,可将线程的索引号所对应的值记为a,与a相邻的前后值分别记为font和back,与font相邻的前一个值记为b,与back相邻的后一个值记为c。依据上述第四个循环执行的准则“如果一个子数据集的前后两个数据集均被判断为噪声,那么该段数据集也为噪声”,可以知道写入操作的发生需要两个必备条件:第一,只针对值为 0的数据;第二,该值的前后相邻值必须为1。基于这两个条件,假设font或back值在被其对应的线程读取后,其中一个值被修改,则font或back值必须为0,且b和a的值为1,或者a和c的值为1。换言之,无论font和back中的哪一个是心脏数据,a的值必须为1,很显然,这不会影响线程对a值的处理,所以即使发生读取心脏数据的情况,也不会对a值的最终结果误判。
因此,第四个循环在并行执行过程中不会出现读取心脏数据的情况。
图4是本发明实施例中的长间期伪差剔除并行执行算法的流程示意图。如图4所示,所述长间期伪差剔除通过多个所述线程并行执行,包括:
S401:声明所述GPU设备端的变量并为其分配相应显存,将所述心电信号从主机端拷贝至所述GPU设备端的全局内存。
S402:根据长间期伪差剔除的间期对所述心电信号进行分段,每个所述线程调用所述GPU设备端的kernel函数,根据所述索引号从所述全局内存读取所述线程对应的一段心电信号,计算该段心电信号的标准差,根据所述索引号将所述标准差存储至所述全局内存中一第一标准差序列。
S403:剔除标准差不在一设定阈值范围内的所述段心电信号。
S404:所述线程调用所述GPU设备端的规约求和kernel函数,计算经过剔除后剩余的所有该段心电信号的标准差的和值。
S405:根据所述和值求均值,并根据所述均值生成第一阈值范围。
S406:每个所述线程重新调用所述kernel函数,根据所述索引号读取并判断经过剔除后剩余的所述段心电信号的标准差是否在所述第一阈值范围内,根据所述索引号将判断结果存储至所述全局内存中一第一噪声序列。
S407:每个所述线程根据所述索引号从所述第一噪声序列依次读取所述判断结果及与其相邻的前一个判断结果和后一个判断结果,如果所述前一个判断结果和所述后一个判断结果均为标准差不在所述第一阈值范围内,则将表示是噪声的第一标记根据所述索引号存储至所述第一噪声序列中,反之,则将表示不是噪声的第二标记根据所述索引号存储至所述第一噪声序列中,生成长间期伪差剔除结果。
S408:将所述长间期伪差剔除结果从所述GPU设备端拷贝至所述主机端。
在上述步骤S401中,线程的数量可为:
Figure PCTCN2015082040-appb-000009
每个线程对应处理的心电信号的采样点个数可为f*T,其中,L为从主机端拷贝的心电信号的采样点序列长度,f为心电信号的采样频率,T为长间期伪差剔除的间期。
在上述步骤S402中,标准差可采用步骤S303中的标准差计算公式进行计算,标准差:
Figure PCTCN2015082040-appb-000010
其中,pj是所述段心电信号中第j个采样点处的心电信号,j为整数,j≥0,f为心电信号的采样频率,T为所述长间期伪差剔除的间期。
上述步骤S403中,该设定阈值的范围可为[0.5,3]。上述步骤S405中,第一阈值范围的上限和第一阈值范围的下限分别该步骤中所得均值的3倍和1/3.5倍。
上述步骤S407中,长间期伪差剔除结果可以包括心电信号数据、噪声/非噪声标记、标准差数据。
图5是本发明一实施例中长间期伪差剔除并行执行算法的流程示意图。如图5所示,长间期伪差剔除并行执行算法包括步骤:
S501:声明设备端变量,并为之分配相应的显存,然后将kernel函数运行中所需要的心电信号数据从主机端拷贝到设备端。
S502:设原始心电信号数据的长度为L,启动
Figure PCTCN2015082040-appb-000011
个线程,即每个线程处理一个包含f*T个心电信号数据的数据集。
S503:每个线程调用kernel函数,首先从全局内存中读取属于该段的心电信号数据,并计算该段心电信号数据的标准差M;然后根据线程的索引号gid将计算结果写入全局内存,即st_d[gid]=M。
S504:同步线程,直到所有线程均完成了步骤S503的运算。
S505:根据经验阈值范围[0.5,30]判断每段心电信号数据集的标准差是否在上述经验阈值范围内,并以此剔除明显有噪声的心电信号数据段。
S506:调用规约求和kernel函数,计算剩余数据的和值。
S507:依据步骤S506中得到的和值求均值temp_M;然后再根据均值temp_M计算得到新的阈值上限和阈值下限。
S508:每个线程调用新的kernel函数,首先按照各自的索引号gid从全局内存中读取相应的标准差,并根据步骤S507产生的阈值上限和阈值下限确定该标准差对应的数据段是否是噪声段。并将结果写入全局内存noiselist[gid],然后同步线程操作。
S509:每个线程再从全局内存中读取noiselist[gid-1]和noiselist[gid+1],执行判断“如果某段心电信号数据的前后两段心电信号数据都是噪声段,则将该段心电信号数据也判断为噪声段”,并将结果写入noiselist[gid]。
S510:将内核kernel函数返回的相关数据从设备端拷贝到主机端,释放无用的设备端变量,回收设备端显存和主机端内存。
本发明实施例中,长间期伪差剔除算法通过在GPU设备端并行执行,可以显著降低滤波处理阶段的耗时。
一个实施例中,短间期伪差剔除算法可以采用串行算法。短间期伪差剔除可在长间期伪差剔除和梳状滤波器滤波后对心电信号进行伪差剔除处理。主要是依据长间期伪差剔除产生的噪声标记数组noiselist,从心电信号数据中没有被标记为噪声的数据段中,剔除更小间期T1的噪声,间期T1的选取依赖于长间期伪差剔除的间期T,即T可以是T1的倍数,例如长间期伪差剔除的间期T的值为5,短间期伪差剔除的间期T1的值为1,即分别表示5秒间期和1秒间期伪差。在短间期伪差剔除过程中产生的噪声序列记为noise[length],数组长度length=ecgnum/T1
图6是本发明实施例中的短间期伪差剔除串行执行算法的流程示意图。如图6所示,短间期伪差剔除串行执行算法包括步骤:
S601:初始化环境变量。
S602:依次从长间期伪差剔除结果中的噪声序列noiselist中读数据noiselist[i]。
S603:如果数据noiselist[i]=1,即表示ecglist[i*T*f]~ecglist[(i+1)*T*f-1]的原始心电信号数据已经被标记为噪声段,那么在数组temp_noise中依次写入T/T1个0,即不对该噪声段所对应的心电信号数据做短间期伪差剔除处理。
S604:如果数据noiselist[i]=0,从数据noiselist[i]在数组ecglist中所对应的下标i*T*f开始依次读取数据,并计算公式Si,j的值,并将Si,j值写入temp_noise[i*T/T1+j]中:
Figure PCTCN2015082040-appb-000012
其中,
Figure PCTCN2015082040-appb-000013
T是T1的整数倍,pn是第n个心电信号数据的值。
S605:重复步骤S602、S603、S604,直到噪声序列noiselist中的数据全部读完。
S606:对新得到的噪声序列temp_noise,求其中所有大于零的值的均值temp_S,并由此得到新的阈值范围(0,5*temp_S]。
S607:根据步骤S606中所得到的新的阈值范围进行判断,如果数据temp_noise[i]在该新的阈值范围内,则数据temp_noise[i]对应的心电信号数据不是噪声,即noise[i]=0,反之,则判断为噪声,noise[i]=1;返回新的噪声序列noise;重复该步骤直到数组temp_noise中的所有数据被读完。
S608:短间期伪差剔除算法结束。
通过对短间期伪差剔除串行算法分析,发明人发现,短间期伪差剔除算法一共有三个大循环。
第一个循环的功能是对长间期伪差剔除结果中的非噪声段数据进行更短间期的伪差剔除。具体可分为两个部分:第一,从长间期伪差剔除后产生的噪声序列中筛选出非噪声段数据;第二,对非噪声段数据以更小的间期T1分段进行计算。但是,在对上述第一个循环进行并行执行改进时,可不按照上述两个部分的串行顺序执行。可首先对所有原始心电信号数据按照间期T1进行分段计算,即启动相应数量的线程资源,每个线程处理的数据集大小为f*T1,并将计算的结果按照线程的索引号写入显存。然后,启动线程资源,线程的数量与长间期伪差剔除结果中的噪声序列长度相等,每个线程按照其索引号与长间期伪差剔除所得到的噪声序列中的数据一一对应。再依据噪声序列中的数据对上一步计算的结果进行矫正。
第二个循环的功能是计算第一个循环计算得到的结果序列中所有大于0的数据的均值。在进行并行化设计的时,首先串行筛选大于0的数据,并将大于零的数据写入新的临时数组中,然后调用归约求和内核函数,计算该临时数组中的数据的和,从而得到所有大于零的数据的均值。
第三个循环的功能是根据前两个循环的计算结果得到新的噪声序列,对它的并行设计是启动相应数量的线程资源,每个线程按照其索引号判断该索引号对应的数据是否为噪声,并将结果写入全局显存。
图7是本发明实施例中的短间期伪差剔除的并行执行算法的流程示意图。如图7所示,短间期伪差剔除的并行执行算法包括步骤。
S701:声明所述GPU设备端的变量并为其分配相应显存,将所述长间期伪差剔除后的心电信号从主机端拷贝至所述GPU设备端的全局内存。
S702:根据短间期伪差剔除的间期T1对所述心电信号进行分段,每个所述线程调用所述GPU设备端的kernel函数,根据所述索引号读取一段心电信号,计算公式S的值,根据所述索引号将所述公式的值存储至所述全局内存中一第二噪声序列,其中,公式S为:
Figure PCTCN2015082040-appb-000014
其中,sum是所有所述段心电信号中心电信号的平方和。
S703:修改所述线程的数量及其对应处理的心电信号的采样点个数,每个修改后的线程重新调用kernel函数,根据所述修改后的线程的索引号读取其唯一对应的所述长间期伪差剔除的结果的第一噪声序列中的一个标记值,并通过一第三标记标出显示是噪声的所述标记值所对应段的心电信号。
S704:根据所述第三标记串行筛除显示是噪声的所述标记值所对应段的心电信号。
S705:所述修改后的线程调用所述GPU设备端中的规约求和kernel函数,计算经过筛除后剩余的所有心电信号的和值。
S706:根据所述和值求得均值,根据所述均值生成一第二阈值范围。
S707:再次修改所述线程的数量及其对应处理的心电信号的数量,再次修改后的所述线程再次重新调用kernel函数,根据再次修改后的所述线程的索引号从所述第二噪声序列读取并剔除每个不在所述第二阈值范围内的所述公式的值,生成短间期伪差剔除结果。
S708:将所述短间期伪差剔除结果从所述GPU设备端拷贝至所述主机端。
在上述步骤S702中,先对根据短间期伪差剔除的间期对心电信号进行分段,再对心电信号并行进行短间期伪差剔除,可以简化并行程序的复杂度,减少了编程实现的工作量。
在上述步骤S702中,每个所述线程处理的心电信号的采样点个数可为f*T1,所述线程的数量可为
Figure PCTCN2015082040-appb-000015
其中,L为从所述主机端拷贝的心电信号的采样点序列长度,f为心电信号的采样点频率。可设定ThreadsPerBlock个所述线程对应一个线程块,每个所述线程块对应的心电信号的采样点个数可由上述公式(1)得到,即可以是:
DataPerBlock=f*T*ThreadsPerBlock,
线程块的数量可由上述公式(2)得到,即可以是:
BlockNum=(L+DataPerBlock-1)/DataPerBlock。
在上述步骤S703中,修改后的线程的数量可为
Figure PCTCN2015082040-appb-000016
其中,L为从所述主机端拷贝的心电信号的采样点序列长度,f为心电信号的采样频率,T为所述长间期伪差剔除的间期,可使DataPerBlock1个修改后的线程对应一个线程块,线程块的数量可根据上述公式(2)计算得到,可以是:
BlockNum1=(L1+DataPerBlock1-1)/DataPerBlock1,
其中,L1为所有所述长间期伪差剔除的结果的第一噪声序列的长度。
在上述步骤S707中,再次修改后的线程的数量可为
Figure PCTCN2015082040-appb-000017
每个再次修改后的线程可对应处理一个所述公式的值,可设定ThreadsPerBlock2个所述再次修改后的线程对应一个线程块,所述线程块的数量可根据上述公式(2)计算得到,可以是:
BlockNum2=(n+ThreadsPerBlock2-1)/ThreadsPerBlock2,
L为从所述主机端拷贝的心电信号的采样点序列长度。短间期伪差剔除结果可以包括心电数据、噪声/非噪声标记、S的值。
本发明实施例中,短间期伪差剔除算法采用并行执行算法,可以减小滤波阶段的耗时,提高心电信号分析的速度。
本发明实施例中,R波位置提取采用的算法可为数学形态学变算法,具体地,是数学形态学变换算法中腐蚀和膨胀运算,即首先对数据进行腐蚀运算,然后在对腐蚀运算后的数据进行膨胀运算。本发明实施例可利用串行算法进行腐蚀运算。在腐蚀运算过程中,将腐蚀窗口的大小设为5,滑动步率设为1。每次从心电信号数据ecglist中读取一个数据,以及与该数据相邻的后续四个数据。然后,对读取的五个数据分别进行不同程度的腐蚀运算,并得到其中的最小值,将该最小值写入临时数组f0中。膨胀运算可在腐蚀运算后得到的临时数组f0的基础上进行的。在膨胀运算的过程中,膨胀窗口的大小同样可设为5,滑动步率可为1。每次从数组f0中读取一个数据,及与其相邻的后续四个数据。然后,对读取的五个数据做不同程度的膨胀运算,得到其中的最大值,将该最大值写入临时数组f1中。
图8是本发明一实施例中的短间期伪差剔除的并行执行算法的流程示意图。如图8所示,短间期伪差剔除的并行执行算法包括步骤:
S801:声明设备端所需变量,并为之分配相应的显存,然后将内核函数所需相关数据从主机端拷贝到设备端。
S802:定义线程块中线程的数量ThreadsPerBlock和每个线程处理的数据量DataPerThread=f*T1,然后依据上述公式(1)和公式(2),计算得到需要启动的线程块的数量,启动kernel_1函数,每个线程按照其索引号gid从心电数据中读取下标为gid*f*T1至(gid+1)*f*T1-1之间的数据,并依据公式
Figure PCTCN2015082040-appb-000018
将计算的结果写入临时数组noise中线程索引号gid对应的下标的位置,其中sum是该线程对应的数据集中的数据的平方和。
S803:修改线程块中线程的数量ThreadsPerBlock,以及每个线程处理的数据量DataPerThread,以及数据总量
Figure PCTCN2015082040-appb-000019
(相关变量在长间期伪差剔除中已定义,这里的n为长间期伪差剔除所产生的噪声数组的长度),依据上述公式(1)、(2)计算得到线程块的数量,启动kernel_2函数,由于线程的数量等于长间期伪差剔除中产生的噪声数组noiselist的长度,即在kernel_2中,每个线程按照线程索引号可以唯一对应noiselist中的一个数据,所以kernel_2中每个线程按照线程索引号从noiselist读取相应的数据,如果noiselist[gid]=1,则将数组noise中下标为gid*T到(gid+1)*T-1值置零。
S804:串行将数组noise中的零值筛除,并产生一个新的无零值的临时数组temp_noise,调用规约求和kernel函数计算临时数组的均值。
S805:根据临时数组的均值,得到新的阈值范围。
S806:修改每个线程块的线程数量ThreadsPerBlock,依据数组noise的长度
Figure PCTCN2015082040-appb-000020
计算得到需要启动的线程块的数量BlockNum,调用kernel_3函数,每个线程从全局内存中读取noise[gid],并与S805步骤产生的阈值范围比较,判断是否噪声,最后生成短间期伪差序列noiselist_2。
S807:将相关数据从设备端拷贝到主机端,释放无用的设备端变量,回收设备端显存和主机端内存。
一个实施例中,R波位置提取算法采用串行算法。图9是本发明实施例中的R波位置提取串行执行算法的流程示意图。如图9所示,R波位置提取串行执行算法包括步骤:
S901:定义膨胀运算和腐蚀运算的窗口大小w=5和梯度k[w]={0,50,100,50,0}。
S902:依次从心电信号数据ecglist中读一个心电信号数据ecglist[i],以及其后的四个心电信号数据ecglist[i+1]、ecglist[i+2]、ecglist[i+3]、ecglist[i+4],分别与S901中对应的梯度进行不同程度的腐蚀运算,得到最小值,写入数组f0[i]中。
S903:重复步骤S902,直到数组ecglist只剩下4个数据。
S904:对于心电信号数据ecglist中的最后4个心电信号数据ecglist[num_read-w+1]、ecglist[num_read-w+2]、ecglist[num_read-w+3]、ecglist[num_read-w+4]做最大程度的腐蚀运算,并将腐蚀运算的结果写入f0[num_read-w+1]、f0[num_read-w+2]、f0[num_read-w+3]、f0[num_read-w+4]中。
S905:腐蚀运算结束。
S906:对腐蚀运算得到的结果集数组f0进行膨胀运算,依次从数组f0中读一个数据f0[i],以及其后的数据f0[i+1]、f0[i+2]、f0[i+3]、f0[i+4],分别与S901中对应的梯度进行相应程度的膨胀运算,得到最大值,写入数组f1[i]。
S907:重复步骤S906,直到数组f0中剩下最后四个数据。
S908:对于S907中剩下的最后四个数据,分别做最大程度的膨胀运算,即将ecglist[num_read-w+1]、ecglist[num_read-w+2]、ecglist[num_read-w+3]、ecglist[num_read-w+4]分别加100后写入f1[num_read-w+1]、f1[num_read-w+2]、f1[num_read-w+3]、f1[num_read-w+4]中。
S909:膨胀运算结束。
最后,计算数据ecglist[i]与数据f1[i]之间的差值,并计算所有差值的均值,将该均值写入全局显存s1[i]中。返回计算的结果,数学形态学变换算法结束。
通过R波位置提取串行执行算法分析,发明人发现,在上述串行分析算法中,不是一次对所有的心电数据进行数学形态学膨胀运算与腐蚀运算,而是分多批次进行的。每次从经过滤波处理的心电数据中读取一定数量(num_read)的心电数据,然后对读取的该部分心电数据进行膨胀运算与腐蚀运算。根据运算结果进行相应的逻辑判断,从而提 取该段心电数据中的所有R波位置。在对R波位置提取算法进行并行化改进时,采用的“一对一”的方式进行。
每次调用内核函数进行膨胀运算和腐蚀运算时,启动与数据量等量的线程资源,使得每个线程按照其索引号可以找到唯一的心电数据点。然后,对该数据以及其后相邻的4个数据进行膨胀运算和腐蚀运算,并将计算结果按照线程的索引号写入全局显存。在该并行化执行过程中,发明人进一步发现,程序会多次调用内核函数,并且每次调用之前将本次内核函数所需的数据从CPU端内存写入GPU端显存,CPU和GPU之间频繁的数据传输会严重影响程序的执行效率。下述实施例中,发明人对此做了进一步优化。
图10是本发明实施例的R波位置提取的并行执行算法的流程示意图。如图10所示,R波位置提取的并行执行算法包括步骤:
S1001:声明所述GPU设备端的变量并为其分配相应显存,将滤波处理后的所述心电信号从主机端的内存拷贝到所述GPU设备端的全局内存。
S1002:每个所述线程调用kernel函数,根据其索引号对应从滤波处理后的所述心电信号中读取一个待检波心电信号。
S1003:所述线程根据设定窗口大小w和一设定梯度,对读取的所述待检波心电信号及其后紧邻的w-1个待检波心电信号进行不同程度的腐蚀运算,将腐蚀运算后的读取的所述待检波心电信号及所述w-1个待检波心电信号中的最小值根据所述索引号存储至所述全局内存中的一第一临时序列,w为整数,w≥2。
S1004:每个所述线程根据所述索引号从所述第一临时序列中读取一个所述最小值。
S1005:根据所述设定窗口大小w和所述设定梯度,对读取的所述最小值及其后紧邻的w-1个所述最小值进行不同程度的膨胀运算,并根据所述索引号将膨胀运算后的读取的所述最小值及其后紧邻的w-1个所述最小值中的最大值存储至所述全局内存中的一第二临时序列。
S1006:根据所述索引号读取并计算所述待检波心电信号和所述最大值的差值,并将所述差值存储至所述全局内存中的一第三临时序列。
S1007:所述线程调用规约求和kernel函数计算所述第三临时序列中的所有所述差值的和值。
S1008:根据所述和值求均值。
S1009:将所述均值从所述GPU设备端拷贝至所述主机端。
本发明实施例中,R波位置提取的并行执行算法中,将经过滤波处理的心电数据全部写入GPU显存,并设置标志位标记读位置,内核函数在执行时依据标志位从显存中读取本次所需的心电数据同时更新标志位。该方法将数据的准备从CPU端迁移到了GPU,使得在此过程中从CPU内存向GPU显存中写数据只发生一次,大大减少了数据传输耗时。
本发明实施例中,根据步骤S1008得到均值后,可以根据该均值在主机端继续计算得到R波的位置。
在上述步骤S1003中,还可以将腐蚀运算后的读取的所述待检波心电信号及所述w-1个待检波心电信号中的最小值存储至一寄存器中。在上述步骤S1004中,从该寄存器中读取的一个最小值,再从全局内存中读取与该最小值相邻的w-1个最小值。
如此一来,因为同时从全局内存和寄存器中读取数据,可以进一步提高R波位置提取的速度。
在上述步骤S1003中,设定窗口大小和设定梯度可以是多种不同设定值,例如,该设定窗口大小w=5,该设定梯度为k[w]={0,50,100,50,0}。
在上述步骤S1002中,该线程的数量可以等于从主机端拷贝的心电信号的采样点序列长度L3,可设定ThreadsPerBlock3个该线程对应一个线程块,该线程块的数量可以是:
BlockNum3=(L3+ThreadsPerBlock3-1)/ThreadsPerBlock3。
如此可降低并行算法的复杂度。
图11是本发明一实施例的R波位置提取的并行执行算法的流程示意图。如图11所示,R波位置提取的并行执行算法包括步骤:
S1101:根据数据规模和GPU本身参数限制,确定所需启动的线程块数量BlockNum和每个线程块中线程的数量threadPerBlock,使得总的线程数量等于数据量。
S1102:调用kernel_1函数,在该内核函数中,每个线程依据其索引号gid从全局显存中读取相应的心电数据以及其后的四个数据,然后对这五个数据分别进行不同程度的腐蚀运算,并将得到的最小值再次依据线程的索引号写入临时的d_f0[gid]和寄存器中。
S1103:同步线程,使得所有线程均已完成第S1102步骤中的腐蚀运算。
S1104:腐蚀运算完成,依据腐蚀运算得到的结果进行膨胀运算。
S1105:在该内核函数中,线程首先从寄存器中读取数据d_f0[gid],再依次依据线程的索引号从显存中读取四个相邻的数据d_f0[gid+1]、d_f0[gid+2]、d_f0[gid+3]、 d_f0[gid+4],然后对这五个数据分别进行不同程度的膨胀运算,并将得到的最大值依据线程的索引号写入全局内存中的临时的d_f1[gid]中。
S1106:膨胀运算完成,进行其他运算。
S1107:调用kernel_2函数,在该内核函数中,计算数组中d_ecglist[gid]与腐蚀膨胀运算得到的结果d_f1[gid]之间的差值,并写入d_s1[gid]中。
S1108:调用归约求和内核函数,对数组d_s1中的数据进行归约求和,并将最终的结果写入数组d_sum_s1。
S1109:计算数组s1中数据集的均值d_mean。
S1110:将计算结果从设备端拷贝到主机端,释放不用的变量回收内存。
本发明实施例中,QRS波群起始终止位置提取可采用峰值群确定的方法获得。寻找R波附近的波峰和波谷,并根据一设定阈值,确定QRS波群的边缘位置。QRS波群起始位置的提取所需的特征参数依赖于R波位置的提取。QRS波群宽度的计算主要依靠两个数组:数组QRS_startlist和数组QRS_endlist。数组QRS_startlist存储了QRS波群的起始点,数组QRS_endlist存储了QRS波群的终止点,所以QRS波的宽度即为二者之间的差值。
一个实施例中,QRS波群宽度提取采用串行算法执行。图12是本发明实施例中的QRS波群宽度提取的串行执行算法的流程示意图。如图12所示,在串行算法执行过程中,依次从QRS波群的起始点数组QRS_startlist和QRS波群的终止点数组QRS_endlist中读取QRS波群的起始位置QRS_startlist[i]和终止位置QRS_endlist[i],求取起始位置QRS_startlist[i]和终止位置QRS_endlist[i]的差值QRSlist,再根据设定阈值80确定是否记录该差值QRSlist。
另一个实施例中,QRS波群宽度提取采用并行算法执行。依据数组QRS_startlist和数组QRS_endlist的长度启动相应数量的线程资源,然后每个线程依据其索引gid分别从数组QRS_startlist和数组QRS_endlist读取相应的数据并行计算二者之间的差值,即可得到相应的QRS波群的宽度,最后再依据线程的索引将QRS波群的宽度写入全局显存。
图13是本发明实施例中的QRS波群宽度提取的并行执行算法的流程示意图。如图13所示,QRS波群宽度提取的并行执行算法包括步骤:
S1301:声明所述GPU设备端的变量并为其分配相应显存,将QRS波群起止位置提取的结果从主机端拷贝到所述GPU设备端的全局内存。
S1302:所述线程调用所述GPU设备端的kernel函数,根据其索引号读取所述QRS波群起止位置提取的结果中的起始位置和终止位置,计算所述起始位置和终止位置的差值,根据所述索引号将所述差值存储至所述全局内存,生成QRS波群宽度提取的结果。
S1303:将所述QRS波群宽度提取的结果从所述GPU设备端拷贝至所述主机端。
本发明实施例中,可在主机端根据峰值群确定法获得QRS波群的所述起始位置和所述终止位置。以使不易于并行执行的步骤在主机端串行执行,减少在CPU和GPU之间拷贝数据的耗时。
在上述步骤S1302中,可根据数据规模和GPU本身参数限制,确定所需启动的线程块数量blocknum和每个线程块中线程的数量threadsPerBlock。所启动线程的总数量可为blocknum*threadPerBlock。线程依据其索引号gid分别从全局内存中的QRS波群起始位置数组QRS_startlist和QRS波群终止位置数组QRS_endlist读取相应的起始位置数据QRS_startlist[gid]和终止位置数据QRS_endlist[gid],计算得到QRS波群宽度,并再次依据线程的索引号将计算结果写入到QRSlist[gid]中。
本发明实施例中,所用到的QRS波群的起始位置和终止位置可在主机端计算得到。
本发明实施例中,异常波形分类串行算法可分为四步:
第一步,对十三种或多种类型的心拍逐个判定。
第二步,针对被噪声严重污染的数据段、波幅过小的心搏以及对房性早搏和室性早搏的判定。
第三步,针对漏搏及停搏的判定。
第四步,针对所有异常波,使用模板比对方法。
通过对上述异常波形分类的四个步骤的算法进行分析,发明人发现,第四步模板对比耗时占了整个算法的95%以上,所以,可对耗时最多的部分进行并行加速。模板比对方法是在每个异常心拍附近建立QRS模板。模板比对方法的核心部分耗时最多,该核心部分是生成标识序列。标识序列是依据RR间期RRlist生成。其中标识包括-1、0、1三个值。
图14是本发明实施例中的创建模板的并行执行算法的流程示意图。如图14所示,创建模板的并行执行算法包括步骤:
S1401:声明所述GPU设备端的变量并为其分配相应显存,将R波位置提取的结果从主机端拷贝到所述GPU设备端的全局内存。
S1402:所述线程调用kernel函数,根据其索引号读取所述R波位置提取的结果中的RR间期,根据一设定准则获取每个所述RR间期的标识符,并根据所述索引号将所述标识符存储至所述全局内存中,生成创建模板的结果。
S1403:将所述创建模板的结果从所述GPU设备端拷贝到所述主机端。
本发明实施例中,RR间期可在主机端由R波位置提取结果计算得到。
在上述步骤S1402中,该设定准则可包括:
(1)如果与第i个RR间期RRlist[i]相邻的第i+1个RR间期RRlist[i+1]不在区间(0.6*RRlist[i],1.5*RRlist[i])内,则所述RR间期数据RRlist[i]的标识符为-1,其中,i为整数,i≥0。
(2)如果所述RR间期RRlist[i+1]在区间(0.6*RRlist[i],1.5*RRlist[i])内,及所述RR间期RRlist[i]不在区间(0.8*RRmean,1.3*RRmean)内,则所述RR间期RRlist[i]的标识符为0,其中,RRmean为所有所述RR间期的平均值。
(3)如果所述RR间期RRlist[i+1]在区间(0.6*RRlist[i],1.5*RRlist[i])内,及所述RR间期RRlist[i]在区间(0.8*RRmean,1.3*RRmean)内,则所述RR间期RRlist[i]的标识符为1。
在上述步骤S1402中,可根据数据规模和GPU本身参数限制,确定所需启动的线程块数量blocknum和每个线程块中线程的数量threadPerBlock。启动线程的个数可为blocknum*threadPerBlock,每个线程按照其索引从全局显存中读取相应的RR间期数据,并依据上述三条准则,得到相应的标识符,并按照线程索引将结果写入d_FLAGRlist[gid]中。
模板是在异常心拍附近创建,即当发现异常心拍时,就需要调用模板创建函数创建模板,因此当异常波形非常多时,就会在该阶段消耗大量的时间。本发明实施例中,不需在每次创建模板时都重新生成标识序列,所以本发明实施例通过对标识序列的生成和模板的创建进行优化,可显著减少异常波形分类的耗时。
本发明实施例的并行心电信号分析方法始终保持与串行心电信号分析方法的正确率一致。使用24小时长程心电信号数据对串行和并行算法进行多次测试,然后取其均值得到心电信号分析各个阶段的加速比,如表1所示。
表1 是串行运行时间和并行运行时间以及对应的加速比
   串行运行时间(ms) 并行运行时间(ms) 加速比
长间期伪差剔除算法 392 54 7.3
短间期伪差剔除算法 261 44 5.9
数学形态学变换算法 1685 160 10.5
QRS波群宽度检测算法 7 6.8 1.0
心率失常波形分类与检测 1562 30.2 48.5
通过表1中的数据对比,可以看出,在心电信号分析的滤波阶段,长间期伪差剔除算法的串行平均运行耗时为392毫秒,经过并行设计和优化后的平均运行时间为54毫秒,所获得的加速比为7.3;短间期伪差剔除算法的串行平均运行时间261毫秒,经过并行设计和优化后的平均运行时间是44毫秒,所获得的加速比为5.9;在心电信号的QRS波检测阶段,数学形态学膨胀运算与腐蚀运算的串行运行速度为1685毫秒,但是经过并行设计和优化后的平均运行时间是160毫秒,加速比为10.5;QRS波群宽度计算的串行运行时间为7毫秒,并行加速后的运行平均运行时间是6.8毫秒,加速比接近1;心率失常波形分类算法的串行运行时间的均值是1562毫秒,经过并行加速后的运行速度为30.2,所获得的加速比为48.5。
在对各个算法并行加速前后的运行时间对比的过程中,可以发现,当算法的串行运行时间较长时,并行加速所获得的加速比越大,当算法的串行运行时间很短时。
使用不同时长的心电信号样本集对本发明实施例的并行算法进行多次分析测试,得到平均分析时间,如表2所示。从表2可以得知,异常波形越多,分析时间越长,本发明实施例的并行心电信号分析方法能够很好地适应不同时长的心电信号数据文件。分析时间均可保持在2秒以内。时长为24小时的心电信号数据的平均分析时间为1.9秒。假设上传至家庭健康云平台的心电信号数据是均匀分布在每天的各个时段的,则一台“CPU+GPU”服务器一天可以实时分析24小时的长程心电信号数据的数量是44920。那么,只需在家庭健康云平台上部署7台“CPU+GPU”服务器组成的算法服务器集群,就可以满足每日30万条的24小时长程心电信号数据的快速分析需求。
表2 不同时长的心电数据的分析时间
心电时长(h) 1 2 4 8 12 16 20 24
文件大小(MB) 0.5 1 2.1 4.1 6.2 8.3 10.3 12.4
分析时间(ms) 105 183 345 676 1023 1364 1644 1932
本发明使用“GPU+CPU”异构并行系统来加速长程心电数据的分析过程,即首先使用GPU(Graphic Processing Unit,图形处理器)加速技术,将串行心电算法中的可并 行和耗时较多的部分并行化,从指令级加速心电的分析过程,然后在家庭健康云平台上部署装载有并行心电分析算法的GPU服务器,以满足当前家庭健康云服务平台的需求。
本发明实施例的基于GPU的并行心电数据分析方法通过将心电信号分析过程中的一个或多个步骤在GPU上并行执行,可以显著提高心电信号的分析速度。
本领域内的技术人员应明白,本发明的实施例可提供为方法、系统、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本发明是参照根据本发明实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
以上所述的具体实施例,对本发明的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本发明的具体实施例而已,并不用于限定本发明的保护范围,凡在本发明的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (18)

  1. 一种基于GPU的并行心电信号分析方法,其特征在于,所述方法包括:
    通过长间期伪差剔除和短间期伪差剔除对心电信号进行滤波处理;
    通过R波位置提取、QRS波群起止位置提取及QRS波群宽度提取对滤波处理后的所述心电信号进行QRS检波;
    通过创建模板对QRS检波后的所述心电信号进行异常波形分类;
    其中,所述长间期伪差剔除、所述短间期伪差剔除、所述R波位置提取、所述QRS波群宽度提取及所述创建模板中至少有一个通过GPU设备端的多个线程并行执行,所述线程通过其唯一对应的索引号读取并处理其所对应的数据。
  2. 如权利要求1所述的基于GPU的并行心电信号分析方法,其特征在于,所述长间期伪差剔除通过多个所述线程并行执行,包括:
    声明所述GPU设备端的变量并为其分配相应显存,将所述心电信号从主机端拷贝至所述GPU设备端的全局内存;
    根据长间期伪差剔除的间期对所述心电信号进行分段,每个所述线程调用所述GPU设备端的kernel函数,根据所述索引号从所述全局内存读取所述线程对应的一段心电信号,计算该段心电信号的标准差,根据所述索引号将所述标准差存储至所述全局内存中一第一标准差序列;
    剔除标准差不在一设定阈值范围内的所述段心电信号;
    所述线程调用所述GPU设备端的规约求和kernel函数,计算经过剔除后剩余的所有该段心电信号的标准差的和值;
    根据所述和值求均值,并根据所述均值生成第一阈值范围;
    每个所述线程重新调用所述kernel函数,根据所述索引号读取并判断经过剔除后剩余的所述段心电信号的标准差是否在所述第一阈值范围内,根据所述索引号将判断结果存储至所述全局内存中一第一噪声序列;
    每个所述线程根据所述索引号从所述第一噪声序列依次读取所述判断结果及与其相邻的前一个判断结果和后一个判断结果,如果所述前一个判断结果和所述后一个判断结果均为标准差不在所述第一阈值范围内,则将表示是噪声的第一标记根据所述索引号存储至所述第一噪声序列中,反之,则将表示不是噪声的第二标记根据所述索引号存储至所述第一噪声序列中,生成长间期伪差剔除结果;
    将所述长间期伪差剔除结果从所述GPU设备端拷贝至所述主机端。
  3. 如权利要求2所述的基于GPU的并行心电信号分析方法,其特征在于,所述线程的数量
    Figure PCTCN2015082040-appb-100001
    每个所述线程对应处理的心电信号的采样点个数为f*T,其中,L为从所述主机端拷贝的心电信号的采样点序列长度,f为心电信号的采样频率,T为所述长间期伪差剔除的间期。
  4. 如权利要求2所述的基于GPU的并行心电信号分析方法,其特征在于,
    所述标准差
    Figure PCTCN2015082040-appb-100002
    其中,
    Figure PCTCN2015082040-appb-100003
    其中,pj是所述段心电信号中第j个采样点处的心电信号,j为整数,j≥0,f为心电信号的采样频率,T为所述长间期伪差剔除的间期。
  5. 如权利要求2所述的基于GPU的并行心电信号分析方法,其特征在于,所述第一阈值范围的上限和所述第一阈值范围的下限分别为所述均值的3倍和所述均值的1/3.5倍。
  6. 如权利要求2所述的基于GPU的并行心电信号分析方法,其特征在于,所述设定阈值范围为[0.5,3]。
  7. 如权利要求1所述的基于GPU的并行心电信号分析方法,其特征在于,所述短间期伪差剔除通过多个所述线程并行执行,包括:
    声明所述GPU设备端的变量并为其分配相应显存,将所述长间期伪差剔除后的心电信号从主机端拷贝至所述GPU设备端的全局内存;
    根据短间期伪差剔除的间期T1对所述心电信号进行分段,每个所述线程调用所述GPU设备端的kernel函数,根据所述索引号读取一段心电信号,计算公式
    Figure PCTCN2015082040-appb-100004
    的值,根据所述索引号将所述公式的值存储至所述全局内存中一第二噪声序列,其中,sum是所有所述段心电信号中心电信号的平方和;
    修改所述线程的数量及其对应处理的心电信号的采样点个数,每个修改后的线程重新调用kernel函数,根据所述修改后的线程的索引号读取其唯一对应的所述长间期伪差剔除的结果的第一噪声序列中的一个标记值,并通过一第三标记标出显示是噪声的所述标记值所对应段的心电信号;
    根据所述第三标记串行筛除显示是噪声的所述标记值所对应段的心电信号;
    所述修改后的线程调用所述GPU设备端中的规约求和kernel函数,计算经过筛除后剩余的所有心电信号的和值;
    根据所述和值求得均值,根据所述均值生成一第二阈值范围;
    再次修改所述线程的数量及其对应处理的心电信号的数量,再次修改后的所述线程再次重新调用kernel函数,根据再次修改后的所述线程的索引号从所述第二噪声序列读取并剔除每个不在所述第二阈值范围内的所述公式的值,生成短间期伪差剔除结果;
    将所述短间期伪差剔除结果从所述GPU设备端拷贝至所述主机端。
  8. 如权利要求7所述的基于GPU的并行心电信号分析方法,其特征在于,还包括:
    每个所述线程处理的心电信号的采样点个数为f*T1,所述线程的数量为
    Figure PCTCN2015082040-appb-100005
    其中,L为从所述主机端拷贝的心电信号的采样点序列长度,f为心电信号的采样点频率;
    设定ThreadsPerBlock个所述线程对应一个线程块,每个所述线程块对应的心电信号的采样点个数为DataPerBlock=f*T*ThreadsPerBlock,所述线程块的数量为BlockNum=(L+DataPerBlock-1)/DataPerBlock。
  9. 如权利要求7所述的基于GPU的并行心电信号分析方法,其特征在于,还包括:
    修改后的线程的数量
    Figure PCTCN2015082040-appb-100006
    其中,L为从所述主机端拷贝的心电信号的采样点序列长度,f为心电信号的采样频率,T为所述长间期伪差剔除的间期,DataPerBlock1个修改后的线程对应一个线程块,所述线程块的数量为BlockNum1=(L1+DataPerBlock1-1)/DataPerBlock1,其中,L1为所有所述长间期伪差剔除的结果的第一噪声序列的长度。
  10. 如权利要求6所述的基于GPU的并行心电信号分析方法,其特征在于,还包括:
    再次修改后的线程的数量为
    Figure PCTCN2015082040-appb-100007
    每个所述再次修改后的线程对应处理一个所述公式的值,设定ThreadsPerBlock2个所述再次修改后的线程对应一个线程块,所述线程块的数量BlockNum2=(n+ThreadsPerBlock2-1)/ThreadsPerBlock2,L为从所述主机端拷贝的心电信号的采样点序列长度。
  11. 如权利要求1所述的基于GPU的并行心电信号分析方法,其特征在于,所述R波位置提取通过多个所述线程并行执行,包括:
    声明所述GPU设备端的变量并为其分配相应显存,将滤波处理后的所述心电信号从主机端的内存拷贝到所述GPU设备端的全局内存;
    每个所述线程调用kernel函数,根据其索引号对应从滤波处理后的所述心电信号中读取一个待检波心电信号;
    所述线程根据设定窗口大小w和一设定梯度,对读取的所述待检波心电信号及其后紧邻的w-1个待检波心电信号进行不同程度的腐蚀运算,将腐蚀运算后的读取的所述待检波心电信号及所述w-1个待检波心电信号中的最小值根据所述索引号存储至所述全局内存中的一第一临时序列,w为整数,w≥2;
    每个所述线程根据所述索引号从所述第一临时序列中读取一个所述最小值;
    根据所述设定窗口大小w和所述设定梯度,对读取的所述最小值及其后紧邻的w-1个所述最小值进行不同程度的膨胀运算,并根据所述索引号将膨胀运算后的读取的所述最小值及其后紧邻的w-1个所述最小值中的最大值存储至所述全局内存中的一第二临时序列;
    根据所述索引号读取并计算所述待检波心电信号和所述最大值的差值,并将所述差值存储至所述全局内存中的一第三临时序列;
    所述线程调用规约求和kernel函数计算所述第三临时序列中的所有所述差值的和值;
    根据所述和值求均值;
    将所述均值从所述GPU设备端拷贝至所述主机端。
  12. 如权利要求11所述的基于GPU的并行心电信号分析方法,其特征在于,所述方法还包括:
    将腐蚀运算后的读取的所述待检波心电信号及所述w-1个待检波心电信号中的最小值存储至一寄存器中;
    从所述寄存器中读取的所述最小值,从所述全局内存中读取所述w-1个所述最小值。
  13. 如权利要求11所述的基于GPU的并行心电信号分析方法,其特征在于,所述设定窗口大小w=5,所述设定梯度为k[w]={0,50,100,50,0}。
  14. 如权利要求11所述的基于GPU的并行心电信号分析方法,其特征在于,所述线程的数量等于从所述主机端拷贝的所述心电信号的采样点序列长度L3,设定ThreadsPerBlock3个所述线程对应一个线程块,所述线程块的数量BlockNum3=(L3+ThreadsPerBlock3-1)/ThreadsPerBlock3。
  15. 如权利要求1所述的基于GPU的并行心电信号分析方法,其特征在于,所述QRS波群宽度提取通过多个所述线程并行执行,包括:
    声明所述GPU设备端的变量并为其分配相应显存,将QRS波群起止位置提取的结果从主机端拷贝到所述GPU设备端的全局内存;
    所述线程调用所述GPU设备端的kernel函数,根据其索引号读取所述QRS波群起止位置提取的结果中的起始位置和终止位置,计算所述起始位置和终止位置的差值,根据所述索引号将所述差值存储至所述全局内存,生成QRS波群宽度提取的结果;
    将所述QRS波群宽度提取的结果从所述GPU设备端拷贝至所述主机端。
  16. 如权利要求15所述的基于GPU的并行心电信号分析方法,其特征在于,在主机端根据峰值群确定法获得QRS波群的所述起始位置和所述终止位置。
  17. 如权利要求1所述的基于GPU的并行心电信号分析方法,其特征在于,所述创建模板通过多个所述线程并行执行,包括:
    声明所述GPU设备端的变量并为其分配相应显存,将R波位置提取的结果从主机端拷贝到所述GPU设备端的全局内存;
    所述线程调用kernel函数,根据其索引号读取所述R波位置提取的结果中的RR间期,根据一设定准则获取每个所述RR间期的标识符,并根据所述索引号将所述标识符存储至所述全局内存中,生成创建模板的结果;
    将所述创建模板的结果从所述GPU设备端拷贝到所述主机端。
  18. 如权利要求17所述的基于GPU的并行心电信号分析方法,其特征在于,所述设定准则包括:
    如果与第i个RR间期RRlist[i]相邻的第i+1个RR间期RRlist[i+1]不在区间(0.6*RRlist[i],1.5*RRlist[i])内,则所述RR间期数据RRlist[i]的标识符为-1,其中,i为整数,i≥0;
    如果所述RR间期RRlist[i+1]在区间(0.6*RRlist[i],1.5*RRlist[i])内,及所述RR间期RRlist[i]不在区间(0.8*RRmean,1.3*RRmean)内,则所述RR间期RRlist[i]的标识符为0,其中,RRmean为所有所述RR间期的平均值;
    如果所述RR间期RRlist[i+1]在区间(0.6*RRlist[i],1.5*RRlist[i])内,及所述RR间期RRlist[i]在区间(0.8*RRmean,1.3*RRmean)内,则所述RR间期RRlist[i]的标识符为1。
PCT/CN2015/082040 2015-06-23 2015-06-23 基于gpu的并行心电信号分析方法 WO2016205993A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN201580000415.1A CN105899268B (zh) 2015-06-23 2015-06-23 基于gpu的并行心电信号分析方法
PCT/CN2015/082040 WO2016205993A1 (zh) 2015-06-23 2015-06-23 基于gpu的并行心电信号分析方法
JP2016576010A JP6389907B2 (ja) 2015-06-23 2015-06-23 Gpuによる心電信号の並列解析方法
US15/389,978 US10258250B2 (en) 2015-06-23 2016-12-23 GPU-based parallel electrocardiogram signal analysis method, computer readable storage medium and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/082040 WO2016205993A1 (zh) 2015-06-23 2015-06-23 基于gpu的并行心电信号分析方法

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/389,978 Continuation US10258250B2 (en) 2015-06-23 2016-12-23 GPU-based parallel electrocardiogram signal analysis method, computer readable storage medium and device

Publications (1)

Publication Number Publication Date
WO2016205993A1 true WO2016205993A1 (zh) 2016-12-29

Family

ID=57002547

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/082040 WO2016205993A1 (zh) 2015-06-23 2015-06-23 基于gpu的并行心电信号分析方法

Country Status (4)

Country Link
US (1) US10258250B2 (zh)
JP (1) JP6389907B2 (zh)
CN (1) CN105899268B (zh)
WO (1) WO2016205993A1 (zh)

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106361283A (zh) * 2016-09-06 2017-02-01 四川长虹电器股份有限公司 心音信号优化方法
CN107193657A (zh) * 2017-05-18 2017-09-22 安徽磐众信息科技有限公司 基于solaflare网卡的低延迟服务器
CN107958214A (zh) * 2017-11-21 2018-04-24 中国科学院深圳先进技术研究院 Ecg信号的并行分析装置、方法和移动终端
US10743789B2 (en) * 2017-11-21 2020-08-18 Shenzhen Institutes Of Advanced Technology, Chinese Academy Of Sciences ECG signal parallel analysis apparatus, method and mobile terminal
CN107811631A (zh) * 2017-11-27 2018-03-20 乐普(北京)医疗器械股份有限公司 心电信号质量评估方法
CN108597336B (zh) * 2018-02-28 2021-11-05 天津天堰科技股份有限公司 心电波形仿真方法
WO2019198691A1 (ja) * 2018-04-11 2019-10-17 シャープ株式会社 情報処理装置、およびウェアラブル端末
CN109222964B (zh) * 2018-07-20 2021-02-09 广州视源电子科技股份有限公司 房颤检测装置及存储介质
CN109445970A (zh) * 2018-09-18 2019-03-08 北京工业大学 一种软件可靠性时间序列预测方法及应用
JP7179570B2 (ja) * 2018-10-10 2022-11-29 エヌ・ティ・ティ・コミュニケーションズ株式会社 ノイズ情報取得装置、ノイズ情報取得方法及びコンピュータプログラム
CN109394206B (zh) * 2018-11-14 2021-08-27 东南大学 基于穿戴式心电信号中早搏信号的实时监测方法及其装置
CN109480826B (zh) * 2018-12-14 2021-07-13 东软集团股份有限公司 一种心电信号处理方法、装置及设备
CN109710488A (zh) * 2018-12-14 2019-05-03 北京工业大学 一种基于区块链技术的时间序列生成方法
TWI747057B (zh) * 2019-10-07 2021-11-21 宏碁智醫股份有限公司 心律訊號處理方法、電子裝置及電腦程式產品
CN111956201B (zh) * 2020-07-22 2022-09-06 上海数创医疗科技有限公司 基于卷积神经网络的心拍类型识别方法和装置
CN113180687B (zh) * 2021-04-29 2024-02-09 深圳邦健生物医疗设备股份有限公司 多导联动态心搏实时分类方法、装置、设备及存储介质
CN113520403B (zh) * 2021-06-21 2024-04-02 浙江好络维医疗技术有限公司 一种基于峰谷特征的心电伪差识别方法
CN113420672B (zh) * 2021-06-24 2023-03-14 清华大学 一种基于gpu对脑电信号处理过程进行并行加速的方法
CN113509187B (zh) * 2021-07-12 2023-10-24 广州市康源图像智能研究院 一种实时的单导联心电r波检测方法
CN114027853B (zh) * 2021-12-16 2022-09-27 安徽心之声医疗科技有限公司 基于特征模板匹配的qrs波群检测方法、装置、介质及设备
CN114224353B (zh) * 2022-02-21 2023-03-21 深圳泰和智能医疗科技有限公司 一种基于体温监测仪的心电检测分类方法
CN117257324B (zh) * 2023-11-22 2024-01-30 齐鲁工业大学(山东省科学院) 基于卷积神经网络和ecg信号的房颤检测方法
CN117357130B (zh) * 2023-12-07 2024-02-13 深圳泰康医疗设备有限公司 基于人工智能的心电图数字化曲线分割方法
CN117453421B (zh) * 2023-12-18 2024-03-19 北京麟卓信息科技有限公司 一种基于数据分段的gpu全片存储带宽度量方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN2385674Y (zh) * 1999-09-30 2000-07-05 信息产业部邮电设计院 心血管血流动力-心电检测仪
CN1907214A (zh) * 2006-08-18 2007-02-07 方祖祥 具有急救及定位功能的便携式远程实时监护仪
US20080215807A1 (en) * 2007-03-02 2008-09-04 Sony Corporation Video data system
US20130267835A1 (en) * 2012-04-10 2013-10-10 Jerome Ranjeev Edwards System and method for localizing medical instruments during cardiovascular medical procedures
CN103654770A (zh) * 2013-12-03 2014-03-26 上海理工大学 移动心电信号qrs波实时波检测方法及装置
CN104462786A (zh) * 2014-11-19 2015-03-25 成都乐乐云科技有限责任公司 一种网络化智能化多参数生命体征监护装置及云平台系统

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH08117196A (ja) * 1994-10-27 1996-05-14 Tochigi Nippon Denki Kk 心電波形入力装置
US5655540A (en) * 1995-04-06 1997-08-12 Seegobin; Ronald D. Noninvasive method for identifying coronary artery disease utilizing electrocardiography derived data
JP5080348B2 (ja) * 2008-04-25 2012-11-21 フクダ電子株式会社 心電計及びその制御方法
US9504427B2 (en) * 2011-05-04 2016-11-29 Cardioinsight Technologies, Inc. Signal averaging
CN102340296B (zh) * 2011-07-21 2013-12-25 东北大学秦皇岛分校 一种基于gpu的高阶数字fir滤波器频域并行处理实现方法
CN102646271B (zh) * 2012-02-22 2014-08-06 浙江工业大学 一种基于cuda的快速图像类比合成方法
KR20140063100A (ko) * 2012-11-16 2014-05-27 삼성전자주식회사 원격 심질환 관리 장치 및 방법
CN103156599B (zh) * 2013-04-03 2014-10-15 河北大学 一种心电信号r特征波检测方法
US10368764B2 (en) * 2013-09-12 2019-08-06 Topera, Inc. System and method to select signal segments for analysis of a biological rhythm disorder
US10226197B2 (en) * 2014-04-25 2019-03-12 Medtronic, Inc. Pace pulse detector for an implantable medical device

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN2385674Y (zh) * 1999-09-30 2000-07-05 信息产业部邮电设计院 心血管血流动力-心电检测仪
CN1907214A (zh) * 2006-08-18 2007-02-07 方祖祥 具有急救及定位功能的便携式远程实时监护仪
US20080215807A1 (en) * 2007-03-02 2008-09-04 Sony Corporation Video data system
US20130267835A1 (en) * 2012-04-10 2013-10-10 Jerome Ranjeev Edwards System and method for localizing medical instruments during cardiovascular medical procedures
CN103654770A (zh) * 2013-12-03 2014-03-26 上海理工大学 移动心电信号qrs波实时波检测方法及装置
CN104462786A (zh) * 2014-11-19 2015-03-25 成都乐乐云科技有限责任公司 一种网络化智能化多参数生命体征监护装置及云平台系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
XU, JIAQING;: "The Design of Myocardial Infarction Early Aided Diagnosis System Based on GPU", CHINA MASTER'S THESES FULL-TEXT DATABASE, 15 January 2014 (2014-01-15), ISSN: 1674-0246 *

Also Published As

Publication number Publication date
US10258250B2 (en) 2019-04-16
CN105899268A (zh) 2016-08-24
JP2017526404A (ja) 2017-09-14
US20170105643A1 (en) 2017-04-20
JP6389907B2 (ja) 2018-09-12
CN105899268B (zh) 2019-02-15

Similar Documents

Publication Publication Date Title
WO2016205993A1 (zh) 基于gpu的并行心电信号分析方法
CN109948647B (zh) 一种基于深度残差网络的心电图分类方法及系统
US20200337580A1 (en) Time series data learning and analysis method using artificial intelligence
Sahu et al. FINE_DENSEIGANET: Automatic medical image classification in chest CT scan using Hybrid Deep Learning Framework
Hills et al. Classification of time series by shapelet transformation
CN113724880A (zh) 一种异常脑连接预测系统、方法、装置及可读存储介质
Malali et al. Supervised ECG wave segmentation using convolutional LSTM
Ahmed et al. An investigative study on motifs extracted features on real time big-data signals
JP2022516146A (ja) 脳血管疾患学習装置、脳血管疾患検出装置、脳血管疾患学習方法、及び脳血管疾患検出方法
CN115359576A (zh) 一种多模态情绪识别方法、装置、电子设备及存储介质
Zhao et al. Neuronal population reconstruction from ultra-scale optical microscopy images via progressive learning
Liao et al. Classify autism and control based on deep learning and community structure on resting-state fMRI
Rey-Villamizar et al. Large-scale automated image analysis for computational profiling of brain tissue surrounding implanted neuroprosthetic devices using Python
CN115363599A (zh) 一种用于心房颤动识别的心电信号处理方法及系统
Ilbeigipour et al. Real-time heart arrhythmia detection using apache spark structured streaming
CN115177262A (zh) 一种基于深度学习的心音心电联合诊断装置和系统
Liu et al. Using simulated training data of voxel-level generative models to improve 3D neuron reconstruction
Yin et al. A comprehensive evaluation of multicentric reliability of single-subject cortical morphological networks on traveling subjects
GB2603831A (en) Mobile AI
Lee et al. Adaptive ecg signal compression method based on look-ahead linear approximation for ultra long-term operating of healthcare iot devices (sci)
CN113283465B (zh) 一种弥散张量成像数据分析方法及装置
CN114224354B (zh) 心律失常分类方法、装置及可读存储介质
Emrich et al. Accelerated Sample-Accurate R-Peak Detectors Based on Visibility Graphs
CN113080847B (zh) 基于图的双向长短期记忆模型诊断轻度认知障碍的装置
CN110688414B (zh) 时序数据的处理方法、装置和计算机可读存储介质

Legal Events

Date Code Title Description
ENP Entry into the national phase

Ref document number: 2016576010

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15895893

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 16.05.2018)

122 Ep: pct application non-entry in european phase

Ref document number: 15895893

Country of ref document: EP

Kind code of ref document: A1