CN113743609A - Multi-signal-oriented rapid breakpoint detection method, system, equipment and storage medium - Google Patents

Multi-signal-oriented rapid breakpoint detection method, system, equipment and storage medium Download PDF

Info

Publication number
CN113743609A
CN113743609A CN202110997289.9A CN202110997289A CN113743609A CN 113743609 A CN113743609 A CN 113743609A CN 202110997289 A CN202110997289 A CN 202110997289A CN 113743609 A CN113743609 A CN 113743609A
Authority
CN
China
Prior art keywords
signal
breakpoint
matrix
detection method
signal matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110997289.9A
Other languages
Chinese (zh)
Other versions
CN113743609B (en
Inventor
段君博
王青
刘轩宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN202110997289.9A priority Critical patent/CN113743609B/en
Publication of CN113743609A publication Critical patent/CN113743609A/en
Application granted granted Critical
Publication of CN113743609B publication Critical patent/CN113743609B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/123DNA computing

Abstract

The invention discloses a multi-signal-oriented rapid breakpoint detection method, a system, equipment and a storage medium. The method provided by the application can quickly and accurately detect the shared breakpoint at the same position in the signal, thereby providing reliable initial and termination position information for further signal segmentation, fitting and parameter estimation. The technology has wide practical characteristics, can be applied to clinical applications such as reproductive health diagnosis, prenatal screening of pregnant women, genetic diagnosis of newborn genetic diseases, health monitoring of wearable equipment and the like, and other scientific research fields such as archaeology, biology, medicine, engineering and the like, and has great significance for improving the physical quality of the nation and promoting scientific research work.

Description

Multi-signal-oriented rapid breakpoint detection method, system, equipment and storage medium
Technical Field
The invention belongs to the technical field of signal breakpoint detection, and particularly relates to a multi-signal-oriented rapid breakpoint detection method, a multi-signal-oriented rapid breakpoint detection system, multi-signal-oriented rapid breakpoint detection equipment and a storage medium.
Background
The conventional breakpoint detection method mainly includes Cyclic Binary Segmentation (CBS), optimal segmentation (OP), and pruning precise linear time (pel).
CBS is a Copy Number Variation (CNV) detection method developed for gene chip data, and is also widely used for copy number variation detection based on high-throughput sequencing (HTS) data at present, and DNAcopy in R language is realized based on CBS algorithm. If only one breakpoint on the chromosome is assumed and the data obeys normal distribution, the data on both sides of the breakpoint are subjected to double-sample t-test, and the presence or absence of the breakpoint can be judged. The location of this breakpoint can be determined by traversing all possible loci of the chromosome. If a plurality of breakpoints exist on the chromosome, the position of a first breakpoint is determined firstly, and then the chromosomes on two sides of the breakpoint are processed in the same way, so that the positions of a second breakpoint and a third breakpoint can be determined; this process is repeated to realize so-called binary division. If the signal does not follow a normal distribution, other suitable tests may be used in place of the t-test. Therefore, the CBS repeatedly calculates the maximum log-likelihood ratio, connects regions with similar values by using a hypothesis test method, and thus completes signal bisection.
The CBS method cannot prove the optimality of the segmentation from a global perspective, whereas the OP method can prove its global optimality. The OP method decomposes the segmentation problem of the long signal into a plurality of sub-problems of short signal segmentation by using a Dynamic Programming (DP) idea, and obtains a solution of the original problem by merging the solutions of the sub-problems. The OP can solve the division problem of signals of arbitrary length by using a method of decomposing a subproblem by a method of a regression method starting from the division problem of signals of length 1.
Although the OP method has global optimality, since the OP needs to traverse all possible cases in the process of applying dynamic programming, the calculation amount increases in square with the signal length. The calculation amount is huge for large-scale problems (such as signals with the length of tens of thousands of points), and the practical application requirements are difficult to meet. However, it may prove that under certain specific conditions, some cases do not have the possibility, and therefore these cases may be eliminated during traversal, thereby avoiding unnecessary computations. Therefore, the PELT method breaks through the bottleneck of OP in the aspect of calculation amount, the calculation amount is reduced to be in a linear increasing relation with the signal length, the application range is greatly expanded, and only the breakpoint of a single signal can be detected.
Many scientific research and engineering applications require the detection of breakpoints in signals. A breakpoint is here understood to mean a position in a signal on both sides of which the signal exhibits different patterns (strictly mathematically speaking distributions). For example, fig. 1 shows a signal with two high and low steps, and the signal keeps the same mode (the mean value is constant) in the steps, but has a mode change between the steps. If a break point (thick line) can be detected, the mean value of the signal between break points can be easily obtained.
The existing methods such as CBS, OP, and PELT can only detect a breakpoint in a single signal, but in practical applications, it is often necessary to detect a breakpoint shared by multiple signals (as shown in fig. 2). As copy number variation detection problems based on high-throughput sequencing techniques, the True Positive Rate (TPR) of a single signal is low and the False Positive Rate (FPR) is high because the deep-reading signal contains noise. A straightforward way to improve detection performance is to increase sequencing coverage, which however leads to increased experimental costs. An alternative is to sequence the sample multiple times with medium or low coverage, or to use multiple platforms for sequencing, i.e. multiplex sequencing. Multiplex sequencing can reduce systematic errors introduced by a single sample or platform, can improve detection performance, but requires techniques that can detect multiple signal breakpoints. In addition, multiple individuals in a population may share CNV, and many complex diseases may also share CNV, so it is necessary to detect common CNV in multiple signals from the viewpoint of multiple samples.
Therefore, the conventional methods such as CBS, OP, and PELT can only detect a breakpoint in a single signal, cannot detect a breakpoint common to a plurality of signals, and have a large detection calculation amount, which results in high experiment cost.
Disclosure of Invention
In order to overcome the disadvantages of the prior art, the present invention provides a multi-signal-oriented fast breakpoint detection method, system, device and storage medium, and aims to solve the technical problem of the prior art that a breakpoint detection method can only detect a breakpoint in a single signal and has low detection efficiency.
The invention provides a multi-signal-oriented rapid breakpoint detection method, which comprises the following steps:
s1, preprocessing the original signal to obtain a preprocessed signal matrix Y with the size of NxM after preprocessing;
s2, determining a breakpoint number k or a penalty parameter lambda according to actual conditions;
s3, when selecting the punishment parameter lambda as the input parameter, solving the minimization optimization problem, and acquiring the processed signal matrix X with the size of N multiplied by M; when the number of break points k is selected as an input parameter, the maximum possible lambda is calculated from the preprocessed signal matrix YmaxIn the interval [0, λmax]Estimating a punishment parameter lambda by internal search, solving a minimization optimization problem under the given lambda to ensure that the number of broken points is k, and acquiring a processed signal matrix X with the size of NxM;
s4, calculating a signal matrix X according to the obtained breakpoint position; performing denormalization processing on the signal matrix X to obtain a processed signal X0And the breakpoint of the segmented signal is rapidly detected.
Preferably, the method at S1 specifically includes the following steps:
s1.1, storing the acquired original signals into an N multiplied by M original matrix Y0
S1.2, calculating an original matrix Y0The maximum absolute value c of;
and S1.3, carrying out preprocessing operation on the original signal by adopting the maximum absolute value c.
Preferably, in S1.3, the raw signal is preprocessed by using a normalization method, and the result of the preprocessing is shown in formula (1):
Y=Y0/c (1)。
preferably, in S4, the processed signal X0Is shown in formula (2):
X0=cX (2)。
preferably, in S3, the interval [0, λ ] is searched by using a bisection methodmax]The penalty parameter lambda within.
Preferably, in S3, given the preprocessed signal matrix Y and the penalty parameter λ, the minimization optimization problem as shown in equation (3) is solved, i.e. the signal matrix X is solved:
Figure BDA0003234264710000041
where Y is the N × M pre-processed signal matrix, X is the processed N × M sized signal matrix, N is the number of sampling points per signal, M is the number of signals, λ is a penalty parameter for each breakpoint, and p (X) is the number of breakpoints in X.
Preferably, the specific operation steps of S3 are as follows:
when the number of break points k is selected as an input parameter:
1) calculating the maximum possible lambda according to the preprocessed signal matrix YmaxAnd let the minimum possible lambdaminCalculating penalty parameter as 0
Figure BDA0003234264710000042
2) Under the condition of giving a preprocessed signal matrix Y and a punishment parameter lambda, solving the minimization optimization problem shown in a formula (3), namely solving a signal matrix X and the number P (X) of breakpoints;
3) if the number k of break points is less than the number P (X) of break points, let λminλ and repeating the above steps;
if the number k of break points is greater than the number P (X) of break points, let λmaxλ and repeating the above steps;
if the number k of the break points is equal to the number P (X) of the break points, outputting a signal matrix X;
when the penalty parameter λ is selected as an input parameter: the minimization optimization problem, i.e. the signal matrix X, is solved as shown in equation (3).
The invention also discloses a system of the multi-signal-oriented rapid breakpoint detection method, which comprises the following steps:
the signal preprocessing module is used for preprocessing the acquired original signal;
the penalty parameter estimation module is used for acquiring a processed signal matrix under the condition of giving a breaking point number or a penalty parameter;
and the signal processing module is used for determining the breakpoint position of the signal matrix and the processed signal and realizing the breakpoint quick detection of the segmented signal.
A computer device comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the steps of the multi-signal-oriented rapid breakpoint detection method when executing the computer program.
A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the multi-signal oriented fast breakpoint detection method.
Compared with the prior art, the invention has the following beneficial effects:
the invention provides a multi-signal-oriented processing method, which is characterized in that acquired signals are preprocessed, and the number of break points and punishment parameters are determined according to actual conditions, so that the final position of the break points and the processed signals can be determined by combining preprocessing results, the number of break points and the punishment parameters. The method can quickly and accurately detect the shared breakpoint at the same position in the signal, thereby providing reliable initial and termination position information for further signal segmentation, fitting and parameter estimation.
Further, a dichotomy is adopted to search the penalty parameters, firstly, the penalty parameters are compared with the elements in the middle of the sequence, and if the penalty parameters are larger than the elements, the penalty parameters are continuously searched in the latter half of the current sequence; if the number of the elements is smaller than the element, the searching is continued in the first half part of the current sequence until the same element is found or the searched sequence range is empty, and the dichotomy has the advantages of less comparison times, high searching speed and good average performance.
Furthermore, the convergence rate can be improved by preprocessing the original signal by using a normalization method, and meanwhile, in order to ensure the accuracy of the preprocessed signal, the original signal is subjected to normalization processing.
According to the system of the multi-signal-oriented rapid breakpoint detection method, breakpoint detection is decomposed into different and mutually independent modules according to the relevance of contents, multi-signal breakpoint detection is achieved through the modularization idea, when a problem occurs in which module can be managed independently, and the modules are mutually independent and do not influence each other.
Drawings
FIG. 1 is a schematic diagram of a signal breakpoint detection including two high and low steps;
FIG. 2 is a diagram illustrating detection of common breakpoints in multiple signals;
FIG. 3 is a flowchart of a fast breakpoint detection method according to the present invention;
FIG. 4 is a graph of the present invention applied to family consensus copy number variation detection ((a) children inherit a variation common to parents, (b) children inherit a variation from parents);
fig. 5 shows a gait analysis of the wearable device according to the present invention ((a) gait detected by using three acceleration sensors in x, y and z directions, (b) gait detected by using only the acceleration sensor in z direction);
FIG. 6 shows the improvement of the computation time of the present invention over the conventional method ((a) the improvement of the computation time with the signal length N and (b) the improvement of the computation time with the signal dimension M).
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The invention is described in further detail below with reference to the accompanying drawings:
the invention provides a multi-signal-oriented rapid breakpoint detection method, a specific signal processing flow is shown in fig. 3, and the rapid breakpoint detection method comprises the following steps:
s1, carrying out normalization preprocessing on the original signal to obtain a preprocessed signal matrix Y with the size of NxM;
s1.1, storing the acquired original signals into an N multiplied by M original matrix Y0
S1.2, calculating an original matrix Y0The maximum absolute value c of;
s1.3, preprocessing the original signal by using the maximum absolute value c.
S2, determining a breakpoint number k or a penalty parameter lambda according to actual conditions;
s3, if the selected input parameter is lambda, directly solving a minimization optimization problem to obtain a processed signal matrix X with the size of N multiplied by M; if the input parameter is selected to be k, the maximum possible λ is first calculated from the preprocessed signal matrix YmaxThen in the interval [0, λmax]Estimating a punishment parameter lambda by internal search, solving a minimization optimization problem under the given lambda to enable the number of broken points to be k, and obtaining a processed signal matrix X with the size of NxM;
search the interval [0, λ ] by dichotomymax]The penalty parameter lambda in the system is set to ensure that the detection breakpoint P (X) is k;
s4, calculating a signal matrix X according to the obtained breakpoint position; the signal matrix X is multiplied by the maximum absolute value c to be subjected to post-normalization processing to obtain a processed signal X0Therefore, the breakpoint of the segmented signal can be rapidly detected.
Wherein, the pretreatment result is shown as formula (1):
Y=Y0/c (1)
processed signal X0Is shown in formula (2):
X0=cX (2)
given the preprocessed signal matrix Y and the penalty parameter λ, the minimization optimization problem as shown in equation (3) is solved, i.e. the signal matrix X is solved:
Figure BDA0003234264710000071
wherein, Y is a to-be-processed signal matrix of nxm, X is a processed signal matrix of the same size, N is the number of sampling points (signal length) of each signal, M is the number of signals, λ is a penalty parameter for each breakpoint, and p (X) is the number of breakpoints in X. When X is obtained, the position and the number of the breakpoints are easy to know.
When the number of break points k is selected as an input parameter:
1) calculating the maximum possible lambda according to the preprocessed signal matrix YmaxAnd let the minimum possible lambdaminCalculating penalty parameter as 0
Figure BDA0003234264710000072
2) Under the condition of giving a preprocessed signal matrix Y and a punishment parameter lambda, solving the minimization optimization problem shown in a formula (3), namely solving a signal matrix X and the number P (X) of breakpoints;
3) if the number k of break points is less than the number P (X) of break points, let λminλ and repeating the above steps;
if the number k of break points is greater than the number P (X) of break points, let λmaxλ and repeating the above steps;
if the number k of the break points is equal to the number P (X) of the break points, outputting a signal matrix X;
when the penalty parameter λ is selected as an input parameter: the minimization optimization problem, i.e. the signal matrix X, is solved as shown in equation (3).
The core steps of the above technical scheme are S3, and the specific operation steps of S3 are as follows:
a) inputting: preprocessing a signal matrix Y and a punishment parameter lambda;
b) initialization: the target function storage vector F is an N +1 long all-zero vector, the breakpoint storage array bp is a cell array, the first cell is a null vector, the effective index list R is 1, the segmentation energy E is 0, and the average value Z is equal to the first column of the preprocessing signal matrix Y;
c) entering a loop of i-1 until step x);
d) adding the R-th element of the effective index list of the target function storage vector F and the segmentation energy E, and storing the R-th element into a temporary vector v;
e) searching the minimum value and position of the temporary vector v, and storing the minimum value and position of the temporary vector v into a and i respectively1
f) Calculating a + lambda, and storing the a + lambda into the (i +1) th storage unit of the target function storage vector F;
g) reading ith of effective index list R1An element stored in the maximum absolute value c;
h) reading the c-th cell of the breakpoint storage array bp, splicing the c-th cell with the maximum absolute value c, and storing the c-th cell into the i + 1-th cell of the breakpoint storage array bp;
i) finding the position of the (i +1) th element in the temporary vector v, which is smaller than the storage vector F of the target function, and storing the position into the i2
j) If i is less than the number of sampling points N, executing to step x) one by one, otherwise, directly jumping to step x);
k) storing the (i +1) th column of the preprocessed signal matrix Y into Y;
l) keeping only the ith index in the effective index list R2Individual elements, delete + remaining elements;
m) calculating i minus the effective index list R and adding 1, and storing a length vector l;
n) copying the length vector L for M times in a row mode, and storing the length vector L into a matrix L;
o) copying y n times in a row mode (n is the length of the length vector l), and storing the y into a matrix B;
p) reading i of the mean value Z2Rows, stored in matrix T;
q) calculating a matrix B minus a matrix T, calculating the sum of squares of all elements of each row, and storing the sum into a vector e;
r) calculating a vector e by point multiplying a length vector l, then dividing the vector by point by l +1, and storing the vector e;
s) calculating the ith of vector E and segment energy E2The sum of the position elements is stored in the segmentation energy E;
t) adding 0 at the end of the segment energy E;
u) calculating a matrix T point multiplication matrix L, adding a matrix B, then point dividing (L +1), and storing into an average value Z;
v) column at the end of the mean value Z is added y;
w) adding i +1 at the end of the effective index list R;
x) adding 1 to i, and returning to the step c);
y) outputting: the output breakpoint position is the 2 nd to the last element in the last cell of the breakpoint storage array bp; and (4) processing the output processed data matrix X by using columns as units, and taking the average value of the column in the data matrix Y between two adjacent break points as the signal value of the section of the column X for each column of signals.
The invention also discloses a system of the multi-signal-oriented rapid breakpoint detection method, which comprises the following steps:
the signal preprocessing module is used for preprocessing the acquired original signal;
the penalty parameter estimation module is used for acquiring a processed signal matrix under the condition of giving a breaking point number or a penalty parameter;
and the signal processing module is used for determining the breakpoint position of the signal matrix and the processed signal and realizing the breakpoint quick detection of the segmented signal.
A computer device comprises a memory and a processor, wherein the memory stores a computer program, and the processor realizes the steps of the multi-signal-oriented rapid breakpoint detection method when executing the computer program.
A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the multi-signal oriented fast breakpoint detection method.
As shown in fig. 4, the beneficial effects of the population consensus copy number variation detection of the present invention applied to high throughput sequencing technologies are demonstrated. The sequencing data originated from a three-family (father, mother, son, M ═ 3), and the pre-processing of the data included (i) alignment of HTS off-line data file fastq pairs to reference genome hg19 using mapping tool bowtie; (ii) calculating a Read Depth (RD) signal, which is the sequencing coverage depth of each base site within a fixed width window on the genome; (iii) GC and mappability correction of RD signals. FIG. 4(a) shows the copy number variation detected in the interval 32.8-33.4 Mb for chromosome 22 using the present invention. It can be seen that the offspring inherits the variation common to both parents. Furthermore, FIG. 4(b) shows the copy number variation detected in the region of chromosome 22 from 39.2 to 39.5Mb using the present invention. It can be seen that here the offspring inherits the variation from the father, where the mother has no variation. The method has important application value in detecting the breakpoint by using a plurality of signals.
As shown in fig. 5, the health data analysis applied to the wearable device of the present invention is illustrated. This data is derived from a 12-second running exercise performed on the subject, and the sensor is attached to the right ankle joint of the subject and detects acceleration in three directions x, y, and z (M is 3)). Fig. 5(a) shows the effect of detecting three acceleration signals using the present invention, and it can be seen that the detection is good step by step. In contrast, fig. 5(b) shows the detection effect using only one acceleration signal (z direction), and it can be seen that gaits around 8 seconds are not well distinguished. The detection precision of the breakpoint can be improved by using a plurality of signals.
As shown in fig. 6, the improvement of the calculation time of the present invention compared to the conventional method is demonstrated. Here, simulation data is used. Fig. 6(a) shows the relationship between the calculation time and the signal length N when the number of signals M is 10. It can be seen that as the signal length N increases, the computation time increases and the present invention uses only about one percent of the computation time of the conventional method. For very long signals with N of 100000 points, the invention only uses about 10 seconds, while the conventional method uses about 1000 seconds. Fig. 6(b) shows the relationship between the calculation time and the signal dimension M when the signal length N is 3000. It can be seen that the computation time required by the present invention remains almost constant as the signal dimension M increases, whereas the computation time of the conventional method increases linearly. For many signals with M1000, the calculation time of the present invention is less than 1 second, while the conventional method takes about 400 seconds. The invention can greatly reduce the calculation time, and is particularly suitable for a large number of signals.
The invention provides a rapid signal processing method. The method can quickly and accurately detect the common breakpoint position in the multi-dimensional signals, and further provides reliable starting and stopping position information for the multi-dimensional signals through segmentation, fitting and parameter estimation. The method has the wide practical characteristics, and can be applied to the fields of biology, medicine, engineering and the like, such as population copy number variation detection based on a high-throughput sequencing technology, motion state detection based on wearable equipment and the like.
Abbreviations and key terms appearing and used in the present invention are defined as follows:
CNV Copy Number Variation
HTS High-Throughput Sequencing
DP Dynamic Programming
RD Read Depth
CBS Circular Binary Segmentation cyclic Binary Segmentation
OP Optimal partioning
Precise Linear Time for PELT Pruned Exact Linear Time pruning
TPR True Positive Rate
False Positive Rate of FPR False Positive Rate
GC Guanine-cysteine content Guanine-cytosine content
The above-mentioned contents are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modification made on the basis of the technical idea of the present invention falls within the protection scope of the claims of the present invention.

Claims (10)

1. A multi-signal-oriented rapid breakpoint detection method is characterized by comprising the following steps:
s1, preprocessing the original signal to obtain a preprocessed signal matrix Y with the size of NxM after preprocessing;
s2, determining a breakpoint number k or a penalty parameter lambda according to actual conditions;
s3, when selecting the punishment parameter lambda as the input parameter, solving the minimization optimization problem, and acquiring the processed signal matrix X with the size of N multiplied by M; when the number of break points k is selected as an input parameter, the maximum possible lambda is calculated from the preprocessed signal matrix YmaxIn the interval [0, λmax]Estimating a punishment parameter lambda by internal search, solving a minimization optimization problem under the given lambda to ensure that the number of broken points is k, and acquiring a processed signal matrix X with the size of NxM;
s4, calculating a signal matrix X according to the obtained breakpoint position; performing denormalization processing on the signal matrix X to obtain a processed signal X0And the breakpoint of the segmented signal is rapidly detected.
2. The multi-signal-oriented fast breakpoint detection method according to claim 1, wherein S1 specifically includes the following steps:
s1.1, storing the acquired original signals into an N multiplied by M original matrix Y0
S1.2, calculating an original matrix Y0The maximum absolute value c of;
and S1.3, carrying out preprocessing operation on the original signal by adopting the maximum absolute value c.
3. The multi-signal-oriented fast breakpoint detection method according to claim 2, wherein in S1.3, the original signal is preprocessed by a normalization method, and the preprocessing result is shown in formula (1):
Y=Y0/c (1)。
4. the multi-signal-oriented fast breakpoint detection method according to claim 2, wherein in S4, the processed signal X0Is shown in formula (2):
X0=cX (2)。
5. the method as claimed in claim 1, wherein in S3, a binary search interval [0, λ ] is usedmax]The penalty parameter lambda within.
6. The multi-signal-oriented fast breakpoint detection method according to claim 1, wherein in S3, given the preprocessed signal matrix Y and the penalty parameter λ, a minimization optimization problem is solved as shown in formula (3), that is, a signal matrix X is solved:
Figure FDA0003234264700000021
where Y is the N × M pre-processed signal matrix, X is the processed N × M sized signal matrix, N is the number of sampling points per signal, M is the number of signals, λ is a penalty parameter for each breakpoint, and p (X) is the number of breakpoints in X.
7. The multi-signal-oriented fast breakpoint detection method according to claim 6, wherein the specific operation steps of S3 are as follows:
when the number of break points k is selected as an input parameter:
1) calculating the maximum possible lambda according to the preprocessed signal matrix YmaxAnd let the minimum possible lambdaminCalculating penalty parameter as 0
Figure FDA0003234264700000022
2) Under the condition of giving a preprocessed signal matrix Y and a punishment parameter lambda, solving the minimization optimization problem shown in a formula (3), namely solving a signal matrix X and the number P (X) of breakpoints;
3) if the number k of break points is less than the number P (X) of break points, let λminλ and repeating the above steps;
if the number k of break points is greater than the number P (X) of break points, let λmaxλ and repeating the above steps;
if the number k of the break points is equal to the number P (X) of the break points, outputting a signal matrix X;
when the penalty parameter λ is selected as an input parameter: the minimization optimization problem, i.e. the signal matrix X, is solved as shown in equation (3).
8. The system for realizing the multi-signal-oriented rapid breakpoint detection method according to any one of claims 1 to 7 comprises:
the signal preprocessing module is used for preprocessing the acquired original signal;
the penalty parameter estimation module is used for acquiring a processed signal matrix under the condition of giving a breaking point number or a penalty parameter;
and the signal processing module is used for determining the breakpoint position of the signal matrix and the processed signal and realizing the breakpoint quick detection of the segmented signal.
9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor when executing the computer program implements the steps of the multi-signal oriented fast breakpoint detection method according to any one of claims 1 to 7.
10. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, implements the steps of the multi-signal-oriented fast breakpoint detection method according to any one of claims 1 to 7.
CN202110997289.9A 2021-08-27 2021-08-27 Multi-signal-oriented rapid breakpoint detection method, system, equipment and storage medium Active CN113743609B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110997289.9A CN113743609B (en) 2021-08-27 2021-08-27 Multi-signal-oriented rapid breakpoint detection method, system, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110997289.9A CN113743609B (en) 2021-08-27 2021-08-27 Multi-signal-oriented rapid breakpoint detection method, system, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113743609A true CN113743609A (en) 2021-12-03
CN113743609B CN113743609B (en) 2024-04-02

Family

ID=78733518

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110997289.9A Active CN113743609B (en) 2021-08-27 2021-08-27 Multi-signal-oriented rapid breakpoint detection method, system, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113743609B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115225549A (en) * 2022-07-15 2022-10-21 中国工商银行股份有限公司 Breakpoint testing method and device, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020144178A1 (en) * 2001-03-30 2002-10-03 Vittorio Castelli Method and system for software rejuvenation via flexible resource exhaustion prediction
EP1727072A1 (en) * 2005-05-25 2006-11-29 The Babraham Institute Signal processing, transmission, data storage and representation
CN108197428A (en) * 2017-12-25 2018-06-22 西安交通大学 A kind of next-generation sequencing technologies copy number mutation detection method of parallel Dynamic Programming
CN113295702A (en) * 2021-05-20 2021-08-24 国网山东省电力公司枣庄供电公司 Electrical equipment fault diagnosis model training method and electrical equipment fault diagnosis method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020144178A1 (en) * 2001-03-30 2002-10-03 Vittorio Castelli Method and system for software rejuvenation via flexible resource exhaustion prediction
EP1727072A1 (en) * 2005-05-25 2006-11-29 The Babraham Institute Signal processing, transmission, data storage and representation
CN108197428A (en) * 2017-12-25 2018-06-22 西安交通大学 A kind of next-generation sequencing technologies copy number mutation detection method of parallel Dynamic Programming
CN113295702A (en) * 2021-05-20 2021-08-24 国网山东省电力公司枣庄供电公司 Electrical equipment fault diagnosis model training method and electrical equipment fault diagnosis method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
吴功跃;: "集值向量优化中精确罚函数稳定性研究", 知识文库, no. 10 *
张亚靓;纪俊卿;孟祥川;许同乐;: "基于指数小波阈值与PSO-DP-LSSVM的发动机轴承故障诊断", 机床与液压, no. 19 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115225549A (en) * 2022-07-15 2022-10-21 中国工商银行股份有限公司 Breakpoint testing method and device, computer equipment and storage medium
CN115225549B (en) * 2022-07-15 2024-03-26 中国工商银行股份有限公司 Breakpoint test method, breakpoint test device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN113743609B (en) 2024-04-02

Similar Documents

Publication Publication Date Title
US10991453B2 (en) Alignment of nucleic acid sequences containing homopolymers based on signal values measured for nucleotide incorporations
US10347365B2 (en) Systems and methods for visualizing a pattern in a dataset
Bafna et al. The conserved exon method for gene finding.
US5964860A (en) Sequence information signal processor
Calabrese et al. Fast identification and statistical evaluation of segmental homologies in comparative maps
Penny et al. Testing methods of evolutionary tree construction
Ben-Dor et al. Banishing bias from consensus sequences
Alser et al. From molecules to genomic variations: Accelerating genome analysis via intelligent algorithms and architectures
CN112733904B (en) Water quality abnormity detection method and electronic equipment
CN112289370A (en) Protein structure prediction method and device based on multitask time domain convolutional neural network
CN113743609A (en) Multi-signal-oriented rapid breakpoint detection method, system, equipment and storage medium
WO2001016861A2 (en) Method and apparatus for analyzing nucleic acid sequences
Xia et al. Compositional analysis of microbiome data
Alser et al. Going from molecules to genomic variations to scientific discovery: Intelligent algorithms and architectures for intelligent genome analysis
CA3154621A1 (en) Single cell rna-seq data processing
Šrámek et al. On-line Viterbi algorithm for analysis of long biological sequences
Gudodagi et al. Investigations and Compression of Genomic Data
Moyer et al. Motif identification using CNN-based pairwise subsequence alignment score prediction
Dainat et al. Methods to identify and study the evolution of pseudogenes using a phylogenetic approach
Rangarajan Promoter sequence analysis through no gap multiple sequence alignment of motif pairs
Christof et al. Computing physical maps of chromosomes with nonoverlapping probes by branch-and-cut
Geiger et al. A model and solution to the DNA flipping string problem
Hu et al. A real-time de novo DNA Sequencing assembly platform based on an FPGA implementation
Bello et al. Acceleration of algorithm of Smith-Waterman using recursive variable expansion
Junjun et al. A comprehensive review of deep learning-based variant calling methods

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant