单分子的识别、计数方法及装置Single molecule identification and counting method and device
技术领域Technical field
本发明涉及基因测序技术领域,尤其涉及一种单分子的识别、计数方法及识别、计数装置及处理系统。The invention relates to the field of gene sequencing technology, in particular to a single molecule identification, counting method, identification, counting device and processing system.
背景技术Background technique
在相关技术中,第三代测序技术为单分子测序,基于成像光学检测的单分子测序技术为依赖光学信号和依赖电信号的碱基识别技术。其中,依靠荧光识别的碱基团,所带荧光是在特定功率的激光照射下从激发态跃变到基态所发出的光强。但是由于不同荧光团发光的时间长度不同、发出的光强有差别以及背景噪声的存在等都会造成单分子识别的错误。同时DNA链分布不均,碱基团抱团等也会造成有效单分子的减少。In the related art, the third generation sequencing technology is single molecule sequencing, and the single molecule sequencing technology based on imaging optical detection is a base recognition technology that relies on optical signals and electrical signals. Among them, the fluorescence group is determined by fluorescence, and the fluorescence emitted is the intensity of light emitted from the excited state to the ground state under laser irradiation of a specific power. However, due to the different lengths of luminescence of different fluorophores, the difference in emitted light intensity, and the presence of background noise, single-molecule recognition errors are caused. At the same time, the DNA strands are unevenly distributed, and the base clusters and the like also cause a decrease in effective single molecules.
现有方法主要依靠人眼在采集到的荧光图像上进行单分子识别和计数,但这样的方法耗费人力而且速度也慢。而参照语音识别,采用基于HMM和机器学习的方法,不仅需要大量样本的训练,而且运行效率也不高。The existing methods mainly rely on the human eye to perform single molecule recognition and counting on the collected fluorescent images, but such a method is labor-intensive and slow. With reference to speech recognition, the method based on HMM and machine learning not only requires a large number of samples to be trained, but also has low operational efficiency.
发明内容Summary of the invention
本发明实施方式旨在至少解决现有技术中存在的技术问题之一。为此,本发明实施方式需要提供一种单分子的识别、计数方法及识别、计数装置。The embodiments of the present invention aim to at least solve one of the technical problems existing in the prior art. Therefore, the embodiments of the present invention need to provide a single molecule identification, counting method, and identification and counting device.
本发明实施方式的一种单分子的识别方法,包括步骤:输入图像亮点强度的时间序列;根据所述时间序列,形成所述图像亮点的时间与强度的折线图,所述折线图由多条线段组成;对所述折线图进行网格划分以形成阵列排布的多个网格,统计落在每个所述网格的所述线段和/或所述线段的端点的次数;基于所述强度的大小进行分组,对所述次数进行频数统计,以获得直方图;查找所述直方图的极大值点,判定满足以下条件的一个极大值点所在的峰对应一个单分子:所述极大值点的值大于第一设定阈值且所述极大值点所在的峰的宽度大于第二设定阈值。上述单分子的识别方法,通过对亮点强度的时间序列的折线图转化为图像处理以得到直方图,能够快速地对单分子进行识别,而识别的精度也较高。A method for identifying a single molecule according to an embodiment of the present invention includes the steps of: inputting a time series of intensity of an image bright spot; forming a line graph of time and intensity of the bright spot of the image according to the time series, wherein the line graph is composed of multiple a line segment composition; meshing the line graphs to form a plurality of grids arranged in an array, counting the number of times of the line segments and/or the end points of the line segments that fall on each of the grids; The intensity is grouped, frequency statistics are performed on the number of times to obtain a histogram; the maximum point of the histogram is searched, and a peak at which a maximum point satisfying the following condition is determined corresponds to a single molecule: The value of the maximum value point is greater than the first set threshold and the width of the peak at which the maximum value point is greater than the second set threshold. The above-mentioned method for identifying a single molecule can be quickly recognized for a single molecule by converting a time-series line graph of bright spot intensity into image processing to obtain a histogram, and the recognition accuracy is also high.
本发明实施方式的一种单分子的计数方法,包括步骤:输入图像亮点强度的时间序列;根据所述时间序列,形成所述图像亮点的时间与强度的折线图,所述折线图由多条线段组成;对所述折线图进行网格划分以形成阵列排布的多个网格,统计落在每个所述网格的所述线段和/或所述线段的端点的次数;基于所述强度的大小进行分组,对所述次数进行频数统计,以获得直方图;查找所述直方图的极大值点,判定满足以下条件的一个极大值点所在的峰对应一个单分子:所述极大值点的值大于第一设定阈值且所述极大值点所在的峰的宽度大于第二设定阈值;计算获得单分子的数目S1。上述单分子的计数方法,通过对亮点强度的时间序列的折线图转化为图像处理以得到直方图,能够快速地对单分子进行计数,而计数的精度也较高。A single molecule counting method according to an embodiment of the present invention includes the steps of: inputting a time series of image bright point intensity; forming a line graph of time and intensity of the image bright point according to the time series, wherein the line graph is composed of multiple a line segment composition; meshing the line graphs to form a plurality of grids arranged in an array, counting the number of times of the line segments and/or the end points of the line segments that fall on each of the grids; The intensity is grouped, frequency statistics are performed on the number of times to obtain a histogram; the maximum point of the histogram is searched, and a peak at which a maximum point satisfying the following condition is determined corresponds to a single molecule: The value of the maximum value point is greater than the first set threshold value and the width of the peak at which the maximum value point is located is greater than the second set threshold value; the number S1 of single molecules is calculated. The counting method of the single molecule described above is converted into image processing by a line graph of a time series of bright spot intensity to obtain a histogram, and the single molecule can be quickly counted, and the counting accuracy is also high.
本发明实施方式的一种单分子的计数方法,包括步骤:输入图像亮点强度的时间序列;根据所述时间序列,形成所述图像亮点的时间与强度的折线图,所述折线图由多条线段组成;对所述折线图进行网格划分以形成阵列排布的多个网格,统计落在每个所述网格的所述线段和/或所述线段的端点的次数;基于所述强度的大小进行分组,对所述次数进行频数统计,以获得直方图;查找所述直方图的极大值点,判定满足以下条件时,对单分子的计数加1:所述极大值点的值大于第一设定阈值且所述极大值点所在的峰的宽度大于第二设定阈值。上述单分子的计数方法,通过对亮点强度的时间序列的折线图转化为图像处理以得到直方图,能够快速地对单分子进行计数,而计数的精度也较高。A single molecule counting method according to an embodiment of the present invention includes the steps of: inputting a time series of image bright point intensity; forming a line graph of time and intensity of the image bright point according to the time series, wherein the line graph is composed of multiple a line segment composition; meshing the line graphs to form a plurality of grids arranged in an array, counting the number of times of the line segments and/or the end points of the line segments that fall on each of the grids; The magnitude of the intensity is grouped, the frequency is counted to obtain a histogram; the maximum point of the histogram is searched, and when the following conditions are met, the count of the single molecule is added: the maximum point The value of the peak is greater than the first set threshold and the width of the peak at which the maximum point is located is greater than the second set threshold. The counting method of the single molecule described above is converted into image processing by a line graph of a time series of bright spot intensity to obtain a histogram, and the single molecule can be quickly counted, and the counting accuracy is also high.
本发明实施方式的一种单分子的识别装置,用以实施上述本发明一方面的单分子的识别方法的部分或全部步骤,包括:输入单元,用于输入图像亮点强度的时间序列;转化单元,用于根据所述输入单元中的所述时间序列,形成所述图像亮点的时间与强度的折线图,所述折线图由多条线段组成;网格统计单元,用于对来自所述转化单元的所述折线图进行网格划分以形成阵列排布的多个网格,统计落在每个所述网格的所述线段和/或所述线段的端点的次数;直方统计单元,用于基于所述强度的大小进行分组,对来自所述网格统计单元的所述次数进行频数统计,以获得直方图;判定单元,用于查找来自所述直方统计单元的所述直方图的极大值点,并判定满足以下条件的一个极大值点所在的峰对应一个单分子:所述极大值点的值大于第一设定阈值且所述极大值点所在的峰的宽度大于第二设定阈值。上述单分子的识别装置,通过对亮点强度的时间序列的折线图转化为图像处理以得到直方图,能够快速地对单分子进行识别,而识别的精度也较高。A single molecule identification device for implementing one or all of the steps of the single molecule identification method of the above aspect of the present invention includes: an input unit for inputting a time series of image brightness intensity; and a conversion unit And a line graph for forming a time and an intensity of the image bright point according to the time series in the input unit, wherein the line graph is composed of a plurality of line segments; a grid statistical unit is configured to The line graph of the cells is meshed to form a plurality of grids arranged in an array, counting the number of times of the line segments and/or the end points of the line segments of each of the grids; a histogram unit, Grouping based on the magnitude of the intensity, performing frequency statistics on the number of times from the grid statistical unit to obtain a histogram; and determining a unit for finding the pole of the histogram from the histogram statistic unit a large value point, and determining that a peak at which a maximum value point satisfies the following condition corresponds to a single molecule: the value of the maximum value point is greater than a first set threshold value and the maximum The width of the peak at which the value point is located is greater than the second set threshold. The single-molecule identification device converts the time-series line graph of the intensity of the bright spot into image processing to obtain a histogram, and can quickly recognize the single molecule, and the recognition accuracy is also high.
本发明实施方式的一种单分子的计数装置,用以实施上述本发明一方面的单分子的计数方法的部分
或全部步骤,包括:输入单元,用于输入图像亮点强度的时间序列;转化单元,用于根据所述输入单元中的所述时间序列,形成所述图像亮点的时间与强度的折线图,所述折线图由多条线段组成;网格统计单元,用于对来自所述转化单元的所述折线图进行网格划分以形成阵列排布的多个网格,统计落在每个所述网格的所述线段和/或所述线段的端点的次数;直方统计单元,用于基于所述强度的大小进行分组,对来自所述网格统计单元的所述次数进行频数统计,以获得直方图;判定单元,用于查找来自所述直方统计单元的所述直方图的极大值点,并判定满足以下条件的一个极大值点所在的峰对应一个单分子:所述极大值点的值大于第一设定阈值且所述极大值点所在的峰的宽度大于第二设定阈值;计算单元,用于计算获得单分子的数目S1。上述单分子的计数装置,通过对亮点强度的时间序列的折线图转化为图像处理以得到直方图,能够快速地对单分子进行计数,而计数的精度也较高。A single molecule counting device according to an embodiment of the present invention for implementing the above-described single molecule counting method of one aspect of the present invention
Or all the steps include: an input unit, configured to input a time series of image brightness intensity; and a conversion unit, configured to form a line graph of time and intensity of the image bright point according to the time sequence in the input unit, The line graph is composed of a plurality of line segments; a grid statistical unit is configured to mesh the line graphs from the conversion unit to form a plurality of grids arranged in an array, and statistics fall on each of the nets a number of times of the line segment and/or an end point of the line segment; a histogram statistic unit configured to perform grouping based on the magnitude of the intensity, and perform frequency statistics on the number of times from the grid statistical unit to obtain a histogram a determination unit, configured to find a maximum value point of the histogram from the histogram statistical unit, and determine that a peak of a maximum value point satisfying the following condition corresponds to a single molecule: the maximum value point The value of the peak is greater than the first set threshold and the width of the peak at which the maximum point is located is greater than the second set threshold; the calculation unit is configured to calculate the number S1 of single molecules. The single-molecule counting device converts the time-series line graph of the intensity of the bright spot into image processing to obtain a histogram, and can quickly count the single molecules, and the counting accuracy is also high.
本发明实施方式的一种单分子的计数装置,用以实施上述本发明一方面的单分子的计数方法的部分或全部步骤,包括:输入单元,用于输入图像亮点强度的时间序列;转化单元,用于根据所述输入单元中的所述时间序列,形成所述图像亮点的时间与强度的折线图,所述折线图由多条线段组成;网格统计单元,用于对来自所述转化单元的所述折线图进行网格划分以形成阵列排布的多个网格,统计落在每个所述网格的所述线段和/或所述线段的端点的次数;直方统计单元,用于基于所述强度的大小进行分组,对来自所述网格统计单元的所述次数进行频数统计,以获得直方图;判定单元,用于查找来自所述直方统计单元的所述直方图的极大值点,并判定满足以下条件时,对单分子的计数加1:所述极大值点的值大于第一设定阈值且所述极大值点所在的峰的宽度大于第二设定阈值。上述单分子的计数装置,通过对亮点强度的时间序列的折线图转化为图像处理以得到直方图,能够快速地对单分子进行计数,而计数的精度也较高。A single molecule counting device according to an embodiment of the present invention, for performing some or all of the steps of the single molecule counting method of the above aspect of the present invention, comprising: an input unit for inputting a time series of image brightness intensity; and a conversion unit And a line graph for forming a time and an intensity of the image bright point according to the time series in the input unit, wherein the line graph is composed of a plurality of line segments; a grid statistical unit is configured to The line graph of the cells is meshed to form a plurality of grids arranged in an array, counting the number of times of the line segments and/or the end points of the line segments of each of the grids; a histogram unit, Grouping based on the magnitude of the intensity, performing frequency statistics on the number of times from the grid statistical unit to obtain a histogram; and determining a unit for finding the pole of the histogram from the histogram statistic unit a large value point, and determining that the following condition is satisfied, the count of the single molecule is increased by 1: the value of the maximum value point is greater than the first set threshold value and the peak of the maximum value point is Is greater than a second preset threshold. The single-molecule counting device converts the time-series line graph of the intensity of the bright spot into image processing to obtain a histogram, and can quickly count the single molecules, and the counting accuracy is also high.
本发明实施方式的一种单分子的处理系统,包括:数据输入装置,用于输入数据;数据输出装置,用于输出数据;存储装置,用于存储数据,所述数据包括计算机可执行程序;处理器,用于执行所述计算机可执行程序,执行所述计算机可执行程序包括完成上述任一实施方式的方法。该单分子的处理系统能够实现单分子识别和/或单分子计数。A single molecule processing system according to an embodiment of the present invention includes: a data input device for inputting data; a data output device for outputting data; and a storage device for storing data, the data including a computer executable program; A processor for executing the computer executable program, the executing the computer executable program comprising performing the method of any of the above embodiments. The single molecule processing system enables single molecule recognition and/or single molecule counting.
本发明实施方式的一种计算机可读存储介质,用于存储供计算机执行的程序,执行所述程序包括完成上述任一实施方式的方法。计算机可读存储介质可以包括:只读存储器、随机存储器、磁盘或光盘等。A computer readable storage medium for storing a program for execution by a computer, the method comprising executing the method of any of the above embodiments. The computer readable storage medium may include read only memory, random access memory, magnetic or optical disks, and the like.
本发明实施方式的附加方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明实施方式的实践了解到。Additional aspects and advantages of the embodiments of the invention will be set forth in part in
附图说明DRAWINGS
本发明实施方式的上述和/或附加的方面和优点从结合下面附图对实施方式的描述中将变得明显和容易理解,其中:The above and/or additional aspects and advantages of the embodiments of the present invention will become apparent and readily understood from
图1是本发明实施方式的单分子的识别方法的流程示意图。1 is a schematic flow chart of a method for identifying a single molecule according to an embodiment of the present invention.
图2是本发明实施方式的单分子的识别方法的另一流程示意图。2 is a schematic flow chart of another method for identifying a single molecule according to an embodiment of the present invention.
图3是本发明实施方式的单分子的识别方法的又一流程示意图。3 is a schematic flow chart showing another method of identifying a single molecule according to an embodiment of the present invention.
图4是本发明实施方式的单分子的识别方法的又一流程示意图。4 is a schematic flow chart showing another method of identifying a single molecule according to an embodiment of the present invention.
图5是本发明实施方式的单分子的识别方法的又一流程示意图。FIG. 5 is still another schematic flowchart of a method for identifying a single molecule according to an embodiment of the present invention.
图6是本发明实施方式的单分子的识别方法的又一流程示意图。6 is a schematic flow chart showing another method of identifying a single molecule according to an embodiment of the present invention.
图7是本发明实施方式的单分子的识别方法的又一流程示意图。7 is a schematic flow chart showing another method of identifying a single molecule according to an embodiment of the present invention.
图8是本发明实施方式的单分子的识别方法的又一流程示意图。FIG. 8 is still another schematic flow chart of a method for identifying a single molecule according to an embodiment of the present invention.
图9是本发明实施方式的单分子的识别方法的墨西哥帽滤波的曲线示意图。9 is a schematic diagram showing a Mexican hat filter of a single molecule identification method according to an embodiment of the present invention.
图10是本发明实施方式的单分子的识别方法的再一流程示意图。FIG. 10 is a schematic flow chart of still another method for identifying a single molecule according to an embodiment of the present invention.
图11是本发明实施方式的单分子的识别方法中8连通像素的示意图。11 is a schematic diagram of eight connected pixels in a single molecule identification method according to an embodiment of the present invention.
图12是本发明实施方式的单分子的识别方法的折线图的示意图。Fig. 12 is a schematic diagram showing a line graph of a single molecule identification method according to an embodiment of the present invention.
图13是本发明实施方式的单分子的识别方法中对折线图进行网格划分的示意图。Fig. 13 is a schematic diagram showing the meshing of a line graph in the single molecule identification method according to the embodiment of the present invention.
图14是本发明实施方式的单分子的识别方法中滤波前的折线图的示意图。Fig. 14 is a schematic diagram showing a line graph before filtering in the single molecule identification method according to the embodiment of the present invention.
图15是本发明实施方式的单分子的识别方法中滤波后的折线图的示意图。Fig. 15 is a schematic diagram showing a filtered line graph in the single molecule identification method according to the embodiment of the present invention.
图16是本发明实施方式的单分子的识别方法的折线图的另一示意图。Fig. 16 is another schematic diagram of a line graph of a single molecule identification method according to an embodiment of the present invention.
图17是本发明实施方式的单分子的识别方法中均衡化后的直方图的示意图。Fig. 17 is a schematic diagram showing a histogram after equalization in the single molecule identification method according to the embodiment of the present invention.
图18是本发明实施方式的单分子的识别方法的另再一流程示意图。Fig. 18 is a flow chart showing still another flow of the method for identifying a single molecule according to an embodiment of the present invention.
图19是本发明实施方式的单分子的识别方法中线腐蚀的过程示意图。Fig. 19 is a schematic view showing the process of line corrosion in the single molecule identification method according to the embodiment of the present invention.
图20是本发明实施方式的单分子的识别方法中线腐蚀的另一过程示意图。Fig. 20 is a schematic view showing another process of line corrosion in the single molecule identification method according to the embodiment of the present invention.
图21是本发明实施方式的单分子的识别方法中8连通窗口的示意图。21 is a schematic diagram of an 8-connected window in a single molecule identification method according to an embodiment of the present invention.
图22是本发明实施方式的单分子的识别方法中标识连通区域的示意图。
Fig. 22 is a schematic diagram showing the identification of a connected region in the single molecule identification method according to the embodiment of the present invention.
图23是本发明实施方式的单分子的计数方法的流程示意图。23 is a schematic flow chart of a single molecule counting method according to an embodiment of the present invention.
图24是本发明实施方式的单分子的计数方法的另一流程示意图。Fig. 24 is a schematic flow chart showing another method of counting a single molecule according to an embodiment of the present invention.
图25是本发明实施方式的单分子的计数方法的再一流程示意图。Fig. 25 is a flow chart showing still another flow of the single molecule counting method according to the embodiment of the present invention.
图26是本发明实施方式的单分子的计数方法的又一流程示意图。Fig. 26 is a schematic flow chart showing still another method of counting a single molecule according to an embodiment of the present invention.
图27是本发明实施方式的单分子的识别装置的模块示意图。Figure 27 is a block diagram showing a single molecule identification device according to an embodiment of the present invention.
图28是本发明实施方式的单分子的识别装置的又一模块示意图。28 is a block diagram showing still another module of the single molecule identification device according to the embodiment of the present invention.
图29是本发明实施方式的单分子的识别装置的再一模块示意图。29 is a block diagram showing still another module of the single molecule identification device according to the embodiment of the present invention.
图30是本发明实施方式的单分子的识别装置的另一模块示意图。Figure 30 is a block diagram showing another module of the single molecule identification device of the embodiment of the present invention.
图31是本发明实施方式的单分子的计数装置的模块示意图。Figure 31 is a block diagram showing a single molecule counting device according to an embodiment of the present invention.
图32是本发明实施方式的单分子的计数装置的再一模块示意图。32 is a block diagram showing still another module of the single molecule counting device according to the embodiment of the present invention.
图33是本发明实施方式的单分子的计数装置的另一模块示意图。Figure 33 is a block diagram showing another module of the single molecule counting device of the embodiment of the present invention.
图34是本发明实施方式的单分子的计数装置的又再一模块示意图。Figure 34 is still another block diagram of the single molecule counting device of the embodiment of the present invention.
图35是本发明实施方式的单分子的计数装置的再又一模块示意图。Fig. 35 is a block diagram showing still another module of the single molecule counting device according to the embodiment of the present invention.
图36是本发明实施方式的单分子的计数装置的再另一模块示意图。Fig. 36 is a schematic view showing still another module of the single molecule counting device according to the embodiment of the present invention.
图37是本发明实施方式的单分子的处理系统的模块示意图。37 is a block diagram of a single molecule processing system in accordance with an embodiment of the present invention.
具体实施方式detailed description
下面详细描述本发明的实施方式,所述实施方式的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施方式是示例性的,仅用于解释本发明,而不能理解为对本发明的限制。The embodiments of the present invention are described in detail below, and the examples of the embodiments are illustrated in the drawings, wherein the same or similar reference numerals indicate the same or similar elements or elements having the same or similar functions. The embodiments described below with reference to the drawings are intended to be illustrative of the invention and are not to be construed as limiting.
在本发明的描述中,需要理解的是,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个所述特征。在本发明的描述中,“多个”的含义是两个或两个以上,除非另有明确具体的限定。In the description of the present invention, it is to be understood that the terms "first" and "second" are used for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, features defining "first" or "second" may include one or more of the described features either explicitly or implicitly. In the description of the present invention, the meaning of "a plurality" is two or more unless specifically and specifically defined otherwise.
在本发明的描述中,需要说明的是,除非另有明确的规定和限定,“连接”应做广义理解,例如,可以是固定连接,也可以是可拆卸连接,或一体地连接;可以是机械连接,也可以是电连接或可以相互通信;可以是直接相连,也可以通过中间媒介间接相连,可以是两个元件内部的连通或两个元件的相互作用关系。In the description of the present invention, it should be noted that, unless otherwise clearly defined and defined, "connected" should be understood broadly, for example, it may be a fixed connection, a detachable connection, or an integral connection; The mechanical connections may also be electrical connections or may communicate with each other; they may be directly connected or indirectly connected through an intermediate medium, and may be internal communication of two elements or an interaction relationship of two elements.
本发明实施方式的单分子的识别方法及计数方法可应用于基因测序中,本发明实施方式所称的“基因测序”同核酸序列测定,包括DNA测序和/或RNA测序,包括长片段测序和/或短片段测序。The single molecule identification method and counting method of the embodiments of the present invention can be applied to gene sequencing, and the "gene sequencing" and nucleic acid sequence determination, including DNA sequencing and/or RNA sequencing, including long fragment sequencing and / or short segment sequencing.
请参图1,本发明实施方式的一种单分子的识别方法,包括步骤:S01,输入图像亮点强度的时间序列;S02,根据时间序列,形成图像亮点的时间与强度的折线图,折线图由多条线段组成;S03,对折线图进行网格划分以形成阵列排布的多个网格,统计落在每个网格的线段和/或线段的端点的次数;S04,基于强度的大小进行分组,对次数进行频数统计,以获得直方图;S05,查找直方图的极大值点,判定满足以下条件的一个极大值点所在的峰对应一个单分子:极大值点的值大于第一设定阈值且极大值点所在的峰的宽度大于第二设定阈值。上述单分子的识别方法,通过对亮点强度的时间序列的折线图转化为图像处理以得到直方图,能够快速地对单分子进行识别,而识别的精度也较高。该基于直方统计的单分子的识别方法能够根据亮点的强度的时间序列数据来准确识别单分子,尤其适用于亮点包含的单分子数>3情况。Referring to FIG. 1 , a single molecule identification method according to an embodiment of the present invention includes the steps of: S01, inputting a time series of image brightness intensity; and S02, forming a line graph of time and intensity of an image bright point according to a time series, a line chart Consists of a plurality of line segments; S03, meshing the line graphs to form a plurality of grids arranged in the array, and counting the number of times of the line segments and/or end points of each line segment; S04, based on the intensity Perform grouping, perform frequency statistics on the number of times to obtain a histogram; S05, find the maximum point of the histogram, and determine that the peak of a maximum point satisfying the following condition corresponds to a single molecule: the value of the maximum point is greater than The width of the first set threshold and the peak at which the maximum point is located is greater than the second set threshold. The above-mentioned method for identifying a single molecule can be quickly recognized for a single molecule by converting a time-series line graph of bright spot intensity into image processing to obtain a histogram, and the recognition accuracy is also high. The single-molecule identification method based on the histogram can accurately identify a single molecule according to the time series data of the intensity of the bright spot, and is particularly suitable for the case where the number of single molecules included in the bright spot is >3.
具体地,在步骤S01中,在形成图像亮点时,利用特定波长的激光照射测试样品,使测试样品激发出荧光,然后利用相机采集荧光而所形成的图像,图像中存在对应于测试样品发出荧光的部分(核酸分子)的图像亮点。所称的“亮点”,指图像上的发光点,一个发光点占有至少一个像素点。所称“像素点”同“像素”。Specifically, in step S01, when the image bright point is formed, the test sample is irradiated with laser light of a specific wavelength, the test sample is excited to emit fluorescence, and then the image formed by the fluorescence is collected by the camera, and the image is emitted corresponding to the test sample. The image of the part (nucleic acid molecule) is bright. The so-called "bright spot" refers to the light-emitting point on the image, and one light-emitting point occupies at least one pixel. The so-called "pixel" is the same as "pixel."
在本发明的一个实施例中,图像来自单分子测序平台,例如Helicos、Pacific Biosciences(PacBio)公司的测序平台,输入的原始数据为图像的像素点的参数,对所称的“亮点”的检测为对单分子光学信号的检测。In one embodiment of the present invention, the image is from a single molecule sequencing platform, such as a sequencing platform of Helicos, Pacific Biosciences (PacBio), and the input raw data is a parameter of a pixel point of the image, and the so-called "bright spot" is detected. For the detection of single-molecule optical signals.
在某些实施方式中,请参图2,单分子的识别方法,还包括:图像预处理步骤S31,图像预处理步骤分析输入的待处理图像以获得第一图像,待处理图像包含至少一个图像亮点,图像亮点具有至少一个像素点;亮点检测步骤S32,亮点检测步骤S32包括步骤:S321,分析第一图像以计算亮点判定阈值,S322,分析第一图像以获取候选亮点,S323,根据亮点判定阈值判断候选亮点是否为图像亮点,若判断结果为是,S324,则获取图像亮点强度的时间序列,若判断结果为否,S325,丢弃候选亮点。In some embodiments, referring to FIG. 2, the single molecule identification method further includes: an image preprocessing step S31, the image preprocessing step analyzing the input image to be processed to obtain a first image, and the image to be processed includes at least one image. a bright spot, the image highlight has at least one pixel; the bright spot detecting step S32, the bright spot detecting step S32 includes the steps of: S321, analyzing the first image to calculate a bright spot determination threshold, S322, analyzing the first image to obtain a candidate bright spot, S323, determining according to the bright spot The threshold value determines whether the candidate bright spot is an image bright spot. If the determination result is yes, S324, the time series of the image bright spot intensity is acquired. If the determination result is no, S325, the candidate bright spot is discarded.
因此,通过图像预处理步骤对待处理图像进行去噪处理,可减少亮点检测步骤的计算量,同时,通过亮点判断阈值判断候选亮点是否为图像亮点,可提高判断图像亮点的准确性。
Therefore, the denoising process of the image to be processed by the image preprocessing step can reduce the calculation amount of the bright spot detecting step, and at the same time, determine whether the candidate bright spot is an image bright point by using the bright spot judgment threshold, thereby improving the accuracy of determining the bright spot of the image.
具体地,在一个例子中,输入的待处理图像可为512*512或2048*2048的16位tiff格式的图像,tiff格式的图像可为灰度图像。如此,可简化单分子的识别方法的处理过程。Specifically, in one example, the input image to be processed may be a 16-bit tiff format image of 512*512 or 2048*2048, and the image of the tiff format may be a grayscale image. In this way, the processing of the single molecule identification method can be simplified.
在某些实施方式中,请参图3,图像预处理步骤S31包括:对待处理图像进行减背景处理,以获得第一图像。如此,能够进一步减少待处理图像的噪声,使单分子的识别和/或计数方法的准确性更高。In some embodiments, referring to FIG. 3, the image pre-processing step S31 includes performing background subtraction processing on the image to be processed to obtain a first image. In this way, the noise of the image to be processed can be further reduced, so that the accuracy of the single molecule recognition and/or counting method is higher.
在某些实施方式中,请参图4,图像预处理步骤S31包括:对进行减背景处理后的待处理图像进行简化处理,以获得第一图像。如此,可减少后续单分子的识别和/或计数方法的计算量。In some embodiments, referring to FIG. 4, the image pre-processing step S31 includes: performing a simplification process on the image to be processed after performing the background subtraction processing to obtain a first image. In this way, the amount of calculation of the subsequent single molecule recognition and/or counting method can be reduced.
在某些实施方式中,请参图5,图像预处理步骤S31包括:对待处理图像进行滤波处理,以获得第一图像。如此,对待处理图像进行滤波可在尽量保留图像细节特征的条件下获取第一图像,进而可提高单分子的识别和/或计数方法的准确性。In some embodiments, referring to FIG. 5, the image pre-processing step S31 includes performing filtering processing on the image to be processed to obtain a first image. In this way, filtering the image to be processed can acquire the first image under the condition that the image detail features are retained as much as possible, thereby improving the accuracy of the single molecule recognition and/or counting method.
在某些实施方式中,请参图6,图像预处理步骤S31包括:对待处理图像进行减背景处理后再进行滤波处理,以获得第一图像。如此,对待处理图像进行减背景后再进行滤波,能够进一步减少待处理图像的噪声,使单分子的识别和/或计数方法的准确性更高。In some embodiments, referring to FIG. 6, the image pre-processing step S31 includes: performing background subtraction processing on the image to be processed, and then performing filtering processing to obtain a first image. In this way, the image to be processed is filtered after subtracting the background, which can further reduce the noise of the image to be processed, so that the accuracy of the single molecule recognition and/or counting method is higher.
在某些实施方式中,请参图7,图像预处理步骤S31包括:对进行减背景处理后再进行滤波处理后的待处理图像进行简化处理,以获得第一图像。如此,可减少后续图像处理方法的计算量。In some embodiments, referring to FIG. 7, the image pre-processing step S31 includes: performing a simplified process on the image to be processed after performing the subtractive background processing to obtain the first image. In this way, the amount of calculation of the subsequent image processing method can be reduced.
在某些实施方式中,请参图8,图像预处理步骤S31包括:对待处理图像进行简化处理以获得第一图像。如此,可减少后续单分子的识别和/或计数方法的计算量。In some embodiments, referring to FIG. 8, the image pre-processing step S31 includes performing a simplification process on the image to be processed to obtain a first image. In this way, the amount of calculation of the subsequent single molecule recognition and/or counting method can be reduced.
在某些实施方式中,对待处理图像进行减背景处理,包括:利用开运算确定待处理图像的背景,根据背景对待处理图像进行减背景处理。如此,利用开运算用来消除小物体、在纤细点处分离物体、平滑较大物体的边界的同时并不明显改变图像面积,可更准确地获取减背景处理后的图像。In some embodiments, performing background subtraction processing on the image to be processed includes: determining an background of the image to be processed by using an open operation, and performing background subtraction processing on the image to be processed according to the background. In this way, the open operation is used to eliminate small objects, separate objects at slender points, and smooth the boundaries of large objects without significantly changing the image area, so that the background-subtracted image can be acquired more accurately.
具体地,在本发明实施方式中,在待处理图像f(x,y)(如灰度图像)移动a*a窗口(例如15*15窗口),利用开运算(先腐蚀再膨胀)估计待处理图像的背景,如下公式1及公式2所示:Specifically, in the embodiment of the present invention, the image to be processed f(x, y) (such as a grayscale image) is moved by an a*a window (for example, a 15*15 window), and an open operation (corrosion re-expansion) is used to estimate The background of the image is processed as shown in Equation 1 and Equation 2 below:
g(x,y)=erode[f(x,y),B]=min{f(x+x',y+y')-B(x',y')|(x',y')∈Db} 公式1,g(x,y)=erode[f(x,y),B]=min{f(x+x',y+y')-B(x',y')|(x',y') ∈D b } Equation 1,
其中,g(x,y)为腐蚀后的灰度图像,f(x,y)为原灰度图像,B为结构元素。Where g(x, y) is the grayscale image after etching, f(x, y) is the original grayscale image, and B is the structural element.
g(x,y)=dilate[f(x,y),B]=max{f(x-x',y-y')-B(x',y')|(x',y')∈Db} 公式2。g(x,y)=dilate[f(x,y),B]=max{f(x-x',y-y')-B(x',y')|(x',y') ∈D b } Equation 2.
其中,g(x,y)为膨胀后的灰度图像,f(x,y)为原灰度图像,B为结构元素。Where g(x, y) is the expanded grayscale image, f(x, y) is the original grayscale image, and B is the structural element.
故可得背景噪声g=imopen(f(x,y),B)=dilate[erode(f(x,y),B)] 公式3。Therefore, the background noise g=imopen(f(x,y),B)=dilate[erode(f(x,y),B)] can be obtained.
对原图进行减背景:Decrease the background of the original image:
f=f-g={f(x,y)-g(x,y)|(x,y)∈D} 公式4。f=f-g={f(x,y)-g(x,y)|(x,y)∈D} Equation 4.
可以理解,本实施方式的对待处理图像进行减背景处理的具体方法可适用于上述任一实施方式中提到的对待处理图像进行减背景处理的步骤。It can be understood that the specific method of performing background subtraction processing on the image to be processed in the embodiment may be applied to the step of performing background subtraction processing on the image to be processed mentioned in any of the above embodiments.
在某些实施方式中,滤波处理为墨西哥帽滤波处理。墨西哥帽滤波易于实现,降低了单分子的识别和/或计数方法的成本,同时,墨西哥帽滤波能提升前景与背景的对比度,使前景更亮,使背景更暗。In some embodiments, the filtering process is a mexican hat filtering process. Mexican cap filtering is easy to implement, reducing the cost of single-molecule identification and/or counting methods. At the same time, Mexican cap filtering improves the contrast between the foreground and the background, making the foreground brighter and making the background darker.
在进行墨西哥帽滤波时,使用m*m窗口对滤波处理前的待处理图像进行高斯滤波,对高斯滤波后的待处理图像进行二维拉普拉斯锐化,m为自然数且为大于1的奇数。如此,通过两步骤实现了墨西哥帽滤波。In the Mexican hat filtering, the m*m window is used to perform Gaussian filtering on the image to be processed before the filtering process, and the Gaussian filtered image to be processed is subjected to two-dimensional Laplacian sharpening, where m is a natural number and is greater than 1. odd number. Thus, Mexican hat filtering is achieved in two steps.
具体地,请参图9,墨西哥帽核可表示为:Specifically, please refer to Figure 9, the Mexican hat core can be expressed as:
其中,x和y表示像素点的坐标。 Where x and y represent the coordinates of the pixel points.
首先使用m*m窗口对待处理图像进行高斯滤波,如下公式6所示:First, Gaussian filtering is performed on the image to be processed using the m*m window, as shown in Equation 6 below:
其中,t1和t2表示滤波窗口的位置,wt1,t2表示高斯滤波的权重Where t1 and t2 represent the position of the filter window, and wt1 and t2 represent the weight of the Gaussian filter.
然后对待处理图像进行二维拉普拉斯锐化,如下公式7所示:The image to be processed is then subjected to two-dimensional Laplacian sharpening, as shown in Equation 7 below:
其中,K和k均表示拉普拉斯算子,与锐化目标有关,如果需要加强锐化和减弱锐化,就修改K和k。Among them, K and k both represent Laplacian operators, which are related to sharpening targets. If it is necessary to strengthen sharpening and weaken sharpening, modify K and k.
在一个例子中,m=3,因此m*m=3*3,进行高斯滤波时,公式6变为:
In one example, m=3, so m*m=3*3, when performing Gaussian filtering, Equation 6 becomes:
可以理解,本实施方式的墨西哥帽滤波的具体方法可适用于上述任一实施方式中提到的对待处理图像进行滤波处理的步骤。It can be understood that the specific method of the mecha mask filtering of the present embodiment can be applied to the step of performing filtering processing on the image to be processed mentioned in any of the above embodiments.
在某些实施方式中,简化图像为二值化图像。如此二值化图像易于处理,且应用范围广。In some embodiments, the simplified image is a binarized image. Such binarized images are easy to handle and have a wide range of applications.
具体地,在一个例子中,二值化图像可包含表征像素点不同属性的0和1二个数值,二值化图像可表示为:
Specifically, in one example, the binarized image may include two values of 0 and 1 characterizing different attributes of the pixel, and the binarized image may be expressed as:
在某些实施方式中,在进行简化处理时,根据简化处理前的待处理图像获取信噪比矩阵,并根据信噪比矩阵简化简化处理前的待处理图像以得到第一图像。In some embodiments, when the simplification process is performed, the signal to noise ratio matrix is obtained according to the image to be processed before the simplification processing, and the image to be processed before the processing is simplified according to the signal to noise ratio matrix to obtain the first image.
在一个具体例子中,可先对待处理图像进行减背景处理,之后再根据减背景处理后的待处理图像获取信噪比矩阵。如此,利于后续从噪声更少的图像获得信息,能够使单分子的识别和/或计数方法获得处理结果的准确性更高。In a specific example, the image to be processed may be subjected to subtractive background processing, and then the signal to noise ratio matrix is obtained according to the image to be processed after subtracting the background processing. In this way, it is advantageous for the subsequent acquisition of information from images with less noise, so that the accuracy of obtaining the processing result by the single molecule recognition and/or counting method is higher.
具体地,在一个例子中,信噪比矩阵可表示为:公式8,其中,x和y表示像素点的坐标,h表示图像的高度,w表示图像的宽度,i∈w,j∈h。Specifically, in one example, the signal to noise ratio matrix can be expressed as: Equation 8, where x and y represent the coordinates of the pixel, h represents the height of the image, and w represents the width of the image, i∈w, j∈h.
在一个例子中,简化图像为二值化图像,可根据信噪比矩阵得到二值化图像,二值化图像如公式9所示:In one example, the simplified image is a binarized image, and the binarized image can be obtained from the signal to noise ratio matrix. The binarized image is as shown in Equation 9:
在计算信噪比矩阵时,可先对待处理图像进行减背景处理和/或滤波处理,如上实施方式的减背景处理步骤和滤波处理步骤,根据减背景处理后得到公式4,再求得减背景处理后的待处理图像与背景的比值矩阵:When calculating the signal-to-noise ratio matrix, the background image to be processed may be subjected to subtractive background processing and/or filtering processing. The background subtraction background processing step and the filtering processing step of the above embodiment may be followed by subtracting the background processing to obtain the formula 4, and then subtracting the background. The ratio matrix of the processed image to the background after processing:
R=f/g={f(x,y)/g(x,y)|(x,y)∈D} 公式10,其中,D表示图像f的维度(高*宽)。R=f/g={f(x,y)/g(x,y)|(x,y)∈D} Equation 10, where D represents the dimension (height*width) of the image f.
由此可以求得SNR矩阵:
From this we can find the SNR matrix:
在某些实施方式中,分析第一图像以计算亮点判定阈值的步骤,包括:通过大津法处理第一图像以计算亮点判定阈值。如此,通过较成熟及简单的方法实现了亮点判定阈值的查找,进而提高了单分子的识别和/或计数方法的准确性及降低了单分子的识别和/或计数方法的成本。同时,用第一图像进行亮点判定阈值的查找,可提高了单分子的识别和/或计数方法的效率和准确性。In some embodiments, the step of analyzing the first image to calculate a bright spot determination threshold comprises: processing the first image by the Otsu method to calculate a bright spot determination threshold. In this way, the search of the bright spot determination threshold is realized by a more mature and simple method, thereby improving the accuracy of the single molecule recognition and/or counting method and reducing the cost of the single molecule identification and/or counting method. At the same time, using the first image to perform the search of the bright spot determination threshold can improve the efficiency and accuracy of the single molecule recognition and/or counting method.
具体地,大津法(OTSU算法)也可称为最大类间方差法,大津法利用类间方差最大来分割图像,意味着错分概率最小,准确性高。假设待处理图像的前景和背景的分割阈值为T,属于前景的像素点数占整幅图像的比例为ω0,其平均灰度为μ0;属于背景的像素点数占整幅图像的比例为ω1,其平均灰度为μ1。待处理图像的总平均灰度记为μ,类间方差记为var,则有:Specifically, the Otsu method (OTSU algorithm) can also be called the maximum inter-class variance method. The Otsu method uses the largest variance between classes to segment the image, which means that the probability of misclassification is the smallest and the accuracy is high. Suppose that the segmentation threshold of the foreground and background of the image to be processed is T, the ratio of the number of pixels belonging to the foreground to the entire image is ω 0 , and the average gradation is μ 0 ; the ratio of the number of pixels belonging to the background to the entire image is ω 1 , the average gray level is μ 1 . The total average gray level of the image to be processed is recorded as μ, and the variance between classes is recorded as var, which is:
μ=ω0*μ0+ω1*μ1 公式11;μ=ω 0 *μ 0 +ω 1 *μ 1 Equation 11;
var=ω0(μ0-μ)2+ω1(μ1-μ)2 公式12。Var=ω 0 (μ 0 -μ) 2 +ω 1 (μ 1 -μ) 2 Equation 12.
将公式11代入公式12,得到等价公式13:Substituting the formula 11 into the formula 12 yields the equivalent formula 13:
var=ω0ω1(μ1-μ0)2 公式13。Var=ω 0 ω 1 (μ 1 -μ 0 ) 2 Equation 13.
采用遍历的方法得到使类间方差最大的分割阈值T,即为所求的亮点判定阈值T。The traversal method is used to obtain a segmentation threshold T that maximizes the variance between classes, that is, the desired spot determination threshold T.
在某些实施方式中,请参图10,根据亮点判定阈值判断候选亮点是否为图像亮点的步骤,包括:In some embodiments, referring to FIG. 10, the step of determining whether the candidate bright spot is an image bright spot according to the bright spot determination threshold includes:
步骤S41,在第一图像中查找大于(h*h-1)连通的像素点并将查找到的像素点作为候选亮点的中心,h*h与亮点是一一对应的,h*h中的每个值对应一个像素点,h为自然数且为大于1的奇数;Step S41, searching for a pixel point larger than (h*h-1) in the first image and using the found pixel point as the center of the candidate bright point, h*h and the bright point are in one-to-one correspondence, in h*h Each value corresponds to one pixel, and h is a natural number and is an odd number greater than one;
步骤S42,判断候选亮点的中心是否满足条件:Imax*ABI*ceofguass>T,其中,Imax为h*h窗口的
中心最强强度,ABI为h*h窗口中第一图像中为设定值所占的比率,ceofguass为h*h窗口的像素和二维高斯分布的相关系数,T为亮点判定阈值。Step S42, determining whether the center of the candidate bright spot satisfies the condition: I max *A BI *ceof guass >T, where I max is the center strongest intensity of the h*h window, and A BI is the first image in the h*h window For the ratio of the set values, ceof guass is the correlation coefficient between the pixels of the h*h window and the two-dimensional Gaussian distribution, and T is the bright point determination threshold.
若满足上述条件,S43,判断候选亮点的中心对应的亮点为待处理图像所包含的图像亮点;If the above condition is met, S43, determining that the bright spot corresponding to the center of the candidate bright spot is an image bright point included in the image to be processed;
若不满足上述条件,S44,弃去候选亮点的中心对应的亮点。如此,实现了图像亮点的检测。If the above condition is not satisfied, S44, the bright spot corresponding to the center of the candidate bright spot is discarded. In this way, the detection of image highlights is achieved.
具体地,Imax可理解为候选亮点的中心最强强度。在一个例子中,h=3,查找大于8连通的像素点,如图11所示。将查找到的像素点作为候选亮点的像素点。Imax为3*3窗口的中心最强强度,ABI为3*3窗口中第一图像中为设定值所占的比率,ceofguass为3*3窗口的像素和二维高斯分布的相关系数。Specifically, I max can be understood as the center strongest intensity of the candidate bright spot. In one example, h=3, looking for pixels that are greater than 8 connected, as shown in FIG. The found pixel point is used as the pixel point of the candidate bright spot. I max is the strongest intensity in the center of the 3*3 window, A BI is the ratio of the set value in the first image in the 3*3 window, and ceof guass is the correlation between the pixel of the 3*3 window and the two-dimensional Gaussian distribution. coefficient.
第一图像为简化处理后的图像,例如第一图像可为二值化图像,也就是说,二值化图像中的设定值可为像素点满足设定条件时所对应的值。在另一个例子中,二值化图像可包含表征像素点不同属性的0和1二个数值,设定值为1,ABI为h*h窗口中二值化图像中为1所占的比率。例如,请参公式9,当SNR<=mean(SNR)时,BI=1。The first image is a simplified image, for example, the first image may be a binarized image, that is, the set value in the binarized image may be a value corresponding to when the pixel meets the set condition. In another example, the binarized image may contain two values of 0 and 1 characterizing different attributes of the pixel, the set value is 1, and A BI is the ratio of 1 in the binarized image in the h*h window. . For example, please refer to Equation 9. When SNR<=mean(SNR), BI=1.
另外,在某些实施方式中,h的数值可与在进行墨西哥帽滤波时所选取的m的数值相等,即h=m。Additionally, in some embodiments, the value of h may be equal to the value of m selected when performing Mexican hat filtering, ie, h = m.
在某些实施方式中,在采集上述图像时,相机会按时间序列依次进行多个视野(Field of View,FOV)的荧光采集。因此,在得到图像数据时,图像数据所包含的图像亮点强度与相机采集的时间序列是对应的。In some embodiments, when acquiring the above image, the camera sequentially performs fluorescence acquisition of a plurality of fields of view (FOV) in time series. Therefore, when image data is obtained, the intensity of the image highlights contained in the image data corresponds to the time series acquired by the camera.
在步骤S02中,得到所需的图像亮点后,对在相邻的采集时间所对应的图像亮点强度进行点连线,而形成图像亮点的时间与强度的折线图,如图12所示。在图12中,横轴表示采集荧光的时间,单位毫秒(ms),纵轴表示图像亮点强度。在一个例子中,相邻两次采集荧光的时间间隔为20ms。In step S02, after the desired image highlights are obtained, the intensity of the image highlights corresponding to the adjacent acquisition times are point-connected, and a line graph of the time and intensity of the image highlights is formed, as shown in FIG. In Fig. 12, the horizontal axis represents the time at which fluorescence is collected, in milliseconds (ms), and the vertical axis represents the intensity of the image bright spot. In one example, the time interval between two adjacent acquisitions of fluorescence is 20 ms.
纵轴为对应的亮点强度值,在本发明实施方式中,亮点强度值为亮点像素值,对于16位的tiff图像,亮点像素值在0-65535的范围内,对于8位的灰度图像,亮点像素值在0-255范围内。本发明实施方式中采用的是16位的tiff图像。The vertical axis is the corresponding bright point intensity value. In the embodiment of the present invention, the bright spot intensity value is a bright pixel value, and for a 16-bit tiff image, the bright pixel value is in the range of 0-65535, and for the 8-bit grayscale image, The bright pixel values are in the range 0-255. A 16-bit tiff image is used in the embodiment of the present invention.
在步骤S03中,将折线图的波形转化为图像处理以进行后续的直方统计。对折线图进行图像处理包括对折线图进行网格划分。In step S03, the waveform of the line graph is converted into image processing for subsequent histogram statistics. Image processing of the line graph includes meshing the line graph.
在某些实施方式中,对折线图进行网格划分是按照采集强度的时间帧数与强度的大小来划分。如此,可对折线图进行较简单的处理而得到网格划分,降低了单分子的识别方法的成本。具体地,可按照时间帧数划分为M和按照强度的大小划分为N,即形成M*N个网格。采集强度的时间帧数为相邻两次采集荧光的时间间隔。在一个实施方式中,可将一个网格沿横轴方向称为长度方向,沿纵轴方向称为高度方向。一个网格的长度可设置为时间帧数的数倍,如1倍、2倍、2.5倍等。一个网格的高度可灵活设置,例如,对于16位的tiff图像,纵轴的值为0-65535,网格划分时,可将纵轴的值归一化后平分成50份,然后一个网格的高度设置为0.02,即N=50。In some embodiments, meshing the line graph is divided by the number of time frames and the intensity of the acquisition intensity. In this way, the line graph can be relatively simplely processed to obtain mesh division, which reduces the cost of the single molecule identification method. Specifically, it can be divided into M according to the number of time frames and N according to the size of the intensity, that is, M*N grids are formed. The number of time frames of the acquired intensity is the time interval between two adjacent acquisitions of fluorescence. In one embodiment, a mesh may be referred to as a longitudinal direction along a horizontal axis and a height direction along a longitudinal axis. The length of a grid can be set to several times the number of time frames, such as 1x, 2x, 2.5x, and so on. The height of a grid can be flexibly set. For example, for a 16-bit tiff image, the value of the vertical axis is 0-65535. When meshing, the value of the vertical axis can be normalized and divided into 50 parts, then one network. The height of the grid is set to 0.02, which is N=50.
在一个例子中,相邻两次采集荧光的时间间隔为20ms,一个网格的长度等于一个时间间隔,高度=0.02。请参图16,在这样的例子中,落在一个网格的线段的次数可为0次、1次或2次。图16中的黑点代表图像亮点的强度的时间序列。In one example, the time interval between two adjacent acquisitions of fluorescence is 20 ms, the length of one grid is equal to one time interval, and the height = 0.02. Referring to Figure 16, in such an example, the number of segments falling on a grid can be 0, 1, or 2 times. The black dots in Figure 16 represent the time series of the intensity of the image highlights.
在一个例子中,请参图13,将折线图划分为8*6的网格,并统计落在每个网格的线段和/或线段的端点的次数。在图13中,统计落在每个网格的线段的次数(即每个网格被线段经过的次数),网格里的数字代表落在每个网格的线段的次数。图13中的黑点代表图像亮点的强度的时间序列。In one example, referring to Figure 13, the line graph is divided into 8*6 grids and the number of times that fall on the endpoints of each grid's line segments and/or line segments is counted. In Fig. 13, the number of times of the line segment falling on each grid (i.e., the number of times each grid is passed by the line segment) is counted, and the number in the grid represents the number of lines falling on each grid. The black dots in Figure 13 represent the time series of the intensity of the image highlights.
在某些实施方式中,步骤S04包括步骤:按照强度的大小划分为N个组,统计次数落在N个组里的频数:其中,ni表示落在网格的第i行的次数的频数之和,j表示时间帧数,gij表示落在网格(i,j)的次数的频数,M表示时间帧数的数量。如此,可将次数转换为强度与次数的直方图,使后续单分子的识别方法运算更简单。In some embodiments, step S04 includes the steps of dividing into N groups according to the strength, and counting the frequency of the statistics in the N groups: Where n i represents the sum of the frequencies of the number of times falling on the i-th row of the grid, j represents the number of time frames, g ij represents the frequency of the number of times falling on the grid (i, j), and M represents the number of time frames . In this way, the number of times can be converted into a histogram of the intensity and the number of times, so that the subsequent single molecule recognition method is simpler.
具体地,在一个例子中,直方图的横轴表示组数,纵轴表示次数落在对应组数的频数。需要指出的是,N的数值等于上述形成M*N个网格中的N的数值。M的数值等于上述形成M*N个网格中的M的数值。Specifically, in one example, the horizontal axis of the histogram represents the number of groups, and the vertical axis represents the frequency at which the number of times falls within the corresponding number of groups. It should be noted that the value of N is equal to the value of N formed in the M*N grids described above. The value of M is equal to the value of M described above which forms M*N grids.
在某些实施方式中,基于强度的大小进行分组,对次数进行频数统计,以获得直方图的步骤包括步骤:进行按L窗口的直方图均衡化:其中,np表示对ni均衡化,ni'表示ni的均衡化结果之和,p为与窗口L的大小以及所在第i行有关的整数。如此,能够使直方图的分布更均匀,易于识别。L窗口用于直方图均衡化,L的取值与单分子荧光的衰变速度的大小有关,一般地,如
果单分子荧光发光淬灭快,则L取值不宜过大。直方统计的精度受L窗口大小的影响,L的取值可灵活设置以选择合适的直方统计的精度。在一个例子中,L取值范围为[5,15]。In some embodiments, grouping based on the magnitude of the intensity, frequency statistics on the number of times to obtain a histogram includes the steps of performing histogram equalization by L window: Wherein, n p n i expressed equalization, n i 'denotes the size L of the equalization result and n i, p is an integer associated with the window, and where the i-th row. In this way, the distribution of the histogram can be made more uniform and easy to recognize. The L window is used for histogram equalization. The value of L is related to the decay rate of single molecule fluorescence. Generally, if the single molecule fluorescence is quenched quickly, the value of L should not be too large. The accuracy of the histogram is affected by the size of the L window, and the value of L can be flexibly set to select the accuracy of the appropriate histogram. In one example, L has a value range of [5, 15].
请参图17,图17是均衡化后的直方图,直方图的横轴表示组数,纵轴表示次数落在对应组数的频数。Referring to FIG. 17, FIG. 17 is an equalized histogram. The horizontal axis of the histogram indicates the number of groups, and the vertical axis indicates the frequency at which the number of times falls within the corresponding number of groups.
在步骤S05中,可按导数求出直方图的所有极大值点。第一设定阈值Q和第二设定阈值H与折线图的波峰的形状相关,波峰越尖,第一设定阈值Q越大,而第二设定阈值H越小;波峰越胖,第一设定阈值Q越小,而第二设定阈值H越大。在一个例子中,第一设定阈值Q的取值范围为[2,6],第二设定阈值H的取值范围为[4,10]。In step S05, all the maximum points of the histogram can be found by the derivative. The first set threshold Q and the second set threshold H are related to the shape of the peak of the line graph. The sharper the peak, the larger the first set threshold Q is, and the second set threshold H is smaller; the peak is fatter, the first The smaller the set threshold Q is, the larger the second set threshold H is. In one example, the first set threshold Q has a value range of [2, 6], and the second set threshold H has a value range of [4, 10].
在某些实施方式中,在对折线图进行网格划分前,单分子的识别方法还包括步骤:对折线图进行滤波。如此,能够消除光强闪烁和相机采样造成的突变误差,使折线图的波形更加平滑。具体地,波形的修饰可以采用基于L2大小窗口的中值滤波:R=medium(Zi)。在一个例子中,请结合图14和图15,图14为滤波前的折线图,图15为滤波后的折线图,由图可以看出,滤波后的折线图的波形更加平滑,有利于提高单分子识别的准确性和效率。In some embodiments, prior to meshing the line graph, the single molecule identification method further includes the step of filtering the line graph. In this way, the sudden change caused by the light intensity flicker and the camera sampling can be eliminated, and the waveform of the line graph is smoothed. Specifically, the modification of the waveform may employ a median filtering based on an L2 size window: R = medium(Z i ). In an example, please refer to FIG. 14 and FIG. 15, FIG. 14 is a line diagram before filtering, and FIG. 15 is a line diagram after filtering. It can be seen from the figure that the waveform of the filtered line chart is smoother, which is beneficial to improve. The accuracy and efficiency of single molecule recognition.
在本发明实施方式中,极大值点即为峰值点,极大值点为波峰的顶点(拐点),也就是说,判断满足条件的一个极大值点所在的峰对应一个单分子。In the embodiment of the present invention, the maximum point is the peak point, and the maximum point is the vertex (inflection point) of the peak, that is, the peak at which a maximum point satisfying the condition is judged corresponds to a single molecule.
在某些实施方式中,请参图18,单分子的识别方法,还包括步骤:S51,根据每个网格所对应的次数,对进行网格划分后的折线图进行线腐蚀以将进行网格划分后的折线图转换为简化图;S52,对简化图进行游程编码以标识连通区域;S53,计算每个连通区域的面积,判定满足以下条件的一个连通区域对应一个单分子:连通区域的面积大于第三设定阈值。In some embodiments, referring to FIG. 18, the single molecule identification method further includes the step of: S51, performing line etch on the meshed line graph according to the number of times corresponding to each grid to perform network aging. The line graph after division is converted into a simplified map; S52, run-length coding is performed on the simplified map to identify the connected region; S53, the area of each connected region is calculated, and a connected region satisfying the following condition is determined to correspond to a single molecule: the connected region The area is greater than the third set threshold.
如此,能够使单分子的识别和/或计数方法应用范围更广。该基于游程编码的单分子的识别方法能够根据亮点的强度的时间序列数据来准确识别单分子,尤其适用于一个亮点包含的单分子数目不大于3的情况。在该实施方式中,结合基于直方图统计和基于游程编码的方法来对单分子进行识别,能够准确识别各种波形的折线图(亮点强度的时间序列)中的单分子。In this way, the single molecule recognition and/or counting method can be applied in a wider range. The single-molecule recognition method based on run-length coding can accurately identify a single molecule according to time series data of the intensity of a bright spot, and is particularly suitable for a case where the number of single molecules included in one bright spot is not more than 3. In this embodiment, in combination with histogram-based and run-length coding-based methods to identify single molecules, it is possible to accurately identify single molecules in a line graph of various waveforms (time series of bright spot intensity).
在线腐蚀时,可采用以下公式进行形态学的腐蚀操作:g(x,y)=erode[f(x,y),B]=min{f(x+x',y+y')-B(x',y')|(x',y')∈Db}。较佳地,可选择直线的结构元素,如W*1的窗口大小,若窗口中网格的次数超过阈值T,则将网格标记为第一值,否则标记为第二值。如此,可将进行网格划分后的折线图转换为包括第一值和第二值的简化图。在某些实施方式中,简化图为二值化图。如可将第一值取值为1,将第二值取值为0。For online corrosion, the following equation can be used for morphological corrosion operations: g(x,y)=erode[f(x,y),B]=min{f(x+x',y+y')-B (x', y')|(x', y') ∈ D b }. Preferably, the structural element of the line, such as the window size of W*1, may be selected. If the number of times of the grid in the window exceeds the threshold T, the grid is marked as the first value, otherwise it is marked as the second value. In this way, the meshed line graph can be converted into a simplified map including the first value and the second value. In some embodiments, the simplified map is a binarized map. If the first value is 1 and the second value is 0.
在一个例子中,请参图19,一个网格的长度为L1,W=2*L1,T=2,图19显示沿长度方向排列的5个网格,网格里的数字代表次数,那么在进行线腐蚀时,将窗口与网格对齐,在经线腐蚀后,5个网格分别被标记为0、1、0、0、0。In an example, please refer to Figure 19. The length of a grid is L1, W=2*L1, T=2. Figure 19 shows five grids arranged along the length. The numbers in the grid represent the number of times, then When performing line etching, the window is aligned with the grid. After the warp is etched, the five grids are labeled 0, 1, 0, 0, and 0, respectively.
在另一个例子中,请参图20,一个网格的长度为L1,W=2*L1,T=2,图20显示沿长度方向排列的5个网格,网格里的数字代表次数,那么在进行线腐蚀时,将窗口与网格错开,在经线腐蚀后,5个网格分别被标记为0、1、0、0、0。In another example, please refer to Figure 20. The length of a grid is L1, W=2*L1, T=2. Figure 20 shows five grids arranged along the length. The numbers in the grid represent the number of times. Then, when performing line etching, the window is staggered from the grid. After the warp is etched, the five grids are marked as 0, 1, 0, 0, and 0, respectively.
需要说明的是,W的取值要大于或等于一个网格的长度,较佳地,W是一个网格的长度的整数倍。在例子中,W>=L1,较佳地,W是L1的整数倍。It should be noted that the value of W is greater than or equal to the length of a grid. Preferably, W is an integer multiple of the length of a grid. In the example, W>=L1, preferably, W is an integer multiple of L1.
在其它例子中,阈值T的取值范围为[6,8],其选择与折线图的波形的波动有关,波动越小,阈值T的取值越大。In other examples, the threshold T has a value range of [6, 8], and its selection is related to the fluctuation of the waveform of the line graph. The smaller the fluctuation, the larger the value of the threshold T is.
为便于理解,在说明游程编码时,以二值化图中的1和0为例进行以下说明。可以理解,简化图的其它类型及第一值和第二值的其它取值,本领域技术人员可根据以下说明进行变更。For ease of understanding, in describing the run length coding, the following description will be made by taking 1 and 0 in the binarization diagram as an example. It will be understood that other types of the figures and other values of the first and second values may be modified by those skilled in the art in light of the following description.
在游程编码时,可采用8连通的方式。根据网格,按照8连通的原则递归连接各自的连通区域,接着利用游程编码标识连通区域。具体地,通过8连通(如采用图21所示的3*3窗口),从一个非0的网格Q出发,如果网格Q的8个方向的网格均为非0,则将网格Q的8个方向上的网格标识为与网格Q一样的数值,以此类推。在完成整个简化图后,可得到如图18所示的标识图。In the run-length encoding, an 8-connected approach can be used. According to the grid, the respective connected areas are recursively connected according to the principle of 8 connections, and then the connected areas are identified by the run length coding. Specifically, by 8 connectivity (such as using the 3*3 window shown in FIG. 21), starting from a non-zero grid Q, if the grids of the 8 directions of the grid Q are all non-zero, the grid will be The grid in the 8 directions of Q is identified as the same value as the grid Q, and so on. After completing the entire simplified diagram, an identification map as shown in FIG. 18 can be obtained.
在图22中,将不同的连通区域用不同的数值进行标识,在计算每个连通区域的面积时,将相同数字出现的次数记为连通区域的面积,如图22中,数字9出现的次数为9,则数字9对应的连通区域的面积为9,数字7出现的次数为20,则数字7对应的连通区域的面积为20。In Fig. 22, different connected areas are identified by different numerical values. When calculating the area of each connected area, the number of occurrences of the same number is recorded as the area of the connected area, as shown in Fig. 22, the number of occurrences of the number 9 If it is 9, the area of the connected area corresponding to the number 9 is 9, and the number of occurrences of the number 7 is 20, and the area of the connected area corresponding to the number 7 is 20.
上述示例是采用递归算法,在其它示例中,也可采用遍历算法来查找连通区域。The above example uses a recursive algorithm, and in other examples, a traversal algorithm can also be employed to find connected regions.
若连通区域的面积大于第三设定阈值P,则一个这样的连通区域对应一个单分子。P的取值大小与单分子荧光的衰变时间相关。在一个例子中,第三设定阈值P的取值范围为[5,10]。If the area of the connected area is greater than the third set threshold P, one such connected area corresponds to a single molecule. The magnitude of P is related to the decay time of single molecule fluorescence. In one example, the third set threshold P has a value range of [5, 10].
请参图23,本发明实施方式的一种单分子的计数方法,包括步骤:S81,输入图像亮点强度的时间序列;S82,根据时间序列,形成图像亮点的时间与强度的折线图,折线图由多条线段组成;S83,对折线图进
行网格划分以形成阵列排布的多个网格,统计落在每个网格的线段和/或线段的端点的次数;S84,基于强度的大小进行分组,对次数进行频数统计,以获得直方图;S85,查找直方图的极大值点,判定满足以下条件的一个极大值点所在的峰对应一个单分子:极大值点的值大于第一设定阈值且极大值点所在的峰的宽度大于第二设定阈值;S86,计算获得单分子的数目S1。上述单分子的计数方法,通过对亮点强度的时间序列的折线图转化为图像处理以得到直方图,能够快速地对单分子进行计数,而计数的精度也较高。需要说明的是,上述任一实施方式和示例中的对单分子的识别和/或计数方法的技术特征和优点的描述,包括步骤、参数设置以及图像预处理亮点检测等的解释和说明,同样也适用于本实施方式的单分子的计数方法,为避免冗余,在此不再详细展开。Referring to FIG. 23, a single molecule counting method according to an embodiment of the present invention includes the steps of: S81, inputting a time series of image brightness intensity; S82, forming a line graph of time and intensity of an image bright point according to a time series, a line chart Consists of multiple line segments; S83, on the line graph
The rows are meshed to form a plurality of grids arranged in the array, and the number of times of the line segments and/or the end points of the line segments of each grid are counted; S84, grouping based on the intensity, and frequency statistics are performed to obtain the frequency statistics. Histogram; S85, find the maximum point of the histogram, and determine that the peak of a maximum point satisfying the following condition corresponds to a single molecule: the value of the maximum point is greater than the first set threshold and the maximum point is The width of the peak is greater than the second set threshold; S86, the number of single molecules S1 is calculated. The counting method of the single molecule described above is converted into image processing by a line graph of a time series of bright spot intensity to obtain a histogram, and the single molecule can be quickly counted, and the counting accuracy is also high. It should be noted that the description of the technical features and advantages of the single molecule identification and/or counting method in any of the above embodiments and examples includes explanations and explanations of steps, parameter settings, and image preprocessing bright spot detection, and the like. Also applicable to the single molecule counting method of the present embodiment, in order to avoid redundancy, it will not be developed in detail here.
例如,在某些实施方式中,在步骤S83对折线图进行网格划分前,单分子的计数方法还包括对折线图进行滤波的步骤。再例如,在某些实施方式中,请参图24,该单分子的计数方法还包括步骤:S91,根据每个网格所对应的次数,对进行网格划分后的折线图进行线腐蚀以将进行网格划分后的折线图转换为简化图;S92,对简化图进行游程编码以标识连通区域;S93,计算每个连通区域的面积,判定满足以下条件的一个连通区域对应一个单分子:连通区域的面积大于第三设定阈值;S94,计算获得单分子的数目S2;S95,取S1和S2中的较小者作为最终单分子数。该基于直方统计的单分子的计数方法特别适用于准确查找亮点包含的单分子数>3的情况,而基于游程编码的单分子的计数方法特别适用于准确查找亮点包含的单分子数<=3的情况。在该实施方式中,结合两种方法,可准确查找及计数各种波形的折线图中的单分子。在某些实施方式中,简化图为二值化图。For example, in some embodiments, prior to meshing the line graph in step S83, the single molecule counting method further includes the step of filtering the line graph. For example, in some embodiments, referring to FIG. 24, the single molecule counting method further includes the step of: S91, performing line etching on the meshed line graph according to the number of times corresponding to each grid. Converting the line graph after meshing into a simplified graph; S92, performing run-length encoding on the simplified graph to identify the connected region; S93, calculating the area of each connected region, and determining that one connected region satisfying the following condition corresponds to a single molecule: The area of the connected region is greater than a third set threshold; S94, the number of single molecules S2 is calculated; S95, and the smaller of S1 and S2 is taken as the final single molecule number. The single-molecule counting method based on histogram is particularly suitable for accurately finding the single molecule number >3 of the bright spot, and the single-molecule counting method based on the run-length encoding is particularly suitable for accurately finding the single molecule number contained in the bright spot <=3 Case. In this embodiment, combining the two methods, it is possible to accurately find and count single molecules in a line graph of various waveforms. In some embodiments, the simplified map is a binarized map.
请参图25,本发明实施方式的一种单分子的计数方法,包括步骤:S61,输入图像亮点强度的时间序列;S62,根据时间序列,形成图像亮点的时间与强度的折线图,折线图由多条线段组成;S63,对折线图进行网格划分以形成阵列排布的多个网格,统计落在每个网格的线段和/或线段的端点的次数;S64,基于强度的大小进行分组,对次数进行频数统计,以获得直方图;S65,查找直方图的极大值点,判定满足以下条件时,对单分子的计数加1:极大值点的值大于第一设定阈值且极大值点所在的峰的宽度大于第二设定阈值。上述单分子的计数方法,通过对亮点强度的时间序列的折线图转化为图像处理以得到直方图,能够快速地对单分子进行计数,而计数的精度也较高。Referring to FIG. 25, a single molecule counting method according to an embodiment of the present invention includes the steps of: S61, inputting a time series of image brightness intensity; S62, forming a line graph of time and intensity of an image bright point according to a time series, a line chart Consisting of a plurality of line segments; S63, meshing the line graphs to form a plurality of grids arranged in the array, counting the number of times of the line segments and/or end points of each line segment; S64, based on the intensity Perform grouping, perform frequency statistics on the number of times to obtain a histogram; S65, find the maximum point of the histogram, and determine that the count of the single molecule is increased by 1 when the following conditions are satisfied: the value of the maximum point is greater than the first setting The width of the peak at which the threshold and the maximum point are located is greater than the second set threshold. The counting method of the single molecule described above is converted into image processing by a line graph of a time series of bright spot intensity to obtain a histogram, and the single molecule can be quickly counted, and the counting accuracy is also high.
需要说明的是,上述任一实施方式和示例中的对单分子的识别和/或计数方法的技术特征和优点的描述,包括对步骤、参数设置以及图像预处理亮点检测等的解释和说明,同样也适用于本实施方式的单分子的计数方法,为避免冗余,在此不再详细展开。It should be noted that the description of the technical features and advantages of the single molecule identification and/or counting method in any of the above embodiments and examples includes explanations and explanations of steps, parameter settings, and image preprocessing bright spot detection, etc. The same applies to the single molecule counting method of the present embodiment, and in order to avoid redundancy, it will not be developed in detail here.
例如,在某些实施方式中,在步骤S63对折线图进行网格划分前,单分子的计数方法还包括对折线图进行滤波的步骤。再例如,在某些实施方式中,请参图26,该单分子的计数方法,还包括步骤:S71,根据每个网格所对应的次数,对进行网格划分后的折线图进行线腐蚀以将进行网格划分后的折线图转换为简化图;S72,对简化图进行游程编码以标识连通区域;S73,计算每个连通区域的面积,并判定满足以下条件时,对单分子的计数加1:连通区域的面积大于第三设定阈值;S74,将基于直方图获取的单分子的计数和基于游程编码获取的单分子的计数中的较小者作为最终的单分子数。如此,能够使单分子的计数方法应用范围更广,及能够获取更准确的单分子数。For example, in some embodiments, prior to meshing the line graph in step S63, the single molecule counting method further includes the step of filtering the line graph. For example, in some embodiments, referring to FIG. 26, the single molecule counting method further includes the step of: S71, performing line etching on the meshed line graph according to the number of times corresponding to each grid. Converting the line graph after the meshing into a simplified graph; S72, performing run-length encoding on the simplified graph to identify the connected region; S73, calculating the area of each connected region, and determining the count of the single molecule when the following conditions are satisfied Plus 1: the area of the connected region is greater than a third set threshold; S74, the smaller of the single molecule count based on the histogram and the single molecule count obtained based on the run length encoding is taken as the final single molecule number. In this way, the single molecule counting method can be applied in a wider range, and a more accurate single molecule number can be obtained.
基于直方统计的单分子的计数方法特别适用于准确查找亮点包含的单分子数>3的情况,而基于游程编码的单分子的计数方法特别适用于准确查找亮点包含的单分子数<=3的情况。在该实施方式中,结合两种方法,可准确查找及计数各种波形的折线图中的单分子。例如,基于直方图获取的单分子数为S1,基于游程编码获取的单分子数为S2,比较S1和S2的大小,S1和S2中取较小者作为最终的单分子数。The single-molecule counting method based on the histogram is especially suitable for accurately finding the single molecule number >3 of the bright spot, and the single-molecule counting method based on the run-length encoding is particularly suitable for accurately finding the single molecule number <=3 included in the bright spot. Happening. In this embodiment, combining the two methods, it is possible to accurately find and count single molecules in a line graph of various waveforms. For example, the number of single molecules acquired based on the histogram is S1, the number of single molecules acquired based on the run length encoding is S2, and the sizes of S1 and S2 are compared, and the smaller ones of S1 and S2 are taken as the final single molecule number.
请参图27,本发明实施方式的一种单分子的识别装置200,该单分子的识别装置200用以实施上述任一实施方式或示例中的单分子识别方法的全部或部分步骤,该单分子的识别装置200包括:第一输入单元202,用于输入图像亮点强度的时间序列;第一转化单元204,用于根据第一输入单元202中的时间序列,形成图像亮点的时间与强度的折线图,折线图由多条线段组成;第一网格统计单元206,用于对来自第一转化单元204的折线图进行网格划分以形成阵列排布的多个网格,统计落在每个网格的线段和/或线段的端点的次数;第一直方统计单元208,用于基于强度的大小进行分组,对来自第一网格统计单元206的次数进行频数统计,以获得直方图;第一判定单元210,用于查找来自第一直方统计单元208的直方图的极大值点,并判定满足以下条件的一个极大值点所在的峰对应一个单分子:极大值点的值大于第一设定阈值且极大值点所在的峰的宽度大于第二设定阈值。上述单分子的识别装置200,通过对亮点强度的时间序列的折线图转化为图像处理以得到直方图,能够快速地对单分子进行识别,而识别的精度也较高。Referring to FIG. 27, a single molecule identification device 200 according to an embodiment of the present invention is used to implement all or part of the steps of the single molecule identification method in any of the above embodiments or examples. The molecular recognition device 200 includes: a first input unit 202 for inputting a time series of image brightness intensity; and a first conversion unit 204 configured to form a time and intensity of the image bright point according to the time sequence in the first input unit 202. a line graph, the line graph is composed of a plurality of line segments; a first grid statistic unit 206 is configured to mesh the line graphs from the first transform unit 204 to form a plurality of grids arranged in the array, and the statistics fall on each The number of times of the line segments and/or the end points of the line segments; the first histogram statistics unit 208 is configured to perform grouping based on the magnitude of the intensity, and perform frequency statistics on the number of times from the first mesh statistical unit 206 to obtain a histogram a first determining unit 210, configured to find a maximum value point of the histogram from the first histogram statistic unit 208, and determine a peak pair of a maximum value point that satisfies the following condition A single molecule: the value is greater than a first maximum set point value and the threshold value width of the peak point where the maximum is greater than a second predetermined threshold value. The single-molecule identification device 200 converts a time-series line graph of bright spot intensity into image processing to obtain a histogram, and can quickly recognize a single molecule, and the recognition accuracy is also high.
需要说明的是,上述任一实施方式和实施例中的对单分子的识别方法的技术特征和有益效果的解释和说明也适用于本实施方式的单分子的识别装置200,为避免冗余,在此不再详细展开。It should be noted that the explanation and description of the technical features and beneficial effects of the single molecule identification method in any of the above embodiments and embodiments are also applicable to the single molecule identification device 200 of the present embodiment, in order to avoid redundancy. It will not be expanded in detail here.
例如,在某些实施方式中,请参图28,单分子的识别装置200还包括第一滤波单元212,与第一网
格统计单元206连接,用于在对折线图进行网格划分前,对来自第一转化单元204的折线图进行滤波。For example, in some embodiments, referring to FIG. 28, the single molecule identification device 200 further includes a first filtering unit 212, and the first network.
The grid statistics unit 206 is coupled for filtering the line graph from the first conversion unit 204 prior to meshing the line graph.
在某些实施方式中,在第一网格统计单元206中,对折线图进行网格划分是按照采集强度的时间帧数与强度的大小来划分。In some embodiments, in the first mesh statistic unit 206, meshing the line graph is divided according to the number of time frames of the acquisition intensity and the magnitude of the intensity.
在某些实施方式中,在第一直方统计单元208中,基于强度的大小进行分组,对次数进行频数统计,以获得直方图包括:按照强度的大小划分为N个组,统计次数落在N个组里的频数:其中,ni表示落在网格的第i行的次数的频数之和,j表示时间帧数,gij表示落在网格(i,j)的的次数的频数,M表示时间帧数的数量。In some embodiments, in the first histogram statistic unit 208, grouping is performed based on the magnitude of the intensity, and frequency statistics are performed on the number of times to obtain a histogram, including: dividing into N groups according to the magnitude of the intensity, and the number of statistics falls on Frequency in N groups: Where n i represents the sum of the frequencies of the number of times falling on the i-th row of the grid, j represents the number of time frames, g ij represents the frequency of the number of times falling on the grid (i, j), and M represents the number of time frames Quantity.
在某些实施方式中,在第一直方统计单元208中,基于强度的大小进行分组,对次数进行频数统计,以获得直方图包括:进行按L窗口的直方图均衡化:其中,np表示对ni均衡化,ni'表示ni的均衡化结果之和,p为与窗口L的大小以及所在第i行有关的整数。In some embodiments, in the first histogram statistic unit 208, grouping is performed based on the magnitude of the intensity, and frequency statistics are performed on the number of times to obtain a histogram including: performing histogram equalization by the L window: Wherein, n p n i expressed equalization, n i 'denotes the size L of the equalization result and n i, p is an integer associated with the window, and where the i-th row.
在某些实施方式中,请参图29,单分子的识别装置200,还包括:第一简化单元214,用于根据每个网格所对应的次数,对进行网格划分后的折线图进行线腐蚀以将进行网格划分后的折线图转换为简化图;第一标识单元216,用于对简化图进行游程编码以标识连通区域;在第一判定单元210中,计算每个连通区域的面积,判定满足以下条件的一个连通区域对应一个单分子:连通区域的面积大于第三设定阈值。在某些实施方式中,简化图为二值化图。In some embodiments, referring to FIG. 29, the single-molecule identification device 200 further includes: a first simplification unit 214, configured to perform a meshed line graph according to the number of times corresponding to each grid Line etching to convert the line graph after meshing into a simplified map; a first identifying unit 216 for run-length encoding the simplified graph to identify a connected region; and in the first determining unit 210, calculating each connected region The area determines that one connected region satisfying the following condition corresponds to one single molecule: the area of the connected region is larger than the third set threshold. In some embodiments, the simplified map is a binarized map.
在某些实施方式中,请参图30,单分子的识别装置200,还包括:第一图像预处理单元218,第一图像预处理单元218用于分析输入的待处理图像以获得第一图像,待处理图像包含至少一个图像亮点,图像亮点具有至少一个像素点;第一亮点检测单元220,第一亮点检测单元220用于:分析第一图像以计算亮点判定阈值,分析第一图像以获取候选亮点,根据亮点判定阈值判断候选亮点是否为图像亮点,若判断结果为是,则获取图像亮点强度的时间序列,若判断结果为否,丢弃候选亮点。In some embodiments, referring to FIG. 30, the single-molecule identification device 200 further includes: a first image pre-processing unit 218 for analyzing the input image to be processed to obtain a first image. The image to be processed includes at least one image bright point, the image bright spot has at least one pixel point; the first bright spot detecting unit 220 is configured to: analyze the first image to calculate a bright spot determination threshold, analyze the first image to obtain The candidate bright spot determines whether the candidate bright spot is an image bright spot according to the bright spot determination threshold. If the determination result is yes, the time series of the image bright spot intensity is acquired, and if the determination result is negative, the candidate bright spot is discarded.
在某些实施方式中,第一图像预处理单元218包括第一减背景单元226,第一减背景单元226用于对待处理图像进行减背景处理,以获得第一图像。In some embodiments, the first image pre-processing unit 218 includes a first subtraction background unit 226 for performing background subtraction processing on the image to be processed to obtain a first image.
在某些实施方式中,第一图像预处理单元218包括第一图像简化单元222,第一图像简化单元222用于对进行减背景处理后的待处理图像进行简化处理,以获得第一图像。In some embodiments, the first image pre-processing unit 218 includes a first image reduction unit 222 for performing a simplification process on the image to be processed after the background subtraction process to obtain a first image.
在某些实施方式中,第一图像预处理单元218包括第一图像滤波单元224,第一图像滤波单元224用于对待处理图像进行滤波处理,以获得第一图像。In some embodiments, the first image pre-processing unit 218 includes a first image filtering unit 224 for performing a filtering process on the image to be processed to obtain a first image.
在某些实施方式中,第一图像预处理单元218包括第一减背景单元226和第一图像滤波单元224,第一减背景单元226用于对待处理图像进行减背景处理,第一图像滤波单元224用于对进行减背景处理后的待处理图像再进行滤波处理,以获得第一图像。In some embodiments, the first image pre-processing unit 218 includes a first subtraction background unit 226 and a first image filtering unit 224 for performing background subtraction processing on the image to be processed, the first image filtering unit 224 is configured to perform filtering processing on the image to be processed after performing background subtraction processing to obtain a first image.
在某些实施方式中,第一图像预处理单元218包括第一图像简化单元222,第一图像简化单元222用于对进行减背景处理后再进行滤波处理后的待处理图像进行简化处理,以获得第一图像。In some embodiments, the first image pre-processing unit 218 includes a first image simplification unit 222, and the first image simplification unit 222 is configured to simplify the image to be processed after the background processing is performed, Obtain the first image.
在某些实施方式中,第一图像预处理单元218包括第一图像简化单元222,第一图像简化单元222用于对待处理图像进行简化处理以获得第一图像。In some embodiments, the first image pre-processing unit 218 includes a first image reduction unit 222 for performing a simplification process on the image to be processed to obtain a first image.
在某些实施方式中,在第一减背景单元226中,对待处理图像进行减背景处理,包括:利用开运算确定待处理图像的背景,根据背景对待处理图像进行减背景处理。在某些实施方式中,滤波处理为墨西哥帽滤波处理。在某些实施方式中,简化处理为二值化处理。In some embodiments, in the first subtraction background unit 226, performing background subtraction processing on the image to be processed includes: determining an background of the image to be processed by using an open operation, and performing background subtraction processing on the image to be processed according to the background. In some embodiments, the filtering process is a mexican hat filtering process. In some embodiments, the simplification process is a binarization process.
在某些实施方式中,第一图像简化单元222用于,根据简化处理前的待处理图像获取信噪比矩阵,并根据信噪比矩阵简化简化处理前的待处理图像以得到第一图像。In some embodiments, the first image reduction unit 222 is configured to acquire a signal to noise ratio matrix according to the image to be processed before the simplified processing, and simplify the simplified image before processing according to the signal to noise ratio matrix to obtain the first image.
在某些实施方式中,在第一亮点检测单元220中,分析第一图像以计算亮点判定阈值包括:通过大津法处理第一图像以计算亮点判定阈值。In some embodiments, in the first bright spot detecting unit 220, analyzing the first image to calculate the bright spot determination threshold comprises: processing the first image by the Otsu method to calculate a bright spot determination threshold.
在某些实施方式中,在第一亮点检测单元220中,根据亮点判定阈值判断候选亮点是否为图像亮点,包括:在第一图像中查找大于(h*h-1)连通的像素点并将查找到的像素点作为候选亮点的中心,h为自然数且为大于1的奇数;判断候选亮点的中心是否满足条件:Imax*ABI*ceofguass>T,其中,Imax为h*h窗口的中心最强强度,ABI为h*h窗口中第一图像中为设定值所占的比率,ceofguass为h*h窗口的像素和二维高斯分布的相关系数,T为亮点判定阈值,若满足上述条件,判断候选亮点的中心对应的亮点为图像亮点,若不满足上述条件,弃去候选亮点的中心对应的亮点。In some embodiments, determining, in the first bright spot detecting unit 220, whether the candidate bright spot is an image bright point according to the bright spot determination threshold includes: searching for a pixel point larger than (h*h-1) in the first image and The found pixel is the center of the candidate bright spot, h is a natural number and is an odd number greater than 1; determining whether the center of the candidate bright spot satisfies the condition: I max *A BI *ceof guass >T, where I max is h*h window The strongest intensity of the center, A BI is the ratio of the set value in the first image in the h*h window, ceof guass is the correlation coefficient of the pixel of the h*h window and the two-dimensional Gaussian distribution, and T is the bright point decision threshold If the above conditions are met, it is determined that the bright spot corresponding to the center of the candidate bright spot is an image bright spot, and if the above condition is not satisfied, the bright spot corresponding to the center of the candidate bright spot is discarded.
请参图31,本发明实施方式的一种单分子的计数装置400,该单分子的计数装置400用以实施上述本发明任一实施方式和实施例中的单分子的计数方法的全部或部分步骤,该单分子的计数装置400包括:
第二输入单元402,用于输入图像亮点强度的时间序列;第二转化单元404,用于根据第二输入单元402中的时间序列,形成图像亮点的时间与强度的折线图,折线图由多条线段组成;第二网格统计单元406,用于对来自第二转化单元404的折线图进行网格划分以形成阵列排布的多个网格,统计落在每个网格的线段和/或线段的端点的次数;第二直方统计单元408,用于基于强度的大小进行分组,对来自第二网格统计单元406的次数进行频数统计,以获得直方图;第二判定单元410,用于查找来自第二直方统计单元408的直方图的极大值点,并判定满足以下条件的一个极大值点所在的峰对应一个单分子:极大值点的值大于第一设定阈值且极大值点所在的峰的宽度大于第二设定阈值;计算单元412,用于计算获得单分子的数目S1。上述单分子的计数装置400,通过对亮点强度的时间序列的折线图转化为图像处理以得到直方图,能够快速地对单分子进行计数,而计数的精度也较高。Referring to FIG. 31, a single molecule counting device 400 according to an embodiment of the present invention is used to implement all or part of the single molecule counting method in any of the above embodiments and examples of the present invention. In the step, the single molecule counting device 400 includes:
a second input unit 402 is configured to input a time series of image brightness intensity; a second conversion unit 404 is configured to form a line graph of time and intensity of the image bright point according to the time sequence in the second input unit 402, and the line graph is composed of The second line statistic unit 406 is configured to mesh the line graphs from the second conversion unit 404 to form a plurality of grids arranged in the array, and the statistics fall on the line segments of each grid and/or Or the number of times of the end points of the line segment; the second histogram statistic unit 408 is configured to perform grouping based on the magnitude of the intensity, perform frequency statistics on the number of times from the second grid statistical unit 406 to obtain a histogram; and the second determining unit 410 uses Finding a maximum value point of the histogram from the second histogram statistic unit 408, and determining that a peak of a maximum value point satisfying the following condition corresponds to a single molecule: the value of the maximum value point is greater than the first set threshold value and The width of the peak where the maximum point is located is greater than the second set threshold; and the calculation unit 412 is configured to calculate the number S1 of single molecules. The above-described single-molecule counting device 400 converts a time-series line graph of bright spot intensity into image processing to obtain a histogram, and can quickly count a single molecule, and the counting accuracy is also high.
需要说明的是,上述任一实施方式和实施例中的对单分子的计数方法的技术特征和有益效果的解释和说明也适用于本实施方式的单分子的计数装置400,为避免冗余,在此不再详细展开。It should be noted that the explanation and description of the technical features and beneficial effects of the single molecule counting method in any of the above embodiments and examples are also applicable to the single molecule counting device 400 of the present embodiment, in order to avoid redundancy, It will not be expanded in detail here.
例如,在某些实施方式中,请参图32,单分子的计数装置400还包括第二滤波单元414,与第二网格统计单元406连接,用于在对折线图进行网格划分前,对来自第二转化单元404的折线图进行滤波。For example, in some embodiments, referring to FIG. 32, the single molecule counting device 400 further includes a second filtering unit 414 coupled to the second mesh statistical unit 406 for meshing the line graph before it is meshed. The line graph from the second conversion unit 404 is filtered.
在某些实施方式中,请参图33,单分子的计数装置400,还包括:第二简化单元416,用于根据每个网格所对应的次数,对进行网格划分后的折线图进行线腐蚀以将进行网格划分后的折线图转换为简化图;第二标识单元418,用于对简化图进行游程编码以标识连通区域;在第二判定单元410中,计算每个连通区域的面积,判定满足以下条件的一个连通区域对应一个单分子:连通区域的面积大于第三设定阈值;在计算单元412中,计算获得单分子的数目S2,取S1和S2中的较小者作为最终单分子数。In some embodiments, referring to FIG. 33, the single-molecule counting device 400 further includes: a second simplifying unit 416, configured to perform the meshed line graph according to the number of times corresponding to each grid Line etching to convert the meshed line graph into a simplified map; a second identifying unit 418 for run-length encoding the simplified map to identify the connected region; and in the second determining unit 410, calculating each connected region The area determines that one connected region satisfying the following condition corresponds to one single molecule: the area of the connected region is larger than the third set threshold; in the calculating unit 412, the number S2 of single molecules is calculated, and the smaller one of S1 and S2 is taken as The final single molecule number.
请参图34,本发明实施方式的一种单分子的计数装置600,包括:第三输入单元602,用于输入图像亮点强度的时间序列;第三转化单元604,用于根据第三输入单元602中的时间序列,形成图像亮点的时间与强度的折线图,折线图由多条线段组成;第三网格统计单元606,用于对来自第三转化单元604的折线图进行网格划分以形成阵列排布的多个网格,统计落在每个网格的线段和/或线段的端点的次数;第三直方统计单元608,用于基于强度的大小进行分组,对来自第三网格统计单元606的次数进行频数统计,以获得直方图;第三判定单元610,用于查找来自第三直方统计单元608的直方图的极大值点,并判定满足以下条件时,对单分子的计数加1:极大值点的值大于第一设定阈值且极大值点所在的峰的宽度大于第二设定阈值。上述单分子的计数装置600,通过对亮点强度的时间序列的折线图转化为图像处理以得到直方图,能够快速地对单分子进行计数,而计数的精度也较高。Referring to FIG. 34, a single-molecule counting device 600 according to an embodiment of the present invention includes: a third input unit 602 for inputting a time series of image brightness intensity; and a third converting unit 604 for using a third input unit. a time series in 602, a line graph of time and intensity of the image highlights, the line graph consisting of a plurality of line segments; and a third grid statistics unit 606 for meshing the line graphs from the third transforming unit 604 Forming a plurality of grids arranged in an array, counting the number of times falling on the end points of the line segments and/or line segments of each grid; a third histogram statistics unit 608 for grouping based on the magnitude of the intensity, from the third grid The number of statistics unit 606 is frequency counted to obtain a histogram; the third determining unit 610 is configured to find a maximum value point of the histogram from the third histogram statistic unit 608, and determine that the single condition is satisfied when the following conditions are met. Count plus 1: The value of the maximum point is greater than the first set threshold and the width of the peak at which the maximum point is located is greater than the second set threshold. The above-described single-molecule counting device 600 converts a time-series line graph of bright spot intensity into image processing to obtain a histogram, and can quickly count a single molecule, and the counting accuracy is also high.
需要说明的是,上述任一实施方式和实施例中的对单分子的计数方法的技术特征和有益效果的解释和说明也适用于本实施方式的单分子的计数装置600,为避免冗余,在此不再详细展开。It should be noted that the explanation and description of the technical features and beneficial effects of the counting method for single molecules in any of the above embodiments and embodiments are also applicable to the single molecule counting device 600 of the present embodiment, in order to avoid redundancy, It will not be expanded in detail here.
例如,在某些实施方式中,请参图35,单分子的计数装置600还包括第三滤波单元612,与第三网格统计单元606连接,用于在对折线图进行网格划分前,对来自第三转化单元604的折线图进行滤波。For example, in some embodiments, referring to FIG. 35, the single molecule counting device 600 further includes a third filtering unit 612 coupled to the third mesh statistical unit 606 for use in meshing the line graph. The line graph from the third conversion unit 604 is filtered.
在某些实施方式中,请参图36,单分子的计数装置600,还包括:第三简化单元614,用于根据每个网格所对应的次数,对进行网格划分后的折线图进行线腐蚀以将进行网格划分后的折线图转换为简化图;第三标识单元616,用于对简化图进行游程编码以标识连通区域;在第三判定单元610中,计算每个连通区域的面积,并判定满足以下条件时,对单分子的计数加1:连通区域的面积大于第三设定阈值;及用于将基于直方图获取的单分子的计数和基于游程编码获取的单分子的计数中的较小者作为最终的单分子数。In some embodiments, referring to FIG. 36, the single-molecule counting device 600 further includes: a third simplifying unit 614, configured to perform the meshed line graph according to the number of times corresponding to each grid. Line etching to convert the meshed line graph into a simplified map; a third identifying unit 616 for run-length encoding the simplified map to identify the connected region; and in the third determining unit 610, calculating each connected region Area, and determine that the following conditions are met, the count of the single molecule is increased by 1: the area of the connected region is larger than the third set threshold; and the single molecule count obtained based on the histogram and the single molecule obtained based on the run length coding The smaller of the counts is the final single molecule number.
请参图37,本发明实施方式的一种单分子的处理系统300,包括:数据输入装置302,用于输入数据;数据输出装置304,用于输出数据;存储装置306,用于存储数据,数据包括计算机可执行程序;处理器308,用于执行计算机可执行程序,执行计算机可执行程序包括完成上述任一实施方式的方法。Referring to FIG. 37, a single molecule processing system 300 according to an embodiment of the present invention includes: a data input device 302 for inputting data; a data output device 304 for outputting data; and a storage device 306 for storing data. The data includes a computer executable program; a processor 308 for executing a computer executable program, and executing the computer executable program includes the method of performing any of the above embodiments.
本发明实施方式的一种计算机可读存储介质,用于存储供计算机执行的程序,执行程序包括完成上述任一实施方式的方法。计算机可读存储介质可以包括:只读存储器、随机存储器、磁盘或光盘等。A computer readable storage medium for storing a program for execution by a computer, the program comprising the method of any of the above embodiments. The computer readable storage medium may include read only memory, random access memory, magnetic or optical disks, and the like.
在本说明书的描述中,参考术语“一个实施方式”、“某些实施方式”、“示意性实施方式”、“示例”、“具体示例”、或“一些示例”等的描述意指结合所述实施方式或示例描述的具体特征、结构、材料或者特点包含于本发明的至少一个实施方式或示例中。在本说明书中,对上述术语的示意性表述不一定指的是相同的实施方式或示例。而且,描述的具体特征、结构、材料或者特点可以在任何的一个或多个实施方式或示例中以合适的方式结合。In the description of the present specification, the description with reference to the terms "one embodiment", "some embodiments", "illustrative embodiment", "example", "specific example", or "some examples", etc. The specific features, structures, materials or characteristics described in the embodiments or examples are included in at least one embodiment or example of the invention. In the present specification, the schematic representation of the above terms does not necessarily mean the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in a suitable manner in any one or more embodiments or examples.
此外,在本发明各个实施方式中的各功能单元可以集成在一个处理模块中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。In addition, each functional unit in each embodiment of the present invention may be integrated into one processing module, or each unit may exist physically separately, or two or more units may be integrated into one module. The above integrated modules can be implemented in the form of hardware or in the form of software functional modules.
尽管上面已经示出和描述了本发明的实施方式,可以理解的是,上述实施方式是示例性的,不能理解为对本发明的限制,本领域的普通技术人员在本发明的范围内可以对上述实施方式进行变化、修改、替换和变型。
Although the embodiments of the present invention have been shown and described, it is understood that the above-described embodiments are illustrative and are not to be construed as limiting the scope of the invention. The embodiments are subject to changes, modifications, substitutions and variations.