WO2023123291A1 - Time sequence signal identification method and apparatus, and computer readable storage medium - Google Patents

Time sequence signal identification method and apparatus, and computer readable storage medium Download PDF

Info

Publication number
WO2023123291A1
WO2023123291A1 PCT/CN2021/143406 CN2021143406W WO2023123291A1 WO 2023123291 A1 WO2023123291 A1 WO 2023123291A1 CN 2021143406 W CN2021143406 W CN 2021143406W WO 2023123291 A1 WO2023123291 A1 WO 2023123291A1
Authority
WO
WIPO (PCT)
Prior art keywords
signal
time series
identified
dimensional image
series signal
Prior art date
Application number
PCT/CN2021/143406
Other languages
French (fr)
Chinese (zh)
Inventor
颜旭
黎宇翔
章文蔚
徐讯
曾涛
Original Assignee
深圳华大生命科学研究院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳华大生命科学研究院 filed Critical 深圳华大生命科学研究院
Priority to PCT/CN2021/143406 priority Critical patent/WO2023123291A1/en
Publication of WO2023123291A1 publication Critical patent/WO2023123291A1/en

Links

Images

Definitions

  • the present application relates to the field of sequencing, in particular, to a time series signal identification method, device, computer-readable storage medium, processor and system.
  • time-series signals that is, sequence signals (such as nanopore electrical signals)
  • sequence signals such as nanopore electrical signals
  • existing solutions generally adopt traditional time-series data analysis ideas, and use statistical analysis methods, manual extraction of special sequence features, etc. to perform similarity calculations and threshold filtering for identification. Therefore, the target signal in the timing signal is detected.
  • the existing technology has great limitations, and can only identify some target signals with obvious features that are easy to identify; the robustness is poor, and different types of target signals need to be designed separately, which is not efficient.
  • the main purpose of this application is to provide a time-series signal identification method, device, computer-readable storage medium, processor and system to solve the problem that the time-series signal identification method in the prior art can only identify target signals with obvious characteristics The problem.
  • a time series signal identification method including: acquiring the time series signal to be identified; converting the time series signal to be identified into a two-dimensional image; according to the The two-dimensional image determines the recognition result, and the recognition result includes at least one of the following: whether the time-series signal to be recognized includes a target signal, the type of the target signal, the target signal in the time-series signal to be recognized position in .
  • determining the recognition result according to the two-dimensional image includes: constructing an artificial intelligence model, the artificial intelligence model is obtained through training using multiple sets of training data, each of the multiple sets of training data
  • the training data all include: historical two-dimensional images corresponding to historical time series signals and historical recognition results corresponding to the historical two-dimensional images; input the two-dimensional images into the artificial intelligence model for calculation, and obtain the recognition result.
  • the artificial intelligence model includes a DBL module and/or a residual module
  • the DBL module includes a convolution layer, a batch normalization layer, and an activation layer
  • the residual module includes the DBL module.
  • determining the position of the target signal in the time series signal to be identified includes: determining the position of a sub-image corresponding to the target signal in the two-dimensional image; The position of the sub-image in the two-dimensional image determines the position of the target signal in the time-series signal to be identified.
  • determining the position of the target signal in the time series signal to be identified includes: acquiring the position of the two-dimensional image Width; obtain the pixel coordinates of the sub-image corresponding to the target signal in the two-dimensional image; obtain the total length of the time series signal; according to the width of the two-dimensional image, the pixel coordinates and the time series The total length of the signal determines the position of the target signal in the time series signal to be identified.
  • the method before converting the time series signal to be identified into a two-dimensional image, the method further includes: performing filtering processing on the time series signal to be identified; a time axis of the time series signal to be identified Perform scaling.
  • the time series signal is a sequencing time series.
  • a time-series signal identification device including: an acquisition unit, configured to acquire a time-series signal to be identified; a conversion unit, configured to convert the time-series signal to be identified into a two-dimensional image; a first determining unit, configured to determine a recognition result based on the two-dimensional image, the recognition result including at least one of the following: whether the time series signal to be recognized includes a target signal, the type of the target signal, the The position of the target signal in the time series signal to be identified.
  • a computer-readable storage medium includes a stored program, wherein when the program is running, the device where the computer-readable storage medium is located is controlled to execute any the method described.
  • a system including a single-channel nanopore sequencing device, one or more processors, memory and one or more programs, wherein the one or more programs are stored in the In memory, and configured to be executed by the one or more processors, the one or more programs are included for performing any one of the methods described above.
  • Fig. 1 shows a flow chart of a time series signal identification method according to an embodiment of the present application
  • FIG. 2 shows a schematic diagram of a time series signal according to an embodiment of the present application
  • FIG. 3 shows a preprocessed time series signal according to an embodiment of the present application
  • Fig. 4 shows a schematic diagram of a two-dimensional image according to an embodiment of the present application
  • FIG. 5 shows a schematic diagram of a target signal according to an embodiment of the present application
  • Fig. 6 shows a target detection framework yolov3 according to an embodiment of the present application
  • FIG. 7 shows a schematic diagram of a DBL module according to an embodiment of the present application.
  • FIG. 8 shows a schematic diagram of a residual module according to an embodiment of the present application.
  • Fig. 9 shows a schematic diagram of a time-series signal identification device according to an embodiment of the present application.
  • the identification method of time series signals in the prior art can only identify target signals with obvious characteristics.
  • the embodiments of the present application provide a time series signal identification method, device, computer-readable storage medium, processor and system.
  • a time series signal identification method is provided.
  • Fig. 1 is a flowchart of a time series signal identification method according to an embodiment of the present application. As shown in Figure 1, the method includes the following steps:
  • Step S101 acquiring time series signals to be identified
  • Step S102 converting the above-mentioned time series signal to be identified into a two-dimensional image
  • Step S103 determine the recognition result according to the above-mentioned two-dimensional image, the above-mentioned recognition result includes at least one of the following: whether the above-mentioned time-series signal to be recognized includes the target signal, the type of the above-mentioned target signal, the above-mentioned target signal in the above-mentioned time-series signal to be recognized position in .
  • the above-mentioned time-series signal may be a time-series electrical signal of nanopore sequencing.
  • the nanopore sequencer sequentially generates a corresponding time-series electrical signal.
  • the above-mentioned nucleic acid sequence may include One or more nucleic acid subsequences, which may include one or more nucleotides, each nucleotide including a nitrogenous base. Understandably, the position of the target signal in the time-series signal to be identified is the position of the nucleic acid subsequence corresponding to the target signal in the nucleic acid sequence corresponding to the time-series electrical signal.
  • the above-mentioned time-series signal is one-dimensional time-series data
  • the above-mentioned target signal is one-dimensional time-series data
  • time series signal to be identified is shown in FIG. 2 , which includes 5 repeated target signal segments. It should be noted that the time series signal in FIG. 2 can be colored.
  • the time series signal to be identified by acquiring the time series signal to be identified, then converting the time series signal to be identified into a two-dimensional image, and finally identifying whether the time series signal includes the target signal, the type of the target signal, and the target signal based on the two-dimensional image
  • the position of the signal in the above time series signal to be identified Due to the use of image recognition technology, different types of target signals can be identified, and it is no longer limited to target signals with obvious features that are easy to identify. Compared with manual identification, the identification efficiency is higher.
  • determining the recognition result based on the above-mentioned two-dimensional image includes: constructing an artificial intelligence model, the above-mentioned artificial intelligence model is obtained through training using multiple sets of training data, and each of the above-mentioned multiple sets of training data
  • Each set of training data includes: historical two-dimensional images corresponding to historical time series signals and historical recognition results corresponding to the above-mentioned historical two-dimensional images; the above-mentioned two-dimensional images are input into the above-mentioned artificial intelligence model for calculation, and the above-mentioned recognition results are obtained. That is, by building an artificial intelligence model, the recognition result can be determined more accurately based on the two-dimensional image. That is, it is determined according to the two-dimensional image whether the time-series signal to be identified includes the target signal, the type of the target signal, and the position of the target signal in the time-series signal to be identified.
  • the above-mentioned artificial intelligence model includes a DBL module and/or a residual module
  • the above-mentioned DBL module includes a convolution layer, a batch normalization layer and an activation layer
  • the above-mentioned residual module includes the above-mentioned DBL module. More specifically, the residual module is obtained by adding the input (input) after two DBL modules.
  • determining the position of the above-mentioned target signal in the above-mentioned time series signal to be identified includes: determining the position of the sub-image corresponding to the above-mentioned target signal in the above-mentioned two-dimensional image; The position of the sub-image in the two-dimensional image determines the position of the target signal in the time-series signal to be identified. That is, the position of the target signal in the time series signal to be identified can be determined according to the position of the sub-image corresponding to the target signal in the two-dimensional image. Further determine the position of the base corresponding to the target signal in the original sequencing sequence.
  • determining the position of the above-mentioned target signal in the above-mentioned time series signal to be recognized includes: acquiring the above-mentioned two-dimensional image Width; obtain the pixel coordinates of the sub-image corresponding to the above-mentioned target signal in the above-mentioned two-dimensional image; obtain the total length of the above-mentioned time-series signal; determine according to the width of the above-mentioned two-dimensional image, the above-mentioned pixel coordinates and the total length of the above-mentioned time-series signal The position of the above-mentioned target signal in the above-mentioned time series signal to be identified.
  • the position of the target signal in the two-dimensional image can be identified according to the two-dimensional image, and then the position of the target signal in the time-series signal can be determined according to the correspondence between the two-dimensional image and the time-series signal to be identified.
  • the size of the two-dimensional image is 400x100, that is, the width of the two-dimensional image is 400.
  • the abscissa of the sub-image corresponding to the target signal is 150 pixels.
  • the sequence length of the time series signal before being converted into a picture is 10000, so the position of the target signal in the time series signal is 10000*150/400.
  • the above-mentioned artificial intelligence model is a deep learning model
  • constructing the above-mentioned artificial intelligence model includes: obtaining relevant parameters for model training, and the above-mentioned relevant parameters include optimizers, The learning rate and the number of training iterations; using the above-mentioned relevant parameters as standards, using multiple sets of the above-mentioned training data to train the above-mentioned artificial intelligence model. That is, by setting relevant parameters such as the optimizer, learning rate, and number of training iterations, the accuracy and generalization of the trained artificial intelligence model are higher.
  • the deep learning network is mainly divided into three modules: a preprocessing layer, a feature mapping and fusion layer, and a prediction output layer, as shown in Table 1.
  • a preprocessing layer a three-channel image data is used as the input data of the model.
  • the data is sent to a 3*3 convolutional layer, and then after multiple serial convolutions and bottleneck layers.
  • the modular units feed data into the pooling layer and finally output to the feature map and fusion layer.
  • Each bottleneck layer module unit here is spliced by three convolutional layers and N residual network units, and each convolutional layer is followed by data normalization and Leak relu activation function to process the data .
  • the feature data of three different scales output in the preprocessing layer are spliced with each other after a series of pooling, convolution and upsampling, and then three different scales are output after convolution processing.
  • the feature map of dimension is fed as output to the resulting prediction layer.
  • the purpose of convolution and data splicing in the feature mapping layer and fusion layer is not only to enable the model to capture more subtle features in the training of targets of different sizes so as to ensure the classification and prediction effect of the model for different targets, but also The spatial information capability of the features is guaranteed, which helps to locate the target accurately.
  • the three features output by the feature extraction and fusion layers are respectively processed by convolution, data normalization, activation function, and reconvolution as output feature vectors for classification prediction and coordinate point calculation.
  • three types of losses are designed as loss functions to calculate whether there is a target, the classification of the target, and the coordinate points of the target. Among them, the classification of whether to include the target and the target is calculated by cross entropy loss; the loss of the coordinate point of the target object is calculated by GIoU to calculate the distance loss between the predicted coordinate frame and the real frame.
  • the above-mentioned historical identification results are obtained by marking the above-mentioned historical two-dimensional images with a picture annotation tool
  • building an artificial intelligence model includes: combining multiple groups of the above-mentioned historical two-dimensional images and the above-mentioned historical two-dimensional images
  • the historical recognition results corresponding to the dimensional images are divided into a training set and a test set; the above-mentioned artificial intelligence model is trained by using the above-mentioned training set; and the above-mentioned artificial intelligence model is tested by using the above-mentioned test set. That is, use a rich training set to train the model to obtain the artificial intelligence model, and then use the test set to test the accuracy of the artificial intelligence model. If the accuracy does not meet the requirements, adjust the training set and conduct training again to guide the training. The accuracy of the artificial intelligence model is high.
  • an image annotation tool is used to annotate the above-mentioned historical two-dimensional images
  • the target sequence signal can be selected with a rectangle
  • a file format that the model can read such as xml
  • multiple groups of the above-mentioned historical two-dimensional images and the historical recognition results corresponding to the above-mentioned historical two-dimensional images are divided into training sets and test sets, including: determining the division ratio; based on the above-mentioned division ratio, Dividing multiple groups of the above-mentioned historical two-dimensional images and the historical recognition results corresponding to the above-mentioned historical two-dimensional images into the above-mentioned training set and the above-mentioned test set. For example, take 60% of the data as the training set and 40% of the data as the test set. Of course, in practical applications, those skilled in the art can select an appropriate division ratio according to actual needs.
  • the above-mentioned method before converting the above-mentioned time-series signal to be identified into a two-dimensional image, further includes: performing filtering processing on the above-mentioned time-series signal to be identified;
  • the time axis is scaled. That is, in order to realize the accurate determination of the artificial intelligence model, the above-mentioned time series signals to be recognized are preprocessed first, and the sequencing sequence after preprocessing is shown in Figure 3, and then converted into a two-dimensional image, as shown in Figure 4 , the recognition results are shown in Figure 5.
  • the filtering process includes smoothing or denoising the time series signal.
  • the target signal in the two-dimensional image is easier to identify by performing scaling processing on the time axis of the above-mentioned time series signal to be identified, specifically, the easier it is to be identified by the naked eye, the better.
  • a combination of a downsampling algorithm and a filtering algorithm may also be used to perform smoothing and denoising processing on the above-mentioned time series signal to be identified.
  • the time series signal is a sequencing time series.
  • the sequencing time series are electrical signal time series and optical signal time series.
  • the embodiment of the present application also provides a time-series signal identification device. It should be noted that the time-series signal identification device in the embodiment of the present application can be used to implement the time-series signal identification method provided in the embodiment of the present application. The time series signal identification device provided by the embodiment of the present application is introduced below.
  • Fig. 9 is a schematic diagram of a time-series signal identification device according to an embodiment of the present application. As shown in Figure 9, the device includes:
  • An acquisition unit 10 configured to acquire a time series signal to be identified
  • a conversion unit 20 configured to convert the above-mentioned time series signal to be identified into a two-dimensional image
  • the first determination unit 30 is configured to determine a recognition result based on the above-mentioned two-dimensional image, and the above-mentioned recognition result includes at least one of the following: whether the above-mentioned time-series signal to be recognized includes a target signal, the type of the above-mentioned target signal, and whether the above-mentioned target signal is to be recognized. The position in the above time series signal of .
  • the above-mentioned time-series signal may be a time-series electrical signal of nanopore sequencing.
  • the nanopore sequencer sequentially generates a corresponding time-series electrical signal.
  • the above-mentioned nucleic acid sequence may include One or more nucleic acid subsequences, which may include one or more nucleotides, each nucleotide including a nitrogenous base. Understandably, the position of the target signal in the time-series signal to be identified is the position of the nucleic acid subsequence corresponding to the target signal in the nucleic acid sequence corresponding to the time-series electrical signal.
  • the above-mentioned time-series signal is one-dimensional time-series data
  • the above-mentioned target signal is one-dimensional time-series data
  • the acquisition unit acquires the time series signal to be identified, the conversion unit converts the time series signal to be identified into a two-dimensional image, and the first determination unit identifies whether the time series signal includes the target signal, the above target signal or not according to the two-dimensional image.
  • the type of the target signal and the position of the above-mentioned target signal in the above-mentioned time series signal to be identified Due to the use of image recognition technology, different types of target signals can be identified, and it is no longer limited to target signals with obvious features that are easy to identify. Compared with manual identification, the identification efficiency is higher.
  • the first determining unit includes a building block and a computing module, and the building block is used to build an artificial intelligence model.
  • the above-mentioned artificial intelligence model is obtained through training using multiple sets of training data.
  • Each set of training data includes: historical two-dimensional images corresponding to historical time series signals and historical recognition results corresponding to the above-mentioned historical two-dimensional images; the calculation module is used to input the above-mentioned two-dimensional images into the above-mentioned artificial intelligence model for calculation , to obtain the above recognition results. That is, by building an artificial intelligence model, the recognition result can be determined more accurately based on the two-dimensional image. That is, it is determined according to the two-dimensional image whether the time-series signal to be identified includes the target signal, the type of the target signal, and the position of the target signal in the time-series signal to be identified.
  • the device further includes a second determination unit, the second determination unit is used to determine the position of the target signal in the time series signal to be identified, and the second determination unit includes a first determination module and a second determination module, the first determination module is used to determine the position of the sub-image corresponding to the above-mentioned target signal in the above-mentioned two-dimensional image; the second determination module is used to determine the position of the sub-image corresponding to the above-mentioned target signal in the above-mentioned two-dimensional image position, to determine the position of the target signal in the time series signal to be identified. That is, the position of the target signal in the time series signal to be identified can be determined according to the position of the sub-image corresponding to the target signal in the two-dimensional image. Further determine the position of the base corresponding to the target signal in the original sequencing sequence.
  • the second determination module includes a second acquisition submodule, a third acquisition submodule, a fourth acquisition submodule, and a second determination submodule
  • the second acquisition submodule is used to acquire the above-mentioned two-dimensional image width
  • the third acquisition sub-module is used to obtain the pixel coordinates of the sub-image corresponding to the above-mentioned target signal in the above-mentioned two-dimensional image
  • the fourth acquisition sub-module is used to obtain the total length of the above-mentioned time series signal
  • the second determination sub-module uses The position of the target signal in the time series signal to be identified is determined according to the width of the two-dimensional image, the pixel coordinates and the total length of the time series signal.
  • the position of the target signal in the two-dimensional image can be identified according to the two-dimensional image, and then the position of the target signal in the time-series signal can be determined according to the correspondence between the two-dimensional image and the time-series signal to be identified.
  • the size of the two-dimensional image is 400x100, that is, the width of the two-dimensional image is 400.
  • the abscissa of the sub-image corresponding to the target signal is 150 pixels.
  • the sequence length of the time series signal before being converted into a picture is 10000, so the position of the target signal in the time series signal is 10000*150/400.
  • the above-mentioned artificial intelligence model is a deep learning model
  • the construction module includes a first acquisition sub-module and a first training sub-module, and the first acquisition sub-module is used for Obtain relevant parameters for model training, the above-mentioned relevant parameters include an optimizer, a learning rate and the number of training iterations; the first training sub-module is used to use the above-mentioned relevant parameters as a standard to train the above-mentioned artificial intelligence model by using multiple sets of the above-mentioned training data. That is, by setting relevant parameters such as the optimizer, learning rate, and number of training iterations, the accuracy and generalization of the trained artificial intelligence model are higher.
  • the above-mentioned historical recognition results are obtained by using a picture annotation tool to mark the above-mentioned historical two-dimensional images. It is used to divide multiple groups of the above-mentioned historical two-dimensional images and the historical recognition results corresponding to the above-mentioned historical two-dimensional images into a training set and a test set; the second training submodule is used to use the above-mentioned training set to train the above-mentioned artificial intelligence model; The test sub-module is used to test the above-mentioned artificial intelligence model by using the above-mentioned test set.
  • the division submodule includes a first determination submodule and a processing submodule, the first determination submodule is used to determine the division ratio; the processing submodule is used to combine multiple groups of the above history
  • the two-dimensional images and the historical recognition results corresponding to the above-mentioned historical two-dimensional images are divided into the above-mentioned training set and the above-mentioned test set. For example, take 60% of the data as the training set and 40% of the data as the test set. Of course, in practical applications, those skilled in the art can select an appropriate division ratio according to actual needs.
  • the above-mentioned device further includes a filtering unit and a scaling unit, and the filtering unit is used to filter the above-mentioned time-series signal to be recognized before converting the above-mentioned time-series signal to be recognized into a two-dimensional image ;
  • the scaling unit is used to scale the time axis of the time series signal to be identified. That is, in order to realize the accurate determination of the artificial intelligence model, the above-mentioned time series signals to be recognized are preprocessed first, and the sequencing sequence after preprocessing is shown in Figure 3, and then converted into a two-dimensional image, as shown in Figure 4 , the recognition results are shown in Figure 5.
  • the filtering process includes smoothing or denoising the time series signal.
  • the target signal in the two-dimensional image is easier to identify by performing scaling processing on the time axis of the above-mentioned time series signal to be identified, specifically, the easier it is to be identified by the naked eye, the better.
  • the time-series signal recognition device includes a processor and a memory, and the above-mentioned acquisition unit, conversion unit and first determination unit are all stored in the memory as program units, and the processor executes the above-mentioned program units stored in the memory to realize corresponding Function.
  • the processor includes a kernel, and the kernel fetches corresponding program units from the memory.
  • One or more kernels can be set, and accurate identification of time series signals can be achieved by adjusting kernel parameters.
  • Memory may include non-permanent memory in computer-readable media, in the form of random access memory (RAM) and/or non-volatile memory, such as read-only memory (ROM) or flash memory (flash RAM), memory including at least one memory chip.
  • RAM random access memory
  • ROM read-only memory
  • flash RAM flash memory
  • An embodiment of the present invention provides a computer-readable storage medium, the computer-readable storage medium includes a stored program, wherein when the program is running, the device where the computer-readable storage medium is located is controlled to execute the time series signal recognition methods.
  • An embodiment of the present invention provides a processor, the processor is used to run a program, wherein the time series signal identification method is executed when the program is running.
  • An embodiment of the present invention provides a system, including a single-channel nanopore sequencing device, one or more processors, memory, and one or more programs, wherein the above-mentioned one or more programs are stored in the above-mentioned memory, and are It is configured to be executed by the above-mentioned one or more processors, and the above-mentioned one or more programs include a method for performing any one of the above-mentioned methods.
  • An embodiment of the present invention provides a device.
  • the device includes a processor, a memory, and a program stored on the memory and operable on the processor.
  • the processor executes the program, at least the following steps are implemented:
  • Step S101 acquiring time series signals to be identified
  • Step S102 converting the above-mentioned time series signal to be identified into a two-dimensional image
  • Step S103 determine the recognition result according to the above-mentioned two-dimensional image, the above-mentioned recognition result includes at least one of the following: whether the above-mentioned time-series signal to be recognized includes the target signal, the type of the above-mentioned target signal, the above-mentioned target signal in the above-mentioned time-series signal to be recognized position in .
  • the devices in this article can be servers, PCs, PADs, mobile phones, etc.
  • the present application also provides a computer program product, which, when executed on a data processing device, is adapted to execute a program initialized with at least the following method steps:
  • Step S101 acquiring time series signals to be identified
  • Step S102 converting the above-mentioned time series signal to be identified into a two-dimensional image
  • Step S103 determine the recognition result according to the above-mentioned two-dimensional image, the above-mentioned recognition result includes at least one of the following: whether the above-mentioned time-series signal to be recognized includes the target signal, the type of the above-mentioned target signal, the above-mentioned target signal in the above-mentioned time-series signal to be recognized position in .
  • the embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions
  • the device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
  • a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
  • processors CPUs
  • input/output interfaces network interfaces
  • memory volatile and non-volatile memory
  • Memory may include non-permanent storage in computer readable media, in the form of random access memory (RAM) and/or nonvolatile memory such as read only memory (ROM) or flash RAM.
  • RAM random access memory
  • ROM read only memory
  • flash RAM flash random access memory
  • Computer-readable media including both permanent and non-permanent, removable and non-removable media, can be implemented by any method or technology for storage of information.
  • Information may be computer readable instructions, data structures, modules of a program, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Flash memory or other memory technology, Compact Disc Read-Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cartridge, tape magnetic disk storage or other magnetic storage device or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
  • computer-readable media excludes transitory computer-readable media, such as modulated data signals and carrier waves.
  • This embodiment relates to a specific time series signal recognition system, including a hardware environment and a software environment, as shown in Table 2.
  • the system also includes a single-channel nanopore sequencing device and a PC. Use the nanopore sequencing device to collect target library data, and save the collected signal data and other information as a file structure of h5, and store them in the hard disk of the PC in real time.
  • This embodiment uses the target detection framework yolov3, and the network architecture is shown in Figure 6, wherein the DBL module is composed of a convolutional layer (conv), a batch normalization layer (BN) and an activation function (Leaky ReLU). As shown in Figure 7, the residual module is obtained by adding the input after two DBL modules, as shown in Figure 8.
  • the DBL module is composed of a convolutional layer (conv), a batch normalization layer (BN) and an activation function (Leaky ReLU).
  • the residual module is obtained by adding the input after two DBL modules, as shown in Figure 8.
  • a 416*416*3 image is input, preliminary feature extraction is completed through the DBL module and 3 residual modules, and then further feature learning at multiple scales is performed, and finally various Possible prediction frame coordinate information (that is, the coordinate information of the target to be detected in the picture).
  • the use of three outputs (y1/y2/y3) here is based on FPN (feature pyramid networks), and multi-scale is used to detect targets of different sizes. The finer the grid, the finer the object can be detected. Finally, by setting the threshold and filtering according to the probability value of each prediction frame, the remaining most likely coordinate position information can be obtained.
  • YOLOv3 can be implemented based on pytorch or tensorflow, and you can refer to open source code.
  • Data labeling and preprocessing Before using the model training, it is necessary to convert the point signal data of the sample analysis into an image format, manually label it and pass it to the deep learning model for training and testing. First, the signal data is generated as a picture of the same size and saved. Considering the size of the signal data, in the process of picture generation, the data can be processed by downsampling and filtering to maintain the data form. Then, use tools such as roLabelImg (an open source image labeling tool) to label the type and relative coordinates of the classification signals contained in each picture, and save all result files.
  • roLabelImg an open source image labeling tool
  • Model training and testing After data labeling is completed, the data images used for training and the corresponding labeling results need to be divided into a training set and a test set.
  • the training set production stage you can use random sampling and set the ratio yourself to create a data set for deep learning training. Then, set the parameters of model training according to the application scenario, such as optimizer, learning rate, and number of training iterations, etc., to train and test the model. If the test results are not ideal, you can try to increase the number of data sets, modify the model training parameters, etc. to adjust the training, and repeat the iterative process until the model training results meet the requirements of the indicators.
  • FIG. 4 A two-dimensional image converted based on the sequencing electrical signal of the present invention is shown in FIG. 4 . Because during the sequencing process, some custom special base sequence fragments and other fragments are mixed, and the special base fragments will present special electrical signals during the sequencing process, as shown in Figure 4. It may take a lot of time and effort to manually screen out special target signals from all the sequencing electrical signals, so this model can be used to screen out one or more target signals that need to be analyzed in a large amount of chaotic signal data, as shown in the figure 5.
  • the output of the model detection results includes the classification information and relative coordinates of the target signal, which also plays a great auxiliary role in the further analysis of the signal.
  • the time-series signal identification method of the present application obtains the time-series signal to be identified, then converts the time-series signal to be identified into a two-dimensional image, and finally identifies whether the time-series signal includes the target signal according to the two-dimensional image, The type of the target signal and the position of the target signal in the time series signal to be identified. Due to the use of image recognition technology, different types of target signals can be identified, and it is no longer limited to target signals with obvious features that are easy to identify. Compared with manual identification, the identification efficiency is higher.
  • the acquisition unit acquires the time-series signal to be identified
  • the conversion unit converts the time-series signal to be identified into a two-dimensional image
  • the first determination unit identifies the time-series signal according to the two-dimensional image Whether to include the target signal, the type of the target signal, and the position of the target signal in the time series signal to be identified. Due to the use of image recognition technology, different types of target signals can be identified, and it is no longer limited to target signals with obvious features that are easy to identify. Compared with manual identification, the identification efficiency is higher.

Landscapes

  • Inspection Of Paper Currency And Valuable Securities (AREA)

Abstract

The present application provides a time sequence signal identification method and apparatus, and a computer readable storage medium. The method comprises: obtaining a time sequence signal to be identified; converting said time sequence signal into a two-dimensional image; and determining an identification result according to the two-dimensional image, the identification result comprising at least one of the following: whether said time sequence signal comprises a target signal, the type of the target signal, and the position of the target signal in said time sequence signal. Since the image identification technology is adopted, different types of target signals can be identified, it is not limited to target signals whose features are clearly easy to identify, and compared with an artificial recognition mode, the identification efficiency is relatively high.

Description

时间序列信号识别方法、装置与计算机可读存储介质Time series signal identification method, device and computer-readable storage medium 技术领域technical field
本申请涉及测序领域,具体而言,涉及一种时间序列信号识别方法、装置、计算机可读存储介质、处理器和系统。The present application relates to the field of sequencing, in particular, to a time series signal identification method, device, computer-readable storage medium, processor and system.
背景技术Background technique
对于时间序列信号即时序信号(例如纳米孔电信号)的识别,现有方案一般采用传统时序数据分析思路,通过统计分析方法、人工提取特殊序列特征等方式做相似度计算、阈值过滤进行识别,从而检测出时序信号中的目标信号。For the identification of time-series signals, that is, sequence signals (such as nanopore electrical signals), existing solutions generally adopt traditional time-series data analysis ideas, and use statistical analysis methods, manual extraction of special sequence features, etc. to perform similarity calculations and threshold filtering for identification. Therefore, the target signal in the timing signal is detected.
现有技术具有很大的局限性,只能识别一些特征很明显容易被识别的目标信号;鲁棒性很差,对于不同类型的目标信号都需要单独设计,效率不高。The existing technology has great limitations, and can only identify some target signals with obvious features that are easy to identify; the robustness is poor, and different types of target signals need to be designed separately, which is not efficient.
发明内容Contents of the invention
本申请的主要目的在于提供一种时间序列信号识别方法、装置、计算机可读存储介质、处理器和系统,以解决现有技术中时间序列信号的识别方法仅能识别出特征很明显的目标信号的问题。The main purpose of this application is to provide a time-series signal identification method, device, computer-readable storage medium, processor and system to solve the problem that the time-series signal identification method in the prior art can only identify target signals with obvious characteristics The problem.
为了实现上述目的,根据本申请的一个方面,提供了一种时间序列信号识别方法,包括:获取待识别的时间序列信号;将待识别的所述时间序列信号转换为二维图像;根据所述二维图像确定识别结果,所述识别结果包括以下至少之一:待识别的所述时间序列信号是否包括目标信号、所述目标信号的种类、所述目标信号在待识别的所述时间序列信号中的位置。In order to achieve the above object, according to one aspect of the present application, a time series signal identification method is provided, including: acquiring the time series signal to be identified; converting the time series signal to be identified into a two-dimensional image; according to the The two-dimensional image determines the recognition result, and the recognition result includes at least one of the following: whether the time-series signal to be recognized includes a target signal, the type of the target signal, the target signal in the time-series signal to be recognized position in .
可选地,根据所述二维图像确定所述识别结果,包括:构建人工智能模型,所述人工智能模型为使用多组训练数据通过训练得到的,所述多组训练数据中的每一组训练数据均包括:历史时间序列信号对应的历史二维图像以及与所述历史二维图像对应的历史识别结果;将所述二维图像输入至所述人工智能模型中进行计算,得到所述识别结果。Optionally, determining the recognition result according to the two-dimensional image includes: constructing an artificial intelligence model, the artificial intelligence model is obtained through training using multiple sets of training data, each of the multiple sets of training data The training data all include: historical two-dimensional images corresponding to historical time series signals and historical recognition results corresponding to the historical two-dimensional images; input the two-dimensional images into the artificial intelligence model for calculation, and obtain the recognition result.
可选地,所述人工智能模型包括DBL模块和/或残差模块,所述DBL模块包括卷积层、批归一化层和激活层,所述残差模块包括所述DBL模块。Optionally, the artificial intelligence model includes a DBL module and/or a residual module, the DBL module includes a convolution layer, a batch normalization layer, and an activation layer, and the residual module includes the DBL module.
可选地,确定所述目标信号在待识别的所述时间序列信号中的位置,包括:确定所述目标信号对应的子图像在所述二维图像中的位置;根据所述目标信号对应的子图像在所述二维图像中的位置,确定所述目标信号在待识别的所述时间序列信号中的位置。Optionally, determining the position of the target signal in the time series signal to be identified includes: determining the position of a sub-image corresponding to the target signal in the two-dimensional image; The position of the sub-image in the two-dimensional image determines the position of the target signal in the time-series signal to be identified.
可选地,根据所述目标信号对应的子图像在所述二维图像中的位置,确定所述目标信号在待识别的所述时间序列信号中的位置,包括:获取所述二维图像的宽度;获取所述目标信号对应的子图像在所述二维图像中的像素坐标;获取所述时间序列信号的总长度;根据所述 二维图像的宽度、所述像素坐标和所述时间序列信号的总长度,确定所述目标信号在待识别的所述时间序列信号中的位置。Optionally, according to the position of the sub-image corresponding to the target signal in the two-dimensional image, determining the position of the target signal in the time series signal to be identified includes: acquiring the position of the two-dimensional image Width; obtain the pixel coordinates of the sub-image corresponding to the target signal in the two-dimensional image; obtain the total length of the time series signal; according to the width of the two-dimensional image, the pixel coordinates and the time series The total length of the signal determines the position of the target signal in the time series signal to be identified.
可选地,在将待识别的所述时间序列信号转换为二维图像之前,所述方法还包括:对待识别的所述时间序列信号进行滤波处理;对待识别的所述时间序列信号的时间轴进行缩放处理。Optionally, before converting the time series signal to be identified into a two-dimensional image, the method further includes: performing filtering processing on the time series signal to be identified; a time axis of the time series signal to be identified Perform scaling.
可选地,所述时间序列信号为测序时间序列。Optionally, the time series signal is a sequencing time series.
根据本申请的一个方面,提供了一种时间序列信号识别装置,包括:获取单元,用于获取待识别的时间序列信号;转换单元,用于将待识别的所述时间序列信号转换为二维图像;第一确定单元,用于根据所述二维图像确定识别结果,所述识别结果包括以下至少之一:待识别的所述时间序列信号是否包括目标信号、所述目标信号的种类、所述目标信号在待识别的所述时间序列信号中的位置。According to one aspect of the present application, a time-series signal identification device is provided, including: an acquisition unit, configured to acquire a time-series signal to be identified; a conversion unit, configured to convert the time-series signal to be identified into a two-dimensional image; a first determining unit, configured to determine a recognition result based on the two-dimensional image, the recognition result including at least one of the following: whether the time series signal to be recognized includes a target signal, the type of the target signal, the The position of the target signal in the time series signal to be identified.
根据本申请的一个方面,提供了一种计算机可读存储介质所述计算机可读存储介质包括存储的程序,其中,在所述程序运行时控制所述计算机可读存储介质所在设备执行任意一种所述的方法。According to one aspect of the present application, a computer-readable storage medium is provided. The computer-readable storage medium includes a stored program, wherein when the program is running, the device where the computer-readable storage medium is located is controlled to execute any the method described.
根据本申请的一个方面,提供了一种系统,包括单通道纳米孔测序装置、一个或多个处理器,存储器以及一个或多个程序,其中,所述一个或多个程序被存储在所述存储器中,并且被配置为由所述一个或多个处理器执行,所述一个或多个程序包括用于执行任意一种所述的方法。According to one aspect of the present application, a system is provided, including a single-channel nanopore sequencing device, one or more processors, memory and one or more programs, wherein the one or more programs are stored in the In memory, and configured to be executed by the one or more processors, the one or more programs are included for performing any one of the methods described above.
应用本申请的技术方案,通过获取待识别的时间序列信号,然后将待识别的时间序列信号转换为二维图像,最后根据二维图像识别出时间序列信号是否包括目标信号、上述目标信号的种类、上述目标信号在待识别的上述时间序列信号中的位置。由于采用了图像识别技术,使得可以识别出不同种类的目标信号,不再局限于特征很明显容易被识别的目标信号,且相对于人工识别的方式,识别效率较高。Applying the technical solution of the present application, by obtaining the time series signal to be identified, then converting the time series signal to be identified into a two-dimensional image, and finally identifying whether the time series signal includes the target signal and the type of the target signal according to the two-dimensional image , the position of the above-mentioned target signal in the above-mentioned time series signal to be identified. Due to the use of image recognition technology, different types of target signals can be identified, and it is no longer limited to target signals with obvious features that are easy to identify. Compared with manual identification, the identification efficiency is higher.
附图说明Description of drawings
构成本申请的一部分的说明书附图用来提供对本申请的进一步理解,本申请的示意性实施例及其说明用于解释本申请,并不构成对本申请的不当限定。在附图中:The accompanying drawings constituting a part of the present application are used to provide further understanding of the present application, and the schematic embodiments and descriptions of the present application are used to explain the present application, and do not constitute an improper limitation of the present application. In the attached picture:
图1示出了根据本申请的实施例的一种时间序列信号识别方法流程图;Fig. 1 shows a flow chart of a time series signal identification method according to an embodiment of the present application;
图2示出了根据本申请的实施例的时间序列信号示意图;FIG. 2 shows a schematic diagram of a time series signal according to an embodiment of the present application;
图3示出了根据本申请的实施例的经过预处理后的时间序列信号;FIG. 3 shows a preprocessed time series signal according to an embodiment of the present application;
图4示出了根据本申请的实施例的二维图像示意图;Fig. 4 shows a schematic diagram of a two-dimensional image according to an embodiment of the present application;
图5示出了根据本申请的实施例的目标信号示意图;FIG. 5 shows a schematic diagram of a target signal according to an embodiment of the present application;
图6示出了根据本申请的实施例的目标检测框架yolov3;Fig. 6 shows a target detection framework yolov3 according to an embodiment of the present application;
图7示出了根据本申请的实施例的DBL模块示意图;FIG. 7 shows a schematic diagram of a DBL module according to an embodiment of the present application;
图8示出了根据本申请的实施例的残差模块示意图;FIG. 8 shows a schematic diagram of a residual module according to an embodiment of the present application;
图9示出了根据本申请的实施例的一种时间序列信号识别装置示意图。Fig. 9 shows a schematic diagram of a time-series signal identification device according to an embodiment of the present application.
具体实施方式Detailed ways
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。It should be noted that, in the case of no conflict, the embodiments in the present application and the features in the embodiments can be combined with each other. The present application will be described in detail below with reference to the accompanying drawings and embodiments.
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分的实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本申请保护的范围。In order to enable those skilled in the art to better understand the solution of the present application, the technical solution in the embodiment of the application will be clearly and completely described below in conjunction with the accompanying drawings in the embodiment of the application. Obviously, the described embodiment is only It is an embodiment of a part of the application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the scope of protection of this application.
需要说明的是,本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本申请的实施例。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terms "first" and "second" in the description and claims of the present application and the above drawings are used to distinguish similar objects, but not necessarily used to describe a specific sequence or sequence. It should be understood that the data so used may be interchanged under appropriate circumstances for the embodiments of the application described herein. Furthermore, the terms "comprising" and "having", as well as any variations thereof, are intended to cover a non-exclusive inclusion, for example, a process, method, system, product or device comprising a sequence of steps or elements is not necessarily limited to the expressly listed instead, may include other steps or elements not explicitly listed or inherent to the process, method, product or apparatus.
应该理解的是,当元件(诸如层、膜、区域、或衬底)描述为在另一元件“上”时,该元件可直接在该另一元件上,或者也可存在中间元件。而且,在说明书以及权利要求书中,当描述有元件“连接”至另一元件时,该元件可“直接连接”至该另一元件,或者通过第三元件“连接”至该另一元件。It will be understood that when an element such as a layer, film, region, or substrate is referred to as being "on" another element, it can be directly on the other element or intervening elements may also be present. Also, in the specification and claims, when it is described that an element is "connected" to another element, the element may be "directly connected" to the other element, or "connected" to the another element through a third element.
正如背景技术中所介绍的,现有技术中的时间序列信号的识别方法仅能识别出特征很明显的目标信号,为解决如上时间序列信号的识别方法仅能识别出特征很明显的目标信号的问题,本申请的实施例提供了一种时间序列信号识别方法、装置、计算机可读存储介质、处理器和系统。As introduced in the background technology, the identification method of time series signals in the prior art can only identify target signals with obvious characteristics. Problem, the embodiments of the present application provide a time series signal identification method, device, computer-readable storage medium, processor and system.
根据本申请的实施例,提供了一种时间序列信号识别方法。According to an embodiment of the present application, a time series signal identification method is provided.
图1是根据本申请实施例的时间序列信号识别方法的流程图。如图1所示,该方法包括以下步骤:Fig. 1 is a flowchart of a time series signal identification method according to an embodiment of the present application. As shown in Figure 1, the method includes the following steps:
步骤S101,获取待识别的时间序列信号;Step S101, acquiring time series signals to be identified;
步骤S102,将待识别的上述时间序列信号转换为二维图像;Step S102, converting the above-mentioned time series signal to be identified into a two-dimensional image;
步骤S103,根据上述二维图像确定识别结果,上述识别结果包括以下至少之一:待识别的上述时间序列信号是否包括目标信号、上述目标信号的种类、上述目标信号在待识别的上述时间序列信号中的位置。Step S103, determine the recognition result according to the above-mentioned two-dimensional image, the above-mentioned recognition result includes at least one of the following: whether the above-mentioned time-series signal to be recognized includes the target signal, the type of the above-mentioned target signal, the above-mentioned target signal in the above-mentioned time-series signal to be recognized position in .
具体地,上述时间序列信号可以为纳米孔测序的时间序列电信号,核酸序列穿过纳米孔测序仪中的纳米孔时,纳米孔测序仪依次产生相应的时间序列电信号,上述核酸序列可以包括一个或多个核酸子序列,上述核酸子序列可以包括一个或多个核苷酸,每个核苷酸包括含氮碱基。可理解地,上述目标信号在待识别的上述时间序列信号中的位置即为所述目标信号对应的核酸子序列在时间序列电信号对应的核酸序列中的位置。Specifically, the above-mentioned time-series signal may be a time-series electrical signal of nanopore sequencing. When a nucleic acid sequence passes through a nanopore in a nanopore sequencer, the nanopore sequencer sequentially generates a corresponding time-series electrical signal. The above-mentioned nucleic acid sequence may include One or more nucleic acid subsequences, which may include one or more nucleotides, each nucleotide including a nitrogenous base. Understandably, the position of the target signal in the time-series signal to be identified is the position of the nucleic acid subsequence corresponding to the target signal in the nucleic acid sequence corresponding to the time-series electrical signal.
可选地,上述时间序列信号为一维时间序列数据,上述目标信号为一维时间序列数据。Optionally, the above-mentioned time-series signal is one-dimensional time-series data, and the above-mentioned target signal is one-dimensional time-series data.
具体地,待识别的时间序列信号如图2所示,包含了5个重复的目标信号片段,需要说明的是图2中的时间序列信号可以是彩色的。Specifically, the time series signal to be identified is shown in FIG. 2 , which includes 5 repeated target signal segments. It should be noted that the time series signal in FIG. 2 can be colored.
具体地,获取待识别的时间序列信号之后,保存至硬盘文件。Specifically, after acquiring the time series signal to be identified, it is saved to a hard disk file.
上述方案中,通过获取待识别的时间序列信号,然后将待识别的时间序列信号转换为二维图像,最后根据二维图像识别出时间序列信号是否包括目标信号、上述目标信号的种类、上述目标信号在待识别的上述时间序列信号中的位置。由于采用了图像识别技术,使得可以识别出不同种类的目标信号,不再局限于特征很明显容易被识别的目标信号,且相对于人工识别的方式,识别效率较高。In the above scheme, by acquiring the time series signal to be identified, then converting the time series signal to be identified into a two-dimensional image, and finally identifying whether the time series signal includes the target signal, the type of the target signal, and the target signal based on the two-dimensional image The position of the signal in the above time series signal to be identified. Due to the use of image recognition technology, different types of target signals can be identified, and it is no longer limited to target signals with obvious features that are easy to identify. Compared with manual identification, the identification efficiency is higher.
需要说明的是,在附图的流程图示出的步骤可以在诸如一组计算机可执行指令的计算机系统中执行,并且,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。It should be noted that the steps shown in the flowcharts of the accompanying drawings may be performed in a computer system, such as a set of computer-executable instructions, and that although a logical order is shown in the flowcharts, in some cases, The steps shown or described may be performed in an order different than here.
本申请的一种实施例中,根据上述二维图像确定上述识别结果,包括:构建人工智能模型,上述人工智能模型为使用多组训练数据通过训练得到的,上述多组训练数据中的每一组训练数据均包括:历史时间序列信号对应的历史二维图像以及与上述历史二维图像对应的历史识别结果;将上述二维图像输入至上述人工智能模型中进行计算,得到上述识别结果。即通过构建人工智能模型的方法,使得根据二维图像更精确地确定识别结果。即根据二维图像确定待识别的上述时间序列信号是否包括目标信号、上述目标信号的种类、上述目标信号在待识别的上述时间序列信号中的位置。In one embodiment of the present application, determining the recognition result based on the above-mentioned two-dimensional image includes: constructing an artificial intelligence model, the above-mentioned artificial intelligence model is obtained through training using multiple sets of training data, and each of the above-mentioned multiple sets of training data Each set of training data includes: historical two-dimensional images corresponding to historical time series signals and historical recognition results corresponding to the above-mentioned historical two-dimensional images; the above-mentioned two-dimensional images are input into the above-mentioned artificial intelligence model for calculation, and the above-mentioned recognition results are obtained. That is, by building an artificial intelligence model, the recognition result can be determined more accurately based on the two-dimensional image. That is, it is determined according to the two-dimensional image whether the time-series signal to be identified includes the target signal, the type of the target signal, and the position of the target signal in the time-series signal to be identified.
一种具体的实施例中,上述人工智能模型包括DBL模块和/或残差模块,上述DBL模块包括卷积层、批归一化层和激活层,上述残差模块包括上述DBL模块。更为具体地,残差模块由经两个DBL模块后与input(输入)相加得到。In a specific embodiment, the above-mentioned artificial intelligence model includes a DBL module and/or a residual module, the above-mentioned DBL module includes a convolution layer, a batch normalization layer and an activation layer, and the above-mentioned residual module includes the above-mentioned DBL module. More specifically, the residual module is obtained by adding the input (input) after two DBL modules.
本申请的一种实施例中,确定上述目标信号在待识别的上述时间序列信号中的位置,包括:确定上述目标信号对应的子图像在上述二维图像中的位置;根据上述目标信号对应的子图像在上述二维图像中的位置,确定上述目标信号在待识别的上述时间序列信号中的位置。即根 据目标信号对应的子图像在二维图像中的位置可以确定目标信号在待识别的上述时间序列信号中的位置。进一步地确定目标信号对应的碱基在原始测序序列中的位置。In an embodiment of the present application, determining the position of the above-mentioned target signal in the above-mentioned time series signal to be identified includes: determining the position of the sub-image corresponding to the above-mentioned target signal in the above-mentioned two-dimensional image; The position of the sub-image in the two-dimensional image determines the position of the target signal in the time-series signal to be identified. That is, the position of the target signal in the time series signal to be identified can be determined according to the position of the sub-image corresponding to the target signal in the two-dimensional image. Further determine the position of the base corresponding to the target signal in the original sequencing sequence.
本申请的一种实施例中,根据上述目标信号对应的子图像在上述二维图像中的位置,确定上述目标信号在待识别的上述时间序列信号中的位置,包括:获取上述二维图像的宽度;获取上述目标信号对应的子图像在上述二维图像中的像素坐标;获取上述时间序列信号的总长度;根据上述二维图像的宽度、上述像素坐标和上述时间序列信号的总长度,确定上述目标信号在待识别的上述时间序列信号中的位置。In an embodiment of the present application, according to the position of the sub-image corresponding to the target signal in the above-mentioned two-dimensional image, determining the position of the above-mentioned target signal in the above-mentioned time series signal to be recognized includes: acquiring the above-mentioned two-dimensional image Width; obtain the pixel coordinates of the sub-image corresponding to the above-mentioned target signal in the above-mentioned two-dimensional image; obtain the total length of the above-mentioned time-series signal; determine according to the width of the above-mentioned two-dimensional image, the above-mentioned pixel coordinates and the total length of the above-mentioned time-series signal The position of the above-mentioned target signal in the above-mentioned time series signal to be identified.
具体地,根据二维图像可以识别出目标信号在二维图像中的位置,进而根据二维图像与待识别的时间序列信号的对应关系确定目标信号在时间序列信号中的位置。例如,二维图像的大小是400x100,即二维图像的宽度为400,经过目标检测后得到目标信号对应的子图像的横坐标是150像素,同时时间序列信号在转换成图片之前的序列长度为10000,所以目标信号在时间序列信号的位置就是10000*150/400。Specifically, the position of the target signal in the two-dimensional image can be identified according to the two-dimensional image, and then the position of the target signal in the time-series signal can be determined according to the correspondence between the two-dimensional image and the time-series signal to be identified. For example, the size of the two-dimensional image is 400x100, that is, the width of the two-dimensional image is 400. After the target detection, the abscissa of the sub-image corresponding to the target signal is 150 pixels. At the same time, the sequence length of the time series signal before being converted into a picture is 10000, so the position of the target signal in the time series signal is 10000*150/400.
为实现对人工智能模型的精确确定,本申请的一种实施例中,上述人工智能模型为深度学习模型,构建上述人工智能模型,包括:获取模型训练的相关参数,上述相关参数包括优化器、学习率和训练迭代次数;以上述相关参数为标准,采用多组上述训练数据对上述人工智能模型进行训练。即通过设置优化器、学习率和训练迭代次数等相关参量,使得训练得到的人工智能模型的准确度和泛化度更高。In order to realize the accurate determination of the artificial intelligence model, in one embodiment of the present application, the above-mentioned artificial intelligence model is a deep learning model, and constructing the above-mentioned artificial intelligence model includes: obtaining relevant parameters for model training, and the above-mentioned relevant parameters include optimizers, The learning rate and the number of training iterations; using the above-mentioned relevant parameters as standards, using multiple sets of the above-mentioned training data to train the above-mentioned artificial intelligence model. That is, by setting relevant parameters such as the optimizer, learning rate, and number of training iterations, the accuracy and generalization of the trained artificial intelligence model are higher.
本申请的一种具体的实施例中,深度学习网络主要分为三个模块:预处理层、特征映射和融合层和预测输出层,如表1所示。在预处理层,一个三通道的图像数据作为模型的输入数据,在经过数据切片和合并后再将数据送入到一个3*3的卷积层,然后在经过多个串联卷积和瓶颈层模块单元将数据送入到池化层最后输出到特征映射和融合层。这里的每个瓶颈层模块单元是由三个卷积层和N个残差网络单元拼接而成的,并且每个卷基层后面都用了数据归一化和Leak relu激活函数对数据进行了处理。另外,除了池化层的输出外,加上瓶颈层模块单元的输出结果共有三个不同尺寸的特征数据作为特征融合层的输入进入下一步数据处理。在特征映射和融合层,在预处理层输出的三个不同尺度的特征数据,在经过一系列的池化、卷积以及上采样后再进行相互拼接,然后经过卷积处理后输出三个不同尺寸的特征图作为输出送入到结果预测层。在特征映射层和融合层做卷积和数据拼接的目的不仅是让模型在对待不同尺寸的目标的训练中能够抓取到更细微的特征从而保证模型对不同目标的分类和预测效果,同时也保证了特征的空间信息能力从而有助于精确的定位目标。在输出层,特征提取和融合层输出的三个特征分别经过卷积、数据归一化、激活函数以及再卷积处理后作为输出特征向量用于分类预测和坐标点计算。模型中设计了三类损失作为损失函数去计算是否有目标、目标的分类以及目标的坐标点。其中,是否包含目标和目标的分类利用交叉熵损失计算;目标物体的坐标点的损失用GIoU计算预测的坐标框和真实框的距离损失。In a specific embodiment of the present application, the deep learning network is mainly divided into three modules: a preprocessing layer, a feature mapping and fusion layer, and a prediction output layer, as shown in Table 1. In the preprocessing layer, a three-channel image data is used as the input data of the model. After data slicing and merging, the data is sent to a 3*3 convolutional layer, and then after multiple serial convolutions and bottleneck layers. The modular units feed data into the pooling layer and finally output to the feature map and fusion layer. Each bottleneck layer module unit here is spliced by three convolutional layers and N residual network units, and each convolutional layer is followed by data normalization and Leak relu activation function to process the data . In addition, in addition to the output of the pooling layer, plus the output of the bottleneck layer module unit, there are three feature data of different sizes as the input of the feature fusion layer to enter the next step of data processing. In the feature mapping and fusion layer, the feature data of three different scales output in the preprocessing layer are spliced with each other after a series of pooling, convolution and upsampling, and then three different scales are output after convolution processing. The feature map of dimension is fed as output to the resulting prediction layer. The purpose of convolution and data splicing in the feature mapping layer and fusion layer is not only to enable the model to capture more subtle features in the training of targets of different sizes so as to ensure the classification and prediction effect of the model for different targets, but also The spatial information capability of the features is guaranteed, which helps to locate the target accurately. In the output layer, the three features output by the feature extraction and fusion layers are respectively processed by convolution, data normalization, activation function, and reconvolution as output feature vectors for classification prediction and coordinate point calculation. In the model, three types of losses are designed as loss functions to calculate whether there is a target, the classification of the target, and the coordinate points of the target. Among them, the classification of whether to include the target and the target is calculated by cross entropy loss; the loss of the coordinate point of the target object is calculated by GIoU to calculate the distance loss between the predicted coordinate frame and the real frame.
表1深度学习网络模型表Table 1 Deep learning network model table
Figure PCTCN2021143406-appb-000001
Figure PCTCN2021143406-appb-000001
具体地,基于pytorch或者tensorflow搭建基于卷积神经网络的深度学习模型,并利用生成的训练集完成深度学习模型训练,并保存好模型文件。Specifically, build a convolutional neural network-based deep learning model based on pytorch or tensorflow, and use the generated training set to complete the deep learning model training, and save the model file.
本申请的又一种实施例中,上述历史识别结果是采用图片标注工具对上述历史二维图像进行标注得到的,构建人工智能模型,包括:将多组上述历史二维图像以及与上述历史二维图像对应的历史识别结果,划分为训练集和测试集;采用上述训练集对上述人工智能模型进行训练;采用上述测试集对上述人工智能模型进行测试。即采用丰富的训练集进行模型的训练得到人工智能模型,然后采用测试集对人工智能模型的准确度进行测试,在准确度不满足要求的情况下,调整训练集,再次进行训练指导训练得到的人工智能模型的准确度较高。In yet another embodiment of the present application, the above-mentioned historical identification results are obtained by marking the above-mentioned historical two-dimensional images with a picture annotation tool, and building an artificial intelligence model includes: combining multiple groups of the above-mentioned historical two-dimensional images and the above-mentioned historical two-dimensional images The historical recognition results corresponding to the dimensional images are divided into a training set and a test set; the above-mentioned artificial intelligence model is trained by using the above-mentioned training set; and the above-mentioned artificial intelligence model is tested by using the above-mentioned test set. That is, use a rich training set to train the model to obtain the artificial intelligence model, and then use the test set to test the accuracy of the artificial intelligence model. If the accuracy does not meet the requirements, adjust the training set and conduct training again to guide the training. The accuracy of the artificial intelligence model is high.
具体地,采用图片标注工具对上述历史二维图像进行标注,目标序列信号可以用矩形框选中,随后生成模型可读取的文件格式,比如xml,得到模型所需要的训练数据。Specifically, an image annotation tool is used to annotate the above-mentioned historical two-dimensional images, the target sequence signal can be selected with a rectangle, and then a file format that the model can read, such as xml, is generated to obtain the training data required by the model.
本申请的再一种实施例中,将多组上述历史二维图像以及与上述历史二维图像对应的历史识别结果,划分为训练集和测试集,包括;确定划分比例;基于上述划分比例,将多组上述历史二维图像以及与上述历史二维图像对应的历史识别结果,划分为上述训练集和上述测试集。例如,将60%的数据作为训练集,将40%的数据作为测试集。当然,在实际的应用中,本领域技术人员可以根据实际需求选取合适的划分比例。In another embodiment of the present application, multiple groups of the above-mentioned historical two-dimensional images and the historical recognition results corresponding to the above-mentioned historical two-dimensional images are divided into training sets and test sets, including: determining the division ratio; based on the above-mentioned division ratio, Dividing multiple groups of the above-mentioned historical two-dimensional images and the historical recognition results corresponding to the above-mentioned historical two-dimensional images into the above-mentioned training set and the above-mentioned test set. For example, take 60% of the data as the training set and 40% of the data as the test set. Of course, in practical applications, those skilled in the art can select an appropriate division ratio according to actual needs.
本申请的再一种实施例中,在将待识别的上述时间序列信号转换为二维图像之前,上述方法还包括:对待识别的上述时间序列信号进行滤波处理;对待识别的上述时间序列信号的时间轴进行缩放处理。即为实现对人工智能模型的精确确定,对待识别的上述时间序列信号先进行预处理,预处理之后的测序序列如图3所示,再转化为二维图像,二维图像如图4所示,识别结果如图5所示。具体地,滤波处理包括对时间序列信号进行平滑处理或者去噪处理。 通过对待识别的上述时间序列信号的时间轴进行缩放处理使得二维图像中的目标信号更容易识别,具体地,缩放至肉眼越容易识别越好。In yet another embodiment of the present application, before converting the above-mentioned time-series signal to be identified into a two-dimensional image, the above-mentioned method further includes: performing filtering processing on the above-mentioned time-series signal to be identified; The time axis is scaled. That is, in order to realize the accurate determination of the artificial intelligence model, the above-mentioned time series signals to be recognized are preprocessed first, and the sequencing sequence after preprocessing is shown in Figure 3, and then converted into a two-dimensional image, as shown in Figure 4 , the recognition results are shown in Figure 5. Specifically, the filtering process includes smoothing or denoising the time series signal. The target signal in the two-dimensional image is easier to identify by performing scaling processing on the time axis of the above-mentioned time series signal to be identified, specifically, the easier it is to be identified by the naked eye, the better.
具体地,还可以采用降采样算法和滤波算法相结合对待识别的上述时间序列信号做平滑、去噪处理。Specifically, a combination of a downsampling algorithm and a filtering algorithm may also be used to perform smoothing and denoising processing on the above-mentioned time series signal to be identified.
本申请的一种具体的实施例中,所述时间序列信号为测序时间序列。测序时间序列为电信号时间序列和光信号时间序列。In a specific embodiment of the present application, the time series signal is a sequencing time series. The sequencing time series are electrical signal time series and optical signal time series.
本申请实施例还提供了一种时间序列信号识别装置,需要说明的是,本申请实施例的时间序列信号识别装置可以用于执行本申请实施例所提供的用于时间序列信号识别方法。以下对本申请实施例提供的时间序列信号识别装置进行介绍。The embodiment of the present application also provides a time-series signal identification device. It should be noted that the time-series signal identification device in the embodiment of the present application can be used to implement the time-series signal identification method provided in the embodiment of the present application. The time series signal identification device provided by the embodiment of the present application is introduced below.
图9是根据本申请实施例的时间序列信号识别装置的示意图。如图9所示,该装置包括:Fig. 9 is a schematic diagram of a time-series signal identification device according to an embodiment of the present application. As shown in Figure 9, the device includes:
获取单元10,用于获取待识别的时间序列信号;An acquisition unit 10, configured to acquire a time series signal to be identified;
转换单元20,用于将待识别的上述时间序列信号转换为二维图像;A conversion unit 20, configured to convert the above-mentioned time series signal to be identified into a two-dimensional image;
第一确定单元30,用于根据上述二维图像确定识别结果,上述识别结果包括以下至少之一:待识别的上述时间序列信号是否包括目标信号、上述目标信号的种类、上述目标信号在待识别的上述时间序列信号中的位置。The first determination unit 30 is configured to determine a recognition result based on the above-mentioned two-dimensional image, and the above-mentioned recognition result includes at least one of the following: whether the above-mentioned time-series signal to be recognized includes a target signal, the type of the above-mentioned target signal, and whether the above-mentioned target signal is to be recognized. The position in the above time series signal of .
具体地,上述时间序列信号可以为纳米孔测序的时间序列电信号,核酸序列穿过纳米孔测序仪中的纳米孔时,纳米孔测序仪依次产生相应的时间序列电信号,上述核酸序列可以包括一个或多个核酸子序列,上述核酸子序列可以包括一个或多个核苷酸,每个核苷酸包括含氮碱基。可理解地,上述目标信号在待识别的上述时间序列信号中的位置即为所述目标信号对应的核酸子序列在时间序列电信号对应的核酸序列中的位置。Specifically, the above-mentioned time-series signal may be a time-series electrical signal of nanopore sequencing. When a nucleic acid sequence passes through a nanopore in a nanopore sequencer, the nanopore sequencer sequentially generates a corresponding time-series electrical signal. The above-mentioned nucleic acid sequence may include One or more nucleic acid subsequences, which may include one or more nucleotides, each nucleotide including a nitrogenous base. Understandably, the position of the target signal in the time-series signal to be identified is the position of the nucleic acid subsequence corresponding to the target signal in the nucleic acid sequence corresponding to the time-series electrical signal.
可选地,上述时间序列信号为一维时间序列数据,上述目标信号为一维时间序列数据。Optionally, the above-mentioned time-series signal is one-dimensional time-series data, and the above-mentioned target signal is one-dimensional time-series data.
上述方案中,获取单元获取待识别的时间序列信号,转换单元将待识别的时间序列信号转换为二维图像,第一确定单元根据二维图像识别出时间序列信号是否包括目标信号、上述目标信号的种类、上述目标信号在待识别的上述时间序列信号中的位置。由于采用了图像识别技术,使得可以识别出不同种类的目标信号,不再局限于特征很明显容易被识别的目标信号,且相对于人工识别的方式,识别效率较高。In the above solution, the acquisition unit acquires the time series signal to be identified, the conversion unit converts the time series signal to be identified into a two-dimensional image, and the first determination unit identifies whether the time series signal includes the target signal, the above target signal or not according to the two-dimensional image. The type of the target signal and the position of the above-mentioned target signal in the above-mentioned time series signal to be identified. Due to the use of image recognition technology, different types of target signals can be identified, and it is no longer limited to target signals with obvious features that are easy to identify. Compared with manual identification, the identification efficiency is higher.
本申请的一种实施例中,第一确定单元包括构建模块和计算模块,构建模块用于构建人工智能模型,上述人工智能模型为使用多组训练数据通过训练得到的,上述多组训练数据中的每一组训练数据均包括:历史时间序列信号对应的历史二维图像以及与上述历史二维图像对应的历史识别结果;计算模块用于将上述二维图像输入至上述人工智能模型中进行计算,得到上述识别结果。即通过构建人工智能模型的方法,使得根据二维图像更精确地确定识别结果。即根据二维图像确定待识别的上述时间序列信号是否包括目标信号、上述目标信号的种类、上述目标信号在待识别的上述时间序列信号中的位置。In one embodiment of the present application, the first determining unit includes a building block and a computing module, and the building block is used to build an artificial intelligence model. The above-mentioned artificial intelligence model is obtained through training using multiple sets of training data. Among the above-mentioned multiple sets of training data Each set of training data includes: historical two-dimensional images corresponding to historical time series signals and historical recognition results corresponding to the above-mentioned historical two-dimensional images; the calculation module is used to input the above-mentioned two-dimensional images into the above-mentioned artificial intelligence model for calculation , to obtain the above recognition results. That is, by building an artificial intelligence model, the recognition result can be determined more accurately based on the two-dimensional image. That is, it is determined according to the two-dimensional image whether the time-series signal to be identified includes the target signal, the type of the target signal, and the position of the target signal in the time-series signal to be identified.
本申请的一种实施例中,所述装置还包括第二确定单元,第二确定单元用于确定上述目标信号在待识别的上述时间序列信号中的位置,第二确定单元包括第一确定模块和第二确定模块,第一确定模块用于确定上述目标信号对应的子图像在上述二维图像中的位置;第二确定模块用于根据上述目标信号对应的子图像在上述二维图像中的位置,确定上述目标信号在待识别的上述时间序列信号中的位置。即根据目标信号对应的子图像在二维图像中的位置可以确定目标信号在待识别的上述时间序列信号中的位置。进一步地确定目标信号对应的碱基在原始测序序列中的位置。In an embodiment of the present application, the device further includes a second determination unit, the second determination unit is used to determine the position of the target signal in the time series signal to be identified, and the second determination unit includes a first determination module and a second determination module, the first determination module is used to determine the position of the sub-image corresponding to the above-mentioned target signal in the above-mentioned two-dimensional image; the second determination module is used to determine the position of the sub-image corresponding to the above-mentioned target signal in the above-mentioned two-dimensional image position, to determine the position of the target signal in the time series signal to be identified. That is, the position of the target signal in the time series signal to be identified can be determined according to the position of the sub-image corresponding to the target signal in the two-dimensional image. Further determine the position of the base corresponding to the target signal in the original sequencing sequence.
本申请的一种实施例中,第二确定模块包括第二获取子模块、第三获取子模块、第四获取子模块和第二确定子模块,第二获取子模块用于获取上述二维图像的宽度;第三获取子模块用于获取上述目标信号对应的子图像在上述二维图像中的像素坐标;第四获取子模块用于获取上述时间序列信号的总长度;第二确定子模块用于根据上述二维图像的宽度、上述像素坐标和上述时间序列信号的总长度,确定上述目标信号在待识别的上述时间序列信号中的位置。具体地,根据二维图像可以识别出目标信号在二维图像中的位置,进而根据二维图像与待识别的时间序列信号的对应关系确定目标信号在时间序列信号中的位置。例如,二维图像的大小是400x100,即二维图像的宽度为400,经过目标检测后得到目标信号对应的子图像的横坐标是150像素,同时时间序列信号在转换成图片之前的序列长度为10000,所以目标信号在时间序列信号的位置就是10000*150/400。In an embodiment of the present application, the second determination module includes a second acquisition submodule, a third acquisition submodule, a fourth acquisition submodule, and a second determination submodule, and the second acquisition submodule is used to acquire the above-mentioned two-dimensional image width; the third acquisition sub-module is used to obtain the pixel coordinates of the sub-image corresponding to the above-mentioned target signal in the above-mentioned two-dimensional image; the fourth acquisition sub-module is used to obtain the total length of the above-mentioned time series signal; the second determination sub-module uses The position of the target signal in the time series signal to be identified is determined according to the width of the two-dimensional image, the pixel coordinates and the total length of the time series signal. Specifically, the position of the target signal in the two-dimensional image can be identified according to the two-dimensional image, and then the position of the target signal in the time-series signal can be determined according to the correspondence between the two-dimensional image and the time-series signal to be identified. For example, the size of the two-dimensional image is 400x100, that is, the width of the two-dimensional image is 400. After the target detection, the abscissa of the sub-image corresponding to the target signal is 150 pixels. At the same time, the sequence length of the time series signal before being converted into a picture is 10000, so the position of the target signal in the time series signal is 10000*150/400.
为实现对人工智能模型的精确确定,本申请的一种实施例中,上述人工智能模型为深度学习模型,构建模块包括第一获取子模块和第一训练子模块,第一获取子模块用于获取模型训练的相关参数,上述相关参数包括优化器、学习率和训练迭代次数;第一训练子模块用于以上述相关参数为标准,采用多组上述训练数据对上述人工智能模型进行训练。即通过设置优化器、学习率和训练迭代次数等相关参量,使得训练得到的人工智能模型的准确度和泛化度更高。In order to realize the accurate determination of the artificial intelligence model, in one embodiment of the present application, the above-mentioned artificial intelligence model is a deep learning model, and the construction module includes a first acquisition sub-module and a first training sub-module, and the first acquisition sub-module is used for Obtain relevant parameters for model training, the above-mentioned relevant parameters include an optimizer, a learning rate and the number of training iterations; the first training sub-module is used to use the above-mentioned relevant parameters as a standard to train the above-mentioned artificial intelligence model by using multiple sets of the above-mentioned training data. That is, by setting relevant parameters such as the optimizer, learning rate, and number of training iterations, the accuracy and generalization of the trained artificial intelligence model are higher.
本申请的又一种实施例中,上述历史识别结果是采用图片标注工具对上述历史二维图像进行标注得到的,构建模块包括划分子模块、第二训练子模块和测试子模块,划分子模块用于将多组上述历史二维图像以及与上述历史二维图像对应的历史识别结果,划分为训练集和测试集;第二训练子模块用于采用上述训练集对上述人工智能模型进行训练;测试子模块用于采用上述测试集对上述人工智能模型进行测试。即采用丰富的训练集进行模型的训练得到人工智能模型,然后采用测试集对人工智能模型的准确度进行测试,在准确度不满足要求的情况下,调整训练集,再次进行训练指导训练得到的人工智能模型的准确度较高。In yet another embodiment of the present application, the above-mentioned historical recognition results are obtained by using a picture annotation tool to mark the above-mentioned historical two-dimensional images. It is used to divide multiple groups of the above-mentioned historical two-dimensional images and the historical recognition results corresponding to the above-mentioned historical two-dimensional images into a training set and a test set; the second training submodule is used to use the above-mentioned training set to train the above-mentioned artificial intelligence model; The test sub-module is used to test the above-mentioned artificial intelligence model by using the above-mentioned test set. That is, use a rich training set to train the model to obtain the artificial intelligence model, and then use the test set to test the accuracy of the artificial intelligence model. If the accuracy does not meet the requirements, adjust the training set and conduct training again to guide the training. The accuracy of the artificial intelligence model is high.
本申请的再一种实施例中,划分子模块包括第一确定子模块和处理子模块,第一确定子模块用于确定划分比例;处理子模块用于基于上述划分比例,将多组上述历史二维图像以及与上述历史二维图像对应的历史识别结果,划分为上述训练集和上述测试集。例如,将60%的数据作为训练集,将40%的数据作为测试集。当然,在实际的应用中,本领域技术人员可以根据实际需求选取合适的划分比例。In yet another embodiment of the present application, the division submodule includes a first determination submodule and a processing submodule, the first determination submodule is used to determine the division ratio; the processing submodule is used to combine multiple groups of the above history The two-dimensional images and the historical recognition results corresponding to the above-mentioned historical two-dimensional images are divided into the above-mentioned training set and the above-mentioned test set. For example, take 60% of the data as the training set and 40% of the data as the test set. Of course, in practical applications, those skilled in the art can select an appropriate division ratio according to actual needs.
本申请的再一种实施例中,上述装置还包括滤波单元和缩放单元,滤波单元用于在将待识别的上述时间序列信号转换为二维图像之前,对待识别的上述时间序列信号进行滤波处理;缩放单元用于对待识别的上述时间序列信号的时间轴进行缩放处理。即为实现对人工智能模型的精确确定,对待识别的上述时间序列信号先进行预处理,预处理之后的测序序列如图3所示,再转化为二维图像,二维图像如图4所示,识别结果如图5所示。具体地,滤波处理包括对时间序列信号进行平滑处理或者去噪处理。通过对待识别的上述时间序列信号的时间轴进行缩放处理使得二维图像中的目标信号更容易识别,具体地,缩放至肉眼越容易识别越好。In yet another embodiment of the present application, the above-mentioned device further includes a filtering unit and a scaling unit, and the filtering unit is used to filter the above-mentioned time-series signal to be recognized before converting the above-mentioned time-series signal to be recognized into a two-dimensional image ; The scaling unit is used to scale the time axis of the time series signal to be identified. That is, in order to realize the accurate determination of the artificial intelligence model, the above-mentioned time series signals to be recognized are preprocessed first, and the sequencing sequence after preprocessing is shown in Figure 3, and then converted into a two-dimensional image, as shown in Figure 4 , the recognition results are shown in Figure 5. Specifically, the filtering process includes smoothing or denoising the time series signal. The target signal in the two-dimensional image is easier to identify by performing scaling processing on the time axis of the above-mentioned time series signal to be identified, specifically, the easier it is to be identified by the naked eye, the better.
所述时间序列信号识别装置包括处理器和存储器,上述获取单元、转换单元和第一确定单元等均作为程序单元存储在存储器中,由处理器执行存储在存储器中的上述程序单元来实现相应的功能。The time-series signal recognition device includes a processor and a memory, and the above-mentioned acquisition unit, conversion unit and first determination unit are all stored in the memory as program units, and the processor executes the above-mentioned program units stored in the memory to realize corresponding Function.
处理器中包含内核,由内核去存储器中调取相应的程序单元。内核可以设置一个或以上,通过调整内核参数来实现对时间序列信号的精准识别。The processor includes a kernel, and the kernel fetches corresponding program units from the memory. One or more kernels can be set, and accurate identification of time series signals can be achieved by adjusting kernel parameters.
存储器可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM),存储器包括至少一个存储芯片。Memory may include non-permanent memory in computer-readable media, in the form of random access memory (RAM) and/or non-volatile memory, such as read-only memory (ROM) or flash memory (flash RAM), memory including at least one memory chip.
本发明实施例提供了一种计算机可读存储介质,所述计算机可读存储介质包括存储的程序,其中,在所述程序运行时控制所述计算机可读存储介质所在设备执行所述时间序列信号识别方法。An embodiment of the present invention provides a computer-readable storage medium, the computer-readable storage medium includes a stored program, wherein when the program is running, the device where the computer-readable storage medium is located is controlled to execute the time series signal recognition methods.
本发明实施例提供了一种处理器,所述处理器用于运行程序,其中,所述程序运行时执行所述时间序列信号识别方法。An embodiment of the present invention provides a processor, the processor is used to run a program, wherein the time series signal identification method is executed when the program is running.
本发明实施例提供了一种系统,包括单通道纳米孔测序装置、一个或多个处理器,存储器以及一个或多个程序,其中,上述一个或多个程序被存储在上述存储器中,并且被配置为由上述一个或多个处理器执行,上述一个或多个程序包括用于执行任意一种上述的方法。An embodiment of the present invention provides a system, including a single-channel nanopore sequencing device, one or more processors, memory, and one or more programs, wherein the above-mentioned one or more programs are stored in the above-mentioned memory, and are It is configured to be executed by the above-mentioned one or more processors, and the above-mentioned one or more programs include a method for performing any one of the above-mentioned methods.
本发明实施例提供了一种设备,设备包括处理器、存储器及存储在存储器上并可在处理器上运行的程序,处理器执行程序时实现至少以下步骤:An embodiment of the present invention provides a device. The device includes a processor, a memory, and a program stored on the memory and operable on the processor. When the processor executes the program, at least the following steps are implemented:
步骤S101,获取待识别的时间序列信号;Step S101, acquiring time series signals to be identified;
步骤S102,将待识别的上述时间序列信号转换为二维图像;Step S102, converting the above-mentioned time series signal to be identified into a two-dimensional image;
步骤S103,根据上述二维图像确定识别结果,上述识别结果包括以下至少之一:待识别的上述时间序列信号是否包括目标信号、上述目标信号的种类、上述目标信号在待识别的上述时间序列信号中的位置。Step S103, determine the recognition result according to the above-mentioned two-dimensional image, the above-mentioned recognition result includes at least one of the following: whether the above-mentioned time-series signal to be recognized includes the target signal, the type of the above-mentioned target signal, the above-mentioned target signal in the above-mentioned time-series signal to be recognized position in .
本文中的设备可以是服务器、PC、PAD、手机等。The devices in this article can be servers, PCs, PADs, mobile phones, etc.
本申请还提供了一种计算机程序产品,当在数据处理设备上执行时,适于执行初始化有至少如下方法步骤的程序:The present application also provides a computer program product, which, when executed on a data processing device, is adapted to execute a program initialized with at least the following method steps:
步骤S101,获取待识别的时间序列信号;Step S101, acquiring time series signals to be identified;
步骤S102,将待识别的上述时间序列信号转换为二维图像;Step S102, converting the above-mentioned time series signal to be identified into a two-dimensional image;
步骤S103,根据上述二维图像确定识别结果,上述识别结果包括以下至少之一:待识别的上述时间序列信号是否包括目标信号、上述目标信号的种类、上述目标信号在待识别的上述时间序列信号中的位置。Step S103, determine the recognition result according to the above-mentioned two-dimensional image, the above-mentioned recognition result includes at least one of the following: whether the above-mentioned time-series signal to be recognized includes the target signal, the type of the above-mentioned target signal, the above-mentioned target signal in the above-mentioned time-series signal to be recognized position in .
本领域内的技术人员应明白,本申请的实施例可提供为方法、系统、或计算机程序产品。因此,本申请可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
本申请是参照根据本申请实施例的方法、设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present application is described with reference to flowcharts and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the present application. It should be understood that each procedure and/or block in the flowchart and/or block diagram, and a combination of procedures and/or blocks in the flowchart and/or block diagram can be realized by computer program instructions. These computer program instructions may be provided to a general purpose computer, special purpose computer, embedded processor, or processor of other programmable data processing equipment to produce a machine such that the instructions executed by the processor of the computer or other programmable data processing equipment produce a An apparatus for realizing the functions specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。These computer program instructions may also be stored in a computer-readable memory capable of directing a computer or other programmable data processing apparatus to operate in a specific manner, such that the instructions stored in the computer-readable memory produce an article of manufacture comprising instruction means, the instructions The device realizes the function specified in one or more procedures of the flowchart and/or one or more blocks of the block diagram.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device, causing a series of operational steps to be performed on the computer or other programmable device to produce a computer-implemented process, thereby The instructions provide steps for implementing the functions specified in the flow chart or blocks of the flowchart and/or the block or blocks of the block diagrams.
在一个典型的配置中,计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
存储器可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。存储器是计算机可读介质的示例。Memory may include non-permanent storage in computer readable media, in the form of random access memory (RAM) and/or nonvolatile memory such as read only memory (ROM) or flash RAM. The memory is an example of a computer readable medium.
计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他 磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。Computer-readable media, including both permanent and non-permanent, removable and non-removable media, can be implemented by any method or technology for storage of information. Information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Flash memory or other memory technology, Compact Disc Read-Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cartridge, tape magnetic disk storage or other magnetic storage device or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer-readable media excludes transitory computer-readable media, such as modulated data signals and carrier waves.
还需要说明的是,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括要素的过程、方法、商品或者设备中还存在另外的相同要素。It should also be noted that the term "comprises", "comprises" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article, or apparatus comprising a set of elements includes not only those elements, but also includes Other elements not expressly listed, or elements inherent in the process, method, commodity, or apparatus are also included. Without further limitations, an element defined by the phrase "comprising a ..." does not preclude the presence of additional identical elements in the process, method, article, or apparatus that includes the element.
实施例Example
本实施例涉及一种具体的时间序列信号识别系统,包括硬件环境和软件环境,如表2所示。This embodiment relates to a specific time series signal recognition system, including a hardware environment and a software environment, as shown in Table 2.
表2硬件环境和软件环境表Table 2 Hardware environment and software environment table
Figure PCTCN2021143406-appb-000002
Figure PCTCN2021143406-appb-000002
由于算法模块中牵涉到大量数据的读写和储存以及模型的训练和测试,需要大量的数据运算和操作,在高性能的CPU或带GPU配置的软硬件环境中运行能够显著地提升效率和稳定性。Since the algorithm module involves the reading, writing and storage of a large amount of data, as well as the training and testing of the model, a large number of data calculations and operations are required. Running in a high-performance CPU or hardware and software environment with GPU configuration can significantly improve efficiency and stability. sex.
所述系统还包括单通道纳米孔测序装置和PC机。利用纳米孔测序装置采集目标文库数据,采集后的信号数据以及其它信息保存为h5的文件结构,并实时存入PC机硬盘。The system also includes a single-channel nanopore sequencing device and a PC. Use the nanopore sequencing device to collect target library data, and save the collected signal data and other information as a file structure of h5, and store them in the hard disk of the PC in real time.
本实施例采用目标检测框架yolov3,网络架构如图6所示,其中DBL模块是由卷积层(conv)、批归一化层(BN)和激活函数(Leaky ReLU)组成。如图7所示,残差模块是指经过两个DBL模块后与input相加得到,如图8所示。This embodiment uses the target detection framework yolov3, and the network architecture is shown in Figure 6, wherein the DBL module is composed of a convolutional layer (conv), a batch normalization layer (BN) and an activation function (Leaky ReLU). As shown in Figure 7, the residual module is obtained by adding the input after two DBL modules, as shown in Figure 8.
如图6所示,输入端,输入一张416*416*3的图片,经过DBL模块和3个残差模块完成初步的特征提取,然后经过多个尺度的进一步特征学习,最后直接输出各种可能的预测框坐标信息(也就是待检测目标在图片中的坐标信息)。这里采用3个输出(y1/y2/y3)是借鉴了FPN(feature pyramid networks),采用多尺度来对不同size的目标进行检测,越精细的网格就可以检测出越精细的物体。最后通过设定阈值,根据每个预测框的概率值过滤,就能得到剩下 的最可能的坐标位置信息。具体网络搭建方面,YOLOv3可以基于pytorch或者tensorflow实现,可以参考开源代码。As shown in Figure 6, at the input end, a 416*416*3 image is input, preliminary feature extraction is completed through the DBL module and 3 residual modules, and then further feature learning at multiple scales is performed, and finally various Possible prediction frame coordinate information (that is, the coordinate information of the target to be detected in the picture). The use of three outputs (y1/y2/y3) here is based on FPN (feature pyramid networks), and multi-scale is used to detect targets of different sizes. The finer the grid, the finer the object can be detected. Finally, by setting the threshold and filtering according to the probability value of each prediction frame, the remaining most likely coordinate position information can be obtained. In terms of specific network construction, YOLOv3 can be implemented based on pytorch or tensorflow, and you can refer to open source code.
数据的标注和预处理:在使用模型训练之前需要将样本分析的点信号数据转为图片格式,将其进行人工标注后传递给深度学习模型进行训练和测试。首先,将信号数据生成为同一尺寸大小的图片保存,考虑到信号数据量的大小,在图片生成的过程中,可以利用下采样和滤波的方式对数据做一定的处理从而保持数据形态。然后,采用roLabelImg(一种开源图片标注工具)等工具对每张图片里包含的分类信号的类型和相对坐标进行标注,并保存所有结果文件。Data labeling and preprocessing: Before using the model training, it is necessary to convert the point signal data of the sample analysis into an image format, manually label it and pass it to the deep learning model for training and testing. First, the signal data is generated as a picture of the same size and saved. Considering the size of the signal data, in the process of picture generation, the data can be processed by downsampling and filtering to maintain the data form. Then, use tools such as roLabelImg (an open source image labeling tool) to label the type and relative coordinates of the classification signals contained in each picture, and save all result files.
模型训练与测试:在完成数据标注后,需要将用于训练的数据图片和对应的标注结果划分为训练集和测试集。在训练集制作阶段可以使用随机抽取和自行设置比例制作出用于深度学习训练的数据集。然后,根据应用场景设置好模型训练的参数,如优化器、学习率以及训练迭代次数等,对模型进行训练和测试。如果测试的效果不理想,可尝试增加数据集的数量,修改模型训练参数等方法调整训练,重复迭代这一过程直到模型训练的结果满足指标的要求。Model training and testing: After data labeling is completed, the data images used for training and the corresponding labeling results need to be divided into a training set and a test set. In the training set production stage, you can use random sampling and set the ratio yourself to create a data set for deep learning training. Then, set the parameters of model training according to the application scenario, such as optimizer, learning rate, and number of training iterations, etc., to train and test the model. If the test results are not ideal, you can try to increase the number of data sets, modify the model training parameters, etc. to adjust the training, and repeat the iterative process until the model training results meet the requirements of the indicators.
模型的使用:模型训练完成后,可以针对不同的场景进行部署和使用。一种基于本发明的测序电信号转换得到的二维图像如图4所示。因为在测序过程中,混合了一些自定义的特殊碱基序列片段和其它片段,而特殊的碱基片段在测序过程中会呈现特殊形态的电信号,如图4。如果从所有的测序电信号中人工筛选出特殊的目标信号可能需要大量的时间和精力,因此可以利用该模型在大量混乱的信号数据中筛选出来一种或多种需要分析的目标信号,如图5。另外,模型检测的结果产出包含了目标信号的分类信息以及相对的坐标,对于信号的进一步分析也起到了很大的辅助作用。Use of the model: After the model training is completed, it can be deployed and used for different scenarios. A two-dimensional image converted based on the sequencing electrical signal of the present invention is shown in FIG. 4 . Because during the sequencing process, some custom special base sequence fragments and other fragments are mixed, and the special base fragments will present special electrical signals during the sequencing process, as shown in Figure 4. It may take a lot of time and effort to manually screen out special target signals from all the sequencing electrical signals, so this model can be used to screen out one or more target signals that need to be analyzed in a large amount of chaotic signal data, as shown in the figure 5. In addition, the output of the model detection results includes the classification information and relative coordinates of the target signal, which also plays a great auxiliary role in the further analysis of the signal.
从以上的描述中,可以看出,本申请上述的实施例实现了如下技术效果:From the above description, it can be seen that the above-mentioned embodiments of the present application have achieved the following technical effects:
1)、本申请的时间序列信号识别方法,通过获取待识别的时间序列信号,然后将待识别的时间序列信号转换为二维图像,最后根据二维图像识别出时间序列信号是否包括目标信号、上述目标信号的种类、上述目标信号在待识别的上述时间序列信号中的位置。由于采用了图像识别技术,使得可以识别出不同种类的目标信号,不再局限于特征很明显容易被识别的目标信号,且相对于人工识别的方式,识别效率较高。1), the time-series signal identification method of the present application obtains the time-series signal to be identified, then converts the time-series signal to be identified into a two-dimensional image, and finally identifies whether the time-series signal includes the target signal according to the two-dimensional image, The type of the target signal and the position of the target signal in the time series signal to be identified. Due to the use of image recognition technology, different types of target signals can be identified, and it is no longer limited to target signals with obvious features that are easy to identify. Compared with manual identification, the identification efficiency is higher.
2)、本申请的时间序列信号识别装置,获取单元获取待识别的时间序列信号,转换单元将待识别的时间序列信号转换为二维图像,第一确定单元根据二维图像识别出时间序列信号是否包括目标信号、上述目标信号的种类、上述目标信号在待识别的上述时间序列信号中的位置。由于采用了图像识别技术,使得可以识别出不同种类的目标信号,不再局限于特征很明显容易被识别的目标信号,且相对于人工识别的方式,识别效率较高。2) In the time-series signal identification device of the present application, the acquisition unit acquires the time-series signal to be identified, the conversion unit converts the time-series signal to be identified into a two-dimensional image, and the first determination unit identifies the time-series signal according to the two-dimensional image Whether to include the target signal, the type of the target signal, and the position of the target signal in the time series signal to be identified. Due to the use of image recognition technology, different types of target signals can be identified, and it is no longer limited to target signals with obvious features that are easy to identify. Compared with manual identification, the identification efficiency is higher.
以上所述仅为本申请的优选实施例而已,并不用于限制本申请,对于本领域的技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。The above descriptions are only preferred embodiments of the present application, and are not intended to limit the present application. For those skilled in the art, there may be various modifications and changes in the present application. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of this application shall be included within the protection scope of this application.

Claims (10)

  1. 一种时间序列信号的识别方法,其特征在于,包括:A method for identifying time series signals, comprising:
    获取待识别的时间序列信号;Obtain the time series signal to be identified;
    将待识别的所述时间序列信号转换为二维图像;converting the time series signal to be identified into a two-dimensional image;
    根据所述二维图像确定识别结果,所述识别结果包括以下至少之一:待识别的所述时间序列信号是否包括目标信号、所述目标信号的种类、所述目标信号在待识别的所述时间序列信号中的位置。The recognition result is determined according to the two-dimensional image, and the recognition result includes at least one of the following: whether the time series signal to be recognized includes a target signal, the type of the target signal, and whether the target signal is in the target signal to be recognized. position in the time series signal.
  2. 根据权利要求1所述的方法,其特征在于,根据所述二维图像确定所述识别结果,包括:The method according to claim 1, wherein determining the recognition result according to the two-dimensional image comprises:
    构建人工智能模型,所述人工智能模型为使用多组训练数据通过训练得到的,所述多组训练数据中的每一组训练数据均包括:历史时间序列信号对应的历史二维图像以及与所述历史二维图像对应的历史识别结果;Building an artificial intelligence model, the artificial intelligence model is obtained through training using multiple sets of training data, each set of training data in the multiple sets of training data includes: historical two-dimensional images corresponding to historical time series signals and the Describe the historical recognition results corresponding to the historical two-dimensional images;
    将所述二维图像输入至所述人工智能模型中进行计算,得到所述识别结果。The two-dimensional image is input into the artificial intelligence model for calculation to obtain the recognition result.
  3. 根据权利要求2所述的方法,其特征在于,所述人工智能模型包括DBL模块和/或残差模块,所述DBL模块包括卷积层、批归一化层和激活层,所述残差模块包括所述DBL模块。The method according to claim 2, wherein the artificial intelligence model comprises a DBL module and/or a residual module, the DBL module comprises a convolutional layer, a batch normalization layer and an activation layer, and the residual modules include the DBL module.
  4. 根据权利要求1所述的方法,其特征在于,确定所述目标信号在待识别的所述时间序列信号中的位置,包括:The method according to claim 1, wherein determining the position of the target signal in the time series signal to be identified comprises:
    确定所述目标信号对应的子图像在所述二维图像中的位置;determining the position of the sub-image corresponding to the target signal in the two-dimensional image;
    根据所述目标信号对应的子图像在所述二维图像中的位置,确定所述目标信号在待识别的所述时间序列信号中的位置。The position of the target signal in the time series signal to be identified is determined according to the position of the sub-image corresponding to the target signal in the two-dimensional image.
  5. 根据权利要求4所述的方法,其特征在于,根据所述目标信号对应的子图像在所述二维图像中的位置,确定所述目标信号在待识别的所述时间序列信号中的位置,包括:The method according to claim 4, wherein the position of the target signal in the time series signal to be identified is determined according to the position of the sub-image corresponding to the target signal in the two-dimensional image, include:
    获取所述二维图像的宽度;Acquiring the width of the two-dimensional image;
    获取所述目标信号对应的子图像在所述二维图像中的像素坐标;Acquiring pixel coordinates of the sub-image corresponding to the target signal in the two-dimensional image;
    获取所述时间序列信号的总长度;Obtain the total length of the time series signal;
    根据所述二维图像的宽度、所述像素坐标和所述时间序列信号的总长度,确定所述目标信号在待识别的所述时间序列信号中的位置。According to the width of the two-dimensional image, the pixel coordinates and the total length of the time series signal, determine the position of the target signal in the time series signal to be identified.
  6. 根据权利要求1至5中任一项所述的方法,其特征在于,在将待识别的所述时间序列信号转换为二维图像之前,所述方法还包括:The method according to any one of claims 1 to 5, wherein before converting the time series signal to be identified into a two-dimensional image, the method further comprises:
    对待识别的所述时间序列信号进行滤波处理;performing filtering processing on the time series signal to be identified;
    对待识别的所述时间序列信号的时间轴进行缩放处理。Perform scaling processing on the time axis of the time series signal to be identified.
  7. 根据权利要求1至5中任一项所述的方法,其特征在于,所述时间序列信号为测序时间序列。The method according to any one of claims 1 to 5, wherein the time series signal is a sequencing time series.
  8. 一种时间序列信号的识别装置,其特征在于,包括:An identification device for a time series signal, characterized in that it comprises:
    获取单元,用于获取待识别的时间序列信号;an acquisition unit, configured to acquire a time series signal to be identified;
    转换单元,用于将待识别的所述时间序列信号转换为二维图像;a conversion unit, configured to convert the time series signal to be identified into a two-dimensional image;
    第一确定单元,用于根据所述二维图像确定识别结果,所述识别结果包括以下至少之一:待识别的所述时间序列信号是否包括目标信号、所述目标信号的种类、所述目标信号在待识别的所述时间序列信号中的位置。The first determination unit is configured to determine a recognition result according to the two-dimensional image, the recognition result includes at least one of the following: whether the time series signal to be recognized includes a target signal, the type of the target signal, the target The position of a signal in said time series signal to be identified.
  9. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质包括存储的程序,其中,在所述程序运行时控制所述计算机可读存储介质所在设备执行权利要求1至7中任意一项所述的方法。A computer-readable storage medium, characterized in that the computer-readable storage medium includes a stored program, wherein, when the program is running, the device where the computer-readable storage medium is located is controlled to execute any of claims 1 to 7. one of the methods described.
  10. 一种系统,其特征在于,包括单通道纳米孔测序装置、一个或多个处理器,存储器以及一个或多个程序,其中,所述一个或多个程序被存储在所述存储器中,并且被配置为由所述一个或多个处理器执行,所述一个或多个程序包括用于执行权利要求1至7中任意一项所述的方法。A system, characterized in that it includes a single-channel nanopore sequencing device, one or more processors, memory and one or more programs, wherein the one or more programs are stored in the memory and are Configured to be executed by the one or more processors, the one or more programs are included for performing the method of any one of claims 1-7.
PCT/CN2021/143406 2021-12-30 2021-12-30 Time sequence signal identification method and apparatus, and computer readable storage medium WO2023123291A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/143406 WO2023123291A1 (en) 2021-12-30 2021-12-30 Time sequence signal identification method and apparatus, and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/143406 WO2023123291A1 (en) 2021-12-30 2021-12-30 Time sequence signal identification method and apparatus, and computer readable storage medium

Publications (1)

Publication Number Publication Date
WO2023123291A1 true WO2023123291A1 (en) 2023-07-06

Family

ID=86997101

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/143406 WO2023123291A1 (en) 2021-12-30 2021-12-30 Time sequence signal identification method and apparatus, and computer readable storage medium

Country Status (1)

Country Link
WO (1) WO2023123291A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020169735A1 (en) * 2001-03-07 2002-11-14 David Kil Automatic mapping from data to preprocessing algorithms
CN112513882A (en) * 2018-06-08 2021-03-16 瑞典爱立信有限公司 Methods, devices and computer readable media related to detection of cell conditions in a wireless cellular network
CN113807225A (en) * 2021-09-07 2021-12-17 中国海洋大学 Load identification method based on feature fusion
CN113811908A (en) * 2019-09-02 2021-12-17 西门子(中国)有限公司 Method and device for determining production cycle of production facility
CN113850185A (en) * 2021-09-24 2021-12-28 中南大学 Multi-classification method, device, terminal and storage medium for underground acoustic emission source

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020169735A1 (en) * 2001-03-07 2002-11-14 David Kil Automatic mapping from data to preprocessing algorithms
CN112513882A (en) * 2018-06-08 2021-03-16 瑞典爱立信有限公司 Methods, devices and computer readable media related to detection of cell conditions in a wireless cellular network
CN113811908A (en) * 2019-09-02 2021-12-17 西门子(中国)有限公司 Method and device for determining production cycle of production facility
CN113807225A (en) * 2021-09-07 2021-12-17 中国海洋大学 Load identification method based on feature fusion
CN113850185A (en) * 2021-09-24 2021-12-28 中南大学 Multi-classification method, device, terminal and storage medium for underground acoustic emission source

Similar Documents

Publication Publication Date Title
CN109117848B (en) Text line character recognition method, device, medium and electronic equipment
Xu et al. CenterFace: joint face detection and alignment using face as point
US11854283B2 (en) Method and apparatus for visual question answering, computer device and medium
CN110458095B (en) Effective gesture recognition method, control method and device and electronic equipment
US8442307B1 (en) Appearance augmented 3-D point clouds for trajectory and camera localization
WO2018021942A2 (en) Facial recognition using an artificial neural network
CN111080693A (en) Robot autonomous classification grabbing method based on YOLOv3
US11810326B2 (en) Determining camera parameters from a single digital image
CN112016638B (en) Method, device and equipment for identifying steel bar cluster and storage medium
CN110264523B (en) Method and equipment for determining position information of target image in test image
CN109544516B (en) Image detection method and device
CN109345460B (en) Method and apparatus for rectifying image
CN113490947A (en) Detection model training method and device, detection model using method and storage medium
Hu et al. PolyBuilding: Polygon transformer for building extraction
KR102440198B1 (en) VIDEO SEARCH METHOD AND APPARATUS, COMPUTER DEVICE, AND STORAGE MEDIUM
CN111124863A (en) Intelligent equipment performance testing method and device and intelligent equipment
Tang et al. Image dataset creation and networks improvement method based on CAD model and edge operator for object detection in the manufacturing industry
CN110210314B (en) Face detection method, device, computer equipment and storage medium
WO2023123291A1 (en) Time sequence signal identification method and apparatus, and computer readable storage medium
CN111680680A (en) Object code positioning method and device, electronic equipment and storage medium
CN106447711B (en) A kind of multiple dimensioned basic geometry feature extracting method
Yu et al. Surface Defect inspection under a small training set condition
CN113902890A (en) Self-supervision data enhancement method, system and equipment for visual concept detection
CN113469087A (en) Method, device, equipment and medium for detecting picture frame in building drawing
CN113096104A (en) Training method and device of target segmentation model and target segmentation method and device