CN116230094A - DNA encoding method, system, equipment and storage medium based on waveform characteristic storage - Google Patents

DNA encoding method, system, equipment and storage medium based on waveform characteristic storage Download PDF

Info

Publication number
CN116230094A
CN116230094A CN202310129992.7A CN202310129992A CN116230094A CN 116230094 A CN116230094 A CN 116230094A CN 202310129992 A CN202310129992 A CN 202310129992A CN 116230094 A CN116230094 A CN 116230094A
Authority
CN
China
Prior art keywords
information
waveform
signals
storage
image information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310129992.7A
Other languages
Chinese (zh)
Inventor
戴俊彪
强薇
黄小罗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS filed Critical Shenzhen Institute of Advanced Technology of CAS
Priority to CN202310129992.7A priority Critical patent/CN116230094A/en
Publication of CN116230094A publication Critical patent/CN116230094A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/3059Digital compression and data reduction techniques where the original information is represented by a subset or similar information, e.g. lossy compression
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Computational Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Pure & Applied Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Bioethics (AREA)
  • Biophysics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Biotechnology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Algebra (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention relates to a DNA coding method and a system based on waveform characteristic storage, wherein the method comprises the following steps: converting the image information and the audio information into waveform signals; dividing waveform signals converted from audio information; performing Fourier transform on the waveform signals converted from the segmented audio information and the waveform signals converted from the image information to obtain frequency domain information; the obtained frequency domain information is converted into a final base sequence for DNA information storage. The invention does not need to encode based on a silicon-based system, and can fundamentally and effectively solve the problem of shortage of silicon-based storage resources.

Description

DNA encoding method, system, equipment and storage medium based on waveform characteristic storage
Technical Field
The invention relates to the technical field of data storage, in particular to a DNA coding method, a system, a device and a storage medium based on waveform characteristic storage.
Background
The development of the internet has led to an explosive growth of information in human society, however existing storage media have been rapidly depleted.
DNA information storage is a new type of storage medium, which is developed to solve the contradiction between explosive growth of information and increasingly smaller silicon-based storage.
However, most of developed methods for storing DNA information are based on the existing silicon-based computer storage system, and cannot fundamentally solve the problem of storage resource shortage in practical application.
That is, the prior art cannot be separated from the currently established silicon-based storage system, and the cut-in angle cannot be separated from the silicon-based computer, so that the problem of storage resource shortage cannot be fundamentally solved.
Disclosure of Invention
In view of the foregoing, it is necessary to provide a method, a system, a device and a storage medium for encoding DNA based on waveform feature storage, which can start from the most original physical form of information transmission, and does not use the product processed by a silicon-based computer as an information source, thus actually realizing the replacement of the silicon-based storage medium.
The invention provides a DNA coding method based on waveform characteristic storage, which comprises the following steps: a. converting the image information and the audio information into waveform signals; b. dividing waveform signals converted from audio information; c. performing Fourier transform on the waveform signals converted from the segmented audio information and the waveform signals converted from the image information to obtain frequency domain information; d. the obtained frequency domain information is converted into a final base sequence for DNA information storage.
Specifically, the step a specifically includes:
for image information: dividing the image information to be stored into a plurality of equal small areas, and customizing the original area according to the image size; analyzing each region to obtain a spectrum, or obtaining a waveform chart by using an instrument;
for audio information: vibration waveforms were collected using an acoustic wave collector.
Specifically, the step b specifically includes:
and carrying out sliding window segmentation on the acoustic wave signals, and dividing the long signals into a plurality of sections of short signals.
Specifically, the step c specifically includes:
for audio information:
and respectively carrying out independent Fourier transformation on the information in each window, and then recording the converted frequency domain information, wherein the recording mode is as follows: [ (f) 1 ,r 1 ),(f 2 ,r 2 ),…,(f m ,r m )]Wherein f is frequency and r is amplitude;
there are two cases for image information:
(1) The collected waveform signals are wave spectrums, and the first m wavelengths with higher signal intensity are selected; wherein m is self-definition, the larger the value is, the closer the value is to the original signal, the higher the precision is, and meanwhile, the more information needs to be stored, the larger the occupied storage space is;
then, sequentially converting the selected wavelengths into frequency signals by using a formula 1; finally, each wavelength record (f, r) is obtained;
selecting and recording several groups of wavelength information in each region, and finally obtaining a list of the whole image information as all selected wavelength information;
equation 1:
f=v/λ#1
where f is the frequency, v is the speed of light, and λ is the wavelength;
(2) The collected signals are original waveform signals, the waveform signals are subjected to Fourier transformation, and the final obtained result is (f, r) which is the same as the processing mode in each window of the audio;
several groups of frequency information are selected and recorded in each area, and finally a list of all selected wavelength information of the whole image signal is obtained.
Specifically, the step d specifically includes:
converting the result obtained in step c into a quaternary number corresponding to four bases in the DNA: A/T/C/G, a DNA sequence storing information was obtained.
The invention also provides a DNA coding system based on waveform characteristic storage, which comprises: conversion module, segmentation module, transform module, storage module, wherein: the conversion module is used for converting the image information and the audio information into waveform signals; the segmentation module is used for segmenting the waveform signals converted by the audio information; the transformation module is used for carrying out Fourier transformation on the waveform signals converted from the segmented audio information and the waveform signals converted from the image information to obtain frequency domain information; the storage module is used for converting the obtained frequency domain information into a final base sequence so as to store DNA information.
Specifically, the conversion module is specifically configured to:
for image information: dividing the image information to be stored into a plurality of equal small areas, and customizing the original area according to the image size; analyzing each region to obtain a spectrum, or obtaining a waveform chart by using an instrument;
for audio information: vibration waveforms were collected using an acoustic wave collector.
Specifically, the segmentation module is specifically configured to:
and carrying out sliding window segmentation on the acoustic wave signals, and dividing the long signals into a plurality of sections of short signals.
Specifically, the transformation module is specifically configured to:
for audio information:
and respectively carrying out independent Fourier transformation on the information in each window, and then recording the converted frequency domain information, wherein the recording mode is as follows: [ (f) 1 ,r 1 ),(f 2 ,r 2 ),…,(f m ,r m )]Wherein f is frequency and r is amplitude;
there are two cases for image information:
(1) The collected waveform signals are wave spectrums, and the first m wavelengths with higher signal intensity are selected; wherein m is self-definition, the larger the value is, the closer the value is to the original signal, the higher the precision is, and meanwhile, the more information needs to be stored, the larger the occupied storage space is;
then, sequentially converting the selected wavelengths into frequency signals by using a formula 1; finally, each wavelength record (f, r) is obtained;
selecting and recording several groups of wavelength information in each region, and finally obtaining a list of the whole image information as all selected wavelength information;
equation 1:
f=v/λ#1
where f is the frequency, v is the speed of light, and λ is the wavelength;
(2) The collected signals are original waveform signals, the waveform signals are subjected to Fourier transformation, and the final obtained result is (f, r) which is the same as the processing mode in each window of the audio;
several groups of frequency information are selected and recorded in each area, and finally a list of all selected wavelength information of the whole image signal is obtained.
Specifically, the storage module is specifically configured to:
converting the result obtained by the conversion module into quaternary numbers corresponding to four bases in DNA: A/T/C/G, a DNA sequence storing information was obtained.
The method starts from the most original physical form of information transmission, and does not use the product processed by the silicon-based computer as an information source, thereby truly replacing the silicon-based storage medium. The method does not need to encode based on a silicon-based system, and can fundamentally and effectively solve the problem of shortage of silicon-based storage resources. Meanwhile, the method and the device can be used for lossy compression and improve coding density.
Drawings
FIG. 1 is a flow chart of a method of encoding DNA based on waveform signature storage in an embodiment of the present application;
FIG. 2 is a schematic diagram of a processing procedure of a DNA encoding method based on waveform feature storage according to an embodiment of the present application;
fig. 3 is a schematic view of image area division provided in an embodiment of the present application;
FIG. 4 is a schematic diagram of image segmentation using a custom fixed-size graph according to an embodiment of the present disclosure;
fig. 5 is a schematic diagram of an audio signal sliding window according to an embodiment of the present disclosure;
FIG. 6 is a schematic diagram of a spectrum of a monochromator according to an embodiment of the present application;
FIG. 7 is a schematic diagram of a DNA coding system based on waveform feature storage according to an embodiment of the present application;
FIG. 8 is a schematic diagram of an imaged text provided in an embodiment of the present application;
FIG. 9 is a schematic diagram of a portion of the result of collecting the spectrum of the light reflected from each pixel by a monochromator and converting the wavelength into frequency according to the embodiment of the present application;
FIG. 10 is a schematic diagram of a result simplified to obtain a partial result according to an embodiment of the present application;
FIG. 11 is a schematic diagram of converting digits into quaternary numbers according to an embodiment of the present disclosure;
FIG. 12 is a schematic diagram of a quaternary string obtained by supplementing digits according to an embodiment of the present disclosure;
FIG. 13 is a schematic diagram of a DNA sequence obtained by selecting a set of mapping relationships according to an embodiment of the present application;
fig. 14 is a schematic diagram of audio splitting into windows according to an embodiment of the present application;
FIG. 15 is a schematic diagram of performing Fourier transform on each window according to an embodiment of the present application;
fig. 16 is a schematic diagram of selecting 3 signals with strongest intensities in each window according to an embodiment of the present application;
FIG. 17 is a schematic view of a device structure according to an embodiment of the present application;
fig. 18 is a schematic structural diagram of a storage medium according to an embodiment of the present application.
Detailed Description
The following detailed description of the present invention will be made in detail to make the above objects, features and advantages of the present invention more apparent, but should not be construed to limit the scope of the present invention.
It should be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It should also be understood that the term "and/or" as used in this specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
In addition, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used merely to distinguish between descriptions and are not to be construed as indicating or implying relative importance.
Reference in the specification to "one embodiment" or "some embodiments" or the like means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," and the like in the specification are not necessarily all referring to the same embodiment, but mean "one or more but not all embodiments" unless expressly specified otherwise. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless expressly specified otherwise. Some specific embodiments of the present invention are described below with reference to the accompanying drawings.
Example 1
Referring to fig. 1 and 2, a flowchart of a DNA encoding method based on waveform feature storage according to a preferred embodiment of the present invention is shown.
Step S1, converting the image information and the audio information into waveform signals. Specifically:
the information in daily contact mainly comprises three categories of text information, image information and audio information, and the text is an image with fixed meaning after specialization in human society, so that the text information can be combined with the image information essentially, and only the image information and the audio information need to be stored.
Image information:
all that can be seen in daily life is from the reflection of light, belongs to the visible light wave band in electromagnetic waves, and has the wavelength of 400nm to 780 nm. Therefore, the image information to be stored is divided into a plurality of equal small areas, and the original area size is customized according to the image size, which comprises two modes:
(1) The image information is divided into a number of equal length areas, such as squares, rectangles, triangles, etc. Taking square area as an example, the original side length is l 0 Gradually narrowing the side length, each time by Δl, until each area reflects only one color of light, as shown in fig. 3;
(2) The minimum area with a fixed length is custom-defined similarly to the pixel definition, as shown in fig. 4.
Then, each region is analyzed using an instrument such as a spectrum analyzer, a monochromator, etc., to obtain a spectrum, or a waveform chart (similar to an oscilloscope) is obtained using an instrument.
Audio information:
the sound wave belongs to mechanical wave, and is generated for object vibration, and a sound wave collector is used for collecting vibration waveforms.
And S2, dividing the waveform signal converted from the audio information. Specifically:
acoustic signals for audio information. Because the waveform of the acoustic wave signal has larger change and is difficult to observe periodically, in order to reduce the difficulty of subsequent calculation and processing, the acoustic wave signal is subjected to sliding window segmentation, and the long signal is divided into a plurality of sections of short signals. The abscissa of the acoustic signal is time, the window length is n seconds (the abscissa of the acoustic signal is time, as shown in fig. 5), and the step size is also n seconds. Short signals within the window reduce the complexity of the signal and thus facilitate periodic observations, while computational accuracy is also higher.
And S3, carrying out Fourier transformation on the waveform signals converted from the divided audio information and the waveform signals converted from the image information to obtain frequency domain information. Specifically:
audio information:
for the audio information, the information in each window is respectively and independently subjected to fourier transformation, and then the converted frequency domain information is recorded, wherein the recording mode is as follows: [ (f) 1 ,r 1 ),(f 2 ,r 2 ),…,(f m ,r m )]Where f is the frequency and r is the amplitude. And selecting recorded signal peaks according to actual conditions in the window.
Image information:
the image information is divided into two cases:
(1) The collected waveform signal is a spectrum. As shown in fig. 6, an output image of the spectrogram of the monochromator. Selecting the first m wavelengths with higher signal intensity; wherein m is self-definition, the larger the value is, the closer to the original signal is, the higher the precision is, and meanwhile, the more information needs to be stored, the larger the occupied storage space is.
Then, sequentially converting the selected wavelengths into frequency signals by using a formula 1; finally, each wavelength record (f, r) is obtained.
And (3) according to actual situation selection, recording several groups of wavelength information in each area, and finally obtaining a list of the whole image information as all selected wavelength information.
Equation 1:
f=v/λ#1
where f is the frequency, v is the speed of light, and λ is the wavelength.
(2) The collected signal is the original waveform signal. The waveform signal is fourier transformed to have the same processing mode as that in each window of the audio, and the result is (f, r), which will not be described here.
According to the actual situation, several groups of frequency information are selected and recorded in each area, and finally a list of the whole image signal as all selected wavelength information is obtained.
And S4, converting the obtained frequency domain information into a final base sequence so as to store DNA information. Specifically:
converting the result obtained in the step S3 into a quaternary number corresponding to four bases in DNA: A/T/C/G, a DNA sequence storing information was obtained.
For example, assuming that the list obtained in step S3 is [ (13, 45), (22, 11) ], the result of conversion into quaternary is [ (31,231), (112,11) ]; to simplify the decoding process from DNA to the original information, the list is converted to [031231112023] by zero padding all numbers of insufficient digits to an entire sequence with the base length j of the digits of the largest digits in the quaternary list. A set of fixed encoding tables, e.g., {0:A,1:C,2:G,3:T }, was then selected from Table 1, resulting in a DNA sequence of ATCGTCCCGAGT.
TABLE 1
Figure BDA0004083495820000101
Example 2
Referring to FIG. 7, a hardware architecture diagram of a DNA encoding system 10 of the present invention based on waveform signature storage is shown. The system comprises: a conversion module 101, a segmentation module 102, a transformation module 103 and a storage module 104. Wherein:
the conversion module 101 is configured to convert image information and audio information into waveform signals.
Specifically:
the information in daily contact mainly comprises three categories of text information, image information and audio information, and the text is an image with fixed meaning after specialization in human society, so that the text information can be combined with the image information essentially, and only the image information and the audio information need to be stored.
Image information:
all that can be seen in daily life is from the reflection of light, belongs to the visible light wave band in electromagnetic waves, and has the wavelength of 400nm to 780 nm. Therefore, the transformation module 101 divides the image information to be stored into a plurality of equal small areas, and self-defines the original area size according to the image size, which includes two ways:
(1) The image information is divided into a number of equal length areas, such as squares, rectangles, triangles, etc. Taking square area as an example, the original side length is l 0 Gradually narrowing the side length, each time by Δl, until each area reflects only one color of light, as shown in fig. 3;
(2) The minimum area with a fixed length is custom-defined similarly to the pixel definition, as shown in fig. 4.
Then, each region is analyzed using an instrument such as a spectrum analyzer, a monochromator, etc., to obtain a spectrum, or a waveform chart (similar to an oscilloscope) is obtained using an instrument.
Audio information:
the sound wave belongs to mechanical wave and is generated for vibration of the object, and the conversion module 101 collects vibration waveforms by using a sound wave collector.
The dividing module 102 is configured to divide the waveform signal converted from the audio information. Specifically:
acoustic signals for audio information. Because the waveform of the acoustic wave signal has larger change and is difficult to observe periodically, in order to reduce the difficulty of subsequent calculation and processing, the acoustic wave signal is subjected to sliding window segmentation, and the long signal is divided into a plurality of sections of short signals. The abscissa of the acoustic signal is time, the window length is n seconds (the abscissa of the acoustic signal is time, as shown in fig. 5), and the step size is also n seconds. Short signals within the window reduce the complexity of the signal and thus facilitate periodic observations, while computational accuracy is also higher.
The transform module 103 is configured to perform fourier transform on the waveform signal converted from the audio information and the waveform signal converted from the image information after being divided, so as to obtain frequency domain information. Specifically:
audio information:
for the audio information, the information in each window is respectively and independently subjected to fourier transformation, and then the converted frequency domain information is recorded, wherein the recording mode is as follows: [ (f) 1 ,r 1 ),(f 2 ,r 2 ),…,(f m ,r m )]Where f is the frequency and r is the amplitude. And selecting recorded signal peaks according to actual conditions in the window.
Image information:
the image information is divided into two cases:
(1) The collected waveform signal is a spectrum. As shown in fig. 6, an output image of the spectrogram of the monochromator. Selecting the first m wavelengths with higher signal intensity; wherein m is self-definition, the larger the value is, the closer to the original signal is, the higher the precision is, and meanwhile, the more information needs to be stored, the larger the occupied storage space is.
Then, sequentially converting the selected wavelengths into frequency signals by using a formula 1; finally, each wavelength record (f, r) is obtained.
And (3) according to actual situation selection, recording several groups of wavelength information in each area, and finally obtaining a list of the whole image information as all selected wavelength information.
Equation 1:
f=v/λ#1
where f is the frequency, v is the speed of light, and λ is the wavelength.
(2) The collected signal is the original waveform signal. The waveform signal is fourier transformed to have the same processing mode as that in each window of the audio, and the result is (f, r), which will not be described here.
According to the actual situation, several groups of frequency information are selected and recorded in each area, and finally a list of the whole image signal as all selected wavelength information is obtained.
The storage module 104 is used for converting the obtained frequency domain information into a final base sequence so as to store DNA information. Specifically:
the storage module 104 converts the result obtained by the conversion module 103 into a quaternary number corresponding to four bases in DNA: A/T/C/G, a DNA sequence storing information was obtained.
For example, assume that the list obtained in the transform module 103 is [ (13, 45), (22, 11) ], and the result of conversion into quaternary is [ (31,231), (112,11) ]; to simplify the decoding process from DNA to the original information, the list is converted to [031231112023] by zero padding all numbers of insufficient digits to an entire sequence with the base length j of the digits of the largest digits in the quaternary list. A set of fixed encoding tables, e.g., {0:A,1:C,2:G,3:T }, was then selected from Table 1, resulting in a DNA sequence of ATCGTCCCGAGT.
TABLE 1
Figure BDA0004083495820000131
Figure BDA0004083495820000141
Example 3
(1) In this embodiment, taking the image information of fig. 8 as an example, since the generation results are more, the following are partially shown:
the divided areas are defined as 1 pixel, and the text of the lower graph is divided so that the color in each area is unique.
Then, the spectrum of the light reflected from each pixel is collected by a monochromator, and the wavelength is converted into frequency by formula 1, so that the result is a part, each pixel has two elements, the first value represents the amplitude, and the second value represents the frequency, similarly as shown in fig. 9.
In this embodiment, the number of bits is relatively close, so that the result can be simplified, and the whole amplitude part divided by 5×10 -10 Frequency part divided by 10 14 Partial results were obtained as shown in fig. 10.
The numbers are converted to quaternary numbers as shown in fig. 11.
And then the digits are complemented so that the digits of each digit are equal to obtain a quaternary string of digits, as shown in figure 12.
A set of mappings, e.g., {0:A,1:C,2:G,3:T }, was selected in Table 1 to yield a DNA sequence, as shown in FIG. 13.
(2) The present embodiment takes audio information as an example:
splitting a piece of audio into 3 windows, the signal of each window being shown in fig. 14;
fourier transforming each window to convert it into frequency domain, and the image is shown in fig. 15;
according to the obtained result, selecting 3 signals (except 0 Hz) with the strongest intensity of each window, and obtaining a result shown in FIG. 16;
wherein the amplitude is increased by 100 times and converted into a positive integer, so that the subsequent steps can be more conveniently carried out. The list obtained is as follows:
[(120,100),(50,68),(176,4),(30,100),(180,100),(50,71),(120,200),(50,100),(164,4)]
converting into quaternary numbers to obtain the following list:
[1320,1210,302,1010,2303,10,132,1210,2310,1210,302,1013,1320,3020,302,1210,2210,10]
taking the highest number of digits in all numbers as a reference, carrying out 0 bit filling, and merging into a quaternary character string:
132012100302101023030010013212102310121003021013132030200302121022100010
selecting a base mapping relation {0:A,1:C,2:G,3:T }, and obtaining a base sequence:
CTGACGCAATAGCACAGTATAACAACTGCGCAGTCACGCAATAGCACTCTGATAGAATAGCGCAGGCAAACA
please refer to fig. 17, which is a schematic diagram of an apparatus structure according to an embodiment of the present application. The device 50 includes a processor 51, a memory 52 coupled to the processor 51.
The memory 52 stores program instructions for implementing the internet of things intrusion detection method described above.
The processor 51 is configured to execute program instructions stored in the memory 52 to control the intrusion detection of the internet of things.
The processor 51 may also be referred to as a CPU (Central Processing Unit ). The processor 51 may be an integrated circuit chip with signal processing capabilities. Processor 51 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Please refer to fig. 18, which is a schematic diagram illustrating a structure of a storage medium according to an embodiment of the present application. The storage medium of the embodiment of the present application stores a program file 61 capable of implementing all the methods described above, where the program file 61 may be stored in the storage medium in the form of a software product, and includes several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the methods in the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, an optical disk, or other various media capable of storing program codes, or a computer, a server, a mobile phone, a tablet, or other devices.
It is noted that the embodiment of the present application converts image information into a waveform signal, where the waveform may be obtained from a spectrum, or may be obtained from an electrical signal after photoelectric conversion.
In the embodiment of the present application, the relationship between the quaternary numbers and the bases is not fixed.
In addition, each functional unit in each embodiment of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
In the various embodiments described above, it may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. The steps and methods described in accordance with the various embodiments of the invention result in whole or in part when computer program instructions are loaded and executed on a computer.
It will be appreciated that the systems, apparatus, and methods described herein may be implemented in other ways. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the division of the functional units may be re-divided according to actual needs without affecting the satisfaction or completion of the functions and steps of the present invention as described above. For example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. The units of the above devices may be combined or re-divided according to the storage and decoding methods, or additional functional units may be added according to actual needs to meet the requirements of the above steps and methods.
It should be noted that, the present application introduces fourier transform, converts the dimension of the information to be stored, and expects to use fewer bases to represent the original information amount, thereby realizing the improvement of the coding density. In some cases, the information may be increased over the original information in the dimension after the fourier transform, and the application may be used for lossy compression. In the dimension after conversion, selecting the part with high signal intensity for coding, and discarding the part with low signal intensity. In general, most of the discarded parts are noise or have no great influence on the quality of the original signal, so that the result after the lossy compression does not cause the information to be stored to be lost.
While the invention has been described with reference to the presently preferred embodiments, it will be understood by those skilled in the art that the foregoing is by way of illustration and not of limitation, and that any modifications, equivalents, variations and the like which fall within the spirit and scope of the principles of the invention are intended to be included within the scope of the appended claims.

Claims (12)

1. A DNA encoding method based on waveform characteristic storage, the method comprising the steps of:
a. converting the image information and the audio information into waveform signals;
b. dividing waveform signals converted from audio information;
c. performing Fourier transform on the waveform signals converted from the segmented audio information and the waveform signals converted from the image information to obtain frequency domain information;
d. the obtained frequency domain information is converted into a final base sequence for DNA information storage.
2. The DNA encoding method based on waveform characteristics storage as claimed in claim 1, wherein said step a specifically comprises:
for image information: dividing the image information to be stored into a plurality of equal small areas, and customizing the original area according to the image size; analyzing each region to obtain a spectrum, or obtaining a waveform chart by using an instrument;
for audio information: vibration waveforms were collected using an acoustic wave collector.
3. The DNA encoding method based on waveform characteristics storage as claimed in claim 2, wherein said step b specifically comprises:
and carrying out sliding window segmentation on the acoustic wave signals, and dividing the long signals into a plurality of sections of short signals.
4. The DNA encoding method based on waveform characteristics storage as claimed in claim 3, wherein said step c specifically comprises:
for audio information:
and respectively carrying out independent Fourier transformation on the information in each window, and then recording the converted frequency domain information, wherein the recording mode is as follows: [ (f) 1 ,r 1 ),(f 2 ,r 2 ),…,(f m ,r m )]Wherein f is frequency and r is amplitude;
there are two cases for image information:
(1) The collected waveform signals are wave spectrums, and the first m wavelengths with higher signal intensity are selected; wherein m is self-definition, the larger the value is, the closer the value is to the original signal, the higher the precision is, and meanwhile, the more information needs to be stored, the larger the occupied storage space is;
then, sequentially converting the selected wavelengths into frequency signals by using a formula 1; finally, each wavelength record (f, r) is obtained;
selecting and recording several groups of wavelength information in each region, and finally obtaining a list of the whole image information as all selected wavelength information;
equation 1:
f=v/λ#1
where f is the frequency, v is the speed of light, and λ is the wavelength;
(2) The collected signals are original waveform signals, the waveform signals are subjected to Fourier transformation, and the final obtained result is (f, r) which is the same as the processing mode in each window of the audio;
several groups of frequency information are selected and recorded in each area, and finally a list of all selected wavelength information of the whole image signal is obtained.
5. The method of DNA encoding based on waveform characteristics storage as claimed in claim 4, wherein said step d specifically comprises:
converting the result obtained in step c into a quaternary number corresponding to four bases in the DNA: A/T/C/G, a DNA sequence storing information was obtained.
6. A DNA encoding system based on waveform signature storage, the system comprising: conversion module, segmentation module, transform module, storage module, wherein:
the conversion module is used for converting the image information and the audio information into waveform signals;
the segmentation module is used for segmenting the waveform signals converted by the audio information;
the transformation module is used for carrying out Fourier transformation on the waveform signals converted from the segmented audio information and the waveform signals converted from the image information to obtain frequency domain information;
the storage module is used for converting the obtained frequency domain information into a final base sequence so as to store DNA information.
7. The DNA encoding system based on waveform characteristics storage of claim 6, wherein said transformation module is specifically configured to:
for image information: dividing the image information to be stored into a plurality of equal small areas, and customizing the original area according to the image size; analyzing each region to obtain a spectrum, or obtaining a waveform chart by using an instrument;
for audio information: vibration waveforms were collected using an acoustic wave collector.
8. The DNA encoding system based on waveform characteristics storage of claim 7, wherein said segmentation module is specifically configured to:
and carrying out sliding window segmentation on the acoustic wave signals, and dividing the long signals into a plurality of sections of short signals.
9. The DNA encoding system based on waveform characteristics storage of claim 8, wherein said transformation module is specifically configured to:
for audio information:
and respectively carrying out independent Fourier transformation on the information in each window, and then recording the converted frequency domain information, wherein the recording mode is as follows: [ (f) 1 ,r 1 ),(f 2 ,r 2 ),…,(f m ,r m )]Wherein f is frequency and r is amplitude;
there are two cases for image information:
(1) The collected waveform signals are wave spectrums, and the first m wavelengths with higher signal intensity are selected; wherein m is self-definition, the larger the value is, the closer the value is to the original signal, the higher the precision is, and meanwhile, the more information needs to be stored, the larger the occupied storage space is;
then, sequentially converting the selected wavelengths into frequency signals by using a formula 1; finally, each wavelength record (f, r) is obtained;
selecting and recording several groups of wavelength information in each region, and finally obtaining a list of the whole image information as all selected wavelength information;
equation 1:
f=v/λ#1
where f is the frequency, v is the speed of light, and λ is the wavelength;
(2) The collected signals are original waveform signals, the waveform signals are subjected to Fourier transformation, and the final obtained result is (f, r) which is the same as the processing mode in each window of the audio;
several groups of frequency information are selected and recorded in each area, and finally a list of all selected wavelength information of the whole image signal is obtained.
10. The DNA encoding system based on waveform characteristics storage of claim 9, wherein said storage module is specifically configured to:
converting the result obtained by the conversion module into quaternary numbers corresponding to four bases in DNA: A/T/C/G, a DNA sequence storing information was obtained.
11. An apparatus comprising a processor, a memory coupled to the processor, wherein,
the memory stores program instructions for implementing the internet of things intrusion detection method of any one of claims 1-5;
the processor is used for executing the program instructions stored by the memory to control the intrusion detection of the Internet of things.
12. A storage medium storing program instructions executable by a processor for performing the internet of things intrusion detection method according to any one of claims 1 to 5.
CN202310129992.7A 2023-02-09 2023-02-09 DNA encoding method, system, equipment and storage medium based on waveform characteristic storage Pending CN116230094A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310129992.7A CN116230094A (en) 2023-02-09 2023-02-09 DNA encoding method, system, equipment and storage medium based on waveform characteristic storage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310129992.7A CN116230094A (en) 2023-02-09 2023-02-09 DNA encoding method, system, equipment and storage medium based on waveform characteristic storage

Publications (1)

Publication Number Publication Date
CN116230094A true CN116230094A (en) 2023-06-06

Family

ID=86576382

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310129992.7A Pending CN116230094A (en) 2023-02-09 2023-02-09 DNA encoding method, system, equipment and storage medium based on waveform characteristic storage

Country Status (1)

Country Link
CN (1) CN116230094A (en)

Similar Documents

Publication Publication Date Title
CN110136744B (en) Audio fingerprint generation method, equipment and storage medium
CN1272911C (en) Audio signal decoding device and audio signal encoding device
US7460994B2 (en) Method and apparatus for producing a fingerprint, and method and apparatus for identifying an audio signal
US9589283B2 (en) Device, method, and medium for generating audio fingerprint and retrieving audio data
US20060238386A1 (en) System and method for audio data compression and decompression using discrete wavelet transform (DWT)
CN103403710A (en) Extraction and matching of characteristic fingerprints from audio signals
CN109147827B (en) Encoding method, encoding device, and recording medium
KR20000023379A (en) Apparatus and method for processing an information, apparatus and method for recording an information, recording medium and providing medium
US10229688B2 (en) Data compression apparatus, computer-readable storage medium having stored therein data compression program, data compression system, data compression method, data decompression apparatus, data compression/decompression apparatus, and data structure of compressed data
JP2002041089A (en) Frequency-interpolating device, method of frequency interpolation and recording medium
Johnson et al. Low complexity lossless compression of underwater sound recordings
CN1193159A (en) Speech encoding and decoding method and apparatus, telphone set, tone changing method and medium
CN101667170A (en) Computation apparatus and method, quantization apparatus and method, audio encoding apparatus and method, and program
EP3859976B1 (en) Coding device, decoding device, code string data structure, coding method, decoding method, coding program, and decoding program
CN116230094A (en) DNA encoding method, system, equipment and storage medium based on waveform characteristic storage
KR20090080777A (en) Method and Apparatus for detecting signal
JP2004102023A (en) Specific sound signal detection method, signal detection device, signal detection program, and recording medium
US10840944B2 (en) Encoding apparatus, decoding apparatus, data structure of code string, encoding method, decoding method, encoding program and decoding program
Smith A survey of various data compression techniques
RU2451998C2 (en) Efficient design of mdct/imdct filterbank for speech and audio coding applications
JP4645866B2 (en) DIGITAL SIGNAL PROCESSING METHOD, LEARNING METHOD, DEVICE THEREOF, AND PROGRAM STORAGE MEDIUM
Song et al. Hybrid compression scheme based on VMD optimization algorithm application to mechanical equipment monitoring
US5899974A (en) Compressing speech into a digital format
CN113470693B (en) Fake singing detection method, fake singing detection device, electronic equipment and computer readable storage medium
KR20050085761A (en) Sinusoid selection in audio encoding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination