GB2619782A - DNA storage coding optimization method based on double-strategy back spider algorithm - Google Patents

DNA storage coding optimization method based on double-strategy back spider algorithm Download PDF

Info

Publication number
GB2619782A
GB2619782A GB2211537.2A GB202211537A GB2619782A GB 2619782 A GB2619782 A GB 2619782A GB 202211537 A GB202211537 A GB 202211537A GB 2619782 A GB2619782 A GB 2619782A
Authority
GB
United Kingdom
Prior art keywords
sequence set
coding sequence
dna coding
dna
policy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
GB2211537.2A
Other versions
GB202211537D0 (en
Inventor
Zhang Qiang
Wang Bin
Wang Pengfei
Wu Jieqiong
Wei Xiaopeng
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian University of Technology
Original Assignee
Dalian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202111101673.2A external-priority patent/CN113792877B/en
Application filed by Dalian University of Technology filed Critical Dalian University of Technology
Publication of GB202211537D0 publication Critical patent/GB202211537D0/en
Publication of GB2619782A publication Critical patent/GB2619782A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/123DNA computing

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Genetics & Genomics (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The present invention relates to a DNA storage coding optimization method based on a double-strategy black spider algorithm. The method comprises: constructing a DNA coding sequence set meeting a current constraint combination, performing fitness evaluation on sequences in the DNA coding sequence set, and sorting according to a result; introducing the double-strategy black spider algorithm to optimize the set, and obtaining optimized DNA coding sequences having high fitness; screening the optimized DNA coding sequences by means of combination constraint, and reserving sequences meeting the combination constraint; and merging the reserved sequence set into the DNA coding sequence set, and outputting an optimal coding sequence set meeting the combination constraint. The optimized double-strategy black spider algorithm is applied to the DNA coding sequence set, and the purpose of optimizing the sequence is achieved. The optimized sequence set has better performance in a fitness function; the optimized sequence set is screened by means of end constraint to construct a DNA coding sequence set having stable physical and thermodynamic characteristics.

Description

METHOD FOR OPTIMIZING CODING FOR DEOXYRIBONUCLEIC ACID (DNA) STORAGE BASED ON DUAL-POLICY BLACK WIDOW OPTIMIZATION (BWO)
ALGORITHM
CROSS REFERENCE TO RELATED APPLICATION
100011 This patent application claims the benefit and priority of Chinese Patent Application No. 202111101673.2, A method for optimizing coding for deoxyribonucleic acid (DNA) storage based on a dual-policy black widow optimization (BWO) algorithm, filed on September 18, 2021, the disclosure of which is incorporated by reference herein in its entirety as part of the present application.
TECHNICAL FIELD
100021 The present disclosure relates to the field of coding design in DNA storage, and specifically, to a method for optimizing coding for DNA storage based on a BWO algorithm.
BACKGROUND ART
100031 In the context of data explosion, DNA is considered an ideal carrier for information storage due to its advantages of high storage density, abundant resources, easy access, long storage time, and low energy consumption. In recent years, DNA storage technology has been continuously developed and applied, and various coding methods of DNA coding sets have also emerged.
100041 Currently, a widely used method is to combine the intelligence algorithm and population-based algorithm with a DNA coding method. However, update manners and logic of some populations are simple, which is prone to problems of insufficient population diversity and local optima. Due to the special biological properties of DNA sequences, sequences with low similarity can avoid the occurrence of non-specific hybridization as far as possible. Meanwhile, sequence sets with relatively stable physical and thermodynamic properties are less error-prone during storage. A high-quality DNA coding sequence set can reduce error rates during storage. Therefore, in addition to improving algorithm efficiency and quality and expanding storage sequence sets, how to improve stability of the physical and thermodynamic properties of sequences is also an urgent issue to be addressed.
SUMMARY
100051 The present disclosure aims to perform search and optimization on a DNA coding sequence set through a meta-heuristic algorithm, to finally obtain a DNA coding sequence set with more sequences and more stable physical and thermodynamic properties, thereby improving quality of the DNA coding sequence set and ensuring stability of DNA storage.
[0006] To achieve the above objectives, the present disclosure adopts the following technical solutions: [0007] Firstly, a DNA coding sequence set that satisfies a current combined constraint is constructed, fitness of sequences in the DNA coding sequence set is evaluated, and the sequences are sorted by fitness; secondly, the DNA coding sequence set is optimized by using a dual-policy BWO algorithm, to obtain optimized DNA coding sequences with high fitness; thirdly, the optimized DNA coding sequences are screened again through a combined constraint, and sequences that satisfy the combined constraint are retained; and finally, the retained sequences are added into the DNA coding sequence set, and an optimal coding sequence set that satisfies the combined constraint is output.
100081 The foregoing technical solution can achieve the following technical effects: [0009] 1. A random swap policy and a weight-based selection policy are introduced to improve the development and exploration capabilities of the algorithm. The random swap policy can improve the diversity of sequences, and the weight-based selection policy selects different update methods in the process of generating next-generation sequences: in the first half part, finding an optimal solution around the sequences to improve the exploration capability of the algorithm; and in the second half part, finding an alternative solution far away from the current optimal solution to improve the development capability of the algorithm and prevent the sequence selection from falling into local optimum.
[0010] 2. The optimized dual-policy BWO algorithm is applied to a DNA coding sequence set, to optimize the sequences. The optimized sequence set performs better in a fitness function. 100111 3. The optimized sequence set is screened through a terminal constraint to construct a DNA coding sequence set with stable physical and thermodynamic properties.
[0012] 4. The storage sequence set with stable physical and thermodynamic properties is applied to DNA storage to improve the stability and reliability of storage.
BRIEF DESCRIPTION OF THE DRAWINGS
[0013] FIG is a flowchart of a method for optimizing coding for DNA storage based on a dual-policy BWO algorithm
DETAILED DESCRIPTION OF THE EMBODIMENTS
[0014] The technical solution in an embodiment of the present disclosure is now described clearly and completely with reference to the accompanying drawings for examples of the present disclosure. It will be understood that the described examples are merely a part of, rather than all, examples of the present disclosure. All other examples derived from the examples of the present disclosure by those skilled in the art without creative work shall fall within the protection scope of the present disclosure.
[0015] In the present disclosure, the algorithm is a dual-policy BWO algorithm, the constraints are Hamming distance, non-consecutiveness constraint, guanine-cytosine (GC) content, and terminal constraint, and the indicators used for measuring sequence stability are hairpin structure, melting temperature, and minimum free energy.
100161 The two policies in the dual-policy BWO algorithm in the present disclosure are a random swap policy and a weight-based selection policy, which are used to improve the exploration and development capabilities of the algorithm. The random swap policy is used to enhance the diversity of sequences, and the weight-based selection policy is used to prevent results from falling into local optimum and enhance the optimization. The entire sequence update is divided into four stages: initialization, offspring generation, cannibalism, and mutation.
[0017] In the present disclosure, the Hamming distance constraint means that there must be at least n different bases in any two sequences in a sequence set to maintain stability, and the GC content is a percentage of the number of guanines (G) and cytosines (C) in a sequence to the total number of bases in the entire sequence. Because different bases connect different hydrogen bonds, a relatively stable state can usually be achieved when the GC content of the storage sequence remains at 40% to 60%. In order to maintain stable biological properties of the sequence, the GC content in this embodiment is 50%. The non-consecutiveness constraint means that there cannot be more than two consecutive identical bases in a storage sequence. The terminal constraint means that there cannot be three or more than three G or C bases in the last five sequences at the terminal of the storage sequence, and its expression is as follows: [0018] GSLStS I ± SLast51 < 3 0) [0019] where IGsLast51 and ICsLas,51 respectively represent the number of guanine (G) bases and the number of cytosine (C) bases in the last five sequences in a sequence S. [0020] Referring to FIG. 1, the specific steps are as follows: 100211 Step Initialize a DNA coding sequence set Pop and parameters required for updating a DNA coding sequence.
[0022] For example, the parameters required for updating a DNA coding sequence include nPop, nvar, pMutation, pCannibalism, and Miter, nPop represents a sequence size; nvar represents the number of sequence dimensions; pMutation represents a mutation rate; pCannibalism represents a cannibalism rate; and Miter represents the maximum number of iterations.
[0023] Step 2: Use Hamming distance as a fitness function to perform fitness-based sorting on the initialized DNA coding sequence set Pop to obtain a current optimal solution.
100241 Step 3: Obtain a parameter S in a weight-based selection policy.
[0025] For example, the parameter S is initialized as 0; if fitness of a current solution is smaller than the current optimal solution, the parameter S is divided by 2 to control the size; and if the fitness of the current solution is greater than the current optimal solution, the parameter S is increased by 1 [0026] Step 4: Determine, during update of the DNA coding sequence set Pop, whether to use a random swap policy.
100271 For example, during the sequence update, whether to use the random swap policy is determined by using formulas (2) and (3). If a parameter a is less than a parameter [3, the random swap policy will be used to replace a dimension in Pop with a dimension of the current optimal solution; otherwise, the policy will not be used, and the current solution and optimal solution in Pop are maintained.
[0028] a=tan(gx (rand-0.5)) (2) [0029] 13= 1-CIter/MIter (3) [0030] Clter is the current number of iterations, and rand is a random number with a value of (0,1).
[0031] Step 5: Obtain parameters Wa and Wb in the weight-based selection policy, and update the DNA coding sequence set Pop based on the parameters Wa and Wb.
100321 For example, the parameters Wa and Wb used in the weight-based selection policy are generated by using formulas (4) and (5).
[0033] W"-(1 -C Iter/IMIter)i-1a15),S7MIter (4) [0034] Wb=(2-2 x C Iter/M Iter) I -(rand-05),/SMiter (5) 100351 The parameter S is generated in step 2. Wa and Wb have different value ranges. Wa is used to improve the development capability of the algorithm, and Wb is used to improve the exploration capability of the algorithm.
[0036] Step 6: Use the Hamming distance as the fitness function to obtain fitness of sequences in the updated DNA coding sequence set Pop, perform fitness-based sorting, retain a sequence that satisfies the Hamming distance constraint, and add the sequence to a DNA coding sequence set Pop2.
100371 For example, formula (6) is used to determine usage of the two parameters. In the first half of a sequence loop, Wa is used to generate offspring, that is, formula (7), in the second half of the sequence loop, Wb is used to generate offspring, that is, formula (8).
[0038] CIter/MIter<0.5 (6) 100391 fyi = Wax rand. x x1 + (1 Y2 = Wa x rand. x x2 + (1 [0040] fYi = rand. x x1 ± Wb X (1 (Y2 = rand. x x2 + Wb X (1 - rand). x x2 - rand). x xi - rand). x x2 - rand). x Clter/MIter < 0.5 (7) Clter/MIter > 0.5 (8) [0041] Wa and Wb are generated by using formulas (4) and (5) respectively, xi and x2 are sequences in the original DNA sequence set Pop; and yi and y2 are sequences in the updated DNA sequence set Pop.
100421 Fitness-based sorting is performed on the updated DNA sequence set Pop, a DNA sequence with low fitness is replaced, a DNA sequence with high fitness is retained in the current sequence set, and the retained sequences are stored in Pop2.
[0043] Step 7: Obtain, based on the number of sequences in the current DNA coding sequence set Pop and a mutation rate, a partial data set that should be mutated, randomly swap values in any two dimensions in the partial data set, and store the mutation result to a DNA coding sequence set Pop3.
100441 For example, the number of sequences in the mutated partial data set is nMutation, as shown in formula (9). Values in two dimensions of a sequence in the partial data set are randomly swapped, and the mutation result is stored in Pop3.
[0045] nMutation=PopxpMutation (9) [0046] Step 8: Filter out, by using the fitness function, sequences that do not satisfy a combined constraint from the DNA coding sequence set Pop2 and the DNA coding sequence set Pop3, and remove the sequences.
[0047] For example, the physical and thermodynamic properties of a sequence are separately measured by using the hairpin structure, melting temperature, and free energy. A calculation formula for the hairpin structure is as follows: = ?I pinlen) E f(nippniniel ne n+-rjEir /22y) [0048] hi a ir p in(S) -2 Hairpin(S,k) (10) 100491 where r represents a length of a shortest subsequence required to form a hairpin loop, and pinlen represents a length of a subsequence that forms a hairpin stem. In the sequence S, if a hairpin structure is generated at the k-th base of the sequence, and the number of complementary bases in the hairpin stem is more than half of the number of bases that make up a stem length, a value of Hairpin(S,k) is set to I; otherwise, the value is set too.
100501 For the DNA sequence set S with m sequences, F-rta(5) is used to represent a difference between Tm values of sequences, and a formula is as follows: [0051] FT",(S) = EZ1 (Tm(S i) -Tm(S)}2 (11) [0052] where Tm(Si) represents a melting temperature of the i-th sequence in the DNA sequence set 5, and Tm(S) represents a mean value of melting temperatures of the DNA sequence set S. 100531 Step 9: Store the DNA coding sequence set Pop2 and the DNA coding sequence set Pop3 to the DNA coding sequence set Pop.
[0054] Step 10: Determine whether the update times of the current DNA coding sequence set Pop reach the maximum number of iterations Miter; and if yes, output the DNA coding sequence set Pop; otherwise, go to step 2.
100551 Embodiment 1 [0056] In this embodiment, a length of a DNA code is 20, the Hamming distance n is greater than or equal to 17 and also satisfies the combined constraint.
100571 Step 1: Initialize parameters required for updating a DNA coding sequence, where nPop=2000, nvar=20, pMutation=0.4, pCannibalism=0.5, and MIter=2500. Perform filtering on an initialized DNA sequence set with the parameter settings based on a combined constraint, and add a sequence that satisfies the combined constraint to a sequence set Pop as an initial DNA coding sequence set. With the settings, a size of the initial sequence set is 130.
[0058] Step 2: Use Hamming distance as a fitness function to sort the sequences in Pop to obtain a current optimal solution.
[0059] Step 3: Obtain a parameter S in a weight-based selection policy by comparing a current solution with the optimal solution. If fitness of the current solution is less than that of the current optimal solution, divide S by 2; and if the fitness of the current solution is greater than that of the current optimal solution, increase S by L 100601 Step 4: Use a random swap policy in the current sequence set Pop to perform sequence mutation.
[0061] Step 5: Use formulas (4) and (5) to generate parameters Wa and Wb used in the weight-based selection policy; and use formula (6) to determine usage of the two parameters, where in the first half of a sequence loop, Wa is used to generate offspring, that is, formula (7); in the second half of the sequence loop, Wb is used to generate offspring, that is, formula (8). [0062] Step 6: Perform fitness-based sorting on Pop, replace a DNA sequence with low fitness, and retain a DNA sequence with high fitness in the current sequence set, and store the retained sequence to Pop2.
[0063] Step 7: Randomly swap values in two dimensions in nMutation DNA sequence sets, and store the mutation result to Pop3.
100641 Step 8: Filter out, by using the fitness function, sequences that do not satisfy a constraint from Pop2 and Pop3, and remove the sequences.
[0065] Step 9: Store Pop2 and Pop3 to Pop.
[0066] Step 10: Determine whether the update times of the current DNA coding sequence set Pop reach the maximum number of iterations; and if yes, output the DNA coding sequence set Pop; otherwise, go to step 2.
100671 The present disclosure proposes a method for combining the dual-policy BWO algorithm with the combined constraint to construct a stable DNA coding sequence set. The random swap policy and weight-based selection policy are introduced into the dual-policy BWO algorithm, which improves the development and exploration capabilities of the algorithm while improving the sequence diversity. The experiments of the present disclosure are completed on a desktop computer with Intel(11) CPU 3.6 GHz, 4.0 GB RAM, and Windows 8.
100681 Table 1 Results when n=10 and d>7
TCTGCAGAGT GC AGATC TAG TCGCACTATG
TGTCGTAGTC AGCAGTCAGA TGCGAGATCA
CTCACTACAG GTACGCATGA CTGTCGTACT
CGTGTCTCTA AGAGCATGAC CGATGAGATG
100691 Table 2 Physical and thermodynamic performance of a sequence set when n=9 and d>7 Number of hairpin structures FTn,(S) 6(0) 38 3.65 0.34 100701 Table 1 and Table 2 respectively show the results of a sequence set when the sequence length n=10 and Hamming distance d>7 and the physical and thermodynamic performance when the sequence length n=9 and Hamming distance d>7. According to comparison between the experiment results and other known coding methods, the present disclosure can not only increase the code quantity, but also improve the physical and thermodynamic stability of the coding set. 100711 The above are merely descriptions of preferred embodiments, but are not intended to limit of the present disclosure. It should be noted that many modifications and variations can be made by those of ordinary skill in the art without departing from the technical principle of the present disclosure. These modifications and variations should also be deemed as falling within the protection scope of the present disclosure.

Claims (10)

  1. WHAT IS CLAIMED IS: 1. A method for optimizing coding for deoxyribonucleic acid (DNA) storage based on a dual-policy black widow optimization (BWO) algorithm, comprising the following steps: step 1: initializing a DNA coding sequence set Pop and parameters required for updating a DNA coding sequence; step 2: using Hamming distance as a fitness function to perform fitness-based sorting on the initialized DNA coding sequence set Pop to obtain a current optimal solution; step 3: obtaining a parameter S in a weight-based selection policy; step 4: determining, during update of the DNA coding sequence set Pop, whether to use a random swap policy; step 5: obtaining parameters Wa and Wb in the weight-based selection policy, and updating the DNA coding sequence set Pop based on the parameters Wa and Wh; step 6: using the Hamming distance as the fitness function to obtain fitness of sequences in the updated DNA coding sequence set Pop, performing fitness-based sorting, retaining a sequence that satisfies the Hamming distance constraint, and adding the sequence to a DNA coding sequence set Pop2; step 7: obtaining, based on the number of sequences in the current DNA coding sequence set Pop and a mutation rate, a partial data set that should be mutated, randomly swapping values in any two dimensions in the partial data set, and storing the mutation result to a DNA coding sequence set Pop3; step 8: filtering out, by using the fitness function, sequences that do not satisfy a combined constraint from the DNA coding sequence set Pop2 and the DNA coding sequence set Pop3, and removing the sequences; step 9: storing the DNA coding sequence set Pop2 and the DNA coding sequence set Pop3 to the DNA coding sequence set Pop; and step 10: determining whether the update times of the current DNA coding sequence set Pop reach the maximum number of iterations; and if yes, outputting the DNA coding sequence set Pop; otherwise, going to step 2.
  2. 2. The method for optimizing coding for DNA storage based on a dual-policy BWO algorithm according to claim 1, wherein the parameters required for updating a DNA coding sequence comprise: sequence size nPop, number of sequence dimensions nvar, mutation rate pMutation, cannibalism rate pCannibalism, and maximum number of iterations Miter.
  3. 3. The method for optimizing coding for DNA storage based on a dual-policy BWO algorithm according to claim 1, wherein the combined constraint comprises a Hamming distance constraint, guanine-cytosine (GC) content, a non-consecutiveness constraint, and a terminal constraint, wherein the terminal constraint is as follows: IGSLast51 ± sLast5I < 3 (I) wherein 1GsLas,51 and ICsLas,31 respectively represent the number of guanine (G) bases and the number of cytosine (C) bases in the last five sequences in a sequence S.
  4. 4. The method for optimizing coding for DNA storage based on a dual-policy BWO algorithm according to claim 1, wherein the determining, during update of the DNA coding sequence set Pop, whether to use a random swap policy is specifically: a=tan (7c x (rand-0. 5)) (2) p=1-CIter/MIter (3) wherein rand is a random number with a value of (0,1), CIter is the current number of iterations, and Miter is the maximum number of iterations; and if a parameter a is less than a parameter f3, a random swap policy will be used to replace a dimension in the DNA coding sequence set Pop with a dimension of the current optimal solution; otherwise, the random swap policy will not be used, and a current solution and the optimal solution in the DNA coding sequence set Pop are maintained.
  5. 5. The method for optimizing coding for DNA storage based on a dual-policy BWO algorithm according to claim 1, wherein the manners for obtaining parameters Wa and Wb in the weight-based selection policy are: Wa(]-CIter/N4 Iter)' -(rand-o.5)xsimber (4) Wb=(2-2xCIter/MIter)1-(ntal-0.5),ISMILer (5) wherein the parameter S is generated in step 3 and initialized as 0; if fitness of a current solution is smaller than the current optimal solution, the parameter S is divided by 2; and if the fitness of the current solution is greater than the current optimal solution, S is increased by 1.
  6. 6. The method for optimizing coding for DNA storage based on a dual-policy BWO algorithm according to claim 5, wherein the manners for updating the DNA coding sequence set Pop based on the parameters Wa and Wb are: Clter/M1ter<0.5 (6) fyi = Wa x rand. x + (1-rand). x x2 (y2 = Wa x rand. x x2 + (1-rand). x xi Clter/MIter < 0.5 (7) = rand. x x1 + Wb X (1-rand). x x2 Clter/MIter > 0.5 ty2 = rand. x x2 + Wb X (1-rand). x wherein Wa and Wb are generated by using formulas (4) and (5) respectively, xi and x2 are sequences in the original DNA sequence set Pop, and yi and y2 are sequences in the updated DNA sequence set Pop. (8)
  7. 7. A method for optimizing coding for DNA storage based on a dual-policy BWO algorithm, comprising: step 1: initializing a DNA coding sequence set Pop and parameters required for updating a DNA coding sequence, to obtain the initialized DNA coding sequence set Pop; step 2: using Hamming distance as a fitness function to perform fitness-based sorting on the initialized DNA coding sequence set Pop to obtain a current optimal solution; step 3: obtaining parameters Wa and Wb in a weight-based selection policy, and updating the DNA coding sequence set Pop based on the parameters Wa and Wb to obtain the updated DNA coding sequence set Pop; step 4: using the Hamming distance as the fitness function to obtain fitness of sequences in the updated DNA coding sequence set Pop, performing fitness-based sorting, retaining a sequence that satisfies the Hamming distance constraint, and adding the sequence to a DNA coding sequence set Pop2; step 5: determining, based on the number of sequences in the updated DNA coding sequence set Pop and a mutation rate, a partial data set to be mutated, randomly swapping values in any two dimensions in the partial data set, and storing the mutation result to a DNA coding sequence set Pop3; step 6: filtering out, by using the fitness function, sequences that do not satisfy a combined constraint from the DNA coding sequence set Pop2 and the DNA coding sequence set Pop3, and removing the sequences to obtained the processed DNA coding sequence set Pop2 and the processed DNA coding sequence set Pop3; step 7: storing the processed DNA coding sequence set Pop2 and the processed DNA coding sequence set Pop3 to the updated DNA coding sequence set Pop to obtain a DNA coding sequence set Pop4; and step 8: determining whether the update times of the DNA coding sequence set Pop4 reach the maximum number of iterations; and if yes, outputting the DNA coding sequence set Pop4; otherwise, going to step 2.
  8. 8. The method for optimizing coding for DNA storage based on of a dual-policy BWO algorithm according to claim 7, wherein before the obtaining parameters Wa and Wb in a weight-based selection policy, the method further comprises: obtaining a parameter S in the weight-based selection policy; and initializing the parameter S as 0, wherein if fitness of a current solution is smaller than the current optimal solution, the parameter S is divided by 2; and if the fitness of the current solution is greater than the current optimal solution, S is increased by 1.
  9. 9. The method for optimizing coding for DNA storage based on a dual-policy BWO algorithm according to claim 8, wherein before the obtaining parameters Wa and Wb in a weight-based selection policy, the method further comprises: determining, during update of the DNA coding sequence set Pop, whether to use a random swap policy, which is specifically: a=tan (7c x (rand-O. 5)) (2) p=1-CIter/MIter (3) wherein rand is a random number with a value of (0,1), CIter is the current number of iterations, and Miter is the maximum number of iterations; and if a parameter a is less than a parameter f3, the random swap policy will be used to replace a dimension in the initialized DNA coding sequence set Pop with a dimension of the current optimal solution; otherwise, the random swap policy will not be used, and the current solution and the optimal solution in the initialized DNA coding sequence set Pop are maintained.
  10. 10. The method for optimizing coding for DNA storage based on of a dual-policy BWO algorithm according to claim 9, wherein the manners for obtaining parameters Wa and Wb in a weight-based selection policy are: Wa=(1-CIter/MIter)'-imiti4).5ixsiktiter (4) Wb=(2-2xCIter/MIter)I-(rad-0.5),/51'h1ier (5)-
GB2211537.2A 2021-09-18 2022-05-27 DNA storage coding optimization method based on double-strategy back spider algorithm Pending GB2619782A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111101673.2A CN113792877B (en) 2021-09-18 2021-09-18 DNA storage coding optimization method based on double-strategy black spider algorithm
PCT/CN2022/095523 WO2023040343A1 (en) 2021-09-18 2022-05-27 Dna storage coding optimization method based on double-strategy black spider algorithm

Publications (2)

Publication Number Publication Date
GB202211537D0 GB202211537D0 (en) 2022-09-21
GB2619782A true GB2619782A (en) 2023-12-20

Family

ID=88874458

Family Applications (1)

Application Number Title Priority Date Filing Date
GB2211537.2A Pending GB2619782A (en) 2021-09-18 2022-05-27 DNA storage coding optimization method based on double-strategy back spider algorithm

Country Status (1)

Country Link
GB (1) GB2619782A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109389206A (en) * 2018-09-26 2019-02-26 大连大学 The DNA encoding sequence optimisation method of mixing bat algorithm based on non-dominated ranking
CN109559782A (en) * 2018-11-08 2019-04-02 武汉科技大学 A kind of DNA sequence encoding method based on multi-objective genetic algorithm
CN110533096A (en) * 2019-08-27 2019-12-03 大连大学 The DNA of multiverse algorithm based on K-means cluster stores Encoding Optimization
CN111292808A (en) * 2020-02-14 2020-06-16 大连大学 DNA storage coding optimization method based on improved Harris eagle algorithm
CN113792877A (en) * 2021-09-18 2021-12-14 大连大学 DNA storage coding optimization method based on dual-strategy black spider algorithm

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109389206A (en) * 2018-09-26 2019-02-26 大连大学 The DNA encoding sequence optimisation method of mixing bat algorithm based on non-dominated ranking
CN109559782A (en) * 2018-11-08 2019-04-02 武汉科技大学 A kind of DNA sequence encoding method based on multi-objective genetic algorithm
CN110533096A (en) * 2019-08-27 2019-12-03 大连大学 The DNA of multiverse algorithm based on K-means cluster stores Encoding Optimization
CN111292808A (en) * 2020-02-14 2020-06-16 大连大学 DNA storage coding optimization method based on improved Harris eagle algorithm
CN113792877A (en) * 2021-09-18 2021-12-14 大连大学 DNA storage coding optimization method based on dual-strategy black spider algorithm

Also Published As

Publication number Publication date
GB202211537D0 (en) 2022-09-21

Similar Documents

Publication Publication Date Title
Goetz et al. Active federated learning
WO2023040343A1 (en) Dna storage coding optimization method based on double-strategy black spider algorithm
Hong et al. A clustering-tree topology control based on the energy forecast for heterogeneous wireless sensor networks
CN110533096B (en) DNA storage coding optimization method of multivariate universe algorithm based on K-means clustering
US20120136846A1 (en) Methods of hashing for networks and systems thereof
CN113535706B (en) Two-stage cuckoo filter and repeated data deleting method based on two-stage cuckoo filter
CN110719106B (en) Social network graph compression method and system based on node classification and sorting
CN109450459B (en) Polarization code FNSC decoder based on deep learning
CN109582985A (en) A kind of NoC mapping method of improved genetic Annealing
CN109726479B (en) Deployment method of three-dimensional network-on-chip vertical channel
GB2619782A (en) DNA storage coding optimization method based on double-strategy back spider algorithm
Ahn et al. Multiple-deme parallel estimation of distribution algorithms: Basic framework and application
Tuah et al. Energy consumption and lifetime analysis for heterogeneous Wireless Sensor Network
Ren et al. Enhance continuous estimation of distribution algorithm by variance enlargement and reflecting sampling
CN106650936A (en) Rough set attribute reduction method
Yuan et al. OPTIMIZED TRUST-AWARE RECOMMENDER SYSTEM USING GENETIC ALGORITHM.
Antichi et al. JA-trie: Entropy-based packet classification
CN110650539B (en) Wireless communication downlink resource allocation method based on SCMA
Sun et al. Tree-based differential evolution algorithm for QoS multicast routing
CN113285985A (en) RS code node repairing method based on genetic algorithm under multi-data center background
CN112417179A (en) Address processing method and device
Tsai et al. Sparse degrees analysis for LT codes optimization
Lin et al. Multi-bit sliding stack decoding algorithm for OVXDM
Xie et al. Benchmarking philosophy based search framework
CN116994073B (en) Graph contrast learning method and device for self-adaptive positive and negative sample generation