WO2015084704A1 - Computational tools for genomic sequencing and macromolecular analysis - Google Patents

Computational tools for genomic sequencing and macromolecular analysis Download PDF

Info

Publication number
WO2015084704A1
WO2015084704A1 PCT/US2014/067868 US2014067868W WO2015084704A1 WO 2015084704 A1 WO2015084704 A1 WO 2015084704A1 US 2014067868 W US2014067868 W US 2014067868W WO 2015084704 A1 WO2015084704 A1 WO 2015084704A1
Authority
WO
WIPO (PCT)
Prior art keywords
undefined
semantic network
database
false
values
Prior art date
Application number
PCT/US2014/067868
Other languages
French (fr)
Inventor
Roger MIDMORE
Original Assignee
Midmore Roger
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US14/095,416 external-priority patent/US9672466B2/en
Application filed by Midmore Roger filed Critical Midmore Roger
Priority to CN201480066317.3A priority Critical patent/CN105793858B/en
Priority to EP14867303.1A priority patent/EP3077941A4/en
Publication of WO2015084704A1 publication Critical patent/WO2015084704A1/en

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/30Data warehousing; Computing architectures

Definitions

  • Fig. 1 depicts a disclosed logic
  • Fig. 3 depicts graphical representation of a semantic network
  • Fig. 4 depicts the assignment of a property to a particular index within array
  • FIG. 5 depicts a disclosed general layout of principled data structures
  • Fig. 7 is a continuation of Fig. 6
  • Fig. 9 depicts several properties that may be computed within a disclosed system
  • Item 1 A machine implemented method comprising a semantic network for genome sequencing and analysis, the method comprising: [0091 ] using symbols comprising (F, T, U, D) to represent the values false, true, undefined, and defined, mapped into a two vector dynamic array; the values further mapped into indexes within the two vector dynamic arrays and stored as nodes within a semantic network for representing inputted genetic sequences;
  • Item 10 The system of claim 9 further comprising using the use of a phrase structure rewrite rule associated with a node within the semantic network for the testing and passing of the rewrite rule, the word size of the system imposing a chunking factor in the testing of conditionals in theoretic time 0(C).
  • Item 11 The system of item 9 further comprising a database of vector arrays, with each array associated with each semantic node, a database of the semantic network and a database of a grammar phrase structure implementations and a database of logical connectives.
  • Item 12 The system of item 9 implementing a top/down, bottom/up parser capable of a plurality of syntactic parses of a grammar to efficient model the growth of the statistical summation in the search space.

Abstract

A four valued parallelized simulation with the ability to implement computer systems with the ability to combine a mixture of computer intensive techniques in DNA analysis and molecular crystallography. Disclosed systems and methods aid in the design of drugs, pharmaceutical research, genetically modified organisms and the detection of genetic sequences for gene therapy.

Description

Invention Title
Computational Tools For Genomic Sequencing and Macromolecular Analysis [0001 ] Cross Reference to Related Applications.
[0002] This application is a continuation in part to U.S. Patent Application
14/095,416 filed on December 3, 2013, the contents of which are incorporated herein by reference.
[0003] Copyright and Trademark Notice
[0004] This application includes material which is subject or may be subject to copyright and/or trademark protection. The copyright and trademark owner(s) has no objection to the facsimile reproduction by any of the patent disclosure, as it appears in the Patent and Trademark Office files or records, but otherwise reserves all copyright and trademark rights whatsoever.
[0005] Background of the Invention
[0006] (1) Field of the Invention
[0007] The invention generally relates to computational tools for genomic sequencing and macromolecular analysis.
[0008] (2) Description of the Related Art
[0009] In the related art, various computational tools and machines for genome sequencing and analysis have been disclosed. But, the prior art lacks the efficiency of the presently disclosed embodiments.
[0010] Brief Summary of the Invention
[001 1 ] The present invention overcomes shortfalls in the related art by presenting an unobvious and unique combinations, configurations and use of methods, systems and means reducing the time and computational costs traditionally associated with testing, manipulation and analysis of data in computer architectures. [0012] Disclosed embodiments overcome the shortfalls in the related art by presenting a notation that allows for the encoding of both syntactic and semantic information into a two bit vector notation within associated with a semantic node in a semantic network. Disclosed embodiments also overcome shortfalls in the art by encoding the property each feature assumes in recursive predicate analysis.
[0013] Brief Description of the Drawings
[0014] Fig. 1 depicts a disclosed logic
[0015] Fig. 2 depicts a machine implementation
[0016] Fig. 3 depicts graphical representation of a semantic network
[0017] Fig. 4 depicts the assignment of a property to a particular index within array
[0018] Fig. 5 depicts a disclosed general layout of principled data structures
[0019] Fig. 6 depicts computations of complex analogies
[0020] Fig. 7 is a continuation of Fig. 6
[0021 ] Fig. 8 is a continuation of Fig. 6
[0022] Fig. 9 depicts several properties that may be computed within a disclosed system
[0023] These and other aspects of the present invention will become apparent upon reading the following detailed description in conjunction with the associated drawings.
DETAILED DESCRIPTION OF EMBODIMENTS OF THE INVENTION
[0024] The following detailed description is directed to certain specific
embodiments of the invention. However, the invention can be embodied in a multitude of different ways as defined and covered by the claims and their equivalents. In this description, reference is made to the drawings wherein like parts are designated with like numerals throughout. [0025] Unless otherwise noted in this specification or in the claims, all of the terms used in the specification and the claims will have the meanings normally ascribed to these terms by workers in the art.
[0026] Unless the context clearly requires otherwise, throughout the description and the claims, the words "comprise," "comprising" and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is to say, in a sense of "including, but not limited to." Words using the singular or plural number also include the plural or singular number, respectively. Additionally, the words "herein," "above," "below," and words of similar import, when used in this application, shall refer to this application as a whole and not to any particular portions of this application.
[0027] The above detailed description of embodiments of the invention is not intended to be exhaustive or to limit the invention to the precise form disclosed above. While specific embodiments of, and examples for, the invention are described above for illustrative purposes, various equivalent modifications are possible within the scope of the invention, as those skilled in the relevant art will recognize. For example, while steps are presented in a given order, alternative embodiments may perform routines having steps in a different order. The teachings of the invention provided herein can be applied to other systems, not only the systems described herein. The various embodiments described herein can be combined to provide further embodiments. These and other changes can be made to the invention in light of the detailed description.
[0028] All the above references and U.S. patents and applications are incorporated herein by reference. Aspects of the invention can be modified, if necessary, to employ the systems, functions and concepts of the various patents and applications described above to provide yet further embodiments of the invention.
[0029] Reference Numbers
[0030] 100 non transitory machine readable medium sometimes containing machine readable instructions
[0031 ] 200 a general or specialized processor [0032] 300 memory, sometimes non volatile
[0033] 410 database of one or more semantic networks
[0034] 420 database of vector arrays
[0035] 430 database of logical connectives
[0036] 440 database of grammar phrase structure implementations
[0037] 450 database of system reports
[0038] 500 semantic network
[0039] 510 objects
[0040] 520 relations
[0041 ] 600 runtime stack and heap
[0042] 700 system clock
[0043] 800 top down / bottom up parser
[0044] 900 hash table of constructive primitive formulas
[0045] 910 hash table of the functors and terms
[0046] 920 properties that are immediately passed securable
[0047] 921 general chart parser
[0048] 922 solution state at time t
[0049] 923 solution state at time t plus one
[0050] 930 hash table of the lexicon (class)
[0051 ] 940 symbol table
[0052] 1000 depicts an analogical example using the four valued logic and reformulating the prior art of Sheldon Klein from his paper Culture, Mysticism and Social Structure and the Calculation of Behavior. Computer Sciences Technical Report # 462 Dec. 1981. [0053] 1010 depicts a continuing example of 1000.
[0054] 1020 depicts a continuing example of 1000
[0055] 1030 depicts a accompanying pictorial analogy of 1000
[0056] 1040 depicts an accompanying pictorial analogy of 1010
[0057] 1100 depicts a continuing analogical computation from 1000
[0058] 1110 depicts a continuing analogical computation from 1000
[0059] 1120 depicts a continuing analogical computation from 1000
[0060] 1130 depicts a continuing pictorial computation from 1000
[0061 ] 1140 depicts a continuing pictorial computation from 1000
[0062] 1200 depicts a continuing analogical computation from 1000
[0063] 1 210 depicts a continuing analogical computation from 1000
[0064] 1220 depicts a continuing pictorial computation from 1000
[0065] 1230 depicts a continuing pictorial computation from 1000
[0066] 1240 depicts a question mark which represents a complex analogy to have been computed
[0067] 1300 depicts a bond angle bend [0068] 1310 depicts a bond stretch [0069] 1320 depicts a torsional strain [0070] 1340 depicts DNA
[0071 ] Referring to Fig. 1, a diagram for the basic binary operators and negation, ignoring monotonic arguments for negation, for a four valued logic is described. These operators are used in proving the completeness for a family of logics. These logics can be derived from a variety of different arguments. From considerations of Boolean groupings on the truth values, a pre-ordering of the truth tables into a lattice structure, or from set theoretic and recursive definitions. All are constructed to preserve some of the primary axioms in classical logic. By modeling the recursive values the truth values assume explicitly in the semantic network simplifies the testing of conditionals and the quantification of variables. The undefined value, the default value for growth to the system, allows for the dynamic benign encoding into the network, a logic property attributable to many Kleene logics. The fourth property allows for the proper quantification and binding of variables for the elimination of the effects of the newer truth values for subsequent steps in the calculation. It also provides the possibility for the introduction of a constructively acceptable "tertium non datur" for decision procedures for modeling Markov processes into the logic.
[0072] By encoding properties with a specific bit into the bit vector the linear scaling may be maintained. This system is a departure from prior art in compiler design for creating symbol tables, testing of features and aids extended stack compiler
implementations.
[0073] In first column of Fig. 1, the logical not sign is shown as in the second column of Fig. 1 the AND operator is shown as Λ, in the third column of Fig. 1 the OR operator is shown as V. The first column shows the values before application of the not operator. For example, in the first row of the first column, the value of F is shown before application of the not operator and T is shown as a result.
[0074] In the second column, a OR operator takes one value from the first column and one value from the first row and shows the result of the logical operator where the column value and row value intersect. In the third column a AND operator is applied in a similar manner as in the second column. For example, in the third column, at the first row and selecting the last element, at the first column in selecting the second element D and F are shown and result in a value of D.
[0075] Referring to Fig. 2, a machine implementation is shown using a machine readable, non-transitory media 100, the media 100 having machine readable
instructions sent to a general or specialized processor 200. The processor 200 may be in communication with memory 300, a plurality of databases and other components, such as a network, user interfaces and other implements. The plurality of databases may include a database 410 of one or more semantic networks, such as the network system of Fig. 3, a database 420 of vector arrays the arrays may be associated with each semantic node or other network component, a database 430 of logical connectives, such as the connectives of Fig. 1, a database 440 of grammar phrase structure
implementations, such as the and a database of other disclosed components Fig. 5 also depicts a system clock 700, top down / bottom up parser 800 and runtime stack and heap 600.
[0076] Referring to Fig. 3, a graphical representation of a semantic network 500 is shown with objects 510 and relations 520, with all objects and relations being nodes in memory or in a database.
[0077] Fig. 4 depicts a graphical representation of the two bit vector array associated with the semantic node in memory. Fig 4 further shows the assignment of the truth value across the two arrays, with X being a specific index into the array. The word size in the figure is a consequence of word size limitations in computer
architecture. This causes a chunking factor for implementations of the array.
[0078] Fig. 5 is a simple diagram depicting a general layout of the data structures assumed to be all contained in the same space of random access memory or RAM in a constructive formalization to highlight disclosed system diagnostics. 930 is a hash table of a lexicon for enforcing class membership. 910 is a hash table for functors and terms. 900 is a hash table for formulas. 920 is shared memory for a chart parser and a solution state of a computation controlling what is immediately securable in a simulation. 940 depicts a symbol has table responsible for mapping properties in their assignment in a bit vector.
[0079] Fig. 5 highlights the important logical divisions in constructive analysis which are important for systems analysis of the system in general and how it is using computer resources for specific algorithms in specific environments. Functors and terms are a logical term from Kleene's The Foundations Of Intuitionistic Mathematics and are equivalent to the use of object and relations in the writings of Klein. By restricting Klein's semantic triple to its 2-tuple subset one may model formulas consisting of Kleene's primitive recursive functions in the system. The notion of immediate securability is taken care of by the chart parser and its control of the solution state for the simulation. The system may be seen as allowing a timed memory access (i.e.
massive sequential write from memory to the processes on the solution state
(blackboard)) as the parser switches the blackboard from time T to time T+l. It's this timing by the parser with the system clock that allows for the determination of use of all system resources as this write may be given to a distributed system (i.e. network) and be seen as the timing of transmission in information theory for analysis of the system. The arrows and boxes represent the linkages (pointers) between specific entries that are related in the hash tables between the formulas(2-tuple & 3-tuple triples),
terms/formulas (object/relations) and lexicon(class) for lookup (search).
[0080] Kleene's formulization is very restrictive and one may allow the loosening of logical standards to include general recursive formulas(allowing Klein's triple semantic notation in its fullest) as well as the notion of class which is taken care of by the lexicon in Klein's theories. Equating lambda definability and the notion of special realizability of Kleene with the notion of algorithm by Markov in The Theory of Algorithms will allow for a more colloquial presentation of the system. All that is needed in the diagram is to replace Lexicon with Markov's syllables, Objects/Relations with Markov's words, and formulas with Markov's notion of normal algorithms. His general rewrite system can then be assumed it's abilities in general pattern matching and replacement of strings in DNA sequences or strings more generally.
[0081 ] Fig. 6 is a diagram of a reformulation of a three valued analogical example given by Prof. Klein. It maps the value true to [1,1] and uses the strong equivalence operator for analogical relations. The exclusive OR is preferred so as to not make the system machine dependent and for the absence of the strong equivalence operator in major programming languages, use of the strong equivalence is shown since it is the preferred operator by logicians and one may interchange the four-valued strong equality operator with it's two-valued counterpart and use the notion of traditional equality when reviewing the logical literature.
[0082] The metamathematical values in figure 6 are True mapped to [1,1], False mapped to [0,1], Undefined [0,0], and Defined [1,0].
[0083] Fig. 7 is the continuation of 6
[0084] Fig. 8 is the continuation of 6
[0085] Fig. 9 is a diagram of some of the macromolecular properties being capable of being modelled in the system. 1310 is a diagram of bond stretching. 1300 is a diagram of angle bends. 1320 is a diagram of rotational or torsional stretching . 1340 depicts a DNA molecule.
[0086] These are some of the general molecular mechanical properties that are used in modeling, these properties may be interchanged with quantum mechanics but the computation time will increase significantly with this switch away from classical mechanics.
[0087] These and other changes can be made to the invention in light of the above detailed description. In general, the terms used in the following claims, should not be construed to limit the invention to the specific embodiments disclosed in the
specification, unless the above detailed description explicitly defines such terms.
Accordingly, the actual scope of the invention encompasses the disclosed embodiments and all equivalent ways of practicing or implementing the invention under the claims.
[0088] While certain aspects of the invention are presented below in certain claim forms, the inventors contemplate the various aspects of the invention in any number of claim forms.
[0089] Disclosed embodiments include the following Items:
[0090] Item 1. A machine implemented method comprising a semantic network for genome sequencing and analysis, the method comprising: [0091 ] using symbols comprising (F, T, U, D) to represent the values false, true, undefined, and defined, mapped into a two vector dynamic array; the values further mapped into indexes within the two vector dynamic arrays and stored as nodes within a semantic network for representing inputted genetic sequences;
[0092] for F, T, U, D, defined into set theory, such as {} for undefined, {T} for true, {F} for false, {} for undefined and {T, F} for defined, these values are interpreted as properties {P} for T and, {-P} false, {} for undefined and {P, -P} for defined, which are the properties used for testing the conditionals and quantifying variables for successive recursive steps in the predicate calculus;
[0093] c) defining a logic with a negation, ignoring monotonic argumentations, with the following binary connectives: for the logical AND (Λ), NOT (-); and logical OR (V) connectives as follows used to prove the completeness of the logics:
[0094] -F is T
[0095] ^T is F
[0096] is D
[0097] ^D is U;
[0098] d) for the Λ connective
[0099] Λ F T U D
[00100] F F F F F
[00101 ] T F T U D
[00102] U F U U F
[00103] D F D F D;
[00104] e) for the V connective
[00105] V F T U D
[00106] F F T U D [001 07] T T T T T [00108] U U T U T [00109] D D T T D;
[001 1 0] f) optimizing short term memory maximizing long term storage by the linear encoding of syntactic and semantic information into the semantic network;
[001 1 1 ] g) in a parallel context optimizing short term memory to maximize long term storage becomes optimizing communication and memory between different knowledge sources (processes) and;
[001 1 2] h) using defined and undefined to help separate asset classes in the simulation.
[001 1 3] The method of item 1 further comprising using the use of a phrase structure rewrite rule associated with a node within the semantic network for the testing and passing of the rewrite rule.
[001 14] The method of item 2 implementing a top/down, bottom/up parser capable of a plurality of syntactic parses of a grammar.
[001 1 5] The method of item 3 using a system clock, runtime stack and heap, a processor, machine readable instructions contained on non-transitory media and a database of rewrite rules, a database of the semantic network and a database of syntactic and semantic information.
[001 1 6] The system of item 4 implementing a top/down, bottom/up parser capable of a plurality of syntactic parses of a grammar to provide syntactic pattern matching abilities for modeling pattern matching for DNA sequences.
[001 1 7] The system of item 5 implemented for dynamic modeling of DNA in Monte Carlo simulations, for the use of whole genomic sequences.
[001 1 8] The system of item 5 using a specialized processor. [001 19] Item 8. A machine implemented method comprising a semantic network for macromolecular analysis, the method comprising:
[00120] using symbols comprising (F, T, U, D) to represent the values false, true, undefined, and defined, mapped into a two vector dynamic array; the values further mapped into indexes within the two vector dynamic arrays and stored as nodes within a semantic network for representing inputted macro molecular mechanics;
[00121 ] for F, T, U, D, defined into set theory
[00122] , such as {} for undefined, {T} for true, {F} for false, {} for undefined and {T, F} for defined, these values are interpreted as properties {P} for T and, {-P} false, {} for undefined and {P, -P} for defined, which are the properties used for testing the conditionals and quantifying variables for successive recursive steps in the predicate calculus;
[00123] c) defining a logic with a negation, ignoring monotonic argumentations, with the following binary connectives: for the logical AND (Λ), NOT (-); and logical OR (V) connectives as follows used to prove the completeness of the logics:
[00124] -F is T
[00125] is F
[00126] is D
[00127] is U;
[00128] d) for the Λ connective
[00129] Λ F T U D
[00130] F F F F F
[00131 ] T F T U D
[00132] U F U U F
[00133] D F D F D; [001 34] e) for the V connective
[00135] V F T U D
[00136] F F T U D
[00137] T T T T T
[00138] U U T U T
[00139] D D T T D;
[00140] f) optimizing short term memory maximizing long term storage by the linear encoding of syntactic and semantic information into the semantic network;
[00141 ] g) in a parallel context optimizing short term memory to maximize long term storage becomes optimizing communication and memory between different knowledge sources (processes) and;
[00142] h) using defined and undefined to help separate genetic types in the simulation.
[00143] Item 9. A system for the hybrid modeling of genetic sequences and macromolecular structures for chemical discoveries in key lock systems and induced fit systems the system comprising:
[00144] machine readable instructions stored upon a nonvolatile computer readable medium, a central processing unit, a runtime stack and heap, semantic network, top down / bottom up parser, a system clock, database with historical economic information;
[00145] the system using a Boolean encoding comprising (F, T, U, D) to represent the values false, true, undefined, and defined, mapped into a two vector dynamic array; the values further mapped into indexes within the two vector dynamic arrays and associated with nodes in a semantic network;
[00146] for {F, T, U, D} defined into set theory, such as {} for undefined, {T} for true, {F} for false, and {T,F} for defined, these values are interpreted as properties {P} for T, {-P} for false, {} for undefined and {P, -P} for defined, which are the properties used for the testing of conditionals and quantifying of variables in the predicate calculus;
[00147] the system defining a logic with a negation with the following binary connectives: for the logical AND (Λ), NOT H; and logical OR (V) connectives as follows used to prove the completeness of the logics:
[00148] -FisT
[00149] -T is F
[00150] -U is D
[00151] -D is U;
[00152] e) for the Λ connective
[00153] Λ FTUD
[00154] F FFFF
[00155] T FTU D
[00156] U F U U F
[00157] D FDFD;
[00158] f) for the V connective
[00159] VFTU D
[00160] F FTU D
[00161] T TTTT
[00162] U UTUT
[00163] D DTTD;
[00164] g) the system optimizing short term memory maximizing long term storage by the linear encoding of the information into the semantic network; [001 65] h) the system integrating memory in a parallel context to optimize communication and memory between different knowledge databases.
[00166] Item 10. The system of claim 9 further comprising using the use of a phrase structure rewrite rule associated with a node within the semantic network for the testing and passing of the rewrite rule, the word size of the system imposing a chunking factor in the testing of conditionals in theoretic time 0(C).
[00167] Item 11. The system of item 9 further comprising a database of vector arrays, with each array associated with each semantic node, a database of the semantic network and a database of a grammar phrase structure implementations and a database of logical connectives.
[00168] Item 12. The system of item 9 implementing a top/down, bottom/up parser capable of a plurality of syntactic parses of a grammar to efficient model the growth of the statistical summation in the search space.
[00169] Item 13. The system of item 9 used for the dynamic macromolecular modeling of DNA in Monte Carlo simulations, with the physical properties of the DNA.

Claims

Claims What is claimed is:
[Claim 1 ] A machine implemented method comprising a semantic network for genome sequencing and analysis, the method comprising:
a) using symbols comprising (F, T, U, D) to represent the values false, true, undefined, and defined, mapped into a two vector dynamic array; the values further mapped into indexes within the two vector dynamic arrays and stored as nodes within a semantic network for representing inputted genetic sequences;
b) for F, T, U, D, defined into set theory, such as {} for undefined, {T} for true, {F} for false, {} for undefined and {T, F} for defined, these values are interpreted as properties {P} for T and, {-P} false, {} for undefined and {P, -P} for defined, which are the properties used for testing the conditionals and quantifying variables for successive recursive steps in the predicate calculus;
c) defining a logic with a negation, ignoring monotonic argumentations, with the following binary connectives: for the logical AND (Λ), NOT (-); and logical OR (V) connectives as follows used to prove the completeness of the logics:
-F is T
is F
is D
is U;
d) for the Λ connective
Λ F T U D
F F F F F T F T U D U F U U F D F D F D;
e) for the V connective V F T U D
F F T U D T T T T T U U T U T D D T T D;
f) optimizing short term memory maximizing long term storage by the linear encoding of syntactic and semantic information into the semantic network;
g) in a parallel context optimizing short term memory to maximize long term storage becomes optimizing communication and memory between different knowledge sources (processes) and;
h) using defined and undefined to help separate asset classes in the simulation.
[Claim 2] The method of claim 1 further comprising using the use of a phrase structure rewrite rule associated with a node within the semantic network for the testing and passing of the rewrite rule.
[Claim 3] The method of claim 2 implementing a top/down, bottom/up parser capable of a plurality of syntactic parses of a grammar.
[Claim 4] The method of claim 3 using a system clock, runtime stack and heap, a
processor, machine readable instructions contained on non-transitory media and a database of rewrite rules, a database of the semantic network and a database of syntactic and semantic information.
[Claim 5] The system of claim 4 implementing a top/down, bottom/up parser capable of a plurality of syntactic parses of a grammar to provide syntactic pattern matching abilities for modeling pattern matching for DNA sequences.
[Claim 6] The system of claim 5 implemented for dynamic modeling of DNA in Monte
Carlo simulations, for the use of whole genomic sequences.
[Claim 7] The system of claim 5 using a specialized processor.
[Claim 8] A machine implemented method comprising a semantic network for
macromolecular analysis, the method comprising: using symbols comprising (F, T, U, D) to represent the values false, true, undefined, and defined, mapped into a two vector dynamic array; the values further mapped into indexes within the two vector dynamic arrays and stored as nodes within a semantic network for representing inputted macro molecular mechanics;
for F, T, U, D, defined into set theory
, such as {} for undefined, {T} for true, {F} for false, {} for undefined and {T, F} for defined, these values are interpreted as properties {P} for T and, {-P} false, {} for undefined and {P, -P} for defined, which are the properties used for testing the conditionals and quantifying variables for successive recursive steps in the predicate calculus;
c) defining a logic with a negation, ignoring monotonic argumentations, with the following binary connectives: for the logical AND (Λ), NOT (-); and logical OR (V) connectives as follows used to prove the completeness of the logics:
-F is T
is F
is D
is U;
d) for the Λ connective
Λ FTUD
F FFFF T FTU D U F U U F D FDFD;
e) for the V connective
V FTU D
F FTU D T TTTT U UTUT D DTTD; f) optimizing short term memory maximizing long term storage by the linear encoding of syntactic and semantic information into the semantic network;
g) in a parallel context optimizing short term memory to maximize long term storage becomes optimizing communication and memory between different knowledge sources (processes) and;
h) using defined and undefined to help separate genetic types and molecular structures in the simulation.
[Claim 9] A system for the hybrid modeling of genetic sequences and macromolecular structures for chemical discoveries in key lock systems and induced fit systems the system comprising:
a) machine readable instructions stored upon a nonvolatile computer readable medium, a central processing unit, a runtime stack and heap, semantic network, top down / bottom up parser, a system clock, database with historical economic information;
b) the system using a Boolean encoding comprising (F, T, U, D) to represent the values false, true, undefined, and defined, mapped into a two vector dynamic array; the values further mapped into indexes within the two vector dynamic arrays and associated with nodes in a semantic network;
c) for {F, T, U, D} defined into set theory, such as {} for undefined, {T} for true, {F} for false, and {T,F} for defined, these values are interpreted as properties {P} for T, {-P} for false, {} for undefined and {P, -P} for defined, which are the properties used for the testing of conditionals and quantifying of variables in the predicate calculus;
d) the system defining a logic with a negation with the following binary
connectives: for the logical AND (Λ), NOT (-); and logical OR (V) connectives as follows used to prove the completeness of the logics:
-F is T -T is F
-U is D
-D is U;
e) for the Λ connective
Λ FT U D
F FFFF T FTU D U F U U F D FDFD;
f) for the V connective
VFTU D
F FTU D T TTTT U UTUT D DTTD;
g) the system optimizing short term memory maximizing long term storage by the linear encoding of the information into the semantic network;
h) the system integrating memory in a parallel context to optimize communication and memory between different knowledge databases.
[Claim 10] The system of claim 9 further comprising using the use of a phrase
structure rewrite rule associated with a node within the semantic network for the testing and passing of the rewrite rule, the word size of the system imposing a chunking factor in the testing of conditionals in theoretic time 0(C).
[Claim 11 ] The system of claim 9 further comprising a database of vector arrays, with each array associated with each semantic node, a database of the semantic network and a database of a grammar phrase structure implementations and a database of logical connectives. [Claim 1 2] The system of claim 9 implementing a top/down, bottom/up parser capable of a plurality of syntactic parses of a grammar to efficient model the growth of the statistical summation in the search space.
[Claim 1 3] The system of claim 9 used for the dynamic macromolecular modeling of DNA in Monte Carlo simulations, with the physical properties of the DNA.
PCT/US2014/067868 2013-12-03 2014-12-01 Computational tools for genomic sequencing and macromolecular analysis WO2015084704A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201480066317.3A CN105793858B (en) 2013-12-03 2014-12-01 Calculating instrument for genome sequence and macromolecular analysis
EP14867303.1A EP3077941A4 (en) 2013-12-03 2014-12-01 Computational tools for genomic sequencing and macromolecular analysis

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US14/095,416 2013-12-03
US14/095,416 US9672466B2 (en) 2013-09-03 2013-12-03 Methods and systems of four-valued genomic sequencing and macromolecular analysis

Publications (1)

Publication Number Publication Date
WO2015084704A1 true WO2015084704A1 (en) 2015-06-11

Family

ID=53274000

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2014/067868 WO2015084704A1 (en) 2013-12-03 2014-12-01 Computational tools for genomic sequencing and macromolecular analysis

Country Status (3)

Country Link
EP (1) EP3077941A4 (en)
CN (1) CN105793858B (en)
WO (1) WO2015084704A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090018996A1 (en) * 2007-01-26 2009-01-15 Herbert Dennis Hunt Cross-category view of a dataset using an analytic platform
US20100275189A1 (en) * 2009-02-27 2010-10-28 Cooke Daniel E Method, Apparatus and Computer Program Product for Automatically Generating a Computer Program Using Consume, Simplify & Produce Semantics with Normalize, Transpose & Distribute Operations
US20120101736A1 (en) * 2010-10-25 2012-04-26 Dudley Joel T Method and System for Computing and Integrating Genetic and Environmental Health Risks for a Personal Genome
US20130184161A1 (en) * 2009-10-22 2013-07-18 Stephen F. Kingsmore Methods and Systems for Medical Sequencing Analysis

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020001810A1 (en) * 2000-06-05 2002-01-03 Farrell Michael Patrick Q-beta replicase based assays; the use of chimeric DNA-RNA molecules as probes from which efficient Q-beta replicase templates can be generated in a reverse transcriptase dependent manner
US20030157489A1 (en) * 2002-01-11 2003-08-21 Michael Wall Recursive categorical sequence assembly
EP1623996A1 (en) * 2004-08-06 2006-02-08 Deutsches Krebsforschungszentrum Stiftung des öffentlichen Rechts Improved method of selecting a desired protein from a library
JP4953046B2 (en) * 2005-07-29 2012-06-13 独立行政法人産業技術総合研究所 A novel functional peptide creation system that combines a random peptide library or a peptide library that mimics an antibody hypervariable region and an in vitro peptide selection method using RNA-binding proteins
KR20100083133A (en) * 2007-10-04 2010-07-21 도소 가부시키가이샤 Primer for amplification of rrna of bacterium belonging to the genus legionella, detection method, and detection kit
CN101492736A (en) * 2008-12-08 2009-07-29 南京市第二医院 Opened protein diversity display method
CN101831489B (en) * 2009-03-11 2013-10-02 孙星江 Method for detecting desoxyribonucleic acid anti-counterfeiting maker by utilizing loop-mediated isothermal amplification technology

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090018996A1 (en) * 2007-01-26 2009-01-15 Herbert Dennis Hunt Cross-category view of a dataset using an analytic platform
US20100275189A1 (en) * 2009-02-27 2010-10-28 Cooke Daniel E Method, Apparatus and Computer Program Product for Automatically Generating a Computer Program Using Consume, Simplify & Produce Semantics with Normalize, Transpose & Distribute Operations
US20130184161A1 (en) * 2009-10-22 2013-07-18 Stephen F. Kingsmore Methods and Systems for Medical Sequencing Analysis
US20120101736A1 (en) * 2010-10-25 2012-04-26 Dudley Joel T Method and System for Computing and Integrating Genetic and Environmental Health Risks for a Personal Genome

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
See also references of EP3077941A4 *
TSOUKIÀS: "A FIRST ORDER, FOUR-VALUED, WEAKLY PARACONSISTENT LOGIC AND ITS RELATION WITH ROUGH SETS SEMANTICS", LAMSADE - CNRS, UNIVERSITE PARIS DAUPHINE, 25 November 2006 (2006-11-25), pages 1 - 20, XP055347745, Retrieved from the Internet <URL:HTTPS://PDFS.SEMANTICSCHOLAR.ORG/A981/76E2E8DE0BCAC0920E1A98EAEA52698CDB8A.PDF> [retrieved on 20150204] *

Also Published As

Publication number Publication date
CN105793858B (en) 2018-09-21
CN105793858A (en) 2016-07-20
EP3077941A4 (en) 2017-08-30
EP3077941A1 (en) 2016-10-12

Similar Documents

Publication Publication Date Title
Colton Automated theory formation in pure mathematics
Dominich Mathematical foundations of information retrieval
Luisa Formal models and semantics
Lin General systems theory: A mathematical approach
Baase Computer algorithms: introduction to design and analysis
Khoussainov et al. Automata theory and its applications
Eriksson et al. Phylogenetic algebraic geometry
Chauve et al. Gene family evolution by duplication, speciation, and loss
AU2014315619A1 (en) Methods and systems of four-valued simulation
El-Mabrouk et al. Analysis of gene order evolution beyond single-copy genes
Chang et al. Reconciling gene trees with apparent polytomies
Bernard et al. Techniques for inferring context-free Lindenmayer systems with genetic algorithm
Guillemot Parameterized complexity and approximability of the longest compatible sequence problem
Paszek et al. Efficient algorithms for genomic duplication models
López et al. Prefix–suffix duplication
Sia et al. Evolutionary-based feature construction with substitution for data summarization using DARA
Sperschneider Bioinformatics: problem solving paradigms
US20150066835A1 (en) Computational Tools For Genomic Sequencing and Macromolecular Analysis
EP3077941A1 (en) Computational tools for genomic sequencing and macromolecular analysis
Bryant et al. Supertree methods for ancestral divergence dates and other applications
Dobrynin Wiener index of hexagonal chains with segments of equal length
Gross et al. Phylogenetic networks
Alkhalid et al. Comparison of greedy algorithms for α-decision tree construction
Ludwig Tree-structured problems and parallel computation
Selsam et al. Universal policies for software-defined MDPs

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14867303

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

REEP Request for entry into the european phase

Ref document number: 2014867303

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2014867303

Country of ref document: EP