CN105793858B - Calculating instrument for genome sequence and macromolecular analysis - Google Patents

Calculating instrument for genome sequence and macromolecular analysis Download PDF

Info

Publication number
CN105793858B
CN105793858B CN201480066317.3A CN201480066317A CN105793858B CN 105793858 B CN105793858 B CN 105793858B CN 201480066317 A CN201480066317 A CN 201480066317A CN 105793858 B CN105793858 B CN 105793858B
Authority
CN
China
Prior art keywords
value
row
logic
mapped
undefined
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201480066317.3A
Other languages
Chinese (zh)
Other versions
CN105793858A (en
Inventor
罗杰·密德茂尔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US14/095,416 external-priority patent/US9672466B2/en
Application filed by Individual filed Critical Individual
Publication of CN105793858A publication Critical patent/CN105793858A/en
Application granted granted Critical
Publication of CN105793858B publication Critical patent/CN105793858B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B50/00ICT programming tools or database systems specially adapted for bioinformatics
    • G16B50/30Data warehousing; Computing architectures

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biophysics (AREA)
  • Databases & Information Systems (AREA)
  • Bioethics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

A kind of four value parallel artificials, can realize a kind of computer system, which can be incorporated in the mixture of the computer concentration techniques on DNA analysis and macromolecule crystallization body.Disclosed system and method help the design of drug, pharmacy research, the organism changed from gene angle and to the detection of gene order for gene therapy.

Description

Calculating instrument for genome sequence and macromolecular analysis
The application is that the part for the U.S. Patent application 14/095,416 submitted on December 3rd, 2013 is continued, and is led to It crosses with reference to the content for merging U.S. Patent application 14/095,416 herein.
Patent right and trademark gazette
The application includes being subordinated to or may being subordinated to the substance of patent right and/or trademark protection.Due to this patent public affairs In the file or record of outputing present Patent and Trademark Office, so the owner of patent right and trade mark is for passing through any this patent The open duplicate replicated have no objection, but on the other hand retain all patent right and trade mark right.
Technical field
The present invention generally relates to the calculating instruments of genome sequence and macromolecular analysis.
Background technology
In the prior art, the various calculating instruments and mechanism for sorting and analyzing for genome have been disclosed.But it is existing There is technology to lack the efficiency of presently disclosed embodiment.
Invention content
The present invention by propose reduce traditionally with the test of the data in computer architecture, operation and analyzing and associating when Between and the method for computing cost, the unobvious of system and mode and unique combination, configuration and use, overcome in the prior art Deficiency.
Disclosed embodiment allows syntactic information and semantic information being both encoded in semantic network by proposing With the symbol of the associated dibit vector symbol of semantic node, deficiency in the prior art is overcome.Disclosed embodiment is also logical It crosses and the attribute that each feature is assumed in recursive predicative analysis is encoded to overcome deficiency in the prior art.
Description of the drawings
Fig. 1 depicts disclosed logic;
Fig. 2 depicts machine implementation;
Fig. 3 depicts the graphic representation of semantic network;
Fig. 4 depicts the distribution of the attribute for the particular index in array;
Fig. 5 depicts the disclosed general layout of methodization data structure;
Fig. 6 depict complex class than calculating;
Fig. 7 is the continuation of Fig. 6;
Fig. 8 is the continuation of Fig. 6;
Fig. 9 describes several attributes that can be calculated in disclosed system.
Based on following detailed description is read in conjunction with associated attached drawing, these and other aspects of the present invention will become Obviously.
Specific implementation mode
Following detailed description is directed toward some specific embodiments of the present invention.However, the present invention can be according to such as passing through right It is required that and their equivalent limits and the multitude of different ways of covering is implemented.In the description, reference is made to attached drawing, In, run through entire attached drawing, identical component is designated same label.
Unless in addition being referred in the specification or claims, otherwise use in the specification and in the claims All terms have the meaning for being usually attributed to these terms that those skilled in the art are thought.
Unless context clearly requires in addition that, run through specification and claims, word "include", "comprise" and class As word will be explained by the meaning according to inclusiveness, rather than removing property or the meaning exhausted;That is, according to " including But be not limited to " the meaning explain.Most or odd number quantity is respectively further comprised using the word of singular or plural quantity.This Outside, when word " herein ", " more than ", the word of " following " and similar meaning be in this specification by use, what these words should refer to It is the application as a whole, rather than any specific part of the application.
The detailed description of the upper surface of the embodiment of the present invention is not intended to be exhausted or limits the invention to public above The clear form opened.Although the particular embodiment of the present invention or example since schematic purpose is described in detail above, If those skilled in the relevant art will be recognized that, various equivalent modifications within the scope of the invention are feasible.Example Such as, although step is presented according to given sequence, optional embodiment, which can perform, has the step of according to different order Routine.The teachings of the present invention can be applied to other systems provided herein, rather than just system described here.It retouches herein The various embodiments stated can be combined to provide other embodiment.It can make these and its to the present invention according to specific implementation mode He changes.
All above-mentioned references and United States Patent (USP) and application are merged herein by reference.If it is required, then the present invention Various aspects can be changed, and the present invention is provided with system, function and concept using various patents described above and application Other further embodiments.
Reference label
100 include the non-transitory machine readable media of machine readable instructions sometimes
200 general or specialized processors
300 memories, sometimes nonvolatile memory
The database of 410 one or more semantic networks
The database of 420 vector arrays
The database of 430 logical connectives
The database that 440 grammer word structures are implemented
The database of 450 System Reports
500 semantic networks
510 objects
520 relationships
600 running stacks and heap
700 system clocks
800 from top to bottom/syntax analyzer from bottom to top
The Hash table of the prime formula of 900 presumptions
The Hash table of 910 functors (functor) and item
The attribute of 920 safeties immediately
921 standard drawing mark syntax analyzers
922 time t solution state
923 time t+1 solution state
The Hash table of 930 vocabulary (classification)
940 symbol tables
1000 depict using quaternary logic and reappear paper " Culture, the Mysticism and of Sheldon Klein Social Structure and the Calculation of Behavior " (are documented in December, 1981 in computer section Learn in technical report #462) the prior art analogize example
1010 depict 1000 continuation example
1020 depict 1000 continuation example
1030 depict 1010 adjoint drawing classes ratio
1040 depict 1010 adjoint drawing classes ratio
1100, which depict the analogy continued from 1000, calculates
1110, which depict the analogy continued from 1000, calculates
1120, which depict the analogy continued from 1000, calculates
1130, which depict the drawing continued from 1000, calculates
1140, which depict the drawing continued from 1000, calculates
1200, which depict the analogy continued from 1000, calculates
1210, which depict the analogy continued from 1000, calculates
1220, which depict the drawing continued from 1000, calculates
1230, which depict the drawing continued from 1000, calculates
1240 depict indicate complex class calculated than question mark
1300 depict bond angle bending (bond angle bending)
1310 depict key stretching, extension (bond stretch)
1320 depict torsional strain (torsional strain)
1340 depict DNA
Referring to Fig.1, the basic binary opertor and logic NOT described for quaternary logic (is ignored for logic NOT Dullness demonstration) schematic diagram.These operators are used for the completeness of proof logic class.These logics can pass through various different opinions Card is derived.It accounts for from the grouping of the Boolean type of true value, or is accounted for from sets theory and recursive definition, by truth table It is arranged in trellis in advance.It is all these to be configured to preserve some main axioms in the form of classical logic.By right Recurrence value is modeled, and clearly assumes that true value makes the test of condition and the quantization of variable simplify in semantic network.For The undefined value of system, the default value of growth allow for the benign coding of the dynamic of network, and logical attribute can belong to many Kleene logics.4th attribute allows to carry out quantization and constraint appropriate to variable, for eliminating for subsequent in calculating The influence of the update true value of step.This is also for introducing Markov (Markov) process model building at the decision procedure of logic It estimates acceptable " law of excluded middle (terium non datur) " and provides possibility.
By that will have the attribute coding of specific bit to bit vectors, linear scale can be kept.The system is creating It is different from the prior art in terms of the compiler that symbol table, characteristic test and auxiliary extension heap compiler are implemented.
In the first column of Fig. 1, logic NOT (not) symbol is shown asIn the second column of Fig. 1, with (AND) operation Symbol is shown asΛ, in the third column of Fig. 1, or (OR) operator is shown as ∨.First column is shown accords with it using inverse Preceding value.For example, the first row on the first column, value F is shown before application inverse symbol, and T is shown as result.
In the second column, AND operator takes a value from first row, and a value is taken from the first row, in the value of the value and row of row Intersection shows the result of logical operator.In third column or operator is according to such as similar mode is answered in the second column With.For example, in third column, show and select the last one element D in the first row, in second element F of the first column selection, knot Fruit is value D.
With reference to Fig. 2, show that machine is implemented using machine readable, non-transitory medium 100, medium 100, which has, to be sent To the machine readable instructions of general or specialized processor 200.Processor 200 can with memory 300, multiple databases and other Component (such as, network, user interface and other implementations) is communicated.Multiple databases may include one or more semantic nets The database 410 (network system of such as Fig. 3) of network, (vector array can be with each semantic node for the database 420 of vector array Or other networking components association), the database 430 (such as, the conjunction of Fig. 1) of logical connective, grammer word structure implement Database 440 (such as, the database 440 of Fig. 2) and other disclosed components database.When Fig. 5 further depicts system Clock 700, from top to bottom/syntax analyzer 800 and run time stack and heap 600 from bottom to top.
With reference to Fig. 3, the graphic representation of semantic network 500 is shown using object 510 and relationship 520, wherein all objects It is the node in memory or database with relationship.
Fig. 4 depicts the graphic representation with the associated dibit vector array of semantic node in memory.Fig. 4 is also shown True value is distributed across two number groups, wherein X be the particular index in insertion array.The size of word is computer architecture in figure In word size limitation result.This causes the chunking factor of the realization for array.
Fig. 5 is the schematic diagram for the general layout for depicting data structure, and all data structures are assumed to be in the form of estimating It is comprised in the same space of random access storage device or RAM, to emphasize disclosed system diagnostics.930 be for forcing to hold The Hash table of the vocabulary of row classification verification.910 be the Hash table for functor and item.900 be the Hash table for formula.920 It is the shared memory for chart resolver and the solution state calculated, what is pacified immediately in the solution state control emulation of calculating All risk insurance is deposited.940 describe symbol, and symbol has the table for the attribute of its task to be mapped to bit vectors.
Fig. 5 highlights the important logical partition in presumption is analyzed, common system of the logical partition for system It analyzes and how the network analysis using computer resource for the special algorithm under specific environment is important.Functor and It is and to be equivalent to use from Kleene " The Foundations Of Intuitionistic Mathematics " Object in the paper of Klein and relationship.It, can be in systems by the way that the semantic triple of Klein is limited to its binary subset The formula being made of the primitive recursive function of Kleene is modeled.The concept of safety immediately passes through chart grammar analyzer And its it accounts for for the control of the solution state of emulation.When blackboard is switched to time T+1 by syntax analyzer from time T When, which can be considered as that timing memory is allowed to access (that is, from memory to a large amount of of the process (blackboard) about solution state It is continuously written into).Since the write-in can be administered to distributed system (that is, network), and can be considered as for system The timing of transmission in the information theory of analysis, thus be the syntax analyzer of adjoint system clock timing allow to determine it is all The use of system resource.Arrow and box indicate the contact (pointer) between particular items, formula of the particular items in Hash table It is related between (binary and triple), item/formula (Object-and Relation) and vocabulary (classification).
Kleene formulation is very strict, permissible to loosen logical criteria to include General Recursive formula (permission Klein Most complete triple semantic notation) and pass through Klein theory in the concept of classification that accounts for of vocabulary.It will The lambda orientabilitys of Kleene and the concept of specific realizability are equivalent to the general of the Markov algorithm in theory of algorithm Thought will make there is more popular expression for the system.Needed in figure it is all be exactly utilize Markov syllable substitute It converges, replaces Object-and Relation using Markov word, utilize the Markov concept replacement formula of common algorithm.Its general rewriting System can then be assumed to be that it is matched and character string in DNA sequence dna or more generally in the replacement of character string in general modfel Ability.
Fig. 6 is the diagram that three values that professor Klein provides are analogized with exemplary regeneration.True value is mapped to [1,1] by it, And it is accorded with using the strong bi-conditional operation for analogy relation.Exclusive or is preferred, to which system machine will not be made related, and for The case where lacking strong bi-conditional operation symbol in main programming language, strong use of equal value is shown, this is because logician's preference The operator, and the strong bi-conditional operation symbol of four values can be interchangeable with its two values corresponding part, and make when looking back logic document With traditional concept of equal value.
" true (True) " is mapped to [1,1] by the metamathematics value in Fig. 6, and " false (False) " is mapped to [0,1], will " not Define (Undefined) " it is mapped to [0,0], " defined (Defined) " is mapped to [1,0].
Fig. 7 is the continuation of Fig. 6.
Fig. 8 is the continuation of Fig. 6.
Fig. 9 is the diagram for some the macromolecular attributes that can be modeled in systems.1310 be the diagram of key stretching, extension. 1300 be the diagram of angular distortion.1320 be the diagram of rotation stretching, extension or distortion stretching, extension.1340 describe DNA molecular.
These are some the general molecular mechanical attributes used in modeling, these attributes can be carried out mutual using quantum mechanics It changes, but calculating the time will be since the conversion be significantly increased far from classical mechanics.
These and other changes can be made to the present invention according to detailed description above.In general, in the claims The term used should not be construed to limit the invention to specific embodiment disclosed in specification, unless above in detail retouch It states and clearly limits the term.Therefore, the actual scope of the present invention includes disclosed embodiment and is practiced or carried out right and wants All equivalent ways of invention under asking.
Although some aspects of the present invention are presented in the form of specific rights requirement below, the present invention focuses on any number The various aspects of the invention of the claim form of amount.
Disclosed embodiment includes following item:
First item:The method that the machine for the semantic network for sorting and analyzing for genome is realized, the method includes:
A) using including (F, T, U, D) symbol come indicate to be mapped to the false, true, undefined of two vector dynamic arrays and Defined value;These values are also mapped into the index in the two vectors dynamic array and are stored as the section in semantic network Point, for indicating the genome sequence of input;
B) F, T, U, D are limited to sets theory, wherein { } is " undefined ", and { T } is "true", and { F } is "false", { T, F } For " defined ", these values are interpreted that attribute { P } is "true",For "false", { } is " undefined ", P,It is " Definition ", these attributes are for being directed to continuous recursion step test condition in being integrated in predicate and quantifying the attribute of variable.
C) ignore dull demonstration, using following binary system conjunction, logic is defined with negative form, logical AND (Λ)、 It is non-Logic or (∨) conjunction are used for the completeness of proof logic, as follows:
It is T
It is F
It is D
It is U;
D) it is directed to conjunctionΛ
ΛF T U D
F F F F F
T F T U D
U F U U F
D F D F D;
E) it is directed to conjunction ∨
∨F T U D
F F T U D
T T T T T
U U T U T
D D T T D;
F) by making short-term storage optimize to semantic network syntactic information and semantic information uniform enconding, and make Long term memory maximizes;
G) under parallel environment, short-term storage is made to optimize so that long term memory maximization becomes optimization difference and knows Communication and storage between knowledge source (process);
H) in simulations using defined and undefined help to detach class of assets.
The method of first item further includes using the fortune pair with the associated phrase structure rewriting rule of node in semantic network It is configured for the test of rewriting rule and passes through.
The method of Section 2 realizes the grammer from top to bottom, from bottom to top that polynary syntax parsing can be carried out to grammer Analyzer.
The method of Section 3 uses system clock, run time stack and heap, processor, included in non-transitory medium The database of machine readable instructions and rewriting rule, the database of semantic network and the database of syntactic and semantic information.
The method of Section 4 realizes the grammer from top to bottom, from bottom to top point that polynary syntax parsing can be carried out to grammer Parser, to provide the grammatical pattern matching capacity for being modeled to the pattern for matching DNA sequence dna.
The method tradition of Section 5 is realized carries out dynamic modeling in Monte-Carlo Simulation to DNA, for whole gene Group sequence.
The method of Section 5 uses application specific processor.
Section 8:The method that the machine of semantic network for macromolecular analysis is realized, the method includes:
A) indicate to be mapped to the "false" of two vector dynamic arrays, "true", " not using the symbol including (F, T, U, D) The value of definition " and " defined ";Index that these values are also mapped into the two vectors dynamic array and it is stored as semanteme Node in network, for indicating the macromolecular structure of input;
B) F, T, U, D are limited to sets theory, such as, { } is " undefined ", and { T } is "true", and { F } is "false", and { } is " undefined ", { T, F } are " defined ", these values are interpreted that attribute { P } is "true",For "false", { } is " uncertain Justice ", P,It is " defined ", these attributes are for being directed to continuous recursion step test condition and amount in being integrated in predicate Change the attribute of variable;
C) ignore dull demonstration, using following binary system conjunction, logic is defined with negative form, logical AND (Λ)、 It is non-Logic or (∨) conjunction are used for the completeness of proof logic, as follows:
It is T
It is F
It is D
It is U;
D) it is directed to conjunctionΛ
ΛF T U D
F F F F F
T F T U D
U F U U F
D F D F D;
E) it is directed to conjunction ∨
∨F T U D
F F T U D
T T T T T
U U T U T
D D T T D;
F) by making short-term storage optimize to semantic network syntactic information and semantic information uniform enconding, and make Long term memory maximizes;
G) under parallel environment, short-term storage is made to optimize so that long term memory maximization becomes optimization difference and knows Communication and storage between knowledge source (process);
H) it helps to detach gene type using " defined " and " undefined " in simulations.
Section 9:For the hybrid modeling of gene order and macromolecular structure for key-lock system and induced-fit system The system that chemistry in system is explored, the system include:
A) machine readable instructions in non-volatile computer-readable medium, central processing unit, run time stack are stored in With heap, semantic network, from top to bottom/syntax analyzer, system clock, database with history economics information from bottom to top;
B) system is encoded using the Boolean type including (F, T, U, D) to indicate to be mapped in two vector dynamic arrays The value of "false", "true", " undefined " and " defined ";These values are also mapped into the index in the two vectors dynamic array simultaneously And it is associated with the node in semantic network;
C) { F, T, U, D } is limited to sets theory, such as, { } be " undefined ", { T } be "true", { F } be "false", T, F } it is " defined ", these values are interpreted that attribute { P } is "true",For "false", { } is " undefined ", P,Be " defined ", these attributes are the attributes for condition test and variable quantization in being integrated in predicate.
D) system using following binary system conjunction (logical AND (Λ), it is non-Logic or (∨) conjunction) with negative Form defines logic, and the binary system conjunction is used for the completeness of proof logic:
It is T
It is F
It is D
It is U;
E) it is directed to conjunctionΛ
ΛF T U D
F F F F F
T F T U D
U F U U F
D F D F D;
F) it is directed to conjunction ∨
∨F T U D
F F T U D
T T T T T
U U T U T
D D T T D;
G) system makes short-term storage optimize by the way that information to be linearly encoded to semantic network, and makes long term memory It maximizes;
H) under parallel environment, system integration memory is to optimize the communication and storage between different knowledge data bases.
Section 10:The system of Section 9 further includes using pair rewriting rule with the associated phrase structure of node in semantic network Utilization then for rewriting rule test and pass through, the size of the word of system is strong to condition test in theoretical time O (C) Add chunking factor.
Section 11:The system of Section 9 further includes database (each array and each semantic node pass of vector array Connection), the database implemented of the data of semantic network and grammatical phrases structure and logical connective database.
Section 12:The system of Section 9 realize grammer can be carried out polynary syntax parsing from top to bottom, from lower and On syntax analyzer, effectively to be modeled to the increasing for statistics summation in search space.
Section 13:The system of Section 9 be used to carry out DNA in Monte-Carlo Simulation using the physical attribute of DNA Dynamic macromolecular models.

Claims (13)

1. a kind of achievable method of machine for the semantic network for sorting and analyzing for genome, the method includes:
A) "false", "true", " undefined " that are mapped to two vector dynamic arrays are indicated using the symbol including F, T, U and D The value of " defined ", wherein F indicates that the value for being mapped to the "false" of two vector dynamic arrays, T expressions are mapped to two vectors The value of the "true" of dynamic array, U indicate that the value for being mapped to " undefined " of two vector dynamic arrays, D expressions are mapped to two The value of " defined " of vector dynamic array;It index that described value is also mapped into the two vectors dynamic array and is stored For the node in semantic network, the genome sequence inputted for expression;
B) F, T, U and D are limited with sets theory, wherein { } is " undefined ", and { T } is "true", and { F } is "false", and { T, F } is " defined ", these values are interpreted:Attribute { P } is "true", attributeFor "false", attribute { } is " undefined ", attributeFor " defined ", these attributes are become for being directed to continuous recursion step test condition and quantization in being integrated in predicate The attribute of amount;
C) ignore dull demonstration, using following binary system conjunction, logic, conjunction logical AND are defined with negative formΛ、 Logic NOTLogic or ∨ are used for the completeness of proof logic, as follows:
It is T
It is F
It is D
It is U;
D) it is directed to conjunctionΛ
Wherein, the value of the first row and the value of first row are the value before carrying out logic and operation, i-th of value and first of the first row J-th of value of row carries out logic and operation and obtains the value of the i-th row jth row, wherein 2≤i≤5,2≤j≤5;
E) it is directed to conjunction ∨
Wherein, the value of the first row and the value of first row are the value before carrying out logic or operation, i-th of value and first of the first row J-th of value of row carries out logic or operation obtains the value of the i-th row jth row, wherein 2≤i≤5,2≤j≤5;
F) by making short-term storage optimize to semantic network syntactic information and semantic information uniform enconding, and make long-term Memory maximizes;
G) under parallel environment, short-term storage is made to optimize so that the maximized processing of long term memory becomes making different knowledge Communication between source and optimal storage, wherein knowledge source is process;
H) it helps to detach class of assets using " defined " and " undefined " in simulations.
2. the method as described in claim 1, the method further include:Using with the associated phrase knot of node in semantic network Structure rewriting rule, for rewriting rule test and pass through.
3. method as claimed in claim 2, the method realize grammer can be carried out polynary syntax parsing from upper and Under/syntax analyzer from bottom to top.
4. method as claimed in claim 3, wherein the method using system clock, run time stack and heap, processor, The database of machine readable instructions and rewriting rule, the database of semantic network included in non-transitory medium and grammer With the database of semantic information.
5. method as claimed in claim 4, wherein the method realizes oneself that can carry out polynary syntax parsing to grammer It is upper and under/syntax analyzer from bottom to top, to provide the grammatical pattern for being modeled to the pattern for matching DNA sequence dna With ability.
6. method as claimed in claim 5, wherein the method realization builds DNA into Mobile state in Monte-Carlo Simulation Mould, for whole gene group sequence.
7. method as claimed in claim 5, wherein the method uses application specific processor.
8. a kind of achievable method of machine of semantic network for macromolecular analysis, the method includes:
A) "false", "true", " undefined " that are mapped to two vector dynamic arrays are indicated using the symbol including F, T, U and D The value of " defined ", wherein F indicates that the value for being mapped to the "false" of two vector dynamic arrays, T expressions are mapped to two vectors The value of the "true" of dynamic array, U indicate that the value for being mapped to " undefined " of two vector dynamic arrays, D expressions are mapped to two The value of " defined " of vector dynamic array;It index that these values are also mapped into the two vectors dynamic array and is stored For the node in semantic network, the macromolecular structure inputted for expression;
B) F, T, U and D are limited with sets theory, wherein { } is " undefined ", and { T } is "true", and { F } is "false", and { } is " not Definition ", { T, F } are " defined ", these values are interpreted:Attribute { P } is "true",For "false", { } is " undefined ",For " defined ", these attributes are become for being directed to continuous recursion step test condition and quantization in being integrated in predicate The attribute of amount;
C) ignore dull demonstration, using following binary system conjunction, logic, conjunction logical AND are defined with negative formΛ、 Logic NOTLogic or ∨ are used for the completeness of proof logic, as follows:
It is T
It is F
It is D
It is U;
D) it is directed to conjunctionΛ
Wherein, the value of the first row and the value of first row are the value before carrying out logic and operation, i-th of value and first of the first row J-th of value of row carries out logic and operation and obtains the value of the i-th row jth row, wherein 2≤i≤5,2≤j≤5;
E) it is directed to conjunction ∨
Wherein, the value of the first row and the value of first row are the value before carrying out logic or operation, i-th of value and first of the first row J-th of value of row carries out logic or operation obtains the value of the i-th row jth row, wherein 2≤i≤5,2≤j≤5;
F) by making short-term storage optimize to semantic network syntactic information and semantic information uniform enconding, and make long-term Memory maximizes;
G) under parallel environment, short-term storage is made to optimize so that the maximized processing of long term memory becomes making different knowledge Communication between source and optimal storage, wherein knowledge source is process;
H) it helps to detach gene type and macromolecular structure using " defined " and " undefined " in simulations.
9. a kind of hybrid modeling being used for gene order and macromolecular structure is for in key-lock system and induced-fit system Chemistry explore system, the system comprises:
A) be stored in machine readable instructions, central processing unit, run time stack and heap in non-volatile computer-readable medium, Semantic network, from top to bottom/syntax analyzer, system clock and database with history economics information from bottom to top;
B) system using including F, T, U and D Boolean type encode indicate to be mapped to the "false" in two vector dynamic arrays, The value of "true", " undefined " and " defined ", wherein F indicates the value for the "false" being mapped in two vector dynamic arrays, T tables Show that the value for the "true" being mapped in two vector dynamic arrays, U indicate " undefined " that is mapped in two vector dynamic arrays Value, D indicates the value for being mapped to " defined " in two vector dynamic arrays;It is dynamic that these values are also mapped into two vector Indexing and being associated with the node in semantic network in state array;
C) F, T, U and D are limited with sets theory, wherein { } is " undefined ", and { T } is "true", and { F } is "false", and { T, F } is " defined ", these values are interpreted:Attribute { P } is "true",For "false", { } is " undefined "," to have determined Justice ", these attributes are the attributes for condition test and variable quantization in being integrated in predicate.
D) system utilizes following binary system conjunction, defines logic to negate, the binary system conjunction is that conjunction is patrolled Volume withΛ, logic NOTLogic or ∨ are used for the completeness of proof logic:
It is T
It is F
It is D
It is U;
E) it is directed to conjunctionΛ
Wherein, the value of the first row and the value of first row are the value before carrying out logic and operation, i-th of value and first of the first row J-th of value of row carries out logic and operation and obtains the value of the i-th row jth row, wherein 2≤i≤5,2≤j≤5;
F) it is directed to conjunction ∨
Wherein, the value of the first row and the value of first row are the value before carrying out logic or operation, i-th of value and first of the first row J-th of value of row carries out logic or operation obtains the value of the i-th row jth row, wherein 2≤i≤5,2≤j≤5;
G) system makes short-term storage optimize by the way that information to be linearly encoded to semantic network, and keeps long term memory maximum Change;
H) under parallel environment, system integration memory is so that the communication between different knowledge data bases and optimal storage.
10. system as claimed in claim 9, the system also includes use and the associated phrase knot of node in semantic network Structure rewriting rule, for rewriting rule test and pass through, the word size of system is in theoretical time O (C) by chunking factor It is imposed to condition test.
11. system as claimed in claim 9, the system also includes the database of vector array, semantic network data and The database of database and logical connective that grammatical phrases structure is implemented, wherein each vector array is saved with each semanteme Point association.
12. system as claimed in claim 9, wherein the system, which realizes, to carry out polynary syntax parsing to grammer From top to bottom/syntax analyzer from bottom to top, effectively to be modeled to the increasing for statistics summation in search space.
13. system as claimed in claim 9, wherein the system is used for imitative in Monte Carlo using the physical attribute of DNA DNA is modeled into Mobile state macromolecular in very.
CN201480066317.3A 2013-12-03 2014-12-01 Calculating instrument for genome sequence and macromolecular analysis Active CN105793858B (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US14/095,416 2013-12-03
US14/095,416 US9672466B2 (en) 2013-09-03 2013-12-03 Methods and systems of four-valued genomic sequencing and macromolecular analysis
PCT/US2014/067868 WO2015084704A1 (en) 2013-12-03 2014-12-01 Computational tools for genomic sequencing and macromolecular analysis

Publications (2)

Publication Number Publication Date
CN105793858A CN105793858A (en) 2016-07-20
CN105793858B true CN105793858B (en) 2018-09-21

Family

ID=53274000

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201480066317.3A Active CN105793858B (en) 2013-12-03 2014-12-01 Calculating instrument for genome sequence and macromolecular analysis

Country Status (3)

Country Link
EP (1) EP3077941A4 (en)
CN (1) CN105793858B (en)
WO (1) WO2015084704A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1623996A1 (en) * 2004-08-06 2006-02-08 Deutsches Krebsforschungszentrum Stiftung des öffentlichen Rechts Improved method of selecting a desired protein from a library
JP2007029061A (en) * 2005-07-29 2007-02-08 National Institute Of Advanced Industrial & Technology New functional peptide-creating system by combining random peptide library or peptide library imitating hypervariable region of antibody, with in vitro peptide selection method using rna-binding protein
CN101492736A (en) * 2008-12-08 2009-07-29 南京市第二医院 Opened protein diversity display method
CN101831489A (en) * 2009-03-11 2010-09-15 孙星江 Method for detecting desoxyribonucleic acid anti-counterfeiting maker by utilizing loop-mediated isothermal amplification technology
CN101883852A (en) * 2007-10-04 2010-11-10 东曹株式会社 Legionella rRNA amplification primers, detection method and detection kit thereof

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020001810A1 (en) * 2000-06-05 2002-01-03 Farrell Michael Patrick Q-beta replicase based assays; the use of chimeric DNA-RNA molecules as probes from which efficient Q-beta replicase templates can be generated in a reverse transcriptase dependent manner
US20030157489A1 (en) * 2002-01-11 2003-08-21 Michael Wall Recursive categorical sequence assembly
EP2111593A2 (en) * 2007-01-26 2009-10-28 Information Resources, Inc. Analytic platform
US8549496B2 (en) * 2009-02-27 2013-10-01 Texas Tech University System Method, apparatus and computer program product for automatically generating a computer program using consume, simplify and produce semantics with normalize, transpose and distribute operations
WO2011050341A1 (en) * 2009-10-22 2011-04-28 National Center For Genome Resources Methods and systems for medical sequencing analysis
US8700337B2 (en) * 2010-10-25 2014-04-15 The Board Of Trustees Of The Leland Stanford Junior University Method and system for computing and integrating genetic and environmental health risks for a personal genome

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1623996A1 (en) * 2004-08-06 2006-02-08 Deutsches Krebsforschungszentrum Stiftung des öffentlichen Rechts Improved method of selecting a desired protein from a library
JP2007029061A (en) * 2005-07-29 2007-02-08 National Institute Of Advanced Industrial & Technology New functional peptide-creating system by combining random peptide library or peptide library imitating hypervariable region of antibody, with in vitro peptide selection method using rna-binding protein
CN101883852A (en) * 2007-10-04 2010-11-10 东曹株式会社 Legionella rRNA amplification primers, detection method and detection kit thereof
CN101492736A (en) * 2008-12-08 2009-07-29 南京市第二医院 Opened protein diversity display method
CN101831489A (en) * 2009-03-11 2010-09-15 孙星江 Method for detecting desoxyribonucleic acid anti-counterfeiting maker by utilizing loop-mediated isothermal amplification technology

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"Sensitive and fast mapping of di-base encoded reads";F.HORMOZDIARI 等;《BIOINFORMATICS》;20110517;第27卷(第14期);第1951-1921页 *

Also Published As

Publication number Publication date
EP3077941A1 (en) 2016-10-12
CN105793858A (en) 2016-07-20
WO2015084704A1 (en) 2015-06-11
EP3077941A4 (en) 2017-08-30

Similar Documents

Publication Publication Date Title
Wang et al. Mathdqn: Solving arithmetic word problems via deep reinforcement learning
Yin et al. Neural enquirer: Learning to query tables with natural language
Immerman Descriptive complexity
Khoussainov et al. Automata theory and its applications
CN105706092B (en) The method and system of four values simulation
CN105706091B (en) The method and system of the symbol of the four value analogy translation operations used in natural language processing and other application
Onan SRL-ACO: A text augmentation framework based on semantic role labeling and ant colony optimization
Agarwala et al. One network fits all? modular versus monolithic task formulations in neural networks
CN116561251A (en) Natural language processing method
Song et al. Taxonprompt: Taxonomy-aware curriculum prompt learning for few-shot event classification
Balaji et al. Text summarization using NLP technique
Wu et al. Memory-aware attentive control for community question answering with knowledge-based dual refinement
Wang et al. Deep Semantics Sorting of Voice-Interaction-Enabled Industrial Control System
Correia et al. Quantum computations for disambiguation and question answering
Sun et al. Computational modeling of hierarchically polarized groups by structured matrix factorization
CN105793858B (en) Calculating instrument for genome sequence and macromolecular analysis
US20230177261A1 (en) Automated notebook completion using sequence-to-sequence transformer
Kim et al. Improving a Graph-to-Tree Model for Solving Math Word Problems.
Ishdorj et al. Replicative–distribution rules in P systems with active membranes
Mohammadi et al. A comprehensive survey on multi-hop machine reading comprehension approaches
Ramanujan et al. Control words of transition P systems
Xie et al. Match matrix aggregation enhanced transition-based neural network for sql parsing
Bandyopadhyay et al. DrugDBEmbed: Semantic queries on relational database using supervised column encodings
Correia et al. Grover's Algorithm for Question Answering
Maupomé et al. Position Encoding Schemes for Linear Aggregation of Word Sequences.

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant