WO2007041966A1 - Procede et systeme de conception d'une bibliotheque d'echantillons - Google Patents

Procede et systeme de conception d'une bibliotheque d'echantillons Download PDF

Info

Publication number
WO2007041966A1
WO2007041966A1 PCT/CN2006/002691 CN2006002691W WO2007041966A1 WO 2007041966 A1 WO2007041966 A1 WO 2007041966A1 CN 2006002691 W CN2006002691 W CN 2006002691W WO 2007041966 A1 WO2007041966 A1 WO 2007041966A1
Authority
WO
WIPO (PCT)
Prior art keywords
sample
distribution
variable
samples
pseudo
Prior art date
Application number
PCT/CN2006/002691
Other languages
English (en)
Chinese (zh)
Inventor
Xinlei Hua
Xichen Feng
Original Assignee
Accelergy Shanghai R & D Center Co., Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Accelergy Shanghai R & D Center Co., Ltd filed Critical Accelergy Shanghai R & D Center Co., Ltd
Publication of WO2007041966A1 publication Critical patent/WO2007041966A1/fr

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/60In silico combinatorial chemistry
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B35/00ICT specially adapted for in silico combinatorial libraries of nucleic acids, proteins or peptides
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/60In silico combinatorial chemistry
    • G16C20/62Design of libraries

Definitions

  • the present invention relates to an efficient test method - a high throughput experimental method, and more particularly to the field of design of a combined sample library therein. Background technique
  • the present invention provides a method for integrating empirical knowledge to design a combined sample library.
  • this risk knowledge can be embodied as components of the sample, variables associated with the components, and constraints of these variables.
  • the invention also provides a method of designing a sample library comprising the following steps:
  • the constraint of the variable is the relationship between variables determined by experience or previously known knowledge.
  • the empirical knowledge is a physical or chemical natural law represented by a variable.
  • the pseudo sample can be generated by random sampling.
  • the random sampling described above is performed by employing a set of variables, each of which corresponds to a component and randomly takes values within a certain interval.
  • An example of random sampling includes Monte Carlo simulations.
  • the random values are generated by a random number generator.
  • the random values are related to a probability distribution or probability density. The probability distribution is a uniformly distributed niform distribution or a non-uniform distribution.
  • the non-uniform distribution includes the Bernoulli distribution, the beta distribution, the Chi-square distribution, the exponential distribution, the F distribution, the gamma distribution, the Gaussian distribution, Normal distribution (eg lognormal, multivariate normal distribution and univariate normal distribution), non-central X-square distribution, non-central F distribution, binomial distribution, negative binomial distribution, polynomial distribution, Pare Pareto distribution, Poisson distribution, student t distribution, and Tsallis distribution.
  • the probability distribution includes a uniform distribution, a normal distribution, and a Gaussian distribution.
  • Yet another aspect of the present invention provides a method of obtaining a desired number of samples in a sample library, the method comprising the steps of:
  • the present invention provides a method of measuring the optimal number of samples that need to be designed and/or synthesized, the method comprising the steps of:
  • the method further includes the step of determining a ratio of qualified samples divided by the number of pseudo samples to obtain a qualified sample ratio.
  • the invention also provides a computer product comprising computer software.
  • computer software Once the computer software is running, the methods and calculations of the present invention can be performed.
  • the computer software can perform random sampling.
  • Figure 1 is a schematic representation of a two-component sample produced by the Monte Carlo simulation method, each sample consisting of cerium (Ce) and iron (Fe). All the dots (including hollow, gray, and color) in the figure represent pseudo-samples consisting of uniformly distributed independent and randomly generated enthalpy variables and iron variables; there is no constraint between the ⁇ variable and the iron variable.
  • the gray dots and black dots in the figure represent pseudo samples that satisfy the first constraint.
  • the black dots represent pseudo samples that satisfy both the first and second constraints (detailed reference to Example 1).
  • Figure 2 is a three-dimensional view of a four-component sample produced by the Monte Carlo simulation method, each sample consisting of ruthenium (Ce), iron (Fe), tungsten (W), and nickel (Ni). All points represent pseudo-samples randomly distributed independently of each of the four variables without any constraint therebetween.
  • Figure 3 is a schematic illustration of the pseudo sample in Figure 2 that satisfies the first constraint.
  • Figure 4 is a schematic illustration of the dummy sample of Figure 3 further satisfying the second constraint.
  • Figure 5 is a graphical user interface (GUI) that allows a user to design a multi-component sample library that provides a variable, a range of variables, and a desired segmentation for each component.
  • GUI graphical user interface
  • Figure 6 is a graphical user interface given after selecting a component and corresponding variables.
  • Figure 7 is a graphical user interface that allows a user to specify one or more constraints on a variable.
  • the graphical user interface shown in Figure 8 allows the user to choose 1) whether to perform the Monte Carlo simulation method; 2) how to perform the Monte Carlo simulation method.
  • the graphical user interface shown in Figure 9 allows the user to enter the specified number of samples to be obtained and the specified number of components for the input sample.
  • the graphical user interface shown in Figure 10 allows the user to specify each component.
  • the graphical user interface shown in Figure 11 allows the user to define constraints using variables.
  • the present invention relates to a design strategy for a combined sample library to be designed, synthesized, Screen and measure the sample library.
  • One aspect of the invention provides a method of designing a sample library comprising providing a plurality of components of a sample.
  • combined sample library refers to a collection comprising a plurality of samples, “sample,” refers to a material comprising a plurality of components.
  • component refers to a substance, such as an element, a molecule, a compound, A substance, a mass, etc., or a combination of these shields.
  • a sample comprises n different components, d, C 2 , C 3 ... Ci...C n , where n is an integer and refers to the amount of different components in the sample.
  • the mass of each component Ci is expressed as MWi, where ie ⁇ 0, l, 2 ... n ⁇ , the composition number in the sample is expressed as, and the corresponding composition ratio is expressed as .
  • Mass MWi refers to the molecular weight or atomic weight of the component.
  • composition quantity refers to the number of the ith component in the sample, so the sample can be expressed as ( ⁇ ) ⁇ ( ⁇ 2 ) ⁇ 2 ...( ⁇ ⁇ ...( ⁇ ⁇ ) ⁇ , where i ⁇ 0, 1, 2, ⁇ ⁇ ⁇ , the composition ratio can be characterized as the relative weight of one component in the sample, which can be expressed by Equation 1:
  • composition number Xi can also refer to the molar ratio of the ith component in the sample.
  • the composition ratio can also be expressed as the mole fraction of a component in the sample, which is between 0 and 1, which can be defined by the following formula 2:
  • Composition ratio may be further expressed as a percentage of a component of the sample, its value between 0% and 100% 0
  • the glucose molecule C 6 H 12 0 6 can be considered as a sample containing three components: carbon (C), hydrogen (H), and oxygen (0), each component having a composition number, such as C. It is 6, H is 12, and 0 is 6.
  • the mass of material (MW) of each component can be derived from the mass of each atom, C is 12, H is 1 and O is 16. Therefore, the (weight) composition ratio of C is 0.4 or 40%, (12*6/(12*6+1 * 12+16*6)); H is 0.067 or 6.7%; 0 is 0.533 or 53.3%.
  • the sum of the composition ratios of the components is 1.
  • Another feature of the combined sample library is that each sample in the sample library consists of the same type of components, but these components have different composition ratios.
  • a method of designing a combined sample library includes providing a variable for each component of the multi-component sample.
  • the variable corresponds to the component in the sample.
  • the variable V is a random value in the interval [v min , v max ], where v min is not less than 0, V max is not more than 1, and v min ⁇ v max .
  • the interval is [0, 1]. If the variable V is assumed to be a value in the discrete interval ⁇ V ⁇ Vz ⁇ . V x ⁇ , then V can be discrete, where the discrete value falls within the interval [v min , V max ] (eg [0, 1]) in.
  • V is a random value of the interval [v min , v max ], it may be a continuous value.
  • the setting of the random value of the first variable is not subject to the assumption of the second variable.
  • the process of setting a random value from the interval of the same variable depends on the probability or probability density of the possible values of the variable.
  • the setting of the random value depends on the particular probability of the respective discrete value in the interval.
  • Vi is a variable of the first component Q
  • Vj is a variable of the component second Cj.
  • Vi can be set to a random value in the [Vi, min , Vi, wake-up] interval
  • Vj can be set to a random value in the [Vj, min , Vj, max ] interval
  • the Vi value is independent of Vj.
  • Cj are components in a sample consisting of C 2 , ... Q, ... Cj, ... C n , where I, j G ⁇ 0, 1, 2 ... n ⁇
  • the synthetic variable Vi becomes the composition ratio of the component Ci
  • the synthetic variable Vj becomes the composition ratio Rj, and the sum of the variables of all the components in the sample satisfies the following formula 4:
  • Another aspect of the invention consists in: providing or setting at least one constraint for at least one variable in the sample in the provided method of designing the combined sample library.
  • the term "constraint" refers to the condition of at least one variable or the relationship between variables.
  • a constraint is a constraint that a variable or variables in a sample must satisfy.
  • a set of variables ⁇ Vi ⁇ in a valid or qualified sample must satisfy at least one constraint or a specific set of constraints. For example, assume that the sample includes components d, C 2 ... C n , and each component Q has a variable Vi, where ie ⁇ 0, 1, 2 ... n ⁇ , then, in the effective sample, the component The sum of the variables must satisfy the following constraints, Equation 5:
  • is the error (such as the constraint tolerance or the constraint deviation), and ⁇ is the value that varies between 0 and 0.2.
  • is 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, or 0.10.
  • the constraint approximates the equation as shown in Equation 5.
  • empirical knowledge about the components of the sample can be embodied as a relationship between multiple variables, which can also be understood as a constraint.
  • the ratio of the composition ratio of Q to 2 should be 2:1, where ⁇ 0, 1, 2 ... n ⁇ , je ⁇ 0, 1, 2 ...n ⁇ , and i ⁇ j.
  • the sum of the composition ratios to achieve Ci and Cj is X, where X is a value between 0 and 1.
  • the method of designing the combined sample library provided by the present invention involves generating a pseudo sample.
  • "Pseudo-sample” refers to a multi-component hypothetical sample, each of which has an independent variable such that any assignment of one variable is an independent event arbitrarily assigned relative to another variable and all variables of the pseudo-sample are Not subject to any restrictions. In other words, its variables may or may not satisfy the constraints.
  • a pseudo sample includes d, C 2 ... Q, ... Cj, ... C n , where i, je ⁇ 0, 1, 2 . . .
  • V ; and Vj Is a random value in the interval [0,1], Vi may take a value between [0, 1], and Vj may take another value between [0, 1].
  • the sum of the values of these variables does not need to meet the requirements of Equations 4 and 5.
  • a pseudo sample may or may not correspond to a real physical sample, and a pseudo sample is a sample point that has no independent, assumed variable values. Therefore a large number of pseudo samples constitute a sample space.
  • Pseudo samples can be generated using random sampling.
  • the random sampling method is a method of generating a sample point by randomly assigning a value to each component of a sample point.
  • the method of generating random values in the interval [V min , V max ] is clear, where V min is not less than 0 and V max is not greater than 1.
  • the algorithms and calculation programs for random number generators are well known in the field of computer science. Please refer to DE Knuth's Art-Semi-Numerical Algorithms for Computer Programs (Vol.
  • the randomly generated value is the occurrence of an event or a possible assigned variable (V) in a particular interval [v min , v max ], where the probability of occurrence of the event depends on the probability density or probability distribution of the variable. Therefore, the variable is further defined by the probability function assigning the value contained in the variable interval. For example, a discrete variable can be defined by assigning a correlation probability to each discrete value in the interval. Continuous variables can be defined by assigning a probability distribution to intervals that include all possible values of the variable.
  • the probability distribution used here refers to the arrangement of the values of the variables that reflect the frequency of their observations or theoretical occurrences.
  • Probability distributions well known in the art include balanced distributions and non-equilibrium distributions. Unbalanced distributions include Bernoulli distribution, beta distribution, X-square distribution, exponential distribution, F-distribution, gamma distribution, Gaussian distribution, normal distribution (eg, lognormal, multivariate normal distribution, and univariate positive) State distribution), non-central X-square distribution, non-central F distribution, binomial distribution, negative binomial distribution, polynomial distribution, Pareto distribution, cypress distribution, Student's t distribution, Salis distribution, and above distribution Any combination.
  • the random variables are assigned by a non-equilibrium probability distribution, for example, including a normal distribution, a Poisson distribution, and a Gaussian distribution.
  • the randomly generated value of the variable is assigned or related to the equilibrium distribution, and therefore, the random variable of the equilibrium distribution can be the same in the interval [V min , V max ] (V min > 0, V max ⁇ 1) Probability determines any random value.
  • non-equilibrium distributed random values can be generated by a random number generator (e.g., a linear superimposed generator).
  • a random number generator e.g., a linear superimposed generator
  • a, c and m are pre-set constants, a is the multiplier value, c is the increment, and m is the coefficient.
  • Park and Miller's Random Number Generators: Good Ones are Hard to Find", Comm. ACM 31: 1192-1201, 1988).
  • the random number generator includes the "A Very Fast Shift-Register Sequence Random Number Generator” by Kirkpatrick and Stoll (Journal of Computational Physics 40: 517-526, 1981). The described sequence of transfer registers. Furthermore, the random number generator also includes a quasi-random number generator, please refer to Press and Teukolsky's "Quasi Random Numbers” (Computers in Physics 3: 76-79, 1989).
  • Non-equalized random values such as normal or Gaussian random values
  • One of the methods includes a transformation function, such as the well-known Boks Moeller conversion, to convert the equilibrium distribution random variable into a new set of non-equilibrium distribution random variables (for example, Gaussian or normal distribution), please refer to Boks Moeller Box & Muller, "A Note on the Generation of Random Deviates", Annals Math. Stat. 29:610-611, 1958
  • the random sampling method comprises a Monte Carlo method or simulation.
  • the term "Monte Carlo method” or “Monte Carlo simulation” here refers to a random sampling method used to study a problem and obtain an approximation of the probability of solving the problem.
  • the term “Monte Carlo method” or “Monte Carlo simulation” as used herein refers in particular to the process of generating a random event (such as a randomly occurring value of any given variable). This process is usually done by computer algorithms, which are repeated multiple times, and all test results are analyzed and calculated to provide an approximate solution.
  • Monte Carlo simulations please refer to the Monte Carlo Method of Mitropolis and Uram (Journal of the American Statistical Association 44: 335-341, 1949) (Metropolis and Ulam, "The Monte Carlo Method", Journal of American Statistical Association 44: 335-341 1949 ); Sherbert's “Monte Carlo Method” (Sobol, "The Monte Carlo Method", The University of Chicago Press, 1974); Mooney's Monte Carlo Simulation ( Mooney, “Monte Carlo Simulation”, Sage University Paper, 1997).
  • the Monte Carlo method is constantly evolving in the field. For example, the method is initially applied to values estimated by throwing darts on standard coordinates (a circumference circumscribed by a square).
  • standard coordinates a circumference circumscribed by a square.
  • the ratio of the number of times the dart hits the circumference and the square is similar to the fraction of the ⁇ value, please refer to Ross "The First Lesson in Probability” (Ross, "A First Course in Probability” 2nd Edition, Macmillan, 1976).
  • Equation 6 Equation 7 below:
  • A is the number of points in V(x)
  • is the number of all points produced in the box
  • C is the area of the range box.
  • the ratios ⁇ / ⁇ and V(x) are related in proportion to the area occupied by the range box.
  • Monte Carlo simulation involves generating random variable values that are emphasized by empirical knowledge. For example, empirical knowledge about components (or component ratios) requires assigning variables with different probability densities (continuous variables) in different specific intervals, or requiring that variables be assigned different values at different values. (discrete variable).
  • Another example of Monte Carlo simulation includes Markov chain operations. A Markov operation is a sequence of random values whose probability of occurrence of each event depends on the value generated at the previous moment. Please refer to Frank and Smith's "Understanding Molecular Simulation: From Algorithm to Application” (Frenkd & Smith, "Understanding Molecular Simulation: From Algorithm to Applications” Academic Press, 1996).
  • Another aspect of the invention relates to a method of selecting a qualified sample from a pseudo sample.
  • qualified sample refers to a pseudo sample produced by the method described in the present invention, the variable satisfies one or more specific constraints, and the pseudo sample of the non-conforming sample is referred to as a non-conforming sample.
  • the dummy samples produced e.g., Monte Carlo simulation
  • random sampling method i.e. a large number of tests
  • Each pseudo sample is examined (inspected), for example, using a computer algorithm to determine if it meets a particular constraint or constraints.
  • a pseudo sample that satisfies the constraint is selected and stored as a qualified sample.
  • the values associated with each eligible sample are recorded as a vector and associated with the component ratio to synthesize and design a qualified sample in the sample library, since in the qualified sample, the values are the groups. Ratio.
  • Another aspect of the invention provides a method of producing a given number of samples in a sample library.
  • the method includes the following steps:
  • Another aspect of the present invention discloses a method of calculating a ratio of acceptable samples, where the term "qualified sample ratio (Rq S )," means that in a random sampling method, a variable satisfies one or more constrained pseudo samples.
  • the acceptable sample ratio (R qs ) can be estimated by dividing the number of qualified samples (Nq S ) in the random sampling method by the number of pseudo samples (N ps ) (Equation 8).
  • Nps When N ps increases, the calculation accuracy becomes smaller, and the change rule follows the following formula 9: Accuracy ⁇ ⁇ 9 where N is the number of random simulations (such as Monte Carlo). When a large number of Monte Carlo simulation tests were performed, as 1/N continued to decrease, the variation in the acceptable sample ratio decreased and the accuracy increased. In other words, when a sufficient number of tests are performed, the ratio of acceptable samples can achieve a relatively high degree of accuracy and accuracy. For example, for a constraint 1-eight ⁇ 3 ⁇ 4 ⁇ 1 + eight, Monte Carlo simulation of the sample (the more V0, the more accurate
  • a qualified sample is obtained (eg, 100%; when N qs reaches 10, its accuracy is between -30% and 30%; when Nq S reaches 10 2 , its accuracy is between -10% and 10%; when N qs reaches 10 3 , Its accuracy is between -3% and 3%; when N QS reaches 10 4 , its accuracy is between -1% and 1%.
  • the invention also discloses a method of estimating the optimal number of acceptable samples.
  • optimal number of qualified samples herein refers to the number of samples that satisfy a particular one or more constraints and that properly represent the sample space.
  • the optimal number of qualified samples is obtained by detecting all possible pseudo samples produced by discrete variables and identifying pseudo samples that satisfy a particular constraint.
  • a set of discrete values is generated by dividing (uniformly or non-uniformly) the interval [Vi, min , Vi, max ] into M parts or cells to a specific variable 1 ⁇ 4, thereby generating a set of defined values for the interval.
  • M ⁇ M is a positive integer and can be any number between 1 and 1,000,000.
  • the total number of pseudo samples (Z) can be generated by the number of specific grids of the variable.
  • each component Q has a variable Vi, where ie ⁇ 0, 1, 2. ⁇ . ⁇ ⁇ , each 1 ⁇ 4 is separated into Mi Part or lattice or point, so Vi takes a set of discrete values in the interval [V min , V max ] (V MIN > 0 and V max ⁇ l). This separation is determined based on our experience with the variables corresponding to the components, which can be either segmented or non-uniform segments. If you do not consider or provide constraints for variables, based on a set of ⁇ MJ pseudo samples total
  • the quantity (Z) represents the sample point of the n-dimensional sample space, and Z can be derived by the following formula 10: Mi Equation 10
  • all pseudo samples can be detected, a pseudo sample whose variables satisfy the constraint is selected and stored in the vector to form a qualified sample set.
  • the number of eligible samples i.e., the number of vectors described above, is the optimum number of eligible samples in the sample space corresponding to a given set of discrete values.
  • the number of components (or variables) is increased and the number of segments per component is also increased, a full search through the sample space can become quite cumbersome. For example, for a sample library consisting of five components and each specific interval of each component variable is divided into 100 cells, the sample space is a five-dimensional space, and the total number of all pseudo samples (Z) is 100 5 (or 10 1G , 10,000,000,000). When one or more constraints are introduced, the calculation becomes more complicated. Although it is possible to identify whether each of the pseudo samples satisfies the one or more constraints, another method may be employed to provide an approximation of the optimal values by performing random sampling as described herein (e.g., Monte Carlo simulation). estimate.
  • Monte Carlo simulation Monte Carlo simulation
  • the random number can be generated based on a set of discrete values, where each discrete value is located in the interval assumed by the variable and has a certain probability.
  • the random number may also be any value corresponding to a certain interval having a specific probability distribution. Therefore, the optimal number of qualified samples depends on the product of Z and R qs . It is expected that the optimal number of eligible samples will vary depending on a set of parameters. Examples of this parameter include the number of variable segments (M) in the sample, the method of generating random numbers, the statistical distribution of the variables, the Monte Carlo simulation, the number of Monte Carlo tests, the variable constraints, the tolerance limits, and the required accuracy or Precision.
  • M variable segments
  • a computer system in the present invention refers to a computer or computer readable medium that is designed and configured to perform some or all of the methods described herein.
  • the computer e.g., server
  • the computer employed herein can be any of a variety of general purpose computers, such as personal computers, network servers, workstations, or other computer platforms that are currently or in the future.
  • computers include, in particular, some or all of the components such as processors, operating systems, computer memories, input devices, and output devices.
  • the computer can further include, for example, a cache, a data backup Unit and some other equipment.
  • a processor as used herein may include one or more microprocessors, domain programmable logic arrays, or one or more specialized integrated circuits corresponding to a particular application.
  • processors include, but are not limited to, Intel's Pentium series processors, S-Chip's microprocessors, Sun's workstation system processors, Motorola's personal desktop processors, MIPS Technologies' MIPs processors. Xilinx's highest range of domain programmable logic arrays and other processors.
  • the operating system employed herein includes machine code that, through execution of the processor, coordinates and performs functions of other portions of the computer, and assists the processor in performing functions of various computer programs that may be written in various programming languages.
  • the operating system also provides scheduling, input and output control, file data management, memory management and communication control, and related services, all of which are prior art.
  • Typical operating systems include Windows operating systems such as Microsoft Corporation, Unix or Linux operating systems from a variety of vendors, additional or future operating systems, and combinations of these operating systems.
  • the computer memory used herein can be any of a variety of different types of memory storage devices. Examples include random access memory, magnetic media storage such as permanent hard disks or tapes, optical shields such as reading and writing laser discs, or other access storage devices.
  • the memory storage device can be any existing or future development device, including a compact disc drive, a tape drive, a removable hard drive, or a disk drive. These types of memory storage devices are typically read from or written to a computer program storage medium, such as an optical disk, magnetic tape, removable hard disk or floppy disk. All of these computer program storage media can be considered a product of a computer program.
  • the products of these computer programs typically store computer software programs and/or data. Computer software programs are typically stored in system memory and/or memory storage.
  • the computer software program of the present invention can be executed by loading it into a system memory and/or memory storage device using some type of input device.
  • all or part of the software program may also be present in a read only memory or similar memory storage device, such device not requiring the software program to be loaded first through the input device.
  • the software program, or portions thereof can be loaded by the processor into system memory or a cache or a combination of both in an existing manner to facilitate performing and performing random sampling.
  • the substance handling device may be the applicant's international patent application
  • the substance treatment disclosed in PCT/CN2005/002177, "Material Treatment Apparatus and Application thereof" may also be a reaction system disclosed in the applicant's Chinese Patent Application No. 200510029727.3, and may also be the applicant's Chinese patent application.
  • the parallel reaction system disclosed in No. 200610085162.5 may also be the reaction system disclosed in the Chinese patent application "Reaction System” filed by the applicant on September 30, 2006, or may be applied by the applicant on September 30, 2006.
  • the software is stored in a computer server that is coupled to the user terminal, input device, or output device via a data line, wireless line, or network system.
  • network systems include hardware and software that are electrically coupled together in a computer or device.
  • the network system may include the Internet, 10/1000 Ethernet, Electrical and Electronic Engineering Association 802.11x, Electrical and Electronic Engineering Association 1394, xDSL, Bluetooth, LAN, WLAN, GSP, CDMA, 3G, PACS or any other ANSI recognized standard medium. Based on the equipment.
  • the researcher can access the computer server to design the recipe and process through the above network system anywhere; on the other hand, the researchers in the ground can access the computer server through the above network system for formulation and process.
  • Design B researchers can access the computer server through the above network system to obtain the formulation and process design data, thus achieving collaborative research in different regions, facilitating centralized management of experimental equipment, enabling R&D and implementation in different regions.
  • Chinese Patent Application No. 200610100921.0 Computer Aided Graphical Experimental Design System and Method.
  • the data obtained from the formulation and process design using the combined sample library design method of the present invention can also be stored on a server for sharing by different researchers.
  • data sharing can be done through the Portal system designed by the applicant.
  • researchers can also communicate through the Portal system and use the combined sample library design method of the present invention for formulation and process design.
  • Chinese Patent Application No. 200610100921.0 “Computer Aided Graphical Experimental Design System and Method”.
  • This example shows how to select a qualified sample from a pseudo sample consisting of two components (tantalum and iron) produced by Monte Carlo simulation.
  • the variable V Ce of ⁇ takes a value between 0 and 1
  • the variable V Fe of iron also takes a value between 0 and 1.
  • the Monte Carlo simulation was performed using a uniformly distributed randomly generated V Ce and V Fe values between 0 and 1, in which the randomly generated values of V Ce were independent of the randomly generated values of V Fe . And in this simulation, V Ce and V Fe have no mandatory relationship of any relationship or constraint.
  • Mongolia The result of the Tecal simulation is the generation of a pseudo sample population. All of the points (including hollow, gray, and dark) as shown in Figure 1 constitute a collection of pseudo samples.
  • the first constraint is defined as 0.2 ⁇ V Ce ⁇ 0.8 and 0.2 ⁇ V Fe ⁇ 0.8.
  • the selection process considers the first constraint, the selected set of dummy samples are displayed as black or gray dots as shown in FIG.
  • the second constraint is defined as lA ⁇ V Ce + V Fe ⁇ l+A.
  • a set of pseudo samples that simultaneously satisfy the two constraints are displayed as Black dots as shown in Figure 1.
  • the Monte Carlo simulation introduces the experience into the design through the two constraints, and obtains the identification information of the design sample library. For example, since the number associated with the number of pseudo samples satisfying the two constraints is known, the number of qualified samples can be known. As shown in Table I, we know the composition ratio of each eligible sample. Table I shows the pseudo sample values produced by Monte Carlo simulation. The numbers in italics are the pseudo sample values that satisfy the first constraint, and the numbers in the boxes are the pseudo sample values that satisfy the first and second constraints.
  • This example shows how to select a qualified sample from a pseudo sample composed of four components ( ⁇ , iron, tungsten, and nickel) generated by Monte Carlo simulation. Variables of bismuth, iron, tungsten and nickel v Ce , v Fe , v w ,
  • V Ni takes values between 0 and 1.
  • Monte Carlo simulations we use a uniformly distributed randomly generated value of 0 to 1 for each variable. The randomly generated values in this simulation are independent of each other and are not subject to any constraints.
  • the result of the Monte Carlo simulation is a sample point (pseudo sample) in a four-dimensional space, and the projection of the four-dimensional sample point in three-dimensional space is as shown in FIG.
  • this Monte Carlo simulation which takes into account two constraints, provides deterministic information about the design of a sample library consisting of four components with these two constraints.
  • a qualified sample ratio can be calculated by removing the number of pseudo samples satisfying two constraints by the total number of pseudo samples. Since the number associated with the number of pseudo samples satisfying the two constraints is known, the number of qualified samples can be known. The component ratio of the variables in each of the qualified samples is recorded, and the component ratio of the qualified samples is known. If the variable is segmented by a particular grid point, the optimal number of samples can likewise be obtained according to the method provided by the present invention.
  • This embodiment illustrates a computer program that allows a user to enter information through a graphical user interface and perform calculations and simulations (including Monte Carlo simulation) to design a qualified sample library.
  • the graphical user interface allows the user to select the components required to design the sample.
  • component A may be any one of the group consisting of vanadium (V), niobium (Nb), and molybdenum (Mo), the variable of component A. (V a) in the range between 0 and 1 (refer to FIG. 5 in the range of 0.00 to 1.00), and the variable variation range is divided into 10 portions (10 segments as shown in FIG. 5).
  • component A is assigned a variable (V a ) that takes a value between 0 and 1 (see Fig. 6), and likewise, components B and C are also given corresponding variables (V b and V c ). ) (Please refer to Figure 6).
  • the graphical user interface allows the user to specify constraints between a variable or multiple variables.
  • is the error (or constraint tolerance) and is given as 0.01 in this example (please refer to Figure 6).
  • the graphical user interface further allows the user to decide how to estimate the optimal number of eligible samples.
  • the graphical user interface provides calculations for six different levels of accuracy. In the exact calculation therein, the pseudo sample is produced according to the formula 10 without any constraint shown in the present invention, and then the pseudo sample is detected by a computer and only the portion satisfying the constraint is selected. In this example, 198 samples satisfy the constraints. In addition, the component ratios of all 198 eligible samples were obtained.
  • the calculation accuracy is between -100% and 100%, it is considered to be very low, when it is considered to be low between -30% and 30%, and the accuracy at -10% to 10% is medium, and It is high between -3% and 3%, and between -1.0% and 1.0% is very high precision.
  • This embodiment shows a computer program that obtains a specified number of eligible sample points.
  • the computer program allows the user to enter information through a graphical user interface and perform calculations and simulations (including Monte Carlo simulations) to derive the component ratios for each of these sample points.
  • Figure 9 shows that the graphical user interface allows the user to enter a designated total sample point 125, with each sample point having four components.
  • the graphical user interface also allows the user to specify the desired component (see Figure 10) and define the constraints for the variable (see Figure 11).
  • the simulation test is started, and each dummy sample is checked for compliance with the set constraints. Stop the simulation when the number reaches a qualified sample 125, the sample points 125 in order to obtain four components (elements Pd, /3/:i O90il£ 996/-00iAV

Landscapes

  • Chemical & Material Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computing Systems (AREA)
  • Crystallography & Structural Chemistry (AREA)
  • Medicinal Chemistry (AREA)
  • Library & Information Science (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Evolutionary Biology (AREA)
  • Medical Informatics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

L'invention porte sur un procédé et système de conception d'une bibliothèque d'échantillons composites permettant de tirer le meilleur de la connaissance actuelle, de réduire les temps des expériences sur les échantillons, ou d'accroître les informations utiles tirées des expériences sur les échantillons à l'intérieur d'un temps donné. Le procédé comporte les étapes suivantes: (1) réunion des différents éléments composant l'échantillon; (2) attribution à chaque composant d'une variable prenant une valeur à certains intervalles; (3) fixation d'au moins une condition de contrainte à au moins une variable; (4) production d'un échantillon factice; (5) détection de l'échantillon factice pour s'assurer s'il est conforme; (6) répétition des étapes (4) et (5) jusqu'à l'obtention d'au moins un échantillon conforme. Ledit procédé, efficace et précis, permet d'éviter les déviations du système de conception.
PCT/CN2006/002691 2005-10-13 2006-10-13 Procede et systeme de conception d'une bibliotheque d'echantillons WO2007041966A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN200510030502.X 2005-10-13
CNB200510030502XA CN100558948C (zh) 2005-10-13 2005-10-13 组合样品库的设计方法和系统

Publications (1)

Publication Number Publication Date
WO2007041966A1 true WO2007041966A1 (fr) 2007-04-19

Family

ID=37942319

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2006/002691 WO2007041966A1 (fr) 2005-10-13 2006-10-13 Procede et systeme de conception d'une bibliotheque d'echantillons

Country Status (2)

Country Link
CN (1) CN100558948C (fr)
WO (1) WO2007041966A1 (fr)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107844985A (zh) * 2016-09-21 2018-03-27 腾讯科技(深圳)有限公司 一种概率产品数据处理方法、系统及终端

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1603789A (zh) * 2004-11-08 2005-04-06 武汉大学 一种测定镍基高温合金相含量的方法
US20050182572A1 (en) * 2004-02-13 2005-08-18 Wollenberg Robert H. High throughput screening methods for fuel compositions

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050182572A1 (en) * 2004-02-13 2005-08-18 Wollenberg Robert H. High throughput screening methods for fuel compositions
CN1603789A (zh) * 2004-11-08 2005-04-06 武汉大学 一种测定镍基高温合金相含量的方法

Also Published As

Publication number Publication date
CN1948559A (zh) 2007-04-18
CN100558948C (zh) 2009-11-11

Similar Documents

Publication Publication Date Title
Margossian A review of automatic differentiation and its efficient implementation
L’Ecuyer et al. Recent advances in randomized quasi-Monte Carlo methods
US6996550B2 (en) Methods and apparatus for preparing high-dimensional combinatorial experiments
Kashima et al. Path-integral renormalization group method for numerical study on ground states of strongly correlated electronic systems
Ahmadi et al. A hybrid method of 2-TSP and novel learning-based GA for job sequencing and tool switching problem
WO1993022740A1 (fr) Modelisation cristallographique a haute resolution d'une macromolecule
Ramaswamy et al. A partial-propensity variant of the composition-rejection stochastic simulation algorithm for chemical reaction networks
Dawson et al. Massively parallel sparse matrix function calculations with NTPoly
US20220019931A1 (en) Increasing representation accuracy of quantum simulations without additional quantum resources
Coffrin et al. Evaluating ising processing units with integer programming
Hartung et al. Digitising SU (2) gauge fields and the freezing transition
EP3906510A1 (fr) Augmentation de la précision de représentation de simulations quantiques sans ressources quantiques supplémentaires
WO2007041966A1 (fr) Procede et systeme de conception d'une bibliotheque d'echantillons
Herschlag et al. A consistent hierarchy of generalized kinetic equation approximations to the master equation applied to surface catalysis
Nuttall Parallel implementation and application of the random finite element method
US20090094012A1 (en) Methods and systems for grand canonical competitive simulation of molecular fragments
Rome The Space Race: Progress in Algorithm Space Complexity
Schoenmaker Monte Carlo simulations and complex actions
Carbone et al. Flexible formulation of value for experiment interpretation and design
Osicka et al. Boolean matrix decomposition by formal concept sampling
CN114997060A (zh) 一种声子晶体时变可靠性测试方法、计算设备及存储介质
Kucherenko High dimensional Sobol’s sequences and their application
Xu et al. Short adjacent repeat identification based on chemical reaction optimization
Bleker et al. A Comparative Study of Gene Co-Expression Thresholding Algorithms
Alsolami et al. A Metropolis random walk algorithm to estimate a lower bound of the star discrepancy

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 06791257

Country of ref document: EP

Kind code of ref document: A1