WO2007041966A1 - Method and system for designing a composite sample library - Google Patents

Method and system for designing a composite sample library Download PDF

Info

Publication number
WO2007041966A1
WO2007041966A1 PCT/CN2006/002691 CN2006002691W WO2007041966A1 WO 2007041966 A1 WO2007041966 A1 WO 2007041966A1 CN 2006002691 W CN2006002691 W CN 2006002691W WO 2007041966 A1 WO2007041966 A1 WO 2007041966A1
Authority
WO
WIPO (PCT)
Prior art keywords
sample
distribution
variable
samples
pseudo
Prior art date
Application number
PCT/CN2006/002691
Other languages
French (fr)
Chinese (zh)
Inventor
Xinlei Hua
Xichen Feng
Original Assignee
Accelergy Shanghai R & D Center Co., Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Accelergy Shanghai R & D Center Co., Ltd filed Critical Accelergy Shanghai R & D Center Co., Ltd
Publication of WO2007041966A1 publication Critical patent/WO2007041966A1/en

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/60In silico combinatorial chemistry
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B35/00ICT specially adapted for in silico combinatorial libraries of nucleic acids, proteins or peptides
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16CCOMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
    • G16C20/00Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
    • G16C20/60In silico combinatorial chemistry
    • G16C20/62Design of libraries

Definitions

  • the present invention relates to an efficient test method - a high throughput experimental method, and more particularly to the field of design of a combined sample library therein. Background technique
  • the present invention provides a method for integrating empirical knowledge to design a combined sample library.
  • this risk knowledge can be embodied as components of the sample, variables associated with the components, and constraints of these variables.
  • the invention also provides a method of designing a sample library comprising the following steps:
  • the constraint of the variable is the relationship between variables determined by experience or previously known knowledge.
  • the empirical knowledge is a physical or chemical natural law represented by a variable.
  • the pseudo sample can be generated by random sampling.
  • the random sampling described above is performed by employing a set of variables, each of which corresponds to a component and randomly takes values within a certain interval.
  • An example of random sampling includes Monte Carlo simulations.
  • the random values are generated by a random number generator.
  • the random values are related to a probability distribution or probability density. The probability distribution is a uniformly distributed niform distribution or a non-uniform distribution.
  • the non-uniform distribution includes the Bernoulli distribution, the beta distribution, the Chi-square distribution, the exponential distribution, the F distribution, the gamma distribution, the Gaussian distribution, Normal distribution (eg lognormal, multivariate normal distribution and univariate normal distribution), non-central X-square distribution, non-central F distribution, binomial distribution, negative binomial distribution, polynomial distribution, Pare Pareto distribution, Poisson distribution, student t distribution, and Tsallis distribution.
  • the probability distribution includes a uniform distribution, a normal distribution, and a Gaussian distribution.
  • Yet another aspect of the present invention provides a method of obtaining a desired number of samples in a sample library, the method comprising the steps of:
  • the present invention provides a method of measuring the optimal number of samples that need to be designed and/or synthesized, the method comprising the steps of:
  • the method further includes the step of determining a ratio of qualified samples divided by the number of pseudo samples to obtain a qualified sample ratio.
  • the invention also provides a computer product comprising computer software.
  • computer software Once the computer software is running, the methods and calculations of the present invention can be performed.
  • the computer software can perform random sampling.
  • Figure 1 is a schematic representation of a two-component sample produced by the Monte Carlo simulation method, each sample consisting of cerium (Ce) and iron (Fe). All the dots (including hollow, gray, and color) in the figure represent pseudo-samples consisting of uniformly distributed independent and randomly generated enthalpy variables and iron variables; there is no constraint between the ⁇ variable and the iron variable.
  • the gray dots and black dots in the figure represent pseudo samples that satisfy the first constraint.
  • the black dots represent pseudo samples that satisfy both the first and second constraints (detailed reference to Example 1).
  • Figure 2 is a three-dimensional view of a four-component sample produced by the Monte Carlo simulation method, each sample consisting of ruthenium (Ce), iron (Fe), tungsten (W), and nickel (Ni). All points represent pseudo-samples randomly distributed independently of each of the four variables without any constraint therebetween.
  • Figure 3 is a schematic illustration of the pseudo sample in Figure 2 that satisfies the first constraint.
  • Figure 4 is a schematic illustration of the dummy sample of Figure 3 further satisfying the second constraint.
  • Figure 5 is a graphical user interface (GUI) that allows a user to design a multi-component sample library that provides a variable, a range of variables, and a desired segmentation for each component.
  • GUI graphical user interface
  • Figure 6 is a graphical user interface given after selecting a component and corresponding variables.
  • Figure 7 is a graphical user interface that allows a user to specify one or more constraints on a variable.
  • the graphical user interface shown in Figure 8 allows the user to choose 1) whether to perform the Monte Carlo simulation method; 2) how to perform the Monte Carlo simulation method.
  • the graphical user interface shown in Figure 9 allows the user to enter the specified number of samples to be obtained and the specified number of components for the input sample.
  • the graphical user interface shown in Figure 10 allows the user to specify each component.
  • the graphical user interface shown in Figure 11 allows the user to define constraints using variables.
  • the present invention relates to a design strategy for a combined sample library to be designed, synthesized, Screen and measure the sample library.
  • One aspect of the invention provides a method of designing a sample library comprising providing a plurality of components of a sample.
  • combined sample library refers to a collection comprising a plurality of samples, “sample,” refers to a material comprising a plurality of components.
  • component refers to a substance, such as an element, a molecule, a compound, A substance, a mass, etc., or a combination of these shields.
  • a sample comprises n different components, d, C 2 , C 3 ... Ci...C n , where n is an integer and refers to the amount of different components in the sample.
  • the mass of each component Ci is expressed as MWi, where ie ⁇ 0, l, 2 ... n ⁇ , the composition number in the sample is expressed as, and the corresponding composition ratio is expressed as .
  • Mass MWi refers to the molecular weight or atomic weight of the component.
  • composition quantity refers to the number of the ith component in the sample, so the sample can be expressed as ( ⁇ ) ⁇ ( ⁇ 2 ) ⁇ 2 ...( ⁇ ⁇ ...( ⁇ ⁇ ) ⁇ , where i ⁇ 0, 1, 2, ⁇ ⁇ ⁇ , the composition ratio can be characterized as the relative weight of one component in the sample, which can be expressed by Equation 1:
  • composition number Xi can also refer to the molar ratio of the ith component in the sample.
  • the composition ratio can also be expressed as the mole fraction of a component in the sample, which is between 0 and 1, which can be defined by the following formula 2:
  • Composition ratio may be further expressed as a percentage of a component of the sample, its value between 0% and 100% 0
  • the glucose molecule C 6 H 12 0 6 can be considered as a sample containing three components: carbon (C), hydrogen (H), and oxygen (0), each component having a composition number, such as C. It is 6, H is 12, and 0 is 6.
  • the mass of material (MW) of each component can be derived from the mass of each atom, C is 12, H is 1 and O is 16. Therefore, the (weight) composition ratio of C is 0.4 or 40%, (12*6/(12*6+1 * 12+16*6)); H is 0.067 or 6.7%; 0 is 0.533 or 53.3%.
  • the sum of the composition ratios of the components is 1.
  • Another feature of the combined sample library is that each sample in the sample library consists of the same type of components, but these components have different composition ratios.
  • a method of designing a combined sample library includes providing a variable for each component of the multi-component sample.
  • the variable corresponds to the component in the sample.
  • the variable V is a random value in the interval [v min , v max ], where v min is not less than 0, V max is not more than 1, and v min ⁇ v max .
  • the interval is [0, 1]. If the variable V is assumed to be a value in the discrete interval ⁇ V ⁇ Vz ⁇ . V x ⁇ , then V can be discrete, where the discrete value falls within the interval [v min , V max ] (eg [0, 1]) in.
  • V is a random value of the interval [v min , v max ], it may be a continuous value.
  • the setting of the random value of the first variable is not subject to the assumption of the second variable.
  • the process of setting a random value from the interval of the same variable depends on the probability or probability density of the possible values of the variable.
  • the setting of the random value depends on the particular probability of the respective discrete value in the interval.
  • Vi is a variable of the first component Q
  • Vj is a variable of the component second Cj.
  • Vi can be set to a random value in the [Vi, min , Vi, wake-up] interval
  • Vj can be set to a random value in the [Vj, min , Vj, max ] interval
  • the Vi value is independent of Vj.
  • Cj are components in a sample consisting of C 2 , ... Q, ... Cj, ... C n , where I, j G ⁇ 0, 1, 2 ... n ⁇
  • the synthetic variable Vi becomes the composition ratio of the component Ci
  • the synthetic variable Vj becomes the composition ratio Rj, and the sum of the variables of all the components in the sample satisfies the following formula 4:
  • Another aspect of the invention consists in: providing or setting at least one constraint for at least one variable in the sample in the provided method of designing the combined sample library.
  • the term "constraint" refers to the condition of at least one variable or the relationship between variables.
  • a constraint is a constraint that a variable or variables in a sample must satisfy.
  • a set of variables ⁇ Vi ⁇ in a valid or qualified sample must satisfy at least one constraint or a specific set of constraints. For example, assume that the sample includes components d, C 2 ... C n , and each component Q has a variable Vi, where ie ⁇ 0, 1, 2 ... n ⁇ , then, in the effective sample, the component The sum of the variables must satisfy the following constraints, Equation 5:
  • is the error (such as the constraint tolerance or the constraint deviation), and ⁇ is the value that varies between 0 and 0.2.
  • is 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, or 0.10.
  • the constraint approximates the equation as shown in Equation 5.
  • empirical knowledge about the components of the sample can be embodied as a relationship between multiple variables, which can also be understood as a constraint.
  • the ratio of the composition ratio of Q to 2 should be 2:1, where ⁇ 0, 1, 2 ... n ⁇ , je ⁇ 0, 1, 2 ...n ⁇ , and i ⁇ j.
  • the sum of the composition ratios to achieve Ci and Cj is X, where X is a value between 0 and 1.
  • the method of designing the combined sample library provided by the present invention involves generating a pseudo sample.
  • "Pseudo-sample” refers to a multi-component hypothetical sample, each of which has an independent variable such that any assignment of one variable is an independent event arbitrarily assigned relative to another variable and all variables of the pseudo-sample are Not subject to any restrictions. In other words, its variables may or may not satisfy the constraints.
  • a pseudo sample includes d, C 2 ... Q, ... Cj, ... C n , where i, je ⁇ 0, 1, 2 . . .
  • V ; and Vj Is a random value in the interval [0,1], Vi may take a value between [0, 1], and Vj may take another value between [0, 1].
  • the sum of the values of these variables does not need to meet the requirements of Equations 4 and 5.
  • a pseudo sample may or may not correspond to a real physical sample, and a pseudo sample is a sample point that has no independent, assumed variable values. Therefore a large number of pseudo samples constitute a sample space.
  • Pseudo samples can be generated using random sampling.
  • the random sampling method is a method of generating a sample point by randomly assigning a value to each component of a sample point.
  • the method of generating random values in the interval [V min , V max ] is clear, where V min is not less than 0 and V max is not greater than 1.
  • the algorithms and calculation programs for random number generators are well known in the field of computer science. Please refer to DE Knuth's Art-Semi-Numerical Algorithms for Computer Programs (Vol.
  • the randomly generated value is the occurrence of an event or a possible assigned variable (V) in a particular interval [v min , v max ], where the probability of occurrence of the event depends on the probability density or probability distribution of the variable. Therefore, the variable is further defined by the probability function assigning the value contained in the variable interval. For example, a discrete variable can be defined by assigning a correlation probability to each discrete value in the interval. Continuous variables can be defined by assigning a probability distribution to intervals that include all possible values of the variable.
  • the probability distribution used here refers to the arrangement of the values of the variables that reflect the frequency of their observations or theoretical occurrences.
  • Probability distributions well known in the art include balanced distributions and non-equilibrium distributions. Unbalanced distributions include Bernoulli distribution, beta distribution, X-square distribution, exponential distribution, F-distribution, gamma distribution, Gaussian distribution, normal distribution (eg, lognormal, multivariate normal distribution, and univariate positive) State distribution), non-central X-square distribution, non-central F distribution, binomial distribution, negative binomial distribution, polynomial distribution, Pareto distribution, cypress distribution, Student's t distribution, Salis distribution, and above distribution Any combination.
  • the random variables are assigned by a non-equilibrium probability distribution, for example, including a normal distribution, a Poisson distribution, and a Gaussian distribution.
  • the randomly generated value of the variable is assigned or related to the equilibrium distribution, and therefore, the random variable of the equilibrium distribution can be the same in the interval [V min , V max ] (V min > 0, V max ⁇ 1) Probability determines any random value.
  • non-equilibrium distributed random values can be generated by a random number generator (e.g., a linear superimposed generator).
  • a random number generator e.g., a linear superimposed generator
  • a, c and m are pre-set constants, a is the multiplier value, c is the increment, and m is the coefficient.
  • Park and Miller's Random Number Generators: Good Ones are Hard to Find", Comm. ACM 31: 1192-1201, 1988).
  • the random number generator includes the "A Very Fast Shift-Register Sequence Random Number Generator” by Kirkpatrick and Stoll (Journal of Computational Physics 40: 517-526, 1981). The described sequence of transfer registers. Furthermore, the random number generator also includes a quasi-random number generator, please refer to Press and Teukolsky's "Quasi Random Numbers” (Computers in Physics 3: 76-79, 1989).
  • Non-equalized random values such as normal or Gaussian random values
  • One of the methods includes a transformation function, such as the well-known Boks Moeller conversion, to convert the equilibrium distribution random variable into a new set of non-equilibrium distribution random variables (for example, Gaussian or normal distribution), please refer to Boks Moeller Box & Muller, "A Note on the Generation of Random Deviates", Annals Math. Stat. 29:610-611, 1958
  • the random sampling method comprises a Monte Carlo method or simulation.
  • the term "Monte Carlo method” or “Monte Carlo simulation” here refers to a random sampling method used to study a problem and obtain an approximation of the probability of solving the problem.
  • the term “Monte Carlo method” or “Monte Carlo simulation” as used herein refers in particular to the process of generating a random event (such as a randomly occurring value of any given variable). This process is usually done by computer algorithms, which are repeated multiple times, and all test results are analyzed and calculated to provide an approximate solution.
  • Monte Carlo simulations please refer to the Monte Carlo Method of Mitropolis and Uram (Journal of the American Statistical Association 44: 335-341, 1949) (Metropolis and Ulam, "The Monte Carlo Method", Journal of American Statistical Association 44: 335-341 1949 ); Sherbert's “Monte Carlo Method” (Sobol, "The Monte Carlo Method", The University of Chicago Press, 1974); Mooney's Monte Carlo Simulation ( Mooney, “Monte Carlo Simulation”, Sage University Paper, 1997).
  • the Monte Carlo method is constantly evolving in the field. For example, the method is initially applied to values estimated by throwing darts on standard coordinates (a circumference circumscribed by a square).
  • standard coordinates a circumference circumscribed by a square.
  • the ratio of the number of times the dart hits the circumference and the square is similar to the fraction of the ⁇ value, please refer to Ross "The First Lesson in Probability” (Ross, "A First Course in Probability” 2nd Edition, Macmillan, 1976).
  • Equation 6 Equation 7 below:
  • A is the number of points in V(x)
  • is the number of all points produced in the box
  • C is the area of the range box.
  • the ratios ⁇ / ⁇ and V(x) are related in proportion to the area occupied by the range box.
  • Monte Carlo simulation involves generating random variable values that are emphasized by empirical knowledge. For example, empirical knowledge about components (or component ratios) requires assigning variables with different probability densities (continuous variables) in different specific intervals, or requiring that variables be assigned different values at different values. (discrete variable).
  • Another example of Monte Carlo simulation includes Markov chain operations. A Markov operation is a sequence of random values whose probability of occurrence of each event depends on the value generated at the previous moment. Please refer to Frank and Smith's "Understanding Molecular Simulation: From Algorithm to Application” (Frenkd & Smith, "Understanding Molecular Simulation: From Algorithm to Applications” Academic Press, 1996).
  • Another aspect of the invention relates to a method of selecting a qualified sample from a pseudo sample.
  • qualified sample refers to a pseudo sample produced by the method described in the present invention, the variable satisfies one or more specific constraints, and the pseudo sample of the non-conforming sample is referred to as a non-conforming sample.
  • the dummy samples produced e.g., Monte Carlo simulation
  • random sampling method i.e. a large number of tests
  • Each pseudo sample is examined (inspected), for example, using a computer algorithm to determine if it meets a particular constraint or constraints.
  • a pseudo sample that satisfies the constraint is selected and stored as a qualified sample.
  • the values associated with each eligible sample are recorded as a vector and associated with the component ratio to synthesize and design a qualified sample in the sample library, since in the qualified sample, the values are the groups. Ratio.
  • Another aspect of the invention provides a method of producing a given number of samples in a sample library.
  • the method includes the following steps:
  • Another aspect of the present invention discloses a method of calculating a ratio of acceptable samples, where the term "qualified sample ratio (Rq S )," means that in a random sampling method, a variable satisfies one or more constrained pseudo samples.
  • the acceptable sample ratio (R qs ) can be estimated by dividing the number of qualified samples (Nq S ) in the random sampling method by the number of pseudo samples (N ps ) (Equation 8).
  • Nps When N ps increases, the calculation accuracy becomes smaller, and the change rule follows the following formula 9: Accuracy ⁇ ⁇ 9 where N is the number of random simulations (such as Monte Carlo). When a large number of Monte Carlo simulation tests were performed, as 1/N continued to decrease, the variation in the acceptable sample ratio decreased and the accuracy increased. In other words, when a sufficient number of tests are performed, the ratio of acceptable samples can achieve a relatively high degree of accuracy and accuracy. For example, for a constraint 1-eight ⁇ 3 ⁇ 4 ⁇ 1 + eight, Monte Carlo simulation of the sample (the more V0, the more accurate
  • a qualified sample is obtained (eg, 100%; when N qs reaches 10, its accuracy is between -30% and 30%; when Nq S reaches 10 2 , its accuracy is between -10% and 10%; when N qs reaches 10 3 , Its accuracy is between -3% and 3%; when N QS reaches 10 4 , its accuracy is between -1% and 1%.
  • the invention also discloses a method of estimating the optimal number of acceptable samples.
  • optimal number of qualified samples herein refers to the number of samples that satisfy a particular one or more constraints and that properly represent the sample space.
  • the optimal number of qualified samples is obtained by detecting all possible pseudo samples produced by discrete variables and identifying pseudo samples that satisfy a particular constraint.
  • a set of discrete values is generated by dividing (uniformly or non-uniformly) the interval [Vi, min , Vi, max ] into M parts or cells to a specific variable 1 ⁇ 4, thereby generating a set of defined values for the interval.
  • M ⁇ M is a positive integer and can be any number between 1 and 1,000,000.
  • the total number of pseudo samples (Z) can be generated by the number of specific grids of the variable.
  • each component Q has a variable Vi, where ie ⁇ 0, 1, 2. ⁇ . ⁇ ⁇ , each 1 ⁇ 4 is separated into Mi Part or lattice or point, so Vi takes a set of discrete values in the interval [V min , V max ] (V MIN > 0 and V max ⁇ l). This separation is determined based on our experience with the variables corresponding to the components, which can be either segmented or non-uniform segments. If you do not consider or provide constraints for variables, based on a set of ⁇ MJ pseudo samples total
  • the quantity (Z) represents the sample point of the n-dimensional sample space, and Z can be derived by the following formula 10: Mi Equation 10
  • all pseudo samples can be detected, a pseudo sample whose variables satisfy the constraint is selected and stored in the vector to form a qualified sample set.
  • the number of eligible samples i.e., the number of vectors described above, is the optimum number of eligible samples in the sample space corresponding to a given set of discrete values.
  • the number of components (or variables) is increased and the number of segments per component is also increased, a full search through the sample space can become quite cumbersome. For example, for a sample library consisting of five components and each specific interval of each component variable is divided into 100 cells, the sample space is a five-dimensional space, and the total number of all pseudo samples (Z) is 100 5 (or 10 1G , 10,000,000,000). When one or more constraints are introduced, the calculation becomes more complicated. Although it is possible to identify whether each of the pseudo samples satisfies the one or more constraints, another method may be employed to provide an approximation of the optimal values by performing random sampling as described herein (e.g., Monte Carlo simulation). estimate.
  • Monte Carlo simulation Monte Carlo simulation
  • the random number can be generated based on a set of discrete values, where each discrete value is located in the interval assumed by the variable and has a certain probability.
  • the random number may also be any value corresponding to a certain interval having a specific probability distribution. Therefore, the optimal number of qualified samples depends on the product of Z and R qs . It is expected that the optimal number of eligible samples will vary depending on a set of parameters. Examples of this parameter include the number of variable segments (M) in the sample, the method of generating random numbers, the statistical distribution of the variables, the Monte Carlo simulation, the number of Monte Carlo tests, the variable constraints, the tolerance limits, and the required accuracy or Precision.
  • M variable segments
  • a computer system in the present invention refers to a computer or computer readable medium that is designed and configured to perform some or all of the methods described herein.
  • the computer e.g., server
  • the computer employed herein can be any of a variety of general purpose computers, such as personal computers, network servers, workstations, or other computer platforms that are currently or in the future.
  • computers include, in particular, some or all of the components such as processors, operating systems, computer memories, input devices, and output devices.
  • the computer can further include, for example, a cache, a data backup Unit and some other equipment.
  • a processor as used herein may include one or more microprocessors, domain programmable logic arrays, or one or more specialized integrated circuits corresponding to a particular application.
  • processors include, but are not limited to, Intel's Pentium series processors, S-Chip's microprocessors, Sun's workstation system processors, Motorola's personal desktop processors, MIPS Technologies' MIPs processors. Xilinx's highest range of domain programmable logic arrays and other processors.
  • the operating system employed herein includes machine code that, through execution of the processor, coordinates and performs functions of other portions of the computer, and assists the processor in performing functions of various computer programs that may be written in various programming languages.
  • the operating system also provides scheduling, input and output control, file data management, memory management and communication control, and related services, all of which are prior art.
  • Typical operating systems include Windows operating systems such as Microsoft Corporation, Unix or Linux operating systems from a variety of vendors, additional or future operating systems, and combinations of these operating systems.
  • the computer memory used herein can be any of a variety of different types of memory storage devices. Examples include random access memory, magnetic media storage such as permanent hard disks or tapes, optical shields such as reading and writing laser discs, or other access storage devices.
  • the memory storage device can be any existing or future development device, including a compact disc drive, a tape drive, a removable hard drive, or a disk drive. These types of memory storage devices are typically read from or written to a computer program storage medium, such as an optical disk, magnetic tape, removable hard disk or floppy disk. All of these computer program storage media can be considered a product of a computer program.
  • the products of these computer programs typically store computer software programs and/or data. Computer software programs are typically stored in system memory and/or memory storage.
  • the computer software program of the present invention can be executed by loading it into a system memory and/or memory storage device using some type of input device.
  • all or part of the software program may also be present in a read only memory or similar memory storage device, such device not requiring the software program to be loaded first through the input device.
  • the software program, or portions thereof can be loaded by the processor into system memory or a cache or a combination of both in an existing manner to facilitate performing and performing random sampling.
  • the substance handling device may be the applicant's international patent application
  • the substance treatment disclosed in PCT/CN2005/002177, "Material Treatment Apparatus and Application thereof" may also be a reaction system disclosed in the applicant's Chinese Patent Application No. 200510029727.3, and may also be the applicant's Chinese patent application.
  • the parallel reaction system disclosed in No. 200610085162.5 may also be the reaction system disclosed in the Chinese patent application "Reaction System” filed by the applicant on September 30, 2006, or may be applied by the applicant on September 30, 2006.
  • the software is stored in a computer server that is coupled to the user terminal, input device, or output device via a data line, wireless line, or network system.
  • network systems include hardware and software that are electrically coupled together in a computer or device.
  • the network system may include the Internet, 10/1000 Ethernet, Electrical and Electronic Engineering Association 802.11x, Electrical and Electronic Engineering Association 1394, xDSL, Bluetooth, LAN, WLAN, GSP, CDMA, 3G, PACS or any other ANSI recognized standard medium. Based on the equipment.
  • the researcher can access the computer server to design the recipe and process through the above network system anywhere; on the other hand, the researchers in the ground can access the computer server through the above network system for formulation and process.
  • Design B researchers can access the computer server through the above network system to obtain the formulation and process design data, thus achieving collaborative research in different regions, facilitating centralized management of experimental equipment, enabling R&D and implementation in different regions.
  • Chinese Patent Application No. 200610100921.0 Computer Aided Graphical Experimental Design System and Method.
  • the data obtained from the formulation and process design using the combined sample library design method of the present invention can also be stored on a server for sharing by different researchers.
  • data sharing can be done through the Portal system designed by the applicant.
  • researchers can also communicate through the Portal system and use the combined sample library design method of the present invention for formulation and process design.
  • Chinese Patent Application No. 200610100921.0 “Computer Aided Graphical Experimental Design System and Method”.
  • This example shows how to select a qualified sample from a pseudo sample consisting of two components (tantalum and iron) produced by Monte Carlo simulation.
  • the variable V Ce of ⁇ takes a value between 0 and 1
  • the variable V Fe of iron also takes a value between 0 and 1.
  • the Monte Carlo simulation was performed using a uniformly distributed randomly generated V Ce and V Fe values between 0 and 1, in which the randomly generated values of V Ce were independent of the randomly generated values of V Fe . And in this simulation, V Ce and V Fe have no mandatory relationship of any relationship or constraint.
  • Mongolia The result of the Tecal simulation is the generation of a pseudo sample population. All of the points (including hollow, gray, and dark) as shown in Figure 1 constitute a collection of pseudo samples.
  • the first constraint is defined as 0.2 ⁇ V Ce ⁇ 0.8 and 0.2 ⁇ V Fe ⁇ 0.8.
  • the selection process considers the first constraint, the selected set of dummy samples are displayed as black or gray dots as shown in FIG.
  • the second constraint is defined as lA ⁇ V Ce + V Fe ⁇ l+A.
  • a set of pseudo samples that simultaneously satisfy the two constraints are displayed as Black dots as shown in Figure 1.
  • the Monte Carlo simulation introduces the experience into the design through the two constraints, and obtains the identification information of the design sample library. For example, since the number associated with the number of pseudo samples satisfying the two constraints is known, the number of qualified samples can be known. As shown in Table I, we know the composition ratio of each eligible sample. Table I shows the pseudo sample values produced by Monte Carlo simulation. The numbers in italics are the pseudo sample values that satisfy the first constraint, and the numbers in the boxes are the pseudo sample values that satisfy the first and second constraints.
  • This example shows how to select a qualified sample from a pseudo sample composed of four components ( ⁇ , iron, tungsten, and nickel) generated by Monte Carlo simulation. Variables of bismuth, iron, tungsten and nickel v Ce , v Fe , v w ,
  • V Ni takes values between 0 and 1.
  • Monte Carlo simulations we use a uniformly distributed randomly generated value of 0 to 1 for each variable. The randomly generated values in this simulation are independent of each other and are not subject to any constraints.
  • the result of the Monte Carlo simulation is a sample point (pseudo sample) in a four-dimensional space, and the projection of the four-dimensional sample point in three-dimensional space is as shown in FIG.
  • this Monte Carlo simulation which takes into account two constraints, provides deterministic information about the design of a sample library consisting of four components with these two constraints.
  • a qualified sample ratio can be calculated by removing the number of pseudo samples satisfying two constraints by the total number of pseudo samples. Since the number associated with the number of pseudo samples satisfying the two constraints is known, the number of qualified samples can be known. The component ratio of the variables in each of the qualified samples is recorded, and the component ratio of the qualified samples is known. If the variable is segmented by a particular grid point, the optimal number of samples can likewise be obtained according to the method provided by the present invention.
  • This embodiment illustrates a computer program that allows a user to enter information through a graphical user interface and perform calculations and simulations (including Monte Carlo simulation) to design a qualified sample library.
  • the graphical user interface allows the user to select the components required to design the sample.
  • component A may be any one of the group consisting of vanadium (V), niobium (Nb), and molybdenum (Mo), the variable of component A. (V a) in the range between 0 and 1 (refer to FIG. 5 in the range of 0.00 to 1.00), and the variable variation range is divided into 10 portions (10 segments as shown in FIG. 5).
  • component A is assigned a variable (V a ) that takes a value between 0 and 1 (see Fig. 6), and likewise, components B and C are also given corresponding variables (V b and V c ). ) (Please refer to Figure 6).
  • the graphical user interface allows the user to specify constraints between a variable or multiple variables.
  • is the error (or constraint tolerance) and is given as 0.01 in this example (please refer to Figure 6).
  • the graphical user interface further allows the user to decide how to estimate the optimal number of eligible samples.
  • the graphical user interface provides calculations for six different levels of accuracy. In the exact calculation therein, the pseudo sample is produced according to the formula 10 without any constraint shown in the present invention, and then the pseudo sample is detected by a computer and only the portion satisfying the constraint is selected. In this example, 198 samples satisfy the constraints. In addition, the component ratios of all 198 eligible samples were obtained.
  • the calculation accuracy is between -100% and 100%, it is considered to be very low, when it is considered to be low between -30% and 30%, and the accuracy at -10% to 10% is medium, and It is high between -3% and 3%, and between -1.0% and 1.0% is very high precision.
  • This embodiment shows a computer program that obtains a specified number of eligible sample points.
  • the computer program allows the user to enter information through a graphical user interface and perform calculations and simulations (including Monte Carlo simulations) to derive the component ratios for each of these sample points.
  • Figure 9 shows that the graphical user interface allows the user to enter a designated total sample point 125, with each sample point having four components.
  • the graphical user interface also allows the user to specify the desired component (see Figure 10) and define the constraints for the variable (see Figure 11).
  • the simulation test is started, and each dummy sample is checked for compliance with the set constraints. Stop the simulation when the number reaches a qualified sample 125, the sample points 125 in order to obtain four components (elements Pd, /3/:i O90il£ 996/-00iAV

Abstract

A method for designing a composite sample library is provided, through which we can make the best of the existing knowledge or given hypothesis to reduce the times of sample experiment or to aggrandize the effective information from the sample experiment of fixed times. The method includs the following steps: (1) providing multi-components composing the sample; (2) providing variable for each component, the variable getting value at certain intervals; (3) setting at least one constraint condition for at least one variable; (4) producing fake sample; (5) detecting the fake sample to make sure whether it’s a conforming sample; (6) repeating steps (4) and (5), until getting at least one conforming sample. The method of the invention can avoid the system deviation of design, and is effective and accurate.

Description

组合样品库的设计方法和系统  Design method and system for combined sample library
技术领域 Technical field
本发明涉及一种高效的试验方法 -高通量实验方法, 更具体地讲, 本发明 涉及其中的组合样品库的设计领域。 背景技术  The present invention relates to an efficient test method - a high throughput experimental method, and more particularly to the field of design of a combined sample library therein. Background technique
针对材料的许多特性, 如热传导性、 发光性、 催化活性等, 均可利用组 合材料的发现方法和系统, 来鉴别新材料或优化已有的材料。 目前的组合研 究方法在样品空间中通过格点搜索, '用蛮力, 地合成大量的样品, 然后根 据所需特性筛选这些样品。 然而, 这种方法几乎没有考虑样品有关组分的已 知的经验知识。 即使考虑这样的经验知识, 也缺乏适当的方法来设计在样品 空间中充分随机化了的样品库。  Many of the material's properties, such as thermal conductivity, luminosity, catalytic activity, etc., can be exploited using composite material discovery methods and systems to identify new materials or to optimize existing materials. The current combinatorial research method searches through the grids in the sample space, 'comparing a large number of samples with brute force, and then screening these samples according to the desired characteristics. However, this method hardly takes into account the known empirical knowledge of the relevant components of the sample. Even with this empirical knowledge in mind, there is no suitable way to design a sample library that is sufficiently randomized in the sample space.
因此, 有必要开发一种全新的组合实验设计体系和方法, 以有效地将经 验知识整合到样品库设计中, 而在此经验知识之外样品的取样则应是完全随 机的以避免人为因素的干扰。 另外, 我们需要知道需合成的样品是哪些、 最 具代表性的样品的总数是多少等信息。 发明内容  Therefore, it is necessary to develop a new combined experimental design system and method to effectively integrate empirical knowledge into the sample library design, and samples of samples outside this empirical knowledge should be completely random to avoid human factors. interference. In addition, we need to know which samples are to be synthesized, and what is the total number of representative samples. Summary of the invention
本发明的目的在于提供一种全新的组合实验设计体系和方法, 以有效地 将经验知识整合到样本库的设计中。  It is an object of the present invention to provide a new combined experimental design system and method for efficiently integrating empirical knowledge into the design of a sample library.
本发明提供了一种可整合经验知识来设计组合样品库的方法。 具体地, 这种经险知识可体现为样品的诸个组分、 与组分相关的诸个变量以及这些变 量的约束条件。  The present invention provides a method for integrating empirical knowledge to design a combined sample library. Specifically, this risk knowledge can be embodied as components of the sample, variables associated with the components, and constraints of these variables.
本发明同时提供了一种包括以下步骤的设计样品库的方法:  The invention also provides a method of designing a sample library comprising the following steps:
(1)提供目标样品中的组分;  (1) providing components in the target sample;
(2) 为每一组分设定变量;  (2) Set variables for each component;
(3) 为上述变量设定至少一个约束条件;  (3) setting at least one constraint for the above variables;
(4) 产生伪样品库;  (4) generating a pseudo sample library;
(5)选取伪样品中满足约束条件的合格样品;  (5) selecting qualified samples in the pseudo sample that meet the constraint conditions;
在一实施方案中, 变量的约束条件是由经验或以前已知知识决定的变量 之间的关系。 在又一实施方案中, 经验知识是由变量表示的物理或化学自然 定律。  In one embodiment, the constraint of the variable is the relationship between variables determined by experience or previously known knowledge. In yet another embodiment, the empirical knowledge is a physical or chemical natural law represented by a variable.
-1- 确认本 在一实施方案中, 伪样品可通过随机取样产生。 在一实施方案中, 上述 的随机取样是通过采用一组变量, 其中每个变量对应一个组分并且在一定间 隔内随机取值, 来进行的。 随机取样的一个例子包含蒙特卡罗(Monte Carlo ) 模拟。 在又一个某实施方案中, 随机取值是由通过随机数字发生器产生的。 在又一实施方案中, 随机取值与概率分布或概率密度相关。 该概率分布是均 匀分布 niform distribution )或非均匀分布。 其中, 非均匀分布包括柏努利 分布 (Bernoulli distribution )、 贝它分布 (beta distribution )、 X 平方分布 ( Chi-square distribution )、指数分布、 F分布、伽马分布、高斯分布( Gaussian distribution )、正态分布(例如对数正态、多变量正态分布和单变量正态分布)、 非中心 X平方分布、 非中心 F分布、 二项式分布、 负二项式分布、 多项式分 布、 帕雷托分布 ( Pareto distribution )、 柏松分布 ( Poisson distribution )、 学 生 t分布和萨利斯分布(Tsallis distribution )。 概率分布包括均匀分布、 正态 分布和高斯分布。 -1- Confirmation In an embodiment, the pseudo sample can be generated by random sampling. In one embodiment, the random sampling described above is performed by employing a set of variables, each of which corresponds to a component and randomly takes values within a certain interval. An example of random sampling includes Monte Carlo simulations. In yet another embodiment, the random values are generated by a random number generator. In yet another embodiment, the random values are related to a probability distribution or probability density. The probability distribution is a uniformly distributed niform distribution or a non-uniform distribution. Among them, the non-uniform distribution includes the Bernoulli distribution, the beta distribution, the Chi-square distribution, the exponential distribution, the F distribution, the gamma distribution, the Gaussian distribution, Normal distribution (eg lognormal, multivariate normal distribution and univariate normal distribution), non-central X-square distribution, non-central F distribution, binomial distribution, negative binomial distribution, polynomial distribution, Pare Pareto distribution, Poisson distribution, student t distribution, and Tsallis distribution. The probability distribution includes a uniform distribution, a normal distribution, and a Gaussian distribution.
本发明的又一方面提供了一种在样品库中得到所需数量的样品的方法, 该方法包括以下步骤:  Yet another aspect of the present invention provides a method of obtaining a desired number of samples in a sample library, the method comprising the steps of:
(1)提供组成样品的诸个组分;  (1) providing components constituting a sample;
(2)为每一组分设定一个变量;  (2) setting a variable for each component;
(3)为变量设定至少一个约束条件;  (3) setting at least one constraint for the variable;
(4)提供样品的所需数量;  (4) providing the required number of samples;
(5)产生随机的伪样品;  (5) generating a random pseudo sample;
(6)根据伪样品的变量是否满足约束条件, 确定该伪样品是否为合格样 口 ·  (6) Determine whether the pseudo sample is a qualified sample according to whether the variable of the pseudo sample satisfies the constraint condition.
(7)重复步骤 (5)和 (6), 直至合格样品的数量达到所需数量。  (7) Repeat steps (5) and (6) until the number of qualified samples reaches the required number.
再一方面,本发明提供了一种测算需要进行设计和 /或合成的样品最佳数 量的方法, 该方法包括以下步驟:  In still another aspect, the present invention provides a method of measuring the optimal number of samples that need to be designed and/or synthesized, the method comprising the steps of:
(1)提供组成样品的诸个组分;  (1) providing components constituting a sample;
(2) 为每一组分设定一个变量;  (2) Set a variable for each component;
(3) 为变量设定至少一个约束条件;  (3) Set at least one constraint for the variable;
(4)在给定区间为每一变量提供所需的分割段;  (4) Providing the required segmentation for each variable in a given interval;
(5)产生伪样品;  (5) generating a pseudo sample;
(6)选择满足约束条件的伪样品作为合格样品;  (6) selecting a pseudo sample that satisfies the constraint condition as a qualified sample;
(7) 由合格样品数量除以伪样品数量确定出合格样品比率;  (7) Determine the qualified sample ratio by dividing the number of qualified samples by the number of pseudo samples;
(8)计算样品的数量; 以及 (9)确定样品的最佳数量, 其中, 样品的最佳数量可通过用样品的数量 乘以合格样品比率计算而得。 (8) calculating the number of samples; (9) Determine the optimal number of samples, wherein the optimal number of samples can be calculated by multiplying the number of samples by the qualified sample ratio.
该方法进一步包括确定由合格样品数量除以伪样品数量得出合格样品比 率的步骤。  The method further includes the step of determining a ratio of qualified samples divided by the number of pseudo samples to obtain a qualified sample ratio.
另一方面, 本发明也提供了一种包含计算机软件的计算机产品。 该计算 机软件一旦运行, 即可执行本发明的方法和计算。 例如。 该计算机软件可以 进行随机取样。 附图简介  In another aspect, the invention also provides a computer product comprising computer software. Once the computer software is running, the methods and calculations of the present invention can be performed. E.g. The computer software can perform random sampling. Brief introduction
图 1是通过蒙特卡洛模拟方法产生的二组分样品示意图, 每一样品由铈 (Ce)和铁 (Fe)组成。 图中所有圆点(包括空心、 灰色、 色)代表由各自均匀 分布的、 独立随机产生的铈变量和铁变量构成的伪样品; 铈变量和铁变量之 间没有任何约束。 图中灰点和黑点代表满足第一约束条件的伪样品。 黑点代 表同时满足第一和第二两个约束条件的伪样品 (详细描述参考实施例 1 )。  Figure 1 is a schematic representation of a two-component sample produced by the Monte Carlo simulation method, each sample consisting of cerium (Ce) and iron (Fe). All the dots (including hollow, gray, and color) in the figure represent pseudo-samples consisting of uniformly distributed independent and randomly generated enthalpy variables and iron variables; there is no constraint between the 铈 variable and the iron variable. The gray dots and black dots in the figure represent pseudo samples that satisfy the first constraint. The black dots represent pseudo samples that satisfy both the first and second constraints (detailed reference to Example 1).
图 2是通过蒙特卡洛模拟方法产生的四组分样品的三维图, 每一样品由 铈 (Ce) 、 铁 (Fe)、 钨 (W)和镍 (Ni)组成。 所有点代表由四变量各自均匀分布的、 独立随机产生的伪样品, 其间没有任何约束。  Figure 2 is a three-dimensional view of a four-component sample produced by the Monte Carlo simulation method, each sample consisting of ruthenium (Ce), iron (Fe), tungsten (W), and nickel (Ni). All points represent pseudo-samples randomly distributed independently of each of the four variables without any constraint therebetween.
图 3是图 2中满足第一约束条件的伪样品示意图。  Figure 3 is a schematic illustration of the pseudo sample in Figure 2 that satisfies the first constraint.
图 4是图 3中进一步满足第二约束条件的伪样品示意图。  Figure 4 is a schematic illustration of the dummy sample of Figure 3 further satisfying the second constraint.
图 5是允许使用者设计多组分样品库的图形用户界面(GUI ), 其为每一 组分提供一变量、 变量的范围和所需的分隔段。  Figure 5 is a graphical user interface (GUI) that allows a user to design a multi-component sample library that provides a variable, a range of variables, and a desired segmentation for each component.
图 6是选定一组分和相应的变量后给出的图形用户界面。  Figure 6 is a graphical user interface given after selecting a component and corresponding variables.
图 7是允许使用者对变量规定一个或多个约束条件的图形用户界面。 图 8 所示的图形用户界面允许使用者选择 1)是否执行蒙特卡洛模拟方 法; 2) 如何执行蒙特卡洛模拟方法。  Figure 7 is a graphical user interface that allows a user to specify one or more constraints on a variable. The graphical user interface shown in Figure 8 allows the user to choose 1) whether to perform the Monte Carlo simulation method; 2) how to perform the Monte Carlo simulation method.
图 9所示的图形用户界面允许使用者输入指定的需要获得的样品数以及 输入样品的指定的组分数。  The graphical user interface shown in Figure 9 allows the user to enter the specified number of samples to be obtained and the specified number of components for the input sample.
图 10所示的图形用户界面允许使用者指定每一组分。  The graphical user interface shown in Figure 10 allows the user to specify each component.
图 11所示的图形用户界面允许使用者用变量来定义约束条件。  The graphical user interface shown in Figure 11 allows the user to define constraints using variables.
图 12所示的图形用户界面允许使用者指定约束条件公差。 具体实施方式  The graphical user interface shown in Figure 12 allows the user to specify constraint tolerances. detailed description
本发明涉及一种组合样品库的设计策略以便根据样品性质来设计、合成、 筛选并测量样品库。 The present invention relates to a design strategy for a combined sample library to be designed, synthesized, Screen and measure the sample library.
本发明的一个方面提供了样品库的设计方法,包括提供样品的多个组分。 这里所说的 "组合样本库"是指包括多个样品的集合, "样品,,是指包含多种 组分的材料。 "组分"是指一种物质, 包括如元素、 分子、 化合物、 物质、 物 块等, 或这些物盾的组合。  One aspect of the invention provides a method of designing a sample library comprising providing a plurality of components of a sample. The term "combined sample library" as used herein refers to a collection comprising a plurality of samples, "sample," refers to a material comprising a plurality of components. "component" refers to a substance, such as an element, a molecule, a compound, A substance, a mass, etc., or a combination of these shields.
在本发明的一实施方案中, 某样品包括 n 种不同的组分, d、 C2、 C3...Ci...Cn, 其中 n是整数, 指样品中不同组分的数量。 每一组分 Ci所具有 的质量表示为 MWi, 其中 i e {0,l,2 ...n}, 样品中组成数量表示为 , 相应 组成比率表示为 。 质量 MWi指该组分的分子量或原子量。 所谓的组成数 量 是指样品中第 i 个组分的数量, 因此该样品可表示为 (Οι)χι(Ο2)Χ2...(Ο χί...(Οη)Χη, 其中 i≡{0, 1, 2 , · ·η}, 組成比率可以表征为样品 中一种组分的相对重量数量, 其可用公式 1表示: In one embodiment of the invention, a sample comprises n different components, d, C 2 , C 3 ... Ci...C n , where n is an integer and refers to the amount of different components in the sample. The mass of each component Ci is expressed as MWi, where ie {0, l, 2 ... n}, the composition number in the sample is expressed as, and the corresponding composition ratio is expressed as . Mass MWi refers to the molecular weight or atomic weight of the component. The so-called composition quantity refers to the number of the ith component in the sample, so the sample can be expressed as (Οι) χι(Ο 2 ) Χ 2 ...(Ο χί...(Ο η ) Χη , where i≡{ 0, 1, 2, · · η}, the composition ratio can be characterized as the relative weight of one component in the sample, which can be expressed by Equation 1:
∑Μ^. χ Χ; 组成数量 Xi也可以指样品中第 i个组分的摩尔比率。 在这种情况下, 组 成比率也可以表示为样品中某一組分的摩尔分数, 其取值在 0到 1之间, 可 由以下公式 2予以定义: 公式 2∑Μ^. χ Χ ; The composition number Xi can also refer to the molar ratio of the ith component in the sample. In this case, the composition ratio can also be expressed as the mole fraction of a component in the sample, which is between 0 and 1, which can be defined by the following formula 2:
Figure imgf000006_0001
组成比率还可进一步表示为样品中某一组分的百分比, 其取值在 0%到 100%之间 0
Figure imgf000006_0001
Composition ratio may be further expressed as a percentage of a component of the sample, its value between 0% and 100% 0
在库的任何样品中, 所有组分的全部组成比率之和为 1。 如公式 3所示: ∑R; =1 公式 3  In any sample of the library, the sum of the total composition ratios of all components is 1. As shown in Equation 3: ∑R; =1 Equation 3
i=l  i=l
例如, 葡萄糖分子 C6H1206可看作是包含三个组分的样品:碳元素 (C)、氢 元素 (H)和氧元素 (0),每一组分有组成数量,如 C的是 6,H的是 12,0的是 6。 每一组分的物质质量 (MW)可以由各原子质量得出, C是 12 , H是 1 , 而 O 是 16。 因此, C的(重量)组成比率是 0.4或 40% , (12*6/(12*6+1 * 12+16*6)); H是 0.067或 6.7%; 0是 0.533或 53.3%.三个組分的组成比率的总和是 1。 組合样品库的另一特征是,样品库中每一样品都由相同类型的组分组成, 但这些组分具有不同的组成比率。 For example, the glucose molecule C 6 H 12 0 6 can be considered as a sample containing three components: carbon (C), hydrogen (H), and oxygen (0), each component having a composition number, such as C. It is 6, H is 12, and 0 is 6. The mass of material (MW) of each component can be derived from the mass of each atom, C is 12, H is 1 and O is 16. Therefore, the (weight) composition ratio of C is 0.4 or 40%, (12*6/(12*6+1 * 12+16*6)); H is 0.067 or 6.7%; 0 is 0.533 or 53.3%. The sum of the composition ratios of the components is 1. Another feature of the combined sample library is that each sample in the sample library consists of the same type of components, but these components have different composition ratios.
本发明另一方面提供的组合样品库的设计方法包括为多组分样品的每一 組分提供一个变量。 换言之, 变量与样品中组分——对应。 假设变量 V是在 区间 [vmin,vmax]中的一随机值, 其中 vmin不小于 0, Vmax不大于 1, 并且 vmin < vmax。 在一实施方案中, 该区间为 [0,1]。 如果假设变量 V 为离散区间 {V^Vz · . Vx}的中的值, 则 V可以是离散的, 其中该离散值落在区间 [vmin, Vmax] (如 [0,1])中。 如果设定 V是区间 [vmin, vmax]的随机值, 则其可以是连续 值。 当变量与组分相关而无任何约束条件或与其它变量均无关, 则第一变量 随机值的设定不受第二变量假定的约束。 如果变量是连续的, 则从同一变量 的区间内设定随机值的过程,取决于该变量可能取值的分布概率或概率密度。 如果该变量是离散的, 则随机值的设定取决于区间中各自的离散值的特定概 率。 A method of designing a combined sample library provided by another aspect of the invention includes providing a variable for each component of the multi-component sample. In other words, the variable corresponds to the component in the sample. It is assumed that the variable V is a random value in the interval [v min , v max ], where v min is not less than 0, V max is not more than 1, and v min < v max . In an embodiment, the interval is [0, 1]. If the variable V is assumed to be a value in the discrete interval {V^Vz · . V x }, then V can be discrete, where the discrete value falls within the interval [v min , V max ] (eg [0, 1]) in. If V is a random value of the interval [v min , v max ], it may be a continuous value. When a variable is related to a component without any constraint or independent of other variables, the setting of the random value of the first variable is not subject to the assumption of the second variable. If the variables are continuous, the process of setting a random value from the interval of the same variable depends on the probability or probability density of the possible values of the variable. If the variable is discrete, the setting of the random value depends on the particular probability of the respective discrete value in the interval.
例如, Vi是第一组分 Q的变量, Vj是组分第二 Cj的变量。 Vi可设定为 [Vi, min, Vi,醒]区间的随机值, Vj可设定为 [Vj, min, Vj,max]区间的随机值, Vi取值与 Vj无关。 当 Q、 Cj是由 、 C2、 ...Q、 ...Cj, ...Cn组成的样品中 的组分,其中 I,j G {0, 1, 2 ...n},该合成变量 Vi变成了组分 Ci的组成比率 , 合成变量 Vj变成了组成比率 Rj,而样品中所有组分的变量总和满足以下公式 4: For example, Vi is a variable of the first component Q, and Vj is a variable of the component second Cj. Vi can be set to a random value in the [Vi, min , Vi, wake-up] interval, Vj can be set to a random value in the [Vj, min , Vj, max ] interval, and the Vi value is independent of Vj. When Q, Cj are components in a sample consisting of C 2 , ... Q, ... Cj, ... C n , where I, j G {0, 1, 2 ... n}, The synthetic variable Vi becomes the composition ratio of the component Ci, and the synthetic variable Vj becomes the composition ratio Rj, and the sum of the variables of all the components in the sample satisfies the following formula 4:
∑V; =1 公式 4 ∑V ; =1 formula 4
i=l  i=l
本发明的另一方面在于: 在所提供的组合样品库的设计方法中为样品中 的至少一个变量提供或设定至少一个约束条件。 "约束条件"一词指至少一变 量的条件或者变量之间的关系。 特别是, 约束条件是样品中一变量或多个变 量必须满足的限制条件。 换言之, 在有效或合格的样品中一组取值 {Vi}的变 量必须满足至少一个约束条件或一组特定的约束条件。 例如, 假定样品包括 组分 d、 C2...Cn, 且每一组分 Q具有变量 Vi, 其中 ie {0, 1, 2 ...n}, 那么, 在有效样品中, 组分变量的总和必须满足以下约束条件, 公式 5: Another aspect of the invention consists in: providing or setting at least one constraint for at least one variable in the sample in the provided method of designing the combined sample library. The term "constraint" refers to the condition of at least one variable or the relationship between variables. In particular, a constraint is a constraint that a variable or variables in a sample must satisfy. In other words, a set of variables {Vi} in a valid or qualified sample must satisfy at least one constraint or a specific set of constraints. For example, assume that the sample includes components d, C 2 ... C n , and each component Q has a variable Vi, where ie {0, 1, 2 ... n}, then, in the effective sample, the component The sum of the variables must satisfy the following constraints, Equation 5:
1-Δ<∑ν; < 1+Δ 公式 5 1-Δ<∑ν ; < 1+Δ Equation 5
i=l  i=l
其中 Δ是误差 (如约束公差或约束偏差), Δ是 0到 0.2之间变化的值。在 优选的实施方式中, Δ是 0.01, 0.02, 0.03, 0.04,0.05, 0.06, 0.07, 0.08, 0.09, 或 0.10。 当 Δ逼近 0, 该约束逼近如公式 5中所示等式。 Where Δ is the error (such as the constraint tolerance or the constraint deviation), and Δ is the value that varies between 0 and 0.2. In In a preferred embodiment, Δ is 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, or 0.10. When Δ approaches 0, the constraint approximates the equation as shown in Equation 5.
另外, 样品库设计中, 关于样品的组分的经验知识可以体现为多个变量 之间的关系, 也可以理解为约束。 换言之, 我们可以通过设定约束来实现经 验知识中的若干个組分之间的关系。 例如,根据之前的经验, 由 d、 C2...Cn 组成的样品中,也许 Q与 的组成比率之比应该为 2:1 ,其中 {0, 1, 2 ...n}, j e {0, 1, 2 ...n}, 且 i≠j。 此时, 除公式 4所示的固有的约束外, 一有效样品 的组分的变量必须满足第二约束 Vi : Vj = 2: l。 再例如, 要达到 Ci和 Cj的组 成比率之和是 X , 其中 X是 0到 1之间的值。 此处, 一有效样品的组分的变 量必须满足第二约束 Vi + Vj = x, 其中 i,j G {0, 1, 2 ...n}且 i≠j。 In addition, in the sample library design, empirical knowledge about the components of the sample can be embodied as a relationship between multiple variables, which can also be understood as a constraint. In other words, we can achieve the relationship between several components in empirical knowledge by setting constraints. For example, according to previous experience, in a sample consisting of d, C 2 ... C n , perhaps the ratio of the composition ratio of Q to 2 should be 2:1, where {0, 1, 2 ... n}, je {0, 1, 2 ...n}, and i≠j. At this time, in addition to the inherent constraints shown in Equation 4, the variables of the components of an effective sample must satisfy the second constraint Vi: Vj = 2: l. For another example, the sum of the composition ratios to achieve Ci and Cj is X, where X is a value between 0 and 1. Here, the variable of the component of an effective sample must satisfy the second constraint Vi + Vj = x, where i, j G {0, 1, 2 ... n} and i ≠ j.
本发明提供的组合样品库的设计方法包括产生伪样品。 "伪样品"是指一 个多组分的假定样品, 其每一个组分都有一个独立的变量使得某一个变量的 任意赋值是相对于另一个变量任意赋值的独立事件并且伪样品的全体变量都 不受任何约束条件的限制。 换言之, 其变量可以满足或不满足约束条件。 例 如, 一伪样品包括 d、 C2...Q,...Cj,...Cn, 其中 i,j e {0, 1, 2 . . .11}且1≠」, V; 和 Vj是 [0,1]区间中的随机值, Vi可取 [0, 1]间的某一个值, Vj可取 [0, 1]间 的另一个值。 这些变量值的和无需符合公式 4和 5的要求。 The method of designing the combined sample library provided by the present invention involves generating a pseudo sample. "Pseudo-sample" refers to a multi-component hypothetical sample, each of which has an independent variable such that any assignment of one variable is an independent event arbitrarily assigned relative to another variable and all variables of the pseudo-sample are Not subject to any restrictions. In other words, its variables may or may not satisfy the constraints. For example, a pseudo sample includes d, C 2 ... Q, ... Cj, ... C n , where i, je {0, 1, 2 . . . 11} and 1 ≠", V ; and Vj Is a random value in the interval [0,1], Vi may take a value between [0, 1], and Vj may take another value between [0, 1]. The sum of the values of these variables does not need to meet the requirements of Equations 4 and 5.
伪样品可以对应也可不对应一个真实的物理样品, 伪样品是没有任何约 束的独立的假定变量取值的样品点。 因此大量的伪样品組成一个样品空间。  A pseudo sample may or may not correspond to a real physical sample, and a pseudo sample is a sample point that has no independent, assumed variable values. Therefore a large number of pseudo samples constitute a sample space.
伪样品可采用随机取样来产生。 随机取样方法是一种通过为一个样品点 的每个组份随机赋值来产生样品点的方法。 在区间 [Vmin, Vmax]中产生随机数 值的方法是明确的,其中 Vmin不小于 0,且 Vmax不大于 1。请参考 Carter的《随 机数字的产生和应用 (第四维) 》 ( "The Generation and Application of Random Numbers (Fourth Dimensions)" , Vol. XVI, 1994)。 随机数生成器的算法和计算程序 是计算机科学领域所熟知的, 请参考 D.E. Knuth的《计算机程序的艺术 -半数 值算法》 (第 2卷, 阿狄森-韦斯利第二版, 1981年出版)("The Art of Computer Programming - Seminumerical Algorithms" Vol. 2, 2nd Ed. Addison- Wesley, 1981); Press等 著的 《数值法处方一科学计算的艺术》 ( "Numerical Recipes: The Art of Scientific Computing" Cambridge University Press, 1986; 及 "Numerical Recipes (FORTRAN)" , 第 1 1-225页, l988); 以及 S. L. Anderson所著《在向量超型计算机和其它先进体 系上的随机数生成器》 ( "Random Number Generators on Vector Supercomputers and Other Advanced Architectures" . SIAM Rev,, 32:221-225,1990年)。 随机产生的数值是事件的发生或在特定区间 [vmin, vmax]中的可能赋值的 变量 (V) ,其中该事件发生的概率取决于该变量的概率密度或概率分布。因此, 变量进一步由概率函数对变量间隔所包含的值的赋值来定义。 例如, 离散变 量可通过给区间中的每一离散值指定相关概率来定义。 连续变量可通过给包 括所有变量可能取值的区间指定概率分布来定义。 Pseudo samples can be generated using random sampling. The random sampling method is a method of generating a sample point by randomly assigning a value to each component of a sample point. The method of generating random values in the interval [V min , V max ] is clear, where V min is not less than 0 and V max is not greater than 1. Please refer to Carter's "Generation and Application of Random Numbers (Fourth Dimensions)", Vol. XVI, 1994). The algorithms and calculation programs for random number generators are well known in the field of computer science. Please refer to DE Knuth's Art-Semi-Numerical Algorithms for Computer Programs (Vol. 2, Addison-Wesley Second Edition, 1981) Published ("The Art of Computer Programming - Seminumerical Algorithms" Vol. 2, 2 nd Ed. Addison- Wesley, 1981); Press, et al., "Numerical Recipes: The Art of Art"("Numerical Recipes: The Art of Scientific Computing "Cambridge University Press, 1986 ; and" Numerical Recipes (FORTRAN) ", page 1 1- 225, l 988); and SL Anderson" written in the vector supercomputer systems and other advanced random number generator ("Random Number Generators on Vector Supercomputers and Other Advanced Architectures". SIAM Rev,, 32:221-225, 1990). The randomly generated value is the occurrence of an event or a possible assigned variable (V) in a particular interval [v min , v max ], where the probability of occurrence of the event depends on the probability density or probability distribution of the variable. Therefore, the variable is further defined by the probability function assigning the value contained in the variable interval. For example, a discrete variable can be defined by assigning a correlation probability to each discrete value in the interval. Continuous variables can be defined by assigning a probability distribution to intervals that include all possible values of the variable.
这里所采用的概率分布指反映其观测或理论出现频率的变量取值的排 列。 该技术领域中熟知的概率分布包括均衡分布和非均衡分布。 非均衡分布 包括柏努利分布、 贝它分布、 X平方分布、 指数分布、 F分布、 伽马分布、 高斯分布、 正态分布 (例如, 对数正态、 多变量正态分布和单变量正态分布)、 非中心 X平方分布、 非中心 F分布、 二项式分布、 负二项式分布、 多项式分 布、 帕雷托分布、 柏松分布、 学生氏 t分布、 萨利斯分布以及以上分布的任 意联合。  The probability distribution used here refers to the arrangement of the values of the variables that reflect the frequency of their observations or theoretical occurrences. Probability distributions well known in the art include balanced distributions and non-equilibrium distributions. Unbalanced distributions include Bernoulli distribution, beta distribution, X-square distribution, exponential distribution, F-distribution, gamma distribution, Gaussian distribution, normal distribution (eg, lognormal, multivariate normal distribution, and univariate positive) State distribution), non-central X-square distribution, non-central F distribution, binomial distribution, negative binomial distribution, polynomial distribution, Pareto distribution, cypress distribution, Student's t distribution, Salis distribution, and above distribution Any combination.
在一实施方案中, 随机变量由非均衡概率分布赋值, 例如, 包括正态分 布、 柏松分布和高斯分布。 在另一实施方案中, 变量的随机产生值由均衡分 布赋值或与其相关, 因此, 均衡分布的随机变量可在区间 [Vmin, Vmax](Vmin > 0,Vmax < 1)以相同概率 支定任何随机值。 In one embodiment, the random variables are assigned by a non-equilibrium probability distribution, for example, including a normal distribution, a Poisson distribution, and a Gaussian distribution. In another embodiment, the randomly generated value of the variable is assigned or related to the equilibrium distribution, and therefore, the random variable of the equilibrium distribution can be the same in the interval [V min , V max ] (V min > 0, V max < 1) Probability determines any random value.
本领域一般技术人员可知, 非均衡分布随机值 (或数)可通过随机数字发 生器产生 (如线性叠合发生器)。该线性叠合发生器的一般公式是 Vi =(aVi.! + c) mod m, 其中 a、 c和 m是预先设定的常数, a是乘数值, c是增量, m是系 数。 请参考 Park和 Miller的 《随机数字发生器:好的难找》 ( "Random Number Generators: Good Ones are Hard to Find" , Comm. ACM 31 :1192-1201 , 1988)。 随机数字 发生器包括 Kirkpatrick和 Stoll所著的《快速移转寄存器序列随机数字发生器》 ( " A Very Fast Shift-Register Sequence Random Number Generator " , Journal of Computational Physics 40:517-526, 1981)中所描述的移转寄存器序列。 此夕卜, 随机 数字发生器也包括准随机的数字发生器,请参考 Press和 Teukolsky的《准随机 数》 ( "Quasi Random Numbers" . Computers in Physics 3: 76-79, 1989) 。  One of ordinary skill in the art will recognize that non-equilibrium distributed random values (or numbers) can be generated by a random number generator (e.g., a linear superimposed generator). The general formula for this linear superimposed generator is Vi = (aVi.! + c) mod m, where a, c and m are pre-set constants, a is the multiplier value, c is the increment, and m is the coefficient. Please refer to Park and Miller's "Random Number Generators: Good Ones are Hard to Find", Comm. ACM 31: 1192-1201, 1988). The random number generator includes the "A Very Fast Shift-Register Sequence Random Number Generator" by Kirkpatrick and Stoll (Journal of Computational Physics 40: 517-526, 1981). The described sequence of transfer registers. Furthermore, the random number generator also includes a quasi-random number generator, please refer to Press and Teukolsky's "Quasi Random Numbers" (Computers in Physics 3: 76-79, 1989).
非均衡分布随机值, 如正态分布或高斯分布随机值, 也可通过相关领域 熟知的方法产生。 请参考鲁宾斯坦 《模拟和蒙特卡罗方法》 (Rubinstein, "Simulation and the Monte Carlo Method" 由 John Wiley & Sons出版 1981年)。 方法 之一包括转化函数, 如著名的博克士墨勒转换, 用以将均衡分布随机变量转 换成新一组非均衡分布的随机变量 (例如, 高斯或正态分布), 请参考博克士 墨勒的《随机偏离产生的记录》(Box & Muller, "A Note on the Generation of Random Deviates" 、 Annals Math. Stat. 29:610-611 , 1958 在一实施方案中, 随机取样方法包括蒙特卡罗方法或模拟。 这里的 "蒙 特卡罗方法" 或 "蒙特卡罗模拟" 一词指随机技术的一种或用来研究问题并 获得解决问题概率近似值的随机取样方法。 特别地, 这里所用的 "蒙特卡罗 方法"或 "蒙特卡罗模拟" 一词特别是指产生随机事件的过程 (如任意给定变 量的随机发生值)。 该过程通常通过计算机运算法则达成, 该过程重复多次, 且分析和计算所有的试验结果用以提供近似解答。 蒙特卡罗模拟请参考米特 罗泊勒斯和乌拉姆的 《蒙特卡罗方法》 (美国统计协会期刊 44: 335-341,1949 年) ( Metropolis and Ulam, "The Monte Carlo Method" , Journal of American Statistical Association 44: 335-341 1949 ); 寿柏尔的《蒙特卡罗方法》 (Sobol, "The Monte Carlo Method" , The University of Chicago Press , 1974); 穆尼的《蒙特卡罗模拟》 (Mooney, "Monte Carlo Simulation" , Sage University Paper, 1997)。 Non-equalized random values, such as normal or Gaussian random values, can also be generated by methods well known in the relevant art. Please refer to Rubinstein's Simulation and Monte Carlo Method (Rubinstein, "Simulation and the Monte Carlo Method" published by John Wiley & Sons, 1981). One of the methods includes a transformation function, such as the well-known Boks Moeller conversion, to convert the equilibrium distribution random variable into a new set of non-equilibrium distribution random variables (for example, Gaussian or normal distribution), please refer to Boks Moeller Box & Muller, "A Note on the Generation of Random Deviates", Annals Math. Stat. 29:610-611, 1958 In an embodiment, the random sampling method comprises a Monte Carlo method or simulation. The term "Monte Carlo method" or "Monte Carlo simulation" here refers to a random sampling method used to study a problem and obtain an approximation of the probability of solving the problem. In particular, the term "Monte Carlo method" or "Monte Carlo simulation" as used herein refers in particular to the process of generating a random event (such as a randomly occurring value of any given variable). This process is usually done by computer algorithms, which are repeated multiple times, and all test results are analyzed and calculated to provide an approximate solution. For Monte Carlo simulations, please refer to the Monte Carlo Method of Mitropolis and Uram (Journal of the American Statistical Association 44: 335-341, 1949) (Metropolis and Ulam, "The Monte Carlo Method", Journal of American Statistical Association 44: 335-341 1949 ); Sherbert's "Monte Carlo Method" (Sobol, "The Monte Carlo Method", The University of Chicago Press, 1974); Mooney's Monte Carlo Simulation ( Mooney, "Monte Carlo Simulation", Sage University Paper, 1997).
蒙特卡罗方法在本领域不断得到发展。 例如, 该方法最初应用于通过在 标准坐标 (被正方形外切的一圆周) 上投掷飞镖估计 的数值上。 通过大量 试验, 发现飞镖击中圆周与正方形的数量分别与圆周面积和正方形面积成比 例, 并具有相当的精确度。 相应地, 飞镖击中圆周和正方形次数的比率近似 于^值的分数, 请参考罗斯《概率第一课》(Ross, "A First Course in Probability " 2nd Edition, Macmillan, 1976)。 The Monte Carlo method is constantly evolving in the field. For example, the method is initially applied to values estimated by throwing darts on standard coordinates (a circumference circumscribed by a square). Through a lot of experiments, it was found that the number of hitting circles and squares of the dart is proportional to the circumferential area and the square area, respectively, and has considerable precision. Correspondingly, the ratio of the number of times the dart hits the circumference and the square is similar to the fraction of the ^ value, please refer to Ross "The First Lesson in Probability" (Ross, "A First Course in Probability" 2nd Edition, Macmillan, 1976).
另一例子, 蒙特卡罗模拟可应用于估计以下积分公式 6: 公式 6 As another example, Monte Carlo simulation can be applied to estimate the following integral formula 6: Equation 6
Figure imgf000010_0001
在该例中, 函数 V(x)周围有一范围框, V(x)的积分可理解为范围框中在 V(x)的部分。 如果范围框中点的选取随机且非均一, 那么点位于 V(x)中的概 率则由 V(x)在框中所占的面积部分确定。 蒙特卡罗模拟于是在框中产生大量 随机点(随机发生值)并计算 V(x)中点的数量以获得面积。 作为结果, 公式 6 的积分可表达为以下公式 7:
Figure imgf000010_0001
In this example, there is a range box around the function V(x), and the integral of V(x) can be understood as the portion of the range box at V(x). If the selection of points in the range box is random and non-uniform, the probability that the point is in V(x) is determined by the area of the area occupied by V(x) in the box. The Monte Carlo simulation then generates a large number of random points (random occurrence values) in the box and calculates the number of points in V(x) to obtain the area. As a result, the integral of Equation 6 can be expressed as Equation 7 below:
N« - C 公式 7 N« - C formula 7
B  B
其中 A是 V(x)中点的数量, Β是框中产生的所有点的数量, C是范围框 的面积。 另外, 比率 Α/Β与 V(x)在范围框中相对所占面积比例相关。  Where A is the number of points in V(x), Β is the number of all points produced in the box, and C is the area of the range box. In addition, the ratios Α/Β and V(x) are related in proportion to the area occupied by the range box.
蒙特卡罗模拟的另一实例包括生成由经验知识加以侧重的随机变量值。 例如, 关于组分 (或組分比率)的经验知识要求赋予变量在不同的特定区间以 不同的概率密度(连续变量),或要求赋予变量在不同的值以不同的取值概率 (离散变量)。另一蒙特卡罗模拟的例子包括马尔可夫链运算。马尔可夫运算 是一个随机值序列, 它的每一事件发生的概率依赖于产生于前一时刻的值。 请参考弗兰克和史密斯的 《了解分子模拟: 从算法到应用》(Frenkd & Smith, "Understanding Molecular Simulation: From Algorithm to Applications" Academic Press , 1996)。 Another example of Monte Carlo simulation involves generating random variable values that are emphasized by empirical knowledge. For example, empirical knowledge about components (or component ratios) requires assigning variables with different probability densities (continuous variables) in different specific intervals, or requiring that variables be assigned different values at different values. (discrete variable). Another example of Monte Carlo simulation includes Markov chain operations. A Markov operation is a sequence of random values whose probability of occurrence of each event depends on the value generated at the previous moment. Please refer to Frank and Smith's "Understanding Molecular Simulation: From Algorithm to Application" (Frenkd & Smith, "Understanding Molecular Simulation: From Algorithm to Applications" Academic Press, 1996).
本发明另一方面是关于从伪样品中选择合格样品的方法。 这里的 "合格 样品" 一词指通过本发明中所述方法所产生的, 变量满足一个或多个特定约 束的伪样品,非合格样品的伪样品被称为不合格样品。本发明一个实施例中, 伪样品通过随机取样方法产生 (如蒙特卡罗模拟), 即在大量试验中, 在区间 [Vmin, Vmax] (Vmin > 0且 Vmax < 1)中按均匀分布随机产生的值被赋予给若干组 分变量, 以产生不受任何约束的伪样品。 每一伪样品通过检查 (考察), 譬如 用一个计算机算法, 被判定其是否满足特定的一个或多个约束。 挑选满足约 束的伪样品作为合格样品存储起来。 同时, 与每一合格样品相关的诸个值被 记录成一个矢量, 并与组分比率对应起来, 以便在样品库中合成和设计合格 样品, 因为在合格样品中, 该些值正是诸组分比率。 Another aspect of the invention relates to a method of selecting a qualified sample from a pseudo sample. The term "qualified sample" as used herein refers to a pseudo sample produced by the method described in the present invention, the variable satisfies one or more specific constraints, and the pseudo sample of the non-conforming sample is referred to as a non-conforming sample. In one embodiment of the present invention, the dummy samples produced (e.g., Monte Carlo simulation) by random sampling method, i.e. a large number of tests, in the interval [V min, V max] ( V min> 0 and V max <1) Press Uniformly distributed randomly generated values are assigned to several component variables to produce pseudo samples that are not subject to any constraints. Each pseudo sample is examined (inspected), for example, using a computer algorithm to determine if it meets a particular constraint or constraints. A pseudo sample that satisfies the constraint is selected and stored as a qualified sample. At the same time, the values associated with each eligible sample are recorded as a vector and associated with the component ratio to synthesize and design a qualified sample in the sample library, since in the qualified sample, the values are the groups. Ratio.
本发明的另外一个方面为在样品库产生给定数量的样品提供了一种方 法。 该方法包括以下步骤:  Another aspect of the invention provides a method of producing a given number of samples in a sample library. The method includes the following steps:
(1)提供组成样品的若干组分;  (1) providing a number of components that make up the sample;
(2)赋予每一个组分一变量;  (2) assigning each component a variable;
(3) 为变量设定至少一个约束条件;  (3) Set at least one constraint for the variable;
(4)提供所需要的样品数量;  (4) Provide the required number of samples;
(5)产生伪样品;  (5) generating a pseudo sample;
(6)如果伪样品的变量满足该约束条件, 确定该伪样品为合格样品; (6) if the variable of the pseudo sample satisfies the constraint condition, determining that the pseudo sample is a qualified sample;
(7) 重复步骤 (5)和 (6), 直至合格样品数量达到所需数量。 (7) Repeat steps (5) and (6) until the number of qualified samples reaches the required number.
本发明另一方面揭示了一种计算合格样品比率的方法, 这里的 "合格样 品比率 (RqS),, 一词是指, 在随机取样方法中, 变量满足一个或多个约束的伪 样品的比例。 在一实施方案中, 合格样品比率 (Rqs)可以由随机取样方法中合 格样品数量 (NqS)除以伪样品数量 (Nps)来估计 (公式 8)。 Another aspect of the present invention discloses a method of calculating a ratio of acceptable samples, where the term "qualified sample ratio (Rq S )," means that in a random sampling method, a variable satisfies one or more constrained pseudo samples. In one embodiment, the acceptable sample ratio (R qs ) can be estimated by dividing the number of qualified samples (Nq S ) in the random sampling method by the number of pseudo samples (N ps ) (Equation 8).
Nqs Nqs
Rqs« 公式 8  Rqs« Equation 8
Nps 当 Nps增大, 计算精确度变小, 变化规则跟从以下公式 9: Accuracy 〜士 公式 9 其中 N是随机模拟 (如蒙特卡罗)试验数量。 当进行了大量蒙特卡罗模拟 试验后, 随着 1/N不断減少, 合格样品比率变动减小且精确度增加。 换言之, 当进行足够数量的试验后,合格样品的比率可达到相当高的精确度和准确度。 例如, 就约束条件 1-八≤¾ ≤1+八, 蒙特卡罗模拟的样品 (V0越多, 精确度越 Nps When N ps increases, the calculation accuracy becomes smaller, and the change rule follows the following formula 9: Accuracy ~ 士法9 where N is the number of random simulations (such as Monte Carlo). When a large number of Monte Carlo simulation tests were performed, as 1/N continued to decrease, the variation in the acceptable sample ratio decreased and the accuracy increased. In other words, when a sufficient number of tests are performed, the ratio of acceptable samples can achieve a relatively high degree of accuracy and accuracy. For example, for a constraint 1-eight ≤ 3⁄4 ≤ 1 + eight, Monte Carlo simulation of the sample (the more V0, the more accurate
i=l  i=l
高。 high.
在一实施方案中, 在采用微软的随机数字发生器 (C++编译器 7.1.3091版, 2003 年)产生随机数字函数的蒙特卡罗试验操作中, 观测到获得一个合格样 品 (如
Figure imgf000012_0001
100%; 当 Nqs达到 10时,其精确度 在 -30%到 30%之间; 当 NqS达到 102 , 其精确度在 -10%到 10%之间; 当 Nqs 达到 103, 其精确度在 -3%到 3%之间; 当 NQS达到 104, 其精确度在 -1%到 1% 之间。
In one embodiment, in a Monte Carlo test operation using a random number generator generated by Microsoft's random number generator (C++ compiler version 7.3.1091, 2003), it is observed that a qualified sample is obtained (eg,
Figure imgf000012_0001
100%; when N qs reaches 10, its accuracy is between -30% and 30%; when Nq S reaches 10 2 , its accuracy is between -10% and 10%; when N qs reaches 10 3 , Its accuracy is between -3% and 3%; when N QS reaches 10 4 , its accuracy is between -1% and 1%.
另一方面, 本发明还揭示了估计合格样品最佳数量的方法。 这里的 "合 格样品最佳数量" 一词指, 满足特定一个或多个约束且能恰当地表现样品空 间的随机取样的样品数。  In another aspect, the invention also discloses a method of estimating the optimal number of acceptable samples. The term "optimal number of qualified samples" herein refers to the number of samples that satisfy a particular one or more constraints and that properly represent the sample space.
在本发明一实施方案中, 合格样品最佳数量通过检测所有由离散变量产 生的可能的伪样品并识别满足特定约束的伪样品来获得。 对于到离散变量, 通过将区间 [Vi, min, Vi, max]分割(均匀或非均匀地)成 M部分或格赋予特定变 量 ¼来产生一组离散值, 从而产生一组该区间的定义值。 如果区间被均匀分 成 M部分, 变量的离散值可为 }中任意值, 其中 Vi=Vi, min+ * (Vi, -Vi, min)/Mi, 1 {0,1, 2,. . . M} M是正整数且可以是 1到 1,000,000之间的任意 数。 在一实施方案中, ME{ 1, 2 3, 4, 5, 6, 7 8, 9, 10, 11, 12, 13, 14, 15, 16 17, 18,. . . 20. . 25. . .30. . .40 . . 50 . .102.103 . .104. . 105. 106} In one embodiment of the invention, the optimal number of qualified samples is obtained by detecting all possible pseudo samples produced by discrete variables and identifying pseudo samples that satisfy a particular constraint. For discrete variables, a set of discrete values is generated by dividing (uniformly or non-uniformly) the interval [Vi, min , Vi, max ] into M parts or cells to a specific variable 1⁄4, thereby generating a set of defined values for the interval. . If the interval is evenly divided into M parts, the discrete value of the variable can be any value in }, where Vi=Vi, min + * (Vi, -Vi, min )/Mi, 1 {0,1, 2,. . . M } M is a positive integer and can be any number between 1 and 1,000,000. In one embodiment, ME{ 1, 2 3, 4, 5, 6, 7 8, 9, 10, 11, 12, 13, 14, 15, 16 17, 18, . . . .30. . .40 . . 50 . .10 2 .10 3 . .10 4 . . 10 5 . 10 6 }
如果变量仅设定为区间中的离散数值,伪样品 (Z)的总数可由变量的特定 格子的数量产生。 例如, 在任意包含组分 d、 c2... cn的样品中, 每一组分 Q具有变量 Vi, 其中 i e {0, 1, 2.·.η }, 每个 ¼被分隔成 Mi部分或格子或点, 因此 Vi取在区间 [Vmin, Vmax] (VMIN > 0且 Vmax < l)的一组离散值。 根据我们 对组分所对应的变量的经验来确定这种分隔, 它可以是均勾的分段或非均匀 的分段。 如果不考虑或没有提供变量的约束条件, 基于一组 {MJ的伪样品总 数量 (Z)代表了 n维样品空间的样品点, Z可以由以下公式 10计异得出: Mi 公式 10If the variable is only set to a discrete value in the interval, the total number of pseudo samples (Z) can be generated by the number of specific grids of the variable. For example, in any sample containing components d, c 2 ... c n , each component Q has a variable Vi, where ie {0, 1, 2.·.η }, each 1⁄4 is separated into Mi Part or lattice or point, so Vi takes a set of discrete values in the interval [V min , V max ] (V MIN > 0 and V max < l). This separation is determined based on our experience with the variables corresponding to the components, which can be either segmented or non-uniform segments. If you do not consider or provide constraints for variables, based on a set of {MJ pseudo samples total The quantity (Z) represents the sample point of the n-dimensional sample space, and Z can be derived by the following formula 10: Mi Equation 10
Figure imgf000013_0001
Figure imgf000013_0001
当给定至少一个约束时, 可以检测所有伪样品, 选出变量满足该约束的 伪样品并存储到矢量中构成合格样品集。 当全部样品空间 (整个 Z个伪样品) 被检测后, 合格样品的数量, 亦即上述矢量的数目, 就是对应于给定一组离 散值的样品空间中合格样品的最佳数量。  When at least one constraint is given, all pseudo samples can be detected, a pseudo sample whose variables satisfy the constraint is selected and stored in the vector to form a qualified sample set. When all sample spaces (the entire Z pseudo samples) are detected, the number of eligible samples, i.e., the number of vectors described above, is the optimum number of eligible samples in the sample space corresponding to a given set of discrete values.
如果有关组分 (或变量)数量增多, 且每个组分的分段数量也增加, 则通 过样品空间的完全搜寻会变得相当繁重。 例如, 对一个由五个组分且每组分 变量的每一特定区间被分隔成 100格所构成的样品库 ,样品空间是五维空间, 所有伪样品 (Z)的总数是 1005 (或 101G, 10,000,000,000)。 当引入一个或多个约 束, 计算会变得更复杂。 尽管可通过检测每一伪样品来识别其是否满足该一 个或多个约束, 但可采用另一方法, 即通过进行这里所述的随机取样 (如蒙特 卡罗模拟)来提供最佳数值的近似估计。 If the number of components (or variables) is increased and the number of segments per component is also increased, a full search through the sample space can become quite cumbersome. For example, for a sample library consisting of five components and each specific interval of each component variable is divided into 100 cells, the sample space is a five-dimensional space, and the total number of all pseudo samples (Z) is 100 5 (or 10 1G , 10,000,000,000). When one or more constraints are introduced, the calculation becomes more complicated. Although it is possible to identify whether each of the pseudo samples satisfies the one or more constraints, another method may be employed to provide an approximation of the optimal values by performing random sampling as described herein (e.g., Monte Carlo simulation). estimate.
因此, 在一实施方案中, 我们通过随机模拟产生伪样品, 检测该伪样品 是否满足所述 束, 根据本发明所述的方法获得合格样品和合格样品比率。 在该随机模拟中, 随机数可以是基于一组离散值来产生的, 其中每一离散值 位于变量所假定的区间中并具有一定的概率。 随机数也可以是对应某一区间 的具有特定概率分布的任意取值。 因此, 合格样品的最佳数量取决于 Z 与 Rqs的乘积。 可以预期, 合格样品的最佳数量会有变动, 这取决于一组参数。 该参数的举例包括样品中变量分割 (M)数、 产生随机数字的方法、 变量的统 计分布状态、 蒙特卡罗模拟方式、 蒙特卡罗试验次数、 变量约束、 公差限制 以及所需的准确度或精度。 Thus, in one embodiment, we generate a pseudo sample by random simulation, detecting whether the pseudo sample satisfies the beam, and obtaining a qualified sample and a qualified sample ratio according to the method of the present invention. In this stochastic simulation, the random number can be generated based on a set of discrete values, where each discrete value is located in the interval assumed by the variable and has a certain probability. The random number may also be any value corresponding to a certain interval having a specific probability distribution. Therefore, the optimal number of qualified samples depends on the product of Z and R qs . It is expected that the optimal number of eligible samples will vary depending on a set of parameters. Examples of this parameter include the number of variable segments (M) in the sample, the method of generating random numbers, the statistical distribution of the variables, the Monte Carlo simulation, the number of Monte Carlo tests, the variable constraints, the tolerance limits, and the required accuracy or Precision.
可以理解, 随机取样(如蒙特卡罗模拟) 与概率分布相关的随机数的产 生, 根据本发明所提供方法所作的合格样品的选择与计算, 通常都是通过计 算机系统或服务器系统来执行的。  It will be appreciated that the generation of random numbers associated with probability distributions (e.g., Monte Carlo simulations) associated with probability distributions, the selection and calculation of qualified samples made in accordance with the methods provided herein, are typically performed by computer systems or server systems.
本发明中的计算机系统 (如服务器系统)是指设计并配置成用于执行本发 明所描述的部分或全部方法的计算机或计算机可读媒体。 这里采用的计算机 (如服务器)可以是任何多种类型普通用途的计算机, 如个人电脑、 网络服务 器、 工作站或现今或日后发展的其它计算机平台。 本领域所熟知的, 计算机 特别地包括有如处理器、 操作系统、 计算机存储器、 输入设备以及输出设备 这些部件的部分或全部。 计算机可进一步包括如高速緩冲存储器、 数据备份 单元以及一些其它设备。 本领域一般技术人员可以理解, 这些计算机部件可 以有许多其它可能的构造。 A computer system (e.g., a server system) in the present invention refers to a computer or computer readable medium that is designed and configured to perform some or all of the methods described herein. The computer (e.g., server) employed herein can be any of a variety of general purpose computers, such as personal computers, network servers, workstations, or other computer platforms that are currently or in the future. As is well known in the art, computers include, in particular, some or all of the components such as processors, operating systems, computer memories, input devices, and output devices. The computer can further include, for example, a cache, a data backup Unit and some other equipment. One of ordinary skill in the art will appreciate that these computer components can have many other possible configurations.
这里所采用的处理器可包括一个或多个微处理器、 可域编程逻辑阵列, 或一个或多个对应于特种应用的专门的集成电路。 举例说, 处理器包括但不 限于英特尔公司的奔腾系列处理器、 S皿公司的微处理器、 Sun公司的工作 站系统处理器、 摩托罗拉公司的个人台式机处理器、 MIPS 科技有限公司的 MIPs处理器、 Xilinx公司的最高系列可域编程逻辑阵列以及其它一些处理器。  A processor as used herein may include one or more microprocessors, domain programmable logic arrays, or one or more specialized integrated circuits corresponding to a particular application. For example, processors include, but are not limited to, Intel's Pentium series processors, S-Chip's microprocessors, Sun's workstation system processors, Motorola's personal desktop processors, MIPS Technologies' MIPs processors. Xilinx's highest range of domain programmable logic arrays and other processors.
这里所采用的操作系统包括机器代码, 通过处理器的执行, 能协调和执 行计算机内其它部分的功能, 且帮助处理器执行可能用多种程序语言编写的 不同计算机程序的功能。 除管理计算机中其它部分的数据流之外, 操作系统 也提供调度安排、 输入输出控制、 文件数据管理、 内存管理和通讯控制以及 相关服务, 所有这些都是现有技术。 典型操作系统包括如微软公司的视窗操 作系统、 由诸多供应商提供的 Unix或 Linux操作系统、 另外一些或将来发展 的操作系统, 以及这些操作系统的组合。  The operating system employed herein includes machine code that, through execution of the processor, coordinates and performs functions of other portions of the computer, and assists the processor in performing functions of various computer programs that may be written in various programming languages. In addition to managing the data flow in other parts of the computer, the operating system also provides scheduling, input and output control, file data management, memory management and communication control, and related services, all of which are prior art. Typical operating systems include Windows operating systems such as Microsoft Corporation, Unix or Linux operating systems from a variety of vendors, additional or future operating systems, and combinations of these operating systems.
这里所采用的计算机存储器可是任意不同类型的记忆存储装置。 例如包 括随处可见的随机存取存储器、 永久性硬盘或磁带等磁介质存储、 读写激光 唱盘等光学介盾, 或其它存取存储装置。 记忆存储装置可以是任意一种现有 或将来发展的装置, 包括激光唱盘驱动器、 磁带驱动器、 可移动硬盘驱动器 或磁盘驱动器。 这些类型的记忆存储装置一般是从计算机程序存储介质中读 取或写入到该介质中, 如光盘、 磁带、 可移动硬盘或软盘。 所有这些计算机 程序存储介质都可以被认为是计算机程序的产物。 这些计算机程序的产物通 常存储计算机软件程序和 /或数据。计算机软件程序一般被存储在系统存储器 和 /或记忆存储装置中。  The computer memory used herein can be any of a variety of different types of memory storage devices. Examples include random access memory, magnetic media storage such as permanent hard disks or tapes, optical shields such as reading and writing laser discs, or other access storage devices. The memory storage device can be any existing or future development device, including a compact disc drive, a tape drive, a removable hard drive, or a disk drive. These types of memory storage devices are typically read from or written to a computer program storage medium, such as an optical disk, magnetic tape, removable hard disk or floppy disk. All of these computer program storage media can be considered a product of a computer program. The products of these computer programs typically store computer software programs and/or data. Computer software programs are typically stored in system memory and/or memory storage.
本领域一般技术人员很容易了解到, 本发明中的计算机软件程序可以通 过用某种输入设备来载入系统存储器和 /或记忆存储装置中从而执行。另一方 面,所有或部分该软件程序也可存在于只读存储器或类似的记忆存储装置中, 这样的装置不需要该软件程序首先通过输入装置被载入。 相关领域的一般技 术人员可以理解, 该软件程序或其某些部分可以通过现有方式由处理器来载 入至系统存储器或高速緩冲存储器或二者的结合, 以利于执行和进行随机取 样。  It will be readily apparent to those skilled in the art that the computer software program of the present invention can be executed by loading it into a system memory and/or memory storage device using some type of input device. On the other hand, all or part of the software program may also be present in a read only memory or similar memory storage device, such device not requiring the software program to be loaded first through the input device. One of ordinary skill in the relevant art will appreciate that the software program, or portions thereof, can be loaded by the processor into system memory or a cache or a combination of both in an existing manner to facilitate performing and performing random sampling.
进一步的, 利用本发明的组合样品库设计方法进行配方和工艺设计所得 的数据可以直接输入物质处理装置的计算机控制系统, 使物质处理装置根据 这些数据进行样本的制备。 物质处理装置可以是申请人的国际专利申请第 PCT/CN2005/002177号 《物质处理装置及其应用》 所揭示的物质处理笨置, 也可以是申请人的中国专利申请第 200510029727.3号所揭示的反应系统,还 可以是申请人的中国专利申请第 200610085162.5号所揭示的平行反应系统, 还可以是申请人于 2006年 9月 30日申请的中国专利申请《反应系统》 所揭 示的反应系统, 还可以是申请人于 2006年 9月 30日申请的中国专利 《反应 器》 中所揭示的反应器等。 Further, the data obtained by the formulation and process design using the combined sample library design method of the present invention can be directly input into the computer control system of the material processing apparatus, so that the material processing apparatus performs sample preparation based on the data. The substance handling device may be the applicant's international patent application The substance treatment disclosed in PCT/CN2005/002177, "Material Treatment Apparatus and Application thereof" may also be a reaction system disclosed in the applicant's Chinese Patent Application No. 200510029727.3, and may also be the applicant's Chinese patent application. The parallel reaction system disclosed in No. 200610085162.5 may also be the reaction system disclosed in the Chinese patent application "Reaction System" filed by the applicant on September 30, 2006, or may be applied by the applicant on September 30, 2006. The reactor disclosed in the Chinese patent "Reactor" and the like.
在本发明一实施方案中, 软件被存储在计算机服务器中, 该计算机服务 器通过数据线、无线线路或网络系统与使用终端、输入设备或输出设备连接。 本技术领域一般熟知的, 网络系统包括在计算机或装置中电性连接在一起的 硬件和软件。 例如网络系统可包括互联网、 10/1000 以太网、 电气电子工程 协会 802.11x、 电气电子工程协会 1394、 xDSL、 蓝牙、 局域网、 无线局域网、 GSP、 CDMA, 3G、 PACS或任何其它 ANSI认可标准的介质基础上的设备。  In one embodiment of the invention, the software is stored in a computer server that is coupled to the user terminal, input device, or output device via a data line, wireless line, or network system. As is well known in the art, network systems include hardware and software that are electrically coupled together in a computer or device. For example, the network system may include the Internet, 10/1000 Ethernet, Electrical and Electronic Engineering Association 802.11x, Electrical and Electronic Engineering Association 1394, xDSL, Bluetooth, LAN, WLAN, GSP, CDMA, 3G, PACS or any other ANSI recognized standard medium. Based on the equipment.
进一步的, 一方面, 研究人员可以在任何地方通过上述网络系统接入计 算机服务器进行配方与工艺的设计; 又一方面, 曱地的研究人员可以通过上 述网络系统接入计算机服务器进行配方与工艺的设计, 乙地的研究人员可以 通过上述网络系统接入计算机服务器取得所述的配方与工艺设计数据, 从而 实现不同地区的合作研究, 利于实验设备的集中管理, 能实现研发与实施在 不同地区进行, 具体可参考申请人的中国专利申请第 200610100921.0号《计 算机辅助图形化实验设计系统及方法》。  Further, on the one hand, the researcher can access the computer server to design the recipe and process through the above network system anywhere; on the other hand, the researchers in the ground can access the computer server through the above network system for formulation and process. Design, B researchers can access the computer server through the above network system to obtain the formulation and process design data, thus achieving collaborative research in different regions, facilitating centralized management of experimental equipment, enabling R&D and implementation in different regions. For details, refer to the applicant's Chinese Patent Application No. 200610100921.0 "Computer Aided Graphical Experimental Design System and Method".
进一步的, 利用本发明的组合样品库设计方法进行配方和工艺设计所得 的数据还能够被存储于服务器, 以供不同的研究人员共享。 比如, 可通过申 请人设计的 Portal系统进行数据共享。 研究人员还可以通过 Portal系统进行 交流再利用本发明的组合样品库设计方法进行配方和工艺设计。 具体请参申 请人的中国专利申请第 200610100921.0号《计算机辅助图形化实验设计系统 及方法》。  Further, the data obtained from the formulation and process design using the combined sample library design method of the present invention can also be stored on a server for sharing by different researchers. For example, data sharing can be done through the Portal system designed by the applicant. Researchers can also communicate through the Portal system and use the combined sample library design method of the present invention for formulation and process design. For details, please refer to the Chinese Patent Application No. 200610100921.0, "Computer Aided Graphical Experimental Design System and Method".
前面已经对本发明进行一般描述, 接下来例举一些特定的实施例进一步 描述, 以助了解本发明。  The invention has been described above generally, and further exemplified by the specific embodiments of the invention.
实施例 1  Example 1
本实施例显示如何从蒙特卡罗模拟方式产生的由两个组分 (铈和铁)组成 的伪样品中选择合格样品。 铈的变量 VCe在 0到 1之间取值, 铁的变量 VFe 同样也是在 0到 1之间取值。 蒙特卡罗模拟采用均匀分布的随机产生的 0到 1之间的 VCe和 VFe值来进行, 该模拟中 VCe的随机产生值与 VFe的随机产生 值相互独立。 且在该模拟中, VCe和 VFe并无任何关系或约束的强制限定。 蒙 特卡罗模拟的结果是产生了伪样品群。 如图 1中所示点 (包括空心、 灰色和 深色) 的全部构成伪样品集合。 This example shows how to select a qualified sample from a pseudo sample consisting of two components (tantalum and iron) produced by Monte Carlo simulation. The variable V Ce of铈 takes a value between 0 and 1, and the variable V Fe of iron also takes a value between 0 and 1. The Monte Carlo simulation was performed using a uniformly distributed randomly generated V Ce and V Fe values between 0 and 1, in which the randomly generated values of V Ce were independent of the randomly generated values of V Fe . And in this simulation, V Ce and V Fe have no mandatory relationship of any relationship or constraint. Mongolia The result of the Tecal simulation is the generation of a pseudo sample population. All of the points (including hollow, gray, and dark) as shown in Figure 1 constitute a collection of pseudo samples.
我们可以利用经验知识来减少合格样品的数目, 这是通过引入约束条件 实现的。 第一约束条件定义为 0.2<VCe<0.8 且 0.2<VFe<0.8。 当选择过程考虑 了第一约束后, 选择出的一组伪样品显示为如图 1所示的黑点或灰点。 第二 约束定义为 l-A< VCe + VFe <l+A , 当在第一约束条件的基础上进一步考虑到 该第二约束条件, 选择出同时满足两个约束条件的一组伪样品显示为如图 1 所示的黑点。 We can use empirical knowledge to reduce the number of qualified samples by introducing constraints. The first constraint is defined as 0.2 < V Ce < 0.8 and 0.2 < V Fe < 0.8. When the selection process considers the first constraint, the selected set of dummy samples are displayed as black or gray dots as shown in FIG. The second constraint is defined as lA< V Ce + V Fe <l+A. When the second constraint is further considered on the basis of the first constraint, a set of pseudo samples that simultaneously satisfy the two constraints are displayed as Black dots as shown in Figure 1.
因此, 在设计由 Ce和 Fe组成的样品库时, 该蒙特卡罗模拟通过该两个 约束将经验引进了设计, 并得到了设计样品库的确定信息。 如, 既然满足两 个约束的伪样品数量相关的数字已知, 那么合格样品的数量可以知道。 如表 I中所示, 我们可知各个合格样品的组分比率。表 I显示通过蒙特卡罗模拟产 生的伪样品值, 以斜体字显示数字是满足第一约束的伪样品值, 方框中的数 字是满足第一和第二约束的伪样品值。 Therefore, when designing a sample library consisting of C e and F e , the Monte Carlo simulation introduces the experience into the design through the two constraints, and obtains the identification information of the design sample library. For example, since the number associated with the number of pseudo samples satisfying the two constraints is known, the number of qualified samples can be known. As shown in Table I, we know the composition ratio of each eligible sample. Table I shows the pseudo sample values produced by Monte Carlo simulation. The numbers in italics are the pseudo sample values that satisfy the first constraint, and the numbers in the boxes are the pseudo sample values that satisfy the first and second constraints.
若变量由特定格点分割, 样品的最佳数量可根据本发明所提供的方法获 得。 表 I 蒙特卡罗模拟中的 VCe和 VFe的值 If the variable is segmented by a particular grid point, the optimal number of samples can be obtained according to the methods provided herein. Table I Values of V Ce and V Fe in Monte Carlo simulations
Fe Ce  Fe Ce
0.040000 0.160000  0.040000 0.160000
0.120000 0.340000  0.120000 0.340000
0.440000 0.120000  0.440000 0.120000
0.620000 0,340000  0.620000 0,340000
0.160000 0.040000  0.160000 0.040000
0.460000 0.020000  0.460000 0.020000
0.460000 0.040000  0.460000 0.040000
0.460000 1.000000  0.460000 1.000000
0.560000 0.060000  0.560000 0.060000
0.860000 0.040000  0.860000 0.040000
0.960000 0.020000  0.960000 0.020000
0.000000 0.180000  0.000000 0.180000
0.000000 0.500000
Figure imgf000017_0001
0.000000 0.500000
Figure imgf000017_0001
oooso. ΟΟΟΓΟΖ Oooso. ΟΟΟΓΟΖ
ooosr  Ooosr
oozo. οοοοπο 0000089,, ΟΟΟΟΟΡΌ OOOO 'O Oozo. οοοοπο 0000089,, ΟΟΟΟΟΡΌ OOOO 'O
000009V ΟΟΟΟΡΖΌ ΟΟΟΟΟ 'Ο OOOOPZ'O ΟΟΟΟΟ Ο oooowo ooooos'o OOOOZl ·0 000009V ΟΟΟΟΡΖΌ ΟΟΟΟΟ 'Ο OOOOPZ'O ΟΟΟΟΟ Ο oooowo ooooos'o OOOOZl ·0
000096Ό 000089Ό 000098Ό 00008SO 00009 /0 000088.0 000026Ό 000098Ό 000081 '0 00008 '0 000096Ό 00009SO 000088 000098·。
Figure imgf000018_0001
000096Ό 000089Ό 000098Ό 00008SO 00009 /0 000088.0 000026Ό 000098Ό 000081 '0 00008 '0 000096Ό 00009SO 000088 000098·.
Figure imgf000018_0001
000038Ό 00009£'0 0000Ζ9Ό OOOOSZ'O OOOO O 000086Ό 00008S 000088Ό 000091 '0 00009 Ό OOOOZS'O 00008 £'0 000036Ό oooowo 000086Ό oooo o 000085Ό 00008ΙΌ OOOOWO OOOO O ΟΟΟΟ Ό 000086 οοοοζε'ο 000086Ό οοοο9ε·ο 000088Ό Ό O 0008 0008 0008 0008 0008 0008 0008 0008 0008 0008 0008 0008 0008 0008 0008 0008 0008 0008
00002SO 000088 oooow/o OOOOWO oooowo 0000^8'000002SO 000088 oooow/o OOOOWO oooowo 0000^8'0
00008 SO 00009 '0 ΟΟΟΟΡ Ο OOOOPl V 000086Ό oooo o 0000176Ό oooo o 00008 SO 00009 '0 ΟΟΟΟΡ Ο OOOOPl V 000086Ό oooo o 0000176Ό oooo o
T69Z00/900ZN3/X3d 996贿 ·00Ζ OAV - - T69Z00/900ZN3/X3d 996 bribe 00Ζ OAV - -
ΟΟΟΟΟΑΌ 000096.0ΟΟΟΟΟΑΌ 000096.0
000009Ό 000098*0 oooooro 000098.0000009Ό 000098*0 oooooro 000098.0
OOOOO SO 00009 0OOOOO SO 00009 0
OOOOOl '0 00009ΡΌ oooooro 00009Ε oooooro οοοο9ε'ο oooooro 000090ΌOOOOOl '0 00009ΡΌ oooooro 00009Ε oooooro οοοο9ε'ο oooooro 000090Ό
000008Ό 000006Ό000008Ό 000006Ό
000083Ό 000008Ό oooooro 000008Ό oooowo ΟΟΟΟΟΑΌ oooo ro 0Ο0ΟΟΑΌ000083Ό 000008Ό oooooro 000008Ό oooowo ΟΟΟΟΟΑΌ oooo ro 0Ο0ΟΟΑΌ
000081 '0 000009 V000081 '0 000009 V
OOOOP 'O oooooroOOOOP 'O oooooro
OOOOZP'O 000009-0OOOOZP'O 000009-0
000086Ό oooooro000086Ό oooooro
OOOOPl '0 OOOOO SOOOOOPl '0 OOOOO SO
OOOOOl ·0 ooooo ς'οOOOOOl ·0 ooooo ς'ο
OOOOZS'O oooooroOOOOZS'O oooooro
00009 ·0 OOOOOP'O00009 ·0 OOOOOP'O
000009 V oooooro000009 V oooooro
OOOOZS'O OOOOOP'OOOOOZS'O OOOOOP'O
ΟΟΟΟΖ Ο OOOOOP'OΟΟΟΟΖ Ο OOOOOP'O
00009 ΙΌ oooooro oooowo oooooro00009 ΙΌ oooooro oooowo oooooro
000095Ό oooooro000095Ό oooooro
000009.0 OOQOPi '0000009.0 OOQOPi '0
000090 oooow/o000090 oooow/o
ΟΟΟΟΡΓΟ OOOOZ '0ΟΟΟΟΡΓΟ OOOOZ '0
000000' I oooowo ooooo o 00001717Ό 000000' I oooowo ooooo o 00001717Ό
T69Z00/900ZN3/X3d 996贿 ·00Ζ OAV 0.380000 0.100000 T69Z00/900ZN3/X3d 996 bribe 00Ζ OAV 0.380000 0.100000
0.380000 0.700000  0.380000 0.700000
0.480000 0.800000  0.480000 0.800000
0.580000 0.600000  0.580000 0.600000
0.120000 0.080000  0.120000 0.080000
0.520000 1.000000  0.520000 1.000000
0.720000 1.000000  0.720000 1.000000
0.920000 0.020000  0.920000 0.020000
0.020000 0.180000  0.020000 0.180000
0.620000 0.160000  0.620000 0.160000
实施例 2  Example 2
本实施例显示如何从蒙特卡罗模拟方式产生的由四个组分 (铈、铁、钨和 镍)组成的伪样品中选择合格样品。 铈、 铁、 钨和镍的变量 vCe、 vFe、 vwThis example shows how to select a qualified sample from a pseudo sample composed of four components (铈, iron, tungsten, and nickel) generated by Monte Carlo simulation. Variables of bismuth, iron, tungsten and nickel v Ce , v Fe , v w ,
VNi都在 0到 1之间取值。 蒙特卡罗模拟中我们对每个变量采用均匀分布的 随机产生的 0到 1的数值, 该模拟中的随机产生值相互独立且不受任何约束 限定。 蒙特卡罗模拟的结果为四维空间的样品点 (伪样品), 该四维样品点在 三维空间中的投影如图 2中所示。 V Ni takes values between 0 and 1. In Monte Carlo simulations, we use a uniformly distributed randomly generated value of 0 to 1 for each variable. The randomly generated values in this simulation are independent of each other and are not subject to any constraints. The result of the Monte Carlo simulation is a sample point (pseudo sample) in a four-dimensional space, and the projection of the four-dimensional sample point in three-dimensional space is as shown in FIG.
为选择对应于物理上真实的样品这里提出了第一约束条件, 它定义为 VCe+VFe+Vw+VNi=l。 当选择过程中考虑到第一约束时, 选择出的满足该第一 约束条件的一组伪样品 (第一伪样品)显示在图 3中。 A first constraint is proposed here for selecting a sample corresponding to a physically realistic one, which is defined as V Ce + V Fe + V w + V Ni = l. When a first constraint is considered in the selection process, a selected set of pseudo samples (first pseudo samples) that satisfy the first constraint are shown in FIG.
假想某种经验使我们得出结论说铈与铁的组分之和总是等于鵠和镍的组 分之和, 那么我们可以引入第二约束, 它定义为 VCe+VFe=Vw+VNi。 当在第一 约束条件的基础上进一步考虑到该第二约束条件, 选择出的同时满足该两个 约束条件的一组伪样品 (第二伪样品 )被显示为分散在三围空间中的一个二维 平面上 (如图 4所示)。 图 4中所显示的该合格样品点是从二维平面的一侧观 测的。 Imagine an experience that leads us to conclude that the sum of the components of strontium and iron is always equal to the sum of the components of strontium and nickel. Then we can introduce a second constraint, which is defined as V Ce +V Fe =V w + V Ni . When the second constraint condition is further considered on the basis of the first constraint condition, a selected set of pseudo samples (second pseudo samples) that satisfy the two constraint conditions are displayed as one of two scattered in the space of the surroundings On the dimension plane (as shown in Figure 4). The qualified sample points shown in Figure 4 are observed from one side of the two-dimensional plane.
因此, 考虑到两个约束的该蒙特卡罗模拟提供了关于设计由四个组分组 成的、 具备该两个约束的样品库的确定信息。 例如, 合格样品比率可通过用 伪样.品全部数量去除满足两个约束的伪样品数量来计算。 既然满足两个约束 的伪样品数量相关的数字已知, 那么合格样品的数量可以知道。 记录每一合 格样品中变量的组分比率, 可知该合格样品的组分比率。 若变量被特定格点 分割, 样品的最佳数量同样可根据本发明所提供的方法获得。  Thus, this Monte Carlo simulation, which takes into account two constraints, provides deterministic information about the design of a sample library consisting of four components with these two constraints. For example, a qualified sample ratio can be calculated by removing the number of pseudo samples satisfying two constraints by the total number of pseudo samples. Since the number associated with the number of pseudo samples satisfying the two constraints is known, the number of qualified samples can be known. The component ratio of the variables in each of the qualified samples is recorded, and the component ratio of the qualified samples is known. If the variable is segmented by a particular grid point, the optimal number of samples can likewise be obtained according to the method provided by the present invention.
在上述蒙特卡罗模拟中, 获得全部 28561个伪样品, 其中 460个满足笫 一约束, 47个同时满足两个约束, 因此该伪样品满足两个约束的合格样品比 率是 0.0016456。 In the Monte Carlo simulation described above, all 28561 pseudo samples were obtained, of which 460 were satisfied. With one constraint, 47 of the two constraints are satisfied at the same time, so the ratio of qualified samples satisfying the two constraints of the pseudo sample is 0.0016456.
实施例 3  Example 3
本实施例说明了允许使用者通过图形用户界面输入信息并执行计算和模 拟(包括蒙特卡罗模拟)以设计合格样品库的计算机程序。  This embodiment illustrates a computer program that allows a user to enter information through a graphical user interface and perform calculations and simulations (including Monte Carlo simulation) to design a qualified sample library.
如图 5所示, 图形用户界面允许使用者选择设计样品所需组分。 例如, 一个由组分 A、 B和 C组成的样品,组分 A可以是从由钒 (V)、铌 (Nb)和钼(Mo) 組成的元素组中的任意一个,组分 A的变量 (Va)变化范围在 0到 1之间(请参 考图 5中所示的 0.00到 1.00的范围),该变量变化范围被分成 10部分 (如图 5 所示的 10段)。 其结果是, 組分 A被赋予在 0到 1之间取值的变量 (Va) (请参 考图 6), 同样地, 组分 B和 C也被赋予相应的变量 (Vb和 Vc)(请参考图 6)。 As shown in Figure 5, the graphical user interface allows the user to select the components required to design the sample. For example, a sample consisting of components A, B, and C, component A may be any one of the group consisting of vanadium (V), niobium (Nb), and molybdenum (Mo), the variable of component A. (V a) in the range between 0 and 1 (refer to FIG. 5 in the range of 0.00 to 1.00), and the variable variation range is divided into 10 portions (10 segments as shown in FIG. 5). As a result, component A is assigned a variable (V a ) that takes a value between 0 and 1 (see Fig. 6), and likewise, components B and C are also given corresponding variables (V b and V c ). ) (Please refer to Figure 6).
作为将经验知识纳入取样设计的一种方式, 图形用户界面允许使用者提 供指定一个变量或多个变量间的约束条件。 变量 Va、 Vb和 Vc默认的或第一 隐藏约束是 Va+Vb+Ve = 1 ± Δ。 Δ是误差 (或约束公差), 且在本例中给定为 0.01(请参考图 6)。所需的第二约束是 Va:Vb = 2: 1,并通过图形用户界面输入 (清 参考图 7)。 As a way to incorporate empirical knowledge into the sampling design, the graphical user interface allows the user to specify constraints between a variable or multiple variables. The default or first hidden constraint for the variables V a , V b and V c is V a +V b +V e = 1 ± Δ. Δ is the error (or constraint tolerance) and is given as 0.01 in this example (please refer to Figure 6). The second constraint required is V a : V b = 2: 1, and is entered via a graphical user interface (see Figure 7 for clarity).
图形用户界面进一步允许使用者决定如何估算合格样品的最佳数量。 如 图 8所示, 图形用户界面提供六个不同准确级别的计算。 在其中的精确计算 中,伪样品根据本发明所示的无任何约束的公式 10所产生, 然后由计算机检 测该伪样品并仅选择满足约束的部分。 在该例子中, 198 个样品满足约束。 另外, 还获得了所有 198个合格样品的组分比率。  The graphical user interface further allows the user to decide how to estimate the optimal number of eligible samples. As shown in Figure 8, the graphical user interface provides calculations for six different levels of accuracy. In the exact calculation therein, the pseudo sample is produced according to the formula 10 without any constraint shown in the present invention, and then the pseudo sample is detected by a computer and only the portion satisfying the constraint is selected. In this example, 198 samples satisfy the constraints. In addition, the component ratios of all 198 eligible samples were obtained.
当计算精确度在 -100%到 100%之间时被认为是很低的, 当其在 -30%到 30%之间认为是低,在 -10%到 10%的精确度是中等,而在 -3%到 3%之间是高, 在 - 1.0%到 1.0%之间则是非常高的精确度。  When the calculation accuracy is between -100% and 100%, it is considered to be very low, when it is considered to be low between -30% and 30%, and the accuracy at -10% to 10% is medium, and It is high between -3% and 3%, and between -1.0% and 1.0% is very high precision.
实施例 4  Example 4
本实施例显示了获得指定数量的合格样品点的计算机程序。 该计算机程 序允许使用者通过图形用户界面输入信息并执行计算和模拟(包括蒙特卡罗 模拟), 以此得出这些样品点中每一个样品点的诸组分比率。  This embodiment shows a computer program that obtains a specified number of eligible sample points. The computer program allows the user to enter information through a graphical user interface and perform calculations and simulations (including Monte Carlo simulations) to derive the component ratios for each of these sample points.
图 9显示图形用户界面允许使用者输入指定的总样品点 125 , 其中每个 样品点都具有 4个组分。图形用户界面也允许使用者指定想要的组分 (请参考 图 10), 并定义变量的约束条件 (请参考图 11)。 在定义约束公差后 (请参考图 12), 开始执行模拟试验, 其间每一伪样品都检查是否满足所设约束。 当合格 样品的数量达到 125时停止模拟, 以此得到 125样品点中四个组分 (元素 Pd、 /3/:i O90il£ 996/-00iAV Figure 9 shows that the graphical user interface allows the user to enter a designated total sample point 125, with each sample point having four components. The graphical user interface also allows the user to specify the desired component (see Figure 10) and define the constraints for the variable (see Figure 11). After defining the constraint tolerances (refer to Figure 12), the simulation test is started, and each dummy sample is checked for compliance with the set constraints. Stop the simulation when the number reaches a qualified sample 125, the sample points 125 in order to obtain four components (elements Pd, /3/:i O90il£ 996/-00iAV
Λ Λ
寸9寸 6S0 £ΐ S6S07..  Inch 9 inch 6S0 £ΐ S6S07..
δ寸Ό  Ό inchΌ
寸 6Ό λ 寸I6S0O Z/A7  Inch 6Ό λ inch I6S0O Z/A7
ΐΖ寸∞π S0寸寸 6/, .  ΐΖ inch ∞π S0 inch inch 6/, .
寸寸 0,  Inch 0,
寸 0 6卜, Inch 0 6 Bu,
寸 0寸∞ΐ,  Inch 0 inch ∞ΐ,
ε§ΐ卜7 Ε§ΐ卜7
S 6SS S 6SS
Figure imgf000022_0001
Figure imgf000022_0001
¾9 εζΌ卜73⁄49 εζΌ卜7
£寸 ΐ0 λ7·  £inch ΐ0 λ7·
寸寸 9 o zfc0.  Inch 9 o zfc0.
o s寸寸 68卜  o s inch 68 b
0 0
0 6S}寸 SZ 0, 0 6S} inch SZ 0,
Π寸 8 9ΖΓ0SHSΠ inch 8 9ΖΓ0SHS
§0• §0•
Figure imgf000023_0001
寸 δΌ
Figure imgf000023_0001
Inch δΌ
δ寸 6S
Figure imgf000023_0002
IO §00寸6S0 S//7..
δ inch 6S
Figure imgf000023_0002
IO §00 inch 6S0 S//7..
9 9ΐ£卜 Ό寸寸 §/, - -9 9ΐ£Ό寸寸§/, - -
6L9l£V0 Π 1789 6Α5ΐεΐ 6L9l£V0 Π 1789 6Α5ΐεΐ
68"Ι8Ό 91 £9 乙 0·0 68"Ι8Ό 91 £9 B 0·0
£9Z99£'0 68"90Ό 9ΐε 0 29£LW0 29£L 0 £9^0ΐΌ 0 Z 292V0 9 ·0 9ΐ£ 0 UZ 2V0 L£L 9'0 9ΐ£9∑:0Ό 0 £9Z99£'0 68"90Ό 9ΐε 0 29£LW0 29£L 0 £9^0ΐΌ 0 Z 292V0 9 ·0 9ΐ£ 0 UZ 2V0 L£L 9'0 9ΐ£9∑:0Ό 0
£9 99^ izmvo 0 9\£9f0 9ΐε930 LP6£ '0 £9 1LV0  £9 99^ izmvo 0 9\£9f0 9ΐε930 LP6£ '0 £9 1LV0
9ΐε 0 SOI ·0 Limvo 9£LY 9ΐε o ε9^0Γ0 50Τ360 ^£9308 ΐ7/,1/6εθ 6 L990O 01Ζ60Ό 68"90·0  9ΐε 0 SOI ·0 Limvo 9£LY 9ΐε o ε9^0Γ0 50Τ360 ^£9308 ΐ7/,1/6εθ 6 L990O 01Ζ60Ό 68"90·0
UZ V0 176£0'0 17Λ1/6 0 UZV2V0 f29£ZL'0 68Α590Ό 9ΐ£930 UZPSV
Figure imgf000024_0001
85ΐ£ΐΟΌ uz vo 2C9 0e Ζ 68Α0Ό ΐΐ乙 1/8ΓΟ 9£L V0 L£L^6£'0 £9350Γ0
UZ V0 176£0'0 17Λ1/6 0 UZV2V0 f29£ZL'0 68Α590Ό 9ΐ£930 UZPSV
Figure imgf000024_0001
85ΐ£ΐΟΌ uz vo 2C9 0e Ζ 68Α0Ό ΐΐB 1/8ΓΟ 9£L V0 L£L^6£'0 £9350Γ0
9ΐ£9Ζ,Γ0 9ΐ£920Ό e9^oro 9ΐ£9Ζ,Γ0 9ΐ£920Ό e9^oro
Z£930£"0 3e93S0"0 S0T ;60 9^0ΐΖΌ Ι^8ΐ9 50Τ360Ό L 6 L0'0 3£930£Ό Z£9K) 89 6ΐΌ 9£ZL'0 9Κ9 3ε9 :^0 Z£930£"0 3e93S0"0 S0T ;60 9^0ΐΖΌ Ι^8ΐ9 50Τ360Ό L 6 L0'0 3£930£Ό Z£9K) 89 6ΐΌ 9£ZL'0 9Κ9 3ε9 :^0
62L9\£'0 9ΐε9Ζ0"0 62L9\£'0 9ΐε9Ζ0"0
6L 1S£'0 9ΐε9 Z S9£ 0 9I£ 0 uz vo
Figure imgf000024_0002
6L 1S£'0 9ΐε9 Z S9£ 0 9I£ 0 uz vo
Figure imgf000024_0002
6 ΐε9·0 9TC930 Π 178Ϊ 6 L9V0 L£LPWO ΐ^δΐΓΟ L£L '0 L£L 9'0 umvo 9ΐ£9 L£LP YO 9^0ΐ Ό 6L9\£V0 £9^0I 9ΐ£9^0 85ΐ£ΐΟΌ L 6 LQ'0 9Ι£9^0Ό 68ASt8 90ΙΖ60Ό 68"90·0  6 ΐε9·0 9TC930 Π 178Ϊ 6 L9V0 L£LPWO ΐ^δΐΓΟ L£L '0 L£L 9'0 umvo 9ΐ£9 L£LP YO 9^0ΐ Ό 6L9\£V0 £9^0I 9ΐ£9^ 0 85ΐ£ΐΟΌ L 6 LQ'0 9Ι£9^0Ό 68ASt8 90ΙΖ60Ό 68"90·0
9ΐ£9Ζ0  9ΐ£9Ζ0
T69ZOO/900ZN3/X3d 0.157895 0.039474 0.184211 0.618421 T69ZOO/900ZN3/X3d 0.157895 0.039474 0.184211 0.618421
0.157895 0.052632 0.684211 0.105263  0.157895 0.052632 0.684211 0.105263
0 0.171053 0.789474 0.039474  0 0.171053 0.789474 0.039474
0.013158 0.039474 0.144737 0.802632  0.013158 0.039474 0.144737 0.802632
0.052632 0.039474 0.473684 0.434211  0.052632 0.039474 0.473684 0.434211
0.052632 0.065789 0.723684 0.157895  0.052632 0.065789 0.723684 0.157895
0.065789 0.078947 0.789474 0.065789  0.065789 0.078947 0.789474 0.065789
0.092105 0.171053 0.736842 0  0.092105 0.171053 0.736842 0
0.105263 0.065789 0.105263 0.723684  0.105263 0.065789 0.105263 0.723684
0.118421 0.013158 0.842105 0.026316  0.118421 0.013158 0.842105 0.026316
0.131579 0.078947 0.434211 0.355263  0.131579 0.078947 0.434211 0.355263
0.157895 0.026316 0.513158 0.302632  0.157895 0.026316 0.513158 0.302632
0.026316 0.078947 0.736842 0.157895  0.026316 0.078947 0.736842 0.157895
0.039474 0.052632 0.210526 0.697368  0.039474 0.052632 0.210526 0.697368
0.065789 0.013158 0.315789 0.605263  0.065789 0.013158 0.315789 0.605263
0.092105 0.078947 0.118421 0.710526  0.092105 0.078947 0.118421 0.710526
0.092105 0.078947 0.210526 0.618421  0.092105 0.078947 0.210526 0.618421
0.105263 0.039474 0.592105 0.263158  0.105263 0.039474 0.592105 0.263158
0.184211 0.013158 0.210526 0.592105  0.184211 0.013158 0.210526 0.592105
0 0.157895 0.223684 0.618421  0 0.157895 0.223684 0.618421
0 0.184211 0.618421 0.197368  0 0.184211 0.618421 0.197368
0.013158 0.078947 0.315789 0.592105  0.013158 0.078947 0.315789 0.592105
0.065789 0.078947 0.802632 0.052632  0.065789 0.078947 0.802632 0.052632
0.078947 0.026316 0.381579 0.513158  0.078947 0.026316 0.381579 0.513158
0.105263 0.065789 0.684211 0.144737  0.105263 0.065789 0.684211 0.144737
0.131579 0.013158 0.618421 0.236842  0.131579 0.013158 0.618421 0.236842
0.157895 0.092105 0.539474 0.210526  0.157895 0.092105 0.539474 0.210526
0.157895 0.144737 0.078947 0.618421  0.157895 0.144737 0.078947 0.618421
0.013158 0.118421 0.868421 0  0.013158 0.118421 0.868421 0
0.105263 0.013158 0.223684 0.657895  0.105263 0.013158 0.223684 0.657895
本发明申请中所列论文和专利均是作为参考文献所援引的。 上述例举的 实施方案中所涉及的描述, 举例和数据仅作为演示和例证之用, 并非限定本 发明的范围。 任何根据本发明所做的非实质性修改加工皆落入本发明权利要 求范围内。 因此, 附件权利要求书的精神和范围不局限于本申请对该发明的 说明版本。 The papers and patents listed in the present application are hereby incorporated by reference. The description, examples and data contained in the above-exemplified embodiments are for illustrative purposes only and are not intended to limit the scope of the invention. Any insubstantial modification processing performed in accordance with the present invention falls within the scope of the present invention Find the range. Therefore, the spirit and scope of the appended claims are not limited to the illustrated version of the invention.

Claims

权 利 要 求 Rights request
1、 一种组合样品库的设计方法, 包括以下步骤: 1. A method of designing a combined sample library, comprising the following steps:
(1)提供組成该样品的诸个组分;  (1) providing components constituting the sample;
(2) 为每一所述的组分提供一变量, 该变量在一定区间内取值;  (2) providing a variable for each of the described components, the variable taking a value within a certain interval;
(3) 为其中至少一个所述变量设定至少一个约束条件;  (3) setting at least one constraint for at least one of the variables;
(4)产生伪样品;  (4) generating a pseudo sample;
(5)检验该伪样品, 确定其是否为合格样品;  (5) Inspect the pseudo sample to determine if it is a qualified sample;
(6) 重复步骤 (4)和 (5), 直至确定出至少一个所述的合格样品。  (6) Repeat steps (4) and (5) until at least one of the qualified samples is determined.
2、 权利要求 1所述的方法中, 组分变量的取值区间由经验知识确定。 2. The method of claim 1 wherein the range of values of the component variables is determined by empirical knowledge.
3、权利要求 1所述的方法中, 约束条件是由经验知识确定的上述变量之 间的关系。 3. The method of claim 1 wherein the constraint is a relationship between said variables determined by empirical knowledge.
4、权利要求 1所述的方法进一步包括确定合格样品比率的步骤,该合格 样品比率是用所述的合格样品数量除以所述的伪样品数量得出的。 4. The method of claim 1 further comprising the step of determining a qualified sample ratio obtained by dividing said acceptable sample number by said pseudo sample amount.
5、 权利要求 1所述的方法中, 伪样品是通过随机取样的方法产生的。 5. The method of claim 1 wherein the pseudo sample is produced by a random sampling method.
6、 权利要求 5所述的方法中, 随机取样是一种蒙特卡罗模拟。 6. The method of claim 5 wherein the random sampling is a Monte Carlo simulation.
7、权利要求 5所述的方法中, 随机取样是通过利用具有某种概率分布的 变量值来完成的。 7. The method of claim 5, wherein the random sampling is performed by using a variable value having a certain probability distribution.
8、 权利要求 7所述的方法中, 概率分布是均匀分布。 8. The method of claim 7 wherein the probability distribution is evenly distributed.
9、 权利要求 7所述的方法中, 概率分布是非均匀分布。 9. The method of claim 7 wherein the probability distribution is a non-uniform distribution.
10、 权利要求 9所述的方法中, 非均匀分布是选自于如下一组中的一个 或多个: 柏努利分布、 贝它分布、 X平方分布、 指数分布、 F分布、 伽马分 布、高斯分布、正态分布 (例如对数正态、多变量正态分布和单变量正态分布)、 非中心 X平方分布、 非中心 F分布、 二项式分布、 负二项式分布、 多项式分 布、 帕雷托分布、 柏松分布、 学生 t分布和萨利斯分布。 10. The method of claim 9 wherein the non-uniform distribution is selected from one or more of the group consisting of: Bernoulli distribution, beta distribution, X square distribution, exponential distribution, F distribution, gamma distribution Gaussian distribution, normal distribution (eg lognormal, multivariate normal distribution and univariate normal distribution), non-central X-square distribution, non-central F distribution, binomial distribution, negative binomial distribution, polynomial Minute Cloth, Pareto distribution, cypress distribution, student t distribution and Salisbury distribution.
11、 权利要求 7所述的方中, 变量值是由随机数发生器随机产生的。 11. The method of claim 7 wherein the variable values are randomly generated by a random number generator.
12、权利要求 11所述的方法中, 随机数字发生器是选自于一组线性同余 发生器、 移转寄存序列发生器和准随机数字发生器。 12. The method of claim 11 wherein the random number generator is selected from the group consisting of a set of linear congruential generators, a shift register sequencer, and a quasi-random number generator.
13、 一种提供指定数量的合格样品的样品库的方法, 包括以下步骤:13. A method of providing a sample library of a specified number of eligible samples, comprising the steps of:
(1)提供所需设计的样品的数量; (1) provide the number of samples of the desired design;
(2)提供组成所述样品的诸个组分;  (2) providing components constituting the sample;
(3) 为每一所述组分提供一变量, 其中该变量在一定区间内取值; (3) providing a variable for each of the components, wherein the variable takes a value within a certain interval;
(4) 为至少一个所述变量设定至少一个约束条件; (4) setting at least one constraint for at least one of the variables;
(5) 产生伪样品;  (5) Producing a pseudo sample;
(6)检验该伪样品, 确定其是否为合格样品;  (6) Inspect the pseudo sample to determine if it is a qualified sample;
重复步骤 (4)和 (5), 直至达到 (1 ) 中所预定的合格样品数量。  Repeat steps (4) and (5) until the number of qualified samples scheduled in (1) is reached.
14、权利要求 13所述的方法中,组分变量的取值区间是由经险知识确定 的。  14. The method of claim 13 wherein the range of component variables is determined by risk knowledge.
15、权利要求 13所述的方法中, 约束条件是由经验知识确定的上述变量 之间的关系。  15. The method of claim 13 wherein the constraint is a relationship between said variables determined by empirical knowledge.
16、 一种确定组合样品库中样品的最佳数量的方法, 包括以下步骤:16. A method of determining an optimal number of samples in a combined sample library, comprising the steps of:
(1)提供组成所述样品的诸个組分; (1) providing components constituting the sample;
(2) 为每一所述组分提供一变量, 其中该变量在一定区间内取值; (2) providing a variable for each of the components, wherein the variable takes a value within a certain interval;
(3) 为至少一个所述变量设定至少一个约束条件; (3) setting at least one constraint for at least one of the variables;
(4)产生伪样品;  (4) generating a pseudo sample;
(5)从伪样品中确定出合格样品;  (5) Determining a qualified sample from the pseudo sample;
(6)确定出合格样品比率, 该合格样品比率是用合格样品数量除以伪样 品数量得出的;  (6) Determine the qualified sample ratio, which is obtained by dividing the number of qualified samples by the number of pseudo samples;
(7) 决定样品数量;  (7) Decide on the number of samples;
(8) 计算实验的最佳数量, 其中, 所述的最佳数量是所述合格样品比率 与所述样品数量的乘积。  (8) Calculate the optimum number of experiments, wherein the optimum amount is the product of the acceptable sample ratio and the number of samples.
17、 权利要求 16所述的方法中, 变量取值区间是由经 r知识确定的。 17. The method of claim 16 wherein the variable value interval is determined by the r knowledge.
18、权利要求 16所述的方法中, 约束条件是由经验知识确定的上述变量 之间的关系。 18. The method of claim 16 wherein the constraint is the above variable determined by empirical knowledge The relationship between.
19、 一种存储指令的计算机可读介质, 当该介质在一个或多个处理器中 被执行时,能使所述的一个或多个处理器执行如权利要求 1-18之一所述的方 法。 19. A computer readable medium storing instructions which, when executed in one or more processors, enable the one or more processors to perform the method of any one of claims 1-18 method.
20、 一种设计组合样品库的系统, 其包括: 20. A system for designing a combined sample library, comprising:
(1) 一种能提供组成所述样品的诸个组分的方法;  (1) A method of providing components constituting the sample;
(2) 一种能为每一所述的组分提供一变量, 且该变量在一区间内取 值的方法;  (2) A method of providing a variable for each of said components, and wherein the variable takes a value within an interval;
(3) 一种为其中至少一个所述变量设定至少一个约束条件的方法; (3) A method of setting at least one constraint condition for at least one of said variables;
(4) 一种产生伪样品的方法; (4) A method of generating a pseudo sample;
(5) 一种检验该伪样品的方法, 以确定其是否为合格样品; 以及 (5) A method of testing the pseudo sample to determine if it is a qualified sample;
(6) 一种能确定出至少一个所述合格样品的方法。 (6) A method of determining at least one of said acceptable samples.
PCT/CN2006/002691 2005-10-13 2006-10-13 Method and system for designing a composite sample library WO2007041966A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CNB200510030502XA CN100558948C (en) 2005-10-13 2005-10-13 The method of design of combined sample library and system
CN200510030502.X 2005-10-13

Publications (1)

Publication Number Publication Date
WO2007041966A1 true WO2007041966A1 (en) 2007-04-19

Family

ID=37942319

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2006/002691 WO2007041966A1 (en) 2005-10-13 2006-10-13 Method and system for designing a composite sample library

Country Status (2)

Country Link
CN (1) CN100558948C (en)
WO (1) WO2007041966A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107844985A (en) * 2016-09-21 2018-03-27 腾讯科技(深圳)有限公司 A kind of probability product data processing method, system and terminal

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1603789A (en) * 2004-11-08 2005-04-06 武汉大学 Method for measuring phase content of Ni-base superalloy
US20050182572A1 (en) * 2004-02-13 2005-08-18 Wollenberg Robert H. High throughput screening methods for fuel compositions

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050182572A1 (en) * 2004-02-13 2005-08-18 Wollenberg Robert H. High throughput screening methods for fuel compositions
CN1603789A (en) * 2004-11-08 2005-04-06 武汉大学 Method for measuring phase content of Ni-base superalloy

Also Published As

Publication number Publication date
CN1948559A (en) 2007-04-18
CN100558948C (en) 2009-11-11

Similar Documents

Publication Publication Date Title
L’Ecuyer et al. Recent advances in randomized quasi-Monte Carlo methods
Chung et al. Statistical significance of variables driving systematic variation in high-dimensional data
US6996550B2 (en) Methods and apparatus for preparing high-dimensional combinatorial experiments
US5353236A (en) High-resolution crystallographic modelling of a macromolecule
Kashima et al. Path-integral renormalization group method for numerical study on ground states of strongly correlated electronic systems
Ahmadi et al. A hybrid method of 2-TSP and novel learning-based GA for job sequencing and tool switching problem
WO1992014211A1 (en) Method for modelling the electron density of a crystal
AU2020223177C1 (en) Increasing representation accuracy of quantum simulations without additional quantum resources
Ramaswamy et al. A partial-propensity variant of the composition-rejection stochastic simulation algorithm for chemical reaction networks
Dawson et al. Massively parallel sparse matrix function calculations with NTPoly
US20220019931A1 (en) Increasing representation accuracy of quantum simulations without additional quantum resources
Frey New imperfect rankings models for ranked set sampling
WO2007041966A1 (en) Method and system for designing a composite sample library
Herschlag et al. A consistent hierarchy of generalized kinetic equation approximations to the master equation applied to surface catalysis
Mojsilović et al. Learning a performance metric of Buchberger’s algorithm
Nuttall Parallel implementation and application of the random finite element method
Datta et al. Nearest neighbor mapping of quantum circuits to two-dimensional hexagonal qubit architecture
Self et al. Random numbers from a delay equation
Schoenmaker Monte Carlo simulations and complex actions
Osicka et al. Boolean matrix decomposition by formal concept sampling
CN114997060A (en) Time-varying reliability testing method for photonic crystal, computing equipment and storage medium
Kucherenko High dimensional Sobol’s sequences and their application
Mamitsuka Empirical evaluation of ensemble feature subset selection methods for learning from a high-dimensional database in drug design
Alsolami et al. A Metropolis random walk algorithm to estimate a lower bound of the star discrepancy
Bigerelle et al. Structure coarsening, entropy and compressed space dimension

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 06791257

Country of ref document: EP

Kind code of ref document: A1