WO1997017659A1 - Reseau neuronal biologique - Google Patents
Reseau neuronal biologique Download PDFInfo
- Publication number
- WO1997017659A1 WO1997017659A1 PCT/JP1996/003283 JP9603283W WO9717659A1 WO 1997017659 A1 WO1997017659 A1 WO 1997017659A1 JP 9603283 W JP9603283 W JP 9603283W WO 9717659 A1 WO9717659 A1 WO 9717659A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- vector
- function element
- discriminant function
- pattern
- random number
- Prior art date
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
Definitions
- the present invention provides a basic principle of advanced or ultimate artificial intelligence. It includes new technologies for using digital information. INDUSTRIAL APPLICABILITY The present invention can be used in a wide range of fields that require a pattern recognition ability equal to or greater than that of a human.
- the present invention can be used for a full-scale handwritten character recognition system including kanji, a general object recognition system, and a voice recognition system for an unspecified speaker.
- the present invention can also be used as a general prediction system. Typical examples include weather forecasting and earthquake forecasting, as well as forecasting of changes in the state of general physical systems, and forecasting of various economic indicators including stock prices and exchange rates.
- the present invention can be used as a new function of a general data processing system itself represented by a computer, especially as a general hashing function of a large-scale database.
- the present invention can also be used as an image search system including an associative memory system.
- the present invention can also be used as a sense function or access function of a highly integrated digital memory or analog memory.
- the invention can be used for high speed searching of large databases.
- This invention can be used for the automatic circuit design of the whole information processing apparatus.
- the present invention can also be used for building an advanced automatic control system.
- the present invention can also be used for building an advanced autonomous driving system.
- the invention can also be used to build sophisticated target tracking systems.
- the present invention can also be used to build an advanced target identification system.
- the present invention can also be used for constructing an advanced automatic diagnostic system.
- the present invention can also be used for constructing an advanced artificial intelligence system.
- the present invention can also be used as an advanced encryption communication system.
- the present invention can also be used to construct the brain of a sophisticated robot. Background art
- the problem of pattern recognition is the problem of artificial intelligence, which is essentially the problem to be solved.Artificial intelligence that ignores this can never exceed the ability of living things.
- knowledge fragments are collected by experts and the fact that "feature extraction” is performed by "researchers” in the problem of pattern recognition are considered to be equivalent problems. Not.
- the final target is complex information (category 1 that integrates categories 1 that are distributed in multiple pattern spaces).
- categories 1 that integrates categories 1 that are distributed in multiple pattern spaces.
- the problem is that the function form of the evaluation function is not generally determined, or the attributes of multiple variables that are added to each other at a predetermined ratio in the evaluation function.
- fundamental problems such as the problem that the unit system could not be unified were not solved in a form that could be physically understood.
- the arithmetic system of the probability has not been adapted to the organizational structure of the artificial intelligence system that requires a large-scale and complex hierarchical structure.
- the probability is a quantity defined as a "ratio" and requires a special arithmetic system specialized in probability, but the metric essentially required for pattern recognition should be an amount that can perform ordinary four arithmetic operations. Became clear.
- the connection list model in recent years some ideas such as backpropagation have been devised, but the large number of learning pattern input times (number of passes) during learning remains a problem.
- the arrangement of system components is always “clogged” (ie, because the context of the system components in the brain of living organisms is reversed).
- the phenomenon of unbearable slowness) was caused.
- the coarse “pattern filter” precedes the fine “pattern filter—”, and in some cases, the coarse “pattern filter” alone may cause noise ⁇ cut (
- the current artificial intelligence uses a system that can perform such processing (that is, “trivial processing” or global judgment) (human intervention). (Without) a general approach that could be built automatically.
- a first object of the present invention is to eliminate the above-mentioned various problems and questions and to provide a simple and practical artificial intelligence system including a pattern recognition function. It is to provide a strong basic principle, as well as a concrete construction method.
- a second object of the present invention is to provide a high-speed information retrieval system including an image retrieval system for a large-scale database. It is intended to realize a macro information hashing mechanism and a content search mechanism (content and memory).
- a third object of the present invention is to target all kinds of objects (including non-equilibrium systems and complex systems) irrespective of whether they are natural, artificial, physical, or mathematical.
- An object of the present invention is to provide a simple and powerful pattern recognition method or prediction method that can be realized and can be realized by a unified method. By solving this problem, it is expected that qualifications for artificial intelligence to fully enter the so-called linguistic domain such as “translation” will be obtained.
- a machine can be built to read newspapers and magazines with fluent pronunciation after "understanding” the meaning, which will be good news for those who have lost light.
- the present invention may seem to be mathematically descriptive insofar as it appears in the following claims.
- the principles of the present invention (which may seem simple, but seem to be a deep principle) can only be fully explained strictly depending on the physical model. Hereinafter, it will be described step by step.
- an input pattern or a portion thereof (hereinafter referred to as an input pattern) is used.
- a determining means for strongly determining whether a pattern belongs to a given or predetermined category a pattern vector corresponding to the input pattern, a predetermined weight vector, The calculated inner product is calculated, a magnitude comparison operation is performed between the calculated inner product and a predetermined threshold, and when clustering the system components, the system components are divided by self-proliferation means. It has elements of the type to be generated (hereinafter referred to as “discriminating function elements”) as system components, and the input pattern is such that the components of each dimension are either logical ternary or logical binary.
- n-dimensional vector (where n is finite) is defined, and each dimension component of the pattern vector corresponding to the n-dimensional vector indicating the input pattern is defined as If the components of the n-dimensional vector are ternary representations as the components of each dimension of the pattern vector, the real value is -1 or 0 corresponding to each component. Or +1 is assigned, and when each component of the n-dimensional vector is a binary representation, either one of the real numbers 1 or +1 corresponding to each component And a space (hereinafter referred to as “pattern space”) based on the entire pattern vector and the algebraic operation results thereof, has the absolute value of the component of each dimension.
- pattern space a space
- the discriminant function element belongs to the given or predetermined category 1 as a vector space based on the whole n-dimensional vector, which takes a real value not exceeding 1, and as the weight vector.
- An arithmetic mean vector (hereinafter referred to as “target vector”) that is determined by calculating the arithmetic mean or an approximate value of the component corresponding to each dimension of the entire pattern vector corresponding to the set of representative patterns to be represented.
- a set of criteria for similarity arbitrarily defined in the pattern space from the result of adding the variation of the target vector hereinafter, referred to as “mutation vector” to the target vector.
- the reference vector determines the result of subtracting the arithmetic mean vector (hereinafter referred to as the “reference vector”) determined by calculating the arithmetic mean or its approximate value of the components corresponding to each dimension of the entire pattern vector belonging to Vectors are used.
- the above-mentioned mutation vector does not have a direct relationship with the basic principle of the present invention described below, and will be described below as a zero vector.
- the mutation vector will be described in detail in a later example.
- the work of the discrimination function element of the “descendant” generated by the discrimination function element is divided vertically, its own work is performed. It is necessary and is an internal parameter that is formally added for that purpose.
- the pattern space is a space defined in the discriminant function element corresponding to the discriminant function element.
- a description will be given of a physical model that provides the basis of the present invention.
- the above-described n-dimensional vector in which the components of each dimension are expressed by a logical binary or ternary value is input to the discriminant function element arranged in the system of the present invention.
- the above logical values are not necessarily the values that mathematical logical variables take, but the states and events woven by the real world objects are detected by the observation mechanism, and these are digitized and transmitted according to predetermined rules. It may be information that has come.
- the above digital information may be physically treated as a normal amount, that is, a physical amount. is there. However, if it is more accurately expressed, it is for the signal level representing the digital information, and if its contents (ie, meaning) are handled, it is determined from the signal level.
- the logical value is analogized or decoded according to a predetermined rule.
- the physical correspondence between the above signal level and the above logical value is interrupted.
- the discrimination function element provided in the system of the present invention, the above-mentioned decoding is performed while preserving the physical correspondence.
- a logical value is determined based on a predetermined criterion from a signal representing input digital information (that is, an input pattern).
- the predetermined criterion is generally known only to the discriminating function element side).
- a physical and normalized quantity that is, a pattern vector
- the pattern vector is conveniently changed to the pattern vector.
- the above-mentioned conversion can be omitted.
- the above-mentioned decoding uses the recognized physical and normalized amount (pattern vector) as a predetermined physical value.
- the recognition target (input pattern) recognized first according to a predetermined rule is a logical value
- the signal level of the signal expressing the logical value is generally Is that they are not questioned.
- the signal level A normalized physical quantity is extracted from the above-mentioned signal.
- the meaning of the above two points is that, in the discriminant function element provided in the system of the present invention, as a significant recognition target (pattern vector), quantization is performed from the beginning. It is nothing but the premise that it is only a physical quantity.
- the pattern vector exists in a physical space (n-dimensional open space 13 ⁇ 4) corresponding to the pattern space. It is considered to represent one physical mass-point, with a unit of 11. Further, the mass point is considered to have a velocity vector expressed by the pattern vector, which is obtained by converting the input pattern according to the rule. Therefore, assuming that the mass of the unit is 1, the mass point is consequently regarded as having one unit of momentum (Momentum) expressed by the pattern vector. (However, at this time, the position of the mass point is not questioned.) By proceeding with the above-mentioned modeling, for any given category, this corresponds to all the patterns included in the category one.
- the moving speed of the center of gravity can be considered. Further, the moving speed of the center of gravity of the mass system can be actually calculated from a simple law of dynamics. As a result, the sum of all the pattern vectors corresponding to all the patterns belonging to the above-mentioned category one is divided by the number of all the patterns belonging to the above-mentioned category one.
- the arithmetic mean vector is given as the above algorithm, or an equivalent, but the above result gives a physical definition thereof. (Accordingly, the target vector and the reference vector can be considered to indicate the moving speed of the center of mass of the mass point system, respectively.
- feature vector j For one input pattern for which identification to a given or predetermined category is to be determined, this is also referred to as a single element.
- Category The feature vector can be compared between the category one and the given or predetermined category one, and the identification can be performed according to the similarity.
- n-dimensional random walk n-dimensional random walk
- the mass point of the random walk is usually set at the origin as an initial value, but the amount of movement of the above-described mass point on each coordinate axis in the n-dimensional random walk space by one random walk trial is as follows. , And are equal to the components of each dimension of the pattern vector corresponding to the input pattern.
- the mass system centroid moving speed (target vector or reference vector) corresponding to category 1 is only the moving speed (absolute speed) viewed from the origin of coordinates. Therefore, the above-mentioned similarity using this is a quantity that is always judged based on the absolute origin, and it is essentially useless for solving the pattern recognition problem when examined in detail. Therefore, in the method of the present invention, when calculating the similarity between a pattern and a category, another category 1 serving as a reference is prepared, and the relative similarity based on the category 1 is determined. The calculation is performed, but the difference vector used in the calculation is the difference vector obtained as the difference between the target vector and the reference vector. Relative similarity is sometimes found in our conversations on a daily basis.
- the feature vector itself in the catch category is not directly useful for solving the pattern recognition problem. For example, evaluating the coordinates of the tip of a feature vector is not the correct answer. Because feature vectors are nothing more than skeletons that summarize categories, no matter what summarization is done, the pattern recognition problem itself remains shelved. As a matter of fact, this problem cannot be solved without the concept of relative similarity. Even if significant common features are found between patterns and candidate categories, overestimating these features is sometimes incorrect. This is because the same feature may be remarkably seen in other candidate categories. While features such as "nose-nosed people" and "blue-eyed people” may help to infer that a person is a Westerner, it may be helpful to estimate a particular person among Westerners. would not be very useful. Conversely, even small features that are quantitatively insignificant should be highly valued if they are unique features not found elsewhere throughout the candidate category. You may have to do that.
- FIG. 1 is an explanatory diagram for explaining the gist of the present invention.
- the image of the pattern space (n-dimensional vector space) is shown in a two-dimensional coordinate space for convenience, and therefore the figure is not accurate. Since the components of the dimension of the tor are at the end points on the hypercube, the set is not shown in this figure.
- the ratio of the scale lengths of the coordinate axes is not 1: 1 because the aspect ratio is taken into consideration for visibility.
- 10 indicates the coordinate origin
- 11 indicates the Y axis of the coordinate
- 12 indicates the X axis of the coordinate
- 1 indicates the representative pattern or pattern to be identified in the above given or predetermined category.
- 2 is a set of pattern vectors corresponding to the representative patterns belonging to the above-mentioned reference category 1
- 21 is the target vector
- 2 2 is the reference vector.
- 23 is the above-mentioned mutation vector
- 4 is the pattern vector corresponding to the above-mentioned input pattern
- 3 is the reference vector 22 from the sum of the target vector 21 and the mutation vector 23. This is the weight vector reduced above.
- the above-mentioned mutation vector 23 is considered to be a zero vector, so it is expressed by excluding this from mathematical expressions.
- the mutation vector 23 is zero. The case other than the vector will be described in an embodiment described later.
- FIG. 2 is a set theory illustrating the relationship between the set 1 of pattern vectors corresponding to the given or predetermined category 1 and the set 2 of pattern vectors corresponding to the reference category 1.
- the case of (a) in the figure shows the case where there is no inclusion relationship between the two
- the case of (b) shows the case where there is a partial inclusion relationship between them
- the case of (c) shows the above set
- the case where 1 is a true subset of the above set 2 is shown.
- These relationships are extremely general, but in any case, the existence of the first category is effective, and the relative relationships are considered to be meaningless or formal concepts. However, it has been found to be an important concept, so it is shown in the diagram.
- the case of (c) can be compared to a certain physical model. It is the situation of the diffusion of a gas of a specific component in the mixed gas. For example, it is considered that the vector 3 in FIG. 1 shows a kind of diffusion rate.
- the moving speed of the center of gravity is zero vector.
- the pattern vector corresponding to any one element of the category 1 has a one-to-one complementary pattern vector and virtually always one. Because it exists.
- the absolute values of the components are all equal to the absolute values of the corresponding components of the pattern vector, and only the signs are inverted from each other. It becomes Incidentally, the absolute similarity conventionally used is a relative similarity in which the above-mentioned zero vector is always used as a reference vector.
- the above-mentioned absolute similarity is calculated based on the reference vector in the first stage of the system.
- discriminant function elements whose values can be determined a priori or empirically as zero vectors.
- An n-dimensional flat space corresponding to a set of pattern vectors in each dimension of which two or three real-valued values corresponding to logical inputs in the n-dimensional pattern space are used. (Movement speed of the center of gravity of the interior system) and is not known.
- the expression of the similarity used in the present invention will be mathematically derived using the image of FIG.
- This distance also has the alias City block or Manhattan metric, It has been used frequently in the field of pattern recognition.
- the reason why the above distance is used in the present invention is that the above distance is the easiest to handle.
- Equation (6) may be used as it is as similarity, however, In general, the right side of equation (6) may be multiplied by an arbitrary positive real number in consideration of the signal level on the hardware. In order to make such a treatment physically significant, the multiplier is a multiplier m on the mass of the (unit) mass point on the Euclidean space corresponding to the pattern vector p. And formula (6) is
- the above-mentioned target vector and reference vector are composed of n signed binary counters for executing the random oak and one for counting the number of trials of the random walk. It can be stored by a signed or unsigned counter. As will be described later, the division necessary for calculating the velocity component of the mass point can be omitted by using an appropriate physical model that has been devised in a calculation chart.
- the target vector and the reference vector can be obtained by inputting only one learning pattern (one-pass input).
- the learning pattern is fetched into the discriminant function element unconditionally, or it is fetched by the instruction of a teacher (a teacher who can come from inside the system and a teacher from outside). Further, in each of the above cases, the relative similarity calculated corresponding to the learning pattern is narrowed down to a certain predetermined value, and only when the relative similarity exceeds the predetermined value, is taken in. However, there are cases where the data is imported without such comparison processing.
- Table 1 summarizes calculation examples by the method of the present invention.
- the input pattern vector is a binary case, and the value is generated by random number processing.
- V (0) is a zero vector.
- the parenthesized subscript at the upper right of the pattern vector indicates the management number of the corresponding input pattern.
- the velocity vector V (p), the relative similarity T P (0, ⁇ ), and the subscript in parentheses at the upper right of ⁇ ⁇ (na, ⁇ ) correspond to these subscripts.
- the element of ⁇ is ⁇ ('), ⁇ ⁇ 2) .
- ⁇ (3 elements, and the element of / 3 is ⁇ “), ⁇ (5) , ⁇ ⁇ 6) .
- the calculation suggests two facts: First, the relative similarity ⁇ of ⁇ (where ⁇ , is zero or more for all pattern vectors belonging to 0). It is to take values other one, a zero relative similarity T [rho relative to the (alpha, o) for all pattern vector belonging to the nose takes a positive value, the ⁇ It takes a negative value for all the pattern vectors to which it belongs.
- the "template” in the method of the present invention can be automatically and instantaneously created by simply inputting a test pattern, and has flexibility and additional characteristics. The work is also easy.
- FIG. 3 shows one model in which the implementation of the system of the present invention is performed by an analog method or biological means.
- FIG. 3 shows one model in which the implementation of the system of the present invention is performed by an analog method or biological means.
- reference information for implementing the implementation of the present invention system by an analog method or a biological method is shown.
- 3 is the above-mentioned substance corresponding to an arbitrary one-dimensional component of the input pattern vector.
- 22 i is a component of the reference vector corresponding to the dimension.
- 21 i is a component of the target vector corresponding to the dimension.
- the averaging of the injected amount and the contents of these small chambers immediately before the injection is performed, respectively. It is done quickly, and the excess is released outside the cell.
- the intensity of the physical property of the surplus material has an average value by the averaging.
- the moving speed of the mass point in this case is obtained as an approximate value.
- 32 i is a substance that contributes to the above-mentioned dimension in the relative similarity.
- 32 n indicates a substance that contributes to the relative similarity corresponding to other dimensions.
- 3 3 is the contribution of each of these dimensions This is a small room in which the sum of minutes, that is, the relative similarity is calculated. (However, the arithmetic mean of these contributions is actually calculated, which means that the relative similarity is multiplied by a certain coefficient.
- Reference numeral 36 denotes a mechanism having no direct relation to the present invention, but a storage mechanism for holding the above-described calculation results, and at least a part of the surplus released from each of the small rooms described above. It is used in the storage mechanism.
- FIG. 3 is an intermediate stage in the process of explaining the present invention, it does not go far beyond the classical concept, and is a conceptual diagram that can be said to be quite ideal. In fact, it is not disclosed why this method determines the average relative similarity.Also, even if the average relative similarity is determined, it is only after the weight vector has been determined. Furthermore, one pass of data input is used for one pass, and a total of two passes of data are required, including one pass for obtaining the weight vector. Further, it does not disclose a specific mechanism for performing a comparison operation between each relative similarity and a threshold. In order to solve these questions, it will be necessary to go through the steps of the later examples.
- the pattern vector components to be identified by the discriminant function element in the present invention are defined as binary or ternary values as described above. It can be extended to multi-values with more than values, and by such extension, it is possible to directly handle even input pattern vectors with four or more components. (However, although detailed description of the algorithm and the like is omitted here, a zero component is usually used as a "non" significance information for one value of the multi-value component.)
- the output of the discriminant function element according to the present invention is basically binary (whether or not there is a pulse) as will be disclosed in the embodiments described later.
- the outputs of the plurality of discriminant function elements may be combined by a predetermined logic.
- a pattern vector of a ternary or binary component can be assembled in a subsequent discriminant function element.
- the zero component is considered to have two roles. One is to play a universal role in all cases, and to provide a kind of "forgetting" function to the subsequent discriminant function element, that is, the discriminant function element that receives the output of the discriminant function element as an input.
- the discriminant function element of the above the position of the mass point does not change in the dimension of the zero component even by the above-described random walk trial, but since the number of trials is increased by 1, the absolute value of the mass travel speed naturally decreases.
- the discrimination function element at the subsequent stage automatically contributes to the relative similarity with respect to the dimension of the zero component as a matter of course. (By the way, if the zero component is assigned to the significant information, the significant information is ignored.) If the binary (1-1, +1) component is used as the pattern vector In this case, data estimated to be insignificant by the discriminant function element in the preceding stage (that is, data having a possibility of a noise pattern) cannot be represented by the pattern vector.
- the system considers the input pattern to be normal in three values, as is clear from the above description, the flow of data generated inside the system of the present invention and passing through the system.
- the discriminant function elements provided in the system of the invention may be any given or predetermined output of the existing neural network, and the output of Z or the observing mechanism, and / or the output from another system, and the output of Z or these outputs.
- the output obtained by performing the logical operation, and the input pattern And That is, the input pattern of the discriminant function element provided in the system of the present invention may be data obtained by extending output data of the preceding system via an appropriate logic circuit. All extended data except those that are tautological (that is, logically equivalent) to the data before extension can be taken as the input of the discriminant function element.
- the discriminant function element provided in the system of the present invention may be configured to output the output of the existing neural network, and / or the output of the observation mechanism, and / or the output of another system, and / or these outputs with given or predetermined logic.
- the output of the combined logical operation result is input.
- the input signal is composed of ⁇ ⁇ ⁇ ⁇ ⁇ by combining two input terminals arbitrarily selected from the input terminals, and the pattern vector is derived from the signal transmitted by at least one of the pairs. At least one dimensional component of the torque may be configured.
- the signal of one input terminal Is positive (thus, if there is a pulse, it means the arrival of +1 component), and the signal at the other input terminal is negative (thus, if there is a pulse, it means the arrival of 10 components)
- the random walk is executed between the two in the discrimination function element, and the reference vector and the target vector (hence, the weight) of the discrimination function element are obtained based on the result of the random walk. May be constructed.
- Such a method of forming a pattern vector for directly identifying a discriminant function element is the most standard as a discriminant function element provided in the system of the present invention.
- the basic principle of the system of the present invention is not to directly extract the feature of the recognition target, but to extract the feature (calculation of the weight vector) of the data transmitted through the observation mechanism or the previous hierarchy. It's just However, the required time is dramatically reduced compared to the conventional method (that is, it is realized by one-pass data input), so that the expanded data can be repeatedly tested in a short time. & Cut You can do it. Therefore, an automatic data extension mechanism is provided in the preceding layer (and / or the preceding layer of the current layer) (at least, the above-mentioned data extension function is one of the candidates), and somewhere in the current layer.
- the dimension of the component of the weight vector can be truncated for a dimension whose absolute value does not fall below a given or predetermined size. This measure contributes to saving the hardware (materials) that make up the dimension.
- the output of the neural network of the present invention described in an embodiment described later has a different paradigm from the output of the conventional neural network.
- the output of each layer of the conventional neural network becomes the output of the decision mechanism from the first layer, and the number of signal lines also decreases gradually (this means reduction of the pattern space).
- the number of output signal lines is generally smaller than the number of input signal lines, which is unavoidable reason.
- the system of the present invention also has such a decision process at a later stage, but it is closer to the merging process than to the decision.
- the content of the signal line is the hierarchy.
- each of these hierarchies outputs non-firing, ie, zero, as output data corresponding to an input pattern (ie, a noise pattern) that should not be identified in that hierarchy, but in a later hierarchy,
- an input pattern ie, a noise pattern
- zero can be assigned as the component of the corresponding dimension. It can be said that this is one form of the above-mentioned identification determination.
- the system of this embodiment performs the above data compression concurrently while performing the above kind of identification decision.
- the results (data) can be aggregated into one bit of information per category to indicate whether or not the input pattern should be identified in the category in charge.
- the system of this embodiment functions as a kind of pattern filter for removing noise patterns up to a certain layer.
- the electrolyte ion When the system of the present invention is realized by an analog method or biological means, when the execution of the random walk is performed by using electrolyte ions, the electrolyte ion has an additional role of performing the random walk. Then, by appropriately mixing and diffusing the electric field ions of each dimension, in the process, it can also play a part in the role of complex data expansion. (Of course, the data expansion is not limited to this, but also depends on the selection of the electrolyte ion channel.) In addition, the potential of the output pulse on the output line of the output section is added to the electrolyte ions. Can also play a role in raising In this case, the electrolyte ions are one element forming the power supply system, so to speak, as the discrimination function element.
- the mechanism for executing the random walk and the threshold mechanism described later will be much smaller than the data expansion mechanism and the power supply mechanism described above. (By the way, it can be said that the conventional research on the nervous system of the living body was mainly of the power supply system.
- the principle of the system of the present invention including the execution of the random walk is based on the information of the discriminant function element.
- biological information systems there are places where they are constructed with too small mechanisms, so even if the necessity of research is recognized, it is difficult to perform anatomical research as a real problem. It is undeniable that there were parts. Also, anatomical methods in brain research may have ethical issues.
- the mass of the mass point in the n-dimensional Euclidean space corresponding to the above-mentioned pattern vector used in the system of the present invention can be arbitrarily set when implementing it specifically. obtain.
- the only requirement is that equal dimensions be used in all dimensions of the pattern vector (normalization).
- the object to be recognized by the system of the present invention is an entity having a physical quantity
- the property of the measured value of the object BJiS for example, whether it is equal to a certain predetermined value or whether it enters a certain predetermined section or not
- the discriminant function element used by the system of the present invention is a linear discriminant function element itself, since its input is binary or ternary digital information that conveys the occurrence of an event, all physical quantities of all attributes are Is quantized, and is converted into the above-mentioned digital information that is included (woven) in multi-dimensions, so that the recognition target of the system of the present invention can be extended to objects generated by nonlinear factors. .
- the object of recognition of the system of the present invention is not limited to the limitations revealed in the conventional quantum theory. It is a strict fact revealed in the conventional quantum theory that there are combinations that cannot be measured at the same time in physical quantities of multiple attributes. However, this does not mean that the type of prediction problem of extracting observational data from such physical objects does not go any further. In other words, for a combination of physical quantities of multiple attributes that cannot be measured at the same time, if necessary, the physical quantities of these attributes (but need not be completely independent) are added to the time at which each can be measured. It may be measured separately.
- the combination of the measurement values and measurement time (or measurement order) In the pattern vector used by the system of the present invention it is expressed as a dimensional distinction.
- the measurement is performed in two stages, a learning stage and an execution stage, and the same observation system is used in both stages.
- the learning stage the observation system is used as one element of the entire system. If the behavior (result category) of the non-observed system after the end of the measurement, including the measurement, is given to the observed system in association with the input pattern, the effect of the observed system on the non-observed system is recognized by the observed system.
- the result may be woven.
- the pattern corresponds to one of the above physical events by n: 1. From these facts, one physical event can be regarded as being equivalent to one category having these patterns as elements. Then, the task of determining the occurrence / disappearance of the above event can be reduced to the task of identifying these patterns in the above one category. If the relevant data before the occurrence of the event (leading indicator) is determined as a learning pattern, it is predicted that the event will occur in the future, and the relevant data after the occurrence of the event will be predicted. In the case where (sequence index) is determined by stacking as a learning pattern, it is an estimation that the event would have occurred in the past. BRIEF DESCRIPTION OF THE FIGURES
- FIG. 1 is an explanatory diagram showing the basic principle of the present invention.
- FIG. 2 is an explanatory diagram showing three cases to which the present invention can be applied.
- FIG. 3 shows a reference example in which the basic principle of the present invention is realized by an analog device or a bio device, and optimized using the principle of a calculation chart.
- Figure 4 is a template for explaining the vertical clustering in the present invention.
- FIG. 5 is an explanatory diagram showing a method of forming a fusion discriminant function device according to the present invention.
- FIG. 6 is a diagram showing a thread for realizing the basic principle of the present invention with an analog device or a bio device.
- FIG. 4 is an explanatory diagram showing the size of one shoulder and its operation.
- FIG. 8 shows a configuration method of a weight vector for giving a variation to a target vector, a magnitude of a threshold, and an operation thereof when the basic principle of the present invention is realized by an analog device or a bio device.
- FIG. 9 shows a configuration method of a weight vector for giving a variation to a reference vector, a magnitude of a threshold, and an effect thereof when the basic principle of the present invention is realized by an analog element or a bioelement.
- FIG. 10 is an explanatory diagram showing a state of extracting a partial space query in a pattern space by an identification function element or a child identification function element located in a certain hierarchy in the system of the present invention.
- FIG. 11 shows a subset in the pattern space extracted by a child identification function element located in a certain hierarchy in the system of the present invention, and an input pattern base in the identification function element located in the latter stage of the hierarchy.
- FIG. 4 is an explanatory diagram showing a relationship with a vector configuration method.
- FIG. 12 is an explanatory diagram showing an ideal combination method of input terminals having polarities in a discriminant function element when the basic principle of the present invention is realized by an analog element or a bio element.
- the system of the present invention can also be constructed using a conventional information processing device as an example on the hardware side. Specifically, analog devices, and / or digital devices, and Z or computers, and biodevices formed by using artificial means in at least part of the process, and the devices described above. The whole or a part thereof can be manufactured using an integrated circuit element in which at least a part of the above is integrated in one or more circuits, and / or a circuit including the integrated circuit as an element. Further, as an embodiment of the software, a portion described as a function of a discrimination function element or a child discrimination function element can be realized as a procedure of, for example, a computer program. In any case, All the implementations belong to the scope of the system of the present invention.
- This embodiment mainly describes the structuring of the system of the present invention, and specifically, clustering of discriminant function elements (however, the term “clustering” herein includes network peaking. ).
- clustering herein includes network peaking.
- the inventor of the present invention has three names: (1) clustering for macro identification, (2) clustering for mouth identification, and (3) clustering for macro and mouth association identification. These will be explained in the specific examples section below, but the first thing to note is that many of these examples are described assuming the case of (c) in Fig. 2. That is. However, clustering is still possible in the cases (a) and (b) of FIG.
- discriminant function elements provided in the basic system and described as the principle of the present invention will be mainly described. These discriminant function elements may work by themselves, or may become the "original" discriminant function elements described later and generate several "descendant" discriminant function elements. In the case of an "original" discriminant function element, the above-mentioned mutation vector may be generally considered to be a zero vector.
- FIG. 4 is a basic explanatory diagram of the present invention. In the following embodiments, this diagram is used as a template.
- 21 is a target vector of the discriminant function element
- 41 is a set having an input pattern to be identified as an element
- 1 is a set of pattern vectors corresponding to the above set 41
- 2 2 Is the reference vector of the discriminant function element
- 42 is the category 1 (that is, an arbitrarily defined set of patterns used as the basis for calculating the relative similarity) that forms the basis of the reference vector
- 2 is the above.
- a set of pattern vectors corresponding to the set 4 2, and 4 1 1 is an arrow in the explanation that indicates that the arithmetic mean of all elements of the set 1 of the pattern vector is the target vector 2 1.
- the discriminant function element provided in the system indicates that it should be identified in a given or predetermined category by an instruction from the inside or outside of the system.
- An arithmetic mean vector of the entire pattern vector corresponding to a typical input pattern, which is represented by, is set as a target vector.
- the set 41 is a set that includes a representative input pattern to be identified in a given or predetermined category, which is indicated by an instruction from inside or outside the system.
- the process of obtaining such a target vector is a learning process of the discriminant function element.
- This embodiment can be combined with all other embodiments except those involved in setting the target vector.
- FIG. 5 is an embodiment required when designing a function similar to the function of an eye of a living body or the like.
- the target vector of the discriminant function element is created corresponding to the union of a plurality of categories, and FIG. 5 is an explanatory diagram illustrating this state.
- the discriminant function elements each having the target vector at the center of mass of the mass system corresponding to each of the plurality of categories are fused (that is, the discriminant function elements of the plurality of discriminant function elements are combined).
- the arithmetic mean vector of the target vector and the arithmetic mean vector of the reference vector are calculated, and the calculated target mean vector and the reference vector of the discriminant function element newly constructed by the fusion are calculated.
- the plurality of categories 1 are represented by 51, 52, and 53 for the sake of explanation. (The category corresponding to the reference vector is not shown.)
- To realize the function of the living body's eyes it is necessary to absorb large-scale deformation of patterns and fluctuations in size and position.
- the pattern space is divided into a plurality of reduced pattern spaces, and a certain reduced pattern space and a plurality of reduced pattern spaces in the vicinity thereof are standardized and correctly input.
- a fusion means of the discriminant function element is a means in which a plurality of discriminant function elements share information with each other.
- the most typical use of the fusion discriminant function element of this embodiment is to use it as an “original” discriminant function element, which will be described later.
- This embodiment can be combined with all the other embodiments except those related to the setting of the target vector or the reference vector.
- the system of the present invention can be used to correct an error in an existing neural network (hereinafter, whether or not it is the system of the present invention). It can also be used in the policy of using the network as an observation mechanism.
- the following is one embodiment.
- the discriminant function element used in this case the set 41 in FIG. 4 should not be identified by the existing neural network, even though it should be identified in a given or predetermined category. It is a set that has the determined representative input pattern, as an element.
- the meaning of the existing neural network here is not limited to the system of the present invention.
- an additional correction will be made to the decision result of the existing neural network.
- This embodiment can also be combined with all the other embodiments except those involved in setting the target vector.
- the following embodiment can also be used mainly for correcting an error in an existing neural network, but, similarly to the above, means that can be used in a policy of using an existing neural network as if it were an observation mechanism. It is.
- the discriminant function element used in this case it was determined that the set 41 of FIG. 4 should not be identified in the given or predetermined category, but should be identified by the existing neural network. It is a set with typical patterns, as elements.
- the existing neural network mentioned here is not limited to the system of the present invention. In the case of the present embodiment, a correction will be made so that a part of the decision result of the existing neural network is eventually invalidated and further narrowed down.
- This embodiment can also be combined with all other embodiments except those involved in setting the target vector.
- the following embodiment can also be used mainly for correcting an error in an existing neural network, but in the same way as described above, a means that can be used even in a policy of using an existing neural network as an observation mechanism. is there.
- the reference category 42 in FIG. Although it should not be identified as a set, it is a set of elements consisting of a typical input pattern, which is erroneously determined to be identified by the existing neural network.
- the set is used as a set of noise patterns or a subset thereof, and usually, the target vector 1a includes input patterns to be identified in a given or predetermined category.
- the existing dual network mentioned here is not limited to the system of the present invention.
- the reference vector 22 includes a certain set in a pattern space corresponding to a set of input patterns to be identified in a given or predetermined category.
- Arithmetic mean vector of the entire representative pattern vector is set, which belongs to any one of the arbitrary sets in the pattern space that has been clarified a priori as the set to be included Is done.
- a priori inclusion between categories.
- the category “one character” includes the power category of "Japanese characters"-,-the category one "Japanese character” includes the power category of ** kanji "-
- One "Kanji” category encompasses the category of "Kanji with a bias”-one category of "Kanji with a bias” encompasses a category of "Kanji with a bias” —
- a “character” category encompasses a category of -alphabet 26 characters—, etc.
- the target vector 21 is set to the target vector corresponding to the category 1 "Kanji with human bias". Further, in the discriminant function element of the next subsequent stage, the reference vector corresponding to the category 1 of “Kanji with human bias” is set as the reference vector 22 and the target vector is set as the target vector. In the vector 21, a target vector corresponding to the category of “Kanji” is set.
- the final result of the judgment is determined by the logical product of the outputs of the discriminant function elements of all the layers. (Usually, the above AND operation can be distributed to each layer by constructing a gate circuit that operates only when there is an output from the previous layer in the identification function element.
- the above method can be generally applied to categories that have been artificially configured to have, and their inclusion relations.
- a given or predetermined category 1 The union of the set in the pattern space corresponding to the set of input patterns designated to be identified and the set in the pattern space corresponding to the input pattern designated not to be identified in category 1 above The arithmetic mean vector of the entire representative pattern vector belonging to, is set.Another case of clustering using a priori inclusion relation is that the output of the system itself is a priori. If the threshold of the discriminant function element is appropriately selected, the discrimination of the discriminant function element can be loosened. Identify all the patterns that should be identified in the category, and set the above threshold to identify some of the patterns that should not be identified.
- the set of patterns identified by the discriminant function element includes a given or predetermined category.
- the subsequent discriminant function element is a representative neural network identified by the existing neural network in a given or predetermined category.
- the arithmetic mean vector of the entire pattern vector corresponding to the input pattern can be set to the reference vector 22 in Fig. 4.
- the target vector 21 in Fig. 4 is used.
- the target vector corresponding to the given or predetermined category 1 will be set. Note that in each of the above cases, each target vector of the discriminant function element that exists in several successive layers Category, which is the same category. However, also in this case, it is needless to say that the category corresponding to the reference vector of each discriminant function element must include the category corresponding to the target vector.
- the discriminant function elements required for clustering of macro discrimination before learning at each layer are the target vector and the reference vector, and are zero vectors in at least one generation system.
- the required number of discriminant function elements that are structurally identical to the discriminant function elements that are set to be equal to each other can be duplicated and used.
- a discriminant function element itself is also an embodiment of the present invention, and can be combined with all other embodiments except for the embodiment relating to the setting of the target vector or the reference vector.
- this is not the case for systems of the second generation or later, and the internal parameters of the discriminant function elements distributed to each layer as "seed" may have initial values.
- a discriminant function element having undergone a certain learning process for each hierarchy may be used as a “seed”. Clustering of macro classification is particularly important in prediction problems and search problems, and since the essence of the hierarchical structure that the recognition system should realize has been elucidated, advanced recognition that is incomparable with conventional connectionist methods Enable. However, this alone does not allow a high degree of recognition. The reason is that the reference vector of the discriminant function element approaches the target vector as the hierarchy goes to a later stage, and finally enters a phase in which the two coincide. In such a situation, since the weight vector becomes zero, it is clear that the identification function of the identification function element is disabled.
- the mutation vector of the generated discriminant function element is generally set to zero vector.
- the output data of the discriminant function element is the identification result, and the discriminant function element at the subsequent stage cannot depend on the input only for the output data, and most of the input data is still the original data from the observation mechanism. The fact that the system must rely on a short time input suggests that an ideal system is not complete.
- the discriminating function of the discriminating function element is disabled, the given or predetermined category 1 is used, and in some cases, the size or variation of the pattern.
- the lower-level category (hereinafter referred to as “sub-category 1”) is artificially constructed, including the distribution of the target vector, and the method of using the target vector corresponding to the sub-category 1 is a countermeasure.
- Sub-category 1 is artificially constructed, including the distribution of the target vector, and the method of using the target vector corresponding to the sub-category 1 is a countermeasure.
- the clustering of the micro-discrimination is a mechanical and formal solution to the general situation in which the discrimination function of the discrimination function element is disabled in the above-described embodiment (however, the “parent” and “child” can be used).
- the AND operation as described above is required between the discriminant function elements).
- the category corresponding to the target vector is set as a sub-category composed formally from the first stage where the discrimination function of the discriminant function element is not disabled.
- the following embodiment specifically shows a function of generating a system component in a discriminant function element arranged in a basic neural network, which is shown as a principle of the present invention.
- a system component newly generated by the self-propagation function includes a finite number of elements (hereinafter, “child discriminant function element”
- the child identification function element includes at least one and a finite number of predetermined sets in a pattern space corresponding to a given or predetermined category to be identified by the identification function element.
- the process of generating a child identification function element that gives a role to identify a subset and plays the role includes the above-described identification method and internal parameters ( ⁇ body) until a given or predetermined condition is satisfied. In general, it refers to the target vector, the transcendental vector, the weight vector, the threshold, and the values of the history vector, mutation vector, etc. described later).
- the phrase “until a given or predetermined condition is satisfied” imposes a certain limit on a period during which the self-proliferating function is exerted.
- the phrase “giving the role of identifying one or more and a finite predetermined subset of the set in the pattern space corresponding to the given or predetermined category —” It is based on the judgment that the above subset is very arbitrary, as defined using random numbers, and that it is inappropriate to express it in a concept like the subcategory 1 described above.
- the point to be noted in the above embodiment is that the target vector of the above generated child discriminant function element And the reference vector are both inheriting the element of the "parent" discriminant function element in its entirety. Therefore, these are inherited to the discriminant function element of "descendant". (However, since the mutation vector is updated, the weight vector is automatically updated.)
- the discrimination function element in another embodiment specifically showing the function of generating system components. The method has a child identification function element duplication step and a random number vector construction step under the same conditions and the same processing procedure as the discrimination function element described in the preceding embodiment, and Later, each of the random number vectors obtained from the random number vector construction process was added to its own target vector.
- each vector determined as an addition result as a target vector of each of the duplicated child identification function elements corresponding to the random number vector, and setting a zero vector to each of the child identification function elements.
- the process of setting as a mutation vector of is set.
- a child discriminant function element that is also a "self-owner" of the "original" discriminant function element is generated, and the vertical division of the work of the "original" discriminant function element is made finer (that is, , Preventing incapacitation).
- the end of the target vector or the reference vector can randomly walk in the n-dimensional pattern space.
- the discriminant function element includes a child discriminant function element duplication step and a random number vector construction step under the same conditions and the same processing procedure as the discrimination function element described in the preceding embodiment, and the random number After the vector construction process, each vector determined as a subtraction result obtained by subtracting each of the random number vectors obtained from the random number vector construction process from its own reference vector is replaced with the random number vector. And a process of setting a zero vector as a variation vector of each of the above-mentioned child identification function elements. It is intended to.
- the formula for determining the weight vector is formally constructed by focusing on the fact that there is a difference vector between the target vector and the reference vector.
- the target vector follows that of the ** parent ", the weight vector is substantially equivalent to the above embodiment.
- network peaking is performed by the images of (b) and (b)
- the identification function element in another embodiment, which specifically shows a function of generating a system component. Has a child identification function element duplication step and a random number vector construction step under the same conditions and the same processing procedure as the discrimination function element described in the preceding embodiment.
- the random number vector obtained from the above random number vector construction process For each of them, as a set of internal parameters corresponding to each of the random number vectors, a set made up of three new vectors of the above random number vector is prepared, and for each of the above sets, Sum of the above three vectors Are determined to be equal to the corresponding random number vectors, respectively, and the target vector of each of the child identification function elements replicated corresponding to the random number vectors is determined. For the target vector, set the result of adding any one vector of the above set of vectors corresponding to the random number vector to the target vector, and set the reference corresponding to the above.
- a subtraction result obtained by subtracting one of the reference vector, the reference vector, and the above-mentioned remaining vector is set in the vector, and the mutation vector corresponding to the above is set to the subtraction vector.
- a process for setting an addition result obtained by adding the other vector to the vector is provided.
- This embodiment includes the above three embodiments.
- the directional components of the weight vector diversify more rapidly, but the variants can be adjusted smoothly.
- the embodiments disclosed below relate to the above threshold of the discriminant function element.
- the threshold of the discriminant function element provided in the system of the present invention is not limited to the following embodiments.
- the inner product of the target vector of the self and the weight vector of the self or an approximate value thereof is calculated, and the threshold value of the self is calculated using the calculated value as a guide. To set.
- the above-disclosed threshold is determined by the identification function element (misunderstanding). If there is no expression, there is a "parent" discriminant function element) and the target vector of the child discriminant function element that is installed (three cases exist as shown in the previous embodiment).
- the threshold has the function of dividing the set in the corresponding pattern space almost equally (that is, so that the number of elements is almost equal) by each child identification function element.
- each of the above-mentioned child discriminant function elements performs the above-mentioned division using a multidimensional vector.
- the cohesion of the "class” mentioned above is sufficiently redundant that, even if there is some correlation between these used dimensions, the overall Therefore, the independence of the above division is guaranteed.
- One of the generated child identification function elements is divided into the set corresponding to the pattern identified by itself and the other set (other child identification elements).
- the pattern to be identified by the functional element and the noise pattern, and the set corresponding to are divided into two.
- Each of the remaining child discriminant function elements generated above also divides the above set into the set corresponding to the pattern identified by itself and the other set (set corresponding to the pattern not identified by self, that is, other children).
- a set corresponding to the pattern and noise pattern to be identified by the discriminant function element As a result, a pattern not identified by any of the child identification function elements is definitely determined as a noise pattern. (Note that among the patterns identified by each of these generated child discriminant function elements, there is also a “confusing” noise pattern that is incorrectly identified, but this is gradually removed in the subsequent hierarchy.
- each element has the function of exerting the above-mentioned "class” cohesive force
- each of the above-mentioned child discriminant function elements does not simply divide the above set into two, but forms a pattern to be identified by itself.
- the above set is divided into two parts while applying cohesive force. Therefore, the above-mentioned noise pattern is rejected as a cooperative function of the whole child identification function element. (That is, none of the output signal lines "fires.")
- the expression "using the calculated value as a guide” indicates that the threshold to be set is equal to the calculated value. , Indicating that the value may have a certain width.
- the calculated value is used as it is as a threshold, but the scope of the patent is not limited to this.
- the expression “standard” is used. Does not mean that the calculated value is a rough value, but merely corresponds to the fact that it is customary to give the threshold a certain width in general. In fact, if the value is used as it is as the threshold of the child discriminant function element, the set of pattern vectors corresponding to the target vector of the child discriminant function element will be almost uniformly It has a function that can be divided into two. There are various possible guidelines for the threshold of the child discriminant function element. For example, a method using the inner product of the weight vector and the reference vector can be considered.
- the threshold is considered as the pattern space corresponding to the target vector of the child identification function element. All the pattern vectors belonging to the combinations in are considered to be the average of the relative similarities calculated by inputting them to the above-mentioned child discriminant function elements. That is, it is the average value of the inner products calculated between the pattern vector and the weight vector of the child discriminant function element, and the weight vector is constant.
- the average value is equal to the inner product of the arithmetic mean vector of the entire pattern vector and the weight vector of the child discriminant function element.
- the arithmetic mean vector of the entire pattern vector is nothing but the target vector itself of the child discriminant function element.
- the above threshold calculation method means that the weight vector can be obtained without additional data input after the weight vector has been obtained with one-pass data input.
- comparing the relative similarity with the threshold is based on the value of the expression (weight vector of child discriminant function element) X (input pattern vector—target vector of child discriminant function element). Evaluating is a matter of equivalence. (Hereinafter, the above equation is referred to as a “judgment equation.”) It should be noted that the evaluation of the above judgment equation is simplified while it is inconsistent, and that This means that the value of the expression must not be evaluated. Because it is just a vector, not a scalar.
- the threshold of the same calculation method can be considered. This is because only the patterns having a certain tendency form the first category, and from the patterns belonging to the first category, about half of the patterns are ordered in order from the one with the strongest (typical) tendency. This is a function that is applied when, for example, it should be extracted, and is one embodiment of the present invention.
- FIG. 6 shows an embodiment of a judgment mechanism for evaluating the above judgment formula.
- 62 i is a multiplier corresponding to the dimension i of the vector
- 60 i is a mechanism for holding the component of the dimension i of the weight vector
- 63 i is a mechanism corresponding to the dimension i of the vector.
- 64 i is the substance responsible for the dimension i corresponding to the result of subtracting the threshold from the relative similarity
- 64 n is the substance responsible for the other dimension corresponding to the above result
- 65 is the above value (Physical quantity)
- a summation mechanism that calculates the sum of all dimension correspondences, and 66 denotes the output of the summation mechanism 65 (proportional to the above summation result).
- the target vector is formed using a vector whose input pattern vector 31 i is inverted, and this is realized as an algebraic sum of the charges. I can do it.
- the summation mechanism 65 is a mechanism for calculating the sum of inner products corresponding to each dimension.
- the inner product corresponding to each dimension is output with a sign as a dynamic force such as Coulomb force, and these are output. Applying Pascal's 13 ⁇ 41 sign and adding them together with a sign, and finally converting this to a potential using a piezoelectric element, or more directly (ie, electrically)
- a method of calculating the sum of inner products may be used.
- the output 66 of the above-mentioned judgment mechanism is very small, there is also a method in which this is used as "seeding" and the switch of the power supply mechanism (full-scale pulse output unit) is pressed.
- the following embodiment relates to the random number generation process described in the above embodiment.
- one or more predetermined control factors for controlling the random number generation process in the random number generation process are represented by an initial random number. Is set before the occurrence of the random number, and in the above random number generation process, each of the above history vectors is given or specified in order to limit the range of the random number value to be generated. And the history vectors are updated by a given or predetermined method after the random number generation process.
- control factor which is the above-mentioned history vector
- the discriminant function element so that the way of generating the above-mentioned random numbers is not completely free. It is rooted in the strategy. In other words, it is designed so that the random numbers generated in the past have some influence on the random number generation process at the present time. Normally, such a strategy is based on the target vector of the child discriminant function element, for the dimension in which the absolute value of the component once takes a large value, in the subsequent generation of the child discriminant function element. It is designed to ensure that absolute values are never taken again.
- Such considerations are based on the method of extracting the sub-set of each child discriminant function element (hereinafter, a subset whose elements are the pattern vectors identified by the child discriminant function elements is referred to as “ (Sometimes referred to as a “subset”). Power is nothing less than a consideration of being as independent as possible. By the way, the purpose of such consideration is not to reduce the cost / performance of the system, but usually, under such consideration, a sufficient number of child identification function elements are generated by the method described later. Then, a sufficient amount of information is transmitted so that the subsequent layers can perform the identification function.
- the following embodiment further embodies the random number generation method described in the preceding embodiment.
- the random number generation process described in the above embodiment generates a random number corresponding to each history vector and for each dimension of the history vector.
- control means for limiting the range of the value of the generated random number, wherein the lower limit of the generated random number is set to zero, and the upper limit is set to a position on the coordinate axis indicating a corresponding dimensional component of each of the history vectors. And the length of the section from to the end point (+1) on the coordinate axis.
- the weight vector is virtually non-mutable with respect to the dimension corresponding to the random number, and the above-described component of the weight vector of the child discriminant function element is the same as that of the corresponding dimension of the "parent" discriminant function element. It will be equal to the component. This means that the dimension does not participate in the extraction of the subset in the child identification function element. (However, in the case of the clustering of macro-micro mouth association discrimination, even if the above-mentioned dimensions are in the above-mentioned state, the above-mentioned dimensions remain unchanged for macro-identification as before.
- the time limit related to the self-proliferation function of the discriminant function element shown in the above embodiment can be associated with the above-mentioned state in the above dimension. That is, there may be embodiments in which a given or predetermined dimension or all of the dimensions become in the above-mentioned state, and the subsequent self-reproducing function of the discriminating function element is stopped.
- the following embodiment specifically shows a method of updating the history vector described above.
- the updating process of updating each history vector described in the above-described embodiment corresponds to each history vector generated by the method of the previous embodiment.
- a process is provided in which a random number corresponding to each dimension is added to each history vector corresponding to each history vector and corresponding to each dimension.
- FIG. 7 is an embodiment that comprehensively shows the relationship between the above-described random number generation process and the above-described one history vector updating process.
- 70 i is a line segment indicating the coordinate axis (only in the positive direction) of the history vector corresponding to dimension i.
- 7 1 1i is the length corresponding to the component of the dimension i of the history vector at the time of the first random number generation (that is, the initial value of the component corresponding to the dimension i of the history vector). It is a line segment that I have.
- 7 2 1 i is a line segment having a length corresponding to the upper limit of the random number when the first random number is generated.
- 7 3 1i is a line segment having a length corresponding to the value of the random number generated first.
- 7 1 2 i is a line segment having a length corresponding to the dimension i component of the history vector when the second random number is generated.
- 7 2 2 i is a line segment having a length corresponding to the upper limit of the random number when the second random number is generated.
- 7 3 2i is a line segment with a length and corresponding to the value of the random number generated the second time.
- 7 1 ki is the above history vector when the k-th random number is generated. Is a line segment with a length corresponding to the component of dimension i of the file.
- 7 2 ki is a line segment having a length corresponding to the upper limit of the random number when the k-th random number is generated.
- 7 3 ki is a line segment having a length corresponding to the value of the random number generated at the k-th time.
- the updating process of at least one history vector is performed by a no operation (that is, the content of the history vector after the update is replaced with the content before the update). The process of keeping it as it is).
- This embodiment can also be combined with all the other embodiments related to the history vector.
- the following embodiment further embodies the generation process of the child identification function element described above.
- the number of child discrimination function elements to be duplicated in the generation process of the child discrimination function element described in the above embodiment is set to 2, and in the random number generation process described in the above embodiment.
- the number of history vectors to be referred to is set to one, and the random number generation process and any of the history vector update processes described in the above embodiment are used, and the random number vector after the above history vector update process is used.
- the code of each random number generated in the random number generation process is updated by a given or predetermined rule or a code generated by an at-random method.
- a random number vector composed of random numbers with updated signs and a random number vector determined as a result of multiplying the random number vector by 11 are obtained.
- This embodiment is generally used in combination with the above-described embodiment regarding the threshold.
- the reason why the number of generated child identification elements is limited to two is that one child identification function element using the threshold described in the above embodiment is used.
- the subset to be cut out includes two parts obtained by dividing the set of pattern vectors corresponding to the target vector of the child identification function element almost equally by the child identification function element.
- FIG. 8 shows an embodiment in which this embodiment is implemented using an analog element or a biological element.
- FIG. 9 shows another embodiment when the above embodiment is implemented using an analog element or a bio element, and is a diagram corresponding to one of the above child identification function elements.
- 9 i indicates a subtraction or addition operator.
- the mutation vector is added to the target vector, it is shown that the mutation vector is subtracted from the reference vector, but the weight vector is Both are mathematically equivalent.
- FIG. 10 is an explanatory diagram for assisting the contents described in the above embodiment as an image.
- the case (a) in the figure shows an image in which the “parent” discriminant function element cuts out a subset using the above threshold.
- 101 denotes the “parent” discriminant function element. This shows the subset to be cut out using the threshold.
- the case of (b) is an image of the case where the two complementary child discriminant function elements described above cut out the corresponding subsets using the above threshold in the clustering of Miku mouth recognition.
- 100 represents a set of pattern vectors corresponding to the noise pattern
- 100 a represents a subset cut out by one of the above two child discriminant function elements
- 23 a Denotes a mutation vector of a child identification function element that cuts out the subset 10a.
- 10 b indicates a sub-set extracted by the other child identification function element
- 23 b indicates a mutation vector of the child identification function element which extracts the subset 10 b.
- the clustering of micro-discrimination the target vector 21 and the reference vector 22 are identical (therefore, they overlap in the figure).
- the case (c) is used for clustering macro-micro association discrimination.
- 3a indicates the weight vector of the one child discriminant function element described above
- 3b indicates the weight vector of the other child discriminant function element described above.
- the clustering of macro-micro association discrimination Please note that the state shown in case (b) may also occur in the case of pinging.
- the subsets 10a and 10b each have a pattern pattern corresponding to a noise pattern different from the above noise pattern. Vectors, but they are progressively removed in later levels.
- the target vector 21 normally approaches the origin of the coordinates as the hierarchy in the system goes up.
- set 1 is almost equally divided into two to form the output.
- the reference vector 22 approaches the target vector 21 as the hierarchy in the system goes up.
- the set of patterns corresponding to the reference vector is usually the sum of the set of patterns corresponding to the target vector and the set of noise patterns. This is because it is a set and the noise pattern is gradually removed as the hierarchy goes up. (However, the noise pattern includes a pattern to be identified by another discriminant function element.)
- the components of each dimension of all the history vectors of the self prior to generating the first child discriminant function element are set to the target of the self after the learning period. It is set equal to the absolute value of the component of the corresponding dimension of the vector.
- This embodiment can be combined with any other embodiment that involves a history vector.
- This embodiment can be combined with all the other embodiments relating to the generation of the child discriminant function element.
- an embodiment may be applied in which all functions of the self are stopped and no or the existence of the self is eliminated.
- This is an embodiment that can be additionally applied to all the embodiments related to the generation of the device.
- the following example shows how the above-described child discriminant function element is used in the system of the present invention.
- the above-mentioned child discriminant function element is generated by the self-propagating means of the discriminant function element as a component of the present system. It can be used as an element having the same function as the element.
- FIG. 11 is an embodiment showing a method of forming input information for a post-stage layer in the post-stage layer.
- 11 a indicates a subset extracted by a certain child discriminant function element generated in the previous stage.
- 11b indicates a subset extracted by another child identification function element having a “sibling” relationship with the above-mentioned child identification function element and having a complementary relationship.
- the output of each of the above-mentioned child discriminating function elements is "fire", that is, +1 output.
- the output is zero.
- a ternary output is obtained from the binary set of the outputs of the two child discriminant function elements.
- 1 1 1 is a set of pattern vectors corresponding to the patterns identified from any of the above-mentioned child discriminant function elements.
- a quaternary output can be obtained from a pair of outputs of the two child identification function elements. In other words, it is a quaternary value corresponding to four pairs of (+1, 0), (0, 0), (0, +1), (+1, +1).
- one pattern was input, and the pattern was identified from any of the above-mentioned child identification function elements. Is shown.
- the number of the corresponding mass points in the n-dimensional grid space may be one. (What is input is one pattern as described above.)
- the coordinate components of the mass points of the random walk are in the + direction in all coordinate axes. Since they do not move in either direction, they may be considered as substantially equivalent to the output (0, 0) of the binomial set. Therefore, the above four values are aggregated into three values (11, 0, +1) as in the case of (a) above, and the three values are used for one identification function element in the subsequent hierarchy. Used as a one-dimensional component of the input pattern vector.
- 1 1 e and 1 1 f are subsets of the two child discriminant function elements generated in the previous stage, respectively, and have the same mechanism as the relationship in (b) above. When summarized, they also form ternary components, but each of the input pattern vectors of the above two child discriminant function elements has a different pattern space (possibly This is the difference from the above cases (a) and (b).
- FIG. 12 shows an embodiment in which the above embodiment is realized by an analog device or a bio device.
- An inconvenient embodiment is first shown by (a) and (b) in FIG. 12, and then a preferred embodiment is shown by (c) and (d).
- Figure 12 And 1 2 1 i and 1 2 1 j are input terminals having a positive polarity provided in the discriminant function element, and +1 output derived from the former stage discriminant function element input to the input terminal is input to the input terminal. Import as a +1 component in one dimension of the force pattern vector.
- the white circles with the omitted symbols in Fig. 12 are also the input terminals with the above positive polarity.
- 120 i is also installed in the discriminant function element.
- FIG. 12 (a) shows one arrangement of the above input terminals.
- the input terminal 121 i and the human input terminal 120 i are paired. It shows (in solid lines) that one dimension of the input pattern vector can be constructed. (That is, although omitted in the figure, a mechanism is set up so that the random walk described above is attempted between the input from the input terminal 121 i and the input from the input terminal 120 i.)
- the arrangement of the input terminals is the same as that in (a). In such an arrangement of the input terminals, for example, a pair of the input terminal 1 2 1 i and the input terminal 1 2 1 j is shown.
- the embodiment described below is to set up the internal parameters of the discriminant function element or the calculation mechanism for it.
- the term “internal mechanism” does not mean only the physical inclusion relation, but also includes the mechanism of all other systems sharing the function of the discriminant function element. This embodiment can be combined with all other embodiments. Industrial applicability
- the present invention realizes a pattern recognition function of a human being or a certain kind of organism, or a more advanced pattern recognition function or prediction function that was difficult in the past. (Mechanization). Also, artificial intelligence such as expert systems and translation systems can be easily constructed. That is, the main advantages or effects of the present invention are as follows.
- the information to be input to the system is logical ternary or binary information, not only non-coded information such as image information and audio information, but also general knowledge information necessary for constructing artificial intelligence etc. Can handle. (Explained as a basic principle)
- the recognition algorithm can be unified without depending on the recognition target. Also, since only one element can be used, standardization of this element is suitable for mass production.
- the input of data during the learning phase only needs to be performed once per layer of the system, so the learning period is completed in a short time. (In other words, one pass can be entered.)
- the scale of the map is generally switched from a large scale to a smaller one, but this principle is used. Recognition or search based on this is possible. This contributes to higher recognition accuracy and higher recognition speed. (Function of reference vector)
- Wiring of signals (input signals) to be transmitted to the device is optional, providing high flexibility. (See Figure 11)
- the noise pattern can be eliminated at a shallow stage of the hierarchy. In other words, you can make a global judgment. (Function as a pattern filter)
- the hierarchies in the system can be constructed automatically from macro-identifiers to micro-identifiers, and so that they are made quasi-static (with smooth switching).
- information of various reduced pattern spaces that is, oral information
- the element can be distributed to a local place again, It can respond to large-scale fluctuations in patterns.
- the internal data (image pattern) can be constructed, and the materials required for system construction can be saved. (Generation of "offspring” discriminant function elements using mutation vectors, etc.)
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Image Analysis (AREA)
Abstract
Un réseau neuronal comprend des éléments de fonction discriminante permettant de calculer le produit scalaire d'un vecteur de motif d'entrée et d'un vecteur de pondération prédéfini de façon à trouver la similarité, afin de procéder à la reconnaissance hiérarchique des motifs entrés à partir d'une classification grossière pour aller vers une classification fine. L'élément de fonction discriminante utilise, comme fonction de pondération décrite ci-dessus, le vecteur (3) qui est déterminé comme étant le résultat obtenu en soustrayant un vecteur de moyenne arithmétique (vecteur de référence) (22) des vecteurs de motifs en tant que totalité appartenant à l'ensemble, comme référence dans un espace de motif, du résultat de l'addition des vecteurs de variation (23), comme variation d'un vecteur de moyenne arithmétique (vecteur cible) (21) de tous les vecteurs de motifs à déterminer. L'élément de fonction discriminante reconstitue le vecteur de variation à l'aide d'un processus de génération aléatoire de nombres et d'un vecteur d'historique définissant la limite supérieure des nombres aléatoires générés durant le processus de génération de nombres, puis il génère les éléments de fonction d'identification de 'progéniture' à l'aide du vecteur de variation ainsi constitué.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP51807697A JP3796755B2 (ja) | 1995-11-09 | 1996-11-08 | バイオロジカル・ニューラルネットワーク |
AU75067/96A AU7506796A (en) | 1995-11-09 | 1996-11-08 | Biological neural network |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP32496495 | 1995-11-09 | ||
JP7/324964 | 1995-11-09 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO1997017659A1 true WO1997017659A1 (fr) | 1997-05-15 |
Family
ID=18171597
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP1996/003283 WO1997017659A1 (fr) | 1995-11-09 | 1996-11-08 | Reseau neuronal biologique |
Country Status (3)
Country | Link |
---|---|
JP (1) | JP3796755B2 (fr) |
AU (1) | AU7506796A (fr) |
WO (1) | WO1997017659A1 (fr) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2012084117A (ja) * | 2010-09-13 | 2012-04-26 | Tokyo Institute Of Technology | 属性の学習及び転移システム、認識器生成装置、認識器生成方法及び認識装置 |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH01263859A (ja) * | 1988-04-15 | 1989-10-20 | Fujitsu Ltd | 神経回路網の学習方法 |
JPH0769942B2 (ja) * | 1990-10-18 | 1995-07-31 | 彰 岩田 | パターン識別装置 |
-
1996
- 1996-11-08 WO PCT/JP1996/003283 patent/WO1997017659A1/fr active Application Filing
- 1996-11-08 AU AU75067/96A patent/AU7506796A/en not_active Abandoned
- 1996-11-08 JP JP51807697A patent/JP3796755B2/ja not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH01263859A (ja) * | 1988-04-15 | 1989-10-20 | Fujitsu Ltd | 神経回路網の学習方法 |
JPH0769942B2 (ja) * | 1990-10-18 | 1995-07-31 | 彰 岩田 | パターン識別装置 |
Non-Patent Citations (1)
Title |
---|
INT. JT. CONF. NEURAL NETW., Vol. 6, (1994), YOSHIMASA KIMURA, "A New Scheme Which Incrementally Generates Neural Networks for Distorted Handprinted Kanji Pattern Recognition", pages 3852-3854. * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2012084117A (ja) * | 2010-09-13 | 2012-04-26 | Tokyo Institute Of Technology | 属性の学習及び転移システム、認識器生成装置、認識器生成方法及び認識装置 |
Also Published As
Publication number | Publication date |
---|---|
AU7506796A (en) | 1997-05-29 |
JP3796755B2 (ja) | 2006-07-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Schuld et al. | Machine learning with quantum computers | |
CN110059262B (zh) | 一种基于混合神经网络的项目推荐模型的构建方法及装置、项目推荐方法 | |
Zhang et al. | An overview on restricted Boltzmann machines | |
Sohn et al. | Improved multimodal deep learning with variation of information | |
Lison | A hybrid approach to dialogue management based on probabilistic rules | |
Wehenkel | Automatic learning techniques in power systems | |
Liu et al. | Fuzzy mutual information-based multilabel feature selection with label dependency and streaming labels | |
Smolensky et al. | Mathematical perspectives on neural networks | |
CN111046655B (zh) | 一种数据处理方法、装置及计算机可读存储介质 | |
Tang et al. | Stock market prediction based on historic prices and news titles | |
CN111400494A (zh) | 一种基于GCN-Attention的情感分析方法 | |
Alam et al. | An effective recursive technique for multi-class classification and regression for imbalanced data | |
Pham et al. | Unsupervised training of Bayesian networks for data clustering | |
Birhane | Automating ambiguity: Challenges and pitfalls of artificial intelligence | |
Lison | Model-based bayesian reinforcement learning for dialogue management | |
Fil et al. | Minimal spiking neuron for solving multilabel classification tasks | |
Chen et al. | Defining and Extracting generalizable interaction primitives from DNNs | |
Jiao et al. | Coevolutionary computation and multiagent systems | |
Yang et al. | Stochastic numerical P systems with application in data clustering problems | |
Handa | EDA-RL: Estimation of distribution algorithms for reinforcement learning problems | |
CN111931461A (zh) | 一种用于文本生成的变分自编码器 | |
Tseng et al. | Design of delay-dependent exponential estimator for T–S fuzzy neural networks with mixed time-varying interval delays using hybrid taguchi-genetic algorithm | |
CN114707483B (zh) | 基于对比学习和数据增强的零样本事件抽取系统及方法 | |
Zhang | Machine Learning and Visual Perception | |
WO1997017659A1 (fr) | Reseau neuronal biologique |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AU CA JP US |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): AT BE CH DE DK ES FI FR GB GR IE IT LU MC NL PT SE |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
122 | Ep: pct application non-entry in european phase |