CN111553467B - Method for realizing general artificial intelligence - Google Patents

Method for realizing general artificial intelligence Download PDF

Info

Publication number
CN111553467B
CN111553467B CN202010370939.2A CN202010370939A CN111553467B CN 111553467 B CN111553467 B CN 111553467B CN 202010370939 A CN202010370939 A CN 202010370939A CN 111553467 B CN111553467 B CN 111553467B
Authority
CN
China
Prior art keywords
machine
memory
data
information
activation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010370939.2A
Other languages
Chinese (zh)
Other versions
CN111553467A (en
Inventor
陈永聪
曾婷
其他发明人请求不公开姓名
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN202010370939.2A priority Critical patent/CN111553467B/en
Priority to PCT/CN2020/000108 priority patent/WO2021217282A1/en
Publication of CN111553467A publication Critical patent/CN111553467A/en
Priority to PCT/CN2021/086573 priority patent/WO2021218614A1/en
Application granted granted Critical
Publication of CN111553467B publication Critical patent/CN111553467B/en
Priority to US17/565,449 priority patent/US11715291B2/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method for realizing general artificial intelligence. The method can establish a relationship network between objects only by extracting similarity relationship, time relationship and spatial relationship between the objects under the optimization of a memory and forgetting mechanism. It proposes to use the information of instinct, income and loss assessment as input information and keep it in memory. The machine can establish human-like general artificial intelligence by only organizing the relationships in the memories, recombining the memories with the relationships, selecting responses by using a yield and loss assessment system, and realizing the responses through simulation.

Description

Method for realizing general artificial intelligence
Technical Field
The invention relates to the field of artificial intelligence, in particular to a method for realizing general artificial intelligence.
Background
Artificial intelligence is currently still in the dedicated artificial intelligence phase. Such artificial intelligence can be applied only to a single field, and it is difficult to apply learned skills to various scenes, and it is impossible to generate general intelligence similar to that of human beings. There are great differences between the current training and response process of artificial intelligence and the learning, thinking and decision process of human beings. Such as deep learning by optimizing coefficients to find the multi-layer mapping with the smallest error. The characteristics of the machine-to-intermediate layer are randomly chosen, constrained by an error function. In order to ensure that the machine can select reasonable middle-layer characteristics, an extremely large amount of data is needed for training, and the trained model is difficult to migrate to the outside of the training field for use. Although part of details are removed through a filtering mode, the current popular deep convolutional neural network helps a machine to obtain more reasonable intermediate layer feature selection, but still needs a large amount of training data. The final basis of the machine's judgment may be based on some details that are not noticeable to humans, so the trained model may be easily spoofed. Current knowledge-graph engineering helps to connect different things in machine searches by extracting associations between text or concepts in big data. However, these relationships are lack of quantification, and a method is lacking to help the machine to learn and summarize themselves by using these relationships, and to achieve its own purpose by applying the learned knowledge in daily life. These methods are very different from human learning methods and do not produce human-like general intelligence.
The present application recognizes that the intelligence of a machine should be based on information theory and not on data processing methods that serve information theory. Therefore, the learning method provided by the invention simulates the human learning process, and the machine gradually obtains simple to complex responses from input to output under the motivation drive of the machine through memory arrangement, memory and real recombination and information simulation after the recombination, thereby showing the general intelligence similar to human. These show that the machine learning method proposed by the present invention is very different from the existing machine learning methods in the industry, and no learning method similar to the present invention is available in the industry.
Disclosure of Invention
The application of the invention provides a new learning method and implementation steps, which are specifically described as follows:
the voice and the characters are the acquired products of human beings, and the information except the language is the natural learning tool. Such as recognizing the world through images, there are natural advantages. First, there is natural similarity of images. Through similarity comparison, the machine can self-classify the images. Second, images have natural logical relationships in life, such as water and rivers. Thirdly, through image learning, there is naturally large data available. Such as daily life, images are ubiquitous. Thus, the machine can learn naturally through daily life, and the learning process is similar to the learning process of human beings. Although learning through images is one of the products that have been brought to us by evolution, there are natural disadvantages also through image learning. First, the amount of data is too large. Secondly, too much detail leads to poor generalization. And thirdly, no association with other sensor input information, such as voice, text, touch, smell, etc. Fourthly, many concepts are not represented by images, such as abstract concepts like love, fear, morality, and the like.
In order to fully utilize the advantages of image learning and overcome the disadvantages of image learning, a learning method for extracting a feature map of an image and based on the feature map is provided. Similar to the image, we extract features for other sensors as well, and treat these features as well as the image feature map.
In order to describe the complicated relationships before things, we only need to extract 3 relationships in the present application: similarity, temporal relationship, and spatial relationship. Therefore, the extraction of complicated and complicated relationships among things is greatly simplified. The machine considers that there is a relationship between similar things; information simultaneously appearing in the same space has a relationship with each other; the information relationships appearing in the same space form a transverse relationship network; the whole relation network is formed by connecting similar information among different horizontal relation networks. The relationships in the relational network increase as the number of repetitions increases; relationships in the relationship network, decreasing with increasing time; through the mechanism, the relations capable of repeatedly appearing are summarized and form cognition. What represents awareness is the relationship network.
Therefore, in the present invention, the process of processing information is to translate the input information into a sequence of feature maps that can be understood by the machine, then process the sequences of feature maps using the relational network and the memory base, and then translate the processed sequences of feature maps into a desired output form, such as speech, text, or motion output.
(ii) definition of concepts referred to in the present application.
In order to briefly explain the main steps of the present application, we first define the concepts involved in the present application. The present application will not be explained further later when applying these concepts.
Bottom layer characteristics: it refers to some features that are commonly existed between things and are obtained by finding local similarity between things. For graphics, for example, the underlying geometric features mainly include local edges, local curvatures, textures, hues, ridges, vertices, angles, curvatures, parallels, intersections, sizes, dynamic patterns, and other local features commonly existing in graphics. It is the syllable feature that is ubiquitous in speech for speech. Similar processing is done for other sensor inputs. The underlying features are machine-autonomously built by local similarity. In the process of using the underlying features, the machine can increase or decrease the underlying features through a relationship extraction mechanism (such as a memory and forgetting mechanism) or through human intervention.
Characteristic diagram: on the basis of the capability of extracting the bottom-layer features, common bottom-layer feature combinations in a plurality of similar things, similar scenes and similar processes are extracted through a relationship extraction mechanism, and the common feature combinations are feature maps. The feature map can be an image bottom feature map, a language bottom feature map and other sensor bottom feature maps, and can be static or dynamic. For example, after the machine retains the bottom layer feature combination extracted each time, the memory and forgetting mechanism is adopted to increase and decrease: the memory value of the bottom layer features which repeatedly appear in each extraction process is increased, and the bottom layer features which cannot be repeated are gradually forgotten, so that a plurality of diagrams formed by the bottom layer feature combinations which are extracted each time only keep the common bottom layer feature combinations.
The concept is as follows: a local network formed by a plurality of feature maps is a concept. A concept contains a plurality of feature maps and relationships between these feature maps. The concept does not necessarily contain similar profiles. They may be different profiles, linked by a memory and forgetting mechanism.
Connection value: in the application of the invention, connection can be established between two characteristic graphs in the cognitive network. These connections are directional and of a size. For example, the connection value between the feature map a and the feature map B associated with it is Tab. Likewise, the connection value between the characteristic map B to the characteristic map a with which it is associated is Tba. Tab and Tba are both real numbers, and their values may be the same or different.
The relationship extraction mechanism: a mechanism capable of extracting common feature combinations in a plurality of similar images, similar scenes and similar processes is a relationship extraction mechanism. The relationship extraction mechanism includes, but is not limited to, various forms of multi-layer neural networks, rule-based, logic analysis, supervised or semi-supervised learning methods, and the like, which are known in the art, and also includes the memory and forgetting mechanism proposed in the present application.
Memory function: meaning that some data increases with increasing number of repetitions. The specific increasing mode can be represented by a function, and the function is a memory function. It is noted that different memory functions may be employed for different types of data.
Forget function: meaning that some data is decremented as time and training time increases. The specific reduction mode can be represented by a function, and the function is a forgetting function. It is noted that different forgetting functions may be employed for different types of data.
Memory and forgetting mechanisms: in the application of the invention, the memory function and the forgetting function are used for data, namely a memory and forgetting mechanism. The memory and forgetting mechanism is a relationship extraction mechanism widely used in the application of the present invention.
A cognitive network: a cognitive network is a network formed by different concepts through a common feature map. It is a bi-directionally connected multi-centric star network. The essence of the cognitive network is a network formed by a machine through memory consolidation of all past memories. The cognitive network may be in the form of a single network in the present application. Or may be a relationship implicit in the entire memory bank.
The purpose of using bi-directional join values is because the concatenation relationships between feature maps are not peer-to-peer. We need to express it in terms of bi-directional connection values. Another reason for using bi-directional connection values is: in the chain activation, when one node transmits an activation value to another node and activates the other node, in order to avoid the repeated activation between the two nodes, in the same chain activation process, after A is transmitted to B, the reverse transmission from B to A is prohibited.
Chain activation: when information is input, the machine searches the cognitive network and the memory base to find corresponding bottom-layer characteristics, and gives an activation value to the machine according to the motivation. When a certain node (i) is assigned a certain activation value (real number). If this value is greater than its preset activation threshold Va (i), then node (i) will be activated. It will pass the activation value on to other feature map nodes that have a connection relationship with it. The transfer coefficient is a function of a connection value in the cognitive network, and is a function of a memory value at both ends of the transfer line in the memory base. If a certain node receives the transmitted activation value and accumulates the initial activation value of the node, the total activation value is greater than the preset activation threshold value of the node, the node is activated, and the activation value is transmitted to other characteristic graphs which have connection relations with the node. This activation process is chained until no new activation occurs and the entire activation value transfer process stops, which is referred to as a chained activation process.
Chain activation is a search method, and is a method for finding a feature map most relevant to some underlying feature combination. It is also a way to find the concepts that are most relevant to certain profiles. And is a method for finding one or more pieces of memory (experience) that are most relevant to certain concepts. And is a way to find the concepts that are most relevant to some motivation. The chain activation method is essentially a search or lookup method that can be replaced by other search or lookup methods that can perform similar functions.
The connection value in the cognitive network is a real number between 0 and 1. 0 represents no cascade relation. 1 represents a peer-to-peer connection relationship. For example, the connection value between the name of the object and the feature map is usually 1. These connection values are the ability to represent the center feature map individually, without limitation to each other. For example, there is no restriction that the sum of the connection values around the concept node must be 1. It should be noted that a real number between 0 and 1 is used as the connection value here, so as to avoid the phenomenon of non-convergence of the chain activation process during the chain activation process. This is because in our embodiment we use the simplest multiplication as the transfer function. If other transfer functions are used, the connection values may be in other ranges, but the overall constraint is chosen to be: the activation value passed out needs to be smaller than the activation value of the initiating activation node. This ensures that the chain activation process eventually stops.
Highlighting: after the input bottom-layer characteristics are searched in the cognitive network or the memory base, if one or more characteristic graphs obtain one or more marks, the characteristics are highlighted in the cognitive network or the memory base. The machine takes these feature maps as possible recognition results. And the input features are combined and divided by the input features, so that the overall similarity between the input feature combination and the searched feature map is compared and used as a standard for further judging the similarity. For example, when chain activation is used as a search method, if the activation value of some feature maps is higher than the noise floor of the activation value of the whole cognitive network by a preset threshold, the feature maps are considered to be "highlighted". The activation value noise floor of the cognitive network can be calculated in different ways. For example, the machine may use the activation values of a large number of background feature map nodes in the scene as the activation value noise floor. The machine may also use the average of the activation values of the nodes that are currently activated as the noise floor. The machine can also use its own preset number as the noise floor of the activation value. The specific calculation method needs to be preferred in practice. These calculation methods are only related to basic mathematical statistics and are well known to practitioners in the art. These embodiments do not affect the framework claims of the method and steps of the present application.
Mirror space: after a machine enters an environment, specific things, scenes and processes are identified by extracting images, languages and other underlying features of the sensor input. And the same thing characteristics, scene characteristics and process characteristics found in the memory are overlapped with the similar parts in reality, so that the machine can presume the parts which are not seen by the current thing, scene and process temporarily. Including parts where things are occluded, including parts where a scene is occluded, including front and back parts where a process is not seen by a machine. Since the size of the object is one of the contents of the feature map, the machine also uses the size of the particular object in the field of view to compare to the normal size of the object in the feature map to assist the machine in establishing depth of field in the environment. This is also the process of helping to understand information through memory. The machine determines the position relation between the machine and the environment through an overlapping space established by overlapping the self visual angle of the real environment and the similar part of the visual angle of a third person of the memory environment. The machine pair has both a first person perspective and a third person perspective on its own position in the environment. This is why such an overlapping space is called a mirror space.
After recognizing the feature map of the output information, the machine calls the memory through the feature map to establish a mirror space. The machine then reassembles the memory and input information by piecewise emulation, composing a new sequence of information to understand the input information and create an output response. This is also the creation process of new memory. The machine stores the new memory, and also stores the mirror space, and the stored content is not the record of the input information, but stores the extracted bottom-layer characteristics and the updated memory values of the characteristics.
And (4) frame memory: in the mirror space, every time an event occurs, the machine takes a snapshot of the mirror space and saves the snapshot. The saved content includes the underlying features in the mirror space and their remembered values, which are the remembered frames. The occurrence of an event in the mirror image space means that the similarity of the bottom layer feature combination in the mirror image space is changed beyond a preset value compared with the previous mirror image space, or the memory value of the bottom layer feature in the mirror image space is changed beyond a preset value.
Memory storage: the memory storage refers to the storage of the whole mirror space by the machine, and comprises all extracted underlying features and their combination relationship (including relative position relationship), and the memory value possessed by the underlying features.
Memory bank: the database formed by the memory storage is a memory bank.
A temporary memory bank: the memory bank may be a combination of a plurality of subordinate memory banks. These subordinate memory banks may employ different memory and forgetting curves. The temporary memory bank may be one of the subordinate memory banks whose purpose is to buffer memory storage and to screen materials that need to enter long-term memory.
In the application of the invention, the size of the capacity of the temporary memory library is limited by adopting a stack with limited capacity, and the temporary memory library is maintained by adopting memory and forgetting. The temporary memory library generally adopts a quick memory and quick forgetting mode to screen materials to be put into the long-term memory library. Machines, when faced with large amounts of input information, those things, scenarios and processes that have been learned about, or those that are far from the point of interest, lack the motivation for the machine to analyze them in depth, so the machine may not recognize these data, or the activation values assigned to them are low. When the machine stores the information into the temporary memory base according to the event-driven method, the memory value given by the machine to the new characteristic or the new characteristic combination is positively correlated with the activation value. Those memories with low memory value may be forgotten from the temporary memory bank quickly and will not enter the long-term memory bank. Therefore, only the information which we concern is put into a long-term memory base, and the trivial things which do not need to extract the connection relation every day are not memorized. In addition, because the capacity of the temporary memory pool is limited, the temporary memory pool also passively accelerates the forgetting speed because the stack capacity is close to saturation.
Relationship network: the relational network is a network formed by the relations between characteristic diagrams existing in the memory. The method is a product obtained by extracting the similarity, time relation and spatial relation of input information by a machine and optimizing the similarity, time relation and spatial relation through a memory and forgetting mechanism. Its expression form can be cognitive network with connection value or memory network with memory value or their mixed form.
Attention points are as follows: the attention point is that the machine finds one to more feature maps which are most relevant to the input information in the relational network through the input information. For example, when a chain activation search method is adopted, the activation value is the highest, and one to more feature maps can be highlighted.
Target attention points: the machine selects a feature map for organizing output according to own motivation, and the feature map is the target focus.
Segment simulation: the essence of the segmented emulation is a process of reorganization using memory and input information, which is a creative process. It uses some segments and parts in memory, organized into one or more reasonable processes with the input information. The content that can exist in the memory for a long time is usually the content that is often used, such as commonly used phrases, commonly used actions, commonly used expression organization methods, and the like. These frequently used combinations correspond to a process framework of things, scenes, and processes that are formed by the excellence and the disadvantage of memory and forgetting mechanisms. The machine borrows the process frames and adds the details thereof to form a new process of shape and color. The machine mimics this new process with stepwise segmentation to understand the input information and organize the output responses.
(II) relationships between the concepts in the present application.
The whole intelligent system is divided into three large levels; the first layer is a perception layer, which establishes a characteristic diagram by taking similarity as a standard and simplifies input information; the second level is a cognitive layer which identifies parts and common relations which can occur repeatedly and are common in similar things, scenes and processes, and is a process for establishing time and space relations which form a relation network together with the similarities; the third layer is an application layer, which uses the relationship network as a dictionary to translate the feature maps; it uses a relational network as a grammar to translate input/output information from one form to another; it utilizes the relation network to recombine the memorized and realistic information to understand the input information and organize the output response; the relationship network and memory are also utilized to balance the advantages and disadvantages among the many possible output responses to make a selection. It is also a process of implementing the memory and forgetting mechanism.
In the present application, the instinctive of a machine is treated as a continuously input message; in the machine processing information, the instinct of the machine is default input information; the instinctive motivation of the machine is a preset motivation. In the present application, the machine outputs the results of the profit-and-loss evaluations as a default, and uses the profit-and-loss symbols to represent the profit-and-loss, respectively, and stores them in memory. In each memory segment, the memory value obtained by each time of the specific profit and loss symbols is positively correlated with the profit value and the loss value obtained by the specific profit and loss symbols.
And (III) realizing steps of the general artificial intelligence in the invention.
FIG. 1 shows the main steps for implementing general artificial intelligence. These steps are the first aspect of the present application, and the steps in fig. 1 are described in further detail herein:
step S1: and establishing a characteristic diagram library and establishing an extraction model. The machine builds a library of underlying feature maps by finding local similarities and builds an algorithmic model that extracts these underlying feature maps. This is a pre-preparation process for data processing.
Step S2: and extracting bottom layer characteristics. The machine extracts the bottom layer characteristics of the input information of all the sensors, adjusts the positions, angles and sizes of the bottom layer characteristics according to the positions, angles and sizes with the highest similarity of the bottom layer characteristics and the original data, and places the bottom layer characteristics and the original data in an overlapping mode, so that the relative positions of the bottom layer characteristics on time and space can be reserved, and a mirror image space is established; this step is a simplified process for inputting information.
Step S3: input information is identified. The machine looks for points of interest. This process is a process of recognizing input information, removing ambiguities, and performing feature map translation. It is similar to the process of language translation in that the context is used to identify the information vocabulary from the information source and translate the identified vocabulary into the vocabulary of another language.
Step S4: input information is understood. The machine organizes the points of interest into one or more understandable sequences. This process is similar to language translation, where the vocabulary of the target language is reorganized into understandable language constructs using grammar. The specific method employed for this step is piecewise emulation.
Step S5: a response is selected. The machine adds the translated input information into own motivation to search for the target focus. The machine establishes a response to the input information by using the relationship network and the memory; and evaluating the response using a yield and loss evaluation system; until a response is found that can pass the evaluation system. The machine presets various outputs under the principle of tending to profit and avoiding harm, and evaluates the profit and loss.
Step S6: the response is converted to an output format. The machine converts the selected sequence into an output form through segmented simulation.
Step S7: and updating the database. The machine updates the feature map, the concept, the relationship network and the memory according to the memory and forgetting mechanism according to the use condition of the data in the steps S1, S2, S3, S4, S5 and S6.
In the above steps, S1 and S2 are simplifications of information, and the essence thereof is: "something similar in some respects, and possibly also similar in other respects", this is the basic assumption of similarity relationships. Our brain uses just similarities to classify things, which is the natural ability of humans. The role of classification is generalization of experience. For example, something is edible, and something that looks and smells similar to it may also be edible. Without this ability, it is impossible to develop intelligence. So in steps S1 and S2 we establish underlying features by looking for local similarities among similar things for comparing the similarities between things.
Since there are no two things in the world that are identical, similarity comparison is a process of comparing core information with details removed. Therefore, it is usually necessary to preprocess the input information and extract and compare the outline (specific edge), dynamic (change pattern), texture, etc. of the object. This is also a gift that has evolved to bring life, as we are in the world: those with similar attributes, such as contours, dynamic patterns, and textures, are likely to be similar in other respects. If we are in a world where the shape, texture and dynamic mode of an object can be changed arbitrarily, we may need to develop different brain thinking ways, including different underlying feature extraction methods. Similarly, in that world, we also need to develop different artificial intelligence. After the basic ability of comparing similarity in nature, the human beings establish language symbols through the acquired learning, and continuously summarize the experience, so that the ability of human classification is promoted to more general and more detailed ends, and the language symbols are given to the classification to represent. The language notation is also used to indicate the relationship between these concepts, which is where we can lead the animal.
Human intelligence is an evolutionary result. Our ancestors, when they explore the world before no language symbol is generated, must use basic sensors such as images, sounds, smells, etc. to give their information to recognize the world and compare the similarity with the information. In the application of the invention, the same method is adopted to restore all the input information to the thinking method of our ancestor for information processing. The reason is that the evolution is too long, and compared with ancestors, a new underlying information processing mode is not evolved, and a conversion tool between underlying information and language is added on the underlying information processing mode through the language. Through the hierarchy of languages, we establish some kind of association with information that is not similar in appearance, sound, smell, taste, etc., and pass the association to our descendants.
People who want to speak different languages, even different races, but they have similar thinking and similar behaviors, all of which indicate that our underlying thinking is language independent. This is also the purpose of the present invention to build an underlying feature map. The underlying feature maps extract similarities in images, sounds, and other sensor input information to establish classifications, and use these classifications to represent different categories of information. They are language independent and their purpose is to simplify the simplified part of the input information for preprocessing of subsequent information processing.
We consider the time and space in which things appear to be a relationship. Such relationships are also readily apparent because things that are time related are often associated with the same process. For example, a wild animal rushes to an ancestor, at this time, not only images of the wild animal but also a motion model of the wild animal, a specific sound and specific environment information may be provided, the information enters an information processing system of the ancestor, and after a plurality of similar processing, the ancestor associates the information which can be repeated, appears simultaneously or has a time sequence, so as to be used as experience to better adapt to the environment and seek survival. Similarly, the space where things appear simultaneously is also a relationship. Such as fish and water, cave and animals, sun and day information. These messages can occur simultaneously because they do have an inherent relationship and thus they do occur simultaneously. Ancestors summarize the relations by memorizing and forgetting, and link between the information which can repeatedly appear is stored in long-term memory, so that the survival chance is improved. So the memory and forgetting mechanism is also a gift that evolution brings us.
Therefore, in the present application, we use the mirror space to store information. The mirror image means that mirror image data of the outside world is stored, the bottom layer characteristics are used for replacing the original data, the bottom layer characteristics are placed according to the position most similar to the original data, and the similarity relation is reserved. The image space also stores some information about the machine itself, such as motivation, such as revenue and loss calculations. In the present application, we use an underlying signature to represent this information, so they are treated the same as the other underlying features.
When the information is stored in a mirror space mode, the time relation and the space relation of the information obtained and sensed by the user are stored together with the information. And then, the similar information in the memory sequence is used for connecting each section of memory in series, and the time and space relations of the similar information are also connected in series, so that a three-dimensional relation network is formed. But we must also look for those commonalities and remove those interference information that cannot be repeated. The method for completing the step is a memory and forgetting mechanism, which is a false and true removing process.
Thus, we are not disturbed by complicated relationships between things, but directly pass through three elements: and quantifying the relationships by using repeatability, simplifying complicated relationships among things, and establishing a relationship network among the things.
If we consider the memory as a volume containing numerous underlying features, then the relationship network is the context in this volume. These veins appear because of memory and forgetting mechanisms, those relationships that do not recur are forgotten, and those relationships that recur are strengthened. The characteristic diagrams connected by the coarse relationship context constitute the concept. The concept is a local network that links images, speech, text or any other representation of the same type of information. Because these expressions appear frequently together and frequently translate into each other, the connections between them are tighter. There are also some combinations in the relational network that can occur repeatedly, and the connection between them is not as close as conceptual, but we can use them by mimicking such combinations, which we call process frameworks. If the memory is viewed as a three-dimensional product storage warehouse, the concept is that small parts are frequently used in the products, the process framework is some middleware, and a specific memory is a product. The small parts, the middle parts and those various parts together constitute all the products in the whole memory warehouse, and they are widely existed in the products. And the memory and forgetting mechanisms are recognized, and the relationship network is embodied.
In step S2, to improve efficiency, we only need to identify areas of interest, just to adopt a degree of identification refinement that suits our expectations. This is the goal of iteratively extracting the underlying features of the data using data extraction windows of different sizes in S1 and S2. In S2, the areas of interest and the recognition accuracy adopted are derived from the instinctive and inherited motivations of the machine, which are created by the machine under the dual roles of its own needs and activity targets, as will be described in detail later.
Because we need to preserve the similarity, temporal and spatial relationships between things, we use a method called mirror space to build the stereo space formed by the large number of underlying features. These underlying features include: sensor input of all external information including but not limited to video, audio, touch, smell, temperature, etc.; all internal information including instigation status, benefit loss assessment results, gravity sensing and attitude sensing information, etc. is also included. The different states of instigation may be represented by emotions. In each recall, the instigation is an underlying feature that assigns an initial activation value to the input message. The motivation is a pre-set motivation, but its parameters are adjusted by the outcome of the gain and loss assessment. Its different states, reflecting an emotion of the machine, are also stored together in the mirror space. Then when we recombine through multiple mirror spaces, each space has its own mood and also its own assessment of profit and loss. The machine can naturally adopt a weighted summation mode to predict the estimation result of the recombined mirror space with our emotional response and the income and loss brought to us.
It is associative that the various components used in the reorganization are associated with a plurality of respective original memories that are chain-activated by the activation of the components. When the machine calls the mirror space, it processes the memory information in a similar way to the sensor information. Therefore, parallax can also be used, and the relative size of objects can be used to establish depth of field, so that the data can be established into a stereoscopic image sequence. The machine views these memories from a third person perspective so it can bring himself or someone else into the character in the virtual mirror space it created. The bringing method is as follows: 1, to handle the self-facing situation in the virtual space. And 2, handling the situation faced by other people in the virtual space by the user. The processing method is to take the situations as assumed input information to go through the process of processing similar data input by the sensor at ordinary times.
Since gravity sensing is a continuously input message, it exists in all memories. It has connection relations with all things in memory, and these relations are optimized by the memory and forgetting mechanism. The directional relationship between these images and gravity sensing is widely present in these memories, so we are very sensitive to upside down, but less sensitive to left-right reversal. This is because the upside down results in that we cannot find out the familiar combination of feature maps, so we have to pay more attention to the second recognition, and at the second time we may find out the corresponding feature map by enlarging the memory search range or by angle rotation, which requires more attention, which is why we are so sensitive to the upside down.
When we are in a real environment, we call the mirror space, overlap the mirror space and the real space, or call parts in a plurality of mirror spaces to overlap the real space. Thus, we can know the part which can not be seen in the real space at present according to the other part of the mirror space which is used for reference. This includes: 1, comprising a temporally invisible part of the space, which the machine can complement by imagination (mirror space invocation). Such as an image of the inside of a cabinet. 2, including portions that are not visible in time. For example, the food of the home town arouses the memory of the people for the home town. This is a memory utilization process. In steps S4, S5, and S6, we will use such methods to understand the input information to select the response that meets our objectives to establish the output response.
The specific storage mode of the data in the mirror space is that the data is stored once every event according to the combination mode that the bottom layer characteristics are most matched with the original data. The underlying features can be considered approximately as 2-dimensional data compression, while the event storage mechanism is a compression of data over time. They may also be replaced or partially replaced by other data compression methods. But whatever the method, the similarity, temporal and spatial relationships of things must be preserved. Meanwhile, the internal information of the machine such as the machine instinct state, the machine benefit loss evaluation result, the machine gravity sensing and the attitude sensing at the corresponding moment is also stored. The information stored in the mirror space, including external information and internal information, is provided with its own memory value, and the information complies with the memory and forgetting mechanism. The large amount of the mirror space stored in the actual order is the memory. The machine records in an event-driven manner, that is, only if an "event" occurs on the image space, the machine needs to record the image space again. And the occurrence of an event in the mirror image space means that the similarity of the bottom layer feature combination in the mirror image space is changed beyond a preset value compared with the previous mirror image space, or the memory value of the bottom layer feature in the mirror image space is changed beyond the preset value. When the machine calls the memory, the machine reconstructs a stereogram with a proper size through binocular parallax, the relative size of the feature map and the size of the attention area.
The purpose of the step S3 is to find points of interest. There are many ways to find the focus. For example, the underlying feature map is searched in memory by similarity comparison, and is marked each time one is found. When the mark contained in a certain bottom-layer feature combination in the memory reaches a preset threshold value, the mark is considered to be a corresponding feature map candidate. The machine refers to the entirety of the feature map to segment the input underlying features and further compare the similarity of feature combination patterns between the two. This process continues and all feature map candidates are found. And then according to the connection tightness of the feature map candidates, selecting the feature map which is most closely connected with other information as the most probable feature map under the condition that a plurality of candidates correspond to one input, wherein the feature map is the attention point. This process can either determine the point of interest based on the label and connection relationships after all the underlying features have been processed, or can prioritize identification when any feature map reaches a predetermined criteria.
In addition to similarity comparison, another method is proposed in the present application: a chain activation method. The invention provides a method for searching characteristic graphs, concepts and related memories based on a relational network. In the relational network, when the characteristic diagram i is endowed with an initial activation value, if the value is larger than a preset activation threshold Va (i), the characteristic diagram i is activated and transmits the activation value to other characteristic diagram nodes in connection relation with the characteristic diagram i; if a certain characteristic diagram receives the transmitted activation value and accumulates the initial activation value of the characteristic diagram, the total activation value is larger than the preset activation threshold value of the node of the characteristic diagram, the characteristic diagram is activated and transmits the activation value to other characteristic diagrams which are connected with the characteristic diagram, the activation process is in chain transmission, the whole activation value transmission process is stopped until no new activation occurs, and the process is called a chain activation process; in the single chain activation process, but after the activation value transmission of the feature maps i to j occurs, the reverse transmission of the feature maps j to i is prohibited.
When chain activation is needed, the machine gives an initial activation value to the input bottom layer characteristic diagram according to the own motivation by giving the extracted bottom layer characteristics. These initial activation values may be the same in a single initial activation value assignment, which may simplify the initial value assignment system. After the nodes obtain the initial activation value, the nodes start the chain activation process. After the chain activation process is completed, the machine selects the feature map which has the highest activation and can be highlighted, and takes the feature map as the attention point. The method makes full use of the relation in the relation network, and is an efficient searching method.
However, it should be noted that, due to the activation threshold, the cumulative function of the feature maps is linear even if the transfer coefficient between the feature maps is linear, but due to the activation threshold, the final activation value distribution is different because the activation order is different, regardless of whether the same feature map and the same initial activation value are used in a single chain activation process or in multiple chain activation processes. This is due to the non-linearity brought about by the presence of the activation threshold. The information loss caused by different transmission paths is different. The preference of the selection of the order is activated, which corresponds to the difference of the machine personality, so that different thinking results are produced under the same input information, and the phenomenon is consistent with the human being.
In addition, the relationship strength and the latest memorized value (or connection value) in the relationship network are correlated. The machine will be the first to come first. For example, two machines having the same relationship network confront the same feature map and the same initial activation value, wherein one of the machines suddenly processes an input message regarding the feature map, the machine updates the relevant part of the relationship network after processing the additional message. One of the relationship lines may increase according to a memory curve. This increased memory value does not subside in a short time. Therefore, when facing the same feature map and the same initial activation value, the machine processing the extra information will propagate more activation values along the just enhanced relationship line, thereby leading to the phenomenon of first-come-first.
In addition, in order to reasonably process the information input sequence and ensure that the activation value brought by the information input later is not shielded by the information input earlier, in the application of the invention, the activation value in the chain activation is decreased with time. Because if the activation value in the relationship network does not fade over time, the change in activation value by the following information is not significant enough, which may cause interference between information. If the activation value is not faded, the subsequent information input is strongly interfered by the previous information, so that the user cannot correctly find the attention point. But if we completely empty the memory value of the previous information, we lose the connection relation which may exist between the previous information and the next information. Therefore, in the present invention, we propose to use a progressive fading method to achieve the balance between the isolation and concatenation of the front and back segment information. This regression parameter needs to be preferred in practice. But this presents the problem of maintaining the active state of a message. If we find out the focus in S3, but cannot complete information understanding in S4, or cannot find out a response scheme satisfying the machine profit-and-loss evaluation system in S5, the activation values fade out as time passes, causing the machine to forget the focus and forget what to do with it. The machine then needs to refresh the activation values for these points of interest again. One brushing method is as follows: the attention points are converted into virtual output, the virtual output is used as information input, and the process is repeated to emphasize the attention points, namely the reason why people like to have a self-language of a pyran or have a mind when thinking and sometimes do not understand or find the thought. In addition, in this case, if new input information occurs, the machine has to interrupt the thought process to process the new information. Therefore, from an energy saving perspective, machines tend to be thinking-free, avoiding waste. At this point, the machine may actively send out buffered auxiliary words such as "take … o …" to send out output information indicating that it is thinking and not disturbing. There is also a possibility that the machine may be given a limited amount of thought time or may be overloaded with information and need to complete the information response as soon as possible, and the machine may also use output to input. In such a manner, the device emphasizes useful information and suppresses interference information. These modes are commonly used by humans, and in the present application we also introduce it into the machine's mind. The machine can determine whether the current thinking time exceeds the normal time, needs to refresh the attention information, or tells others to think by themselves or emphasize important points to eliminate the interference information according to a built-in program, or experience of the machine or a mixture of the two.
In addition, in order to correctly determine the connection strength between the feature maps, one method is to: the connection value strengths emitted by the same feature map are not limited to each other, but the activation value transfer function of the feature map may consider normalized transfer in order to correctly process the relationship between the feature map and its attributes during activation. Assuming that the activation value of the signature X is a, the sum of the connection values of all emission directions is H, and the transfer value to the signature Y is Txy, a simple activation value transfer is Yxy. Where Yxy is the activation value passed from the X profile to the Y profile.
Since human interaction is most frequently speech and text, speech and text are usually associated with all the attributes in a concept in a local network of the concept. The attributes of the concept are all feature maps of the concept, which may contain many similar images in memory, various voices, smells, and touch, etc., all linked to a type of image. These profiles obtain activation values from each leg of the relational network and are all transmitted to speech or text, so the usual focus is on conceptual speech and text. Therefore, in the method of filtering or emphasizing self-information of the machine, the virtual output is usually voice, because the method is the most common output mode. The machines output them with minimal energy consumption. This, of course, is closely related to the growth process of a person. For example, a person who learns from a book may convert information into text and input the text again.
After the machine finds the focus through the step S3, the machine proceeds to the step S4. In step S4, the machine needs to convert the point of interest into an image feature map. This transformation process is concept translation. The concept is to connect tightly local relationship networks to each other. In this network, there may be speech, text and other forms of information representing a concept. For human beings, other information than language is retained in its original form, such as images, feelings and emotions. The main language that the machine translates is the language. Therefore, the machine can translate the language into the corresponding feature map by using the feature map most closely related to the corresponding language instead of the language. For example, the concept of converting the voice of "happiness" into "happiness" can represent the typical memory of happiness.
The machine then needs to combine these signatures into an understandable sequence. Basically, the step S4 is to adjust the image feature maps (including the static feature map, the scene feature map and the process feature map) representing the input information in a proper order, and form a reasonable sequence by adding or subtracting partial contents. The basis for the adjustment is to simulate the combination of these information in memory.
The process is as if the warehouse manager were looking for the corresponding parts from the input drawings and then combining the corresponding parts by simulating the previous product (i.e., multi-segment memory). The purpose of this drawing is then understood. When understanding, the required parts are found according to the drawing (the concept translation is realized). Then look at how these parts have been combined before (this is to look for the relevant memory). The machine may find a common feature map combination in the stack of parts in previous products where some combinations of parts frequently appeared (i.e., in memory, those retained by memory and forgetting mechanisms, among the similar things). The machine then preferentially selects those large components that contain the most input information, and then combines the other components with reference to the maximum probability. Some of the components may be combined into another large component. Some of the components may be large components attached thereto. The combination modes are combined in a mode of strongest connection among parts, relations among components and large parts through a reference relation network, and finally a product is formed (similar to a virtual process in memory).
The machine faces the self-created virtual process, and the machine inputs the virtual process as information and searches for memory associated with the virtual process using the relationship network. By incorporating these memories into the selection of the target response, the machine can select responses that meet its own motivation through the gain and loss assessment. The process related to the virtual process includes: previously, when faced with similar processes, what was its state, and then what was its response. A similar process was previously initiated by oneself, what the status of others was, and what the response of the person was. These can all be found by memory and taken into the tissue scope of the target response. The method specifically comprises the following steps: and 1, understanding the implicit information of the information source except the information by recalling the state of the information source sending similar information before. The state at which the information source sends out this information includes why the information source sent out this information. 2, presume the purpose of the information source through the response made after receiving similar information before oneself. The information source sends out such information, which must be based on the machine's past responses to the information, which is the intended purpose of the information source. Otherwise, the information source does not necessarily send out this information. 3, the information source is shared by the memory of the similar information sent by the machine under the state. It is to call a state which sends out similar information by itself to further understand more possible implicit information of the information source. 4, evaluating the income and loss results brought to the user if the expectation of the information source is met through the feedback received after the user sends out similar information. The machine incorporates all of these 4 types of memory into the relevant memory pool and uses the components in the relevant memory pool to combine into its various possible responses and uses the gain and loss system to evaluate those responses to select responses that meet its goals.
In a relational network, the combination relationship of components is an important context in the relational network. They are the common words, phrases and sentence patterns of the language. They are common key steps in an action, namely an action process. Such as buying tickets, going to the airport, security checks and boarding. These steps are formed by a memory and forgetting mechanism, which forgets details and remembers common features from study to study. These key steps contain temporal and spatial information that is a framework of the process that the machine can mimic when it establishes a response. Human beings add language symbols to many process frameworks through language. Thus, these procedural frameworks are organized, ostensibly, in language. But the bottom layer organization relation is still an image characteristic diagram. The machine needs to expand the process frames represented by these languages (some process frames may not have a concept represented by them, but may be represented by multiple concepts, that is, information that needs a word or a segment of a word to be expressed) into corresponding process features (that is, feature diagrams of key steps in the process) to be simulated. For example, the feature map developed by the concept of 'going to the airport' may be a feature map of a few symbolic pictures which are left after specific details are forgotten through a memory and forgetting mechanism in the process of driving to the airport or the process of driving to the airport. These symbolic picture type feature maps will activate the associated memory, allowing us to further develop this concept, such as to mimic past memory, start a net appointment, start preparation of luggage, etc. In preparing luggage, previous memories used a case, but this time there was no case. The machine then needs to search all the memories that have been used to collate the baggage under similar circumstances and if no response is established in the subsequent steps to meet its own assessment of gain and loss, the machine again expands more memories to expand the range of simulation. The machine processes each concept similarly under a large process framework and combines them with reference to the temporal and spatial information of the same part in each memory. If the tissue cannot be organized, the memory is further expanded, and the concept is expanded. The process is iterated to finally form a tower-shaped imitation structure, and the tower-shaped imitation structure is a virtual process. The machine determines whether to choose a tower-type simulation structure to simulate and respond by evaluating the gain and loss of the simulation structure. The components for combining the tower-type structure are the related memory pool obtained by invoking memory through 4 aspects mentioned earlier in the initial stage, and then the content of the memory pool is continuously increased along with the development of the concept, and the concerned content is continuously changed. The memory entering the memory pool is considered to be used once, and the memory value is increased according to the memory curve. These parts are often called because they are critical steps in various processes. On the other hand, these parts are easy to find because of their high memory value, which is not easy to forget. Therefore, this is a positive feedback enhancement process. This process is the segmentation emulation process proposed in the present application.
It should be noted that the language output is a piecewise simulation process. When the machine simulates previous language experience to make a language response, the machine can only use part of the experience (parts) in the previous language experience due to the difference of specific scenes. The language experiences that can be frequently imitated are commonly used sentences, commonly used phrases and idioms. Because they are common parts found in a large number of languages, such as conjunctions, co-words, sighs, common words, common patterns, etc., that are objects that can be imitated in many cases. These objects are used once and increase the activation value according to the memory curve, eventually becoming the process framework as well. In response, the machine mimics the frames and then expands the memory to fit the details to the frames, constituting the speech output.
If the machine fails to establish a reasonable response in the subsequent step S5. It is possible that in step S4, erroneous information is organized, and it is also possible that an error occurs in any of the preceding steps. At this point the machine enters processing for an "unintelligible information" flow. That is, "unintelligible information" is itself an understanding of information. The machine builds a response to the "unintelligible information" based on its own experience. These responses may be left alone, may be to re-extract underlying features, may be to re-identify feature maps and establish points of interest, may be to reselect responses, and so on.
In step S5, the machine needs to add its own motivation based on the understanding of the information, and select a satisfactory response from various possible responses according to the principles of driving toward interest and avoiding harm. This step is the most complex step in the machine's thought. Most of the machine's thought time is used in this step. The machine, based on an understanding of the input information: the purpose and state of the information source, the state of the environment and an initial memory pool established by searching for memories from 4 aspects are used for creating various possible responses and then selecting a reasonable response from the responses to output externally.
The selection method is based on the instinct preset for the machine manually and the income and loss evaluation of the machine to various responses, and the responses are selected according to the method of trending interest and avoiding harm. The motivation of a person is to maintain a good living state of the person in nature. Those that are beneficial to the instigation are "benefits". Those that are lossy to this goal are "losses". After the birth of human beings, the instinctive monitoring system starts to operate, and constantly judges 'benefit' and 'loss'. For example, "milk" can satisfy the instinctive needs of children, it is a "benefit". Being blamed means a threat to survival, which is a "loss". Gaining hugs and concerns is one meeting "safety requirements" as a "benefit", while being ignored is one "loss" in which "safety requirements" cannot be met. As learning progresses, children may also conclude that "food" is a benefit, "money" is a benefit, "dominance" is a benefit, and "good interpersonal relationships" is a benefit, which have evolved on the basis of and serve the instincts. Also, we introduce this mechanism into machine intelligence. Let the machine observe "machine convention" as a basic criterion for the return and loss assessment. In the learning process of the machine, the machine stores the memory image space each time, and the income and loss evaluation result of the section of memory is synchronously stored: the profit value and the loss value are two figures.
It is not possible to tell the machine which can do and which cannot do. These needs are understood by the machine learning itself. The machine is only required to be preset with an expression representing profit and an expression representing loss, or further, different intensities are added, and in the machine learning process, the machine is informed of profit or loss through a preset method, and the approximate intensities of the profit or loss can be obtained. Of course, the gains and losses related to the data of the state sensor of the machine itself can be preset, such as information connection loss caused by collision, power shortage, water intrusion and the like; for example, the information connection benefits of charging in the absence of electricity, maintaining self data in a safety interval and the like. The machine stores two symbols representing the profit and loss in their assigned memories and assigns the profit and loss values as their memory values in positive correlation. Since things in the mirror space have a relationship with each other, such relationship is related to their memory value with each other. The connection relation between the income and the specific characteristic diagram in the same memory appearing at a time is continuously strengthened through a memory and forgetting mechanism. Similar treatment is applied to the loss. Obviously, since the value of the gain-gained memory is proportional to the value of the gain and the value of the loss-gained memory is proportional to the value of the loss, those large gains and terrible losses will make the machine memorable for life, while those small gains and losses will be forgotten over time. Things that often bring revenue and gains are more closely connected, and losses are the same. When the machine evaluates the response of the machine, the virtual response is used as input to enter the relationship network, and the profit value on the profit symbol and the loss value on the loss symbol are obtained naturally. And then evaluated. This evaluation routine may be preset or may be adjusted based on feedback obtained during the learning process. Therefore, the machine can sacrifice small benefits and seek subsequent larger benefits; small losses can also be selected, avoiding larger losses. This also provides an implementation for humans to ensure that the machine thinks at their will. For example, compliance with a "machine convention" is a goal to bring benefits, helping a host a goal to bring benefits, and violation of law a goal to bring losses.
In addition, the machine also records the parameter setting when the instigator assigns the value to the input information into the corresponding memory. The motivation is an emotion represented by the machine assigned a size. Such as alertness, abstinence, degree of trust, etc. It is regulated in two ways. One is the safety state parameter of the machine itself, which is an innate emotion. The innate mood is preset. The other is environmental factors including responses after facing earnings and losses and emotions brought by the self-facing environment, which are obtained through acquired learning. The machine attempts to expand revenue and avoid losses by continually adjusting the present energy valuation system, and gradually associates satisfactory valuation parameters with external stimuli because they all reside in the same memory, and so can do so using a memory and forgetting mechanism. The instigation status of a machine may be revealed in one way, providing an additional means of communication, which is an expression.
When the machine is ready to respond, the machine first looks for memory from the previously elevated 4 aspects, creating a pool of relevant memory. The machine builds various possible responses through piecewise simulation and evaluates the possible gains and losses that these responses may bring. The machine evaluates the income and loss and only needs to make one virtual input for the response established by the machine. After the input, by giving the initial activation value of the information, the income and loss values are naturally obtained after the activation is finished. The machine decides the trade-off based on these profit and loss values. Of course, the profit and loss may be in an intermediate transition state, and when it is difficult to accept or reject, the machine needs to add more memory to the input information, thereby breaking the equilibrium state to make a decision. This process may be performed iteratively.
After the machine completes the step S5, the machine proceeds to the step S6. The step S6 is a translation process. If the machine selects the speech output in step S5, it is simple to convert the image feature map to be output into speech and then adjust their order by using the relationship network and memory to mimic similar linguistic memories. This is a process of organizing words into sentences with reference to a grammar book (relational network). The machine then invokes experience with pronunciation and experience with expressed emotion for each word to send out the information. This is, by analogy, equivalent to the warehouse manager making a shell to the assembled product, as required by the customer, and then sending it out by air directly.
If at S5 the machine chooses the motion output, the problem becomes much more complicated. This corresponds to the product being delivered to the customer being an organization activity. In S5, the warehouse manager gives the product as just an activity plan, which may have major steps and end goals, and the rest of which need to be randomly strained in practice.
1, the machine needs to target the sequence of image feature maps to be output (this is an intermediate target and a final target), according to which different times and spaces are involved. Machines need to divide them in time and space in order to coordinate their execution efficiency. The method used is by selecting as a group closely related objects in time and closely related objects in space, since the mirror space in memory is time and space information, this step can be classified. (this step corresponds to the change from the general scenario to the minute scenario).
2, the machine needs to combine the intermediate targets in each link with the actual situation again, adopt a segmentation simulation method to form a plurality of possible image sequences, and then adopt a profit and loss system again to select sequences which accord with the machine. The machine then takes this selected sequence as a new output. The new output is a subdivision realization link under the original large output frame, and is only a small link in the whole output. (this is the implementation of a transcript, or the same flow is used, as a transcript is also required to organize a campaign, only with intermediate goals).
3, the process is iterated continuously, and the method adopted each time is the same: possible solutions are found by piecewise emulation. And then, selecting a scheme which accords with the scheme by a profit and loss system. This is a goal that can be directly achieved by breaking down a large target into smaller targets and then into smaller targets, subdivided layer by layer, until the underlying experience of the machine is resolved. (analogous to finding an unrealizable episode in transcript execution, requiring a second transcript, moving again to the active organizational process of smaller intermediate targets, this process is iterated continuously until the final target is completed.) subdividing to the underlying experience: it is the language that the muscle is mobilized to make syllables. For an action, it is decomposed into the issuing of drive commands to the relevant "muscles". In this way, the machine may ultimately implement and complete a response process.
In this process, new information may be encountered at any time, resulting in the need for the machine to process a variety of information, and these original goals become legacy motives. (this is equivalent to organizing the activities by constantly meeting new conditions, requiring immediate resolution, or failing to organize them.
The step S7 is the process of establishing new memory space and updating the relationship network throughout all steps. It is not a separate step, it is a process of maintaining the memory system in each step. The core of the method is a memory and forgetting mechanism.
It should be further noted that the steps are divided here for convenience of describing the whole process. The subdivision of the above method into other steps still falls within the scope of the claims of the present patent application.
In the above steps, the establishment, identification and optimization of the characteristic diagram, the establishment, identification and optimization of the concept, the finding of the focus, the finding of the most relevant memory through the focus, the segment simulation of one or more segments of memory, and the screening and storing process of the memory data are involved. These are all specific means for implementing the first aspect of the present application. They are the second aspect of the present patent application disclosure.
The second aspect of the present application discloses, comprising:
the present application provides a process for establishing a feature map, comprising:
the machine builds an underlying feature by comparing the local similarities in step S1, the underlying feature also being a feature map. The machine in step S3 finds a matching feature map in the relationship network if it finds a partial feature. The machine combines these features as a simple graph, stores the simple graph in temporary memory, and gives the simple graph a memory value positively correlated with the activation value. The feature maps established by the two methods are not common features in similar things or processes, and need to be preserved by learning a large number of similar things or processes and finally becoming long-term memory with the help of a relationship extraction mechanism.
In the present application, a feature map identification process is provided, including:
the machine finds a relevant feature map in the relationship network by searching the underlying features, and then marks the relevant feature map. Those feature maps that are labeled multiple times are likely candidates. The machine uses the candidates in the relational network to segment the input underlying features and compares the overall similarity of the two. If the similarity reaches a preset standard, the machine considers that the characteristic diagram is recognized. Another feature pattern recognition process is the use of chain activation. After initial activation values are given to the underlying features, the feature maps with high activation values are selected as candidates. The machine also uses the candidates in the relational network to segment the input underlying features and compare the overall similarity of the two. If the similarity reaches a preset standard, the machine considers that the characteristic diagram is recognized. Compared with the method for finding the attention point proposed by the present application, the difference is that the attention point is found by directly finding the most relevant feature map through the underlying features, and the feature maps may also include the feature map of the underlying features (such as the feature image of the desk), and may also be directly the voice or the text (such as the pronunciation of the desk).
In the present application, a process for optimizing a feature map is provided, including:
assuming that a connection exists between the bottom layer characteristic diagram A and the upper layer characteristic diagram W containing the bottom layer characteristic diagram A, the connection increases the weight of A in W according to a memory curve once used. Meanwhile, the weights of all the underlying features in the feature map W are decreased with time according to the forgetting curve. In this case, if a is a common feature in the thing, scene, and process represented by the feature map W, it is possible to repeatedly find a feature, thereby obtaining more weight. This process continues until those common feature combinations become long-term memory, while those non-common features are progressively weighted down. This is one of the methods to optimize the feature map using a memory and forgetting mechanism. Specifically, in the memory library, after a feature map is found each time, the memory value is increased according to the memory curve. In the cognitive network, after the activation value is transmitted by using the relation line once, the connection value is increased according to a memory curve.
It should be noted that since the machine uses windows of different sizes to extract the underlying features, the underlying features are not related to their size, and those very large features may also be an underlying feature. For example, a table itself may also be a base map in its entirety. It is not necessarily a combination of the local feature maps it contains. When extracting feature maps using small windows, we see local features. When extracting feature maps using large windows, we look for features as a whole. We therefore judge a table that might be judged from either an overall underlying feature, from multiple local features, or a combination of both. It is also possible to use large window identification first and then use small windows for further identification. Of course, this process can be reversed. In addition, size scaling and angular rotation need to be considered when comparing underlying feature similarity. These are very mature algorithms in current image processing and are not described herein.
In the present application, with chained activation, concepts and related memories can also be searched in a relational network. The machine gives an initial activation value to the input information according to the own motivation and starts chain activation. Since chain activation propagates activation values in the relational network, and activation values obtained by each feature map multiple times are accumulated, if multiple source information initiating chain activation transmits activation values to one feature map, the feature map may obtain high activation values due to the multiple accumulation of activation values, and the feature map with the high activation value is a concern. By assigning an initial activation value to a single point of interest for chain activation, the local network formed by the nodes with high activation values is the relevant concept. The memory containing the feature map in the related concept is the related memory. Thus, the machine may use a chain-activated search method to search for memory associated with the input information, including virtual input information. For example, the focus of the input information is obtained by assigning an activation value to each information unit of the input information. A single point of interest assignment then initiates chain activation to find multiple related concepts. Then, each feature map in the related concept is assigned an initial activation value, those containing high activation value feature maps and those containing multiple activation feature maps, which are the memory we need to put into the memory pool.
In the above steps, a relational network is involved. The specific form and establishment process of the relationship network are the third aspect of the present invention.
In the present application, a relational network organization method is provided, including:
a, cognitive network and memory base.
The cognitive network may be considered as a part of the relational network in the memory base that is commonly used and is separately stored for the purpose of fast search. It and the memory base together form the whole relation network. This approach fits the tissue pattern of the local and central brains. The local brain uses the cognitive network to respond quickly and then seeks help to the central brain when needed. The local brain is more like a local fast-reacting nerve center, such as for autopilot.
B, only memory bank.
In this form of organization, there is no separate cognitive network. All relationships are contained in the memory base. This approach is suitable for individual robots.
C, distributed cognitive network, memory base or their combination.
The machine can adopt a data distributed storage method to establish the cognitive network or the memory base. The comparison is suitable for large service-type knowledge centers.
D, a shared cognitive network, a memory base or a combination thereof.
The machine may use a data sharing storage method to establish the cognitive network or the memory base. This comparison is suitable for sharing a co-built open source knowledge center.
In the application of the present invention, a method for establishing a relationship network is provided, which includes:
although the relationships between things appear complicated and complicated, they are difficult to classify and describe. However, in the present application, we propose a method of describing the relationship between things: only the similarity relation between things, the time and space relation between things need to be extracted, and other relations do not need to be further analyzed. The machine compares the similarities to establish a self-established classification of the machine, which is a feature map. The machine extracts the temporal and spatial relationships between things through a memory and forgetting mechanism, which is a relationship network in memory frames. The local relation networks in the memory frame are connected through similar things (including specific things, concepts, languages and the like) among the networks to form the whole relation network.
The extraction of the similarity relation can be performed by using a similarity comparison algorithm or a trained neural network (including the neural network introduced with the memory and forgetting mechanism and proposed in the present application). And will not be described in detail herein.
And 2, extracting the time and space relation among things by organizing the memory. The machine considers that the feature maps in the same memory frame have a relationship with each other, and the relationship strength between the two feature maps is a function of the two memory values. The profiles here contain instinct, gain and loss profiles, emotional memories and all other sensor data. The machine does not need to distinguish the classification and closeness of the various relationships and establish a specific relationship network. The machine only needs to maintain the memory value of the feature map in each memory frame according to a memory and forgetting mechanism.
And 3, the cognitive network is the extraction of the relation network in the memory library. The extraction method comprises the following steps: the characteristic maps in each memory frame are firstly established into connecting lines, and the connecting values of the characteristic maps are functions of the memory values of the characteristic maps at two ends of each connecting line. The join values emitted by each feature map are then normalized. This results in the two characteristic maps not being symmetrical in their connection value with each other.
And 4, connecting the similar feature maps among the memory frames according to the degree of similarity, wherein the connection value is the similarity.
After the steps are carried out, the obtained network is the cognitive network extracted from the memory base. In the following, we do not distinguish the relationship in the memory base from the cognitive network, which is collectively referred to as the relationship network.
(IV) other descriptions in the disclosure of the present application.
It should be noted that in the present disclosure, the learning material of the machine can also be obtained from materials other than self-memory, including but not limited to expert system, knowledge map, dictionary, network big data, etc. These materials can be input by sensors of the machine or can be directly implanted by manual methods. They are all handled as memory in machine learning. This is not inconsistent with the machine using memory to learn.
It should be noted that all the learning steps proposed in the disclosure of the present application do not have time division lines, they are interlaced with each other, and each step is not divided in sequence. These steps are divided for convenience of explanation, and the entire process may be divided into other steps.
It is also noted that the machine's identification and response to input information, in addition to being associated with a relationship network, is also associated with a "personality". The "character" here refers to the preset parameters of the machine. For example, machines with low activation thresholds may prefer to generate associations, take longer to think, consider more comprehensively, and possibly also be humorous. Machines with large temporary memory stores tend to remember many "details". For example, a threshold value is "highlighted" by how much the activation value is above the activation value noise floor when the decision is made. Machines with a high threshold may be soft and short, while machines with a low threshold may be easier to follow intuitively. For example, how similar two node feature graphs (which can be specific things, pronunciation, characters or dynamic processes) are, the similarity is determined, and the ability of the machine for analogy thinking is determined, so that the machine is determined to be a natural normal personality or a humorous machine. Different memory and forgetting curves, different activation value transfer curves all bring different learning effects to the machine.
It should also be noted that the learning experience of the machine and the learning experience of the machine are closely related by the method of the present application. Even if the learning materials are the same and the learning parameter settings are the same, the learning experience is different and the resulting knowledge developed by the machine may vary greatly. For example, the following steps are carried out: our native language may be directly connected to the feature map. And the second language may be first associated with the native language and then indirectly associated with the feature map. Without the skilled knowledge of the second language, it is even possible to go from the second language to the second language, to the native language, and to the feature map. When such a flow is used, the time required is greatly increased, resulting in a machine that is not skilled in applying the second language. Therefore, the machine also has a problem of learning the native language (of course, the machine can directly obtain the capability of using multiple languages by an artificial implantation method). Therefore, the machine learning method disclosed by the invention application is closely related to the learning materials of the machine and the learning sequence of the machine on the materials.
On the basis of the application of the invention, whether different memory and forgetting curves are adopted, whether chain activation is adopted as a searching method, whether different activation value transfer functions are adopted, whether different activation value accumulation modes are adopted, whether other relation extraction mechanisms except the memory and forgetting mechanisms are adopted, whether a data storage form in the application of the invention is adopted, whether different activation thresholds are adopted in chain activation, whether different 'highlight' thresholds are adopted, whether different activation value noise bottom calculation methods are adopted, whether different time sequences are adopted for the nodes in multiple chain activation, whether different time sequences are adopted for the nodes in single chain activation, the number of attention points is selected each time, a specific mode of giving initial activation values according to different motives is adopted, and even different hardware configurations (such as calculation capacity, memory capacity, etc.), which kind of native language is specifically used for learning, and whether the learning is achieved by manual intervention, etc., and the differences are all specific and preferred methods for achieving the framework of the general artificial intelligence provided in the present application, which can be achieved by the knowledge known in the industry, and do not affect the claims provided in the present application.
Drawings
FIG. 1 shows the main steps of the present invention for implementing general artificial intelligence.
FIG. 2 is a method of building an underlying feature map and extracting an underlying feature map algorithm model.
FIG. 3 is a step of extracting an underlying feature map.
Fig. 4 is a flow of finding points of interest using chain activation.
Fig. 5 is an understanding process of input information.
FIG. 6 is a process for machine organization and selection of responses.
Fig. 7 is an organization of a cognitive network.
Detailed Description
The invention is further described in the following with reference to the figures and the specific examples. It should be appreciated that the present document proposes primarily new methods of implementing general artificial intelligence and the main steps of implementing these methods. Each of these main steps can be implemented using presently known structures and techniques. The present application is therefore directed to methods and implementations of the novel methods and implementations described herein, and is not limited to the specific details of implementing the essential steps described herein. The description of these embodiments is merely exemplary in nature and is in no way intended to limit the scope of the present disclosure. In the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the focus of the present application text. All other embodiments obtained by a person skilled in the art without making any inventive step are intended to be within the scope of protection of the present application.
The specific implementation of step S1 is as follows:
fig. 2 is a method of implementing step S1. S101 is to divide input data into a plurality of channels by filters. For images, these channels include filtering specific to the contours, textures, tones, patterns of variation, etc. of the graphics. For speech, these channels include filtering for audio, pitch, and other speech recognition aspects. These processing methods are the same as the image and voice processing methods existing in the industry at present, and are not described herein again.
S102 is to search for local similarity to the input data. This step is to find common local features in the data for each channel, and ignore the overall information. In step S102, the machine first uses a local window W1 to slide and search for local features that are ubiquitous in the data in the window. Local feature-to-image refers to local similar graphics that are commonly present in graphics, including but not limited to local edges, local curvatures, textures, tones, ridges, vertices, angles, curvatures, parallels, intersections, sizes, dynamic patterns, etc. Similar syllables are known for speech. The same applies to other sensor data, and the criterion for judgment is similarity.
The machine places the found locally similar features in a temporary memory base. Each new local feature is put in, and an initial memory value is given to the new local feature. Every time an existing local feature is found, the memory value of the bottom-layer feature in the temporary memory library is increased according to a memory curve. The information in the temporary memory library complies with the memory and forgetting mechanism of the temporary memory library. The bottom-layer characteristics which survive in the temporary memory library can be put into the characteristic map library to be used as the bottom-layer characteristics of the long-term memory after reaching the threshold value of entering the long-term memory library. There may be multiple long-term memory banks that also follow their own memory and forgetting mechanism.
S103 is to repeat the step of S102 using successively the local windows W2, W3 …, Wn, where W1 < W2 < W3 < … < Wn, to obtain the underlying features. In S102 and S103, one local feature extraction algorithm is a similarity contrast algorithm. This is a well-established algorithm and is not expanded here.
At S1, the machine needs to build not only a database of the underlying features, but also a model that can extract these underlying features. In S104, a bottom-level feature extraction algorithm model a is established by the machine. Such algorithm models are actually already used in S102 and S103, and they are similarity comparison algorithms.
In S105, there is another algorithm model B for extracting the underlying features. It is an algorithmic model based on a multi-layer neural network. After the model is trained, the calculation efficiency is higher than that of a similarity algorithm.
In S105, the machine trains the multi-layer neural network using the underlying features in the feature map library as outputs. The window for selecting input data and the window for selecting output data during training need to be of comparable size. In S105, the implementation form of the neural network may be a variety of deep learning networks including a convolutional neural network, and also includes the neural network introduced with a memory and forgetting mechanism proposed in the present application. In S105, the process of training the neural network algorithm model is as follows:
in S105, the machine first trains the neural network algorithm model using the local window W1 to extract data.
In S106, the machine trains the algorithm model again using the local windows W2, W3 …, Wn, successively, where W1 < W2 < W3 … < Wn.
In the optimization, one method is to add a zero to L (L is a natural number) layer neural network layer on the corresponding previous network model after each window size increase. When optimizing this added layer neural network, there are two options:
1, optimizing only an added zero-to-L (L is a natural number) layer neural network layer each time; thus, the machine can superpose all network models to form an integral network with intermediate output. This is most computationally efficient.
2, optimize all layers at a time. The machine thus obtains n algorithmic network models. They need to be used both when extracting the underlying features.
Therefore, in S107, there are two algorithmic network models. One is a single algorithm network with multiple output layers, which has the advantage of low computational resource requirements, but less feature extraction than the latter. The multiple single-output algorithm network models have large computation amount, but the feature extraction is more accurate.
It should be noted that the above method can be used for processing images and voice, and can also be used for processing information of any other sensor by adopting a similar method.
It should also be noted that, because windows of different sizes are selected, the extracted underlying features may also be of different sizes. Some underlying features may be as large as the entire image. Such underlying features are typically a background feature map or a specific scene feature map of some images. Some underlying feature maps may be dynamic processes, since dynamic patterns are also an underlying feature.
The specific implementation of step S2 is as follows:
fig. 3 is a process of implementing the step S2. The step S2 is required to achieve two purposes: one is the need to extract the underlying features contained from the input data. The second is that the underlying features need to maintain the original temporal and spatial relationship.
When the machine faces the input information, the machine performs the bottom layer feature extraction on the input information of all the input sensors by using the algorithm model a or the algorithm model B obtained in the step S1S 201. The machine selects the section to be recognized and selects the size of the recognition window W1 according to its own motivation. The machine is provided with a destination to identify the environment. These goals are usually the target concerns that were not completed in the previous process. These concerns become a motivation for inheritance in new information recognition. These inheritance motivations are characteristic graphs, where machines know some of their properties, so that only a destination can determine a particular identification interval, and select a window W1 of data to fetch based on the size of the expected object. Of course, there is also an environment where the machine does not monitor the destination, and the machine can randomly select the identification interval and the size of the identification window.
S202 is to extract local data by moving the window W1. These partial data are input to the algorithm model a or the algorithm model B obtained in step S1. The machine obtains the underlying features through these algorithmic models. Meanwhile, since the window is used to detect the local data, the extracted underlying feature position is also determined. The machine needs to adopt a similar method to synchronously extract the underlying features from all the sensor input data and maintain the relative relationship of all the input information in time and space.
After the underlying feature map extracted in S202 enters the subsequent information processing process, the response of the machine to the information may be: the information is not yet certain and needs to be identified continuously. At this point, the machine initiates the message recognition action again by segment emulation, recognizing the object for the same interval, but using a smaller or larger window. This process iterates repeatedly until the machine passes through information processing and the resulting response is no longer recognized.
The specific implementation of step S3 is as follows:
fig. 4 is a flowchart for implementing step S3 using chain activation as the lookup method, including:
s301: after the machine extracts data in the interested region by using the W window, the underlying features are extracted by using a similarity contrast algorithm A or a neural network model B in the step S1. A similarity contrast method is then used in the relational network to search for the corresponding underlying features.
S302: the machine assigns an initial activation value to each of the underlying features found in accordance with the motivation. The initial activation value obtained for each underlying feature may be the same. This value can be adjusted by the motivation of the machine, e.g., how strong the machine is to recognize the information.
S303: each underlying feature is assigned an initial activation value if its activation value exceeds a preset activation threshold. It starts initiating chain activation.
S304: the machine initiates a chain activation in the cognitive network.
S305: after all the bottom-layer feature chain activation is completed, the one to more activation is the highest, and the feature graph capable of being highlighted is the focus.
S306: the machine initiates a chain activation in the memory bank.
S307: after all the bottom-layer feature chain activation is completed, the one to more activation is the highest, and the feature graph capable of being highlighted is the focus.
Wherein the processes of S304/S305 and S306/S307 are parallel 1-out-of-2. They are the case of using a separate cognitive network as the relationship network and the case of not having a separate cognitive network, respectively.
The above steps are to utilize the 'distance' between the information in the cognitive network to make the related information mutually support by transmitting the activation value, and the activation value is accumulated and then is highlighted. This is similar to the speech recognition process. However, there are two differences: the focus here may include multiple sides of a concept, such as speech, text, images, etc., or other feature maps with high correlation to multiple input feature maps, which may be activated synchronously by the underlying features. 2, the relational network contains a great deal of common knowledge. These wisdom would help the machine identify points of interest, not just by the relationship between languages.
In comparing the similarity of the underlying features and feature maps in the input and relationship networks, the machine needs to deal with the problems of size scaling and angle matching. A processing method comprises the following steps:
(1) the machine memorizes the characteristic diagrams of various angles: the feature map in memory is a simplified map created by extracting the underlying features for each input message. They are common features that retain similarities under the relationship extraction mechanism. Although they are similar to each other, they may have different viewing angles. The machine memorizes feature maps of the same thing in life but different angles to form different feature maps, but the feature maps can belong to the same concept through learning.
(2) The common parts of these profiles are overlaid with all the angle views, mimicking their original data, and combined to form a stereo profile.
(3) And embedding a view changing program for scaling and spatially rotating the stereo image in the machine. This step is a well established technique in the industry and will not be described in detail here.
(4) When the machine searches for similar underlying features in memory, the method comprises the step of searching for a feature map which can be matched after spatial rotation in memory. Meanwhile, the machine stores the feature map of the current angle into memory and reserves the original angle of view. And then, the bottom-layer features with similar visual angles are input again, so that the search can be quickly carried out. Therefore, in this method, the machine uses a method of combining different visual angle memories and spatial angle rotation to search for similar characteristic maps, which brings about a phenomenon that the familiar visual angle is identified more quickly. Of course, the machine may also use only the method of performing similarity comparison after spatial angle rotation.
The machine also needs to deal with the problem of stereo depth of field in comparing the similarity of the underlying features and feature maps in the input and memory, as the machine needs to reconstruct the memory data into a mirror space. And establishing a three-dimensional mirror space requires depth of field information. A processing method comprises the following steps:
the machine establishes a stereoscopic depth of field by differences in multi-channel inputs (e.g., binocular, binaural inputs).
Meanwhile, the machine also adopts the size comparison of the input characteristic diagram and the memorized characteristic diagram to assist in establishing the stereoscopic depth of field.
In the step of implementing S3, it may happen that some underlying feature combinations are not searched in the memory, and the machine stores these underlying feature combinations in the memory according to the original spatial and temporal positions, and simultaneously takes their activation values as memory values according to the positive correlation ratio. The combination of the bottom-layer features is optimized through a relationship extraction mechanism in subsequent learning. This is also the creation process of the feature map.
So the signature graph comes from multiple channels: channel 1 is the underlying features created in step S1, which are also feature maps. The underlying features extracted through the windows of different sizes in step S2, and these feature maps are optimized through a memory and forgetting mechanism. Channel 2 is created by hitting unrecognizable underlying feature combinations and memorized in step S3. In all steps, the profile can be optimized by a memory and forgetting mechanism.
The previous step of S1 is to establish the ability to extract underlying features, which is a preliminary preparation for information understanding ability. The previous step S2 extracts the underlying features, which is the start of information understanding. The purpose of extracting the underlying features is to remove part of the redundant information from the input information. In step S3, the implicit connection relationship among the language, text, image, environment, memory and other sensor input information is used to mutually transmit the activation value, so that the related characteristic diagram, concept and memory are supported and highlighted. The difference between the method and the traditional 'context' for identifying information is that the traditional identification method needs manual work to establish a 'context' relational library in advance. In the application of the invention, the basic assumption that the similarity, the simultaneous and spatial information are implicitly connected with each other is provided. On the basis of the assumption, the relations of the shapes and the colors are simplified, so that the machines can build a relation network by themselves. It not only contains semantics, but also contains instinct and common sense.
In step S4, the input information is translated into a language understood by the machine, mainly using a relational network, and organized to form a sequence of images. The machine can then use the sequence to find in memory a memory associated with a similar sequence. And searching the response after receiving the similar sequence and the state when receiving the similar sequence. And searching the response received after the self sends out the similar sequence, and searching the state when the self sends out the similar sequence. This is the process of seeking an understanding of the information from experience and further understanding the information from the "co-situation". These memories enter the memory pool as the raw material that the machine uses to organize the output response. In a conversation, the person sending the information and the person receiving the information are likely to omit much of the information that both parties know. Such as shared cognition, experience, and things that have been discussed. And the missing information can be supplemented by the memory search of the above 4 aspects.
Fig. 5 is a schematic diagram of input information for achieving information translation and information understanding.
In S401, the machine searches the memory for the feature map after each point of interest transformation and establishes a memory pool. One implementation is to assign activation values to input information feature maps found in the memory base and then initiate chain activation. After the chain activation is completed, the memory with the higher sum of the activation values is the memory required to be put into the memory pool.
S402 is a find possible process framework. The specific method is to preferentially use the memory with the highest sum of the activation values and extract the process frame from the memories. The specific operation of the process can be as follows: those feature maps with low memory values are removed. They are usually details that are more consistent with the current input information waiting for subsequent supplements. After removing the low memory value feature maps, the machine may leave some feature maps representing key steps, and the feature maps related to the key steps form a process frame according to their original time and space relationships. The machine repeats the above process in a way that the total activation value is from high to low through memorizing in the memory pool. The machine finally obtains a plurality of process frames. This step is equivalent to the warehouse administrator searching for ready-made middleware that can match the input drawings.
In step S403, the machine combines the imitable parts obtained in step S402 into one large imitable frame. Since the frame extracted from each segment of memory may contain a plurality of feature maps corresponding to the interest points, the temporal and spatial relationships between these feature maps are originally present in these memories. The machine can form a large process framework by overlapping similar feature maps in time and space. This step corresponds to the warehouse manager finding the interfaces to connect the middleware to each other. There is also a case where some of the middleware cannot be connected with other middleware.
At S404, the machine solution strategy is to expand the concept representing the process framework by a piecewise-modeling method, so that the expanded process framework contains more details. The machine finds the connection between similar signatures, again by overlaying them. To illustrate by analogy: if two frameworks cannot be connected, the warehouse administrator will open the shell of each middleware (because there are few shell information interface points), which is a process of expanding the concepts representing the middleware into more detailed concepts. For example, the following steps are carried out: when the machine receives an instruction to drive to an airport and get the owner home, it may be shopping in a store. The machine activates these inputs in a chain, and the following concerns may be obtained: the feature map, dynamic feature map or language of "driving", "going", "airport", "receiving", "owner", "going home", and its environment information include the feature map of "shop", "no other arrangement", "good things to buy on the belt", and the like. The method can use the memory activated when directly searching the focus point for the memory pool, and can also re-search the related memory after assigning a new activation value to the found focus point (for example, after virtually inputting the identified information again, the focus points are assigned with new activation values and are chain-activated again). Usually the two memory ranges overlap for the most part. The machine, by removing low memory values (if any) from the associated memory, may have some comparative generalizations, but is widely in the process framework of life: "… start …", "go to airport", "pick-up owner …", "go home", "at store …", etc. These process frameworks may be key feature maps in a string of feature map sequences, rather than languages. Obviously, by referring to the organization in different memories, "… drive …", "… go to airport" and "take the owner home" can be easily connected into a series of image sequences representing "drive to airport and take the owner home" by referring to the past memories and expanding the memories. Since the images "go to …" are obtained after "driving" from the past memories, they may be present in the memories "driving to work" and the like. "go to airport and pick up owner to go home" may exist in the past machine memory of "sit subway to go home", so "go home" is behind the "airport" profile. However, there is a message that the machine is currently in the store and it cannot be linked to other messages. The machine then again expands the memory associated with "store …" to find a connection point with the information "drive to airport and pick up home from owner". The machine then remembers its memory of going from the store to the garage and finds that "car" is the common point of the two pieces of memory. The machine refers to the memory again to obtain the memory of 'needing to go to the garage to take the car firstly' and then 'driving'. Thus, the machine connects the entire process. If the machine needs to perform this process, then each intermediate process is a target, so the same piecewise simulation is used to find the process framework for each link and subdivide it again. For example, going to the garage, a lower-layer process framework for going to the garage is established through previous memory or by referring to previous memory for going to the garage from other places. Under this framework, it may be necessary to subdivide the garage into "find elevator", "sit elevator" and "find car" lower process frameworks. The basis for each subdivision is a piecewise simulation. Two cores of the segmentation simulation are that firstly, a frame is found and is unfolded, and the process can be carried out iteratively; and secondly, simulating memory, namely replacing the details in the memory with the details in the reality by adopting a similar replacement mode. Thus, the machine can build a tower-type feature map sequence by gradually thinning from a plurality of large concepts.
The specific implementation of step S5 is as follows:
the machine needs to respond after obtaining an understanding of the input information. The power to which the machine responds comes from the motivation of the machine. Machines, like humans, are driven by "desire" to respond to external stimuli. The "desire" of a machine is the natural motivation that humans preset for the machine. The motivation is preset by the machine, such as 'safety requirement', 'goal achievement', 'acquisition dominance', 'curiosity' and the like, can remove 'reproduction', and is added with the motivation machine which is expected to be owned by the machine by human beings, such as 'compliance with human laws', 'compliance with machine convention' and the like. In the present application, the instigation of a machine is a default input message. The instigator participates in aspects of the relationship network. The machine only needs to give an initial activation value to the energy machine through a preset algorithm according to information given by a preset control system (such as monitoring electric quantity, a detection system of the machine, and the like), such as information that the electric quantity is little. The activation values of the present enablers will be propagated in the relational network. It may change the latest activation value distribution in the relational network, so that the same input information may bring different activation values. If the machine's instinctive score is high, it may change the point of interest under mere information input, where the point of interest obtained is the target point of interest. The target focus reflects the information that the machine instinctively reacts. Because the types of the energy motors are few, and the assignment is relatively simple, the energy motors can be realized by adopting a preset algorithm, and the adjustment experience is obtained through learning.
The energy motivation is a preset motivation which obtains the size of an initial activation value and reflects the processing attitude of the machine on input information. These values, which give the initial activation value, reflect the current state of the machine, such as being alert, relaxed, willing to handle things, or refused to handle information, and affect the breadth and depth of the machine's memory search, and thus bring about thought differences. It is an emotional response. Its different states, reflecting an emotion of the machine, are also stored together in the mirror space. Then when we recombine through multiple mirror spaces, each space has its own mood and also its own assessment of profit and loss. The machine can naturally adopt a weighted summation mode to predict the emotional response of the recombined mirror space. Therefore, the decision making of the machine is influenced, and besides the local motor, the machine also has emotion and intelligence. And the rationale is the machine's profit-and-loss assessment system. Another motivation for the machine is to inherit the motivation. Inherited motivations are targets that machines have not yet completed. For example, the machine is in the process of completing these objects and new information is entered, so the machine needs to temporarily interrupt the ongoing process to process the new entered information. In this case, the machine is in the process of handling the new information with the original incomplete objects, which are the inherited motivations. The inheritance motivation is handled as an input message that does not require special treatment.
Fig. 6 is a main step of the S5 step:
s501 is the machine searches for memory related to similar input information. In this step, it is possible to use a virtual input of the information identified in step S4, and at this time, as an input of the active identification information, the machine may assign a larger activation value to the instigator through a preset assignment system. This time, the points of interest that may be brought by these activation values are different from the points of interest in step S4, this time the points of interest are the target points of interest.
The machine passes the target point of interest and builds a memory pool similar to the step S4. The machine's response to similar target points of interest in memory may take many forms: for example, the input information may be ignored, the input information may be confirmed again, a piece of memory referred to by the input information may be called, a verbal response may be made to the input information, an action response may be made to the input information, or an extrastring sound of the information source may be presumed through a "co-emotion" thinking.
S502 is to establish a virtual response, which is an instinctive response of the machine to the input information, based on the memory (experience) with the highest memory value as a framework.
S503 is to find the memory associated with the instinct response for the profit-and-loss assessment.
In S504, the machine evaluates the yield and loss of the instinctive reaction.
S505 is a judgment process. If so, the machine outputs the response. If the information can not pass through the information processing system, the machine needs to search for the characteristic graph with the maximum benefit and the characteristic graph with the maximum loss to expand the related memory, and organizes the response flow again, so as to keep the maximum benefit and eliminate the maximum loss. The maximum profit is reserved, the maximum loss is eliminated, and the method becomes a temporary target (benefit and harm are approached) to be completed firstly (to find a way for eliminating the loss and reserving the profit). The original target then becomes an inherited target. After finding out how to eliminate the loss and keep the gain, the machine continues to organize the virtual output process and enters the gain and loss flow evaluation again. Until the selection is completed. If the machine does not find the appropriate choice late in the process, it may send out a temporary response such as "yes" or "o" to tell the outside world that it is thinking, not disturbing. Or the thinking time is somewhat long, the machine needs to input the input information understood in the step S4 to the machine again for refreshing the focus in the relationship network, so as to avoid forgetting what the thinking content of the machine is. It is also possible for the machine to use the input information understood in step S4 to input it again to itself for excluding other information activation values in the relationship network and avoiding their interference. These activation values may be left over by previous thought processes. If the machine still cannot select an appropriate response in this manner, the machine builds a response in the face of the "no response" condition. In this case, "no response" becomes an input message, and the machine goes through the same flow of S5 to establish an appropriate response.
We illustrate the above process briefly: suppose that in a strange city hotel room, the machine receives an order from the owner to "go and buy a bottle of cola and take it back". Through the step S2, the machine extracts the underlying features of many underlying syllable inputs and many environmental information. After step S3, the points of interest found may be: "Hotel", "go", "buy", "bottle", "cola", "get back", "evening", "not much electricity", "not yet paid a house fee", etc., and translate these feature maps into a form that is conveniently processed by a machine, such as an image feature map. In step S4, the machine starts understanding these pieces of information. By organizing the input information, the machine establishes an understanding sequence comprising an image feature map of "go", an image feature map of "buy", an image feature map of "go", an image feature map of "one bottle", an image feature map of "coke", etc., and establishes a chronological order. In step S5, the initial value of the machine is assigned to the system, the state of the machine is queried (e.g., whether the machine has entered a frustrated state because of previous experience), the information sequence in S4 is assigned an initial activation value, and then the relevant memory is found. This is a chain activation seeking method. Related memory can also be searched by comparing the similarity, and a memory pool is established. In this step, the machine may increase the initial activation value of the instigator, thereby allowing the instigator response to be preferentially identified.
The machine realizes that the owner needs to make similar responses according to the responses of the owner or others under similar instructions in the memory. The machine recognizes the state of the master by memorizing the state of the master when the master gives a similar instruction and comparing the states. The machine can be analogized to the source of demand (physiological demand, possibly thirst) and emotion of the owner at the moment according to the relevant state when the machine sends out the type command, and the thinking is 'shared feeling'.
The machine begins to evaluate the instinctive response "go buy a bottle of cola back", find it at the margin of profit and loss (because then its own power is not sufficient), and then the machine looks for other possible responses again. It is possible to find out that the coke was taken out of the refrigerator before the owner, and the machine establishes a possible virtual output process of "take out of the refrigerator to the owner" which is very good when evaluated by profit and loss. The machine then begins to continue the process of evaluating in depth. The way to further evaluate is to pass the virtual output again as input. The machine then takes this goal as the new S5 flow, translating the sequence of goals contained in the previous goal "go buy a bottle of cola back" into inherited goals. To realize the new goal of taking the cola from the refrigerator to the owner, the cola needs to be decomposed into other target sequences of finding the refrigerator, taking the cola, giving the cola to the owner and the like. The machine again targets "find fridge" as a new S5 flow, again evaluating what option to choose to respond to. This time, it is possible to obtain the best ratio of gain and loss "by eye". This is an objective that can be resolved directly into underlying experience so the machine can start to perform this action looking for a refrigerator, since this is the first objective that must be achieved based on past memory-summarized experience.
Assuming that there is a refrigerator in the room, "take cola" becomes the second goal after finding the refrigerator. The machine, based on past memory, say "owner, there is a language …" by himself, and the owner responds by "turning attention to it". There are other memories "the owner goes to the place where he(s) is aware of", there are also memories "the owner takes the cola from the refrigerator", etc., and by stringing these memories, the machine organizes a virtual output "the owner goes to the refrigerator to take the cola by himself". This response is most beneficial and costly (because of the least power consumption). The machine is experienced again, and in the similar situation before, it was found that the owner can turn his attention to the machine by reminding himself to the owner, which is usually by using his fingers.
Then after the layer-by-layer memory call, the profit and loss evaluation, the machine finally determines the output plan in step S506 by modeling the memory: the user points to the refrigerator by hand and gives a voice of "owner, there is a refrigerator".
The specific implementation of step S6 is as follows:
the step S6 is machine out output. In the case of language output, this is a translation process and a simple action simulation process (simulating the past experience of uttering syllables or outputting text). If the output is a course of action, the whole process is very complicated. This corresponds to a director organizing a campaign, involving aspects, as exemplified below. Assuming in the above example that the machine responds by buying a bottle of cola and taking it back, we analyze the brief flow of the machine under action output by this example.
The machine has no experience of going out to buy the coke and getting back in the city, the hotel, the room. It does not have a complete memory available for emulation. Even if the machine has such a piece of memory, the machine will find that the memory does not match the reality when it mimics the piece of memory, due to external condition changes (e.g., different times), or due to internal condition changes (e.g., the machine's own schedule), etc.
The machine then begins to build screenplay. The criteria for screenplay is to divide in time and space so that emulation can be performed and is efficient. The machine screenplay is divided by taking each of the planning objectives (including the intermediate objectives) as a separate objective to determine what is currently simulated. The determined method may be a new chain activation or a comparison of memory and reality similarities. Clearly, in this sequence of planning objectives, the goal of matching the current environment (on the hotel room space) is to "go out". The machine then begins to achieve "go" as a goal. The realization method is as follows: the "go" is used as the information after understanding and is put back in the step of S5 to find various possible solutions and make decisions based on motivation, profit and loss. The steps S5 and S6 may be performed continuously and alternately. Because a series of targets are realized, namely a process of continuously subdividing and realizing the targets, each process is the same in processing mode, but the process is carried out iteratively, and subdivision is carried out layer by layer until the subdivision can be carried out specifically by the bottom experience of the machine.
For example, the following steps are carried out: the first notion of emulation of this instruction by the machine is the concept of "going out". When a machine mimics the concept of "go", which is a very simplified framework, the machine needs to subdivide the concept of "go". The subdivision method comprises the following steps: the machine uses the concept of "go" as a single input command to find out the memory associated with the image feature map of "go" similar to the current situation. Thus, the machine creates a secondary framework that can be modeled: go out of the door. The machine then begins to simulate this secondary framework.
In simulating this secondary framework, the machine may find that the first intermediate object to be simulated is "go to the door". The machine then takes the concept of "walk to gate" as a single input command to look for similar memory to that currently associated with the image signature of "walk to gate". The machine then creates a three-level framework that can be modeled: where the door is.
In mimicking "walk-to-gate," gate "becomes an intermediate target. The machine needs to locate the "door" position. The machine searches for gates in the environment through various feature maps contained under the concept of "gates". The machine may search for memory about this room or may start the search directly at the environment using step S2, depending on whether the machine has performed a feature map extraction for the entire environment.
After the "door" position is located, the machine continues to use piecewise simulation, combining its own position, door position, and where it left the door as input information, with the environmental information, and as a whole input, begins to find the most relevant signatures, concepts, and memory. The machine may find "walk" when simulating the framework "walk to gate". When the "walk" is simulated, a mismatch is found. Since it is seated. Therefore, through the same process, the first concept in the four-level framework to be simulated is established by 'walking': "standing". The machine then needs to subdivide the concept of "standing". The command "stand" is changed to a simulated five-level frame. The machine then begins to simulate this five-level framework.
In simulating the five-stage frame of "standing," the machine may find that the concept to be simulated is "standing up from the sofa". The machine then needs to subdivide the concept of "standing up from the sofa". In simulating the "standing up from the sofa" five-stage frame, the machine may find the concepts to be simulated as "leg strength", "body lean forward", "balance", "hand spread to protect itself", etc. a string of more detailed objects. The machine then needs to subdivide each detail object again. The machine then begins a piecewise simulation of these six-level frames.
In modeling the six-stage framework created by the detailed goal of "leg stiffness", the machine changes the goal of "leg stiffness" into issuing a series of drive commands to each muscle by looking for related experiences in memory, similar to the case, by combining these related experiences. These driving commands are themselves also memorized by means of a number of imitations in a similar environment, by means of a reinforcement learning and memorizing and forgetting mechanism. These memories have become permanent memories through repeated imitations, and they are experience. We are basically unaware of this process both in searching for and using them.
After the above steps, the machine stands up from the sofa. But the machine does not complete an emulation of the concept "go out". The machine, through the emulation of memory, finds that all "going out" of memory is going out from the "gate". By the "door", the machine continues through the process of simulating "go out" in memory. There may be a "door open" process feature in these processes. Thus "opening the door" is the object of the machine simulation. The machine has no experience of opening the door inside this room. Thus, the machine searches through the memory using "open door" as a concept. The highest focus that is achieved may be a simplified "door open" process feature, the image of which may again be based on the image of the opening of the own home room door. In this memory image, the door is opened by pressing the door handle, then rotating, and then pulling back. But the machine does not find the same door handle on the door of this hotel room.
The machine then has to use the segmented emulation method again. The concept of door handle is combined with the current real environment as a whole to input, and the most relevant characteristic diagram and the most relevant memory are searched in the memory. The machine may then get in the door of this room and have a signature that gets very high activation values, which becomes a concern. Then the door handle is similar to the door handle in position and shape, and may be the door handle of the room door. This is how the machine finds the door handle by simulating in segments, using previous experience with door handles in real world situations. Then, by the concept of the door handle, the method of using the door handle in the past is transplanted to the newly found door handle through memory simulation, which is a knowledge generalization process. The machine opens the door and walks out according to the door handle using method in memory, and the simulation of the concept of 'going out' is completed.
The above process is a method that the machine uses the segment simulation through continuous iteration, and a frame process composed of concepts is added into the details conforming to reality step by step, and finally the process becomes a colorful response process of the machine. The essence of piecewise simulation is the expansion and analogy of machine-to-concept. The concept is extracted from life and is taken from life. The application of the concepts is to expand the concepts under the framework of the concepts and replace the details in the memory with the details in the reality to simulate the concepts. Concepts include local networks of feature graphs and process features, languages, etc. It is the component that the machine is used to make up the process, and is a widely used component. Concepts may or may not correspond to a language, and concepts may correspond to a word, commonly used word, sentence, or even a segment of a language, which may differ in different languages.
In the piecewise emulation of a machine, it is also possible to encounter various new information inputs. For example, after planning the path to the gate, the machine begins to simulate the "go" action to accomplish the gate-out process. In this process, the machine may find a new situation: "there is an obstacle on its own planned route". Then, in the face of these new input messages, the machine pauses the original object, keeping it, into the process of handling the new message, and these original objects become the successor objects of the new process.
The machine encounters new input information when it is this time faced with the problem of mismatch between the simulated frame and the real world situation. The machine has to process a new information input from step S2. This information is the basis for finding a solution behind the machine. For example, the machine needs to analyze various attributes of the obstacle (such as size, weight, and safety, etc.). This step requires going through the entire information understanding process of S2 through S4. The machine then selects and implements a solution based on its own motivation. This step requires the process of S5 and S6.
The specific implementation of step S7 is as follows:
the step S7 is performed throughout the steps S1 to S6, and is not a separate step but is applied to the relationship extraction mechanism in the previous step.
In step S1, the underlying features are established mainly by using a memory and forgetting mechanism. And (4) increasing the memory value of the machine according to the memory curve when the machine finds a similar local feature through the local view and if the feature map library already has similar underlying features or feature maps. If there is no similar local feature in the feature map library, it is stored in the feature map and given an initial memory value. The memory values in all feature maps gradually decrease according to the forgetting curve with time or training time (with the increase of the number of training samples). Finally, those simple features that are widely present in a variety of things and that are in common will possess high memory values, becoming the underlying features or feature map.
In step S2, every time an underlying feature or feature map is found, if there is already a similar underlying feature or feature map in the temporary memory library or feature map library, its memory value is increased according to the memory curve; all the bottom-layer characteristics or characteristic graphs in the temporary memory library or the characteristic graph library follow a memory and forgetting mechanism; in step S2, the machine first stores the mirror space in a temporary memory. When the machine stores the image spaces in the memory base, the characteristic maps in the image spaces and the memory values of the characteristic maps are simultaneously stored, and the initial memory values of the characteristic maps are positively correlated with the activation values when the characteristic maps are stored. In the mirror space, the memory value needs to be updated in the mirror space only when the change of the feature map activation value exceeding the preset threshold occurs. In the mirror space, a new mirror space needs to be established only when the change of the similarity of the mirror space compared with the previous mirror space exceeds a preset threshold. We refer to the occurrence of an event, which is the mechanism of events stored by memory.
In the steps of S3, S4, S5 and S6, in the cognitive network, the connection relationship between nodes (including the underlying features and the feature map) follows a memory and forgetting mechanism; in the steps of S3, S4, S5 and S6, the memory values of the underlying characteristics and the characteristic diagram in the memory library follow a memory and forgetting mechanism;
in the above steps, the organization form of the cognitive network and the memory base is involved, the specific chain activation process is involved, and the memory and forgetting mechanism is involved. The specific implementation of these matters is a further refinement of the methods proposed in the first and second aspects of the present application. They are the third aspect of the present application.
The invention provides a method for assigning an initial activation value to a feature map according to motivation, which comprises the following steps:
in the present application, the step S2 of the machine is a step of extracting the underlying features by the machine. The machine needs to select the identification area and the window size to use based on the motivation. The motivation here comes from inheriting motivation. For example, in the previous activity, the response of the machine to the information is "identify further information to a specific area", and then the specific area is the identification area selected by the machine. When the machine further identifies information for these specific areas, the size of the object to be identified is expected to determine the size of the window selected for use by the machine. The machine assigns initial activation values to the extracted underlying features in accordance with the instigation, adjusting the initial activation values of these underlying features in accordance with the gain and loss attributes for the prospect. In addition, in the application of the invention, the instigation is treated as a bottom layer feature which is frequently activated, and the instigation has connection relation with other feature graphs in a memory. For example, the "safety requirement" is a motivation for presetting the machine, and the motivation can be extended to experiences of protecting family from injury, protecting own property and the like in experience.
Therefore, the method for endowing the feature map with the initial activation value by the machine according to the motivation comprises two methods: 1, the inherited motivation of a machine is a feature graph with activation values, which exist in a relationship network, and the machine may or may not select the target interest points when searching for the target interest points, depending on the activation values. 2, in step S3, those underlying features are given initial activation values by the motivation, in fact from two parts: one is the initial activation value assigned to the input information from the instigator, which is typically a uniform initial value assigned to the input by the machine based on the strength of the instigator. The second is the activation values from the instigator propagation, which are not the initial values. But they are accumulated with the initial values so that the input information has different activation values.
The specific implementation mode of the cognitive network is as follows:
fig. 7 is a schematic diagram of a cognitive network formation. Assume that the signature of the apple is numbered S42. Assume that apple texture is feature 1 and assume that the feature map number is S69; a certain curve in the apple shape is feature 2, and the feature map number is S88. … … bottom layer geometry feature N of apple, feature map number Snn. In fig. 7, S42 is a center feature map. S69, S88 to Snn are other characteristic diagrams in connection with S42. The S42_ S69/S42_ S88/S42_ Snn distribution represents the connection value of S42 to S69/S88/Snn.
In fig. 7, the first central node is S42, the connection level value number from the central node S42 to S69 is S42_ S69, and the connection level value number from the central node S42 to S88 is S42_ S88. And among the data entries centered at S69, S42 is its feature. The concatenation values of S69 through S42 are S69_ S42. And among the data entries centered at S88, S42 is its feature. The concatenation values of S88 through S42 are S88_ S42. Thus, S42, S69 and S88 establish a bi-directional connection. Because we use the data entry shown in fig. 7 to store the cognitive network, we sometimes refer to a feature map in one data entry as an index of the cognitive database, all its features are referred to as attributes, and their corresponding connection values are referred to as connection values of the attributes. A large number of such data entries can create a cognitive network. And the feature map number may be associated in a table manner.
One embodiment of establishing a relationship network is as follows:
the relationship extraction mechanism is applied to 3 intelligent hierarchy layers:
1, sensing layer: the only criterion for the perceptual layer to establish relationships is similarity. The machine considers the similarity data combination which can repeatedly appear as the bottom layer characteristic by comparing the similarity; therefore, in step S1, the relationship extraction mechanism used by the machine to extract the underlying features may be a similarity comparison algorithm between data. Whether it is image, language or other data, many algorithms related to similarity contrast are mature algorithms, and are not described herein. In step S1, the obtained underlying features need to be put into a feature gallery, and these underlying features are discarded according to a memory and forgetting mechanism. In step S2, the machine may also extract the underlying features from the input data according to an inter-data similarity comparison algorithm. Another algorithm that the machine may employ in step S2 is to use a neural network model. The neural network models can be any currently mainstream neural network algorithm, and can also be a neural network algorithm which introduces a memory and forgetting mechanism and is proposed in the invention application.
2, cognitive layer: the connection relation between the feature maps is established through learning on the basis of the feature maps established by the perception layer. Therefore, the basis of establishing the relationship is memory and forgetting, the relationship obtaining method is repeated memory, and the correct relationship obtaining is realized by forgetting.
3, application layer: the application layer is continuously applied to the achievements generated in the perception layer and the cognition layer, and the achievements are optimized according to a memory and forgetting mechanism. In the feature map library, each time a machine finds a bottom-layer feature or feature map, if the feature map library already has similar bottom-layer features or feature maps, the memory value of the machine is increased according to a memory curve; all the bottom-layer features or feature maps in the feature map library, wherein the memory value gradually decreases along with the time or training time (increasing along with the number of training samples) according to the forgetting curve; in the cognitive network, every time the connection relation between the nodes is used once, the corresponding connection value is increased according to a memory curve; meanwhile, the connection values of all the cognitive networks are decreased progressively along with time according to a forgetting curve; in the memory library, when the bottom layer characteristic or the characteristic graph is used once, the corresponding memory value is increased according to a memory curve; meanwhile, the memory values of all bottom-layer features or feature maps are decreased progressively along the forgetting curve with time;
in the application of the present invention, various methods for improving the existing neural network are also provided, and the specific implementation modes are as follows:
the invention provides a method for understanding the working principle of a multilayer neural network, which comprises the following steps:
we can consider the input data as the coefficients of the coordinate components at the base of the impulse function coordinates. Each inter-layer transformation is a transformation process of the information expression method. For example, the first transformation is to linearly transform the input shock function coordinate basis to another coordinate basis. This coordinate basis is implicit and can be changed. The co-ordinate component coefficients of this co-ordinate base are the linear outputs of the first intermediate layer (before the non-linear activation function is used). If the two dimensions are the same, the information expression capabilities of the two coordinate substrates are the same, and no information is lost from the input layer to the first intermediate layer.
However, the purpose of the multi-layer neural network is to remove the interference information (redundant information) from the input information and to retain the core information (useful information), so that the whole network must remove the interference information. The method for removing the interference information is to respectively transform the core information and the interference information to different coordinate bases through coordinate base transformation, so that the core information and the interference information become components of the different coordinate bases. The machine then removes the interference information by discarding those components that represent such information. This is a process that reduces the dimensionality of the information representation.
One convenient way to achieve this is to use a non-linear activation function. The information component on a part of the coordinate basis is zeroed out by a non-linear activation function, such as a ReLU function, i.e. half of the coordinate basis information is removed. Such as various deformed ReLU functions or other activation functions, which essentially remove part of the coordinate bases or compress the information on these coordinate bases to achieve the purpose of removing the interference information, such as leakage ReLU, i.e. remove the redundant information by compressing the information on half of the coordinate components.
Each intermediate layer neuron output can be viewed as a component projection of the information onto a corresponding implicit coordinate basis. And the optimization process of the multilayer neural network is to optimize the coordinate substrate corresponding to the middle layer. Each layer will suffer a loss of information component on part of the substrate due to the nonlinear activation function. The nonlinearity of the activation function, the number of intermediate neurons, and the number of layers are mutually constrained. The stronger the nonlinearity of the activation function, the more information loss, and at this time, a smaller number of layers and a larger number of intermediate layer neurons are needed to ensure that the inter-layer transfer of the core information is not lost. Assuming that the input information contains X information amount, the output information contains Y information amount, and the information expression ability loss rate per mapping of the intermediate layer is D, the required number of layers is L > ln (Y/X)/ln (1-D), where L is the required number of layers.
In the case of coordinate substrate dimension reduction to 1/K, the capability of each layer of information expression is reduced to 1/K2(ii) a Note that the information expression capability loss rate is referred to herein, and the information loss rate is not referred to herein. Many times, coordinate bases with dimensions that are too high may have redundant dimensions for particular information. When the information is converted from a high-dimensional to a low-dimensional coordinate basis, if redundant dimensions are removed, the information itself is not lost. Under the condition that the dimension of the coordinate substrate is unchanged, the capability of information expression of each layer is reduced to 1/R under the assumption that R is the ratio of the input and output value ranges of the nonlinear activation function2(ii) a Therefore, in the present application, the information expression loss rate from each layer to the next layer can be constrained by the above constraint conditions, and the activation function, the number of neurons, and the number of layers to be used can be determined.
Based on the analysis of the working principle, the invention provides a plurality of schemes for improving the existing multilayer neural network, and the specific implementation modes are as follows:
(A) linear transformation + dimensionality reduction.
The invention provides a method which comprises the following steps: the linear transformation is adopted between layers, but the number of neurons in each layer is reduced step by step, which is also a process of dimension reduction step by step. But linear activation function + removal of some neurons still is essentially equivalent to a nonlinear activation function. But the equivalent activation function may be a new activation function or even a non-linear activation function that is difficult to express mathematically.
(B) And carrying out nonlinear transformation preprocessing on the input data to remove components on partial dimensions.
Since the coordinate basis of the input data is known (can be regarded as a multi-dimensional impact function basis), the machine can directly perform coordinate basis linear transformation on the input data, and then discard components in partial dimensions according to a preset method. This pre-processing of the data can be seen as the data passing through a non-linear filter, the non-linearity resulting from the active discarding of part of the components. The purpose of which is to select certain aspects of the data characteristics. The outputs of the different filters may be regarded as data of different emphasis points, and may be respectively entered into the underlying feature extraction model in step S1.
A specific coordinate basis transform form needs to be preferred according to practice. Discarded data also needs to be preferred according to practice. The specific form of the nonlinear filter can be set artificially (for example, convolution is one such transformation), and the range can be limited to make the machine to optimize by trying. Since linear transformation itself is a very well-established calculation method, it is not described here in detail.
(C) Forget the mapping path.
A forgetting mechanism is introduced by randomly forgetting some samples:
the total data sample base is randomly divided into a plurality of groups, and each group randomly abandons some samples on the basis of the total samples. Each group was optimized using the same parameters. Of all samples, those that bring non-common feature maps are necessarily a minority of the population because they are non-common features in the samples. Within some packets, it may be possible to have some problematic samples be discarded randomly. Then the fraction of the number of samples in this group that is problematic may be drastically reduced. Then the network obtained by this group, which is in the parameter optimization process, most probably becomes the orthogonal basis by the coordinate basis used by the middle layer, and finally the sparse neuron layer output is obtained. The optimization process is then incorporated for all or the remaining samples based on this set.
2, a forgetting mechanism is introduced by randomly forgetting some mapping paths:
some neuron output weight coefficients w may be randomly zeroed out, or randomly zeroed out by setting the bias term b for a particular neuron to a large value to ensure that the corresponding neuron output is zero.
Progressive forgetting of mapping paths, introducing a forgetting mechanism:
when the nonlinear activation function is used, the weight coefficients w output by the neurons are introduced, the absolute values of all the weight coefficients w are reduced by a delta value after each neural network coefficient updating, and then the weight coefficients are optimized again. It is also possible to introduce a bias term b for the neuron output, and after each neural network coefficient update, change all bias terms b by a delta value, so that the output of the neural network approaches zero, and then perform the neural network coefficient optimization again. Wherein delta is a real number greater than or equal to zero; the delta values for each reduction may be different values.
4, the machine can also forget some neurons randomly, which is the Drop-out method. The Drop-out method is not claimed in the present application and is not described herein.
(D) The optimized gradients are orthogonalized.
Inserting one to a plurality of linear transformation layers (or weak nonlinear transformation layers) in the multilayer neural network; these linear transformation layers (or weak non-linear transformation layers), which may assume different linear or weak non-linear activation functions; the linear transformation layer can be inserted before the optimization begins or can be inserted again in the optimization process.
These linear transformation layers (or weak non-linear transformation layers) are introduced with the purpose: increasing the neuron layer (which is equivalent to increasing the number of coordinate basis transformations) gives the model the opportunity to choose the orthogonal coordinate basis while keeping the information unchanged (or with little loss). Since the components of the orthogonal basis are independent of each other, in the optimization process, if the information has an opportunity to be placed on the orthogonal coordinate basis, the optimization for each dimension of the information is independent of each other, thereby having an opportunity to go to the global optimum point.
Another approach is to have the machine select as many intermediate orthogonal coordinate bases as possible by limiting the representation dimensions. Since orthogonal bases mean that their dimensions are orthogonal to each other, the coordinate output in many dimensions will be zero in the process of removing redundant information. Meaning that the sparseness of the output of the neuron layer, often is orthogonal with respect to the implicit basis representing its choice. By limiting the output of the neurons, the trend of the neurons is rewarded to be sparse, namely, the neurons are rewarded to select an implicit orthogonal coordinate substrate.
In current neural networks, the number of layers is necessarily limited because each layer loses information. The probability of information being mapped onto an orthogonal coordinate basis is reduced. When the coordinate bases are not orthogonal, changing the coefficient of one coordinate component will affect the coefficients of the components of the other coordinate bases at the same time, which may cause optimization problems, introduce optimization into local optimal points or cause loss of useful information. It is to be noted that combinations of the above methods are also within the scope of the claims of the present invention.
In the present application, we will explain how to use the method proposed in the present application to realize the general artificial intelligence by the following examples:
for example, in the afternoon, the machine mother and the machine child are at home, and the machine child is preparing to go out and find friends to kick. The following are their dialogs, and the mental steps in the course of the dialogs.
The environment information is: the time is one afternoon, the environment is a family living room, the weather is clear, the temperature is 20 ℃, two persons including a mother and a child are in the family, and the child wears the sneakers …
The mother has the ability to extract the underlying features through step S1. This capability is manifested as: 1, windows with different sizes can be used to select input data, and in these windows, the bottom-layer features are extracted by comparing the similarity between the input data and the bottom-layer feature data in the feature library. 2, or is: windows of different sizes may be used to pick input data and a trained neural network may be used on the window data to extract underlying features.
In a family environment, at a leisure time, the safety requirement of the instinct of a machine mother can endow a certain activation value to a characteristic diagram representing the instinct in a relational network regularly according to a preset program. The magnitude of this activation value is an empirical value. The experience value is obtained by adopting reinforcement learning through a reward and penalty mechanism of 'response and feedback' in the life of the mother. It may also be a preset experience given to her by a human.
The activation value obtained by the instinctive "safety requirement" of the machine mother is not high because of the mother's being in a home environment and at a leisure time. In mom's relationship network, "security requirements" are usually tied closely to "viewing environments". Therefore, in the case where only the instigator is input, the instigator is obtained in the step S2, the input information does not need to be identified in the step S3, and the possible focus obtained in the step S4 is the "viewing environment". In step S5, if the mom embedded self-test system sends out information at this time: tired, need to have a rest. The preset program inside the mother can send out a rest instruction, which is also a preset 'safety requirement'. This incentive also propagates activation values in the relationship network. At this time, the mother needs to interrupt the current information processing process because new information is input ("tired, needs to rest"). Mom turns to process the new information, this time the potential revenue and loss system feels that it is more profitable to continue viewing the environment, so mom continues to view the environment.
Through relational networking and piecewise impersonation, the machine mom may begin to execute "a sketch to see" and "a sketch to listen". Since the output of these two steps is in the form of actions, it may be necessary to decompose them into specific underlying experiences through piecewise simulation. The machine mother begins to segment and imitate the two concepts of "look" and "listen". The machine mother needs to subdivide the concept of "seeing" into underlying experiences. The underlying experience of "looking" is to issue commands to many muscles, and to some nerves. The parameters of these commands are a constant summary of past experience. Or may be a preset experience. Likewise, "listening" is the same process.
The machine mother then proceeds to the next round of information processing. She begins processing information for both visual and audible inputs. The machine mother proceeds to a new step S2. In step S2, it is first necessary to determine what window needs to be identified and how large to use to identify the underlying features. The machine needs to select the identification area and the window size to use based on the motivation. The motivation here may come from inheriting motivations. For example, in the previous activity, the response of the machine to the information may be "further identify information to a specific area", and then the specific area is the identification area selected by the machine. When the machine further identifies information for these specific areas, the size of the object to be identified is expected to determine the size of the window selected for use by the machine. Here, the machine mom has no specific purpose but only random viewing and listening to the environment. The machine mother is likely to randomly select a region and randomly select the window size to use to extract the underlying features. These behaviors are similar to human behavior in the same environment.
Since the mother is already in such an environment, she may have established a mirror space of this environment. In step S2, the machine mother extracts the underlying features and places them in the size, angle and position that best match the original data, thus preserving the temporal and spatial information in the original data. Suppose that in the mother's input video data, windows and curtains are present. Mom extracts the bottom-level features of the window (which may be a plurality of local contour features of different sizes, a plurality of overall frame features) and the bottom-level features of the curtain (which may be a plurality of local contour features of different sizes, a plurality of local texture features of different sizes, a plurality of overall frame features) and possibly the bottom-level features of the window and the curtain as a whole through the bottom-level feature extraction algorithm which is established in the step S1 and built in her information processing center (since the window + the curtain are common in data, when the machine extracts the local similarity in the window by using the local windows of different sizes, they are possibly extracted as an overall similarity), this combination of floor features may be a combination of floor features of part of the window and floor features of part of the curtain, or a combination of simplified versions thereof.
At this time, the machine mother proceeds to step S3. She uses each extracted feature to begin searching for similar features in the relationship network using a similarity-comparison algorithm. After finding, they are given an initial activation value. This initial activation value is assigned to them according to a preset program, according to the intensity of the current motivation. Along with these video inputs, there are also underlying features that represent machine-local-capable machines that will add any input information and obtain the initial activation value directly from the pre-set program.
It is important to note here that even if the transfer coefficient is linear, the accumulation function is linear due to the presence of the activation threshold, but due to the presence of the activation threshold, the same profile and the same initial activation value, whether during a single chain activation or during multiple chain activations, but because of the different selection of activation orders, the final activation value distribution is different. This is due to the non-linearity brought about by the presence of the activation threshold. The information loss caused by different transmission paths is different. This phenomenon occurs in both a single chain activation process and multiple chain activation processes. The preference of the selection of the order is activated, which corresponds to the difference of the machine personality, so that different thinking results are produced under the same input information, and the phenomenon is consistent with the human being.
In addition, the relationship strength and the latest memorized value (or connection value) in the relationship network are correlated. The machine will take the leading role. For example, two machines having the same relationship network confront the same feature map and the same initial activation value, wherein one of the machines suddenly processes an input message regarding the feature map, the machine updates the relevant part of the relationship network after processing the additional message. One of the relationship lines may increase according to a memory curve. This increased memory value does not subside in a short time. Therefore, a machine that processes additional information in the face of the same signature graph and the same initial activation value will propagate more activation values along the just enhanced relationship line, thereby leading to a first-come-first phenomenon.
By adopting an analogy method, the process of processing the video by a machine mother can be approximately analogized to the current popular Convolutional Neural Network (CNN). The process of extracting the underlying features from the input data by the machine can be approximately regarded as a convolution process. The machine propagates the activation values from the underlying features and finally finds the focus point, which can be approximated to the mapping process of the multi-layer neural network. The memory and forgetting mechanism can be approximately regarded as a gradient optimization process. But the difference between the two is also significant: there are no explicit hierarchical neurons in the relational network, which is an overall network. It is meaningful that every element in the relationship network is visible. The image processing process in the relational network is, for human beings, understandable and visible for each step. To summarize the difference between the two more essentially, it can be considered that: the current multilayer neural network is a relational network which can only see input and output characteristic diagrams. And a relational network, which is similar to a training network that employs layer-by-layer training from underlying features to concepts (from simple to complex materials). The machine increases the number of network layers to retrain each time it adds material. And the inter-layer mapping weight coefficients are bi-directional and not unidirectional. In addition, its intermediate layer is exportable. Neural networks are optimized using error back-propagation algorithms, while relational networks are optimized using memory and forgetting mechanisms. The neural network is trained by adopting all training data and divided into a training process and an application process, the relation network is not distinguished from the training process and the application process, and learning samples required by the relation network are far smaller than the neural network.
After the machine mother extracts the underlying features of the windows and blinds and assigns them initial activation values, the instinct of the machine will also propagate the activation values to the underlying features of the windows and blinds. Obviously, the propagated activation values are low because they are not tightly connected to the secure connection in the relational network. The activation values of these underlying features of the window and shade in memory are not high. The range of chain activation that they can initiate is limited.
After the chain activation is completed, the machine mom looks for points of interest from the relationship network. The result is likely that the profile of the windows and blinds is a point of interest. Therefore, the step S3 is an input information recognition process. The machine mother then proceeds to step S4: the input information is understood using a piecewise-analog approach. The specific process is as follows: the machine mother uses both the window and the shade to search for the most relevant memory. It is possible to find several segments of memory associated with windows and curtains. The searching method comprises the following steps: the machine searches in the memory base using the window and shade feature maps. It is clear that understanding the two concerns of windows and curtains does not require much memory from the memory to assist understanding. Whether more memory is introduced or not is determined by a number of factors: for example, in the latest relationship network, whether the window and the curtain are connected with a high memory value memory or have a related inheritance target. The machine mother may simply invoke a long term memory in the relationship network. The long-term memory is eliminated by a memory and forgetting mechanism, only a part which can repeatedly appear is reserved, and the rest part is memorized: the combination and the voices of the window and the curtain exist in a memory frame, and the memory values of the combination and the voices of the window and the curtain are higher. The machine mother then understands the input information. She proceeds to step S5. In step S5, since the machine mother is in a safe and leisure environment, her motivation initialization program has a low motivation value, she may have no particular motivation, and may only have the motivation "safety demand" activated, but the activation value is low. At this time, the machine mother may imitate the memory by habit, and silently recite the voices of 'windows' and 'curtains' in the mind. It is also possible to have no output.
In the process of random watching and listening, video and audio data are continuously input, and a preset program of the machine mother possibly meets the requirement of 'power saving'. The machine mother may now use a large window, with substantially negligible environmental details. At the moment, the machine mother gives a preset initial activation value to the extracted underlying characteristics in a unified mode, and meanwhile, underlying motivations are always activated and are periodically given activation values to be spread in a machine relation network.
Assuming that a machine mother randomly observes an environment, bottom-layer features such as the outline of the human stoop, the outline of clothes, colors and the like are extracted suddenly. The bottom-layer characteristics are directly processed in the relational network without translation. The machine assigns an initial activation value to the underlying features, the activation values are propagated in the relationship network, and the final focus may be feature maps of 'human', 'bending over', 'red clothes', and the like. Obviously, these profiles can be combined by directly starting with a piecewise emulation. By similar memory segment emulation, "a person wearing red clothing, bending over" may be the process of understanding the information.
The machine mother makes a response after recognizing the information. She mimics one or more of the memories in which it is common to take actions to further identify information. This is an empirical motivation associated with "safety requirements" that recur during mom's growth, so that the memory-like becomes permanent and may be invoked unintentionally. This is an instinctive reaction.
The machine mother then mimics the previous experience (these memories may also be preset instinctive experiences): to identify more specific information for an interval. She then begins to follow these experiences and issue various muscle commands to move her eyes and ears to the area. The identification interval defined by the machine mother is the interval containing the human, and the used identification window is the window which is usually used for identifying the human at a similar distance.
Assuming that the machine mother then finds a particular hair style, with the hand extending toward the shoe, the machine mother follows a similar information processing procedure to assign activation values to the particular hair style, hand extending toward the shoe related underlying features, and the last possible focus is the "my child", "don't wear" feature map. Under similar motivation, and through the evaluation of the profit-and-loss system, mom will continue to identify information. She would use a smaller window to carefully identify the images within the relevant interval.
In the new input, it is possible to add the trademark of "Nike" shoes. After assigning and activating the brand feature map of the "Nike" shoe, this time the machine mother gets the attention of the "Nike" sneaker. At this time, the previous information is integrated, so that the relation network of the mother has higher activation such as "my child", "wearing shoes", "Nike trademark", etc., and under the motivation of "identification information" of the mother, the mother simulates by sections and combines the information, namely "my child is wearing Nike shoes". Having this information, driven by the "safety requirements", it is a strong empirical motivation to protect children. Driven by this motivation, the target focus may be "child-protection". The machine mother, through segmented simulation, finds that in the existing environment, it is often further to identify the risk factors. She then adjusts the assigned system parameters of the motivational, increasing the context for recognition. This time she found a "football" next to the child. Children and soccer are given relatively high activation values to each other, and thus their activation values are higher than the other activation values, which becomes a concern. After searching for the memory, the machine mother finds a plurality of sections of memory which are matched most. They may be the memory of the child kicking in the afternoon, may be the memory of the court near home, may be the experience previously self-summarized by the machine mother: "the child is used to go out and play the ball in the afternoon when the weather is good". The self-summarized experience is also part of the memory, as it is also an activity. And the machine may consciously repeat this activity, deliberately adding memory to these summaries. This is a method that machines learn themselves to better adapt to the environment.
It is to be noted that the activation values in the relationship network also decrease with time. If some points of interest are not handled for a long time. They may be forgotten. If the activation values in the relational network fade slower over time, too much activation information interferes with each other, making it impossible for the machine to reasonably find the target point of interest. At this point, the machine mother may instinctively perform an activation value refresh with the incentive to conserve energy. The method is to convert the current key information into output information, but the output information may not be output, and the information is transferred into the input information, and the key information is activated again, so that the non-key information is forgotten quickly. This is the thought arrangement process. This is one way for machines to highlight important information. This is similar to the term "take an action to buffer the thinking process such as kayaki … - …" which gives the machine a kind of action in thinking. These outputs may not be true outputs, or may have actual outputs, such as pranoptic self-language, etc. The machine can be used by oneself or other people in learning and living, so that the machine can have correct understanding when encountering auxiliary words for buffering the thinking process or a person pranking self language.
Since human interaction is most frequently speech and text, speech and text are usually associated with all the attributes in a concept in a local network of the concept. The attributes of a concept are all feature maps of the concept. These profiles obtain activation values from each leg of the relational network and are all transmitted to speech or text, so the usual focus is on conceptual speech and text. Therefore, in the self-information filtering method of the machine, the intermediate output is usually voice, because the method is the most common output mode. The energy used by the machines to output them is minimal. This, of course, is closely related to the growth process of a person.
Suppose that the mother combines the related feature maps after the segmented simulation, and then the child wants to go out to kick the ball. This information is a sequence of image feature maps that is subconscious until not organized into an output form. At this time, in the relationship network of the machine mother, the activated feature maps are many, the called memories are also many, so that the machine mother has too much information and low operation efficiency. At this time, the machine mother needs a thought buffer under the guidance of experience, which is a self-protection of thinking, and can be experience or preset experience.
To highlight important information, machines typically emphasize important information one or more times, so that the activation values and recall memories of non-important information fade. Suppose that the mother inputs the information of 'children want to go out to kick the ball', and the self-output inputs the information once, so that the activation value of other information is relatively reduced. The information "children want to go out to kick ball" becomes prominent in the relationship network after going through the virtual output flow of the voice information. Mom, given the value of instinctive, the "safety requirements" tend not to allow the child to kick the ball because the child may be injured in kicking the ball. In the other memory, experts say that this is consistent with the goal of "building up health" and that more exercise should be given to teenagers. Which response is then selected requires a revenue and loss system to evaluate.
It is assumed here that neither response meets the mom's requirements for the return and loss assessment. Thus, mom creates another response by piecewise simulation: she chooses the target points of interest with a compromise: agree, kick, pay attention to, be safe, come back in time, etc. The word "safe" when imitating the organization of these target concerns brings about a memory of "a child has a cold because of rain". Thus, mom needs to look for their responses again. She analyzed the cold that gave the greatest loss, but it was the result of a rain. Therefore, she needs to exclude "rain" from the achieved goal. In this way, the yield of the overall process can be maximized.
The goal of "rain exclusion" is obviously a ready experience found soon by piecewise simulation: "look at weather," "take umbrella". These experiences become intermediate goals. Driven by the memory information, the machine mother can look up the weather. To achieve the intermediate goal of "rain out," mom may find a response scheme with umbrellas.
The final constituent output of the mother is then "hope for child to go out and kick the ball with umbrella". But mom omits the subject in the case where she has only himself and a child in the room based on past experience. She also realizes that the child is going out to kick a ball, and from experience she does not have to repeat this information. In previous memories, the mother gives instructions and the child will follow. So this time the mother gives instructions, again with reference to previous experience, that the child may be compliant. If not, the mother does not give an instruction under the drive of own motivation, but adopts other methods. It is empirically determined that the child will comply, so in motivational selection, the selection with the highest activation value is "give him instructions". Thus, the machine mother may eventually output a "with umbrella" voice.
After receiving the information, the child determines the correct vocabulary of the voice through information recognition. After translation and segmented simulation of the past memory of the mother, the language should be the language in the information sent by the mother. The child's mental process, which may be driven by his instinct, is compliant with the mother's intent, agrees to put the umbrella on, and organizes the output by piecewise simulation to assess the lost revenue situation after putting it on.
The children establish the image with the umbrella, and use the partial memory of kicking the ball and the memory of friends together, and the children also see the response of other people bringing the umbrella as the input when kicking the ball, which is a 'common emotion' method, and is a 'translocation' thought, and the invention also is the characteristic of the machine intelligence proposed in the invention application. The information is converted into input information in a parallel or serial or mixed mode, and the input information transmits the profit value and the loss value to the profit symbol and the loss symbol through chain activation. After the input is over, the child sees a high loss value.
Since the amount of information that is used for processing may be relatively large, he needs a mental buffer and highlights the important information. He then previews the possible situation as an input message again, which corresponds to a more comprehensive analysis of the possible result with an additional information emphasis. And re-searching for responses of the self and other people under similar conditions. This time in the return and loss assessment he found that the umbrella was worn with significant losses. He then becomes conscious and chooses again on the basis of the new motivation, with the goal of excluding the largest losses, he mimics the past experience, organises the output, and utters the "not" speech.
After receiving the feedback of the child, the mother recognizes that the child rejects the requirement of the mother through the same processing flow. Driven by the motivation, her target focus may also be "with umbrella", "why", "further confirmation", "dissatisfaction", "maintenance disposition", "child protection", etc. She needs to achieve these goals using piecewise simulation. But these goals are too numerous for her to accomplish all using a complete process. She then again looks for experience to deal with this problem. She may be able to group the objects into groups to achieve them step by step. And other targets are temporarily converted into 'inherited targets', and the inherited targets are left in the memory of the mother as targets to be achieved by subsequent planning.
In piecewise simulation, she may be based on long accumulated experience, and the first step should be "to clarify the situation". She then mimics these experiences, questions "what you are" and expresses her dissatisfaction in language. While "maintain dominance", "put on umbrella", "protect children" become "inherit targets", mom will continue to try to do so at a later time.
After receiving the information, the child recognizes the mother to explain why the mother does not have an umbrella by recognizing the information to imitate the purpose of sending the information and the common response when receiving the information. The child then believes that compliance is good, based on his motivation. But also in the return and loss assessment, so that the children organize the language and begin to express their own reason. The child responds with the intention of letting the mother understand himself. Since, according to long-term experience, the child has previously interpreted, the mother understands himself. If the mother rarely understands himself in the child's life, the child finds, based on experience, that the interpretation is not adequate for his purpose, the gain is low, and the response of the child's choice may be silent.
Assume that the feedback given by the child is: "such friends would feel I look silly …".
This feedback is not within the expectations of the mother. After receiving the information, the mother translates and processes the information, feels like information from her own, feels like the information from other people's experience (such as the experience heard from the child-care specialist), or feels like the information from her own, and decides to speak some reason to the child to make the child know correctly under the drive of the child-protecting instinct. This is a large object that she starts searching for memory, breaking it down into a series of small objects. She mimics memory (perhaps experience in television, giving a presentation in the lecture of a child care specialist to deal with similar situations) and she feels "it is time to speak well with the child …"
The mother then takes a breath and begins to organize her own thoughts … by mimicking the habitual actions of training her owner long ago.
In the present application, how to implement general intelligence by using the method and steps proposed in the present application is demonstrated. It can be seen that the essential difference between the dialog here and the current speech interaction system is that: whether the information communicated is a true understanding or is merely a mechanical simulation. Therefore, the method and the steps proposed by the invention application can realize the thinking process similar to human beings, and are based on three elements of established information summarization, information simulation and motivation drive.

Claims (16)

1. A method for realizing general artificial intelligence is characterized by comprising the following steps:
s1: establishing a characteristic diagram library and establishing an extraction model; the feature map library is a bottom-layer feature map library established by a machine by searching local similarity, and the extraction model is an algorithm model for extracting a bottom-layer feature map;
s2: extracting underlying features, comprising: the machine extracts bottom layer features of input information of the sensor, adjusts the positions, angles and sizes of the bottom layer features according to the positions, angles and sizes with the highest similarity of the bottom layer features and original data, the bottom layer features and the original data are overlapped, the relative positions of the bottom layer features on time and space are reserved, and a mirror image space is established for simplifying the input information;
s3: identifying input information, including: the machine searches for the focus to identify the input information, removes ambiguity, and performs a feature map translation process;
s4: understanding input information, including: organizing the points of interest into one or more intelligible sequences by a machine, reorganizing the vocabulary of the target language into an intelligible language structure using grammar, the specific method employed in step S4 being piecewise simulation;
s5: selecting a response comprising: the machine adds the translated input information into a motivation to search a target focus; the machine establishes a response to the input information by utilizing a relationship network and a memory; evaluating the response using a gain and loss evaluation system until a response is found that can pass the evaluation system;
s6: converting the response to an output format, comprising: the machine converts the sequence selected in S5 into an output form through segmented simulation;
s7: updating a database, comprising: the machine updates the underlying feature graph, the concept, the relationship network and the memory according to the memory and forgetting mechanism according to the use condition of the data in the steps S1, S2, S3, S4, S5 and S6;
wherein the relationship network between things is obtained by collation of the memory; building an output response by interpreting the input information through a reorganization of the memory and the input information; different reorganization results are selected by driving toward the avoidance, and responses are made by imitating or analogizing the selected memory reorganization results.
2. The method for implementing general artificial intelligence according to claim 1, wherein the method for establishing the relationship network comprises extracting the following three basic relationships, respectively: similarity relation of information; the temporal relationship of the information; spatial relationships of information; the strength of the relationship increases each time a relationship in the relationship network is used; the strength of relationships in the relationship network decreases over time; the machine considers the data stored in the same time interval and has a relationship with each other; wherein the strength of the relationship between any two data is related to the memory values of the two data; the machine similarity characteristic graphs have relations, and the relations are related to similarity; wherein,
step S1 includes:
s101: dividing input data into a plurality of channels by a filter;
s102: searching local similarity of input data, searching common local characteristics in the data of each channel and neglecting overall information; in step S102, the machine first uses a local window W1 to slide and search for local features ubiquitous in the data in the window;
s103: successively using local windows W2, W3 …, Wn, wherein W1 < W2 < W3 < … < Wn, repeating the step of S102 to obtain the underlying features; in S104, establishing a bottom layer feature extraction algorithm model A by a machine, using the model A in S102 and S103, and adopting a similarity comparison algorithm;
s105: training a neural network by using the bottom-layer features extracted by the local window W1;
s106: training the neural network using the extracted underlying features of the local windows W2, W3... Wn; in S107, another bottom-layer feature extraction algorithm model B established by a machine is used in S106, the output is n independent neural networks, or a single neural network but includes n output layers, and the algorithm model B is an algorithm model based on a multilayer neural network; or/and the (co) polymer is/are,
step S2 includes:
s201: selecting a data interval and selecting the local window W1 by a machine, and extracting bottom features by the machine by using the algorithm model A or the algorithm model B in the step S1;
s202: the machine moves the local window W1 to extract the underlying features and maintains the original relative temporal and spatial relationships; or/and the (co) polymer is/are,
step S3 includes:
s301: after the machine extracts data in the interested region by using a window, extracting the bottom layer features by using a similarity comparison algorithm A or a neural network model B in the step S1, and then searching the corresponding bottom layer features in the relation network by using a similarity comparison method;
s302: the machine assigns to each of the underlying features found an initial activation value in accordance with the motivation, this value being adjusted by the motivation of the machine;
s303: each underlying feature assigned an initial activation value, if its activation value exceeds a preset activation threshold, then initiating chain activation;
s304: the machine starts chain activation in the cognitive network;
s305: after all the bottom-layer feature chain activation is completed, one to more features are activated the most, and the feature graph capable of being highlighted is the focus;
s306: the machine starts chain activation in the memory bank;
s307: after all the bottom-layer feature chain activation is completed, one to more features are activated the most, and the feature graph capable of being highlighted is the focus;
wherein the processes of S304/S305 and S306/S307 are parallel 1-out-of-2; or/and the (co) polymer is/are,
step S4 includes:
s401: the machine searches the feature graph after each focus point is converted in the memory and establishes a memory pool, and the specific implementation method is that an activation value is given to the input information feature graph found in the memory pool, then chain activation is started, and after the chain activation is completed, the activation value and the higher memory are put into the memory in the memory pool;
s402: searching possible process frames, wherein the specific method is that the memory with the highest sum of the activation values is preferentially used, the process frames are extracted from the memories, and the process frames remove the characteristic diagram with the low memory value;
s403: the machine combines the modelable segments obtained in step S402 into a larger modelable frame;
s404: the machine expands the concept representing the process framework by a piecewise simulation method, the expanded process framework contains more details, the machine finds the relation between similar feature maps by overlapping the similar feature maps again, and the two cores of the piecewise simulation comprise: finding a frame, expanding the frame, and carrying out iteration in the process; simulating memory, adopting a similar replacement mode to replace the details in the memory with the details in the reality, and establishing a tower-shaped characteristic diagram sequence by gradual thinning; or/and the (co) polymer is/are,
step S5 includes:
s501: the machine searches for memory related to the information similar to the input, and uses the information recognized in step S4 as a virtual input;
s502: establishing a virtual response as an instinct response of the machine to the input information based on the memory with the highest memory value as a frame;
s503: searching for memory related to instinctive response for gain and loss assessment;
s504: evaluating the income and loss conditions of the instinct reaction by a machine;
s505: and judging whether the response passes through, if so, taking the response as an output by the machine, and if not, expanding the search correlation memory aiming at the characteristic graph with the maximum benefit and the characteristic graph with the maximum loss by the machine, and organizing the response flow again.
3. The method for implementing general artificial intelligence according to claim 2, wherein the establishing of the mirror space of S2 includes storing of the input information; when the machine stores data, original similarity, spatial relationship and time relationship among the data are kept; the machine uses values or symbols as memory values to represent the time these data can exist in the database; the extraction of the time and space relation among things is realized by arranging the memory; the machine considers that the feature maps in the same memory frame have a relationship with each other, and the relationship strength between the two feature maps is a function of the two memory values; the machine considers that similar things exist in relation, and information appearing in the same space simultaneously exists in relation with each other; the information relationships appearing in the same space form a transverse relationship network; the different transverse relationship networks form the whole relationship network by connecting similar information, and the memory values increase according to the use of the data once and decrease according to the increase of time; the data stored in the same time interval have a relationship with each other; wherein the strength of the relationship between any two data is related to the memory value of the two data.
4. The method for implementing general artificial intelligence according to claim 3, wherein the input information is filtered when stored, and the filtering method comprises:
the machine stores the data into a temporary memory library; the temporary memory library adopts an independent memory forgetting curve;
after the memory value of the data in the temporary memory library reaches a preset standard, the input information is transferred to a long-term memory library; only when the current input data and the previous input data are changed and exceed a preset threshold value, the machine needs to store the data again and establish new data storage; the machine needs to renew the memory values only when the memory values corresponding to the data have changed beyond a preset threshold.
5. The method for implementing general artificial intelligence according to claim 1, wherein the organizing method of the input information storage in S4 includes:
organizing data stored by a machine, wherein the data comprises data characteristics extracted from external input data or from the external input data by the machine at the same time period, data given by an internal monitoring system of the machine and analysis data made by the machine according to the data, and also comprises memory values of the data; and the stored data are considered to be related to each other, and the relationship strength between any two data is related to the memory value of the two data.
6. The method of claim 5, wherein the organizing method further comprises:
when storing analysis data of internal and external input information in a memory, a machine correlates a value of the analysis data with a value of the analysis data itself.
7. The method for implementing general artificial intelligence according to claim 1, wherein the data feature selection method for the underlying features in S2 includes:
the machine uses the window selection data to find local similarity between the selected data; finding data similar to each other as the data feature; while storing the data characteristic in a data characteristic library, assigning a preset memory value to represent the time that can exist in the database; the preset memory value is increased according to the repeated occurrence times of the local similarity and decreased according to the time; the machine repeats the above operations on the same data using windows of different sizes.
8. The method for implementing general artificial intelligence according to claim 1, wherein in S3, the method for training the neural network to recognize whether the input data includes the data features includes:
the machine selects input data by using a window and marks data characteristics contained in the data; the marking method is realized by using data in the selected window and data characteristics to carry out similarity comparison; the machine training neural network identifies the data with the marks; the machine uses a window from small to large to repeatedly carry out the operation on the same data, and after the data selection window is increased every time, the machine increases zero to multiple layers of neurons on the originally trained neural network to be used as a new neural network; in the new training process, if the machine only trains the newly added neuron layer, the machine finally obtains a single neural network of which part of the middle neuron layer is also an output layer; if the machine trains all the neuron layers, the machine retains the previous neural network, and the machine finally obtains that each window has a corresponding neural network.
9. The method for implementing general artificial intelligence according to claim 1, wherein the step of extracting data features from the input data in S2 includes:
the machine selects input data by using a window, confirms whether the selected data contain data characteristics in a data characteristic library or not, and determines the positions of the characteristics in the input data through the positions of the window; the machine repeats the above operations on the same data using windows of different sizes.
10. The method for implementing general artificial intelligence according to claim 1, wherein the step of S5 joining its own motivation to perform a chain activation method in the relational network includes:
in the relation network, when the characteristic graph i is endowed with an initial activation value, if the value is larger than the preset activation threshold Va (i), the characteristic graph i is activated and transmits the activation value to other characteristic graph nodes which are connected with the characteristic graph i; if a certain characteristic diagram receives the transmitted activation value and accumulates the initial activation value of the characteristic diagram, the total activation value is larger than the preset activation threshold value of the node of the characteristic diagram, the characteristic diagram is activated and transmits the activation value to other characteristic diagrams which are connected with the characteristic diagram, the activation process is in chain transmission, the whole activation value transmission process is stopped until no new activation occurs, and the process is called a chain activation process; in the single chain activation process, once the activation value transmission of the feature maps i to j occurs, the reverse transmission of the feature maps j to i is prohibited.
11. The method of claim 10, wherein the transfer coefficient of the activation values from the feature map a to the feature map B is positively correlated with the relationship strength between the feature map a and the feature map B, and is also positively correlated with the weight of the relationship strength from the feature map a to the feature map B in all relationship strengths of the feature map a.
12. The method for implementing general artificial intelligence according to claim 1, wherein in S5, propagation of activation values is performed in the relationship network, and the propagation method includes:
the activation values of the nodes in the relational network propagate in the relational network and the activation values in the relational network are decremented over time.
13. The method for implementing general artificial intelligence according to claim 1, wherein the method for memorizing data by using machine in S4 includes:
the machine uses the local information in different memory data to recombine, and adds the information related to the input information to form a new information sequence; the machine responds to the input information by mimicking this new sequence of information; the machine continues to use the same method to organize a lower-layer new information sequence in the simulation process, and achieves an intermediate target in the simulation process by simulating the lower-layer new information sequence; this method is performed iteratively.
14. The method for implementing general artificial intelligence according to claim 1, wherein the method for memorizing data by using machine in S4 includes:
in the process of recombining the memorized information and inputting the information, the machine takes the partially recombined new information sequence as the input of the machine to adjust the recombination process.
15. A method for implementing general artificial intelligence based on a neural network, the method being implemented in a method for implementing general artificial intelligence according to any one of claims 1-14, the method comprising:
introducing a weight coefficient w output by a neuron, reducing the absolute value of all weight coefficients w by a delta value after each neural network coefficient update, and then optimizing the weight coefficients again; introducing a bias term b output by the neuron, changing a delta value for all bias terms b after the bias term b updates the neural network coefficient each time, enabling the output of the neural network to approach zero, and then optimizing the neural network coefficient again; wherein delta is a real number greater than or equal to zero; the delta values for each reduction are different values.
16. The method of claim 15, wherein one to more linear transformation layers are inserted in the multilayer neural network; the linear transformation layer adopts different linear or weak nonlinear activation functions; the linear transformation layer is inserted before the optimization begins or during the optimization.
CN202010370939.2A 2020-04-30 2020-04-30 Method for realizing general artificial intelligence Active CN111553467B (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
CN202010370939.2A CN111553467B (en) 2020-04-30 2020-04-30 Method for realizing general artificial intelligence
PCT/CN2020/000108 WO2021217282A1 (en) 2020-04-30 2020-05-15 Method for implementing universal artificial intelligence
PCT/CN2021/086573 WO2021218614A1 (en) 2020-04-30 2021-04-12 Establishment of general artificial intelligence system
US17/565,449 US11715291B2 (en) 2020-04-30 2021-12-29 Establishment of general-purpose artificial intelligence system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010370939.2A CN111553467B (en) 2020-04-30 2020-04-30 Method for realizing general artificial intelligence

Publications (2)

Publication Number Publication Date
CN111553467A CN111553467A (en) 2020-08-18
CN111553467B true CN111553467B (en) 2021-06-08

Family

ID=72000250

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010370939.2A Active CN111553467B (en) 2020-04-30 2020-04-30 Method for realizing general artificial intelligence

Country Status (2)

Country Link
CN (1) CN111553467B (en)
WO (1) WO2021217282A1 (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021218614A1 (en) * 2020-04-30 2021-11-04 陈永聪 Establishment of general artificial intelligence system
CN112016664A (en) * 2020-09-14 2020-12-01 陈永聪 Method for realizing humanoid universal artificial intelligence machine
CN112231870B (en) * 2020-09-23 2022-08-02 西南交通大学 Intelligent generation method for railway line in complex mountain area
WO2022109759A1 (en) * 2020-11-25 2022-06-02 陈永聪 Method for implementing humanlike artificial general intelligence
CN113626616B (en) * 2021-08-25 2024-03-12 中国电子科技集团公司第三十六研究所 Aircraft safety early warning method, device and system
CN115359166B (en) * 2022-10-20 2023-03-24 北京百度网讯科技有限公司 Image generation method and device, electronic equipment and medium
CN118503893A (en) * 2024-06-06 2024-08-16 浙江大学 Time sequence data anomaly detection method and device based on space-time characteristic representation difference

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016132569A1 (en) * 2015-02-17 2016-08-25 オーナンバ株式会社 Method for predicting future time at which current value or amount of generated power in a photovoltaic power generation system will decrease
CN109202921A (en) * 2017-07-03 2019-01-15 北京光年无限科技有限公司 The man-machine interaction method and device based on Forgetting Mechanism for robot
CN110909153A (en) * 2019-10-22 2020-03-24 中国船舶重工集团公司第七0九研究所 Knowledge graph visualization method based on semantic attention model
CN111050219A (en) * 2018-10-12 2020-04-21 奥多比公司 Spatio-temporal memory network for locating target objects in video content

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170360411A1 (en) * 2016-06-20 2017-12-21 Alex Rothberg Automated image analysis for identifying a medical parameter
CN107609563A (en) * 2017-09-15 2018-01-19 成都澳海川科技有限公司 Picture semantic describes method and device
CN107818306B (en) * 2017-10-31 2020-08-07 天津大学 Video question-answering method based on attention model
CN110163233A (en) * 2018-02-11 2019-08-23 陕西爱尚物联科技有限公司 A method of so that machine is competent at more complex works
WO2019232335A1 (en) * 2018-06-01 2019-12-05 Volkswagen Group Of America, Inc. Methodologies, systems, and components for incremental and continual learning for scalable improvement of autonomous systems
US10885395B2 (en) * 2018-06-17 2021-01-05 Pensa Systems Method for scaling fine-grained object recognition of consumer packaged goods
EP3617947A1 (en) * 2018-08-30 2020-03-04 Nokia Technologies Oy Apparatus and method for processing image data
CN109492679A (en) * 2018-10-24 2019-03-19 杭州电子科技大学 Based on attention mechanism and the character recognition method for being coupled chronological classification loss
CN109740419B (en) * 2018-11-22 2021-03-02 东南大学 Attention-LSTM network-based video behavior identification method
CN109657791A (en) * 2018-12-14 2019-04-19 中南大学 It is a kind of based on cerebral nerve cynapse memory mechanism towards open world successive learning method
CN110070188B (en) * 2019-04-30 2021-03-30 山东大学 Incremental cognitive development system and method integrating interactive reinforcement learning
CN110705692B (en) * 2019-09-25 2022-06-24 中南大学 Nonlinear dynamic industrial process product prediction method of space-time attention network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016132569A1 (en) * 2015-02-17 2016-08-25 オーナンバ株式会社 Method for predicting future time at which current value or amount of generated power in a photovoltaic power generation system will decrease
CN109202921A (en) * 2017-07-03 2019-01-15 北京光年无限科技有限公司 The man-machine interaction method and device based on Forgetting Mechanism for robot
CN111050219A (en) * 2018-10-12 2020-04-21 奥多比公司 Spatio-temporal memory network for locating target objects in video content
CN110909153A (en) * 2019-10-22 2020-03-24 中国船舶重工集团公司第七0九研究所 Knowledge graph visualization method based on semantic attention model

Also Published As

Publication number Publication date
WO2021217282A1 (en) 2021-11-04
CN111553467A (en) 2020-08-18

Similar Documents

Publication Publication Date Title
CN111553467B (en) Method for realizing general artificial intelligence
Daniele et al. AI+ art= human
Marsella et al. Computationally modeling human emotion
Vernon Artificial cognitive systems: A primer
Barsalou Situated conceptualization
Khan et al. Emotion Based Signal Enhancement Through Multisensory Integration Using Machine Learning.
US10853986B2 (en) Creative GAN generating art deviating from style norms
WO2021223042A1 (en) Method for implementing machine intelligence similar to human intelligence
WO2021226731A1 (en) Method for imitating human memory to realize universal machine intelligence
Reva Logic, Reasoning, Decision-Making
US11715291B2 (en) Establishment of general-purpose artificial intelligence system
CN112215346B (en) Method for realizing humanoid universal artificial intelligence machine
Yu Robot behavior generation and human behavior understanding in natural human-robot interaction
CN113962353A (en) Method for establishing strong artificial intelligence
Augello et al. A social practice oriented signs detection for human-humanoid interaction
CN112016664A (en) Method for realizing humanoid universal artificial intelligence machine
Ehresmann et al. Emergence processes up to consciousness using the multiplicity principle and quantum physics
WO2022109759A1 (en) Method for implementing humanlike artificial general intelligence
Da Silva et al. Modelling shared attention through relational reinforcement learning
Silver et al. The Roles of Symbols in Neural-based AI: They are Not What You Think!
Wong et al. Robot emotions generated and modulated by visual features of the environment
Höppner Posthuman embodiment: On the functions of things in embodiment processes
US20080243750A1 (en) Human Artificial Intelligence Software Application for Machine &amp; Computer Based Program Function
KR102183310B1 (en) Deep learning-based professional image interpretation device and method through expertise transplant
Cid et al. A new paradigm for learning affective behavior: Emotional affordances in human robot interaction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant