CN110598534B

CN110598534B - Bionic visual image target identification method fusing dotted line memory information

Info

Publication number: CN110598534B
Application number: CN201910699779.3A
Authority: CN
Inventors: 余伶俐; 金鸣岳; 周开军
Original assignee: Central South University
Current assignee: Central South University
Priority date: 2019-07-31
Filing date: 2019-07-31
Publication date: 2022-12-09
Anticipated expiration: 2039-07-31
Also published as: CN110598534A

Abstract

The invention discloses a bionic visual image target recognition method fusing dotted line memory information, which comprises the steps of constructing a grid cell set based on visual drive, constructing a distance cell model, and calculating a displacement vector between positions coded by grid cell group vectors; calculating the reaction of all sensory neurons to each foveal pixel k through a Gaussian kernel for target identification; calculating the fovea of the current target image by using Gaussian nuclear sensory cells, taking the characteristic tag unit with the strongest response as a next jumping point, and accumulating the corresponding stimulated identity cells; selecting a next jump viewpoint, and updating a central concave displacement vector through a distance cell model; and circularly repeating the calculation of the current position, the selection of the next jump viewpoint and the vector calculation in the process of target identification until the accumulation of certain stimulation identity cells reaches a threshold value of 0.9, and then considering the stimulation identity as the finally identified target. The invention has higher recognition rate to the position change, zooming and the shielded image.

Description

Bionic visual image target identification method fusing dotted line memory information

Technical Field

The invention relates to the field of visual perception, memory recognition and biological information, in particular to a method for recognizing a zooming and shielding target based on grid cells.

Background

The scaling and the identification of the shielded target are hot problems in the current visual field, and when the target image is scaled and shielded by an obstacle, the traditional machine learning method is difficult to realize the identification of the target image. For the traditional target recognition model, the emphasis is on massive parallel processing of bottom-layer features, and high-order representation occurs in a post-processing stage. However, the visual perception is closely connected with the eye movement, the human visual perception and the eyeball movement can rapidly and continuously scan the current target image, and no matter how the target image is zoomed and shielded, the human visual perception system can effectively and correctly identify the zoomed and shielded target image (the target image in the patent mainly refers to the face and the traffic sign).

At present, the identification method for the target mainly includes the following three methods: the method is characterized in that according to a visual perception process, a bipolar filter in the horizontal direction and the vertical direction is provided and is fused with a Gabor filter to realize directional edge detection, a spatial interval detection operator is designed to objectively describe line spatial frequency and response intensity, and the method better realizes the detection of the scaling and the translation of a target image with stronger linearity but cannot identify a shielded target image; the second is a two-dimensional principal component analysis method (2 DPCA), the 2DPCA is a novel principal component analysis method proposed on the basis of PCA (principal component analysis), the main process is to convert the 2D face image matrix into a one-dimensional vector, obtain the characteristic value of the matrix and corresponding orthonormal eigenvector of the covariance matrix by using the mean image, mainly apply to the face recognition, can't discern accurately to the fuzzy, sheltering from of the characteristic area; and thirdly, scale Invariant Feature Transform (SIFT) method, which is used for extracting the uniqueness invariant features from the target image, searching extreme points in the spatial scale, extracting the position, scale and rotation invariant of the extreme points, realizing the reliable matching of objects or scenes at different viewing angles, and realizing the target identification through the feature point matching, but cannot accurately identify the shielded target image.

The brain is a complex problem at the neural level to achieve recognition of zoomed and occluded targets, leCun in Nature proposes an unsupervised learning system that classifies visual stimuli into different classes and is used to identify specific familiar stimuli in the different classes. These methods of identifying stimuli typically focus on parallel processing of low-level visual features. Since Yarbus' pioneering research on eye movements, it was concluded that perception also depends on motor behavior, e.g. the assumption of perception that Friston proposed in 2012: a sweeping experiment, according to the explanation, the current feature of the stimulation is concavely formed, and the sweeping position of the next feature of the stimulation is calculated according to the stimulation identity; a series of glances can be viewed as a complex trajectory on a two-dimensional plane, analogous to spatial navigation; in 2005, the Moster couple discovered the lattice cells of the entorhinal cortex of the brain, recorded the rule that specific nerve cells are activated when the rat is moving, and presented the rule as hexagonal arrangement; these studies indicate that entorhinal cortical cells can exhibit neural responses to the lattice cells in the visual space; grid cells support path integration and space vector navigation. According to the perception mechanism of grid cells and vector navigation, when a visual target image appears, the visual driving grid cells carry out vector coding on remarkable stimulation characteristics in a visual space so as to drive saccades of visual recognition, the known attributes of the grid cells can endow invariance of size and position to a model, and the shielded target image is recognized by combining the grid cells, recognition memory and vector navigation methods, so that the problem to be solved by the invention is solved.

Disclosure of Invention

The invention aims to solve the technical problem that the prior art is not enough, provides a bionic visual image target identification method fusing point-line memory information, solves the problem of low identification rate of an occluded target brought by the traditional method, and improves the identification rate of the occluded target image by combining grid cells, identification memory and vector navigation.

In order to solve the technical problems, the technical scheme adopted by the invention is as follows: a bionic visual image target recognition method fusing dotted line memory information comprises the following steps:

1) Constructing a grid cell set based on visual driving, wherein each layer of grid cell graph of the grid cell set is composed of matrixes with the same size, and M modules and N offsets are adopted to form four-dimensional grid cells; the grid cell map size of each layer is the size of the image, sxs, and thus the size of the four-dimensional grid cells is sxs × N × M;

2) Constructing a distance cell model, and calculating displacement vectors between positions coded by grid cell population vectors;

3) For each training image, manually and randomly selecting 5-13 stimulation points, storing the stimulation points in a matrix cGC, and recording the coordinates (x, y) of each stimulation point in the matrix tGC corresponding to the pixel value corresponding to each layer of coordinates (x, y) in the grid cells; expanding each stimulation point into a foveal array of pixels sxs, storing in a matrix FPC, wherein S = m × S, m ∈ (7.2, 7.5), calculating the response of all sensory neurons to each foveal pixel k by gaussian kernels, and storing the resulting matrix array in a matrix SC;

4) Selecting a first stimulation point in a target image by adopting a top-down attention mechanism, calculating a fovea of the current target image by using Gaussian nucleus calculation sensory cells, responding to a strongest characteristic label unit as a next jumping point, and accumulating corresponding stimulation identity cells; and for the starting point and the end point of each saccade, updating a foveal displacement vector through a distance cell model so as to judge the pixel difference between the saccade point and the stimulation point and ensure that the pixel difference is maintained within 1%, otherwise, resetting the starting point until the accumulation of certain stimulation identity cells reaches a threshold value of 0.9, and considering the stimulation identity as a finally identified target.

In the step 1), a grid cell set based on visual drive is constructed through a firing rate graph, and the grid cell firing rate graph has the following calculation formula:

r _GC ＝max(0，cos(z ₀ )+cos(z ₁ )+cos(z ₂ ))；

wherein b is ₀ ，b ₁ And b ₂ Is the cosine wave normal vector; b _i Is shown by b ₀ ，b ₁ And b ₂ (ii) a F is an array of M numbers starting from 0.0014 × 2 π to 0.0214 × 2 π;

is a vector representation of the current offset; z is a radical of _i Denotes z ₀ ，z ₁ And z ₂ Calculating results of the normal vectors of different cosine waves; r is a radical of hydrogen _GC Representing the current offset and the firing rate graph of the grid cell layer corresponding to the current module; for each grid dimension, along the gridThe principal axes of two adjacent equilateral triangles uniformly sample the N offsets.

The specific implementation process of the step 3) comprises the following steps: carrying out gray level adjustment on a training image M1 to obtain an image M2, carrying out image integration processing on the M2 to obtain M3, selecting a stimulation point of the training image, wherein the selection of the stimulation point (x, y) satisfies that x is more than or equal to 31, and y is less than or equal to 410; selecting a central concave based on the image M3 to obtain a central concave array, and storing the central concave array in a matrix cGC; the response of all sensory neurons to each foveal pixel k is calculated by the gaussian kernel and the resulting matrix array is stored in the matrix SC.

Gaussian core

Wherein x represents an array obtained by reshaping each 61 × 61 foveal array into an array of 3761 × 1 after 256 copies, and ref represents an array arranged from 0 to 255 after 3761 copies.

The specific implementation process of the step 4) comprises the following steps: respectively calculating values of a PRC (stimulated identity cell) and a PLC (characteristic tag cell), randomly selecting stimulation points in a training set, expanding the stimulation points into a 61 x 61 foveal array O, calculating a neural response of a Gaussian core to the foveal array of a target image, and multiplying an SC _1 in the training set with an SC array at the current position: FC = SC _1 × SC, if satisfied

Then calculate PLC = FC/max (FC) and make it smaller in the PLC array

The value of (2) is set to 0, the value of the current PLC is recorded, if not, the value of the current PLC is recorded

Reselecting the current feature tag unit, wherein the size of the PLC is (n × 9,1); the calculation formula for PRC is: PRC = PRC + PRC _2 × PLC.

The specific implementation process of the step 5) comprises the following steps: setting a weak noise parameter w _n Calculate PLC＝w _n X PRC _2 x PRC; the serial number corresponding to the maximum value in the PLC is the next selected feature tag unit, namely the feature tag unit with the strongest response is selected as the next jump point according to calculation, namely the target feature tag unit, the vector distance between the starting feature tag unit and the feature tag unit of the target is calculated and marked on the target image, and if the pixel difference between the target pixel point obtained through vector calculation and the stimulation point in the training set is within 1 percent, the target feature unit is taken as the starting point of the next jump point; otherwise, reselecting the target characteristic unit; until PRC (i) ≧ 0.9, successful recognition was considered.

Compared with the prior art, the invention has the beneficial effects that:

1) For the constructed visual driving-based grid cell set, the visual driving-based grid cell can support the recognition of memory through the movement vector between the coding features, and capture the distribution of stimulation points in a stimulation specific coordinate system; meanwhile, the attention of visual identification and memory guidance of the position information of the target object can be explored based on the visual driving grid cell set; the attributes of the grid cells can provide size and position invariance, and meanwhile help to handle the problem of occlusion;

2) The foveal processing of the image is a process for simulating human vision, the human vision is central vision and peripheral vision respectively, the central vision is used for providing accurate and detailed visual contents, the peripheral vision is a picture in a wide-angle visual field range, and the visual acuity difference between the central vision and the peripheral vision leads out a retina central foveal rendering system, which is used for rendering the picture of the peripheral vision by tracking human eyes and using low image quality to highlight the staring content of the central vision and is closer to a real picture seen by human eyes;

3) The selection of saccades and saccades points in the recognition process is based on the movement behavior of the eye; fast object-driven eye movement is the basis for complex pattern recognition, and the part of the stimulus currently involved in recognition will be used to calculate the next saccade point, where the next feature of the stimulus should be located according to the assumed identity of the stimulus. If the deviation degree of the saccade point and the stimulation point is too large, the next saccade point needs to be recalculated, because the selection of the stimulation point is based on the part which can present important features; for example, attention is currently focused on the nose of a familiar face, and if it is desired to identify the face, it is necessary to determine the eye position, mouth position, etc. by saccades; if the saccadic stimulus point is not a stimulus point in the training set, the face will not be recognized correctly. The invention has higher recognition rate to the position change, zooming and the shielded image.

Drawings

FIG. 1 is a frame diagram of the present invention for recognizing a bionic visual target image based on fusion of dotted line memory information; (a) Is an overall frame diagram, and (b) is an identification process frame diagram;

FIG. 2 is a diagram showing a part of modules and offsets for constructing a grid cell set based on visual driving according to an embodiment of the present invention; (a) The grid cell sets of 9 modules corresponding to one offset amount, (b) (c) the grid cell sets of 10 offset amounts corresponding to the first module and the fifth module respectively;

FIG. 3 is a representation of a stimulus point and a foveal image of a training image in accordance with an embodiment of the present invention; (a), (b), (c) are respectively an original training image, a training image subjected to gray processing and a training image subjected to integral processing, and (d) are respectively the demonstration of a stimulation point and a fovea of the training image;

FIG. 4 illustrates identification of non-physical noise-occluded images in accordance with an embodiment of the present invention; (a) (d) (g) (j) represents an incremental change in stimulated identity cells; (b) (e) (h) (k) represents a feature cell for selection of a next jump point, (c) (f) (i) represents a change in the current position fovea and the target position fovea in the image recognition process, (l) represents a subject recognition stimulus cell increment map, and a dotted line represents a recognition threshold of 0.9;

FIG. 5 is a different type of identification experiment according to an embodiment of the present invention; (a) (b) represents the recognition of the reduced target image, (c) (d) (e) is the recognition of the noise-masked target image; and (f) and (g) are identification of the object blocking target image.

Detailed Description

The general frame diagram of the identification process of the method of the invention is shown in fig. 1, and specifically comprises the following steps:

the method comprises the following steps: and constructing a grid cell set based on visual drive, and performing memory recognition by encoding a motion vector between the features. Each layer of grid cell map is composed of matrixes (440 x 440 pixels) with the same size, and adopts 9 modules and 100 offsets to form 440 x 100 x 9 four-dimensional grid cells; FIG. 2 shows a set of 9 grid cells for one offset, a set of 10 grid cells for the first module, and a set of 10 grid cells for the fifth module;

step two: constructing a distance cell model, and calculating displacement vectors between positions coded by grid cell population vectors;

step three: after the grid cell and distance cell model construction is completed, training an image; in this embodiment, an input training image is 440 × 440 pixels, for each training image, one of the training images is selected as an example, 9 stimulation points are randomly selected manually (9 stimulation points of each image represent one stimulation identity), stored in a matrix cGC, each stimulation point is expanded into a concave center array of pixels 61 × 61, stored in a matrix FPC, and a matrix array obtained by calculating the response of all sensory neurons (feature detectors) to each concave center pixel k through a gaussian kernel is stored as an output in a matrix SC, 10 training images are selected in this embodiment, and the size of the SC matrix is (3761, 10 × 9); meanwhile, the training model needs to learn the Hebbian association among some cell types, including the response of the feature detector to the features of a single component, the response of the feature detector to the feature label unit, and the bidirectional association of all feature labeled cells and stimulated cells; FIG. 3 shows a training image with manually selected 9 stimulation points and an expanded 61X 61 foveal image;

step four: after the training phase, the model has learned the necessary associations between cells, requiring stimuli from the training set to identify memory, i.e. the identification of the target; selecting a first stimulation point in the target image by adopting a top-down attention mechanism, calculating the current position, namely calculating the fovea of the current target image by using Gaussian kernel calculation sensory cells, responding to the strongest characteristic label unit as a next jumping point, and accumulating corresponding stimulation identity cells; similarly, the next saccade view point is continuously selected through calculation, and the central foveal displacement vector is required to be updated through the distance cell model at the starting point and the end point of each saccade; and repeating the process in the fourth step circularly until the accumulation of cells with certain stimulation identities reaches a threshold value of 0.9, and then considering the stimulation identities as the finally identified targets. FIG. 4 illustrates identification of noise occlusion images; (a) (d) (g) (j) represents stimulus-identified cells, in this example there are 10 classes of training images, each representing a stimulus identity; (b) (e) (h) (k) represents feature cell, in this embodiment, 9 stimulation points are manually selected for each type of image, 90 stimulation points in total, and the corresponding feature cell with the strongest response is used as the selection of the next jump point, (c) (f) (i) represents the change of the fovea of the current position and the fovea of the target position in the image identification process, (l) represents an incremental image of the object identification stimulation cell, and the dotted line represents an identification threshold value of 0.9;

in the first step, a grid cell set based on visual drive is constructed, grid cells can be realized by firing rate maps, the pixel of each firing rate map is 440 × 440 and is calculated by 60 ° offset, cosine waves are superposed, and the calculation mode is as follows by using the following equation system:

r _GC ＝max(0，cos(z ₀ )+cos(z ₁ )+cos(z ₂ ))

wherein b is ₀ ，b ₁ And b ₂ Is the cosine wave normal vector; b _i Is shown by b ₀ ，b ₁ And b ₂ ；

A vector representation representing a current offset; z is a radical of _i Denotes z ₀ ，z ₁ And z ₂ Calculating results of the normal vectors of different cosine waves; r is a radical of hydrogen _GC Representing the current offset and the grid cell layer corresponding to the current module; for each grid scale, 100 offsets are uniformly sampled along the major axes of two adjacent equilateral triangles on the grid. In the embodiment, 9 modules with unchanged directions are used; f is the spatial frequency of the grid, starting from 0.0028 x 2 pi, the scale of the successive grids is related to the scale factor, and the grid patterns of the different cells in the module are offset with respect to each other, together covering the entire visual area. For each grid scale, 100 offsets are uniformly sampled along the major axes of two adjacent equilateral triangles on the grid. Thus, the grid cell set consists of 9 modules, each with 100 offsets; FIG. 2 shows a set of 9 grid cells for one offset and a set of grid cells for the first 10 offsets for the first and fifth modules;

in step two, a distance cell model is constructed in order to calculate the displacement vector between the positions encoded by the grid cell population vector. In each module, the grid cells are projected with the appropriate phase into a single cell that encodes the respective distance of each of the four arrays of distance cells, each cell corresponding to two non-collinear axes. Two arrays of distance elements belonging to the same axis project onto two output elements. One output unit receives a weight monotonically increasing from one distance unit array and decreases the weight monotonically from the other output unit. For the second output unit, the connections increase/decrease in opposite directions along the distance axis. The relative difference between the two output neurons encodes the displacement between the starting position and the target position along a given axis. Since the resolution of the image and the grid map is limited to 440 x 440 pixels, the pixel coordinates are allowed a small tolerance of 1% for the translation vectors resulting from the vector calculation. Vectors as shown in FIG. 4 (d) (e) (f), i.e., calculated by the distance cell model;

in the third step, the size of the input training image M1 is 440 × 440 pixels, as shown in fig. 3 (a), the training image M1 is firstly subjected to gray level adjustment to obtain an image M2, as shown in fig. 3 (b), and then the image integration processing is performed on M2 to obtain M3, as shown in fig. 3 (c), so that the effect of blurring the image is achieved, the stimulation points of the training image are manually selected, the selected stimulation points are selected, stimulation areas showing strong gradients under the illumination condition are selected more, for example, the selection of facial stimulation points, and the selected positions include the eye corners, the nose tips or the side corners; since the stimulation points are to be expanded to a 61 × 61 foveal array, the stimulation points (x, y) are selected such that 31 ≦ x, y ≦ 410; the selection of fovea is based on fig. 3 (c) of the blurred image, so that the fovea image will have the preferred gray value, at which point the resulting fovea array is stored in the matrix cGC and used to later calculate the response of the sensory neurons.

In the third step, the reaction of all sensory neurons (feature detectors) to each foveal pixel k is calculated by a gaussian kernel, which is calculated as follows:

wherein, x represents that each 61 × 61 central concave array is subjected to array remodeling to be an array of 3761 × 1, the array is subjected to 256 times of copying, ref represents an array arranged from 0 to 255, and the array is subjected to 3761 times of copying; after the computation of all foves by all sensory neurons is completed, it is stored in the matrix SC _1 and normalized before it will be used for the identification of the target. Meanwhile, in the training stage, each stimulation point is selected by small random fluctuation, and the calculation mode is as follows: (1-rand/10) stored in PRC _ 2.

In the fourth step, the calculation of the current feature unit refers to calculating the values of the stimulated identity cell PRC and the feature tag cell PLC, respectively, first selecting stimulation points in the training set at random, expanding the stimulation points into 61 × 61 foveal arrays O, such as 9 foveal images pointed by arrows shown in fig. 3 (d), calculating the neural response of the foveal arrays of the target images by using gaussian kernel, and the likeThe following formula is calculated:

it is recalled that the multiplication of SC _1 of the training set with the SC array of the current position is performed: FC = SC _1 × SC, if satisfied

Then calculate PLC = FC/max (FC) and make it smaller in the PLC array

Reselecting the current feature cell, wherein the size of the PLC is (10 x 9,1); the PRC is calculated as follows: PRC = PRC + PRC _2 × PLC, the calculation process of PRC is a superimposed process, and the identification is considered successful until the calculated PRC reaches a threshold value of 0.9, where the size of PRC is (10,1); the recognition threshold can be adjusted according to the number and reliability of the input stimulation points, and setting a lower recognition threshold will facilitate faster recognition, but may reduce the accuracy, so that it is more appropriate to select 0.9 as the threshold.

In the fourth step, the calculation of the next feature label unit in the target image is carried out by firstly setting the weak noise parameter w _n Then calculate PLC = w _n X PRC _2 x PRC; the serial number corresponding to the maximum value in the PLC is the next selected feature tag unit, namely the feature tag unit with the strongest response is selected as the next jump point according to calculation, namely the target feature tag unit, the vector distance between the starting feature tag unit and the feature tag unit of the target is calculated and marked on the target image, and if the pixel difference between the target pixel point obtained through vector calculation and the stimulation point in the training set is within 1 percent, the target feature unit is taken as the starting point of the next jump point; otherwise, reselecting the target characteristic unit; as shown in fig. 5 (e) (f) (g), since the jumping viewpoint is not within 1% of the pixel difference, the process of re-recognition is performed; repeating the stepsAnd (4) in the fourth step, until the PRC (i) is more than or equal to 0.9, the recognition is considered to be successful. As shown in FIG. 4 (l), the black solid line corresponds to the increment of the current target image PRC, the corresponding stimulation identity cell PRC (5) ≧ 0.9, as shown in FIG. 4 (j), so the identified target image corresponds to the fifth type of image.

Claims

1. A bionic visual image target recognition method fused with dotted line memory information is characterized by comprising the following steps:

4) Selecting a first stimulation point in a target image by adopting a top-down attention mechanism, calculating a fovea of the current target image by using Gaussian nucleus calculation sensory cells, responding to a strongest characteristic label unit as a next jumping point, and accumulating corresponding stimulation identity cells; and for the starting point and the end point of each saccade, updating the foveal displacement vector by a distance cell model, judging the pixel difference between the jumping viewpoint and the stimulation point, and ensuring that the pixel difference is maintained within 1%, otherwise, resetting the starting point until the accumulation of certain stimulation identity cells reaches a threshold value of 0.9, and considering the stimulation identity as a finally identified target.

2. The method for recognizing the target of the bionic visual image fused with the dotted line memory information as claimed in claim 1, wherein in step 1), the grid cell set based on the visual drive is constructed by a firing rate map, and the grid cell firing rate map is calculated according to the following formula:

r _GC ＝max(0,cos(z ₀ )+cos(z ₁ )+cos(z ₂ ))；

wherein b is ₀ ，b ₁ And b ₂ Is the cosine wave normal vector; b _i Denotes b ₀ ，b ₁ And b ₂ (ii) a F is an array of M numbers starting from 0.0014 x 2 pi to 0.0214 x 2 pi of the grid spatial frequency;

is a vector representation of the current offset; z is a radical of formula _i Denotes z ₀ ，z ₁ And z ₂ Calculating results of the normal vectors of different cosine waves; r is a radical of hydrogen _GC Representing the current offset and the firing rate graph of the grid cell layer corresponding to the current module; for each grid scale, N offsets are uniformly sampled along the major axes of two adjacent equilateral triangles on the grid.

3. The method for recognizing the bionic visual image target fused with the dotted line memory information as claimed in claim 1, wherein the specific implementation process of the step 3) comprises: carrying out gray level adjustment on a training image M1 to obtain an image M2, carrying out image integration processing on the M2 to obtain M3, selecting a stimulation point of the training image, wherein the selection of the stimulation point (x, y) satisfies that x is more than or equal to 31, and y is less than or equal to 410; selecting a central concave based on the image M3 to obtain a central concave array, and storing the central concave array in a matrix cGC; the response of all sensory neurons to each foveal pixel k is calculated by the gaussian kernel and the resulting matrix array is stored in the matrix SC.

4. The method for recognizing the target of a bionic visual image fused with dotted line memory information as claimed in claim 3, wherein the Gaussian kernel

5. The method for recognizing the bionic visual image target fused with the dotted line memory information as claimed in claim 1, wherein the specific implementation process of the step 4) comprises the following steps: respectively calculating values of a PRC (stimulated identity cell) and a PLC (characteristic tag cell), randomly selecting stimulation points in a training set, expanding the stimulation points into a 61 x 61 foveal array O, calculating a neural response of a Gaussian core to the foveal array of a target image, and multiplying an SC _1 in the training set with an SC array at the current position: FC = SC _1 × SC, if satisfied

Then calculate PLC = FC/max (FC) and make it smaller in the PLC array

6. The method for recognizing the bionic visual image target fused with the dotted line memory information as claimed in claim 5, wherein the specific implementation process of the step 5) comprises: setting a weak noise parameter w _n Calculating PLC = w _n X PRC — 2 x PRC; the serial number corresponding to the maximum value in the PLC is the next selected feature tag unit, namely the feature tag unit with the strongest response is selected as the next jump point according to calculation, namely the target feature tag unit, the vector distance between the starting feature tag unit and the feature tag unit of the target is calculated and marked on the target image, and if the pixel difference between the target pixel point obtained through vector calculation and the stimulation point in the training set is within 1 percent, the target feature unit is taken as the starting point of the next jump point; otherwise, reselecting the target characteristic unit; until PRC (i) ≧ 0.9, successful recognition was considered.