CN110598534B - Bionic visual image target identification method fusing dotted line memory information - Google Patents

Bionic visual image target identification method fusing dotted line memory information Download PDF

Info

Publication number
CN110598534B
CN110598534B CN201910699779.3A CN201910699779A CN110598534B CN 110598534 B CN110598534 B CN 110598534B CN 201910699779 A CN201910699779 A CN 201910699779A CN 110598534 B CN110598534 B CN 110598534B
Authority
CN
China
Prior art keywords
target
image
stimulation
array
point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910699779.3A
Other languages
Chinese (zh)
Other versions
CN110598534A (en
Inventor
余伶俐
金鸣岳
周开军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN201910699779.3A priority Critical patent/CN110598534B/en
Publication of CN110598534A publication Critical patent/CN110598534A/en
Application granted granted Critical
Publication of CN110598534B publication Critical patent/CN110598534B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a bionic visual image target recognition method fusing dotted line memory information, which comprises the steps of constructing a grid cell set based on visual drive, constructing a distance cell model, and calculating a displacement vector between positions coded by grid cell group vectors; calculating the reaction of all sensory neurons to each foveal pixel k through a Gaussian kernel for target identification; calculating the fovea of the current target image by using Gaussian nuclear sensory cells, taking the characteristic tag unit with the strongest response as a next jumping point, and accumulating the corresponding stimulated identity cells; selecting a next jump viewpoint, and updating a central concave displacement vector through a distance cell model; and circularly repeating the calculation of the current position, the selection of the next jump viewpoint and the vector calculation in the process of target identification until the accumulation of certain stimulation identity cells reaches a threshold value of 0.9, and then considering the stimulation identity as the finally identified target. The invention has higher recognition rate to the position change, zooming and the shielded image.

Description

Bionic visual image target identification method fusing dotted line memory information
Technical Field
The invention relates to the field of visual perception, memory recognition and biological information, in particular to a method for recognizing a zooming and shielding target based on grid cells.
Background
The scaling and the identification of the shielded target are hot problems in the current visual field, and when the target image is scaled and shielded by an obstacle, the traditional machine learning method is difficult to realize the identification of the target image. For the traditional target recognition model, the emphasis is on massive parallel processing of bottom-layer features, and high-order representation occurs in a post-processing stage. However, the visual perception is closely connected with the eye movement, the human visual perception and the eyeball movement can rapidly and continuously scan the current target image, and no matter how the target image is zoomed and shielded, the human visual perception system can effectively and correctly identify the zoomed and shielded target image (the target image in the patent mainly refers to the face and the traffic sign).
At present, the identification method for the target mainly includes the following three methods: the method is characterized in that according to a visual perception process, a bipolar filter in the horizontal direction and the vertical direction is provided and is fused with a Gabor filter to realize directional edge detection, a spatial interval detection operator is designed to objectively describe line spatial frequency and response intensity, and the method better realizes the detection of the scaling and the translation of a target image with stronger linearity but cannot identify a shielded target image; the second is a two-dimensional principal component analysis method (2 DPCA), the 2DPCA is a novel principal component analysis method proposed on the basis of PCA (principal component analysis), the main process is to convert the 2D face image matrix into a one-dimensional vector, obtain the characteristic value of the matrix and corresponding orthonormal eigenvector of the covariance matrix by using the mean image, mainly apply to the face recognition, can't discern accurately to the fuzzy, sheltering from of the characteristic area; and thirdly, scale Invariant Feature Transform (SIFT) method, which is used for extracting the uniqueness invariant features from the target image, searching extreme points in the spatial scale, extracting the position, scale and rotation invariant of the extreme points, realizing the reliable matching of objects or scenes at different viewing angles, and realizing the target identification through the feature point matching, but cannot accurately identify the shielded target image.
The brain is a complex problem at the neural level to achieve recognition of zoomed and occluded targets, leCun in Nature proposes an unsupervised learning system that classifies visual stimuli into different classes and is used to identify specific familiar stimuli in the different classes. These methods of identifying stimuli typically focus on parallel processing of low-level visual features. Since Yarbus' pioneering research on eye movements, it was concluded that perception also depends on motor behavior, e.g. the assumption of perception that Friston proposed in 2012: a sweeping experiment, according to the explanation, the current feature of the stimulation is concavely formed, and the sweeping position of the next feature of the stimulation is calculated according to the stimulation identity; a series of glances can be viewed as a complex trajectory on a two-dimensional plane, analogous to spatial navigation; in 2005, the Moster couple discovered the lattice cells of the entorhinal cortex of the brain, recorded the rule that specific nerve cells are activated when the rat is moving, and presented the rule as hexagonal arrangement; these studies indicate that entorhinal cortical cells can exhibit neural responses to the lattice cells in the visual space; grid cells support path integration and space vector navigation. According to the perception mechanism of grid cells and vector navigation, when a visual target image appears, the visual driving grid cells carry out vector coding on remarkable stimulation characteristics in a visual space so as to drive saccades of visual recognition, the known attributes of the grid cells can endow invariance of size and position to a model, and the shielded target image is recognized by combining the grid cells, recognition memory and vector navigation methods, so that the problem to be solved by the invention is solved.
Disclosure of Invention
The invention aims to solve the technical problem that the prior art is not enough, provides a bionic visual image target identification method fusing point-line memory information, solves the problem of low identification rate of an occluded target brought by the traditional method, and improves the identification rate of the occluded target image by combining grid cells, identification memory and vector navigation.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows: a bionic visual image target recognition method fusing dotted line memory information comprises the following steps:
1) Constructing a grid cell set based on visual driving, wherein each layer of grid cell graph of the grid cell set is composed of matrixes with the same size, and M modules and N offsets are adopted to form four-dimensional grid cells; the grid cell map size of each layer is the size of the image, sxs, and thus the size of the four-dimensional grid cells is sxs × N × M;
2) Constructing a distance cell model, and calculating displacement vectors between positions coded by grid cell population vectors;
3) For each training image, manually and randomly selecting 5-13 stimulation points, storing the stimulation points in a matrix cGC, and recording the coordinates (x, y) of each stimulation point in the matrix tGC corresponding to the pixel value corresponding to each layer of coordinates (x, y) in the grid cells; expanding each stimulation point into a foveal array of pixels sxs, storing in a matrix FPC, wherein S = m × S, m ∈ (7.2, 7.5), calculating the response of all sensory neurons to each foveal pixel k by gaussian kernels, and storing the resulting matrix array in a matrix SC;
4) Selecting a first stimulation point in a target image by adopting a top-down attention mechanism, calculating a fovea of the current target image by using Gaussian nucleus calculation sensory cells, responding to a strongest characteristic label unit as a next jumping point, and accumulating corresponding stimulation identity cells; and for the starting point and the end point of each saccade, updating a foveal displacement vector through a distance cell model so as to judge the pixel difference between the saccade point and the stimulation point and ensure that the pixel difference is maintained within 1%, otherwise, resetting the starting point until the accumulation of certain stimulation identity cells reaches a threshold value of 0.9, and considering the stimulation identity as a finally identified target.
In the step 1), a grid cell set based on visual drive is constructed through a firing rate graph, and the grid cell firing rate graph has the following calculation formula:
r GC =max(0,cos(z 0 )+cos(z 1 )+cos(z 2 ));
Figure BDA0002150435200000031
Figure BDA0002150435200000032
wherein b is 0 ,b 1 And b 2 Is the cosine wave normal vector; b i Is shown by b 0 ,b 1 And b 2 (ii) a F is an array of M numbers starting from 0.0014 × 2 π to 0.0214 × 2 π;
Figure BDA0002150435200000033
is a vector representation of the current offset; z is a radical of i Denotes z 0 ,z 1 And z 2 Calculating results of the normal vectors of different cosine waves; r is a radical of hydrogen GC Representing the current offset and the firing rate graph of the grid cell layer corresponding to the current module; for each grid dimension, along the gridThe principal axes of two adjacent equilateral triangles uniformly sample the N offsets.
The specific implementation process of the step 3) comprises the following steps: carrying out gray level adjustment on a training image M1 to obtain an image M2, carrying out image integration processing on the M2 to obtain M3, selecting a stimulation point of the training image, wherein the selection of the stimulation point (x, y) satisfies that x is more than or equal to 31, and y is less than or equal to 410; selecting a central concave based on the image M3 to obtain a central concave array, and storing the central concave array in a matrix cGC; the response of all sensory neurons to each foveal pixel k is calculated by the gaussian kernel and the resulting matrix array is stored in the matrix SC.
Gaussian core
Figure BDA0002150435200000034
Wherein x represents an array obtained by reshaping each 61 × 61 foveal array into an array of 3761 × 1 after 256 copies, and ref represents an array arranged from 0 to 255 after 3761 copies.
The specific implementation process of the step 4) comprises the following steps: respectively calculating values of a PRC (stimulated identity cell) and a PLC (characteristic tag cell), randomly selecting stimulation points in a training set, expanding the stimulation points into a 61 x 61 foveal array O, calculating a neural response of a Gaussian core to the foveal array of a target image, and multiplying an SC _1 in the training set with an SC array at the current position: FC = SC _1 × SC, if satisfied
Figure BDA0002150435200000041
Then calculate PLC = FC/max (FC) and make it smaller in the PLC array
Figure BDA0002150435200000042
The value of (2) is set to 0, the value of the current PLC is recorded, if not, the value of the current PLC is recorded
Figure BDA0002150435200000043
Reselecting the current feature tag unit, wherein the size of the PLC is (n × 9,1); the calculation formula for PRC is: PRC = PRC + PRC _2 × PLC.
The specific implementation process of the step 5) comprises the following steps: setting a weak noise parameter w n Calculate PLC=w n X PRC _2 x PRC; the serial number corresponding to the maximum value in the PLC is the next selected feature tag unit, namely the feature tag unit with the strongest response is selected as the next jump point according to calculation, namely the target feature tag unit, the vector distance between the starting feature tag unit and the feature tag unit of the target is calculated and marked on the target image, and if the pixel difference between the target pixel point obtained through vector calculation and the stimulation point in the training set is within 1 percent, the target feature unit is taken as the starting point of the next jump point; otherwise, reselecting the target characteristic unit; until PRC (i) ≧ 0.9, successful recognition was considered.
Compared with the prior art, the invention has the beneficial effects that:
1) For the constructed visual driving-based grid cell set, the visual driving-based grid cell can support the recognition of memory through the movement vector between the coding features, and capture the distribution of stimulation points in a stimulation specific coordinate system; meanwhile, the attention of visual identification and memory guidance of the position information of the target object can be explored based on the visual driving grid cell set; the attributes of the grid cells can provide size and position invariance, and meanwhile help to handle the problem of occlusion;
2) The foveal processing of the image is a process for simulating human vision, the human vision is central vision and peripheral vision respectively, the central vision is used for providing accurate and detailed visual contents, the peripheral vision is a picture in a wide-angle visual field range, and the visual acuity difference between the central vision and the peripheral vision leads out a retina central foveal rendering system, which is used for rendering the picture of the peripheral vision by tracking human eyes and using low image quality to highlight the staring content of the central vision and is closer to a real picture seen by human eyes;
3) The selection of saccades and saccades points in the recognition process is based on the movement behavior of the eye; fast object-driven eye movement is the basis for complex pattern recognition, and the part of the stimulus currently involved in recognition will be used to calculate the next saccade point, where the next feature of the stimulus should be located according to the assumed identity of the stimulus. If the deviation degree of the saccade point and the stimulation point is too large, the next saccade point needs to be recalculated, because the selection of the stimulation point is based on the part which can present important features; for example, attention is currently focused on the nose of a familiar face, and if it is desired to identify the face, it is necessary to determine the eye position, mouth position, etc. by saccades; if the saccadic stimulus point is not a stimulus point in the training set, the face will not be recognized correctly. The invention has higher recognition rate to the position change, zooming and the shielded image.
Drawings
FIG. 1 is a frame diagram of the present invention for recognizing a bionic visual target image based on fusion of dotted line memory information; (a) Is an overall frame diagram, and (b) is an identification process frame diagram;
FIG. 2 is a diagram showing a part of modules and offsets for constructing a grid cell set based on visual driving according to an embodiment of the present invention; (a) The grid cell sets of 9 modules corresponding to one offset amount, (b) (c) the grid cell sets of 10 offset amounts corresponding to the first module and the fifth module respectively;
FIG. 3 is a representation of a stimulus point and a foveal image of a training image in accordance with an embodiment of the present invention; (a), (b), (c) are respectively an original training image, a training image subjected to gray processing and a training image subjected to integral processing, and (d) are respectively the demonstration of a stimulation point and a fovea of the training image;
FIG. 4 illustrates identification of non-physical noise-occluded images in accordance with an embodiment of the present invention; (a) (d) (g) (j) represents an incremental change in stimulated identity cells; (b) (e) (h) (k) represents a feature cell for selection of a next jump point, (c) (f) (i) represents a change in the current position fovea and the target position fovea in the image recognition process, (l) represents a subject recognition stimulus cell increment map, and a dotted line represents a recognition threshold of 0.9;
FIG. 5 is a different type of identification experiment according to an embodiment of the present invention; (a) (b) represents the recognition of the reduced target image, (c) (d) (e) is the recognition of the noise-masked target image; and (f) and (g) are identification of the object blocking target image.
Detailed Description
The general frame diagram of the identification process of the method of the invention is shown in fig. 1, and specifically comprises the following steps:
the method comprises the following steps: and constructing a grid cell set based on visual drive, and performing memory recognition by encoding a motion vector between the features. Each layer of grid cell map is composed of matrixes (440 x 440 pixels) with the same size, and adopts 9 modules and 100 offsets to form 440 x 100 x 9 four-dimensional grid cells; FIG. 2 shows a set of 9 grid cells for one offset, a set of 10 grid cells for the first module, and a set of 10 grid cells for the fifth module;
step two: constructing a distance cell model, and calculating displacement vectors between positions coded by grid cell population vectors;
step three: after the grid cell and distance cell model construction is completed, training an image; in this embodiment, an input training image is 440 × 440 pixels, for each training image, one of the training images is selected as an example, 9 stimulation points are randomly selected manually (9 stimulation points of each image represent one stimulation identity), stored in a matrix cGC, each stimulation point is expanded into a concave center array of pixels 61 × 61, stored in a matrix FPC, and a matrix array obtained by calculating the response of all sensory neurons (feature detectors) to each concave center pixel k through a gaussian kernel is stored as an output in a matrix SC, 10 training images are selected in this embodiment, and the size of the SC matrix is (3761, 10 × 9); meanwhile, the training model needs to learn the Hebbian association among some cell types, including the response of the feature detector to the features of a single component, the response of the feature detector to the feature label unit, and the bidirectional association of all feature labeled cells and stimulated cells; FIG. 3 shows a training image with manually selected 9 stimulation points and an expanded 61X 61 foveal image;
step four: after the training phase, the model has learned the necessary associations between cells, requiring stimuli from the training set to identify memory, i.e. the identification of the target; selecting a first stimulation point in the target image by adopting a top-down attention mechanism, calculating the current position, namely calculating the fovea of the current target image by using Gaussian kernel calculation sensory cells, responding to the strongest characteristic label unit as a next jumping point, and accumulating corresponding stimulation identity cells; similarly, the next saccade view point is continuously selected through calculation, and the central foveal displacement vector is required to be updated through the distance cell model at the starting point and the end point of each saccade; and repeating the process in the fourth step circularly until the accumulation of cells with certain stimulation identities reaches a threshold value of 0.9, and then considering the stimulation identities as the finally identified targets. FIG. 4 illustrates identification of noise occlusion images; (a) (d) (g) (j) represents stimulus-identified cells, in this example there are 10 classes of training images, each representing a stimulus identity; (b) (e) (h) (k) represents feature cell, in this embodiment, 9 stimulation points are manually selected for each type of image, 90 stimulation points in total, and the corresponding feature cell with the strongest response is used as the selection of the next jump point, (c) (f) (i) represents the change of the fovea of the current position and the fovea of the target position in the image identification process, (l) represents an incremental image of the object identification stimulation cell, and the dotted line represents an identification threshold value of 0.9;
in the first step, a grid cell set based on visual drive is constructed, grid cells can be realized by firing rate maps, the pixel of each firing rate map is 440 × 440 and is calculated by 60 ° offset, cosine waves are superposed, and the calculation mode is as follows by using the following equation system:
Figure BDA0002150435200000071
Figure BDA0002150435200000072
r GC =max(0,cos(z 0 )+cos(z 1 )+cos(z 2 ))
wherein b is 0 ,b 1 And b 2 Is the cosine wave normal vector; b i Is shown by b 0 ,b 1 And b 2
Figure BDA0002150435200000073
A vector representation representing a current offset; z is a radical of i Denotes z 0 ,z 1 And z 2 Calculating results of the normal vectors of different cosine waves; r is a radical of hydrogen GC Representing the current offset and the grid cell layer corresponding to the current module; for each grid scale, 100 offsets are uniformly sampled along the major axes of two adjacent equilateral triangles on the grid. In the embodiment, 9 modules with unchanged directions are used; f is the spatial frequency of the grid, starting from 0.0028 x 2 pi, the scale of the successive grids is related to the scale factor, and the grid patterns of the different cells in the module are offset with respect to each other, together covering the entire visual area. For each grid scale, 100 offsets are uniformly sampled along the major axes of two adjacent equilateral triangles on the grid. Thus, the grid cell set consists of 9 modules, each with 100 offsets; FIG. 2 shows a set of 9 grid cells for one offset and a set of grid cells for the first 10 offsets for the first and fifth modules;
in step two, a distance cell model is constructed in order to calculate the displacement vector between the positions encoded by the grid cell population vector. In each module, the grid cells are projected with the appropriate phase into a single cell that encodes the respective distance of each of the four arrays of distance cells, each cell corresponding to two non-collinear axes. Two arrays of distance elements belonging to the same axis project onto two output elements. One output unit receives a weight monotonically increasing from one distance unit array and decreases the weight monotonically from the other output unit. For the second output unit, the connections increase/decrease in opposite directions along the distance axis. The relative difference between the two output neurons encodes the displacement between the starting position and the target position along a given axis. Since the resolution of the image and the grid map is limited to 440 x 440 pixels, the pixel coordinates are allowed a small tolerance of 1% for the translation vectors resulting from the vector calculation. Vectors as shown in FIG. 4 (d) (e) (f), i.e., calculated by the distance cell model;
in the third step, the size of the input training image M1 is 440 × 440 pixels, as shown in fig. 3 (a), the training image M1 is firstly subjected to gray level adjustment to obtain an image M2, as shown in fig. 3 (b), and then the image integration processing is performed on M2 to obtain M3, as shown in fig. 3 (c), so that the effect of blurring the image is achieved, the stimulation points of the training image are manually selected, the selected stimulation points are selected, stimulation areas showing strong gradients under the illumination condition are selected more, for example, the selection of facial stimulation points, and the selected positions include the eye corners, the nose tips or the side corners; since the stimulation points are to be expanded to a 61 × 61 foveal array, the stimulation points (x, y) are selected such that 31 ≦ x, y ≦ 410; the selection of fovea is based on fig. 3 (c) of the blurred image, so that the fovea image will have the preferred gray value, at which point the resulting fovea array is stored in the matrix cGC and used to later calculate the response of the sensory neurons.
In the third step, the reaction of all sensory neurons (feature detectors) to each foveal pixel k is calculated by a gaussian kernel, which is calculated as follows:
Figure BDA0002150435200000081
wherein, x represents that each 61 × 61 central concave array is subjected to array remodeling to be an array of 3761 × 1, the array is subjected to 256 times of copying, ref represents an array arranged from 0 to 255, and the array is subjected to 3761 times of copying; after the computation of all foves by all sensory neurons is completed, it is stored in the matrix SC _1 and normalized before it will be used for the identification of the target. Meanwhile, in the training stage, each stimulation point is selected by small random fluctuation, and the calculation mode is as follows: (1-rand/10) stored in PRC _ 2.
In the fourth step, the calculation of the current feature unit refers to calculating the values of the stimulated identity cell PRC and the feature tag cell PLC, respectively, first selecting stimulation points in the training set at random, expanding the stimulation points into 61 × 61 foveal arrays O, such as 9 foveal images pointed by arrows shown in fig. 3 (d), calculating the neural response of the foveal arrays of the target images by using gaussian kernel, and the likeThe following formula is calculated:
Figure BDA0002150435200000082
it is recalled that the multiplication of SC _1 of the training set with the SC array of the current position is performed: FC = SC _1 × SC, if satisfied
Figure BDA0002150435200000083
Then calculate PLC = FC/max (FC) and make it smaller in the PLC array
Figure BDA0002150435200000084
The value of (2) is set to 0, the value of the current PLC is recorded, if not, the value of the current PLC is recorded
Figure BDA0002150435200000085
Reselecting the current feature cell, wherein the size of the PLC is (10 x 9,1); the PRC is calculated as follows: PRC = PRC + PRC _2 × PLC, the calculation process of PRC is a superimposed process, and the identification is considered successful until the calculated PRC reaches a threshold value of 0.9, where the size of PRC is (10,1); the recognition threshold can be adjusted according to the number and reliability of the input stimulation points, and setting a lower recognition threshold will facilitate faster recognition, but may reduce the accuracy, so that it is more appropriate to select 0.9 as the threshold.
In the fourth step, the calculation of the next feature label unit in the target image is carried out by firstly setting the weak noise parameter w n Then calculate PLC = w n X PRC _2 x PRC; the serial number corresponding to the maximum value in the PLC is the next selected feature tag unit, namely the feature tag unit with the strongest response is selected as the next jump point according to calculation, namely the target feature tag unit, the vector distance between the starting feature tag unit and the feature tag unit of the target is calculated and marked on the target image, and if the pixel difference between the target pixel point obtained through vector calculation and the stimulation point in the training set is within 1 percent, the target feature unit is taken as the starting point of the next jump point; otherwise, reselecting the target characteristic unit; as shown in fig. 5 (e) (f) (g), since the jumping viewpoint is not within 1% of the pixel difference, the process of re-recognition is performed; repeating the stepsAnd (4) in the fourth step, until the PRC (i) is more than or equal to 0.9, the recognition is considered to be successful. As shown in FIG. 4 (l), the black solid line corresponds to the increment of the current target image PRC, the corresponding stimulation identity cell PRC (5) ≧ 0.9, as shown in FIG. 4 (j), so the identified target image corresponds to the fifth type of image.

Claims (6)

1. A bionic visual image target recognition method fused with dotted line memory information is characterized by comprising the following steps:
1) Constructing a grid cell set based on visual driving, wherein each layer of grid cell graph of the grid cell set is composed of matrixes with the same size, and M modules and N offsets are adopted to form four-dimensional grid cells; the grid cell map size of each layer is the size of the image, sxs, and thus the size of the four-dimensional grid cells is sxs × N × M;
2) Constructing a distance cell model, and calculating displacement vectors between positions coded by grid cell population vectors;
3) For each training image, manually and randomly selecting 5-13 stimulation points, storing the stimulation points in a matrix cGC, and recording the coordinates (x, y) of each stimulation point in the matrix tGC corresponding to the pixel value corresponding to each layer of coordinates (x, y) in the grid cells; expanding each stimulation point into a foveal array of pixels sxs, storing in a matrix FPC, wherein S = m × S, m ∈ (7.2, 7.5), calculating the response of all sensory neurons to each foveal pixel k by gaussian kernels, and storing the resulting matrix array in a matrix SC;
4) Selecting a first stimulation point in a target image by adopting a top-down attention mechanism, calculating a fovea of the current target image by using Gaussian nucleus calculation sensory cells, responding to a strongest characteristic label unit as a next jumping point, and accumulating corresponding stimulation identity cells; and for the starting point and the end point of each saccade, updating the foveal displacement vector by a distance cell model, judging the pixel difference between the jumping viewpoint and the stimulation point, and ensuring that the pixel difference is maintained within 1%, otherwise, resetting the starting point until the accumulation of certain stimulation identity cells reaches a threshold value of 0.9, and considering the stimulation identity as a finally identified target.
2. The method for recognizing the target of the bionic visual image fused with the dotted line memory information as claimed in claim 1, wherein in step 1), the grid cell set based on the visual drive is constructed by a firing rate map, and the grid cell firing rate map is calculated according to the following formula:
r GC =max(0,cos(z 0 )+cos(z 1 )+cos(z 2 ));
Figure FDA0002150435190000011
Figure FDA0002150435190000021
wherein b is 0 ,b 1 And b 2 Is the cosine wave normal vector; b i Denotes b 0 ,b 1 And b 2 (ii) a F is an array of M numbers starting from 0.0014 x 2 pi to 0.0214 x 2 pi of the grid spatial frequency;
Figure FDA0002150435190000022
is a vector representation of the current offset; z is a radical of formula i Denotes z 0 ,z 1 And z 2 Calculating results of the normal vectors of different cosine waves; r is a radical of hydrogen GC Representing the current offset and the firing rate graph of the grid cell layer corresponding to the current module; for each grid scale, N offsets are uniformly sampled along the major axes of two adjacent equilateral triangles on the grid.
3. The method for recognizing the bionic visual image target fused with the dotted line memory information as claimed in claim 1, wherein the specific implementation process of the step 3) comprises: carrying out gray level adjustment on a training image M1 to obtain an image M2, carrying out image integration processing on the M2 to obtain M3, selecting a stimulation point of the training image, wherein the selection of the stimulation point (x, y) satisfies that x is more than or equal to 31, and y is less than or equal to 410; selecting a central concave based on the image M3 to obtain a central concave array, and storing the central concave array in a matrix cGC; the response of all sensory neurons to each foveal pixel k is calculated by the gaussian kernel and the resulting matrix array is stored in the matrix SC.
4. The method for recognizing the target of a bionic visual image fused with dotted line memory information as claimed in claim 3, wherein the Gaussian kernel
Figure FDA0002150435190000023
Wherein x represents an array obtained by reshaping each 61 × 61 foveal array into an array of 3761 × 1 after 256 copies, and ref represents an array arranged from 0 to 255 after 3761 copies.
5. The method for recognizing the bionic visual image target fused with the dotted line memory information as claimed in claim 1, wherein the specific implementation process of the step 4) comprises the following steps: respectively calculating values of a PRC (stimulated identity cell) and a PLC (characteristic tag cell), randomly selecting stimulation points in a training set, expanding the stimulation points into a 61 x 61 foveal array O, calculating a neural response of a Gaussian core to the foveal array of a target image, and multiplying an SC _1 in the training set with an SC array at the current position: FC = SC _1 × SC, if satisfied
Figure FDA0002150435190000024
Then calculate PLC = FC/max (FC) and make it smaller in the PLC array
Figure FDA0002150435190000031
The value of (2) is set to 0, the value of the current PLC is recorded, if not, the value of the current PLC is recorded
Figure FDA0002150435190000032
Reselecting the current feature tag unit, wherein the size of the PLC is (n × 9,1); the calculation formula for PRC is: PRC = PRC + PRC _2 × PLC.
6. The method for recognizing the bionic visual image target fused with the dotted line memory information as claimed in claim 5, wherein the specific implementation process of the step 5) comprises: setting a weak noise parameter w n Calculating PLC = w n X PRC — 2 x PRC; the serial number corresponding to the maximum value in the PLC is the next selected feature tag unit, namely the feature tag unit with the strongest response is selected as the next jump point according to calculation, namely the target feature tag unit, the vector distance between the starting feature tag unit and the feature tag unit of the target is calculated and marked on the target image, and if the pixel difference between the target pixel point obtained through vector calculation and the stimulation point in the training set is within 1 percent, the target feature unit is taken as the starting point of the next jump point; otherwise, reselecting the target characteristic unit; until PRC (i) ≧ 0.9, successful recognition was considered.
CN201910699779.3A 2019-07-31 2019-07-31 Bionic visual image target identification method fusing dotted line memory information Active CN110598534B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910699779.3A CN110598534B (en) 2019-07-31 2019-07-31 Bionic visual image target identification method fusing dotted line memory information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910699779.3A CN110598534B (en) 2019-07-31 2019-07-31 Bionic visual image target identification method fusing dotted line memory information

Publications (2)

Publication Number Publication Date
CN110598534A CN110598534A (en) 2019-12-20
CN110598534B true CN110598534B (en) 2022-12-09

Family

ID=68853261

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910699779.3A Active CN110598534B (en) 2019-07-31 2019-07-31 Bionic visual image target identification method fusing dotted line memory information

Country Status (1)

Country Link
CN (1) CN110598534B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111598110B (en) * 2020-05-11 2023-04-28 重庆大学 HOG algorithm image recognition method based on grid cell memory

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1991016686A1 (en) * 1990-04-26 1991-10-31 John Sutherland Artificial neural device
CN104794732A (en) * 2015-05-12 2015-07-22 西安电子科技大学 Artificial immune network clustering based grayscale image segmentation method
CN109886384A (en) * 2019-02-15 2019-06-14 北京工业大学 A kind of bionic navigation method based on the reconstruct of mouse cerebral hippocampal gitter cell

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1991016686A1 (en) * 1990-04-26 1991-10-31 John Sutherland Artificial neural device
CN104794732A (en) * 2015-05-12 2015-07-22 西安电子科技大学 Artificial immune network clustering based grayscale image segmentation method
CN109886384A (en) * 2019-02-15 2019-06-14 北京工业大学 A kind of bionic navigation method based on the reconstruct of mouse cerebral hippocampal gitter cell

Also Published As

Publication number Publication date
CN110598534A (en) 2019-12-20

Similar Documents

Publication Publication Date Title
US20200151446A1 (en) Methods and apparatus for autonomous robotic control
Rybak et al. A model of attention-guided visual perception and recognition
CN114787828A (en) Artificial intelligence neural network inference or training using imagers with intentionally controlled distortion
CN110598534B (en) Bionic visual image target identification method fusing dotted line memory information
Zhang 2D Computer Vision
Poulopoulos et al. DeepPupil Net: Deep Residual Network for Precise Pupil Center Localization.
Prodöhl et al. Learning the gestalt rule of collinearity from object motion
Ozimek et al. A space-variant visual pathway model for data efficient deep learning
Zhu et al. Autonomous, self-calibrating binocular vision based on learned attention and active efficient coding
Floren et al. Foveated Image and Video Processing and Search
Zhang et al. Weighted KPCA degree of homogeneity amended nonclassical receptive field inhibition model for salient contour extraction in low-light-level image
Klepel et al. Learning Equivariant Object Recognition and its Reverse Application to Imagery
Pangestu et al. Electric Wheelchair Control Mechanism Using Eye-mark Key Point Detection.
Park Representation learning for webcam-based gaze estimation
Zhang et al. Foveated neural network: Gaze prediction on egocentric videos
Prawiro et al. Towards Efficient Visual Attention Prediction for 360 Degree Videos
Xia Driver eye movements and the application in autonomous driving
Costantino Feedback interactions between peripheral and foveal regions in the primary visual cortex
Hénaff Testing a mechanism for temporal prediction in perceptual, neural, and machine representations
CAVALCANTE Visual Scene Analysis based on the Kurtosis Of Responses of Independent Component Filters
Svec et al. Calculation of object position in various reference frames with a robotic simulator
Shaposhnikov et al. CONTEXT-DEPENDENT PERCEPTION FOR FOVEAL MACHINE VISION SYSTEMS
Yu et al. Construction and application of biological visual nerve computing model in robot
Jang Exploring the robust nature of human visual object recognition through comparisons with convolutional neural networks
Herpers et al. Dynamic cell structures for the evaluation of keypoints in facial images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant