CN101159043B

CN101159043B - System and method for visible sensation target context spatial relationship encode

Info

Publication number: CN101159043B
Application number: CN2007101776560A
Authority: CN
Inventors: 苗军; 卿来云; 段立娟; 陈熙霖; 高文; 乔元华
Original assignee: Beijing University of Technology; Institute of Computing Technology of CAS; University of Chinese Academy of Sciences
Current assignee: Beijing University of Technology; Institute of Computing Technology of CAS; Graduate School of CAS; University of Chinese Academy of Sciences
Priority date: 2007-11-19
Filing date: 2007-11-19
Publication date: 2010-12-15
Anticipated expiration: 2027-11-19
Also published as: CN101159043A

Abstract

The invention discloses a system and a method for encoding space relationship of context of a visual target. The system is realized in the mode of a neuron network and includes a visual image basic units encoding neuron layer, a visual image target encoding neuron layer, a visual logic relationship encoding neuron layer, and a visual target space relationship encoding neuron layer. Connection weights among the neurons on two adjacent layers form the codes for image content, each encoding neuron and the connection weight thereof respectively encode the image basic unit, the image target, the target binary logic relationship and the target space relationship. The invention has great flexibility and self-adaptability.

Description

A kind of system and method for visible sensation target context spatial relationship encode

Technical field

The present invention relates to image recognition and neuroid technical field, particularly relate to a kind of with the system and method realization of neuroid form, that be used to express visual pattern target context spatial relationship coding.

Background technology

The context relation of sensation target is meant the space relativeness between any two targets of image.Wherein target can be any content in the image, comprises simple target and complex target, and the former is as one section edge, a shape, one section profile or compact single texture region; The latter is made up of the former, as each sense organ of people's face, people's face portion etc.

Usually, except specifying, the spatial relationship between two targets is generally with the direction and the distance expression of the line between the central point separately.

In the prior art, sensation target context relation is encoded mainly comprise two parts: to the coding of two sensation targets with to the coding of sensation target relative space relation.

About the expression of sensation target context relation, generally adopt model method, for example Bayesian network or markov field model at present based on the probability statistics framework.

European patent application WO2004111931 discloses a kind of vision attention selective system and method (ASystem And Method for Attentional Selection).It is based on from bottom to top vision attention, and it can be selected automatically and separate the marking area that may comprise object.Its groundwork is embodied in accepts input picture, automatically be partitioned into marking area, obtain a remarkable mapping graph, can directly locate the position of remarkable object, therefore, can produce the mask image that only comprises remarkable object, and these separations result is shown to the user, recognition system just can be carried out object identification on the figure that remarkable object is only arranged like this, and has given up some irrelevant, inessential or disturb factors.

Simultaneously, US patent application publication US5664065, US2002154833, US2005047647, Jap.P. open source literature JP2002373333, and Chinese patent open source literature application number is: 99810425.6,200380103136.5,200410035084 also disclose the expression system and method for some existing sensation target context relations, and it mainly is the technology of expressing and following the tracks of about Visual Selective Attention, image object.But, in the prior art, not with the neuroid formal representation about the hereinafter coded representation of spatial relationship visually, can't represent the vision context relation better.

Summary of the invention

Problem to be solved by this invention is to provide a kind of system and method for visible sensation target context spatial relationship encode, and it has great dirigibility and adaptivity.

For realizing the system of a kind of visible sensation target context spatial relationship encode that the object of the invention provides, it is realized with the neuroid form;

Comprise visual pattern primitive coding neuron layer, visual pattern target code neuron layer, visual pattern target logic relation coding neuron layer and sensation target spatial relationship coding neuron layer;

Connection weights between all adjacent two layers neurons have constituted the coding to picture material, and each coding neuron and connection weights thereof are coded image primitive, image object, target binary logic relation and object space relation respectively;

The binary logic relation that pairing in twos forms between the described image object of described target binary logic relation expression.

The system of described visible sensation target context spatial relationship encode also comprises an image sensing input neuron layer, is used for the image input.

The neuron that described image sensing input neuron layer constitutes is corresponding one by one with the uniformly-spaced pixel sampling on the image, and neuronic response is got corresponding pixel value.

Described visual pattern primitive coding neuron layer, visual pattern target code neuron layer, visual pattern target logic relation coding neuron layer and sensation target spatial relationship coding neuron layer coding adopt sparse features, conspicuous cloth learning right value tag, connection features and the distance feature fundamental as coding respectively, and four coding layers are made up of the sparse coding neuron respectively.

For realizing that the object of the invention also provides a kind of method of visible sensation target context spatial relationship encode, comprises the following steps:

Steps A according to the neuronic pixel value of topography's sensing, calculates neuronic encoded radio of visual pattern primitive coding and response;

Step B according to the neuronic response of visual pattern primitive coding, calculates neuronic encoded radio of visual pattern target code and response;

Step C, the binary logic of matching in twos between the formation visual pattern target according to any two neuronic responses of related visual pattern target code concerns, calculates visual pattern target logic relation neuronic encoded radio of coding and response;

Step D according to the spatial relationship between the visual pattern target, calculates the sensation target spatial relationship neuronic encoded radio of encoding.

Among the described step D, also comprise calculating the encode step of neuronic response of sensation target spatial relationship.

In the described steps A, 15 neuronic encoded radio w of visual pattern primitive coding _I1, w _I2, w _I3, w _I4Be 15 kinds of weights that combination is corresponding, obtain through normalized according to 2 * 2 pixels;

To from the neuronic response of visual pattern sensing input x ₁, x ₂, x ₃, x ₄, described visual pattern primitive coding neuron B _iResponse R _i ²By determining with minor function:

Wherein,

I_{i}^{2} = Σ_{k = 1}^{4} w_{ik} x_{k},

T is a threshold value, w _IkBe picture element B _iFour codings in one.

Among the described step B, the described neuronic encoded radio of visual pattern target code that calculates comprises the following steps:

If image target area comprises the M sub regions, to each subregion X _m, 1≤m≤M wherein, primitive coding neuron B ₀And B _kResponse be R _M0 ²And R _Mk ², wherein 1≤k≤14, then Dui Ying weight w _{M0, j} ²³And w _{Mk, j} ²³Determine by following formula:

w_{mi, j} = \frac{{w^{'}}_{mi, j}}{\sqrt{Σ_{m = 1}^{M} ({w^{' 2}}_{m 0, j} + {w^{' 2}}_{mk, j})}},

I=0 wherein, k; 1≤k≤14

W ' wherein _{Mi, j}Value decide according to conspicuous cloth learning rules:

{w^{,}}_{mi, j} = α_{2} R_{mi}^{2},

I=0 wherein, k; 1≤k≤14, wherein, α ₂Be a coefficient, promptly at first calculate one and connect weights, carry out normalization again and calculate, the image object that obtains three layers of the picture element coding neurons to the of the second layer neuronic connection weights of encoding according to conspicuous cloth learning rules;

To picture element coding neuron response R from the second layer ₁ ², R ₂ ²... R _i ²... R _2m ², the 3rd layer image object coding neuron O _iInput value I _j ³, be shown below:

I_{j}^{3} = Σ_{i = 1}^{2 M} w_{ij}^{23} R_{i}^{2}

Through further obtaining optimal response value R through the competition response _j ³, determine by the following formula response function:

Among the described step C, described visual pattern target logic relation neuronic encoded radio of coding and the response of calculating according to Hebb law, all uses identical constant to represent;

To image object coding neuron O from the 3rd layer _I1, O _I2Response R _I1 ³, R _I2 ³, the 4th layer target logic relation coding neuron P _jInput value I _j ⁴By determining with minor function:

I_{j}^{4} = w_{i 1,, j} R_{i 1}^{3} + w_{i 2, j} R_{i 2}^{3}

W wherein _{I1, j}And w _{I2, j}For waiting numerical constant.

Further obtain optimal response value R through the competition response _j ⁴, by following response function decision, make and give prominence to its response:

Among the described step D, the described sensation target spatial relationship neuronic encoded radio of encoding that calculates comprises the following steps:

The 4th layer target logic relation coding neuron to layer 5 object space relation coding neuron connection weight w _Ij ⁴⁵, w wherein _LeftOr w _{To the right}, w _UpwardsOr w _DownwardsBe according to Hebb law w _Ij=α ₃R _iR _jCalculate, wherein, α ₃Be a coefficient, R _iBeing the 4th layer the neuronic response of target logic relation coding, is 1; R _jBe the neuronic response of object space relation coding of layer 5, its be two between the target level or the distance of vertical direction | Δ x| or | Δ y|; Be calculated as follows:

w _Left=α ₃| Δ x|, wherein Δ x＜0

w _{To the right}=α ₃| Δ x|, wherein Δ x＞0

w _Upwards=α ₃| Δ y|, wherein Δ y＜0

w _Downwards=α ₃| Δ y|, wherein Δ y＞0

To target logic relation coding neuron P from the 4th layer _iResponse R _i ⁴, its response is 1, the object space relation coding neuron S of layer 5 _Left, S _{To the right}, S _Upwards, S _DownwardsResponse s _Left, s _{To the right}, s _Upwards, s _DownwardsDetermine by following response function:

Figure DEST_PATH_GA20183203200710177656001D00041

s _{To the right}=0, Δ x＜0 wherein

Figure DEST_PATH_GA20183203200710177656001D00042

s _Left=0, Δ x＞0 wherein

Figure DEST_PATH_GA20183203200710177656001D00043

s _Downwards=0, Δ y＜0 wherein

Figure DEST_PATH_GA20183203200710177656001D00044

s _Upwards=0, Δ y＞0 wherein.

The invention has the beneficial effects as follows: the system and method for visible sensation target context spatial relationship encode of the present invention, imictron in the coding network of its proposition can be corresponding one by one with physical device on hardware is realized, in to study of image object spatial relationship and expression process, can dynamically expand, many spatial relationship aspect between, every pair of any two target has been shown great dirigibility, adaptivity for expressing, can be applicable in the search, detection and Identification of the motion control of the expression of visual pattern and understanding, viewpoint and target.

Description of drawings

Fig. 1 is a neuroid target context spatial relationship coding structure synoptic diagram of the present invention;

Fig. 2 is the neuron coding synoptic diagram of visual pattern primitive of the present invention;

Fig. 3 is a picture element classification synoptic diagram.

Embodiment

In order to make purpose of the present invention, technical scheme and advantage clearer,, the system and method for a kind of visible sensation target context spatial relationship encode of the present invention is further elaborated below in conjunction with drawings and Examples.Should be appreciated that specific embodiment described herein only in order to explanation the present invention, and be not used in qualification the present invention.

The system and method for a kind of visible sensation target context spatial relationship encode of the present invention is realized with the neuroid form, is used to express the coded system and the method for visual pattern target context spatial relationship.

Neuroid, (ARTIFICIAL NEURAL NETWORK ANN) is a kind of engineering system of simulating its structure and intelligent behavior on the basis in that the understanding of human brain tissue structure and operating mechanism is understood to be also referred to as artificial neural network.As far back as initial stage 1940's, psychologist McCulloch, mathematician Pitts have just proposed first mathematical model of neuroid, have from then on started the neural computational science Study on Theory epoch.Thereafter, scholars such as Rosenblatt, Widrow and Hopfield have successively proposed a series of perception learning models again, and it is flourish to make that the neuroid technology is able to.

Neuroid is the system that is extensively interconnected and form by a large amount of neurons, and its this design feature is determining neuroid to have the ability that high speed information is handled.Each neuron nearly 10 of human brain ³～10 ⁴Individual dendron and corresponding cynapse, a people's brain amounts to and forms 10 approximately ¹⁴～10 ¹⁵Individual cynapse.With the term of neural network, promptly be that human brain has 10 ¹⁴～10 ¹⁵Individual interconnected storage potentiality.Though each neuronic calculation function is very simple, and signal transmission rate also lower (about 100 times/second), but because the in parallel function of extreme between each neuron finally makes an ordinary people's brain just can finish the task that existing computing machine needs several 1,000,000,000 treatment steps just can finish at least in about 1 second.

The system of visible sensation target context spatial relationship encode of the present invention, be the neuroid of visible sensation target context spatial relationship encode, comprise four coding neuron layers: visual pattern primitive coding neuron layer, visual pattern target code neuron layer, visual pattern target logic relation coding neuron layer and visual pattern object space relation coding neuron layer.Four layers of coding adopt sparse features, He Bu (Hebb) learning right value tag, connection features and the distance feature fundamental as coding respectively, four coding layers are made up of the sparse coding neuron respectively, realize between the neuron realizing seamless link between the sparse connection of local, layer and the layer, not only sparse but also compact on the structure.

As shown in Figure 1, the system of visible sensation target context spatial relationship encode of the present invention comprises an image sensing input neuron layer and four coding neuron layers.

Image sensing input neuron layer is used for image input, and the neuron of its formation is corresponding one by one with uniformly-spaced pixel sampling on the image, and neuronic response is got corresponding pixel value.

Four coding neuron layers are respectively: visual pattern primitive coding neuron layer, visual pattern target code neuron layer, visual pattern target logic relation coding neuron layer and sensation target spatial relationship coding neuron layer.Wherein, the neuronic response of each layer coding be have with it in one deck down connection neuronic response the weighting input and, and through threshold value block guarantee non-negative.

Connection weights between all adjacent two layers neurons have constituted the encoded radio to picture material, from second to layer 5, each coding neuron and connect weights representative image primitive, image object, target binary logic relation and object space relation respectively.

The coding function of neurons except store, remember with the weights of all cynapses of being connected to self or the target of encoding (or primitive), also be responsible for the neuronic response of lower floor is weighted read group total, and respond a target or primitive, the effect of just playing identification and judging represented to contain in the present image so by self competition.

In embodiments of the present invention, as a kind of enforceable mode, second layer visual pattern primitive coding neuron layer, any one visual pattern primitive coding neuron is accepted the connection input from 2 * 2 input neurons in ground floor image sensing input neuron layer one sub regions, the connection of 2 * 2 pixels input in the sub regions just.The embodiment of the invention is with connecting weights (w _I1, w _I2, w _I3, w _I4) subimage (x of 2 * 2 corresponding pixels in the expression ground floor ₁, x ₂, x ₃, x ₄) i substrate, a kind of image-based eigen has been represented in this substrate simultaneously, as brightness or edge feature, as shown in Figures 2 and 3, is referred to as picture element.

These connect weights (w _I1, w _I2, w _I3, w _I4) be called as picture element coding that should substrate i, can calculate subimage (x ₁, x ₂, x ₃, x ₄) 15 kinds of combination B of corresponding 2 * 2 pixels ₀~ B ₁₄(totally 2 ⁴-a kind) corresponding connection weights.15 kinds of primitive coding synoptic diagram as shown in Figure 3, each primitive is by 4 weights (w among the figure ₁, w ₂, w ₃, w ₄) representative; Four weights of each primitive are corresponding to one group of 2 * 2 lattice, and each lattice is represented a real number.The grey grid is represented arithmetic number, the negative real number of black box representative.Computing method are as follows: establishing has n little grey lattice in 4 lattices, 4-n little black square then arranged; Weights=the 1/n of n little grey lattice correspondence, weights=-1/ (4-n) of 4-n little black square correspondence; The weights that calculate pass through normalized at last, just obtain encoded radio (w as shown in table 1 ₁, w ₂, w ₃, w ₄).

Table 1 picture element coding (w _I1, w _I2, w _I3, w _I4) table

As shown in Figure 1, i neuron of ground floor is to the set of the connection weights between j neuron of the second layer

W_{ij}^{12} = {w_{ij}^{12}},

Constituted the coding of all images primitive of all subregions in the image.

Calculate the neuronic response R of this layer primitive coding simultaneously _i ²:

A picture element coding neuron B _i(0≤i≤14) are passed through subimage (x ₁, x ₂, x ₃, x ₄) weighting (w _I1, w _I2, w _I3, w _I4) summation operation, as the formula (1), realize the essential characteristic of this subimage is extracted the value I of the feature of being extracted _I2Be referred to as this picture element neuronic input value of encoding.

I_{i}^{2} = Σ_{k = 1}^{4} w_{ik} x_{k} - - - (1)

Among the present invention, also further block, obtain threshold values and block the back second layer neuronic response of encoding through threshold values, as the formula (2):

Wherein, T is a threshold values, its role is to allow neuron that response is not made in less weighting input.

The 3rd layer of visual pattern target code neuron layer, wherein, a target in each neuron correspondence image, any one visual pattern target code neuron is accepted to be used for the expression or the coding of an image object from the neuronic connection input of encoding of the picture element of all subregions in target area in the second layer.Wherein encoded radio is embodied in the connection weights W of three layers of the second layers to the _Ij ²³, this visual pattern target code neuron is realized expression and response to image object by the weighted sum computing to the neuronic response of all images primitive coding in the image target area.

Wherein, the connection weights W of three layers of the second layers to the _Ij ²³Be according to Hebb law w _Ij=α ₁R _iR _jCalculate, wherein, α ₁Be a coefficient, R _iBe second layer i neuronic response, R _iBe the 3rd layer of j neuronic response, when calculating the connection weights of three layers of the second layers to the, because response the unknown of the 3rd layer, among the present invention, set the 3rd layer, promptly the response of visual pattern target code neuron layer is 1, then the 3rd layer connection weights W _Ij ²³Be according to Hebb law w _Ij=α ₁R _iCalculate, wherein, R _iBe second layer i neuronic response.

As shown in Figure 1, connect weights W _Ij ²³Be calculated as follows:

If image target area comprises the M sub regions, to each subregion X _m(1≤m≤M), primitive coding neuron B ₀And B _kResponse be R _M0 ²And R _Mk ²(1≤k≤14), then Dui Ying weights W _{M0, j} ²³And W _{Mk, j} ²³Determine by formula (3):

w_{mi, j} = \frac{{w^{'}}_{mi, j}}{\sqrt{Σ_{m = 1}^{M} ({w^{'}}_{m 0, j}^{2} + {w^{'}}_{mk, j}^{2})}}, (i = 0, k; 1 \leq k \leq 14) - - - (3)

W ' wherein _{Mi, j}Value decide according to conspicuous cloth learning rules:

{w^{'}}_{mi, j} = α_{1} R_{mi}^{2}, (i = 0, k; 1 \leq k \leq 14),

Wherein, α ₁Be a coefficient, promptly at first calculate one and connect weights, carry out normalization again and calculate, obtain the connection weights of three layers of the second layers to the according to conspicuous cloth learning rules.

All these connect the set of weights or coding

W_{ij}^{23} = {w_{ij}^{23}}

Constituted expression to relevant all visual pattern target codes in the image.

Simultaneously, to from second layer primitive coding neuron response (R ₁ ², R ₂ ²... R _i ²... R _2m ²), the 3rd layer target code neuron O _iInput value I _j ³, as the formula (4):

I_{j}^{3} = Σ_{i = 1}^{2 M} w_{ij}^{23} R_{i}^{2} - - - (4)

Through further obtaining optimal response value R through the competition response _j ³, by the decision of formula (5) response function, make it possible to outstanding its target response:

The 4th layer of visual pattern target logic relation coding neuron layer, any one target logic relation coding neuron by with the 3rd layer in any two neuronic connections of target code express a pair of binary pair relationhip of two corresponding image objects, the binary logic that is used to encode between these two targets concerns, for the spatial relationship of further expressing between two targets provides index.Wherein encoded radio is embodied in the connection weight w of three layers of the second layers to the _Ij ²³, the weighted sum computing of this target logic relation coding neuron by neuronic response that two image objects are encoded realizes the response to this binary pair relationhip.

The 3rd layer to the 4th layer connection weight w _Ij ³⁴According to Hebb law w _Ij=α ₂R _iR _jCalculate, wherein, in the embodiment of the invention, α ₂Be a constant, R _iBe the 3rd layer of i neuronic response, R _jBe the 4th layer of j neuronic response, when the connection weights that calculate the 3rd layer to the 4th layer, because response the unknown of the 4th layer, among the present invention, set the 4th layer, promptly the response of target logic relation coding neuron layer is 1, then the 4th layer connection weight w _Ij ³⁴Be according to Hebb law w _Ij=α ₂R _iCalculate, wherein, R _iBe the 3rd layer of i neuronic response, and because the 3rd layer neuronic response R _i=1, therefore, the 4th layer connection weights

Preferably, get α ₂=1/2, the neuron response that can make the 4th layer like this is 1, simplifies subsequent calculations.

All these connect the set of weights or coding

W_{ij}^{34} = {w_{ij}^{34}}

Constituted expression to relevant all target pair relationhips in the image.

To from the 3rd layer of target code neuron (O _I1, O _I2) response (R _I1 ³, R _I2 ³), the 4th layer binary logic relation coding neuron P _jInput value I _j ⁴Determine by formula (6) response function:

I_{j}^{4} = w_{i 1,, j} R_{i 1}^{3} + w_{i 2, j} R_{i 2}^{3} - - - (6)

W wherein _{I1, j}And w _{I2, j}For waiting numerical constant (as 1/2).

Further obtain optimal response value R through the competition response _j ⁴, by the decision of formula (7) response function, make and give prominence to its response:

As shown in Figure 1, connection between the 4th layer of the neuroid coding structure and layer 5 be about image object between spatial relationship (promptly target with respect to another target in the horizontal direction with vertical direction on displacement (Δ x, Δ y)) coded representation.

Layer 5 is made of four direction (left, to the right, upwards, downwards) neuron, wherein left, the neuronic response of both direction represents a target B with respect to another target A shifted by delta x in the horizontal direction to the right, wherein when Δ x＜0, expression target B is in the left side of target A and apart from being | Δ x|, neuron S left _LeftResponse R _Left=| Δ x|, neuron S to the right _{To the right}Response R _{To the right}=0; When Δ x＞0, expression target B target A's by side and distance is | Δ x|, neuron S left _LeftResponse R _Left=0, neuron S to the right _{To the right}Response R _{To the right}=| Δ x|; In like manner, upwards, the neuronic response of both direction represent the shifted by delta y of a target B with respect to another target A in the vertical direction downwards, wherein when Δ y＜0, expression target B is at the downside of target A and apart from being | Δ y|, downward neuron S _DownwardsResponse R _Downwards=| Δ y|, upwards neuron S _UpwardsResponse R upwards=0; When Δ y＞0, expression target B in upside and the distance of target A is | Δ y|, neuron S downwards _DownwardsResponse R _Downwards=0, neuron S makes progress _UpwardsResponse R _Upwards=| Δ y|.

The 4th layer of connection weights W to layer 5 _Ij ⁴⁵(w _LeftOr w _{To the right}, w _UpwardsOr w _Downwards) be according to Hebb law w _Ij=α ₃R _iR _jCalculate, wherein, α ₃Be a coefficient, R _iBeing the 4th layer of neuronic response, is 1; R _jBe the neuronic response of layer 5, its be two between the target level or the distance of vertical direction | Δ x| or | Δ y|.Be calculated as follows:

w _Left=α ₃| Δ x| (Δ x＜0) (8)

w _{To the right}=α ₃| Δ x| (Δ x＞0) (9)

w _Upwards=α ₃| Δ y| (Δ y＜0) (10)

w _Downwards=α ₃| Δ y| (Δ y＞0) (11)

As shown in Figure 1, all these connect the set of weights or coding

W_{ij}^{45} = {w_{ij}^{45}}

Constituted expression to spatial relationship between any two targets relevant in the image.

For any two targets, if their Δ x, Δ y is non-vanishing, then respectively has a neuron to accept from neuronic input of target logic relation coding that response is non-vanishing in the 4th layer left, to the right on horizontal direction () and the vertical direction (upwards, downwards).These two spatial relationship coding neurons are by importing the response of computing realization to any a pair of image object spatial relationship (being the offset distance on level and the vertical direction) to the weighting of the neuronic response of target binary logic relation coding.And two other direction neuron is not because import, so response all is zero.

Therefore, to the neuron P that encodes from the 4th layer of target binary logic relation _iResponse R _i ⁴, its response is 1, the spatial relationship coding neuron (S of layer 5 _Left, S _{To the right}, S _Upwards, S _Downwards) response (s _Left, s _{To the right}, s _Upwards, s _Downwards) determine by the response function of formula (12)～formula (15):

s _{To the right}=0 (Δ x＜0) (12)

s _Left=0 (Δ x＞0) (13)

s _Downwards=0 (Δ y＜0) (14)

s _Upwards=0 (Δ y＞0) (15)

Even if by as can be seen above, spatial relationship coding neuron (S _Left, S _{To the right}, S _Upwards, S _Downwards) response and | Δ x| or | Δ y| is proportional, thereby has reflected the spatial relationship between the target.

Describe the method for visible sensation target context spatial relationship encode of the present invention below in detail, comprise the following steps:

Step S100 according to the neuronic pixel value of topography's sensing, calculates neuronic encoded radio of visual pattern primitive coding and response;

As shown in Figure 1, be neuroid target context spatial relationship coding structure synoptic diagram.Ground floor and the connection between the second layer at the neuroid coding structure are the expressions that picture element is encoded.

As shown in Figure 2, be the neuroid coding (w of picture element _I1, w _I2, w _I3, w _I4) synoptic diagram.As a kind of example, (the x of topography ₁, x ₂, x ₃, x ₄), the Dui Ying neuronic response of ground floor sensing just, its a primitive B _iCan be expressed as neuroid coding (w _I1, w _I2, w _I3, w _I4)=(0.5,0.5 ,-0.5 ,-0.5).

As shown in Figure 3, be picture element classification synoptic diagram, any one picture element B _iBy four coding (w _I1, w _I2, w _I3, w _I4) expression, have 15 such primitives.Each primitive is represented by four lattices that all each lattice is represented a real number.The grey grid is represented arithmetic number, the negative real number of black box representative.

Calculate the encoded radio of 15 primitives among Fig. 3, as shown in table 1.

Therefore, to from the neuronic response (x of ground floor topography sensing ₁, x ₂, x ₃, x ₄), the pixel value (x of topography just ₁, x ₂, x ₃, x ₄), the primitive coding neuron B of the second layer _iInput value by the input function decision of formula (2) definition.

Among the present invention, response is blocked the value of meeting with a response R through threshold value _i ², as the formula (2), guarantee that it is non-negative.

Step S200 according to the neuronic response of visual pattern primitive coding, calculates neuronic encoded radio of visual pattern target code and response;

As shown in Figure 1, the connection between the second layer of neuroid coding structure and the 3rd layer is the expression that image object is encoded.

Image object coding neuron among Fig. 1 in the 3rd layer adopts the sparse coding strategy process, and promptly any one neuron of this layer is not connected with all primitive coding neurons in the second layer, and only in wherein sub-fraction is continuous.

Particularly, for any one subregion image (x in the image object _I1, x _I2, x _I3, x _I4), image object coding neuron is only accepted two primitive coding neuron B to this sub regions response ₀And B _kThe input of (1≤k≤14).Wherein, k is corresponding to removing B ₀The neuronic sequence number of primitive coding that has peak response outward, as shown in Figure 3.

B ₀And B _kWith target code neuron O _jThe connection weight w _0jAnd w _Kj, be exactly the coding of target code neuron to this subregion.The coding summation of all such subregions has constituted the coding of target code neuron to this image object in the image target area.

Connect weight w _0jAnd w _KjAcquisition as follows: establish image target area and comprise the M sub regions, to each subregion X _m(1≤m≤M), primitive coding neuron B ₀And B _kResponse be R _M0 ²And R _Mk ²(1≤k≤14), then linking objective coding neuron O _jWeights W _{M0, j} ²³And W _{Mk, j} ²³Determine by formula (3).

W ' wherein _{Mi, j}Value decide according to conspicuous cloth learning rules:

{w^{'}}_{mi, j} = α_{2} R_{mi}^{2}

(i=0, k; 1≤k≤14), wherein, α ₂Be a coefficient, promptly at first calculate one and connect weights, carry out normalization again and calculate, obtain the connection weights of three layers of the second layers to the according to conspicuous cloth learning rules.

To from second layer primitive coding neuron response (R ₁ ², R ₂ ²... R _i ²... R _2m ²), the 3rd layer target code neuron O _iInput value I _j ³, as the formula (4).

Through further obtaining optimal response value R through the competition response _j ³,, make it possible to outstanding its target response by the decision of formula (5) response function.

Step S300 according to any two neuronic responses of related visual pattern target code, calculates visual pattern target logic relation neuronic encoded radio of coding and response;

As shown in Figure 1, the connection between the 3rd layer and the 4th layer of neuroid coding structure be about image object between the coded representation of binary logic pair relationhip.For example, the neuron of in the 3rd layer three image object A, B and C being encoded respectively with the 4th layer in neuron AB, AC and BC form and be connected, represent respectively to be paired in twos between target A, B and the C (A, B), (A, C) and (B, C) binary logic relation.This coding is to represent the concrete spatial relationship between any two targets for the 4th part index is provided.According to Hebb law w _Ij=α R _iR _jCalculate, the connection weights that are used for this part coding are all used identical constant (as numerical value α=1/2) expression.It is 1 that the logical relation neuronic response of encoding is set, and neuronic response input value also is 1 because image object is encoded, then w _Ij=α.

To from the 3rd layer of target code neuron (O _I1, O _I2) response (R _I1 ³, R _I2 ³), the 4th layer binary logic relation coding neuron P _jInput value I _j ⁴Determine by formula (6) response function.

Further obtain optimal response value R through the competition response _j ⁴,, make and give prominence to its response by the decision of formula (7) response function.

Step S400 according to the spatial relationship of visual pattern target, calculates the sensation target spatial relationship neuronic encoded radio of encoding, and further obtains its corresponding response.

Layer 5 is made of four direction (left, to the right, upwards, downwards) neuron, level (left, to the right) and a neuron is vertically respectively arranged on (upwards, downward) direction and the 4th layer in the target binary logic concern neuronic connection weight (w _LeftOr w _{To the right}, w _UpwardsOr w _Downwards) target of coding is in the horizontal and vertical directions with respect to the spatial relationship of another target.

According to Hebb law, it connects weights size and is proportional between two targets in the horizontal and vertical directions distance (| Δ x|, | Δ y|), shown in (8)～(11).

Therefore, to the neuron P that encodes from the 4th layer of target binary logic relation _iResponse R _i ⁴, its response is 1, the spatial relationship coding neuron (S of layer 5 _Left, S _{To the right}, S _Upwards, S _Downwards) response (s _Left, s _{To the right}, s _Upwards, s _Downwards) determine by the response function of formula (12)～formula (15).

The system and method for visible sensation target context spatial relationship encode of the present invention, with core technology and the method for neuroid form realization about the expression of image object context spatial relationship, imictron in the coding network of its proposition can be corresponding one by one with physical device on hardware is realized, in to study of image object spatial relationship and expression process, can dynamically expand, how right for expressing, spatial relationship aspect between every pair of any two target has shown great dirigibility, adaptivity can be applicable to the expression and the understanding of visual pattern, the motion control of viewpoint and the search of target, in the detection and Identification.

For example, a width of cloth facial image is made up of hair, face contour and perceptron official rank target image, and the coding of each target in people's face and spatial relationship thereof is expressed the expression and the understanding that can realize facial image component content and space structure thereof; The direct modeling of spatial relationship in the neuroid layer 5 of the present invention coding neuron human oculogyral four the muscle neurons of control, the neuronic response of four codings is equivalent to the neuronic systole response of muscle and the viewpoint position that causes changes, thereby the motion control function with viewpoint; In addition with this coding neuroid spatial relationship of target in any viewpoint to an image object or first target to the second moment constantly of can encoding, thereby utilize viewpoint motion control mechanism can realize the detection and tracking of target; Express by different coding,, can distinguish and recognition objective according to the size of response by the overall goals neuronic RESPONSE CALCULATION of encoding to each sub-goal of different images target and spatial relationship thereof.

In conjunction with the accompanying drawings to the description of the specific embodiment of the invention, others of the present invention and feature are conspicuous to those skilled in the art by above.

More than specific embodiments of the invention are described and illustrate it is exemplary that these embodiment should be considered to it, and be not used in and limit the invention, the present invention should make an explanation according to appended claim.

Claims

1. the system of a visible sensation target context spatial relationship encode is characterized in that, it is realized with the neuroid form;

2. the system of visible sensation target context spatial relationship encode according to claim 1 is characterized in that, also comprises an image sensing input neuron layer, is used for the image input.

3. the system of visible sensation target context spatial relationship encode according to claim 2, it is characterized in that, the neuron that described image sensing input neuron layer constitutes is corresponding one by one with the uniformly-spaced pixel sampling on the image, and neuronic response is got corresponding pixel value.

4. according to the system of each described visible sensation target context spatial relationship encode of claim 1 to 3, it is characterized in that, described visual pattern primitive coding neuron layer, visual pattern target code neuron layer, visual pattern target logic relation coding neuron layer and sensation target spatial relationship coding neuron layer coding adopt sparse features, conspicuous cloth learning right value tag, connection features and the distance feature fundamental as coding respectively, and four coding layers are made up of the sparse coding neuron respectively.

5. the method for a visible sensation target context spatial relationship encode is characterized in that, comprises the following steps:

6. the method for visible sensation target context spatial relationship encode according to claim 5 is characterized in that, among the described step D, also comprises calculating the encode step of neuronic response of sensation target spatial relationship.

7. the method for visible sensation target context spatial relationship encode according to claim 5 is characterized in that, in the described steps A, and 15 neuronic encoded radio w of visual pattern primitive coding _I1, w _I2, w _I3, w _I4Be 15 kinds of weights that combination is corresponding, obtain through normalized according to 2 * 2 pixels;

Wherein,

T is a threshold value, w _IkBe picture element B _iFour codings in one.

8. the method for visible sensation target context spatial relationship encode according to claim 5 is characterized in that, among the described step B, the described neuronic encoded radio of visual pattern target code that calculates comprises the following steps:

I=0 wherein, k; 1≤k≤14

W ' wherein _{Mi, j}Value decide according to conspicuous cloth learning rules:

9. the method for visible sensation target context spatial relationship encode according to claim 5, it is characterized in that, among the described step C, described visual pattern target logic relation neuronic encoded radio of coding and the response of calculating, according to Hebb law, all use identical constant to represent;

W wherein _{I1, j}And w _{I2, j}Be the numerical constant such as grade,

10. the method for visible sensation target context spatial relationship encode according to claim 5 is characterized in that, among the described step D, the described sensation target spatial relationship neuronic encoded radio of encoding that calculates comprises the following steps:

The 4th layer target logic relation coding neuron is to the neuronic connection weight w of object space relation coding of layer 5 _Ij ⁴⁵, w wherein _LeftOr w _{To the right}, w _UpwardsOr w _DownwardsBe according to Hebb law w _Ij=α ₃R _iR _jCalculate, wherein, α ₃Be a coefficient, R _iBeing the 4th layer the neuronic response of target logic relation coding, is 1; R _jBe the neuronic response of object space relation coding of layer 5, its be two between the target level or the distance of vertical direction | Δ x| or | Δ y|; Be calculated as follows:

w _Left=α ₃| Δ x|, wherein Δ x＜0

w _{To the right}=α ₃| Δ x|, wherein Δ x＞0

w _Upwards=α ₃| Δ y|, wherein Δ y＜0

w _Downwards=α ₃| Δ y|, wherein Δ y＞0

s _{To the right}=0, Δ x＜0 wherein

s _Left=0, Δ x＞0 wherein

s _Downwards=0, Δ y＜0 wherein

s _Upwards=0, Δ y＞0 wherein.