CN101159043B - System and method for visible sensation target context spatial relationship encode - Google Patents

System and method for visible sensation target context spatial relationship encode Download PDF

Info

Publication number
CN101159043B
CN101159043B CN2007101776560A CN200710177656A CN101159043B CN 101159043 B CN101159043 B CN 101159043B CN 2007101776560 A CN2007101776560 A CN 2007101776560A CN 200710177656 A CN200710177656 A CN 200710177656A CN 101159043 B CN101159043 B CN 101159043B
Authority
CN
China
Prior art keywords
coding
response
target
layer
neuron
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2007101776560A
Other languages
Chinese (zh)
Other versions
CN101159043A (en
Inventor
苗军
卿来云
段立娟
陈熙霖
高文
乔元华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Institute of Computing Technology of CAS
Graduate School of CAS
University of Chinese Academy of Sciences
Original Assignee
Beijing University of Technology
Institute of Computing Technology of CAS
University of Chinese Academy of Sciences
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology, Institute of Computing Technology of CAS, University of Chinese Academy of Sciences filed Critical Beijing University of Technology
Priority to CN2007101776560A priority Critical patent/CN101159043B/en
Publication of CN101159043A publication Critical patent/CN101159043A/en
Application granted granted Critical
Publication of CN101159043B publication Critical patent/CN101159043B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a system and a method for encoding space relationship of context of a visual target. The system is realized in the mode of a neuron network and includes a visual image basic units encoding neuron layer, a visual image target encoding neuron layer, a visual logic relationship encoding neuron layer, and a visual target space relationship encoding neuron layer. Connection weights among the neurons on two adjacent layers form the codes for image content, each encoding neuron and the connection weight thereof respectively encode the image basic unit, the image target, the target binary logic relationship and the target space relationship. The invention has great flexibility and self-adaptability.

Description

A kind of system and method for visible sensation target context spatial relationship encode
Technical field
The present invention relates to image recognition and neuroid technical field, particularly relate to a kind of with the system and method realization of neuroid form, that be used to express visual pattern target context spatial relationship coding.
Background technology
The context relation of sensation target is meant the space relativeness between any two targets of image.Wherein target can be any content in the image, comprises simple target and complex target, and the former is as one section edge, a shape, one section profile or compact single texture region; The latter is made up of the former, as each sense organ of people's face, people's face portion etc.
Usually, except specifying, the spatial relationship between two targets is generally with the direction and the distance expression of the line between the central point separately.
In the prior art, sensation target context relation is encoded mainly comprise two parts: to the coding of two sensation targets with to the coding of sensation target relative space relation.
About the expression of sensation target context relation, generally adopt model method, for example Bayesian network or markov field model at present based on the probability statistics framework.
European patent application WO2004111931 discloses a kind of vision attention selective system and method (ASystem And Method for Attentional Selection).It is based on from bottom to top vision attention, and it can be selected automatically and separate the marking area that may comprise object.Its groundwork is embodied in accepts input picture, automatically be partitioned into marking area, obtain a remarkable mapping graph, can directly locate the position of remarkable object, therefore, can produce the mask image that only comprises remarkable object, and these separations result is shown to the user, recognition system just can be carried out object identification on the figure that remarkable object is only arranged like this, and has given up some irrelevant, inessential or disturb factors.
Simultaneously, US patent application publication US5664065, US2002154833, US2005047647, Jap.P. open source literature JP2002373333, and Chinese patent open source literature application number is: 99810425.6,200380103136.5,200410035084 also disclose the expression system and method for some existing sensation target context relations, and it mainly is the technology of expressing and following the tracks of about Visual Selective Attention, image object.But, in the prior art, not with the neuroid formal representation about the hereinafter coded representation of spatial relationship visually, can't represent the vision context relation better.
Summary of the invention
Problem to be solved by this invention is to provide a kind of system and method for visible sensation target context spatial relationship encode, and it has great dirigibility and adaptivity.
For realizing the system of a kind of visible sensation target context spatial relationship encode that the object of the invention provides, it is realized with the neuroid form;
Comprise visual pattern primitive coding neuron layer, visual pattern target code neuron layer, visual pattern target logic relation coding neuron layer and sensation target spatial relationship coding neuron layer;
Connection weights between all adjacent two layers neurons have constituted the coding to picture material, and each coding neuron and connection weights thereof are coded image primitive, image object, target binary logic relation and object space relation respectively;
The binary logic relation that pairing in twos forms between the described image object of described target binary logic relation expression.
The system of described visible sensation target context spatial relationship encode also comprises an image sensing input neuron layer, is used for the image input.
The neuron that described image sensing input neuron layer constitutes is corresponding one by one with the uniformly-spaced pixel sampling on the image, and neuronic response is got corresponding pixel value.
Described visual pattern primitive coding neuron layer, visual pattern target code neuron layer, visual pattern target logic relation coding neuron layer and sensation target spatial relationship coding neuron layer coding adopt sparse features, conspicuous cloth learning right value tag, connection features and the distance feature fundamental as coding respectively, and four coding layers are made up of the sparse coding neuron respectively.
For realizing that the object of the invention also provides a kind of method of visible sensation target context spatial relationship encode, comprises the following steps:
Steps A according to the neuronic pixel value of topography's sensing, calculates neuronic encoded radio of visual pattern primitive coding and response;
Step B according to the neuronic response of visual pattern primitive coding, calculates neuronic encoded radio of visual pattern target code and response;
Step C, the binary logic of matching in twos between the formation visual pattern target according to any two neuronic responses of related visual pattern target code concerns, calculates visual pattern target logic relation neuronic encoded radio of coding and response;
Step D according to the spatial relationship between the visual pattern target, calculates the sensation target spatial relationship neuronic encoded radio of encoding.
Among the described step D, also comprise calculating the encode step of neuronic response of sensation target spatial relationship.
In the described steps A, 15 neuronic encoded radio w of visual pattern primitive coding I1, w I2, w I3, w I4Be 15 kinds of weights that combination is corresponding, obtain through normalized according to 2 * 2 pixels;
To from the neuronic response of visual pattern sensing input x 1, x 2, x 3, x 4, described visual pattern primitive coding neuron B iResponse R i 2By determining with minor function:
Wherein, I i 2 = Σ k = 1 4 w ik x k , T is a threshold value, w IkBe picture element B iFour codings in one.
Among the described step B, the described neuronic encoded radio of visual pattern target code that calculates comprises the following steps:
If image target area comprises the M sub regions, to each subregion X m, 1≤m≤M wherein, primitive coding neuron B 0And B kResponse be R M0 2And R Mk 2, wherein 1≤k≤14, then Dui Ying weight w M0, j 23And w Mk, j 23Determine by following formula:
w mi , j = w ′ mi , j Σ m = 1 M ( w ′ 2 m 0 , j + w ′ 2 mk , j ) , I=0 wherein, k; 1≤k≤14
W ' wherein Mi, jValue decide according to conspicuous cloth learning rules: w , mi , j = α 2 R mi 2 , I=0 wherein, k; 1≤k≤14, wherein, α 2Be a coefficient, promptly at first calculate one and connect weights, carry out normalization again and calculate, the image object that obtains three layers of the picture element coding neurons to the of the second layer neuronic connection weights of encoding according to conspicuous cloth learning rules;
To picture element coding neuron response R from the second layer 1 2, R 2 2... R i 2... R 2m 2, the 3rd layer image object coding neuron O iInput value I j 3, be shown below:
I j 3 = Σ i = 1 2 M w ij 23 R i 2
Through further obtaining optimal response value R through the competition response j 3, determine by the following formula response function:
Among the described step C, described visual pattern target logic relation neuronic encoded radio of coding and the response of calculating according to Hebb law, all uses identical constant to represent;
To image object coding neuron O from the 3rd layer I1, O I2Response R I1 3, R I2 3, the 4th layer target logic relation coding neuron P jInput value I j 4By determining with minor function:
I j 4 = w i 1 , , j R i 1 3 + w i 2 , j R i 2 3
W wherein I1, jAnd w I2, jFor waiting numerical constant.
Further obtain optimal response value R through the competition response j 4, by following response function decision, make and give prominence to its response:
Among the described step D, the described sensation target spatial relationship neuronic encoded radio of encoding that calculates comprises the following steps:
The 4th layer target logic relation coding neuron to layer 5 object space relation coding neuron connection weight w Ij 45, w wherein LeftOr w To the right, w UpwardsOr w DownwardsBe according to Hebb law w Ij3R iR jCalculate, wherein, α 3Be a coefficient, R iBeing the 4th layer the neuronic response of target logic relation coding, is 1; R jBe the neuronic response of object space relation coding of layer 5, its be two between the target level or the distance of vertical direction | Δ x| or | Δ y|; Be calculated as follows:
w Left3| Δ x|, wherein Δ x<0
w To the right3| Δ x|, wherein Δ x>0
w Upwards3| Δ y|, wherein Δ y<0
w Downwards3| Δ y|, wherein Δ y>0
To target logic relation coding neuron P from the 4th layer iResponse R i 4, its response is 1, the object space relation coding neuron S of layer 5 Left, S To the right, S Upwards, S DownwardsResponse s Left, s To the right, s Upwards, s DownwardsDetermine by following response function:
Figure DEST_PATH_GA20183203200710177656001D00041
s To the right=0, Δ x<0 wherein
Figure DEST_PATH_GA20183203200710177656001D00042
s Left=0, Δ x>0 wherein
Figure DEST_PATH_GA20183203200710177656001D00043
s Downwards=0, Δ y<0 wherein
Figure DEST_PATH_GA20183203200710177656001D00044
s Upwards=0, Δ y>0 wherein.
The invention has the beneficial effects as follows: the system and method for visible sensation target context spatial relationship encode of the present invention, imictron in the coding network of its proposition can be corresponding one by one with physical device on hardware is realized, in to study of image object spatial relationship and expression process, can dynamically expand, many spatial relationship aspect between, every pair of any two target has been shown great dirigibility, adaptivity for expressing, can be applicable in the search, detection and Identification of the motion control of the expression of visual pattern and understanding, viewpoint and target.
Description of drawings
Fig. 1 is a neuroid target context spatial relationship coding structure synoptic diagram of the present invention;
Fig. 2 is the neuron coding synoptic diagram of visual pattern primitive of the present invention;
Fig. 3 is a picture element classification synoptic diagram.
Embodiment
In order to make purpose of the present invention, technical scheme and advantage clearer,, the system and method for a kind of visible sensation target context spatial relationship encode of the present invention is further elaborated below in conjunction with drawings and Examples.Should be appreciated that specific embodiment described herein only in order to explanation the present invention, and be not used in qualification the present invention.
The system and method for a kind of visible sensation target context spatial relationship encode of the present invention is realized with the neuroid form, is used to express the coded system and the method for visual pattern target context spatial relationship.
Neuroid, (ARTIFICIAL NEURAL NETWORK ANN) is a kind of engineering system of simulating its structure and intelligent behavior on the basis in that the understanding of human brain tissue structure and operating mechanism is understood to be also referred to as artificial neural network.As far back as initial stage 1940's, psychologist McCulloch, mathematician Pitts have just proposed first mathematical model of neuroid, have from then on started the neural computational science Study on Theory epoch.Thereafter, scholars such as Rosenblatt, Widrow and Hopfield have successively proposed a series of perception learning models again, and it is flourish to make that the neuroid technology is able to.
Neuroid is the system that is extensively interconnected and form by a large amount of neurons, and its this design feature is determining neuroid to have the ability that high speed information is handled.Each neuron nearly 10 of human brain 3~10 4Individual dendron and corresponding cynapse, a people's brain amounts to and forms 10 approximately 14~10 15Individual cynapse.With the term of neural network, promptly be that human brain has 10 14~10 15Individual interconnected storage potentiality.Though each neuronic calculation function is very simple, and signal transmission rate also lower (about 100 times/second), but because the in parallel function of extreme between each neuron finally makes an ordinary people's brain just can finish the task that existing computing machine needs several 1,000,000,000 treatment steps just can finish at least in about 1 second.
The system of visible sensation target context spatial relationship encode of the present invention, be the neuroid of visible sensation target context spatial relationship encode, comprise four coding neuron layers: visual pattern primitive coding neuron layer, visual pattern target code neuron layer, visual pattern target logic relation coding neuron layer and visual pattern object space relation coding neuron layer.Four layers of coding adopt sparse features, He Bu (Hebb) learning right value tag, connection features and the distance feature fundamental as coding respectively, four coding layers are made up of the sparse coding neuron respectively, realize between the neuron realizing seamless link between the sparse connection of local, layer and the layer, not only sparse but also compact on the structure.
As shown in Figure 1, the system of visible sensation target context spatial relationship encode of the present invention comprises an image sensing input neuron layer and four coding neuron layers.
Image sensing input neuron layer is used for image input, and the neuron of its formation is corresponding one by one with uniformly-spaced pixel sampling on the image, and neuronic response is got corresponding pixel value.
Four coding neuron layers are respectively: visual pattern primitive coding neuron layer, visual pattern target code neuron layer, visual pattern target logic relation coding neuron layer and sensation target spatial relationship coding neuron layer.Wherein, the neuronic response of each layer coding be have with it in one deck down connection neuronic response the weighting input and, and through threshold value block guarantee non-negative.
Connection weights between all adjacent two layers neurons have constituted the encoded radio to picture material, from second to layer 5, each coding neuron and connect weights representative image primitive, image object, target binary logic relation and object space relation respectively.
The coding function of neurons except store, remember with the weights of all cynapses of being connected to self or the target of encoding (or primitive), also be responsible for the neuronic response of lower floor is weighted read group total, and respond a target or primitive, the effect of just playing identification and judging represented to contain in the present image so by self competition.
In embodiments of the present invention, as a kind of enforceable mode, second layer visual pattern primitive coding neuron layer, any one visual pattern primitive coding neuron is accepted the connection input from 2 * 2 input neurons in ground floor image sensing input neuron layer one sub regions, the connection of 2 * 2 pixels input in the sub regions just.The embodiment of the invention is with connecting weights (w I1, w I2, w I3, w I4) subimage (x of 2 * 2 corresponding pixels in the expression ground floor 1, x 2, x 3, x 4) i substrate, a kind of image-based eigen has been represented in this substrate simultaneously, as brightness or edge feature, as shown in Figures 2 and 3, is referred to as picture element.
These connect weights (w I1, w I2, w I3, w I4) be called as picture element coding that should substrate i, can calculate subimage (x 1, x 2, x 3, x 4) 15 kinds of combination B of corresponding 2 * 2 pixels 0~ B 14(totally 2 4-a kind) corresponding connection weights.15 kinds of primitive coding synoptic diagram as shown in Figure 3, each primitive is by 4 weights (w among the figure 1, w 2, w 3, w 4) representative; Four weights of each primitive are corresponding to one group of 2 * 2 lattice, and each lattice is represented a real number.The grey grid is represented arithmetic number, the negative real number of black box representative.Computing method are as follows: establishing has n little grey lattice in 4 lattices, 4-n little black square then arranged; Weights=the 1/n of n little grey lattice correspondence, weights=-1/ (4-n) of 4-n little black square correspondence; The weights that calculate pass through normalized at last, just obtain encoded radio (w as shown in table 1 1, w 2, w 3, w 4).
Table 1 picture element coding (w I1, w I2, w I3, w I4) table
Figure S2007101776560D00081
As shown in Figure 1, i neuron of ground floor is to the set of the connection weights between j neuron of the second layer W ij 12 = { w ij 12 } , Constituted the coding of all images primitive of all subregions in the image.
Calculate the neuronic response R of this layer primitive coding simultaneously i 2:
A picture element coding neuron B i(0≤i≤14) are passed through subimage (x 1, x 2, x 3, x 4) weighting (w I1, w I2, w I3, w I4) summation operation, as the formula (1), realize the essential characteristic of this subimage is extracted the value I of the feature of being extracted I2Be referred to as this picture element neuronic input value of encoding.
I i 2 = Σ k = 1 4 w ik x k - - - ( 1 )
Among the present invention, also further block, obtain threshold values and block the back second layer neuronic response of encoding through threshold values, as the formula (2):
Figure S2007101776560D00084
Wherein, T is a threshold values, its role is to allow neuron that response is not made in less weighting input.
The 3rd layer of visual pattern target code neuron layer, wherein, a target in each neuron correspondence image, any one visual pattern target code neuron is accepted to be used for the expression or the coding of an image object from the neuronic connection input of encoding of the picture element of all subregions in target area in the second layer.Wherein encoded radio is embodied in the connection weights W of three layers of the second layers to the Ij 23, this visual pattern target code neuron is realized expression and response to image object by the weighted sum computing to the neuronic response of all images primitive coding in the image target area.
Wherein, the connection weights W of three layers of the second layers to the Ij 23Be according to Hebb law w Ij1R iR jCalculate, wherein, α 1Be a coefficient, R iBe second layer i neuronic response, R iBe the 3rd layer of j neuronic response, when calculating the connection weights of three layers of the second layers to the, because response the unknown of the 3rd layer, among the present invention, set the 3rd layer, promptly the response of visual pattern target code neuron layer is 1, then the 3rd layer connection weights W Ij 23Be according to Hebb law w Ij1R iCalculate, wherein, R iBe second layer i neuronic response.
As shown in Figure 1, connect weights W Ij 23Be calculated as follows:
If image target area comprises the M sub regions, to each subregion X m(1≤m≤M), primitive coding neuron B 0And B kResponse be R M0 2And R Mk 2(1≤k≤14), then Dui Ying weights W M0, j 23And W Mk, j 23Determine by formula (3):
w mi , j = w ′ mi , j Σ m = 1 M ( w ′ m 0 , j 2 + w ′ mk , j 2 ) , ( i = 0 , k ; 1 ≤ k ≤ 14 ) - - - ( 3 )
W ' wherein Mi, jValue decide according to conspicuous cloth learning rules: w ′ mi , j = α 1 R mi 2 , ( i = 0 , k ; 1 ≤ k ≤ 14 ) , Wherein, α 1Be a coefficient, promptly at first calculate one and connect weights, carry out normalization again and calculate, obtain the connection weights of three layers of the second layers to the according to conspicuous cloth learning rules.
All these connect the set of weights or coding W ij 23 = { w ij 23 } Constituted expression to relevant all visual pattern target codes in the image.
Simultaneously, to from second layer primitive coding neuron response (R 1 2, R 2 2... R i 2... R 2m 2), the 3rd layer target code neuron O iInput value I j 3, as the formula (4):
I j 3 = Σ i = 1 2 M w ij 23 R i 2 - - - ( 4 )
Through further obtaining optimal response value R through the competition response j 3, by the decision of formula (5) response function, make it possible to outstanding its target response:
The 4th layer of visual pattern target logic relation coding neuron layer, any one target logic relation coding neuron by with the 3rd layer in any two neuronic connections of target code express a pair of binary pair relationhip of two corresponding image objects, the binary logic that is used to encode between these two targets concerns, for the spatial relationship of further expressing between two targets provides index.Wherein encoded radio is embodied in the connection weight w of three layers of the second layers to the Ij 23, the weighted sum computing of this target logic relation coding neuron by neuronic response that two image objects are encoded realizes the response to this binary pair relationhip.
The 3rd layer to the 4th layer connection weight w Ij 34According to Hebb law w Ij2R iR jCalculate, wherein, in the embodiment of the invention, α 2Be a constant, R iBe the 3rd layer of i neuronic response, R jBe the 4th layer of j neuronic response, when the connection weights that calculate the 3rd layer to the 4th layer, because response the unknown of the 4th layer, among the present invention, set the 4th layer, promptly the response of target logic relation coding neuron layer is 1, then the 4th layer connection weight w Ij 34Be according to Hebb law w Ij2R iCalculate, wherein, R iBe the 3rd layer of i neuronic response, and because the 3rd layer neuronic response R i=1, therefore, the 4th layer connection weights
Preferably, get α 2=1/2, the neuron response that can make the 4th layer like this is 1, simplifies subsequent calculations.
All these connect the set of weights or coding W ij 34 = { w ij 34 } Constituted expression to relevant all target pair relationhips in the image.
To from the 3rd layer of target code neuron (O I1, O I2) response (R I1 3, R I2 3), the 4th layer binary logic relation coding neuron P jInput value I j 4Determine by formula (6) response function:
I j 4 = w i 1 , , j R i 1 3 + w i 2 , j R i 2 3 - - - ( 6 )
W wherein I1, jAnd w I2, jFor waiting numerical constant (as 1/2).
Further obtain optimal response value R through the competition response j 4, by the decision of formula (7) response function, make and give prominence to its response:
As shown in Figure 1, connection between the 4th layer of the neuroid coding structure and layer 5 be about image object between spatial relationship (promptly target with respect to another target in the horizontal direction with vertical direction on displacement (Δ x, Δ y)) coded representation.
Layer 5 is made of four direction (left, to the right, upwards, downwards) neuron, wherein left, the neuronic response of both direction represents a target B with respect to another target A shifted by delta x in the horizontal direction to the right, wherein when Δ x<0, expression target B is in the left side of target A and apart from being | Δ x|, neuron S left LeftResponse R Left=| Δ x|, neuron S to the right To the rightResponse R To the right=0; When Δ x>0, expression target B target A's by side and distance is | Δ x|, neuron S left LeftResponse R Left=0, neuron S to the right To the rightResponse R To the right=| Δ x|; In like manner, upwards, the neuronic response of both direction represent the shifted by delta y of a target B with respect to another target A in the vertical direction downwards, wherein when Δ y<0, expression target B is at the downside of target A and apart from being | Δ y|, downward neuron S DownwardsResponse R Downwards=| Δ y|, upwards neuron S UpwardsResponse R upwards=0; When Δ y>0, expression target B in upside and the distance of target A is | Δ y|, neuron S downwards DownwardsResponse R Downwards=0, neuron S makes progress UpwardsResponse R Upwards=| Δ y|.
The 4th layer of connection weights W to layer 5 Ij 45(w LeftOr w To the right, w UpwardsOr w Downwards) be according to Hebb law w Ij3R iR jCalculate, wherein, α 3Be a coefficient, R iBeing the 4th layer of neuronic response, is 1; R jBe the neuronic response of layer 5, its be two between the target level or the distance of vertical direction | Δ x| or | Δ y|.Be calculated as follows:
w Left3| Δ x| (Δ x<0) (8)
w To the right3| Δ x| (Δ x>0) (9)
w Upwards3| Δ y| (Δ y<0) (10)
w Downwards3| Δ y| (Δ y>0) (11)
As shown in Figure 1, all these connect the set of weights or coding W ij 45 = { w ij 45 } Constituted expression to spatial relationship between any two targets relevant in the image.
For any two targets, if their Δ x, Δ y is non-vanishing, then respectively has a neuron to accept from neuronic input of target logic relation coding that response is non-vanishing in the 4th layer left, to the right on horizontal direction () and the vertical direction (upwards, downwards).These two spatial relationship coding neurons are by importing the response of computing realization to any a pair of image object spatial relationship (being the offset distance on level and the vertical direction) to the weighting of the neuronic response of target binary logic relation coding.And two other direction neuron is not because import, so response all is zero.
Therefore, to the neuron P that encodes from the 4th layer of target binary logic relation iResponse R i 4, its response is 1, the spatial relationship coding neuron (S of layer 5 Left, S To the right, S Upwards, S Downwards) response (s Left, s To the right, s Upwards, s Downwards) determine by the response function of formula (12)~formula (15):
Figure S2007101776560D00121
s To the right=0 (Δ x<0) (12)
s Left=0 (Δ x>0) (13)
Figure S2007101776560D00123
s Downwards=0 (Δ y<0) (14)
Figure S2007101776560D00124
s Upwards=0 (Δ y>0) (15)
Even if by as can be seen above, spatial relationship coding neuron (S Left, S To the right, S Upwards, S Downwards) response and | Δ x| or | Δ y| is proportional, thereby has reflected the spatial relationship between the target.
Describe the method for visible sensation target context spatial relationship encode of the present invention below in detail, comprise the following steps:
Step S100 according to the neuronic pixel value of topography's sensing, calculates neuronic encoded radio of visual pattern primitive coding and response;
As shown in Figure 1, be neuroid target context spatial relationship coding structure synoptic diagram.Ground floor and the connection between the second layer at the neuroid coding structure are the expressions that picture element is encoded.
As shown in Figure 2, be the neuroid coding (w of picture element I1, w I2, w I3, w I4) synoptic diagram.As a kind of example, (the x of topography 1, x 2, x 3, x 4), the Dui Ying neuronic response of ground floor sensing just, its a primitive B iCan be expressed as neuroid coding (w I1, w I2, w I3, w I4)=(0.5,0.5 ,-0.5 ,-0.5).
As shown in Figure 3, be picture element classification synoptic diagram, any one picture element B iBy four coding (w I1, w I2, w I3, w I4) expression, have 15 such primitives.Each primitive is represented by four lattices that all each lattice is represented a real number.The grey grid is represented arithmetic number, the negative real number of black box representative.
Calculate the encoded radio of 15 primitives among Fig. 3, as shown in table 1.
Therefore, to from the neuronic response (x of ground floor topography sensing 1, x 2, x 3, x 4), the pixel value (x of topography just 1, x 2, x 3, x 4), the primitive coding neuron B of the second layer iInput value by the input function decision of formula (2) definition.
Among the present invention, response is blocked the value of meeting with a response R through threshold value i 2, as the formula (2), guarantee that it is non-negative.
Step S200 according to the neuronic response of visual pattern primitive coding, calculates neuronic encoded radio of visual pattern target code and response;
As shown in Figure 1, the connection between the second layer of neuroid coding structure and the 3rd layer is the expression that image object is encoded.
Image object coding neuron among Fig. 1 in the 3rd layer adopts the sparse coding strategy process, and promptly any one neuron of this layer is not connected with all primitive coding neurons in the second layer, and only in wherein sub-fraction is continuous.
Particularly, for any one subregion image (x in the image object I1, x I2, x I3, x I4), image object coding neuron is only accepted two primitive coding neuron B to this sub regions response 0And B kThe input of (1≤k≤14).Wherein, k is corresponding to removing B 0The neuronic sequence number of primitive coding that has peak response outward, as shown in Figure 3.
B 0And B kWith target code neuron O jThe connection weight w 0jAnd w Kj, be exactly the coding of target code neuron to this subregion.The coding summation of all such subregions has constituted the coding of target code neuron to this image object in the image target area.
Connect weight w 0jAnd w KjAcquisition as follows: establish image target area and comprise the M sub regions, to each subregion X m(1≤m≤M), primitive coding neuron B 0And B kResponse be R M0 2And R Mk 2(1≤k≤14), then linking objective coding neuron O jWeights W M0, j 23And W Mk, j 23Determine by formula (3).
W ' wherein Mi, jValue decide according to conspicuous cloth learning rules: w ′ mi , j = α 2 R mi 2 (i=0, k; 1≤k≤14), wherein, α 2Be a coefficient, promptly at first calculate one and connect weights, carry out normalization again and calculate, obtain the connection weights of three layers of the second layers to the according to conspicuous cloth learning rules.
To from second layer primitive coding neuron response (R 1 2, R 2 2... R i 2... R 2m 2), the 3rd layer target code neuron O iInput value I j 3, as the formula (4).
Through further obtaining optimal response value R through the competition response j 3,, make it possible to outstanding its target response by the decision of formula (5) response function.
Step S300 according to any two neuronic responses of related visual pattern target code, calculates visual pattern target logic relation neuronic encoded radio of coding and response;
As shown in Figure 1, the connection between the 3rd layer and the 4th layer of neuroid coding structure be about image object between the coded representation of binary logic pair relationhip.For example, the neuron of in the 3rd layer three image object A, B and C being encoded respectively with the 4th layer in neuron AB, AC and BC form and be connected, represent respectively to be paired in twos between target A, B and the C (A, B), (A, C) and (B, C) binary logic relation.This coding is to represent the concrete spatial relationship between any two targets for the 4th part index is provided.According to Hebb law w Ij=α R iR jCalculate, the connection weights that are used for this part coding are all used identical constant (as numerical value α=1/2) expression.It is 1 that the logical relation neuronic response of encoding is set, and neuronic response input value also is 1 because image object is encoded, then w Ij=α.
To from the 3rd layer of target code neuron (O I1, O I2) response (R I1 3, R I2 3), the 4th layer binary logic relation coding neuron P jInput value I j 4Determine by formula (6) response function.
Further obtain optimal response value R through the competition response j 4,, make and give prominence to its response by the decision of formula (7) response function.
Step S400 according to the spatial relationship of visual pattern target, calculates the sensation target spatial relationship neuronic encoded radio of encoding, and further obtains its corresponding response.
As shown in Figure 1, connection between the 4th layer of the neuroid coding structure and layer 5 be about image object between spatial relationship (promptly target with respect to another target in the horizontal direction with vertical direction on displacement (Δ x, Δ y)) coded representation.
Layer 5 is made of four direction (left, to the right, upwards, downwards) neuron, level (left, to the right) and a neuron is vertically respectively arranged on (upwards, downward) direction and the 4th layer in the target binary logic concern neuronic connection weight (w LeftOr w To the right, w UpwardsOr w Downwards) target of coding is in the horizontal and vertical directions with respect to the spatial relationship of another target.
According to Hebb law, it connects weights size and is proportional between two targets in the horizontal and vertical directions distance (| Δ x|, | Δ y|), shown in (8)~(11).
Therefore, to the neuron P that encodes from the 4th layer of target binary logic relation iResponse R i 4, its response is 1, the spatial relationship coding neuron (S of layer 5 Left, S To the right, S Upwards, S Downwards) response (s Left, s To the right, s Upwards, s Downwards) determine by the response function of formula (12)~formula (15).
The system and method for visible sensation target context spatial relationship encode of the present invention, with core technology and the method for neuroid form realization about the expression of image object context spatial relationship, imictron in the coding network of its proposition can be corresponding one by one with physical device on hardware is realized, in to study of image object spatial relationship and expression process, can dynamically expand, how right for expressing, spatial relationship aspect between every pair of any two target has shown great dirigibility, adaptivity can be applicable to the expression and the understanding of visual pattern, the motion control of viewpoint and the search of target, in the detection and Identification.
For example, a width of cloth facial image is made up of hair, face contour and perceptron official rank target image, and the coding of each target in people's face and spatial relationship thereof is expressed the expression and the understanding that can realize facial image component content and space structure thereof; The direct modeling of spatial relationship in the neuroid layer 5 of the present invention coding neuron human oculogyral four the muscle neurons of control, the neuronic response of four codings is equivalent to the neuronic systole response of muscle and the viewpoint position that causes changes, thereby the motion control function with viewpoint; In addition with this coding neuroid spatial relationship of target in any viewpoint to an image object or first target to the second moment constantly of can encoding, thereby utilize viewpoint motion control mechanism can realize the detection and tracking of target; Express by different coding,, can distinguish and recognition objective according to the size of response by the overall goals neuronic RESPONSE CALCULATION of encoding to each sub-goal of different images target and spatial relationship thereof.
In conjunction with the accompanying drawings to the description of the specific embodiment of the invention, others of the present invention and feature are conspicuous to those skilled in the art by above.
More than specific embodiments of the invention are described and illustrate it is exemplary that these embodiment should be considered to it, and be not used in and limit the invention, the present invention should make an explanation according to appended claim.

Claims (10)

1. the system of a visible sensation target context spatial relationship encode is characterized in that, it is realized with the neuroid form;
Comprise visual pattern primitive coding neuron layer, visual pattern target code neuron layer, visual pattern target logic relation coding neuron layer and sensation target spatial relationship coding neuron layer;
Connection weights between all adjacent two layers neurons have constituted the coding to picture material, and each coding neuron and connection weights thereof are coded image primitive, image object, target binary logic relation and object space relation respectively;
The binary logic relation that pairing in twos forms between the described image object of described target binary logic relation expression.
2. the system of visible sensation target context spatial relationship encode according to claim 1 is characterized in that, also comprises an image sensing input neuron layer, is used for the image input.
3. the system of visible sensation target context spatial relationship encode according to claim 2, it is characterized in that, the neuron that described image sensing input neuron layer constitutes is corresponding one by one with the uniformly-spaced pixel sampling on the image, and neuronic response is got corresponding pixel value.
4. according to the system of each described visible sensation target context spatial relationship encode of claim 1 to 3, it is characterized in that, described visual pattern primitive coding neuron layer, visual pattern target code neuron layer, visual pattern target logic relation coding neuron layer and sensation target spatial relationship coding neuron layer coding adopt sparse features, conspicuous cloth learning right value tag, connection features and the distance feature fundamental as coding respectively, and four coding layers are made up of the sparse coding neuron respectively.
5. the method for a visible sensation target context spatial relationship encode is characterized in that, comprises the following steps:
Steps A according to the neuronic pixel value of topography's sensing, calculates neuronic encoded radio of visual pattern primitive coding and response;
Step B according to the neuronic response of visual pattern primitive coding, calculates neuronic encoded radio of visual pattern target code and response;
Step C, the binary logic of matching in twos between the formation visual pattern target according to any two neuronic responses of related visual pattern target code concerns, calculates visual pattern target logic relation neuronic encoded radio of coding and response;
Step D according to the spatial relationship between the visual pattern target, calculates the sensation target spatial relationship neuronic encoded radio of encoding.
6. the method for visible sensation target context spatial relationship encode according to claim 5 is characterized in that, among the described step D, also comprises calculating the encode step of neuronic response of sensation target spatial relationship.
7. the method for visible sensation target context spatial relationship encode according to claim 5 is characterized in that, in the described steps A, and 15 neuronic encoded radio w of visual pattern primitive coding I1, w I2, w I3, w I4Be 15 kinds of weights that combination is corresponding, obtain through normalized according to 2 * 2 pixels;
To from the neuronic response of visual pattern sensing input x 1, x 2, x 3, x 4, described visual pattern primitive coding neuron B iResponse R i 2By determining with minor function:
Figure FA20183203200710177656001C00021
Wherein,
Figure FA20183203200710177656001C00022
T is a threshold value, w IkBe picture element B iFour codings in one.
8. the method for visible sensation target context spatial relationship encode according to claim 5 is characterized in that, among the described step B, the described neuronic encoded radio of visual pattern target code that calculates comprises the following steps:
If image target area comprises the M sub regions, to each subregion X m, 1≤m≤M wherein, primitive coding neuron B 0And B kResponse be R M0 2And R Mk 2, wherein 1≤k≤14, then Dui Ying weight w M0, j 23And w Mk, j 23Determine by following formula:
Figure FA20183203200710177656001C00023
I=0 wherein, k; 1≤k≤14
W ' wherein Mi, jValue decide according to conspicuous cloth learning rules:
Figure FA20183203200710177656001C00024
I=0 wherein, k; 1≤k≤14, wherein, α 2Be a coefficient, promptly at first calculate one and connect weights, carry out normalization again and calculate, the image object that obtains three layers of the picture element coding neurons to the of the second layer neuronic connection weights of encoding according to conspicuous cloth learning rules;
To picture element coding neuron response R from the second layer 1 2, R 2 2... R i 2... R 2m 2, the 3rd layer image object coding neuron O iInput value I j 3, be shown below:
Figure FA20183203200710177656001C00031
Through further obtaining optimal response value R through the competition response j 3, determine by the following formula response function:
Figure FA20183203200710177656001C00032
9. the method for visible sensation target context spatial relationship encode according to claim 5, it is characterized in that, among the described step C, described visual pattern target logic relation neuronic encoded radio of coding and the response of calculating, according to Hebb law, all use identical constant to represent;
To image object coding neuron O from the 3rd layer I1, O I2Response R I1 3, R I2 3, the 4th layer target logic relation coding neuron P jInput value I j 4By determining with minor function:
Figure FA20183203200710177656001C00033
W wherein I1, jAnd w I2, jBe the numerical constant such as grade,
Further obtain optimal response value R through the competition response j 4, by following response function decision, make and give prominence to its response:
Figure FA20183203200710177656001C00034
10. the method for visible sensation target context spatial relationship encode according to claim 5 is characterized in that, among the described step D, the described sensation target spatial relationship neuronic encoded radio of encoding that calculates comprises the following steps:
The 4th layer target logic relation coding neuron is to the neuronic connection weight w of object space relation coding of layer 5 Ij 45, w wherein LeftOr w To the right, w UpwardsOr w DownwardsBe according to Hebb law w Ij3R iR jCalculate, wherein, α 3Be a coefficient, R iBeing the 4th layer the neuronic response of target logic relation coding, is 1; R jBe the neuronic response of object space relation coding of layer 5, its be two between the target level or the distance of vertical direction | Δ x| or | Δ y|; Be calculated as follows:
w Left3| Δ x|, wherein Δ x<0
w To the right3| Δ x|, wherein Δ x>0
w Upwards3| Δ y|, wherein Δ y<0
w Downwards3| Δ y|, wherein Δ y>0
To target logic relation coding neuron P from the 4th layer iResponse R i 4, its response is 1, the object space relation coding neuron S of layer 5 Left, S To the right, S Upwards, S DownwardsResponse s Left, s To the right, s Upwards, s DownwardsDetermine by following response function:
Figure FA20183203200710177656001C00041
s To the right=0, Δ x<0 wherein
Figure FA20183203200710177656001C00042
s Left=0, Δ x>0 wherein
Figure FA20183203200710177656001C00043
s Downwards=0, Δ y<0 wherein
s Upwards=0, Δ y>0 wherein.
CN2007101776560A 2007-11-19 2007-11-19 System and method for visible sensation target context spatial relationship encode Expired - Fee Related CN101159043B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2007101776560A CN101159043B (en) 2007-11-19 2007-11-19 System and method for visible sensation target context spatial relationship encode

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2007101776560A CN101159043B (en) 2007-11-19 2007-11-19 System and method for visible sensation target context spatial relationship encode

Publications (2)

Publication Number Publication Date
CN101159043A CN101159043A (en) 2008-04-09
CN101159043B true CN101159043B (en) 2010-12-15

Family

ID=39307129

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2007101776560A Expired - Fee Related CN101159043B (en) 2007-11-19 2007-11-19 System and method for visible sensation target context spatial relationship encode

Country Status (1)

Country Link
CN (1) CN101159043B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101923575B (en) * 2010-08-31 2012-10-10 中国科学院计算技术研究所 Target image searching method and system
CN106126688B (en) * 2016-06-29 2020-03-24 厦门趣处网络科技有限公司 Intelligent network information acquisition system and method based on WEB content and structure mining
TWI625681B (en) * 2017-05-11 2018-06-01 國立交通大學 Neural network processing system
CN110222770B (en) * 2019-06-10 2023-06-02 成都澳海川科技有限公司 Visual question-answering method based on combined relationship attention network
CN110687929B (en) * 2019-10-10 2022-08-12 辽宁科技大学 Aircraft three-dimensional space target searching system based on monocular vision and motor imagery

Also Published As

Publication number Publication date
CN101159043A (en) 2008-04-09

Similar Documents

Publication Publication Date Title
CN111259850B (en) Pedestrian re-identification method integrating random batch mask and multi-scale representation learning
CN108717568B (en) A kind of image characteristics extraction and training method based on Three dimensional convolution neural network
CN105787439B (en) A kind of depth image human synovial localization method based on convolutional neural networks
CN103996056B (en) Tattoo image classification method based on deep learning
CN110135319A (en) A kind of anomaly detection method and its system
CN104063719B (en) Pedestrian detection method and device based on depth convolutional network
CN111931624B (en) Attention mechanism-based lightweight multi-branch pedestrian heavy identification method and system
CN108510194A (en) Air control model training method, Risk Identification Method, device, equipment and medium
CN101807245B (en) Artificial neural network-based multi-source gait feature extraction and identification method
Koh et al. A multilayer self-organizing feature map for range image segmentation
CN106407903A (en) Multiple dimensioned convolution neural network-based real time human body abnormal behavior identification method
CN107977932A (en) It is a kind of based on can differentiate attribute constraint generation confrontation network face image super-resolution reconstruction method
CN107423730A (en) A kind of body gait behavior active detecting identifying system and method folded based on semanteme
CN107229904A (en) A kind of object detection and recognition method based on deep learning
Guthikonda Kohonen self-organizing maps
CN106295555A (en) A kind of detection method of vital fingerprint image
CN101159043B (en) System and method for visible sensation target context spatial relationship encode
CN109543602A (en) A kind of recognition methods again of the pedestrian based on multi-view image feature decomposition
CN108647583A (en) A kind of face recognition algorithms training method based on multiple target study
CN110046675A (en) A kind of the exercise ability of lower limbs appraisal procedure based on improved convolutional neural networks
CN108389192A (en) Stereo-picture Comfort Evaluation method based on convolutional neural networks
CN105279485B (en) The detection method of monitoring objective abnormal behaviour under laser night vision
CN114841319A (en) Multispectral image change detection method based on multi-scale self-adaptive convolution kernel
CN108446676A (en) Facial image age method of discrimination based on orderly coding and multilayer accidental projection
CN108235003A (en) Three-dimensional video quality evaluation method based on 3D convolutional neural networks

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20101215

Termination date: 20191119

CF01 Termination of patent right due to non-payment of annual fee