CN105005788B

CN105005788B - The target apperception method of simulated human Low Level Vision

Info

Publication number: CN105005788B
Application number: CN201510377158.5A
Authority: CN
Inventors: 潘晨
Original assignee: China Jiliang University
Current assignee: China Jiliang University
Priority date: 2015-06-25
Filing date: 2015-06-25
Publication date: 2018-08-28
Anticipated expiration: 2035-06-25
Also published as: CN105005788A

Abstract

The invention discloses a kind of target apperception methods of simulated human Low Level Vision, including following steps：1) conspicuousness detection is made to target image by frequency domain method, obtains corresponding pixel saliency map；2) it to the significant point in the pixel saliency map, sorts according to significance；3) top n significant point is chosen as blinkpunkt, including the minimum rectangle range of these blinkpunkts is as watching area；4) stochastical sampling, and the pixel stochastical sampling to carrying out equivalent outside watching area are carried out to the watching area interior pixels；5) use support vector machines Training strategy, the SVM models that training obtains one two classification that will be divided into the pixel region of positive sample as the first fixation object area by whole pixels of target image described in the category of model.The process that the present invention watches attentively according to human vision carrys out simulated human vision by blinkpunkt sequence and neural network model, and target scene is quickly and effectively watched attentively and perceived to realize.

Description

The target apperception method of simulated human Low Level Vision

Technical field

The present invention relates to human vision simulation technical field, specifically a kind of target sense of simulated human Low Level Vision Perception method.

Background technology

With the development of information technology, computer vision has been widely used in low-level feature detection and description, pattern The fields such as identification, artificial intelligence reasoning and machine learning algorithm.However, computer vision is a kind of method of task-driven type, It needs to limit many conditions, and designs corresponding algorithm according to actual task, lack general algorithm, thus often meet To high dimensional nonlinear feature space, super large data volume to problem solving and in real time the problems such as processing so that it is studied and application surface Face huge challenge.

For human visual system, it can efficiently and reliably work, has the following advantages under various circumstances： With the selectivity and purpose in concern mechanism, conspicuousness detection and visual processes related to this；It can be regarded from low layer Priori is utilized in feel processing, the bottom-up processing of data-driven is made to be instructed in visual processes with top-down knowledge Mutually coordinated cooperation；Upper and lower border information is all played an important role in the at all levels of visual processes, and can be comprehensively utilized The information of various mode in environment.But in the case where human visual perception mechanism is not fully understood, how to construct with people There are still larger difficulties for the machine vision of class visual characteristic, if can simulated human vision to realize the perception to target, must Important influence so can be brought to the application such as identification and perception of target.

Invention content

In view of this, the technical problem to be solved by the present invention is to, provide one kind can simulated human vision, realize to target The target apperception method for the simulated human Low Level Vision of scene quickly and effectively watched attentively.

Technical solution of the invention is to provide the target apperception method of the simulated human Low Level Vision of following steps, Including following steps：

1) conspicuousness detection is made to target image by frequency domain method, obtains corresponding pixel saliency map, the pixel is aobvious Work degree figure is consistent with the picture element position information of the target image；

2) it to the significant point in the pixel saliency map, is ranked up according to significance；

3) top n significant point is chosen as blinkpunkt, including the minimum rectangle range of these blinkpunkts is as watching area；

4) stochastical sampling, and the pixel to carrying out equivalent outside watching area are carried out to the watching area interior pixels Stochastical sampling；Obtained watching area interior pixels are sampled as positive sample, watching area external pixels are as negative sample；

5) support vector machines Training strategy, training is utilized to obtain the SVM models of one two classification, pass through the category of model institute The whole pixels for stating target image will be divided into the pixel region of positive sample as the first fixation object area.

Method using the present invention, compared with prior art, the present invention has the following advantages：It is carried out by frequency domain method notable Property detection, can quickly form pixel saliency map, the figure is consistent with the picture element position information of target image, and according to significance It is sorted, the minimum rectangle range that the blinkpunkt of selection is constituted is sampled as watching area, with external samples one It rises and enters neural network, using the high region of significance as the first fixation object area, and the base in the first fixation object area can be established On plinth, expand fixation range again, form corresponding fixation object area, and be compared with the first fixation object area, to judge Whether the result in the first fixation object area is stablized.The process that the present invention watches attentively according to human vision passes through blinkpunkt sequence and god Through network model, carry out simulated human vision, target scene is quickly and effectively watched attentively and perceived to realize.

As an improvement, N+M significant point forms watching area, then through step as blinkpunkt according to step 3) before choosing 4) and corresponding second fixation object area 5) is obtained；Compare the overlapping degree in the first fixation object area and the second fixation object area, Overlapping degree then shows greatly big to the visual perception intensity of target；Overlapping degree is small, shows to have not yet been formed enough to target Visual perception intensity continues to repeat the above process, until reaching enough visual perception intensity, final fixation object area is upper State the superposition in all fixation object areas of process.The design can accelerate the generation and output of visual perception target, and obtain more Stable fixation object area, the result watched attentively are more reliable.

As an improvement, after obtaining fixation object area, the region is cleared in target image and pixel saliency map, to more And 5) significant point in pixel saliency map after new repeats step 3), 4), obtains new note according to significance minor sort again Depending on target area, multiple target areas in image are obtained successively.It can complete to watch the effective information of entire image attentively in this way Identification and reading, improve the accuracy watched attentively and integrity degree.

As an improvement, the frequency domain method refers to by supercomplex Fourier transform, by the red, green, blue in coloured image Three components participate in Fourier transform as three imaginary parts of supercomplex, only retain phase spectrum information, are obtained through inverse fourier transform Obtain pixel saliency map.It should effectively be directed to coloured silk designed for solving the problems, such as that the prior art is only capable of processing black white image identification Color image has correspondingly improved the specific steps of frequency domain method.

Description of the drawings

Fig. 1 is the flow chart of the target apperception method of simulated human Low Level Vision of the present invention.

Specific implementation mode

With regard to specific embodiment, the invention will be further described below, but the present invention is not restricted to these embodiments.

The present invention covers any replacement, modification, equivalent method and scheme made in the spirit and scope of the present invention.For So that the public is had thorough understanding to the present invention, is described in detail concrete details in following present invention preferred embodiment, and Description without these details can also understand the present invention completely for a person skilled in the art.In addition, the attached drawing of the present invention In be explained herein for the needs of signal not being drawn to scale accurately completely.

As shown in Figure 1, the target apperception method of the simulated human Low Level Vision of the present invention, including following steps：

3) top n significant point is chosen as blinkpunkt, including the minimum rectangle range of these blinkpunkts is as watching area； It chooses minimum rectangle range and both can ensure that the accurate of sampling, can also improve the Stability and veracity in fixation object area；

For simulated human vision perceives, image is equivalent to the scene that human vision is watched attentively, no matter scene Size, the range being imaged on the retina it is constant, thus image in machine vision is also such in machine.

Conspicuousness detection is made to target image by frequency domain method, following steps implementation can be used：Treat target image I (i, J) [image is changed to frequency domain to progress two dimensional discrete Fourier transform F by I (i, j) by transform of spatial domain, obtains phase P (u, v) information：

F indicates two dimensional discrete Fourier transform in formula,Indicate phase operation.By phase information through inverse Fourier transform Afterwards, saliency map can be obtained as Sa_Map in spatial domain.

Sa_Map (i, j)=| F^-1[exp { jP (u, v) }] |² (2)

In Fig. 1, it is corresponding using support vector machines (SVM) Training strategy to be related to training data, disaggregated model, result etc. Implementation process.Specific implementation process is as follows：

If including the training set of l sample For input vector, y_k∈ { -1 ,+1 } is positive and negative classification mark Know.SVM first has to use training set learning model building, it is therefore an objective to find optimal separating hyper plane in feature space, test data is use up May correctly it classify.Consider ordinary circumstance, when training set is Nonlinear separability, first selects a gaussian radial basis function

K (x, x_i)=exp-q | | x-x_i||²} (3)

By training set data x_iIt is mapped in a High-dimensional Linear feature space and constructs optimal separating hyper plane.Wherein q is Radial basis kernel function parameter, then the discriminant function of grader be

Training process is knownWith under the conditions of q etc., obtained in (4) formula using Quadratic Programming Solution method b^*, α_i ^*The SVM models obtained as training with supporting vector (SV)；Test process is then to utilize the SVM models, by unknown number (4) formula is substituted into according to x, it is obtained and predicts classification.

SVM avoids the dimension disaster problem that traditional learning algorithm faces using kernel function skill.Most based on structure risk Smallization principle, classification performance only determined by a small amount of supporting vector (SV), the Generalization Capability having had.In practical problem, have A small amount of sample is selected conducive to using priori, carrys out structural classification device through SVM study.Which overcome traditional learning algorithms to be based on warp Principle of minimization risk is tested, performance just has the defect of theoretic guarantee when sample number is intended to infinity；By solving two Secondary planning problem can avoid traditional neural network algorithm from building the empirical of network and be easily trapped into local minimizers number etc. and lack Point；It is suitble to the image object that segmentation is complicated, is difficult to quantitative description.

In order to optimize the present invention, then needs to judge whether the first obtained fixation object area stablizes, be then presented as in block diagram Judge whether stable output.Therefore further object area is needed to form：

N+M significant point is used as blinkpunkt before choosing, according to step 3) formation watching area, then through step 4) with 5) obtain Corresponding second fixation object area；Compare the overlapping degree in the first fixation object area and the second fixation object area, overlapping degree is big Then show big to the visual perception intensity of target；Overlapping degree is small, shows enough visual perceptions to target have not yet been formed strong Degree, continues to repeat the above process, until reaching enough visual perception intensity, final fixation object area is all for the above process The superposition in fixation object area.

After obtaining fixation object area, the region is cleared in target image and pixel saliency map, to updated picture And 5) significant point in plain saliency map repeats step 3), 4), obtains new fixation object area according to significance minor sort again, Multiple target areas in image are obtained successively.The information of all effective watching areas can be partitioned into from figure in this way, is constructed Simulate the machine vision of human vision.

The frequency domain method refers to being made three components of red, green, blue in coloured image by supercomplex Fourier transform Fourier transform is participated in for three imaginary parts of supercomplex, only retains phase spectrum information, it is notable to obtain pixel through inverse fourier transform Degree figure.It is corresponding should to be effectively directed to coloured image designed for solving the problems, such as that the prior art is only capable of processing black white image identification Ground improves the specific steps of frequency domain method.Supercomplex is made of four parts, is expressed as

Q=a+bi+cj+dk (5)

Wherein a, b, c, d are real numbers, and i, j, k is imaginary unit, and is had the following properties that：i²=j²=k²=ijk =-1, ij=-ji=k, ki=-ik=j, jk=-kj=i.

The RGB models of coloured image can be depicted without the pure supercomplex of real part：

F=R (m, n) i+G (m, n) j+B (m, n) k (6)

Wherein R (m, n), G (m, n), B (m, n) indicate three components of image RGB respectively, and m, n are pixel coordinate.If q =f, then a=0, b=R (m, n), c=G (m, n), d=B (m, n).Colour phasor can be carried out according to formula (6) in supercomplex Fu Leaf transformation：

Wherein, fft2 () indicates that conventional two-dimensional Fourier transformation, real () expressions take real part, imag () expressions to take imaginary part.For the empty vector of unit.Herein, F need to only be taken^RThe phase spectrum P (f) of (v, u)：

It enables：A=e^jP(f) (9)

Supercomplex inverse Fourier transform can be obtained using conventional two-dimensional inverse fast Fourier transform (ifft2) combination, such as Formula (10)：

Wherein, B=fft2 (b), C=fft2 (c), D=fft2 (d).

Sa_Map (m, n)=real (F^-R(v, u)) (11)

The notable figure as acquired.Since globality of the colour element before and after data processing is kept, to keep away The color distortion caused by the transformation or exchange of vector component is exempted from.

Only preferred embodiments of the present invention are described above, but are not to be construed as limiting the scope of the invention.This Invention is not only limited to above example, and concrete structure is allowed to vary.In short, all guarantors in independent claims of the present invention Various change is within the scope of the invention made by shield range.

Claims

1. a kind of target apperception method of simulated human Low Level Vision, it is characterised in that：Include the following steps：

1) conspicuousness detection is made to target image by frequency domain method, obtains corresponding pixel saliency map, the pixel significance Figure is consistent with the picture element position information of the target image；

4) stochastical sampling is carried out to the watching area interior pixels, and random to the pixel for carrying out equivalent outside watching area Sampling；Obtained watching area interior pixels are sampled as positive sample, watching area external pixels are as negative sample；

5) support vector machines Training strategy, training is utilized to obtain the SVM models of one two classification, pass through mesh described in the category of model Whole pixels of logo image will be divided into the pixel region of positive sample as the first fixation object area；

N+M significant point is used as blinkpunkt before choosing, and watching area is formed according to step 3), then through step 4) with 5) obtain accordingly The second fixation object area；

Compare the overlapping degree in the first fixation object area and the second fixation object area, overlapping degree then shows greatly the vision to target Perceptive intensity is big；Overlapping degree is small, shows that the enough visual perception intensity to target has not yet been formed, and continues to repeat above-mentioned mistake Journey, until reaching enough visual perception intensity, final fixation object area is the superposition in all fixation object areas of the above process.

2. the target apperception method of simulated human Low Level Vision according to claim 1, it is characterised in that：Mesh is watched in acquisition attentively After marking area, the region is cleared in target image and pixel saliency map, to notable in updated pixel saliency map And 5) point repeats step 3), 4), obtains new fixation object area according to significance minor sort again；Thus image can be obtained successively In multiple target areas.

3. the target apperception method of simulated human Low Level Vision according to claim 1 or 2, it is characterised in that：Described Frequency domain method refers to by supercomplex Fourier transform, using three components of red, green, blue in coloured image as three of supercomplex Imaginary part participates in Fourier transform, only retains phase spectrum information, and pixel saliency map is obtained through inverse fourier transform.