CN105005788B - The target apperception method of simulated human Low Level Vision - Google Patents

The target apperception method of simulated human Low Level Vision Download PDF

Info

Publication number
CN105005788B
CN105005788B CN201510377158.5A CN201510377158A CN105005788B CN 105005788 B CN105005788 B CN 105005788B CN 201510377158 A CN201510377158 A CN 201510377158A CN 105005788 B CN105005788 B CN 105005788B
Authority
CN
China
Prior art keywords
target
pixel
area
fixation object
saliency map
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510377158.5A
Other languages
Chinese (zh)
Other versions
CN105005788A (en
Inventor
潘晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Jiliang University
Original Assignee
China Jiliang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Jiliang University filed Critical China Jiliang University
Priority to CN201510377158.5A priority Critical patent/CN105005788B/en
Publication of CN105005788A publication Critical patent/CN105005788A/en
Application granted granted Critical
Publication of CN105005788B publication Critical patent/CN105005788B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/192Recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references
    • G06V30/194References adjustable by an adaptive method, e.g. learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Abstract

The invention discloses a kind of target apperception methods of simulated human Low Level Vision, including following steps:1) conspicuousness detection is made to target image by frequency domain method, obtains corresponding pixel saliency map;2) it to the significant point in the pixel saliency map, sorts according to significance;3) top n significant point is chosen as blinkpunkt, including the minimum rectangle range of these blinkpunkts is as watching area;4) stochastical sampling, and the pixel stochastical sampling to carrying out equivalent outside watching area are carried out to the watching area interior pixels;5) use support vector machines Training strategy, the SVM models that training obtains one two classification that will be divided into the pixel region of positive sample as the first fixation object area by whole pixels of target image described in the category of model.The process that the present invention watches attentively according to human vision carrys out simulated human vision by blinkpunkt sequence and neural network model, and target scene is quickly and effectively watched attentively and perceived to realize.

Description

The target apperception method of simulated human Low Level Vision
Technical field
The present invention relates to human vision simulation technical field, specifically a kind of target sense of simulated human Low Level Vision Perception method.
Background technology
With the development of information technology, computer vision has been widely used in low-level feature detection and description, pattern The fields such as identification, artificial intelligence reasoning and machine learning algorithm.However, computer vision is a kind of method of task-driven type, It needs to limit many conditions, and designs corresponding algorithm according to actual task, lack general algorithm, thus often meet To high dimensional nonlinear feature space, super large data volume to problem solving and in real time the problems such as processing so that it is studied and application surface Face huge challenge.
For human visual system, it can efficiently and reliably work, has the following advantages under various circumstances: With the selectivity and purpose in concern mechanism, conspicuousness detection and visual processes related to this;It can be regarded from low layer Priori is utilized in feel processing, the bottom-up processing of data-driven is made to be instructed in visual processes with top-down knowledge Mutually coordinated cooperation;Upper and lower border information is all played an important role in the at all levels of visual processes, and can be comprehensively utilized The information of various mode in environment.But in the case where human visual perception mechanism is not fully understood, how to construct with people There are still larger difficulties for the machine vision of class visual characteristic, if can simulated human vision to realize the perception to target, must Important influence so can be brought to the application such as identification and perception of target.
Invention content
In view of this, the technical problem to be solved by the present invention is to, provide one kind can simulated human vision, realize to target The target apperception method for the simulated human Low Level Vision of scene quickly and effectively watched attentively.
Technical solution of the invention is to provide the target apperception method of the simulated human Low Level Vision of following steps, Including following steps:
1) conspicuousness detection is made to target image by frequency domain method, obtains corresponding pixel saliency map, the pixel is aobvious Work degree figure is consistent with the picture element position information of the target image;
2) it to the significant point in the pixel saliency map, is ranked up according to significance;
3) top n significant point is chosen as blinkpunkt, including the minimum rectangle range of these blinkpunkts is as watching area;
4) stochastical sampling, and the pixel to carrying out equivalent outside watching area are carried out to the watching area interior pixels Stochastical sampling;Obtained watching area interior pixels are sampled as positive sample, watching area external pixels are as negative sample;
5) support vector machines Training strategy, training is utilized to obtain the SVM models of one two classification, pass through the category of model institute The whole pixels for stating target image will be divided into the pixel region of positive sample as the first fixation object area.
Method using the present invention, compared with prior art, the present invention has the following advantages:It is carried out by frequency domain method notable Property detection, can quickly form pixel saliency map, the figure is consistent with the picture element position information of target image, and according to significance It is sorted, the minimum rectangle range that the blinkpunkt of selection is constituted is sampled as watching area, with external samples one It rises and enters neural network, using the high region of significance as the first fixation object area, and the base in the first fixation object area can be established On plinth, expand fixation range again, form corresponding fixation object area, and be compared with the first fixation object area, to judge Whether the result in the first fixation object area is stablized.The process that the present invention watches attentively according to human vision passes through blinkpunkt sequence and god Through network model, carry out simulated human vision, target scene is quickly and effectively watched attentively and perceived to realize.
As an improvement, N+M significant point forms watching area, then through step as blinkpunkt according to step 3) before choosing 4) and corresponding second fixation object area 5) is obtained;Compare the overlapping degree in the first fixation object area and the second fixation object area, Overlapping degree then shows greatly big to the visual perception intensity of target;Overlapping degree is small, shows to have not yet been formed enough to target Visual perception intensity continues to repeat the above process, until reaching enough visual perception intensity, final fixation object area is upper State the superposition in all fixation object areas of process.The design can accelerate the generation and output of visual perception target, and obtain more Stable fixation object area, the result watched attentively are more reliable.
As an improvement, after obtaining fixation object area, the region is cleared in target image and pixel saliency map, to more And 5) significant point in pixel saliency map after new repeats step 3), 4), obtains new note according to significance minor sort again Depending on target area, multiple target areas in image are obtained successively.It can complete to watch the effective information of entire image attentively in this way Identification and reading, improve the accuracy watched attentively and integrity degree.
As an improvement, the frequency domain method refers to by supercomplex Fourier transform, by the red, green, blue in coloured image Three components participate in Fourier transform as three imaginary parts of supercomplex, only retain phase spectrum information, are obtained through inverse fourier transform Obtain pixel saliency map.It should effectively be directed to coloured silk designed for solving the problems, such as that the prior art is only capable of processing black white image identification Color image has correspondingly improved the specific steps of frequency domain method.
Description of the drawings
Fig. 1 is the flow chart of the target apperception method of simulated human Low Level Vision of the present invention.
Specific implementation mode
With regard to specific embodiment, the invention will be further described below, but the present invention is not restricted to these embodiments.
The present invention covers any replacement, modification, equivalent method and scheme made in the spirit and scope of the present invention.For So that the public is had thorough understanding to the present invention, is described in detail concrete details in following present invention preferred embodiment, and Description without these details can also understand the present invention completely for a person skilled in the art.In addition, the attached drawing of the present invention In be explained herein for the needs of signal not being drawn to scale accurately completely.
As shown in Figure 1, the target apperception method of the simulated human Low Level Vision of the present invention, including following steps:
1) conspicuousness detection is made to target image by frequency domain method, obtains corresponding pixel saliency map, the pixel is aobvious Work degree figure is consistent with the picture element position information of the target image;
2) it to the significant point in the pixel saliency map, is ranked up according to significance;
3) top n significant point is chosen as blinkpunkt, including the minimum rectangle range of these blinkpunkts is as watching area; It chooses minimum rectangle range and both can ensure that the accurate of sampling, can also improve the Stability and veracity in fixation object area;
4) stochastical sampling, and the pixel to carrying out equivalent outside watching area are carried out to the watching area interior pixels Stochastical sampling;Obtained watching area interior pixels are sampled as positive sample, watching area external pixels are as negative sample;
5) support vector machines Training strategy, training is utilized to obtain the SVM models of one two classification, pass through the category of model institute The whole pixels for stating target image will be divided into the pixel region of positive sample as the first fixation object area.
For simulated human vision perceives, image is equivalent to the scene that human vision is watched attentively, no matter scene Size, the range being imaged on the retina it is constant, thus image in machine vision is also such in machine.
Conspicuousness detection is made to target image by frequency domain method, following steps implementation can be used:Treat target image I (i, J) [image is changed to frequency domain to progress two dimensional discrete Fourier transform F by I (i, j) by transform of spatial domain, obtains phase P (u, v) information:
F indicates two dimensional discrete Fourier transform in formula,Indicate phase operation.By phase information through inverse Fourier transform Afterwards, saliency map can be obtained as Sa_Map in spatial domain.
Sa_Map (i, j)=| F-1[exp { jP (u, v) }] |2 (2)
In Fig. 1, it is corresponding using support vector machines (SVM) Training strategy to be related to training data, disaggregated model, result etc. Implementation process.Specific implementation process is as follows:
If including the training set of l sample For input vector, yk∈ { -1 ,+1 } is positive and negative classification mark Know.SVM first has to use training set learning model building, it is therefore an objective to find optimal separating hyper plane in feature space, test data is use up May correctly it classify.Consider ordinary circumstance, when training set is Nonlinear separability, first selects a gaussian radial basis function
K (x, xi)=exp-q | | x-xi||2} (3)
By training set data xiIt is mapped in a High-dimensional Linear feature space and constructs optimal separating hyper plane.Wherein q is Radial basis kernel function parameter, then the discriminant function of grader be
Training process is knownWith under the conditions of q etc., obtained in (4) formula using Quadratic Programming Solution method b*, αi *The SVM models obtained as training with supporting vector (SV);Test process is then to utilize the SVM models, by unknown number (4) formula is substituted into according to x, it is obtained and predicts classification.
SVM avoids the dimension disaster problem that traditional learning algorithm faces using kernel function skill.Most based on structure risk Smallization principle, classification performance only determined by a small amount of supporting vector (SV), the Generalization Capability having had.In practical problem, have A small amount of sample is selected conducive to using priori, carrys out structural classification device through SVM study.Which overcome traditional learning algorithms to be based on warp Principle of minimization risk is tested, performance just has the defect of theoretic guarantee when sample number is intended to infinity;By solving two Secondary planning problem can avoid traditional neural network algorithm from building the empirical of network and be easily trapped into local minimizers number etc. and lack Point;It is suitble to the image object that segmentation is complicated, is difficult to quantitative description.
In order to optimize the present invention, then needs to judge whether the first obtained fixation object area stablizes, be then presented as in block diagram Judge whether stable output.Therefore further object area is needed to form:
N+M significant point is used as blinkpunkt before choosing, according to step 3) formation watching area, then through step 4) with 5) obtain Corresponding second fixation object area;Compare the overlapping degree in the first fixation object area and the second fixation object area, overlapping degree is big Then show big to the visual perception intensity of target;Overlapping degree is small, shows enough visual perceptions to target have not yet been formed strong Degree, continues to repeat the above process, until reaching enough visual perception intensity, final fixation object area is all for the above process The superposition in fixation object area.
After obtaining fixation object area, the region is cleared in target image and pixel saliency map, to updated picture And 5) significant point in plain saliency map repeats step 3), 4), obtains new fixation object area according to significance minor sort again, Multiple target areas in image are obtained successively.The information of all effective watching areas can be partitioned into from figure in this way, is constructed Simulate the machine vision of human vision.
The frequency domain method refers to being made three components of red, green, blue in coloured image by supercomplex Fourier transform Fourier transform is participated in for three imaginary parts of supercomplex, only retains phase spectrum information, it is notable to obtain pixel through inverse fourier transform Degree figure.It is corresponding should to be effectively directed to coloured image designed for solving the problems, such as that the prior art is only capable of processing black white image identification Ground improves the specific steps of frequency domain method.Supercomplex is made of four parts, is expressed as
Q=a+bi+cj+dk (5)
Wherein a, b, c, d are real numbers, and i, j, k is imaginary unit, and is had the following properties that:i2=j2=k2=ijk =-1, ij=-ji=k, ki=-ik=j, jk=-kj=i.
The RGB models of coloured image can be depicted without the pure supercomplex of real part:
F=R (m, n) i+G (m, n) j+B (m, n) k (6)
Wherein R (m, n), G (m, n), B (m, n) indicate three components of image RGB respectively, and m, n are pixel coordinate.If q =f, then a=0, b=R (m, n), c=G (m, n), d=B (m, n).Colour phasor can be carried out according to formula (6) in supercomplex Fu Leaf transformation:
Wherein, fft2 () indicates that conventional two-dimensional Fourier transformation, real () expressions take real part, imag () expressions to take imaginary part.For the empty vector of unit.Herein, F need to only be takenRThe phase spectrum P (f) of (v, u):
It enables:A=ejP(f) (9)
Supercomplex inverse Fourier transform can be obtained using conventional two-dimensional inverse fast Fourier transform (ifft2) combination, such as Formula (10):
Wherein, B=fft2 (b), C=fft2 (c), D=fft2 (d).
Sa_Map (m, n)=real (F-R(v, u)) (11)
The notable figure as acquired.Since globality of the colour element before and after data processing is kept, to keep away The color distortion caused by the transformation or exchange of vector component is exempted from.
Only preferred embodiments of the present invention are described above, but are not to be construed as limiting the scope of the invention.This Invention is not only limited to above example, and concrete structure is allowed to vary.In short, all guarantors in independent claims of the present invention Various change is within the scope of the invention made by shield range.

Claims (3)

1. a kind of target apperception method of simulated human Low Level Vision, it is characterised in that:Include the following steps:
1) conspicuousness detection is made to target image by frequency domain method, obtains corresponding pixel saliency map, the pixel significance Figure is consistent with the picture element position information of the target image;
2) it to the significant point in the pixel saliency map, is ranked up according to significance;
3) top n significant point is chosen as blinkpunkt, including the minimum rectangle range of these blinkpunkts is as watching area;
4) stochastical sampling is carried out to the watching area interior pixels, and random to the pixel for carrying out equivalent outside watching area Sampling;Obtained watching area interior pixels are sampled as positive sample, watching area external pixels are as negative sample;
5) support vector machines Training strategy, training is utilized to obtain the SVM models of one two classification, pass through mesh described in the category of model Whole pixels of logo image will be divided into the pixel region of positive sample as the first fixation object area;
N+M significant point is used as blinkpunkt before choosing, and watching area is formed according to step 3), then through step 4) with 5) obtain accordingly The second fixation object area;
Compare the overlapping degree in the first fixation object area and the second fixation object area, overlapping degree then shows greatly the vision to target Perceptive intensity is big;Overlapping degree is small, shows that the enough visual perception intensity to target has not yet been formed, and continues to repeat above-mentioned mistake Journey, until reaching enough visual perception intensity, final fixation object area is the superposition in all fixation object areas of the above process.
2. the target apperception method of simulated human Low Level Vision according to claim 1, it is characterised in that:Mesh is watched in acquisition attentively After marking area, the region is cleared in target image and pixel saliency map, to notable in updated pixel saliency map And 5) point repeats step 3), 4), obtains new fixation object area according to significance minor sort again;Thus image can be obtained successively In multiple target areas.
3. the target apperception method of simulated human Low Level Vision according to claim 1 or 2, it is characterised in that:Described Frequency domain method refers to by supercomplex Fourier transform, using three components of red, green, blue in coloured image as three of supercomplex Imaginary part participates in Fourier transform, only retains phase spectrum information, and pixel saliency map is obtained through inverse fourier transform.
CN201510377158.5A 2015-06-25 2015-06-25 The target apperception method of simulated human Low Level Vision Active CN105005788B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510377158.5A CN105005788B (en) 2015-06-25 2015-06-25 The target apperception method of simulated human Low Level Vision

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510377158.5A CN105005788B (en) 2015-06-25 2015-06-25 The target apperception method of simulated human Low Level Vision

Publications (2)

Publication Number Publication Date
CN105005788A CN105005788A (en) 2015-10-28
CN105005788B true CN105005788B (en) 2018-08-28

Family

ID=54378453

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510377158.5A Active CN105005788B (en) 2015-06-25 2015-06-25 The target apperception method of simulated human Low Level Vision

Country Status (1)

Country Link
CN (1) CN105005788B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107392120B (en) * 2017-07-06 2020-04-14 电子科技大学 Attention intelligent supervision method based on sight line estimation
CN112418296B (en) * 2020-11-18 2024-04-02 中国科学院上海微系统与信息技术研究所 Bionic binocular target identification and tracking method based on human eye visual attention mechanism

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7940985B2 (en) * 2007-06-06 2011-05-10 Microsoft Corporation Salient object detection
CN104574335A (en) * 2015-01-14 2015-04-29 西安电子科技大学 Infrared and visible image fusion method based on saliency map and interest point convex hulls

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7940985B2 (en) * 2007-06-06 2011-05-10 Microsoft Corporation Salient object detection
CN104574335A (en) * 2015-01-14 2015-04-29 西安电子科技大学 Infrared and visible image fusion method based on saliency map and interest point convex hulls

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
Saliency Detection: A Spectral Residual Approach;Xiaodi Hou;《Computer Vision and Pattern Recognition(CVPR),2007IEEE Conference on》;IEEE;20070716;全文 *
基于全局颜色对比的显著性目标检测;杨军 等;《计算机应用研究》;20140131;第31卷(第1期);全文 *
基于视觉注意的SVM彩色图像分割方法;郭涛 等;《计算机工程与应用》;20111221;第47卷(第36期);第175页第1栏第47行-第2栏第24行 *
模拟人类视觉的自动图像分割技术研究;侯庆岑;《中国优秀硕士学位论文全文数据库 信息科技辑》;20150215;I138-932第15-17页、28页、35-36页 *

Also Published As

Publication number Publication date
CN105005788A (en) 2015-10-28

Similar Documents

Publication Publication Date Title
CN104992183B (en) The automatic testing method of well-marked target in natural scene
Guo et al. Weighted-RXD and linear filter-based RXD: Improving background statistics estimation for anomaly detection in hyperspectral imagery
CN111898523A (en) Remote sensing image special vehicle target detection method based on transfer learning
Tong et al. A new genetic method for subpixel mapping using hyperspectral images
CN104992452B (en) Airbound target automatic tracking method based on thermal imaging video
CN106778687A (en) Method for viewing points detecting based on local evaluation and global optimization
CN106548169A (en) Fuzzy literal Enhancement Method and device based on deep neural network
CN104933691B (en) Image interfusion method based on the detection of phase spectrum vision significance
Lange et al. The influence of sampling methods on pixel-wise hyperspectral image classification with 3D convolutional neural networks
Kowkabi et al. Enhancing hyperspectral endmember extraction using clustering and oversegmentation-based preprocessing
CN110689000B (en) Vehicle license plate recognition method based on license plate sample generated in complex environment
Wilde et al. Detecting gravitational lenses using machine learning: exploring interpretability and sensitivity to rare lensing configurations
CN105005788B (en) The target apperception method of simulated human Low Level Vision
Xu et al. Using linear spectral unmixing for subpixel mapping of hyperspectral imagery: A quantitative assessment
CN104933435B (en) Machine vision construction method based on simulation human vision
CN105716609B (en) Vision positioning method in a kind of robot chamber
Müller et al. Simulating optical properties to access novel metrological parameter ranges and the impact of different model approximations
Sayama Seeking open-ended evolution in swarm chemistry II: Analyzing long-term dynamics via automated object harvesting
CN109711420A (en) The detection and recognition methods of alveolar hydalid target based on human visual attention mechanism
CN104933725B (en) Simulate the image partition method of human vision
CN105023016B (en) Target apperception method based on compressed sensing classification
CN116310568A (en) Image anomaly identification method, device, computer readable storage medium and equipment
CN106228553A (en) High-resolution remote sensing image shadow Detection apparatus and method
CN106650824B (en) Moving object classification method based on support vector machines
CN104933724A (en) Automatic image segmentation method of trypetid magnetic resonance image

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant