CN109344818A - A kind of light field well-marked target detection method based on depth convolutional network - Google Patents

A kind of light field well-marked target detection method based on depth convolutional network Download PDF

Info

Publication number
CN109344818A
CN109344818A CN201811141315.2A CN201811141315A CN109344818A CN 109344818 A CN109344818 A CN 109344818A CN 201811141315 A CN201811141315 A CN 201811141315A CN 109344818 A CN109344818 A CN 109344818A
Authority
CN
China
Prior art keywords
light field
image
layer
target detection
well
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811141315.2A
Other languages
Chinese (zh)
Other versions
CN109344818B (en
Inventor
张骏
刘亚美
刘紫薇
张钊
郑顺源
郑彤
王程
张旭东
高隽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei University of Technology
Original Assignee
Hefei University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei University of Technology filed Critical Hefei University of Technology
Priority to CN201811141315.2A priority Critical patent/CN109344818B/en
Publication of CN109344818A publication Critical patent/CN109344818A/en
Application granted granted Critical
Publication of CN109344818B publication Critical patent/CN109344818B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition
    • G06V10/12Details of acquisition arrangements; Constructional details thereof
    • G06V10/14Optical characteristics of the device performing the acquisition or on the illumination arrangements
    • G06V10/145Illumination specially adapted for pattern recognition, e.g. using gratings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Abstract

The light field well-marked target detection method based on depth convolutional network that the invention discloses a kind of, step include: the 1 sub-aperture image that whole visual angles are converted out from the light field data for using optical field acquisition equipment to obtain;2 by the sub-aperture image reorganization under different perspectives at lenticule image;3 pairs of lenticule images carry out data enhancing;4 based on the pre-training weight of Deeplab-V2 network, builds the well-marked target detection model in conjunction with lenticule image, and utilize data set training;5 carry out well-marked target detection to light field data to be processed using trained well-marked target detection model.The method of the present invention can effectively improve the accuracy of the well-marked target detection of complex scene image.

Description

A kind of light field well-marked target detection method based on depth convolutional network
Technical field
The invention belongs to computer vision, image procossing and analysis fields, specifically a kind of to be based on depth convolution net The light field well-marked target detection method of network.
Background technique
Conspicuousness target detection is the sensing capability of human visual system.When observing piece image, vision system can Interested region and target in image are quickly obtained, the process for obtaining interested region and target is well-marked target inspection It surveys.With the development of computer technology and internet and universal, people's acquisition external image presentation well of intelligent movable equipment Spray formula increases.Well-marked target detection enters subsequent complex process from a large amount of visual informations of input selection very small part, such as Object detection and recognition, image retrieval, image segmentation etc. effectively reduce the calculation amount of vision system.Currently, well-marked target Detection has become one of the hot spot studied in computer vision field.
According to workable image data, the method for current well-marked target detection can be divided into three classes: two-dimentional well-marked target inspection It surveys, three-dimensional well-marked target detection and light field well-marked target detect.
Two-dimentional well-marked target detection method is to obtain two dimensional image using traditional camera, using conventional method or based on The method of habit extracts Fusion of Color, brightness, position and Texture eigenvalue by part or the frame of global contrast, realizes aobvious It writes and non-significant differentiation.
Three-dimensional well-marked target detection method is the depth information using two dimensional image and scene, realizes well-marked target detection. The depth information of scene is obtained by three-dimension sensor, which equally plays an important role in human visual system, it is anti- The distance between object and observer are answered.Depth information is used for well-marked target detection, compensates for conventional two-dimensional image not Foot obtains final notable figure using color and being complementary to one another for depth, improves well-marked target detection to a certain extent Accuracy.
Light field well-marked target detection method is that the light field data obtained to light-field camera is handled, and realizes well-marked target inspection It surveys.Optical field imaging can record simultaneously the position of light radiation in space by single exposure by new calculating imaging technique And Viewing-angle information, the field information of acquisition reflect the geometry and reflection characteristic of natural scene.Conventional method passes through fusion at present Significant properties on different light field datas improves the performance of the well-marked target detection of challenge scene.
Although having had already appeared the outstanding well-marked target detection method of some performances in computer vision field, this A little methods still remain shortcoming:
1, in two-dimentional well-marked target detection method, since two dimensional image is the product that light projects on camera sensor Point, the light intensity of specific direction is contained only, therefore, two-dimentional well-marked target detection is too sensitive to high frequency section or noise, And the influence for the factors such as, background similar with background color and vein vulnerable to prospect be mixed and disorderly.
2, in three-dimensional well-marked target detection method, the precision of depth information of scene depends on depth camera, current depth That there are resolution ratio is lower, measurement range is narrow, noise is big for degree camera, is unable to measure and transmits material, anti-vulnerable to daylight and smooth flat The problems such as light interference.
3, in three-dimensional well-marked target detection method, the characteristic informations such as color, depth, position, which are all independent from each other, to be located Reason merges again, does not consider its complementarity synthetically.
4, most well-marked target detection methods based on two dimension and 3-D image are obvious to exist between target and background Difference, background are simple etc. assume premised on, with the extensive increase of image data, image content complexity increases, these sides Method has some limitations.
5, in the detection of light field well-marked target, light field data is at the early-stage in the research of well-marked target context of detection, at present Available data set is less and picture quality is poor.The current well-marked target detection using light field data is all based on traditional show Feature calculation method is write, and the multi threads such as color, depth, refocusing are modeled respectively simultaneously, it is insufficient that there is feature representation power The problems such as bad with robust detection effect.
Summary of the invention
The present invention be in order to solve above-mentioned the shortcomings of the prior art in place of, propose a kind of based on depth convolutional network Light field well-marked target detection method, to which the spatial information and Viewing-angle information of light field data can be made full use of, so as to effectively mention The accuracy of the well-marked target detection of high complex scene image.
The present invention adopts the following technical scheme that in order to solve the technical problem
A kind of the characteristics of light field well-marked target detection method based on depth convolutional network of the invention be as follows into Row:
Step 1 obtains lenticule image Id
Step 1.1 obtains light field file using light field equipment, and is decoded to obtain light field data set and is denoted as L= (L1,L2,…,Ld,…,LD), wherein LdIt indicates d-th of light field data, and d-th of light field data is denoted as Ld(u, v, s, t), u With any horizontal pixel and vertical pixel in v representation space information, s and t indicate in Viewing-angle information any horizontal view angle and vertical Visual angle;The sum of d ∈ [1, D], D expression light field data;
Step 1.2, fixed horizontal view angle s and vertical visual angle t, and traverse d-th of light field data LdIn (u, v, s, t) All horizontal pixels and vertical pixel obtain d-th of light field data LdSon in (u, v, s, t) under t row s column visual angle Subaperture imageAndHeight and width be denoted as V and U, v ∈ [1, V], u ∈ [1, U] respectively;
Step 1.3, the traversal light field data LdAll horizontal view angles and vertical visual angle in (u, v, s, t) obtain d-th Sub-aperture image collection under whole visual anglesWherein, [1, S] s ∈, t ∈ [1, T], S indicate that maximum horizontal visual angle is expert at, and T indicates maximum vertical visual angle column;
The visual angle number that step 1.4, definition are chosen is m × m, using formula (1) from the sub-aperture figure under d-th of whole visual angle Image set closes NdD-th image collection M of the middle selection centered on central visual angled:
In formula (1),And it is rightDownward round numbers;
Step 1.5 obtains d-th of lenticule image I according to x=(v-1) × m+t, y=(u-1) × m+sdMiddle xth row The pixel I of y columnd(x, y), to obtain d-th of lenticule image I that height and width are respectively H and Wd, wherein x ∈ [1, H], y ∈ [1, W], H=V × m, W=U × m;
Step 2, from d-th of image collection MdThe sub-aperture image for choosing d-th of central visual angle, is denoted asIt is right The sub-aperture image at described d-th central visual angleSalient region is marked, and enabling the pixel of the salient region is 1, The pixel for enabling non-limiting region is 0, to obtain d-th of lenticule image IdD-th of true notable figure Gd, described D-th of true notable figure GdHeight and width be respectively V and U;
Step 3, to d-th of lenticule image IdData enhancing processing is carried out, d-th of enhanced lenticule is obtained Image collection Id′;To described d-th true notable figure GdGeometric transformation processing is done, d-th of transformed true notable figure is obtained Set Gd′;
Step 4 repeats step 1.2 to step 3, obtains D enhanced lenticule images in the light field data set L Set I '=(I1′,I2′,…,Id′,…,I′D) and the transformed true significant set of graphs of D be denoted as G '=(G1′,G2′,…, Gd′,…,G′D);
D-th step 5, building of light field data LdThe well-marked target detection model of (u, v, s, t);
Step 5.1, the Deeplab-V2 convolutional neural networks for obtaining c layers, the Deeplab-V2 convolutional neural networks packet It includes convolutional layer, pond layer and abandons layer;
Step 5.2 modifies to described c layers of Deeplab-V2 convolutional neural networks, obtains modified LFnet volumes Product neural network;
Step 5.2.1, one layer of convolution kernel size is added before the first layer of the Deeplab-V2 convolutional neural networks For the convolutional layer LF_conv1_1 and ReLU activation primitive LF_relu1_1 of m × m;
The convolutional layer LF_conv1_1 is set when carrying out convolution operation, the moving step length of the convolution kernel is m;
The mathematic(al) representation of the ReLU activation primitive LF_relu1_1 be φ (a)=max (0, a), wherein a expression described in The output of convolutional layer LF_conv1_1, and the input as ReLU activation primitive LF_relu1_1, φ (a) indicate that ReLU activates letter The output of number LF_relu1_1;
Step 5.2.2, in addition to having connected discarding layer in convolutional layer LF_conv1_1 and Deeplab-V2 convolutional neural networks Convolutional layer outside, a discarding layer is added after other convolutional layers in the Deeplab-V2 convolutional neural networks;
Step 5.2.3, c-1 layers of output channel number in the Deeplab-V2 convolutional neural networks is set as B, b is pixel class number;
Step 5.2.4, increase a up-sampling layer after c layers of the Deeplab-V2 convolutional neural networks, utilize Characteristic pattern F of the up-sampling layer to c layers of output of the Deeplab-V2 convolutional neural networksd(q, r, b) adopt Sample operation, the characteristic pattern F after being up-sampledd′(q,r,b);Wherein, q, r and b respectively indicate the characteristic pattern Fd(q, r's, b) Width, height and port number;
Step 5.2.5, increase a shear layer after the up-sampling layer, according to described d-th true notable figure Gd's Long V and width U, using the shear layer to the characteristic pattern Fd' (q, r, b) is sheared, and the lenticule image I is obtaineddPicture Plain class prediction probability graph Fd″(q,r,b);
Step 5.3, using the enhanced lenticule image collection I ' as the defeated of the LFnet convolutional neural networks Enter, using the transformed true significant set of graphs G ' as label, is calculated using cross entropy loss function, and using gradient decline Method is trained the LFnet convolutional neural networks, to obtain the well-marked target detection model of light field data, using described Well-marked target detection model, which is realized, detects the well-marked target of light field data.
Compared with prior art, the beneficial effects of the present invention are:
1, the present invention acquires the light field data of scene complicated and changeable using second generation light-field camera, these scenes contain The difficult points such as the conspicuousness target of sizes, various light sources, well-marked target are similar to background, background is mixed and disorderly, sufficiently supplement and work as Deficiency of the preceding light field visible data in data and difficulty, and improve the quality of current light-field visible data.
2, the present invention extracts characteristics of image using depth convolutional network function powerful in terms of image procossing, merges light field The spatial information and Viewing-angle information of data, using the contextual information of empty pyramid network capture lenticule image, to image Well-marked target in scene is detected, and solves current two dimension or three-dimensional well-marked target detection method is not available Viewing-angle information Defect, improve under complex scene image well-marked target detection precision and robustness.
3, the multi-angle of view information response in lenticule image used herein the space geometry feature of scene, directly will Lenticule image is input in convolutional neural networks, realizes well-marked target detection, overcomes the detection of current light-field well-marked target The shortcomings that method independent process depth and colouring information, depth perception and vision significance are taken into account, is effectively utilized depth and face The complementarity of color improves the accuracy of image well-marked target detection.
Detailed description of the invention
Fig. 1 is well-marked target detection method work flow diagram of the invention;
Fig. 2 is the sub-aperture image that the method for the present invention obtains;
Fig. 3 is the lenticule image that the method for the present invention obtains;
Fig. 4 is the data set part scene and true notable figure that the method for the present invention obtains;
Fig. 5 is the detailed process figure that the method for the present invention lenticule image inputs network model;
Fig. 6 is Deeplab-V2 model structure used in the method for the present invention;
Fig. 7 is the data set that the method for the present invention and other light field well-marked target detection methods are acquired in second generation light-field camera On, the part well-marked target testing result comparison diagram of acquisition;
Fig. 8 be the method for the present invention with " recall ratio/precision ratio curve " for module, second generation light-field camera acquire Data set on, the analysis chart of quantization comparison is carried out with other current light field conspicuousness extracting methods.
Specific embodiment
In the present embodiment, a kind of light field well-marked target detection method based on depth convolutional network, flow chart such as Fig. 1 institute Show, and carry out as follows:
Step 1 obtains lenticule image Id
Step 1.1 obtains light field file using light field equipment, and is decoded to obtain light field data set and is denoted as L= (L1,L2,…,Ld,…,LD), wherein LdIt indicates d-th of light field data, and d-th of light field data is denoted as Ld(u, v, s, t), u With any horizontal pixel and vertical pixel in v representation space information, s and t indicate in Viewing-angle information any horizontal view angle and vertical Visual angle;The sum of d ∈ [1, D], D expression light field data;
In the present embodiment, light field file is obtained using second generation light-field camera, and with lytro powertoolbeta work Tool is decoded light field file, obtains light field data Ld(u,v,s,t);Light field data Ld(u, v, s, t) is joined using biplane Number method indicates, in four-dimensional (u, v, s, t) coordinate space, a light corresponds to a sampled point of light field, and u, v plane indicate Spatial information plane, s, t plane indicate Viewing-angle information plane;In experiment of the invention, 640 light field datas are obtained altogether, are put down 5 parts are divided into, selects 1 part in turn as test set, remaining 4 parts are used as training set.D in step 1.1 indicates training dataset, D=512;
Step 1.2, fixed horizontal view angle s and vertical visual angle t, and traverse d-th of light field data LdOwn in (u, v, s, t) Horizontal pixel and vertical pixel obtain d-th of light field data LdSub-aperture image in (u, v, s, t) under t row s column visual angleAndHeight and width be denoted as V and U respectively, v ∈ [1, V], u ∈ [1, U], in this experiment, V=375, U= 540;
Step 1.3, traversal light field data LdAll horizontal view angles and vertical visual angle in (u, v, s, t) obtain d-th all Sub-aperture image collection under visual angleWherein, [1, S] s ∈, t ∈ [1, T], S indicate that maximum horizontal visual angle is expert at, and T indicates maximum vertical visual angle column;In specific implementation, S=14, T=14; As shown in Fig. 2, left figure is the sub-aperture image collection at all visual angles in Fig. 2, right figure is under the 6th row the 11st column visual angle in Fig. 2 Sub-aperture image
The visual angle number that step 1.4, definition are chosen is m × m, using formula (1) from the sub-aperture figure under d-th of whole visual angle Image set closes NdD-th image collection M of the middle selection centered on central visual angled;In specific implementation, m=9 has chosen 81 views altogether Angle image;Experiment display, more visual angles can provide more information, can further promote the property of well-marked target detection model Can, still, more visual angles need to consume a large amount of storages and calculate the time, increase experiment difficulty;
In formula (1),And it is rightDownward round numbers;
Step 1.5 obtains d-th of lenticule image I according to x=(v-1) × m+t, y=(u-1) × m+sdMiddle xth row The pixel I of y columnd(x, y), to obtain d-th of lenticule image I that height and width are respectively H and Wd, as schemed 3 institutes Show, wherein x ∈ [1, H], y ∈ [1, W], H=V × m, W=U × m;In the present embodiment, in H=3375, W=4860, Fig. 3 Left figure is lenticule image Id, right figure is lenticule image I in Fig. 3dPartial enlarged view, the institute in partial enlarged view medium square There is pixel to represent the pixel set of the same space information, different perspectives information.
Step 2, from d-th of image collection MdThe sub-aperture image for choosing d-th of central visual angle, is denoted asTo d-th The sub-aperture image at central visual angleSalient region is marked, and enabling the pixel of salient region is 1, enables non-limiting region Pixel be 0, to obtain d-th of lenticule image IdD-th of true notable figure Gd, d-th of true notable figure GdHeight It is respectively V and U with width, in specific implementation, V=375, U=540;As shown in figure 4, the first row and the third line are micro- in Fig. 4 Mirror image, the second row and is fourth is that true notable figure.
Step 3, to d-th of lenticule image IdData enhancing processing is carried out, d-th of enhanced lenticule image is obtained Set Id′;To d-th of true notable figure GdGeometric transformation processing is done, d-th of transformed true significant set of graphs G is obtainedd′; In the present embodiment, to d-th of lenticule image IdIt rotated, overturn, increased coloration, increase contrast, increase brightness, drop Low brightness and increase Gaussian noise processing, realizes data enhancing, well-marked target detection model can be improved in data enhancing Generalization ability.
Step 4 repeats step 1.2 to step 3, obtains D enhanced lenticule image collections in light field data set L I '=(I1′,I2′,…,Id′,…,I′D) and the transformed true significant set of graphs of D be denoted as G '=(G1′,G2′,…, Gd′,…,G′D);
D-th step 5, building of light field data LdThe well-marked target detection model of (u, v, s, t);
Step 5.1, the Deeplab-V2 convolutional neural networks for obtaining c layers, Deeplab-V2 convolutional neural networks include volume Lamination, pond layer abandon layer and merge layer, and in specific implementation, c=24, Deeplab-V2 use depth convolutional neural networks, by 16 layers of convolutional layer, 5 layers of pond layer, 2 layers of discarding layer and 1 laminated and layer composition, are used for semantic segmentation, detailed construction such as Fig. 6 institute Show, Deeplab-V2 contains empty pyramid network structure, and the context of image is captured with multiple ratios, realizes that multiple scales are big Small well-marked target detection.
Step 5.2 modifies to c layers of Deeplab-V2 convolutional neural networks, obtains modified LFnet convolution mind Through network, the detailed construction of LFnet convolutional neural networks is as shown in Figure 5;
Step 5.2.1, be added before the first layer of Deeplab-V2 convolutional neural networks one layer of convolution kernel size be m × The convolutional layer LF_conv1_1 and ReLU activation primitive LF_relu1_1 of m;
Convolutional layer LF_conv1_1 is set when carrying out convolution operation, the moving step length of convolution kernel is m;In specific implementation, m =9;Lenticule image I is constructed in step 1.4 and step 1.5dWhen, the visual angle number of selection is 9 × 9, in order to which network can be more Good extraction simultaneously merges multi-angle of view information, so the convolution kernel size of setting convolutional layer LF_conv1_1 is 9 × 9, step-length 9;
The mathematic(al) representation of ReLU activation primitive LF_relu1_1 is that (0, a), wherein a indicates convolutional layer to φ (a)=max The output of LF_conv1_1, and the input as ReLU activation primitive LF_relu1_1, φ (a) indicate ReLU activation primitive LF_ The output of relu1_1;
Step 5.2.2, in addition to having connected discarding layer in convolutional layer LF_conv1_1 and Deeplab-V2 convolutional neural networks Convolutional layer outside, a discarding layer is added after other convolutional layers in Deeplab-V2 convolutional neural networks;In this implementation In example, it is added and abandons layer, over-fitting can be effectively prevented, while improving the generalization ability of well-marked target detection model;
Step 5.2.3, c-1 layers of output channel number in Deeplab-V2 convolutional neural networks is set as b, B is pixel class number;In specific implementation, c-1=23, b=2;Well-marked target detection model is classified to pixel, is divided into Significant and non-significant two class.
Step 5.2.4, increase a up-sampling layer after c layers of Deeplab-V2 convolutional neural networks, using above adopting Characteristic pattern F of the sample layer to c layers of output of Deeplab-V2 convolutional neural networksd(q, r, b) carries out up-sampling operation, in acquisition Characteristic pattern F after samplingd′(q,r,b);Wherein, q, r and b respectively indicate characteristic pattern FdWidth, height and the channel of (q, r, b) Number;
Step 5.2.5, increase a shear layer after up-sampling layer, according to d-th of true notable figure GdLong V and width U, Using shear layer to characteristic pattern Fd' (q, r, b) is sheared, and lenticule image I is obtaineddPixel class prediction probability figure Fd″ (q,r,b);
Step 5.3, using enhanced lenticule image collection I ' as the input of LFnet convolutional neural networks, with transformation True significant set of graphs G ' afterwards is used as label, using cross entropy loss function, and using gradient descent algorithm to LFnet convolution Neural network is trained, to obtain the well-marked target detection model of light field data, is realized using well-marked target detection model Well-marked target detection to light field data.
Test set is handled according to step 1.1 to step 2, obtains the lenticule image of test set, test set it is micro- Lenticular image is input in well-marked target detection model, obtains the pixel class prediction probability figure F of test settest" (q, r, b), Notable figure F is extracted using formula (2)s", F in formula (2)test" (q, r, 2) represents probability graph Ftest" the number in second channel (q, r, b) Value;To notable figure Fs" normalization, obtains final notable figure Fs
Fs"=Ftest″(q,r,2) (2)
For the performance of well-marked target detection model obtained in more fair evaluation the method for the present invention, in turn selection training Collection and test set take the average final index as evaluation well-marked target detection model performance to 5 test results.
Fig. 7 is the well-marked target detection method of the invention based on depth convolutional network and other current light field well-marked targets Detection method is qualitatively compared, wherein Ours indicates the well-marked target detection side of the invention based on depth convolutional network Method;Multi-cue indicate based on focused flow, visual angle stream, depth and color light field well-marked target detection method;DILF indicates base In the light field well-marked target detection method of color, depth and background priori;WSC indicates that the light field based on sparse coding theory is significant Object detection method;LFS indicates the well-marked target detection method modeled based on target and background.4 kinds of methods make in the present invention It is tested on the real scene data set of second generation light-field camera acquisition.
Table 1 be it is of the invention based on the well-marked target detection method of depth convolutional network with " F-measure ", " WF- Measure ", " mean accuracy AP ", " average absolute value error MAE " are module, and are acquired using second generation light-field camera Data set, the analytical table of quantization comparison is carried out with other current light field well-marked target detection methods, " F-measure " is " to look into The statistical indicator of full rate/precision ratio curve " measurement, value show that the effect of well-marked target detection is better, " WF- closer to 1 Measure " is the statistical indicator of " weighting recall ratio/precision ratio curve " measurement, and value shows that well-marked target detects closer to 1 Effect it is better, " AP " has measured the average precision of the result of well-marked target detection, and value indicates well-marked target closer to 1 The effect of detection is better, and " MAE " has measured the result of well-marked target detection and the average absolute difference degree of legitimate reading, value Closer to 0, show that the effect of well-marked target detection is better.
Fig. 8 be it is of the invention based on the well-marked target detection method of depth convolutional network with " accuracy rate-recall rate curve PR Curve " is module, the analysis chart of quantization comparison is carried out with other current light field well-marked target detection methods, if a PR song Completely " encasing " by another PR curve, then the performance of the latter is better than the former to line.
Table 1
Well-marked target detection method Ours Multi-cue DILF WSC LFS
F-measure 0.8118 0.6649 0.6395 0.6452 0.6108
WF-measure 0.7541 0.5420 0.4844 0.5946 0.3597
AP 0.9124 0.6593 0.6922 0.5960 0.6193
MAE 0.0551 0.1198 0.1390 0.1093 0.1698
By the quantitative analysis table of table 1 as it can be seen that the method for the present invention obtain " F-measure ", " WF-measure ", " AP " and " MAE " is above other light field well-marked target detection methods.By the PR curve graph of Fig. 8 as it can be seen that the method for the present invention shows " to look into complete Rate/precision ratio curve " includes the PR curve of other methods close to the upper right corner, and when recall ratio is identical, probability of failure compared with It is low.

Claims (1)

1. a kind of light field well-marked target detection method based on depth convolutional network, it is characterized in that carrying out as follows:
Step 1 obtains lenticule image Id
Step 1.1 obtains light field file using light field equipment, and is decoded to obtain light field data set and is denoted as L=(L1, L2,…,Ld,…,LD), wherein LdIt indicates d-th of light field data, and d-th of light field data is denoted as Ld(u, v, s, t), u and v table Show any horizontal pixel and vertical pixel in spatial information, s and t indicate any horizontal view angle and vertical visual angle in Viewing-angle information;d The sum of ∈ [1, D], D expression light field data;
Step 1.2, fixed horizontal view angle s and vertical visual angle t, and traverse d-th of light field data LdOwn in (u, v, s, t) Horizontal pixel and vertical pixel obtain d-th of light field data LdSub-aperture in (u, v, s, t) under t row s column visual angle ImageAndHeight and width be denoted as V and U, v ∈ [1, V], u ∈ [1, U] respectively;
Step 1.3, the traversal light field data LdAll horizontal view angles and vertical visual angle in (u, v, s, t) obtain d-th all Sub-aperture image collection under visual angleWherein, [1, S] s ∈, t ∈ [1, T], S indicate that maximum horizontal visual angle is expert at, and T indicates maximum vertical visual angle column;
The visual angle number that step 1.4, definition are chosen is m × m, using formula (1) from the sub-aperture image set under d-th of whole visual angle Close NdD-th image collection M of the middle selection centered on central visual angled:
In formula (1),And it is right Downward round numbers;
Step 1.5 obtains d-th of lenticule image I according to x=(v-1) × m+t, y=(u-1) × m+sdMiddle xth row y column Pixel Id(x, y), to obtain d-th of lenticule image I that height and width are respectively H and Wd, wherein x ∈ [1, H], y ∈ [1, W], H=V × m, W=U × m;
Step 2, from d-th of image collection MdThe sub-aperture image for choosing d-th of central visual angle, is denoted asTo described The sub-aperture image at d central visual angleSalient region is marked, and enabling the pixel of the salient region is 1, is enabled non-aobvious The pixel in work property region is 0, to obtain d-th of lenticule image IdD-th of true notable figure Gd, described d-th true Real notable figure GdHeight and width be respectively V and U;
Step 3, to d-th of lenticule image IdData enhancing processing is carried out, d-th of enhanced lenticule image is obtained Set I 'd;To described d-th true notable figure GdGeometric transformation processing is done, d-th of transformed true significant set of graphs is obtained G′d
Step 4 repeats step 1.2 to step 3, obtains D enhanced lenticule image collections in the light field data set L I '=(I '1,I′2,…,I′d,…,I′D) and the transformed true significant set of graphs of D be denoted as G '=(G '1,G′2,…,G ′d,…,G′D);
D-th step 5, building of light field data LdThe well-marked target detection model of (u, v, s, t);
Step 5.1, the Deeplab-V2 convolutional neural networks for obtaining c layers, the Deeplab-V2 convolutional neural networks include volume Lamination, pond layer and discarding layer;
Step 5.2 modifies to described c layers of Deeplab-V2 convolutional neural networks, obtains modified LFnet convolution mind Through network;
Step 5.2.1, be added before the first layer of the Deeplab-V2 convolutional neural networks one layer of convolution kernel size be m × The convolutional layer LF_conv1_1 and ReLU activation primitive LF_relu1_1 of m;
The convolutional layer LF_conv1_1 is set when carrying out convolution operation, the moving step length of the convolution kernel is m;
The mathematic(al) representation of the ReLU activation primitive LF_relu1_1 is that (0, a), wherein a indicates the convolution to φ (a)=max The output of layer LF_conv1_1, and the input as ReLU activation primitive LF_relu1_1, φ (a) indicate ReLU activation primitive The output of LF_relu1_1;
Step 5.2.2, the volume of layer is abandoned in addition to having connected in convolutional layer LF_conv1_1 and Deeplab-V2 convolutional neural networks Outside lamination, a discarding layer is added after other convolutional layers in the Deeplab-V2 convolutional neural networks;
Step 5.2.3, c-1 layers of output channel number in the Deeplab-V2 convolutional neural networks is set as b, B is pixel class number;
Step 5.2.4, increase a up-sampling layer after c layers of the Deeplab-V2 convolutional neural networks, using described Up-sample the characteristic pattern F of c layer output of the layer to the Deeplab-V2 convolutional neural networksd(q, r, b) carries out up-sampling behaviour Make, the characteristic pattern F after being up-sampledd′(q,r,b);Wherein, q, r and b respectively indicate the characteristic pattern FdThe width of (q, r, b) Degree, height and port number;
Step 5.2.5, increase a shear layer after the up-sampling layer, according to described d-th true notable figure GdLong V and Wide U, using the shear layer to the characteristic pattern Fd' (q, r, b) is sheared, and the lenticule image I is obtaineddPixel class Other prediction probability figure Fd″(q,r,b);
Step 5.3, using the enhanced lenticule image collection I ' as the input of the LFnet convolutional neural networks, with The transformed true significant set of graphs G ' is used as label, using cross entropy loss function, and utilizes gradient descent algorithm pair The LFnet convolutional neural networks are trained, to obtain the well-marked target detection model of light field data, using described significant Target detection model realization detects the well-marked target of light field data.
CN201811141315.2A 2018-09-28 2018-09-28 Light field significant target detection method based on deep convolutional network Active CN109344818B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811141315.2A CN109344818B (en) 2018-09-28 2018-09-28 Light field significant target detection method based on deep convolutional network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811141315.2A CN109344818B (en) 2018-09-28 2018-09-28 Light field significant target detection method based on deep convolutional network

Publications (2)

Publication Number Publication Date
CN109344818A true CN109344818A (en) 2019-02-15
CN109344818B CN109344818B (en) 2020-04-14

Family

ID=65307539

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811141315.2A Active CN109344818B (en) 2018-09-28 2018-09-28 Light field significant target detection method based on deep convolutional network

Country Status (1)

Country Link
CN (1) CN109344818B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110441271A (en) * 2019-07-15 2019-11-12 清华大学 Light field high-resolution deconvolution method and system based on convolutional neural networks
CN111369522A (en) * 2020-03-02 2020-07-03 合肥工业大学 Light field significance target detection method based on generation of deconvolution neural network
CN111445465A (en) * 2020-03-31 2020-07-24 江南大学 Light field image snowflake or rain strip detection and removal method and device based on deep learning
CN111931793A (en) * 2020-08-17 2020-11-13 湖南城市学院 Saliency target extraction method and system
CN113343822A (en) * 2021-05-31 2021-09-03 合肥工业大学 Light field saliency target detection method based on 3D convolution

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105701813A (en) * 2016-01-11 2016-06-22 深圳市未来媒体技术研究院 Significance detection method of light field image
US20160203689A1 (en) * 2015-01-08 2016-07-14 Kenneth J. Hintz Object Displacement Detector
CN105913070A (en) * 2016-04-29 2016-08-31 合肥工业大学 Multi-thread significance method based on light field camera
CN106981080A (en) * 2017-02-24 2017-07-25 东华大学 Night unmanned vehicle scene depth method of estimation based on infrared image and radar data
WO2018072858A1 (en) * 2016-10-18 2018-04-26 Photonic Sensors & Algorithms, S.L. Device and method for obtaining distance information from views
CN107993260A (en) * 2017-12-14 2018-05-04 浙江工商大学 A kind of light field image depth estimation method based on mixed type convolutional neural networks

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160203689A1 (en) * 2015-01-08 2016-07-14 Kenneth J. Hintz Object Displacement Detector
CN105701813A (en) * 2016-01-11 2016-06-22 深圳市未来媒体技术研究院 Significance detection method of light field image
CN105913070A (en) * 2016-04-29 2016-08-31 合肥工业大学 Multi-thread significance method based on light field camera
WO2018072858A1 (en) * 2016-10-18 2018-04-26 Photonic Sensors & Algorithms, S.L. Device and method for obtaining distance information from views
CN106981080A (en) * 2017-02-24 2017-07-25 东华大学 Night unmanned vehicle scene depth method of estimation based on infrared image and radar data
CN107993260A (en) * 2017-12-14 2018-05-04 浙江工商大学 A kind of light field image depth estimation method based on mixed type convolutional neural networks

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
HAO SHENG等: "Occlusion-aware depth estimation for light field using multi-orientation EPIs", 《PATTERN RECOGNITION》 *
JUN ZHANG等: "Saliency Detection on Light Field: A Multi-Cue Approach", 《ACM TRANSACTIONS ON MULTIMEDIA COMPUTING, COMMUNICATIONS, AND APPLICATIONS (TOMM)》 *
王丽娟: "光场相机的标定方法及深度估计研究", 《万方数据知识服务平台》 *
罗姚翔: "基于卷积神经网络的光场图像深度估计技术研究", 《万方数据知识服务平台》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110441271A (en) * 2019-07-15 2019-11-12 清华大学 Light field high-resolution deconvolution method and system based on convolutional neural networks
CN111369522A (en) * 2020-03-02 2020-07-03 合肥工业大学 Light field significance target detection method based on generation of deconvolution neural network
CN111445465A (en) * 2020-03-31 2020-07-24 江南大学 Light field image snowflake or rain strip detection and removal method and device based on deep learning
CN111931793A (en) * 2020-08-17 2020-11-13 湖南城市学院 Saliency target extraction method and system
CN111931793B (en) * 2020-08-17 2024-04-12 湖南城市学院 Method and system for extracting saliency target
CN113343822A (en) * 2021-05-31 2021-09-03 合肥工业大学 Light field saliency target detection method based on 3D convolution

Also Published As

Publication number Publication date
CN109344818B (en) 2020-04-14

Similar Documents

Publication Publication Date Title
CN109344818A (en) A kind of light field well-marked target detection method based on depth convolutional network
Chen et al. Learned feature embeddings for non-line-of-sight imaging and recognition
CN108549891B (en) Multi-scale diffusion well-marked target detection method based on background Yu target priori
Lin et al. Line segment extraction for large scale unorganized point clouds
CN108596108B (en) Aerial remote sensing image change detection method based on triple semantic relation learning
Romaszko et al. Vision-as-inverse-graphics: Obtaining a rich 3d explanation of a scene from a single image
CN111612807A (en) Small target image segmentation method based on scale and edge information
CN110297232A (en) Monocular distance measuring method, device and electronic equipment based on computer vision
Li et al. Neulf: Efficient novel view synthesis with neural 4d light field
CN112784782B (en) Three-dimensional object identification method based on multi-view double-attention network
Li et al. Target detection based on dual-domain sparse reconstruction saliency in SAR images
CN113159232A (en) Three-dimensional target classification and segmentation method
CN112990010A (en) Point cloud data processing method and device, computer equipment and storage medium
CN114998566A (en) Interpretable multi-scale infrared small and weak target detection network design method
Dey et al. Mip-NeRF RGB-d: Depth assisted fast neural radiance fields
CN114299405A (en) Unmanned aerial vehicle image real-time target detection method
CN114463736A (en) Multi-target detection method and device based on multi-mode information fusion
Agresti et al. Stereo and ToF data fusion by learning from synthetic data
CN106886754B (en) Object identification method and system under a kind of three-dimensional scenic based on tri patch
Xu et al. Light field distortion feature for transparent object classification
Chen et al. Scene segmentation of remotely sensed images with data augmentation using U-net++
Wang et al. Buried target detection method for ground penetrating radar based on deep learning
CN104217430A (en) Image significance detection method based on L1 regularization
CN113011359A (en) Method for simultaneously detecting plane structure and generating plane description based on image and application
Balado et al. Multi feature-rich synthetic colour to improve human visual perception of point clouds

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant