CN111461125A - Continuous segmentation method of panoramic image - Google Patents
Continuous segmentation method of panoramic image Download PDFInfo
- Publication number
- CN111461125A CN111461125A CN202010198068.0A CN202010198068A CN111461125A CN 111461125 A CN111461125 A CN 111461125A CN 202010198068 A CN202010198068 A CN 202010198068A CN 111461125 A CN111461125 A CN 111461125A
- Authority
- CN
- China
- Prior art keywords
- image
- panoramic
- segmentation
- encoder
- characteristic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Abstract
The invention discloses a method for continuously segmenting a panoramic image. The method uses the segmentation data set of the conventional pinhole image for training, the segmentation model generated by training is deployed by the method disclosed by the invention, and the output segmentation image has the advantages of continuity, seamless, smoothness and reliability at 360 degrees.
Description
Technical Field
The invention belongs to the technical fields of image segmentation technology, scene perception technology, mode recognition technology, image processing technology and computer vision, and relates to a method for continuously segmenting a panoramic image.
Background
Panoramic vision refers to the fact that all visual information of a three-dimensional space larger than a hemispherical visual field (360 degrees × 180 degrees) is obtained at one time, and due to the fact that the visual field is wide, the panoramic vision has very important significance for various industries depending on visual information to make decisions in the fields of civil use, military use and aerospace.
The image segmentation can provide pixel-level classification of scenes, and can complete detection tasks of various scene elements at the same time. Various image segmentation methods including semantic segmentation have been widely applied in the fields of intelligent vehicles, robots, visual aids, augmented reality systems, and the like.
However, current segmentation techniques are typically designed to work with conventional pinhole cameras and therefore can only acquire information at limited angles of view. Segmentation based on a convolutional neural network often requires a large amount of labeled data for training a network model, and most of large-scale data sets in the industry are images acquired by a common pinhole camera. The semantic segmentation network model trained by using the image data sets is not suitable for panoramic images, cannot be directly applied to panoramic cameras, and realizes 360-degree segmentation.
Disclosure of Invention
The invention aims to provide a panoramic image continuity segmentation method, which adopts a segmentation model F comprising N encoders and decoders to carry out continuity processing on a characteristic image boundary, and specifically comprises the following steps: and (3) taking element values from the characteristic images obtained from the convolution layers in the encoders Fi +1 and Fi-1 corresponding to the adjacent panoramic segmented images Pi +1 and Pi-1, and filling without adopting a default zero filling mode. By modifying the filling mode of the convolution layer, the information of the adjacent images can be considered in the prediction of each panoramic subsection image, so that continuous and seamless semantic type prediction can be realized, gaps caused by the subsections can be avoided, and blind areas can be eliminated.
Another object of the present invention is to provide a method for segmenting a panoramic image continuously, which segments an image using a segmentation model F including N encoders and decoders trained from segmented data sets, thereby eliminating the need to label the panoramic image data sets and reducing the time and cost for data preparation.
In order to achieve the above object, the present invention comprises the steps of:
(1) unfolding the panoramic image to obtain an image Pu; the deployment process may employ existing OCamCalib tools (Scaramuzza, d., Martinelli, a.and Siegwart, r.,2006, october.a. toolbox for easy simulation of systematic cameras.in2006IEEE/RSJ International conference on intellectual Robots and Systems (pp.5695-5701).
(2) Averagely dividing the image Pu into N sections along the unfolding direction to obtain a panoramic sectional image P1、P2,…Pi,…PN,i=1,2…N;
(3) Using image segmentation data set, trainingThe segmentation model F is obtained by training a segmentation network of the encoder-decoder type. Inputting the N panoramic segment images into a segmentation model F coder respectively to obtain images P corresponding to the panoramic segment images respectively1、P2,…Pi,…PNCharacteristic image T of1、T2,…T,…TN,i=1,2…N;
Wherein for a panorama segmented image PiEncoder FiAnd filling the boundary of the characteristic image output by the kth convolutional layer by adopting adjacent element values, wherein the adjacent element values adopted by the left boundary of the characteristic image are as follows: encoder FLThe right boundary element value in the feature image output by the (k-1) th convolutional layer in the image, and the adjacent element values adopted by the right boundary of the feature image are as follows: encoder FRThe left boundary element value in the feature image output by the (k-1) th convolutional layer; k is more than or equal to 1; wherein, the characteristic image output by the 0 th convolution layer is the original image, namely the panoramic segmented image Pi。
That is, for the panorama segment image PiEncoder FiThe boundary of the feature image output by the middle 1 st convolutional layer is filled by adopting adjacent element values, wherein the adjacent element values adopted by the left boundary of the feature image are as follows: the right boundary element values of the original image, and the adjacent element values adopted by the right boundary of the feature image are as follows: left boundary element values of the original image; and an encoder FiThe boundary of the characteristic image output by the kth (k is more than or equal to 2) convolutional layer is also filled by adopting adjacent element values, wherein the adjacent element values adopted by the left boundary of the characteristic image are as follows: encoder FLThe right boundary element value in the feature image output by the (k-1) th convolutional layer in the image, and the adjacent element values adopted by the right boundary of the feature image are as follows: encoder FRThe left boundary element value in the feature image output by the (k-1) th convolutional layer;
encoder Fi、FL、FRRespectively representing a method for processing a panorama segmented image Pi、PL、PRThe encoder of (1).
Subscripts L, R satisfy:
(4) feature image T1~TNAnd splicing along the unfolding direction of the panoramic image to obtain a spliced characteristic image T.
(5) Pooling the spliced characteristic image T along the unfolding direction of the panoramic image with the pooling proportion of N, and acquiring the pooled characteristic image Tp。
(6) Feature image TpAnd inputting the panoramic image into a decoder, and up-sampling the resolution of the expanded panoramic image Pu to obtain a panoramic image segmentation image Ps. The upsampling method may use bilinear interpolation.
Further, the segmentation is based on semantics. Correspondingly, the segmentation model F is obtained by training a semantic segmentation data set of the pinhole camera image. The training set can adopt Cityscapes or Mapilary Vistas; encoder-decoder type semantic segmentation networks can be used such as ERFNet, SegNet, ERF-PSPNet. The cityscaps or MapillaryVistas and the semantic segmentation networks ERFNet, SegNet and ERF-PSPNet are common knowledge in the field, and specifically include:
data set cityscaps: cordts, M.M., Omran, M.S., Ramos, S.S., Rehfeld, T.S., Enzweiler, M.S., Benenson, R.S., Franke, U.S., Roth, S.and Schiele, B.S., 2016.The CIS maps data set for manufacturing urea rubber understandings. in Proceedings of The IEEE conference on computer vision and pattern recognition (pp.3213-3223).
Data set Mapillary Vistas: neuhold, G., Ollmann, T.S., Rota Bulo, S.and Kontscheder, P.2017. The mapillary vistas dataset for magnetic understandings of street scenes. in Proceedings of The IEEE International Conference on computer Vision (pp.4990-4999).
Semantic segmentation network ERFNet, Romera, E., Alvarez, J.M., Bergasa, L. M.and Arroyo, R.,2017.Erfnet, effective residual factored consistent net for real-time segmentation. IEEE Transactions on Intelligent transformation Systems,19(1), pp.263-272.
Semantic segmentation network SegNet: badrinarayanan, V., Kendall, A.and Cipola, R.,2017 Segnet A deep connected encoder-decoder architecture for image segmentation. IEEE transactions on pattern analysis and machine analysis, 39(12), pp.2481-2495.
Semantic segmentation network ERF-PSPNet, Yang, K., Wang, K., Bergasa, L, M., Romera, E., Hu, W., Sun, D., Sun, J., Cheng, R., Chen, T.and L Lopez, E.2018, Unifying teran architecture for the visual affected third real-time semantic segmentation, 18(5), p.1506.
Compared with the conventional panoramic image semantic segmentation, the panoramic image semantic segmentation method has the following advantages that:
1. low cost, compact system and low delay. When the invention finishes 360-degree semantic segmentation, only one panoramic camera and one small-sized processor are needed. Compared with the prior art that 360-degree semantic information is acquired, a plurality of pinhole cameras or a plurality of fisheye cameras are needed, so that the device and cost are saved, the compactness of the system is ensured, and the system is suitable for systems such as intelligent vehicles, robots and vision assistance. In addition, in the past, a plurality of cameras need to be synchronized, and images acquired by the cameras and segmentation results need to be subjected to data fusion, so that delay is increased. The invention performs 360-degree semantic segmentation and only needs to process the image collected by one panoramic camera, thereby reducing redundancy and delay.
2. And a new image does not need to be marked, so that the time and the cost for preparing data are saved. The invention only needs the semantic segmentation data set of the conventional pinhole camera image for training, and does not need to mark the panoramic image data set by self, thereby reducing the time and cost for data preparation.
3. The generated semantic segmentation model has high reliability. The invention can train by using abundant and diversified data existing in the industry because of training by adopting the semantic segmentation data set of the conventional pinhole camera image, and can train to generate a reliable model.
4. The 360-degree continuous and seamless semantic segmentation can be realized. According to the invention, because the filling mode of the convolution layer is modified during deployment, and the information of the adjacent images is considered in the prediction of each section of panoramic segmented image, continuous and seamless semantic type prediction can be realized, gaps caused by segmentation can be avoided, and blind areas are eliminated.
5. Smooth semantic segmentation can be achieved. According to the invention, when the system is deployed, the characteristic images of different segments are spliced and pooled, so that noise is filtered out conveniently, and smooth semantic information prediction is realized.
Drawings
FIG. 1 is a schematic diagram of module connections;
FIG. 2 is a panoramic image;
FIG. 3 is a panoramic expansion image;
FIG. 4 shows the result of semantic segmentation of a panoramic image.
Detailed Description
The fact that the present invention is carried out and the technical effects thereof will be described in detail with reference to examples.
In the following embodiments, a panoramic camera is used to acquire a panoramic image as shown in fig. 2, and the image is segmented according to the following steps:
(1) the encoder-decoder type semantic segmentation network ERF-PSPNet is trained with the cityscaps data set to obtain a semantic segmentation model F, as shown in the following table, layers 1 to 16 being the encoder part and layers 17 to 20 being the decoder part.
(2) The panoramic image shown in fig. 2 is expanded, specifically:
setting the center coordinates as the origin O (0,0), X axis, Y axis of the plane coordinate system, the inner diameter of the panoramic image as R, the outer diameter of the panoramic image as R, setting the radius of the middle circle by R1 (R + R)/2, the azimuth angle as β tan-1 (Y/X), developing the panoramic column-shaped developed image by using the origin O (0,0), X axis and Y axis as the plane coordinate system, setting the intersection (R,0) of the inner diameter of the panoramic image and the X axis as the origin O (0,0) and developing in the azimuth angle β direction, establishing the corresponding relation between the pixel coordinate P (X, Y) of any point in the panoramic column-shaped developed image and the pixel coordinate Q (X, Y) in the panoramic image, and calculating the corresponding formula as the following steps:
x*=y*/(tan(360x**/π(R+r)))
y*=(y**+r)cosβ
in the formula, x, y are pixel coordinate values of the panoramic cylindrical expansion image, x, y are pixel coordinate values of the panoramic image, R is an outer diameter of the circular panoramic image, R is an inner diameter of the circular panoramic image, and β is an azimuth angle of the circular panoramic image coordinates.
The image obtained after unfolding is shown in fig. 3 and named Pu.
Those skilled in the art can also see Scaramuzza, d., Martinelli, a.and Siegwart, r.,2006, october.a. toolbox for easy simulation of arbitrary cameras.in2006IEEE/RSJ International Conference on Intelligent Robots and Systems (pp.5695-5701) IEEE, using the OCamCalib tool to expand the panoramic image shown in fig. 2.
(3) Dividing the image Pu into 4 segments along the spreading direction, i.e. the horizontal direction in the figure, and obtaining a panoramic segment image P1、P2,P3,P4。
(4) Segmenting 4 panoramas into images P1、P2,P3,P4Respectively input into the encoders of the segmentation models F to obtain respective corresponding panoramic segment images P1、P2,P3,P4Characteristic image T of1、T1,T3,T4One encoder corresponds to one panoramic segment image;
for panoramic segment image PiEncoder FiAnd filling the boundary of the characteristic image output by the kth convolutional layer by adopting adjacent element values, wherein the adjacent element values adopted by the left boundary of the characteristic image are as follows: encoder FLThe right boundary element value in the feature image output by the (k-1) th convolutional layer in the image, and the adjacent element values adopted by the right boundary of the feature image are as follows: encoder FRThe left boundary element value in the feature image output by the (k-1) th convolutional layer; k is more than or equal to 2;
encoder Fi、FL、FRIndividual watchFor processing panorama segment image Pi、PL、PRThe encoder of (1).
Subscripts L, R satisfy:
specifically, in the present embodiment, in the layers 1 to 8 in the corresponding table, when padding is performed in the convolution calculation at the boundary, the number of elements taken from the boundary of the adjacent feature image each time is 1, and in the layers 9 to 16, the number of elements taken is consistent with the extended convolution rate.
(5) Feature image T1~T4And splicing along the unfolding direction of the panoramic image to obtain a spliced characteristic image T.
(6) Pooling the spliced characteristic image T along the unfolding direction of the panoramic image with the pooling proportion of 4, and acquiring the pooled characteristic image Tp。
(7) Feature image TpAnd inputting the images into a decoder, and adopting bilinear interpolation to up-sample the resolution of the expanded panoramic image Pu to obtain a panoramic image segmentation image Ps, as shown in fig. 4, wherein different gray values represent different segmentation categories.
Comparing fig. 4 with fig. 2 manually, it can be seen that the images are in smooth transition without discontinuity of the images due to segmentation of the panoramic image; the elements such as the passage, the vehicle, the house, the finer street lamp, the pedestrian and the like are clearly divided, the division result is accurate, and the precision is high; and the generated semantic segmentation model is adopted, so that the result reliability is high.
In addition, the speed of the image input from step 2 to the image output of step 7 on the Nvidia Titan RTX processor is as high as 40 frames per second, and the precision (average cross-over ratio) can be improved by more than 25% compared with the existing segmentation method that directly inputs the whole panoramic image without adaptation.
Claims (4)
1. A method for continuous segmentation of a panoramic image, the method comprising at least:
(1) unfolding the panoramic image to obtain an image Pu;
(2) averagely dividing the image Pu into N sections along the unfolding direction to obtain a panoramic sectional image, and sequentially marking the panoramic sectional image as P from left to right1、P2,...Pi,…PN,i=1,2…N;
(3) An encoder-decoder type segmentation network is trained using the image segmentation dataset to obtain a segmentation model F. Inputting the N panoramic segment images into a segmentation model F coder respectively to obtain images P corresponding to the panoramic segment images respectively1、P2,...Pi,…PNCharacteristic image T of1、T2,...T,…TN,i=1,2…N;
Wherein for a panorama segmented image PiEncoder FiAnd filling the boundary of the characteristic image output by the kth convolutional layer by adopting adjacent element values, wherein the adjacent element values adopted by the left boundary of the characteristic image are as follows: encoder FLThe right boundary element value in the feature image output by the (k-1) th convolutional layer in the image, and the adjacent element values adopted by the right boundary of the feature image are as follows: encoder FRThe left boundary element value in the feature image output by the (k-1) th convolutional layer; k is more than or equal to 1; wherein, the characteristic image output by the 0 th convolution layer is an original image: panorama segmented image Pi。
Encoder Fi、FL、FRRespectively representing a method for processing a panorama segmented image Pi、PL、PRSubscripts L, R satisfy:
(4) feature image T1~TNAnd splicing along the unfolding direction of the panoramic image to obtain a spliced characteristic image T.
(5) Pooling the spliced characteristic image T along the unfolding direction of the panoramic image with the pooling proportion of N, and acquiring the pooled characteristic image Tp。
(6) Feature image TpAnd inputting the panoramic image into a decoder, and up-sampling the resolution of the expanded panoramic image Pu to obtain a panoramic image segmentation image Ps.
2. The method of claim 1, wherein the segmenting is based on semantics.
3. The method of claim 2, wherein the segmentation model F is trained using a semantic segmentation dataset of pinhole camera images.
4. The method of claim 1, wherein in step 6, the upsampling method may use bilinear interpolation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010198068.0A CN111461125B (en) | 2020-03-19 | 2020-03-19 | Continuous segmentation method of panoramic image |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010198068.0A CN111461125B (en) | 2020-03-19 | 2020-03-19 | Continuous segmentation method of panoramic image |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111461125A true CN111461125A (en) | 2020-07-28 |
CN111461125B CN111461125B (en) | 2022-09-20 |
Family
ID=71683586
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010198068.0A Active CN111461125B (en) | 2020-03-19 | 2020-03-19 | Continuous segmentation method of panoramic image |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111461125B (en) |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH10191361A (en) * | 1996-10-24 | 1998-07-21 | Matsushita Electric Ind Co Ltd | Supplement method for image signal, image signal coder and image signal decoder |
WO2014043814A1 (en) * | 2012-09-21 | 2014-03-27 | Tamaggo Inc. | Methods and apparatus for displaying and manipulating a panoramic image by tiles |
CN109285168A (en) * | 2018-07-27 | 2019-01-29 | 河海大学 | A kind of SAR image lake boundary extraction method based on deep learning |
CN109961427A (en) * | 2019-03-12 | 2019-07-02 | 北京羽医甘蓝信息技术有限公司 | The method and apparatus of whole scenery piece periapical inflammation identification based on deep learning |
CN110188817A (en) * | 2019-05-28 | 2019-08-30 | 厦门大学 | A kind of real-time high-performance street view image semantic segmentation method based on deep learning |
CN110197529A (en) * | 2018-08-30 | 2019-09-03 | 杭州维聚科技有限公司 | Interior space three-dimensional rebuilding method |
CN110503651A (en) * | 2019-08-09 | 2019-11-26 | 北京航空航天大学 | A kind of significant object segmentation methods of image and device |
CN110675401A (en) * | 2018-07-02 | 2020-01-10 | 浙江大学 | Panoramic image pixel block filtering method and device |
-
2020
- 2020-03-19 CN CN202010198068.0A patent/CN111461125B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH10191361A (en) * | 1996-10-24 | 1998-07-21 | Matsushita Electric Ind Co Ltd | Supplement method for image signal, image signal coder and image signal decoder |
WO2014043814A1 (en) * | 2012-09-21 | 2014-03-27 | Tamaggo Inc. | Methods and apparatus for displaying and manipulating a panoramic image by tiles |
CN110675401A (en) * | 2018-07-02 | 2020-01-10 | 浙江大学 | Panoramic image pixel block filtering method and device |
CN109285168A (en) * | 2018-07-27 | 2019-01-29 | 河海大学 | A kind of SAR image lake boundary extraction method based on deep learning |
CN110197529A (en) * | 2018-08-30 | 2019-09-03 | 杭州维聚科技有限公司 | Interior space three-dimensional rebuilding method |
CN109961427A (en) * | 2019-03-12 | 2019-07-02 | 北京羽医甘蓝信息技术有限公司 | The method and apparatus of whole scenery piece periapical inflammation identification based on deep learning |
CN110188817A (en) * | 2019-05-28 | 2019-08-30 | 厦门大学 | A kind of real-time high-performance street view image semantic segmentation method based on deep learning |
CN110503651A (en) * | 2019-08-09 | 2019-11-26 | 北京航空航天大学 | A kind of significant object segmentation methods of image and device |
Non-Patent Citations (3)
Title |
---|
AMER YY等: "An Efficient Segmentation Algorithm for Panoramic Dental Images", 《PROCEDIA COMPUTER SCIENCE》 * |
KAILUN YANG等: "PASS: Panoramic Annular Semantic Segmentation", 《IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS》 * |
汤一平 等: "非约束环境下人脸识别技术的研究", 《浙江工业大学学报》 * |
Also Published As
Publication number | Publication date |
---|---|
CN111461125B (en) | 2022-09-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111462329B (en) | Three-dimensional reconstruction method of unmanned aerial vehicle aerial image based on deep learning | |
CN111325794B (en) | Visual simultaneous localization and map construction method based on depth convolution self-encoder | |
CN110570371A (en) | image defogging method based on multi-scale residual error learning | |
CN105550995A (en) | Tunnel image splicing method and system | |
CN110298884B (en) | Pose estimation method suitable for monocular vision camera in dynamic environment | |
CN111950477A (en) | Single-image three-dimensional face reconstruction method based on video surveillance | |
CN112767467B (en) | Double-image depth estimation method based on self-supervision deep learning | |
CN111612825B (en) | Image sequence motion shielding detection method based on optical flow and multi-scale context | |
CN113222124B (en) | SAUNet + + network for image semantic segmentation and image semantic segmentation method | |
KR102157610B1 (en) | System and method for automatically detecting structural damage by generating super resolution digital images | |
CN113284251B (en) | Cascade network three-dimensional reconstruction method and system with self-adaptive view angle | |
JP4772789B2 (en) | Method and apparatus for determining camera pose | |
CN112509106A (en) | Document picture flattening method, device and equipment | |
KR101915540B1 (en) | Compose System on Image Similarity Analysis And Compose Method on Image Similarity Analysis | |
CN115115859A (en) | Long linear engineering construction progress intelligent identification and analysis method based on unmanned aerial vehicle aerial photography | |
CN113283525A (en) | Image matching method based on deep learning | |
CN111654621B (en) | Dual-focus camera continuous digital zooming method based on convolutional neural network model | |
CN116778288A (en) | Multi-mode fusion target detection system and method | |
CN114526728B (en) | Monocular vision inertial navigation positioning method based on self-supervision deep learning | |
CN113506342B (en) | SLAM omni-directional loop correction method based on multi-camera panoramic vision | |
CN111461125B (en) | Continuous segmentation method of panoramic image | |
CN117315169A (en) | Live-action three-dimensional model reconstruction method and system based on deep learning multi-view dense matching | |
CN115063717B (en) | Video target detection and tracking method based on real scene modeling of key area | |
CN116109778A (en) | Face three-dimensional reconstruction method based on deep learning, computer equipment and medium | |
CN111080533A (en) | Digital zooming method based on self-supervision residual error perception network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |