CN117196958B - Picture splicing method, device, equipment and storage medium based on deep learning - Google Patents

Picture splicing method, device, equipment and storage medium based on deep learning Download PDF

Info

Publication number
CN117196958B
CN117196958B CN202311452275.4A CN202311452275A CN117196958B CN 117196958 B CN117196958 B CN 117196958B CN 202311452275 A CN202311452275 A CN 202311452275A CN 117196958 B CN117196958 B CN 117196958B
Authority
CN
China
Prior art keywords
picture
pictures
homography matrix
matching
group
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311452275.4A
Other languages
Chinese (zh)
Other versions
CN117196958A (en
Inventor
周斌
丁明
陈应文
徐洪亮
许洁斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Xuanwu Wireless Technology Co Ltd
Original Assignee
Guangzhou Xuanwu Wireless Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Xuanwu Wireless Technology Co Ltd filed Critical Guangzhou Xuanwu Wireless Technology Co Ltd
Priority to CN202311452275.4A priority Critical patent/CN117196958B/en
Publication of CN117196958A publication Critical patent/CN117196958A/en
Application granted granted Critical
Publication of CN117196958B publication Critical patent/CN117196958B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses a picture splicing method, a device, equipment and a storage medium based on deep learning, wherein the method comprises the following steps: acquiring an atlas formed by a plurality of pictures; homography matrix estimation is carried out on images of the atlas two by two through a homography matrix estimation model obtained through deep learning training, a matching relation between the images is determined according to an estimation result, and a matching image group is determined; forming an undirected graph according to the matching relation between the pictures in the matching graph group; setting a reference picture, and determining paths from the rest pictures in the matching picture group to the reference picture through a search algorithm; calculating homography matrixes from the rest pictures to the reference pictures according to the paths; and splicing all the pictures in the matching picture group according to the homography matrix from the rest pictures in the matching picture group to the reference picture to obtain spliced pictures. The invention solves the technical problems that the traditional picture splicing method is limited by the overlapping condition between pictures and the quality of the pictures.

Description

Picture splicing method, device, equipment and storage medium based on deep learning
Technical Field
The present invention relates to the field of image stitching technologies, and in particular, to a method, an apparatus, a device, and a storage medium for image stitching based on deep learning.
Background
In the field of quick sales, a salesman needs to inventory own commodities sold in a plurality of shops so as to know the selling condition and the market demand condition of own products. The traditional intelligent checking technology mainly comprises a single picture based identification method and a multi-picture splicing based identification method. According to the method, a salesman shoots commodity pictures to a store based on a single picture, and then a background intelligent algorithm is utilized to identify the pictures, wherein the method often causes incomplete commodity display in the shot pictures due to overlong shelves or too narrow aisles, but if a plurality of pictures are shot for identification, repeated areas possibly exist among the pictures, so that commodity repeated counting is caused; according to the identification method based on the splicing of the pictures, a plurality of pictures with overlapping areas are shot from a salesman to a store, the pictures are spliced into one picture by using the picture splicing method, then intelligent identification is carried out on the spliced pictures, and as the homography matrix between the spliced pictures is obtained by using the picture characteristic point matching and fitting method in the traditional splicing algorithm, the matching of the characteristic points cannot be completed on the pictures without the overlapping areas, and the matching of the characteristic points cannot be completed when the pictures are in a large-scale solid color or fuzzy areas. It can be seen that the conventional picture stitching method is limited by the overlapping situation between pictures and the quality of the pictures themselves.
Disclosure of Invention
The invention provides a picture splicing method, device, equipment and storage medium based on deep learning, which are used for solving the technical problems that the traditional picture splicing method is limited by the overlapping condition among pictures and the quality of the pictures.
In order to solve the above technical problems, an embodiment of the present invention provides a method for splicing pictures based on deep learning, including: acquiring an atlas formed by a plurality of pictures;
homography matrix estimation is carried out on the images of the atlas pairwise through a homography matrix estimation model, and a homography matrix estimation result is obtained; wherein the homography matrix estimation model is obtained through deep learning training;
determining a matching relation between pictures according to the homography matrix estimation result, and determining a matching picture group according to the matching relation between the pictures; any picture in the matching picture group has a matching relation with at least the rest pictures in one matching picture group;
forming an undirected graph according to the matching relation between the pictures in the matching graph group; the nodes corresponding to the pictures with the matchable relation in the matching picture group are adjacent in the undirected picture;
setting one picture in the matching picture group as a reference picture, and determining paths from the rest pictures in the matching picture group to the reference picture through a search algorithm; the paths are paths among nodes corresponding to each picture in the undirected graph;
calculating homography matrixes from the rest pictures to the reference pictures according to the transfer relation between homography matrixes of the corresponding pictures on the paths;
and splicing all the pictures in the matching picture group according to the homography matrix from the rest pictures in the matching picture group to the reference picture to obtain spliced pictures.
Preferably, the training process of the homography matrix estimation model includes:
acquiring a plurality of positive samples and a plurality of negative samples; wherein each positive sample comprises: a first positive sample picture, a positive sample homography matrix, and a second positive sample picture generated based on the first positive sample picture and the positive sample homography matrix; each of the negative samples comprises: the first negative sample picture, the second negative sample picture and the negative sample homography matrix;
constructing an initial neural network model, taking the positive sample and the negative sample as inputs, taking a prediction homography matrix corresponding to each sample as output, and training the neural network model;
and taking the trained neural network model as the homography matrix estimation model.
Preferably, the generating of the positive sample includes:
acquiring a plurality of pictures to be processed, and aiming at each picture to be processed, performing the following operations:
randomly generating a first frame in the picture to be processed, and matting out a rectangular area of the first frame in the picture to be processed to obtain the first positive sample picture;
randomly displacing three vertexes of the first frame to obtain a second frame;
according to the vertex pixel coordinates of the first frame and the second frame, calculating to obtain a homography matrix from the second frame to the first frame, wherein the homography matrix is used as the positive sample homography matrix;
according to the positive sample homography matrix, homography transformation is carried out on the to-be-processed picture and the second frame at the same time, a transformed to-be-processed picture and a third frame are respectively obtained, and a rectangular area of the third frame in the transformed to-be-processed picture is scratched out to obtain a second positive sample picture;
and generating a positive sample according to the first positive sample picture, the second positive sample picture and the positive sample homography matrix.
Preferably, the generating of the negative sample includes:
acquiring a plurality of graph groups to be processed; each to-be-processed image group comprises two images which do not have a matching relationship;
for each group of graphs to be processed, the following operations are performed:
generating a fourth frame in a picture in the to-be-processed picture group, and matting out a rectangular area of the fourth frame in the picture to obtain a first negative sample picture; generating a fifth frame with the same shape as the fourth frame in another picture in the to-be-processed picture group, and digging out a rectangular area of the fifth frame in the other picture to obtain a second negative sample picture;
generating a negative sample according to the first negative sample picture, the second negative sample picture and a preset negative sample homography matrix; wherein the preset negative sample homography matrix is zero matrix.
As a preferable scheme, during training, the positive sample and the negative sample are input into the homography matrix estimation network to obtain corresponding prediction output;
calculating the loss function values of the predicted output and the corresponding target output of the positive sample and the negative sample according to a preset loss function; the target output of the positive sample is the positive sample homography matrix, and the target output of the negative sample is the negative sample homography matrix;
and optimizing the homography matrix estimation network according to the loss function value so as to enable the loss function value to be converged.
Preferably, the search algorithm includes: depth-first, breadth-first, or dijkstra's algorithm.
As a preferred scheme, when all pictures in the matching picture group are spliced, the pictures obtained by splicing at this time are linearly fused every time the splicing of two pictures is completed.
On the basis of the above embodiment, another embodiment of the present invention provides a deep learning-based picture stitching apparatus, including: the system comprises a picture acquisition module, a homography matrix estimation module and a picture splicing module;
the image acquisition module is used for acquiring an atlas formed by a plurality of images and transmitting the atlas to the homography matrix estimation module;
the homography matrix estimation module is used for carrying out homography matrix estimation on the pictures of the atlas two by two through a homography matrix estimation model to obtain a homography matrix estimation result, and transmitting the homography matrix estimation result to the picture splicing module; wherein the homography matrix estimation model is obtained through deep learning training;
the picture splicing module is used for determining the matching relation between pictures according to the homography matrix estimation result and determining a matching picture group according to the matching relation between the pictures; forming an undirected graph according to the matching relation between the pictures in the matching graph group; setting one picture in the matching picture group as a reference picture, and determining paths from the rest pictures in the matching picture group to the reference picture through a search algorithm; calculating homography matrixes from the rest pictures to the reference pictures according to the transfer relation between homography matrixes of the corresponding pictures on the paths; according to the homography matrix from the rest pictures in the matching picture group to the reference picture, all the pictures in the matching picture group are spliced to obtain spliced pictures; any picture in the matching picture group has a matching relation with at least the rest pictures in one matching picture group; nodes corresponding to pictures with matchable relations in the matching picture group are adjacent in the undirected picture; the paths are paths among nodes corresponding to the pictures in the undirected graph.
On the basis of the foregoing embodiment, another embodiment of the present invention provides a deep learning-based picture stitching apparatus, including: a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor; the image splicing method based on the deep learning according to the embodiment of the invention is realized when the processor executes the computer program.
On the basis of the foregoing embodiments, an embodiment of the present invention further provides a storage medium, including: a stored computer program; and controlling equipment where the storage medium is located to execute the image splicing method based on the deep learning according to the embodiment of the invention when the computer program runs.
Compared with the prior art, the embodiment of the invention has the following beneficial effects:
the homography matrix estimation model obtained by the deep learning self-supervision training carries out homography matrix estimation on the pictures of the atlas, and the matching relation between the pictures is determined according to the homography matrix estimation result; determining a matching graph group according to the matching relation between the pictures; forming an undirected graph according to the matching relation between the pictures in the matching graph group; setting one picture in the matching picture group as a reference picture, and determining paths from other pictures to the reference picture through a search algorithm; calculating homography matrixes from other pictures to the reference picture according to the transfer relation between homography matrixes of the corresponding pictures on the path; and splicing all the pictures in the matching picture group according to the homography matrix from the rest pictures to the reference picture in the matching picture. According to the invention, for the pictures which do not have an overlapping area with the reference picture in the matching picture group, the path from the picture to the reference picture is searched through a search algorithm, so that the homography matrix from the picture to the reference picture is determined, and the pictures which do not have the overlapping area with the reference picture can be spliced to the reference picture; in addition, the splicing method obtains a homography matrix estimation model based on deep learning self-supervision training, and does not need to complete matching of characteristic points when carrying out image homography matrix estimation, so the quality requirement on the image is not harsh. In conclusion, the invention solves the technical problems that the traditional picture splicing method is limited by the overlapping condition between pictures and the quality of the pictures.
Drawings
Fig. 1 is a schematic flow chart of a picture stitching method based on deep learning according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a neural network model according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a positive sample generation process according to an embodiment of the present invention;
FIG. 4 is an undirected graph of matching relationships between pictures in a matching graph group according to one embodiment of the present invention;
FIG. 5 is a schematic diagram of a linear fusion according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a picture stitching device based on deep learning according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
Referring to fig. 1, a flow chart of a picture stitching method based on deep learning according to an embodiment of the present invention includes:
s1, acquiring an atlas formed by a plurality of pictures.
In practical applications, an atlas formed by several pictures of the same scene at different angles is usually obtained.
S2, carrying out homography matrix estimation on the images of the atlas two by two through a homography matrix estimation model to obtain a homography matrix estimation result; the homography matrix estimation model is obtained through deep learning training.
In a preferred embodiment, the training process of the homography matrix estimation model includes:
acquiring a plurality of positive samples and a plurality of negative samples; wherein each positive sample comprises: a first positive sample picture, a positive sample homography matrix, and a second positive sample picture generated based on the first positive sample picture and the positive sample homography matrix; each of the negative samples comprises: the first negative sample picture, the second negative sample picture and the negative sample homography matrix;
constructing an initial neural network model, taking the positive sample and the negative sample as inputs, taking a prediction homography matrix corresponding to each sample as output, and training the neural network model;
and taking the trained neural network model as the homography matrix estimation model.
When an initial neural network is constructed, an end-to-end neural network is built by using a full convolution neural network and drinking two full connection layers, the input of the neural network is two gray level diagrams, and the output is a homography matrix formed by 9 numerical values.
Referring to fig. 2, a schematic structural diagram of a neural network model according to an embodiment of the present invention is provided, where the structure of the neural network model includes four stages:
first stage (stage 1): the characteristic extraction structure formed by 10 layers of convolutional neural networks has the dimensions of input pictures (width multiplied by height multiplied by channel number) of 64 multiplied by 2, and the output characteristic scale of 64 multiplied by 64;
second stage (stage 2): the feature extraction structure formed by 10 layers of convolutional neural networks is characterized in that the dimension of the feature output by stage 1 is 32 multiplied by 64 after being subjected to downsampling by a maximum Pooling layer (Max Pooling), the feature is used as the input of stage 2, and the output feature dimension is 32 multiplied by 128;
third stage (stage 3): the feature extraction structure formed by 10 layers of convolutional neural networks is characterized in that the dimension of the feature output by stage 2 is 16 multiplied by 128 after being subjected to downsampling by a maximum Pooling layer (Max Pooling), the feature is used as the input of stage 3, and the output feature dimension is 16 multiplied by 256;
fourth stage (stage 4): 2 layers of full-connection layers; feature reshape output by stage 3 is 65536 length of one-dimensional feature as input of stage 4, output is 1024 through the first full connection layer, and output feature scale is 9 through the second full connection layer.
The neural network model is trained by self-supervision, so that the training cost is low and the model precision is high.
In a preferred embodiment, the generating of the positive sample includes:
acquiring a plurality of pictures to be processed, and aiming at each picture to be processed, performing the following operations:
randomly generating a first frame in the picture to be processed, and matting out a rectangular area of the first frame in the picture to be processed to obtain the first positive sample picture;
randomly displacing three vertexes of the first frame to obtain a second frame;
according to the vertex pixel coordinates of the first frame and the second frame, calculating to obtain a homography matrix from the second frame to the first frame, wherein the homography matrix is used as the positive sample homography matrix;
according to the positive sample homography matrix, homography transformation is carried out on the to-be-processed picture and the second frame at the same time, a transformed to-be-processed picture and a third frame are respectively obtained, and a rectangular area of the third frame in the transformed to-be-processed picture is scratched out to obtain a second positive sample picture;
and generating a positive sample according to the first positive sample picture, the second positive sample picture and the positive sample homography matrix.
Referring to fig. 3, a schematic diagram of a positive sample generation process according to an embodiment of the present invention is shown, and the operations performed on each picture to be processed specifically are:
(1) Randomly generating a rectangular first frame in the picture to be processed, namely a white frame in a first picture in a first row in FIG. 3; the picture of the rectangular area of the white frame is scratched out and is not taken as a first positive sample picture, namely a first picture in a second row in the figure 3;
(2) Stacking and displacing the other three vertexes except the upper left corner of the first frame for a distance to obtain a second frame, namely a gray frame in a second picture of the first row in FIG. 3;
if the first frame has a width W and a length L, and the pixel coordinates of a vertex of the frame are (x, y), randomly displacing the vertex can be expressed as:
;/>
wherein,,/>
(3) Calculating a positive sample homography matrix H according to the corresponding matching relation of the pixel coordinates of the four vertexes of the first frame and the second frame; the calculation of the positive sample homography matrix specifically comprises the following steps:
A. constructing homogeneous coordinate matrix
If the pixel coordinates of the four vertices of the first frame are (x 1, y 1), (x 2, y 2), (x 3, y 3) and (x 4, y 4), and the pixel coordinates of the four vertices corresponding to the second frame are (x 1', y 1'), (x 2', y 2'), (x 3', y 3') and (x 4', y 4'), the formula for calculating the positive sample homography matrix is:
B. pair matrixSingular value decomposition is performed to obtain a matrix +.>Is a singular value decomposition result of (2):
wherein, sum upIs->Orthogonal matrix, is->A diagonal matrix;
C. taking a matrixIs the last column of->The feature vector corresponding to the minimum singular value of the (a) is obtained;
D. and (3) carrying out normalization processing on the homography matrix A, namely dividing all elements of A by the last element of A to obtain a positive sample homography matrix H.
(4) According to the positive sample homography matrix H, homography transformation is carried out on the picture to be processed and the second frame at the same time, a transformed picture to be processed (namely, a first row of a third picture in fig. 3) and a third frame (namely, a gray frame in the first row of the third picture in fig. 3) are respectively obtained, and a rectangular area of the third frame in the transformed picture to be processed is scratched out to obtain a second positive sample picture, namely, a second row of the second picture in fig. 3;
when training the neural network model, the preprocessing of the positive sample picture is required:
scaling the first positive sample picture, the second positive sample picture, the first negative sample picture and the second negative sample picture in scale, filling the short side into a black area to obtain a scaled and filled picture with the size of 64 multiplied by 64, graying the scaled and filled picture, and splicing the scaled and filled picture into input data with the size of 64 multiplied by 2 through a channel; the calculation formula of the scaled scale is as follows:
(5) Generating a positive sample according to the first positive sample picture, the second positive sample picture and the positive sample homography matrix;
firstly, generating a pseudo tag GT_H2 according to a positive sample homography matrix H, wherein the specific steps are as follows:
A. calculating a scaling matrix scaleH, wherein the formula is as follows:
B. the pseudo tag GT_H2 is calculated, and the formula is as follows:
and then forming a positive sample according to the first positive sample picture, the second positive sample picture and the pseudo tag GT_H2.
In a preferred embodiment, the generating of the negative sample includes:
acquiring a plurality of graph groups to be processed; each to-be-processed image group comprises two images which do not have a matching relationship;
for each group of graphs to be processed, the following operations are performed:
generating a fourth frame in a picture in the to-be-processed picture group, and matting out a rectangular area of the fourth frame in the picture to obtain a first negative sample picture; generating a fifth frame with the same shape as the fourth frame in another picture in the to-be-processed picture group, and digging out a rectangular area of the fifth frame in the other picture to obtain a second negative sample picture;
generating a negative sample according to the first negative sample picture, the second negative sample picture and a preset negative sample homography matrix; wherein the preset negative sample homography matrix is zero matrix.
It should be noted that, when training the neural network model, the negative sample picture needs to be preprocessed, and the processing manner is the same as that of the positive sample picture, which is not described herein again.
When generating negative samples, pseudo tags of the negative samplesThe method comprises the following steps:
and forming a negative sample according to the first negative sample picture, the second negative sample picture and the pseudo tag GT_H2.
In a preferred embodiment, during training, the positive samples and the negative samples are input into the homography matrix estimation network to obtain corresponding prediction output;
calculating the loss function values of the predicted output and the corresponding target output of the positive sample and the negative sample according to a preset loss function; the target output of the positive sample is the positive sample homography matrix, and the target output of the negative sample is the negative sample homography matrix;
and optimizing the homography matrix estimation network according to the loss function value so as to enable the loss function value to be converged.
The preset loss function is:
wherein,one-dimensional vector output for model prediction, +.>Is a one-dimensional vector of pseudo tags.
The optimization strategy of the loss function can be a gradient descent method, a random gradient descent method, a momentum optimization method or a self-adaptive learning rate optimization method. And until the loss function converges, namely the loss function is not reduced any more, obtaining a trained model, and completing training of the model.
In the training process, the generation ratio of the positive sample and the negative sample may be set to 6:4, setting the training round to be at least 100 times.
S3, determining a matching relation between pictures according to the homography matrix estimation result, and determining a matching picture group according to the matching relation between the pictures; any picture in the matching picture group has a matching relation with at least the rest pictures in one matching picture group.
Homography matrix encouragement is directly carried out on pictures in the atlas two by using a homography matrix estimation model to obtain a homography matrix estimation result, and if 9 values of the homography matrix estimation result are all close to 0, or a pseudo tag of the homography matrix estimation result and a negative sample is calculatedLoss function between->The method comprises the steps of carrying out a first treatment on the surface of the If calculated->If the value is smaller than 0.4, the homography matrix estimation result is considered that no matching relationship exists between the two pictures corresponding to the homography matrix estimation result; otherwise, the homography matrix estimation result is considered to have a matching relationship between the two pictures corresponding to the homography matrix estimation result. And combining all the pictures with the matchable relation into a matchable picture group, wherein any picture in the matchable picture group can find another picture or pictures with the matchable relation in the picture group.
S4, forming an undirected graph according to the matching relation between the pictures in the matching graph group; and the nodes corresponding to the pictures with the matchable relation in the matching picture group are adjacent in the undirected picture.
And taking all pictures in the matching graph group as nodes of the undirected graph, and taking the matchable relationship among the pictures as edges among the corresponding nodes to form the undirected graph.
Referring to fig. 4, an undirected graph formed by matching relationships between pictures in a matching graph group according to an embodiment of the present invention is shown; wherein, node 1, node 2, node 3, node 4 and node 5 represent picture 1, picture 2, picture 3, picture 4 and picture 5 in the matching picture group respectively; and if a matchable relationship exists between the picture 1 and the picture 5, the node 1 is adjacent to the node 5, and the adjacent relationship among other nodes is the same.
S5, setting one picture in the matching picture group as a reference picture, and determining paths from the rest pictures in the matching picture group to the reference picture through a search algorithm; the paths are paths among nodes corresponding to each picture in the undirected graph.
The invention does not limit the selection scheme of the reference picture, can number the pictures, select the picture with the middle number as the reference picture, and find the path leading to the node corresponding to the reference picture from the nodes corresponding to any picture in the undirected picture.
Assuming that the total number of pictures is N, the intermediate numberThe method comprises the following steps:
for illustration, referring to fig. 4, a reference picture is determined as a picture 3, and a path from a picture 1 to a picture 3 is determined as follows: 1- & gt 5- & gt 4- & gt 3.
In a preferred embodiment, the search algorithm comprises: depth-first, breadth-first, or dijkstra's algorithm.
S6, calculating homography matrixes from the rest pictures to the reference picture according to the transfer relation between homography matrixes of the corresponding pictures on the paths.
For illustration, please refer to fig. 4, calculate homography matrices for pictures 1-3:
firstly, according to paths from the pictures 1 to the pictures 3, a homography matrix H15 from the pictures 1 to the pictures 5, a homography matrix H54 from the pictures 5 to the pictures 4 and a homography matrix H43 from the pictures 4 to the pictures 3 are obtained;
then, according to the transfer relation between homography matrixes, a homography matrix H13 from the picture 1 to the picture 3 is calculated, and the calculation formula is as follows:
and S7, splicing all the pictures in the matching picture group according to the homography matrix from the rest pictures in the matching picture group to the reference picture to obtain spliced pictures.
Referring to fig. 4, according to the homography matrix H13 from the picture 1 to the picture 3, the picture 1 can be spliced to the picture 3, and similarly, the picture 2, the picture 4 and the picture 5 can be spliced to the picture 3 to obtain the spliced picture.
In a preferred embodiment, when all the pictures in the matching picture group are spliced, the method linearly fuses the pictures obtained by splicing every time the splicing of two pictures is completed.
The specific method for linear fusion comprises the following steps:
determining an overlapping area of two pictures;
for each pixel in the overlapping area, calculating a fused pixel value according to a linear fusion formula; the linear fusion formula is:
where output is the output pixelA dot;the weight coefficient is used for representing the fusion proportion of two pictures, and the value range is (0, 1); />And->The pixel values of the pixel points at the corresponding positions of the picture 1 and the picture 2 are respectively; />Is the pixel distance +.>The pixel distance of the boundary; />Is the pixel distance +.>Distance of boundary;
referring to fig. 5, a linear fusion diagram is provided according to an embodiment of the present invention, output is a star point at the center of fig. 5,for the left picture of fig. 5, +.>For the right picture of fig. 5, +.>Is star point distance->The pixel distance of the boundary; />Is star point distance->Distance of the boundary.
(1) Assigning the fused pixel values to the pixel values in the overlapping region;
(2) The non-overlapping areas of the two pictures are spliced together directly.
Example two
Referring to fig. 6, a schematic structural diagram of a deep learning-based picture stitching device according to an embodiment of the present invention includes: the system comprises a picture acquisition module, a homography matrix estimation module and a picture splicing module;
the image acquisition module is used for acquiring an atlas formed by a plurality of images and transmitting the atlas to the homography matrix estimation module;
the homography matrix estimation module is used for carrying out homography matrix estimation on the pictures of the atlas two by two through a homography matrix estimation model to obtain a homography matrix estimation result, and transmitting the homography matrix estimation result to the picture splicing module; wherein the homography matrix estimation model is obtained through deep learning training;
the picture splicing module is used for determining the matching relation between pictures according to the homography matrix estimation result and determining a matching picture group according to the matching relation between the pictures; forming an undirected graph according to the matching relation between the pictures in the matching graph group; setting one picture in the matching picture group as a reference picture, and determining paths from the rest pictures in the matching picture group to the reference picture through a search algorithm; calculating homography matrixes from the rest pictures to the reference pictures according to the transfer relation between homography matrixes of the corresponding pictures on the paths; according to the homography matrix from the rest pictures in the matching picture group to the reference picture, all the pictures in the matching picture group are spliced to obtain spliced pictures; any picture in the matching picture group has a matching relation with at least the rest pictures in one matching picture group; nodes corresponding to pictures with matchable relations in the matching picture group are adjacent in the undirected picture; the paths are paths among nodes corresponding to the pictures in the undirected graph.
It should be noted that the above-described apparatus embodiments are merely illustrative, and the units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. In addition, in the drawings of the embodiment of the device provided by the invention, the connection relation between the modules represents that the modules have communication connection, and can be specifically implemented as one or more communication buses or signal lines. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
It will be clearly understood by those skilled in the art that, for convenience and brevity, the specific working process of the apparatus described above may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
Example III
Accordingly, an embodiment of the present invention provides a deep learning-based picture stitching apparatus, where the apparatus includes a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor, where the processor implements the deep learning-based picture stitching method described in the embodiment of the present invention when the processor executes the computer program.
The device may be a computing device such as a desktop computer, a notebook, a palm computer, a cloud server, etc. The device may include, but is not limited to, a processor, a memory.
The processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like that is a control center of the device, connecting the various parts of the overall device using various interfaces and lines.
The memory may be used to store the computer program, and the processor may implement various functions of the device by running or executing the computer program stored in the memory, and invoking data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created according to the use of the cellular phone, etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as a hard disk, memory, plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash Card (Flash Card), at least one disk storage device, flash memory device, or other volatile solid-state storage device.
Example IV
Accordingly, an embodiment of the present invention provides a storage medium, where the storage medium includes a stored computer program, and when the computer program runs, the device where the computer readable storage medium is located is controlled to execute the image stitching method based on deep learning according to the embodiment of the present invention.
The storage medium is a computer readable storage medium, and the computer program is stored in the computer readable storage medium, and when executed by a processor, the computer program can implement the steps of the above-mentioned method embodiments. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium contains content that can be appropriately scaled according to the requirements of jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is subject to legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunication signals.
The foregoing embodiments have been provided for the purpose of illustrating the general principles of the present invention, and are not to be construed as limiting the scope of the invention. It should be noted that any modifications, equivalent substitutions, improvements, etc. made by those skilled in the art without departing from the spirit and principles of the present invention are intended to be included in the scope of the present invention.

Claims (10)

1. The image splicing method based on the deep learning is characterized by comprising the following steps of:
acquiring an atlas formed by a plurality of pictures;
homography matrix estimation is carried out on the images of the atlas pairwise through a homography matrix estimation model, and a homography matrix estimation result is obtained; wherein the homography matrix estimation model is obtained through deep learning training;
determining a matching relation between pictures according to the homography matrix estimation result, and determining a matching picture group according to the matching relation between the pictures; any picture in the matching picture group has a matching relation with at least the rest pictures in one matching picture group;
forming an undirected graph according to the matching relation between the pictures in the matching graph group; the nodes corresponding to the pictures with the matchable relation in the matching picture group are adjacent in the undirected picture;
setting one picture in the matching picture group as a reference picture, and determining paths from the rest pictures in the matching picture group to the reference picture through a search algorithm; the paths are paths among nodes corresponding to each picture in the undirected graph;
calculating homography matrixes from the rest pictures to the reference pictures according to the transfer relation between homography matrixes of the corresponding pictures on the paths;
and splicing all the pictures in the matching picture group according to the homography matrix from the rest pictures in the matching picture group to the reference picture to obtain spliced pictures.
2. The deep learning based picture stitching method according to claim 1, wherein the training process of the homography matrix estimation model includes:
acquiring a plurality of positive samples and a plurality of negative samples; wherein each positive sample comprises: a first positive sample picture, a positive sample homography matrix, and a second positive sample picture generated based on the first positive sample picture and the positive sample homography matrix; each of the negative samples comprises: the first negative sample picture, the second negative sample picture and the negative sample homography matrix;
constructing an initial neural network model, taking the positive sample and the negative sample as inputs, taking a prediction homography matrix corresponding to each sample as output, and training the neural network model;
and taking the trained neural network model as the homography matrix estimation model.
3. The deep learning based picture stitching method according to claim 2, wherein the generating of the positive samples comprises:
acquiring a plurality of pictures to be processed, and aiming at each picture to be processed, performing the following operations:
randomly generating a first frame in the picture to be processed, and matting out a rectangular area of the first frame in the picture to be processed to obtain the first positive sample picture;
randomly displacing three vertexes of the first frame to obtain a second frame;
according to the vertex pixel coordinates of the first frame and the second frame, calculating to obtain a homography matrix from the second frame to the first frame, wherein the homography matrix is used as the positive sample homography matrix;
according to the positive sample homography matrix, homography transformation is carried out on the to-be-processed picture and the second frame at the same time, a transformed to-be-processed picture and a third frame are respectively obtained, and a rectangular area of the third frame in the transformed to-be-processed picture is scratched out to obtain a second positive sample picture;
and generating a positive sample according to the first positive sample picture, the second positive sample picture and the positive sample homography matrix.
4. The deep learning based picture stitching method of claim 2, wherein the generating of the negative samples comprises:
acquiring a plurality of graph groups to be processed; each to-be-processed image group comprises two images which do not have a matching relationship;
for each group of graphs to be processed, the following operations are performed:
generating a fourth frame in a picture in the to-be-processed picture group, and matting out a rectangular area of the fourth frame in the picture to obtain a first negative sample picture; generating a fifth frame with the same shape as the fourth frame in another picture in the to-be-processed picture group, and digging out a rectangular area of the fifth frame in the other picture to obtain a second negative sample picture;
generating a negative sample according to the first negative sample picture, the second negative sample picture and a preset negative sample homography matrix; wherein the preset negative sample homography matrix is zero matrix.
5. The deep learning-based picture stitching method according to claim 2, wherein the positive sample and the negative sample are input into the homography matrix estimation model to obtain corresponding prediction output during training;
calculating the loss function values of the predicted output and the corresponding target output of the positive sample and the negative sample according to a preset loss function; the target output of the positive sample is the positive sample homography matrix, and the target output of the negative sample is the negative sample homography matrix;
and optimizing the homography matrix estimation model according to the loss function value so as to enable the loss function value to be converged.
6. The deep learning based picture stitching method according to claim 1, wherein the search algorithm comprises: depth-first, breadth-first, or dijkstra's algorithm.
7. The method for splicing pictures based on deep learning as set forth in claim 1, wherein when all pictures in the matched picture group are spliced, each time the splicing of two pictures is completed, the pictures obtained by the splicing are linearly fused.
8. Picture splicing apparatus based on degree of depth study, characterized by comprising: the system comprises a picture acquisition module, a homography matrix estimation module and a picture splicing module;
the image acquisition module is used for acquiring an atlas formed by a plurality of images and transmitting the atlas to the homography matrix estimation module;
the homography matrix estimation module is used for carrying out homography matrix estimation on the pictures of the atlas two by two through a homography matrix estimation model to obtain a homography matrix estimation result, and transmitting the homography matrix estimation result to the picture splicing module; wherein the homography matrix estimation model is obtained through deep learning training;
the picture splicing module is used for determining the matching relation between pictures according to the homography matrix estimation result and determining a matching picture group according to the matching relation between the pictures; forming an undirected graph according to the matching relation between the pictures in the matching graph group; setting one picture in the matching picture group as a reference picture, and determining paths from the rest pictures in the matching picture group to the reference picture through a search algorithm; calculating homography matrixes from the rest pictures to the reference pictures according to the transfer relation between homography matrixes of the corresponding pictures on the paths; according to the homography matrix from the rest pictures in the matching picture group to the reference picture, all the pictures in the matching picture group are spliced to obtain spliced pictures; any picture in the matching picture group has a matching relation with at least the rest pictures in one matching picture group; nodes corresponding to pictures with matchable relations in the matching picture group are adjacent in the undirected picture; the paths are paths among nodes corresponding to the pictures in the undirected graph.
9. Picture stitching device based on deep learning, characterized by comprising: a processor, a memory, and a computer program stored in the memory and configured to be executed by the processor; the processor, when executing the computer program, implements a deep learning based picture stitching method as claimed in any one of claims 1 to 7.
10. A storage medium, comprising: a stored computer program; wherein the computer program, when run, controls a device in which the storage medium is located to perform the deep learning based picture stitching method according to any one of claims 1 to 7.
CN202311452275.4A 2023-11-03 2023-11-03 Picture splicing method, device, equipment and storage medium based on deep learning Active CN117196958B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311452275.4A CN117196958B (en) 2023-11-03 2023-11-03 Picture splicing method, device, equipment and storage medium based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311452275.4A CN117196958B (en) 2023-11-03 2023-11-03 Picture splicing method, device, equipment and storage medium based on deep learning

Publications (2)

Publication Number Publication Date
CN117196958A CN117196958A (en) 2023-12-08
CN117196958B true CN117196958B (en) 2024-04-05

Family

ID=88987223

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311452275.4A Active CN117196958B (en) 2023-11-03 2023-11-03 Picture splicing method, device, equipment and storage medium based on deep learning

Country Status (1)

Country Link
CN (1) CN117196958B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101930603A (en) * 2010-08-06 2010-12-29 华南理工大学 Method for fusing image data of medium-high speed sensor network
WO2016165016A1 (en) * 2015-04-14 2016-10-20 Magor Communications Corporation View synthesis-panorama
CN106780303A (en) * 2016-12-02 2017-05-31 上海大学 A kind of image split-joint method based on local registration
CN113128518A (en) * 2021-03-30 2021-07-16 西安理工大学 Sift mismatch detection method based on twin convolution network and feature mixing
CN115115522A (en) * 2022-08-15 2022-09-27 浙江工业大学 Goods shelf commodity image splicing method and system
CN115601672A (en) * 2022-12-14 2023-01-13 广州市玄武无线科技股份有限公司(Cn) VR intelligent shop patrol method and device based on deep learning

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101930603A (en) * 2010-08-06 2010-12-29 华南理工大学 Method for fusing image data of medium-high speed sensor network
WO2016165016A1 (en) * 2015-04-14 2016-10-20 Magor Communications Corporation View synthesis-panorama
CN106780303A (en) * 2016-12-02 2017-05-31 上海大学 A kind of image split-joint method based on local registration
CN113128518A (en) * 2021-03-30 2021-07-16 西安理工大学 Sift mismatch detection method based on twin convolution network and feature mixing
CN115115522A (en) * 2022-08-15 2022-09-27 浙江工业大学 Goods shelf commodity image splicing method and system
CN115601672A (en) * 2022-12-14 2023-01-13 广州市玄武无线科技股份有限公司(Cn) VR intelligent shop patrol method and device based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于位姿的浅水视频图像增强与拼接研究;张权;《中国优秀硕士学位论文全文数据库信息科技辑》;第I138-2785页 *

Also Published As

Publication number Publication date
CN117196958A (en) 2023-12-08

Similar Documents

Publication Publication Date Title
CN109508681B (en) Method and device for generating human body key point detection model
CN109584156B (en) Microscopic sequence image splicing method and device
US11380017B2 (en) Dual-view angle image calibration method and apparatus, storage medium and electronic device
CN109446889B (en) Object tracking method and device based on twin matching network
CN108229591A (en) Neural network adaptive training method and apparatus, equipment, program and storage medium
CN107610146A (en) Image scene segmentation method, apparatus, computing device and computer-readable storage medium
CN101577004B (en) Rectification method for polar lines, appliance and system thereof
CN112396688B (en) Three-dimensional virtual scene generation method and device
CN111833237A (en) Image registration method based on convolutional neural network and local homography transformation
CN110688928A (en) Model training method and device, electronic equipment and computer readable storage medium
CN110443245A (en) Localization method, device and the equipment of a kind of license plate area under unrestricted scene
CN111310706A (en) Commodity price tag identification method and device, electronic equipment and storage medium
CN115049878B (en) Target detection optimization method, device, equipment and medium based on artificial intelligence
CN110610149A (en) Information processing method and device and computer storage medium
CN115457364A (en) Target detection knowledge distillation method and device, terminal equipment and storage medium
CN109961447B (en) Price tag detection method and device, electronic equipment and storage medium
CN109784227B (en) image detection and identification method and device
CN111369611B (en) Image pixel depth value optimization method, device, equipment and storage medium thereof
CN109741380A (en) Textile picture fast matching method and device
CN117196958B (en) Picture splicing method, device, equipment and storage medium based on deep learning
CN109784379A (en) The update method and device in textile picture feature library
US20210142091A1 (en) Method for determining a type and a state of an object of interest
CN114693919A (en) Target detection method, terminal equipment and storage medium
CN112766012B (en) Two-dimensional code image recognition method and device, electronic equipment and storage medium
CN104902260B (en) The acquisition methods and system of a kind of image parallactic

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant