CN107689035A

CN107689035A - A kind of homography matrix based on convolutional neural networks determines method and device

Info

Publication number: CN107689035A
Application number: CN201710763497.6A
Authority: CN
Inventors: 庄晓滨; 周俊明; 戴长军
Original assignee: Guangzhou Huaduo Network Technology Co Ltd
Current assignee: Guangzhou Cubesili Information Technology Co Ltd
Priority date: 2017-08-30
Filing date: 2017-08-30
Publication date: 2018-02-13
Anticipated expiration: 2037-08-30
Also published as: CN107689035B

Abstract

This application discloses a kind of homography matrix based on convolutional neural networks to determine method and device, and this method includes：There will be the convolutional neural networks model that the input of a pair of rectangular images of homography corresponding relation pre-establishes, four apex coordinates of another width rectangular image are determined according to four apex coordinates known to the deviation between four apex coordinates of a pair of rectangular images that the convolutional neural networks model exports and the width rectangular image in a pair of rectangular images, four apex coordinates of four apex coordinates and another width rectangular image according to known to a width rectangular image, determine homography matrix corresponding to a pair of rectangular images.Pass through the above method, because in training convolutional neural networks model, used training image collection carried out brightness, fuzziness, noise and subgraph position disturbance, take into full account picture quality for training and being influenceed using the precision of model, therefore, can be higher with the robustness and adaptive ability of lift scheme, precision.

Description

A kind of homography matrix based on convolutional neural networks determines method and device

Technical field

The application is related to field of computer technology, more particularly to a kind of homography matrix based on convolutional neural networks determines Method and device.

Background technology

In computer vision field, any two images comprising same target are linked together by homography, and By determining that the homography matrix of two images can be widely applied in the real life of people, e.g., image rectification, image Alignment and camera stabilization etc..

At present, under different camera postures, picture material can be different caused by same object, but still There is pixel corresponding to part, can determine any two images pair for including same target using pixel corresponding to part The homography matrix answered.

Specifically, in the prior art, it is main to generate the 128* needed for experiment using the picture in MS-COCO data sets 128 view data, using the relative displacement of four corresponding opposite vertexes of two images (8 transverse and longitudinal coordinates) as label, The partial parameters of the VGG-style networks based on convolutional neural networks are trained, subsequently, using the VGG-style nets trained Network determines homography matrix corresponding to the image of two width comprising same target.

But prior art does not have to take into full account the change inside image, including brightness change when generating data And the common situation of internal disturbance so that the image of two width comprising same target is being determined using VGG-style networks The precision of corresponding homography matrix is relatively low.

The content of the invention

The embodiment of the present application provides a kind of homography matrix based on convolutional neural networks and determines method and device, solves existing There is technology when generating data, there is no to take into full account a change inside image, including brightness change and internal disturbance Common situation so that homography square corresponding to the image of two width comprising same target is being determined using VGG-style networks The problem of precision of battle array is relatively low.

A kind of homography matrix based on convolutional neural networks that the embodiment of the present application provides determines method, including：

There will be the convolutional neural networks model that the input of a pair of rectangular images of homography corresponding relation pre-establishes；

It is inclined between four apex coordinates of the pair of rectangular image exported according to the convolutional neural networks model Known four apex coordinates of a width rectangular image in poor and the pair of rectangular image determine the pair of rectangular image In another width rectangular image four apex coordinates；

Four summits of four apex coordinates and another width rectangular image according to known to the width rectangular image Coordinate, determine homography matrix corresponding to the pair of rectangular image.

Preferably, there will be the convolutional neural networks that pre-establish of a pair of rectangular images of homography corresponding relation input Before model, methods described also includes：

Training image collection is made, wherein, the training image collection includes at least one pair of square that homography corresponding relation be present Shape image, initialize each weight parameter in convolutional neural networks model to be trained, by it is described at least one pair of homography be present The rectangular image of corresponding relation inputs convolutional neural networks model to be trained, defeated according to convolutional neural networks model to be trained Go out it is described at least one pair of exist homography corresponding relation rectangular image apex coordinate deviation and it is described at least one pair of Exist each in convolutional neural networks model to be trained described in the apex coordinate training of the rectangular image of homography corresponding relation Weight parameter, obtain convolutional neural networks model.

Preferably, at least one pair of described rectangular image that homography corresponding relation be present is gray level image, and/or described At least one pair of rectangular image that homography corresponding relation be present includes the central point of image and size is identical.

Preferably, this method includes：To it is described at least one pair of a width in the rectangular image of homography corresponding relation be present At least one of brightness, fuzziness, noise and subgraph image position of rectangular image are disturbed.

Preferably, the core size of last pond layer in the convolutional neural networks model is 4x4, the convolutional layer Convolution kernel port number be 64.

Preferably, the rectangle for the presence homography corresponding relation concentrated the training image according to stochastic gradient descent method Convolutional neural networks model to be trained, the instruction exported according to convolutional neural networks model to be trained described in image input Practice the deviation of the apex coordinate of the rectangular image of the presence homography corresponding relation in image set, and the training image is concentrated Presence homography corresponding relation rectangular image apex coordinate apex coordinate between difference, build loss function, directly Meet model accuracy value set in advance to loss function.

Preferably, to it is described at least one pair of the bright of a width rectangular image in the rectangular image of homography corresponding relation be present Spending the mode disturbed is：For a width rectangular image to be disturbed, random number r is generated, according to the random number r generated, The new gray value of each pixel in the width rectangular image is determined by formula p '=p × (1.0+r), wherein, p ' expressions are newly grey Angle value, P represent original gray value, and r represents random number；To it is described at least one pair of exist in the rectangular image of homography corresponding relation The mode that is disturbed of fuzziness of a width rectangular image be：For a width rectangular image to be disturbed, random number a is generated, Using random number a as blur radius, Gaussian Blur is carried out to the width rectangular image；To it is described at least one pair of have that homography is corresponding to close The mode that the noise of a width rectangular image in the rectangular image of system is disturbed is：For a width rectangular image to be disturbed, Density random number and intensity random number are generated, it is raw in the width rectangular image according to density random number and intensity random number Into salt-pepper noise；To it is described at least one pair of the subgraph of the width rectangular image in the rectangular image of homography corresponding relation be present The mode that position is disturbed is：For a width rectangular image to be disturbed, two different positions are randomly choosed in the width image Put and the subgraph of identical size, exchange all pixels in two subgraphs.

Preferably, the decline strategy that wherein stochastic gradient descent method uses is： Wherein, lr is current learning rate, and iter is current iteration number, and max_iter is maximum iteration, Power is the parameter that Schistosomiasis control rate declines speed, learning rate based on base_lr；And/or model accuracy is according to following public affairs Formula calculates：s_i=p_i-r_i, wherein, M is the number of test sample collection Amount, p_iFor the prediction deviation of a pair of rectangular image i apex coordinate, r_iFor a pair of rectangular image i apex coordinate it is true partially Difference.

A kind of homography matrix determining device based on convolutional neural networks that the embodiment of the present application provides, including：

Input module, for there will be the convolutional Neural that the input of a pair of rectangular images of homography corresponding relation pre-establishes Network model；

Coordinate determining module, for four of the pair of rectangular image exported according to the convolutional neural networks model Known four apex coordinates of the width rectangular image in deviation and the pair of rectangular image between apex coordinate determine Four apex coordinates of another width rectangular image in the pair of rectangular image；

Matrix deciding module, for four apex coordinates according to known to the width rectangular image and another width square Four apex coordinates of shape image, determine homography matrix corresponding to the pair of rectangular image.

Preferably, described device also includes：

Model training module, for there will be the input of a pair of rectangular images of homography corresponding relation is advance in input module Before the convolutional neural networks model of foundation, training image collection is made, wherein, the training image collection exists including at least one pair of The rectangular image of homography corresponding relation, each weight parameter in convolutional neural networks model to be trained is initialized, by described in At least one pair of rectangular image that homography corresponding relation be present inputs convolutional neural networks model to be trained, according to be trained Convolutional neural networks model output it is described at least one pair of exist homography corresponding relation rectangular image apex coordinate it is inclined Convolution to be trained described in the apex coordinate training of at least one pair of poor and described rectangular image that homography corresponding relation be present Each weight parameter in neural network model, obtains convolutional neural networks model.

Preferably, the device also includes：

Disturb module, for it is described at least one pair of a width histogram in the rectangular image of homography corresponding relation be present At least one of brightness, fuzziness, noise and subgraph image position of picture are disturbed.

Preferably, the model training module, is additionally operable to the training image is concentrated according to stochastic gradient descent method Convolutional neural networks model to be trained described in the rectangular image input of homography corresponding relation be present, according to convolution to be trained The apex coordinate of the rectangular image for the presence homography corresponding relation that the training image of neural network model output is concentrated Deviation, and the training image concentrate presence homography corresponding relation rectangular image apex coordinate apex coordinate it Between difference, loss function is built, until loss function meets model accuracy value set in advance.

Preferably, the disturbance module is specifically used for, to it is described at least one pair of the histogram of homography corresponding relation be present The mode that is disturbed of brightness of a width rectangular image as in is：For a width rectangular image to be disturbed, random number is generated R, according to the random number r generated, the new of each pixel in the width rectangular image is determined by formula p '=p × (1.0+r) Gray value, wherein, the new gray value of p ' expressions, P represents original gray value, and r represents random number；It is single at least one pair of described presence to answer The mode that the fuzziness of a width rectangular image in the rectangular image of property corresponding relation is disturbed is：For a width to be disturbed Rectangular image, random number a is generated, using random number a as blur radius, Gaussian Blur is carried out to the width rectangular image；To it is described extremely The mode that the noise of a width rectangular image in few a pair of rectangular images that homography corresponding relation be present is disturbed is：For A width rectangular image to be disturbed, density random number and intensity random number are generated, it is random according to density random number and intensity Number, salt-pepper noise is generated in the width rectangular image；To it is described at least one pair of exist in the rectangular image of homography corresponding relation The mode that is disturbed of subgraph image position of a width rectangular image be：For a width rectangular image to be disturbed, in the width figure As two diverse locations of interior random selection and the subgraph of identical size, all pixels in two subgraphs are exchanged.

The embodiment of the present application provides a kind of homography matrix based on convolutional neural networks and determines method and device, this method Including：There will be the convolutional neural networks model that the input of a pair of rectangular images of homography corresponding relation pre-establishes, according to this In deviation and a pair of rectangular images between four apex coordinates of a pair of rectangular images of convolutional neural networks model output A width rectangular image known four apex coordinates determine another width rectangular image in a pair of rectangular images four summits sit Mark, four apex coordinates of four apex coordinates and another width rectangular image according to known to a width rectangular image, it is determined that Homography matrix corresponding to a pair of rectangular images.By the above method, by being made in training convolutional neural networks model Training image collection carried out brightness, fuzziness, noise and subgraph position disturbance, had taken into full account picture quality pair , therefore, can be with the robustness and adaptive ability of lift scheme, relative to use in training and being influenceed using the precision of model For VGG-style networks determine homography matrix corresponding to the image of two width comprising same target, precision is higher.

Brief description of the drawings

Accompanying drawing described herein is used for providing further understanding of the present application, forms the part of the application, this Shen Schematic description and description please is used to explain the application, does not form the improper restriction to the application.In the accompanying drawings：

Fig. 1 is the process schematic that the homography matrix based on convolutional neural networks that the embodiment of the present application provides determines；

Fig. 2 is a kind of embodiment for establishing convolutional neural networks model that the embodiment of the present application provides；

Fig. 3 A are the model structure schematic diagram for the convolutional neural networks model to be trained that the embodiment of the present application provides；

The starting module structural representation that Fig. 3 B the embodiment of the present application provides；

Fig. 4 is the embodiment for the making training image collection that the embodiment of the present application provides；

Schematic diagram before and after the subgraph image position for the width rectangular image of disturbance one that Fig. 5 provides for the embodiment of the present application；

Fig. 6 is every frame picture track schematic diagram after the correction for the shot by camera that the embodiment of the present application provides；

Fig. 7 is the process schematic for the video stabilization based on convolutional neural networks that the embodiment of the present application provides；

Fig. 8 is the picture view before and after the correction for the shot by camera that the embodiment of the present application provides；

Fig. 9 is the structural representation for the homography matrix determining device based on convolutional neural networks that the embodiment of the present application provides Figure；

Figure 10 is that the homography matrix based on convolutional neural networks that the embodiment of the present application provides determines that system forms structure Block diagram.

Embodiment

To make the purpose, technical scheme and advantage of the application clearer, below in conjunction with the application specific embodiment and Technical scheme is clearly and completely described corresponding accompanying drawing.Obviously, described embodiment is only the application one Section Example, rather than whole embodiments.Based on the embodiment in the application, those of ordinary skill in the art are not doing Go out under the premise of creative work the every other embodiment obtained, belong to the scope of the application protection.

Fig. 1 is the homography matrix determination process based on convolutional neural networks that the embodiment of the present application provides, and is specifically included Following steps：

S101：There will be the convolutional neural networks mould that the input of a pair of rectangular images of homography corresponding relation pre-establishes Type.

In actual applications, it is determined that the homography matrix of any two images comprising same target can be widely applied Into the real life of people, e.g., image rectification, image alignment and camera stabilization etc..

And it is determined that during the homography matrix of any two images comprising same target, it is necessary first to establish volume Product neural network model, subsequently, can determine to include any the two of same target by the convolutional neural networks model established The homography matrix of width image.

Further, the application gives a kind of embodiment for establishing convolutional neural networks model, specific such as Fig. 2 It is shown：

S201：Training image collection is made, wherein, homography corresponding relation be present including at least one pair of in the training image collection Rectangular image.

S202：Initialize each weight parameter in convolutional neural networks model to be trained.

S203：At least one pair of rectangular image that homography corresponding relation be present is inputted to convolutional neural networks mould to be trained Type.

S204：The square of homography corresponding relation be present at least one pair of exported according to convolutional neural networks model to be trained The apex coordinate of the deviation of the apex coordinate of shape image and at least one pair of rectangular image that homography corresponding relation be present is trained Each weight parameter in the convolutional neural networks model to be trained, obtains convolutional neural networks model.

At this it should be noted that the rectangular image that homography corresponding relation for a pair be present refers to this in rectangular image Two images include same target.Further, since before the convolutional neural networks model for treating training is trained, this is treated The quantity of the convolutional layer of the convolutional neural networks model of training, the quantity of convolution kernel and the port number of convolution kernel lead in convolutional layer Often all set, and will not be modified again during training, and in convolutional neural networks model to be trained Convolutional layer in the quantity of convolution kernel and the port number of convolution kernel also determine the sizes of a pair of images being input in model And shape, that is to say, that the size and shape for a pair of images being input in model is to fit to mode input requirement, because This, it is fixed size that training image, which concentrates the size of a pair included of images that homography corresponding relation be present, and shape It is fixed.

In addition, herein it should also be noted that, present invention also provides the model of convolutional neural networks model to be trained Structure, specifically as shown in Figure 3A, the model structure of convolutional neural networks model to be trained is by input layer (Input), convolutional layer (Convolution Layer), activation primitive (Activation Function), pond layer (Pooling Layer) and connect entirely Connect layer (Full Connection Layer) composition, it is also possible to the custom layers of other acceleration network trainings are included, wherein, volume Lamination (Convolution Layer) has the ability of extraction image abstraction feature, and as a rule, the number of plies is more, and feature is more taken out As may learn higher level semantic feature, in this application, convolutional layer being labeled as to WxHxC+S form, W is represented Core width, H represent core height, and C represents the number of core, and S represents stride；Activation primitive (Activation Function) is one Kind improves network non-linear method, gives tacit consent to common ReLU activation primitives after each convolutional layer；Pond layer (Pooling Layer it is) a kind of method of data down-sampling, the non-linear of model can be improved and prevent model over-fitting, the application uses Max Pooling and Avg Pooling two ways, wherein, Max Pooing are using the maximum in receptive field as pond Changing the output of layer, Avg Pooling are the output for taking the average of the array in receptive field as pond layer, in this application, general Pond layer is labeled as WxH+S form, and W represents core width, and H represents core height, and S represents stride；Articulamentum (Full Connection Layer) play a part of " grader " in whole convolutional neural networks, obtain the vector of one 8 dimension.

Further, since local acknowledgement's normalization (LRN) layer can make smoothing processing to current signature figure in depth, dividing It is proved to effectively increase accuracy rate in generic task, therefore, the application also uses in the convolutional neural networks model established LRN layers.Secondly, the application also using starting (Inception) module in the convolutional neural networks model established, and Inception modules can effectively improve the width of network, also increase adaptability of the network to yardstick, Inception modules Quantity in a model can determine according to actual conditions, e.g., be set as 9.Fig. 3 B show the structural representation of starting module Figure.

At this it should be noted that in convolutional neural networks model to be trained, data between layers are referred to as spy Sign figure, this feature figure can be considered that one has width, the highly three-dimensional matrice with depth, and the size of convolution kernel determines current spy The size of the upper receptive field of sign figure, the number of convolution kernel determine the depth of next layer of characteristic pattern, and the size of stride determines next The width and height of layer characteristic pattern.

Further, due to needing to use training image when being trained in the convolutional neural networks model for treating training Collection, therefore, in this application, needed before convolutional neural networks model to be trained is trained make training image collection, and In practical application, homography corresponding relation for a pair be present, and size and shape meet mode input requirement two images it is usual It is difficult to directly obtain, therefore, in this application, training image collection can be made according to mode as shown in Figure 4：

S401：Obtain original image set.

S402：Any original image concentrated for original image, default size is zoomed to by the original image.

S403：According to default length and default width, the first rectangular image is determined on the original image, is remembered respectively Record first position of four summits of the first rectangular image in original image.

S404：Random perturbation is carried out to four summits of the first rectangular image, records four summits after random perturbation The second place.

S405：According to the first position on four summits and the second place on four summits, described first is solved The homography matrix of position and the second place.

S406：The original image is changed by the homography matrix.

S407：The quadrangle that is surrounded corresponding to the second place in the original image is found on image after conversion Four summit pixels, and according to default length and default width, the quadrangle surrounded to four summit pixels is carried out Scaling, using the quadrangle after scaling as the second rectangular image.

Herein it should be noted that first rectangular image and second rectangular image are that have that homography is corresponding to close A pair of rectangular images of system.

For example, original image set is obtained, the original image X concentrated for original image, in the present example, only with original graph Illustrated exemplified by original image X in image set, manufacturing process and the original image X of other original images are consistent, by original Beginning image X zooms to 320*240 (that is, default size), according to default length 128 and default width 128, original The first rectangular image A is determined on image X, records first of the first rectangular image A four summits in original image X respectively Put, generate eight random number n, random perturbation is carried out to four summits of the first rectangular image, records four after random perturbation The second place on summit, according to the second place of the first position on four summits and four summits, solve first position and second The homography matrix H of position, original image X is changed by homography matrix H, obtains image Y, after conversion Four summit pictures of the quadrangle surrounded corresponding to the second place in original image X are found on image (that is, image Y) Element, and according to default length 128 and default width 128, the quadrangle surrounded to four summit pixels zooms in and out, Using the quadrangle after scaling as the second rectangular image B, wherein, the first rectangular image A and the second rectangular image B are list be present A pair of rectangular images of answering property corresponding relation,

Herein it should be noted that according to default length 128 and default width 128, determined on original image X First rectangular image A can be the central point using the center of original image as the first rectangular image A, according to default length 128 And default width 128 determines the first rectangular image A four edges, naturally it is also possible to other points of original image for first Rectangular image A central point, four of the first rectangular image A are determined according to default length 128 and default width 128 Side, specifically determine that the first rectangular image A can be according to reality with the central point which of original image point is the first rectangular image A Border situation is determined.Further, since each summit is to include abscissa and ordinate in four summits of the first rectangular image, because , it is necessary to generate eight random number n, this eight random number n can be with identical, can be identical with some for this, and some is different, It is possible to different, subsequently, according to the eight of generation random number n, four summits of the first rectangular image is disturbed at random Dynamic, the process for recording the second place on four summits after random perturbation is specially：Assuming that the first position on summit 1 is (x, y), And random number corresponding to x is n1 in the first position on summit 1, random number corresponding to y is n2, then right according to random number n1 and n2 Summit 1 is disturbed, and records the second place (x+n1, y+n2) on the summit 1 after random perturbation.

Further, in order to reduce the scale of convolutional neural networks model, therefore, in this application, training figure is being made During image set, original image can be subjected to gray processing processing before the original image is zoomed into default size, That is, the original image is converted into gray level image, or after the second rectangular image is determined, by the first rectangular image and Two rectangular images carry out gray processing processing.

Further, for boosting algorithm robustness and adaptive ability, in this application, the first rectangle is being determined After image and the second rectangular image, that is, after determining a pair of rectangular images that homography corresponding relation be present, to described Brightness, fuzziness, noise and the subgraph of the width rectangular image in the rectangular image of homography corresponding relation be present at least one pair of At least one of image position is disturbed.

Further, the application given at this to it is described at least one pair of exist in the rectangular image of homography corresponding relation A width rectangular image the mode that is disturbed of brightness, it is specific as follows：

For a width rectangular image to be disturbed, generate random number r, according to the random number r generated, by formula p '= P × (1.0+r) determines the new gray value of each pixel in the width rectangular image, wherein, the new gray value of p ' expressions, P represents former Beginning gray value, r represent random number.

Herein it should be noted that in actual applications, random number r can be between section [- 0.1,0.1].

Further, the application given at this to it is described at least one pair of exist in the rectangular image of homography corresponding relation A width rectangular image the mode that is disturbed of fuzziness, it is specific as follows：

For a width rectangular image to be disturbed, random number a is generated, using random number a as blur radius, to the width histogram As carrying out Gaussian Blur.

Herein it should be noted that in actual applications, random number a can be between section [1,5].

Further, the application given at this to it is described at least one pair of exist in the rectangular image of homography corresponding relation A width rectangular image the mode that is disturbed of noise, it is specific as follows：

For a width rectangular image to be disturbed, density random number and intensity random number are generated, according to density random number And intensity random number, generate salt-pepper noise in the width rectangular image.

Further, the application given at this to it is described at least one pair of exist in the rectangular image of homography corresponding relation A width rectangular image the mode that is disturbed of subgraph image position, it is specific as follows：

For a width rectangular image to be disturbed, two diverse locations and identical size are randomly choosed in the width image Subgraph, all pixels in two subgraphs are exchanged, specifically as shown in figure 5, in Figure 5, leftmost figure is the first rectangle Image, it is the second rectangular image that centre, which obtains figure, and the figure of rightmost is that the subgraph image position of the second rectangular image is disturbed Image afterwards.

At this it should be noted that at least one pair of described width in the rectangular image of homography corresponding relation being present When rectangular image carries out two or more disturbances, disturbance order can determine according to actual conditions, e.g., can first to it is described at least A width rectangular image in a pair of rectangular images that homography corresponding relation be present carries out brightness fluctuation, then carries out fuzziness and disturb It is dynamic, first at least one pair of described width rectangular image existed in the rectangular image of homography corresponding relation can also be obscured Degree disturbance, then carry out brightness fluctuation.

In addition, the rectangular image that the rectangular image after disturbance is concentrated as final training image.

Further, in order to reduce the scale of convolutional neural networks model, then last in convolutional neural networks model The core size of individual pond layer is set as 4x4, and the port number of the convolution kernel of the convolutional layer is set as 64.

Further, in this application, during each weight parameter in convolutional neural networks model to be trained, tool The rectangular image input for the presence homography corresponding relation that body can concentrate the training image according to stochastic gradient descent method The convolutional neural networks model to be trained, the training image collection exported according to convolutional neural networks model to be trained In presence homography corresponding relation rectangular image apex coordinate deviation, and the training image concentrate presence list Difference between the apex coordinate of the apex coordinate of the rectangular image of answering property corresponding relation, loss function is built, until loss letter Number meets model accuracy value set in advance.

Herein it should be noted that in order to provide the precision of model, in this application, loss function can use Euclidean away from From certainly, in actual applications, other kinds of loss function can also be used.

Further, in the embodiment of the present application, declining strategy used in stochastic gradient descent method can be：Wherein, lr is current learning rate, and iter is current iteration number, max_iter For maximum iteration, power is the parameter that Schistosomiasis control rate declines speed, learning rate based on base_lr.

Herein it should be noted that the present invention in actual applications, will can every time participate in the training sample of gradient updating Quantity be set to 64, max_iter be maximum iteration be set to 400000, power be Schistosomiasis control rate decline speed parameter set 0.001 is set to for learning rate based on 0.5, base_lr.

In addition, present invention also provides the calculation of model accuracy, formula specific as follows calculates：s_i=p_i-r_i, wherein, M is the number of test sample collection Amount, p_iFor the prediction deviation of a pair of rectangular image i apex coordinate, r_iFor a pair of rectangular image i apex coordinate it is true partially Difference.

After training each weight parameter completed in convolutional neural networks model by the above method, convolutional neural networks are obtained Model.

Subsequently, when it needs to be determined that two images, then can be cut into symbol respectively by homography matrix between two images The size and shape of convolutional neural networks mode input requirement is closed, such as, it is assumed that the size of convolutional neural networks mode input requirement With the rectangle for being shaped as 128*128, then need for two images to be cut into 128*128 rectangle respectively, two after cutting certainly Width rectangular image there must be mutual corresponding relation, that is to say, that, it is necessary to comprising same target, then there will be the corresponding pass of homography The convolutional neural networks model that a pair of rectangular images input of system pre-establishes.

S102：Between four apex coordinates of the pair of rectangular image exported according to the convolutional neural networks model Deviation and known four apex coordinates of the width rectangular image in the pair of rectangular image determine the pair of rectangle Four apex coordinates of another width rectangular image in image.

In the embodiment of the present application, there will be the convolution that the input of a pair of rectangular images of homography corresponding relation pre-establishes After neural network model, by convolutional neural networks model final output be a pair of rectangular images four apex coordinates it Between deviation.

And homography matrix calculation formula is specially：Its In, H is the homography matrix of two images, and (u ', v ') and (u, v) is the mapping relations of same pixel in two images, Then according to homography matrix calculation formula, finally want to determine the homography matrix of two images, then need to know two width figures As four pairs of corresponding coordinates, and four apex coordinates of the width rectangular image in a pair of rectangular images can be determined clearly Come, it is, assuming using the width rectangular image central point in a pair of rectangular images as origin, by four tops of the rectangular image Point coordinates is fixed, using the rectangular image center as the origin of coordinates, then four apex coordinates of the rectangular image be it is believed that Another width rectangular image in a pair of rectangular images can be by by four known to the width rectangular image in a pair of rectangular images Deviation between the apex coordinate of a pair of rectangular images of the individual corresponding output of apex coordinate is added, so as in another width square Can be determined in shape image with the width rectangular image in a pair of rectangular images known to the corresponding top of four apex coordinates Point coordinates.

S103：Four of four apex coordinates and another width rectangular image according to known to the width rectangular image Apex coordinate, determine homography matrix corresponding to the pair of rectangular image.

In the embodiment of the present application, after four pairs of coordinates are determined, two can be determined according to homography matrix calculation formula The homography matrix of width image.

By the above method, due in training convolutional neural networks model used training image collection be carried out it is bright Degree, fuzziness, noise and subgraph position disturbance, have taken into full account precision shadow of the picture quality for training and using model Ring, therefore, can be determined with the robustness and adaptive ability of lift scheme relative to using VGG-style networks comprising same For homography matrix corresponding to the image of two width of one object, precision is higher.

Herein it should be noted that being tested according to actual experiment, convolutional neural networks mould used in this application is tested out Type size is 12.52M, and convolutional neural networks model accuracy mean error is 5.3, and VGG-style used in prior art Network model size is 260.91M, and model accuracy mean error is 9.2.

Above-mentioned is exactly the mode of establishing of convolutional neural networks model, and determines to include according to convolutional neural networks model The mode of the homography matrix of any two images of same target, and in actual applications, by establishing convolutional neural networks Model, and the homography matrix of any two images comprising same target is determined according to convolutional neural networks model, can be with It is widely applied in the real life of people, here, the application will provide some applications based on homography matrix, specifically It is bright to determine that the homography matrix of any two images comprising same target is actual in people according to convolutional neural networks model Application in life.

The first application：

In actual applications, because video camera is during shooting, it is possible to DE Camera Shake occurs, so that The situation of violent wobble variation occurs for captured picture moment, therefore, in order to when video camera is shaken, Violent wobble variation will not occur moment for captured picture, but each frame of the picture captured by realizing can be smoothly The change of generation, in this application, the four of adjacent two images can be determined based on the convolutional neural networks model trained The grid deviation on individual summit, picture is corrected according to grid deviation, so as to which the picture for reaching captured will not occur moment Violent wobble variation, but the effect of the change for the generation that each frame of the picture captured by realizing can be smooth.

It is specific as follows：

Started with captured float former frame, successively (that is, scheme former frame picture and adjacent next frame picture Picture sequence in 7) the convolutional neural networks model of above-mentioned foundation is input to, output former frame picture is drawn with adjacent next frame Deviation (that is, the picture sequence skew in Fig. 7) between the apex coordinate on four summits in face, the convolutional Neural according to each pair Deviation between the former frame picture of network model output and four apex coordinates of adjacent next frame picture, it is determined that being drawn per frame Deviation (that is, the camera motion track in Fig. 7) between face and four apex coordinates of the first frame picture, it is every according to what is determined Deviation between frame picture and four apex coordinates of the first frame picture, it is determined that four summits of every frame picture and the first frame picture Correcting a deviation (the camera motion smooth trajectory in Fig. 7) between coordinate, according to four tops of every frame picture and the first frame picture Correcting a deviation between point coordinates and known four apex coordinates of the first frame picture, it is determined that being rectified per four summits of frame picture Positive coordinate, coordinate is corrected according to four summits of the every frame picture determined, it is determined that the homography between the two frame pictures specified Matrix (that is, the homography matrix conversion in Fig. 7), and correct per frame picture.

Herein it should be noted that according to each pair convolutional neural networks model export former frame picture with it is adjacent Deviation between four apex coordinates of next frame picture, it is determined that per frame picture between four apex coordinates of the first frame picture Deviation (the camera motion smooth trajectory in Fig. 7), specifically can be with：For any frame picture, it is determined that before the frame picture Each pair described in the output of convolutional neural networks model former frame picture and adjacent next frame picture four apex coordinates it Between deviation, by determine be located at the frame picture before each pair described in convolutional neural networks model output former frame picture Four as the frame picture and the first frame picture of deviation sum between four apex coordinates of adjacent next frame picture Deviation between apex coordinate, i.e.p_tBetween the frame picture and four apex coordinates of the first frame picture Deviation, Δ_iIt is as shown in Figure 6 so as to obtain for the deviation between the i-th frame picture and four apex coordinates of the i-th -1 frame picture Every frame picture and the first frame picture four apex coordinates between deviation.

In addition, it should be noted herein in this application, according to the every frame picture and the first frame picture determined Deviation between four apex coordinates, it is determined that the correcting a deviation between four apex coordinates of the first frame picture per frame picture (the camera motion smooth trajectory in Fig. 7), specifically can be with：Started with captured float former frame picture, successively basis Correcting a deviation between former frame picture and four apex coordinates of the first frame picture and, next frame picture and the first frame picture Four apex coordinates between deviation, by correcting formula p '_t=argmin_p(α‖p-p′_t-1‖+(1-α)‖p-p_tUnder ‖) determining Correcting a deviation between one frame picture and four apex coordinates of the first frame picture, wherein, p '_tFor next frame picture and the first frame Correcting a deviation between four apex coordinates of picture, p_tBetween next frame picture and four apex coordinates of the first frame picture Deviation, p '_t-1Correcting a deviation between former frame picture and four apex coordinates of the first frame picture, α is weight coefficient, For adjustment picture stable case and retain situation, until all frames picture and the first frame picture four apex coordinates it Between correct a deviation all determine complete, so as to obtain four apex coordinates of every frame picture as shown in Figure 6 and the first frame picture Between correct a deviation.

Further, in this application, coordinate is corrected according to four summits of the every frame picture determined, it is determined that specify Homography matrix (that is, homography matrix conversion) in Fig. 7 between two frame pictures, and correcting per frame picture, specifically can be with：When The two frame pictures specified include：When former frame picture is with adjacent next frame picture, to determine the correction orientation of next frame picture (apex coordinate on four summits after wherein, the correction orientation is corrected including next frame picture) is directed to any frame picture, according to The four summits correction coordinate for the frame picture determined and four summits of the former frame picture adjacent with the frame picture are rectified Positive coordinate, by homography matrix formula, the homography matrix between former frame picture and the frame picture is determined, next frame is drawn Face into former frame picture, repeats said process by the correction of the homography matrix determined, until by the frame picture be remedied to First frame picture is consistent, finally, all frame pictures is remedied to consistent with the first frame picture；When two specified frame picture bags Include：When first frame picture is with other frame pictures, for any frame picture, corrected according to four summits of the frame picture determined Coordinate is corrected on four summits of coordinate and the first frame picture picture, by homography matrix formula, determine the first frame picture with Homography matrix between the frame picture, the frame picture is corrected into the first frame picture, weight by the homography matrix determined Multiple said process, it is consistent with the first frame picture until the frame picture is remedied to, finally, all frame pictures are remedied to the One frame picture is consistent.

Further, when all frame pictures are remedied to it is consistent with the first frame picture after, cut in all frame pictures Common content (that is, the scene cuts output in Fig. 7), you can obtain smoother, a stable video, that is, realize and clapped The each frame for the picture taken the photograph can be smooth generation change, whole process is specifically as shown in Figure 7.

Herein it should be noted that to every frame picture after correction, rectangle, and square is inscribed in the maximum for asking for non-black surround part The ratio of width to height of shape should be the ratio of display, generally, it shall be guaranteed that at least 80% retention rate after cutting.

For example, as shown in figure 8, (a) and (b) in Fig. 8 is adjacent two of same video camera in shoot on location video Frame picture, in order to simply and easily illustrate the scheme of the application, illustrated at this so that (a) is with (b) two width picture as an example, it is real The picture that multiframe occurs when occurring is shaken on border, but principle is consistent with two width pictures, (a) is that captured picture is trembled Dynamic former frame picture (that is, the first frame picture), (b) is adjacent next frame picture, and (a) picture and (b) picture are input to The convolutional neural networks model of foundation is stated, the deviation between the apex coordinate on four summits for exporting (a) picture and (b) picture (namely determining the deviation between (b) picture and four apex coordinates of (a) picture), for (b) picture, it is determined that being located at (b) The deviation between (a) picture of each pair convolutional neural networks model output and four apex coordinates of (b) picture before picture, Four tops of (a) picture that each pair convolutional neural networks model before being located at (b) picture by what is determined exports and (b) picture Deviation sum between point coordinates is as the deviation between (b) picture and four apex coordinates of (a) picture, with captured picture Face shake former frame picture starts, successively according between four apex coordinates of (a) picture and (b) picture correct a deviation with And the deviation between (a) picture and four apex coordinates of (b) picture, by correcting formula p '_t=argmin_p(α‖p-p′_t-1‖ +(1-α)‖p-p_tCorrecting a deviation between (a) picture and four apex coordinates of (b) picture ‖) is determined, according to the four of (a) picture Correction between the summit correction coordinate on individual summit and the apex coordinate on four summits of (a) picture of output and (b) picture Deviation, it is determined that the summit correction coordinate on four summits of (b) picture, to determine, finally, rectified according to the four of (b) picture summits Four summits correction coordinate of positive coordinate and (a) picture, by homography matrix formula, it is determined that between (a) picture and (b) picture Homography matrix, (b) picture is corrected into (a) picture by the homography matrix determined, cut in all frame pictures Common content, cut (b) picture and obtain (c) picture in Fig. 8, (c) picture is replaced to original (b) picture.

By the above-mentioned means, effectively can cause captured picture that violent wobble variation will not occur moment, and Be realize captured picture each frame can be smooth generation change effect.

Second of application：

In actual applications, watching video live broadcast has been increasingly becoming a kind of important amusement side in people's daily life Formula, and during net cast is carried out, it is virtually interactive with reality in net cast in order to strengthen, when working as video camera When the image at preceding moment reaches the position specified, then default virtual object can be shown on screen, when the current time of video camera Image without reach specify position when, then default virtual object, in this application, Ke Yigen will not be being shown on screen According to current image and the image for the position specified, the current of video camera is judged by the convolutional neural networks model of above-mentioned foundation Whether the image at moment reaches the position specified.

Detailed process is as follows：

From live video, the m frame pictures at current time are extracted, successively for each picture in m frame pictures, this is drawn Face is input in the convolutional neural networks model of above-mentioned foundation jointly with target picture, exports four of the picture and target picture Deviation between the apex coordinate on summit, it is, the position deviation of four opposite vertexes, until determining each to be drawn in m frame pictures Deviation between the apex coordinate on face and four summits of target picture, further according to determining each picture and mesh in m frame pictures Mark the deviation V between the apex coordinate on four summits of picture_i, pass through formula：To determine to take the photograph Whether the image at the current time of camera is without the position specified of arrival.

It is known determination in advance herein it should be noted that target picture refers to the picture corresponding to specified location, In addition, the m in formula refers to the quantity of m frame pictures,T, S are default threshold value, two default thresholds Value can be with identical, can also be different.

In addition, it should be noted at this further according to determining four of each picture and target picture in m frame pictures Deviation V between the apex coordinate on summit_iAfterwards, the shadow at the current time of video camera can also be determined by formula ‖ V ‖ ＜ S Seem no without the position specified of arrival, wherein, wherein, ‖ ‖ can be 0 norm, 1 norm, the equidistant formula of 2 norms, and S is Default threshold value, V are the azimuth deviation matrix [V of current m frames picture_ij]_m×8, it is, of course, also possible to be determined according to other formula Whether the image at the current time of video camera is without the position specified of arrival, as long as image that can be by formula by current time Tried one's best with the azimuth deviation of target picture small.

Further, when according to the apex coordinate for determining four summits of each picture and target picture in m frame pictures Between deviation when meeting formula, it is determined that the image at the current time of video camera reaches the position specified；When basis determines m When deviation in frame picture between the apex coordinate on four summits of each picture and target picture does not meet formula, it is determined that take the photograph The position that the image at the current time of camera is specified without arrival.

Further, then can be on screen when determining that the image at current time of video camera reaches the position specified Default virtual object is shown, then will not be when it is determined that just the image at the current time of video camera does not have the position that arrival is specified Default virtual object is shown on screen, it is necessary to continue to move to video camera until the image and target picture at video camera current time By formula, the match is successful, and this also illustrates, the image at the current time of video camera reaches the position specified.

By the above-mentioned means, can effectively determine whether the image at the current time of video camera is no reaches what is specified Position.

In addition, during the shooting of panorama is carried out using camera, occur that camera can not be stablized unavoidably In same horizontal line, so as to cause the situation that the splicing of the picture of adjacent former frame and a later frame can be unstable, therefore, at this In application, the homography matrix of adjacent two width picture can be determined based on the convolutional neural networks model trained, will be latter The picture of frame is adjusted to the angle of the picture of former frame, so that what adjacent former frame and the picture of a later frame were stablized Splicing.

It is specific as follows：

Former frame picture and adjacent next frame picture are input to the convolutional neural networks model of above-mentioned foundation, before output Deviation between the apex coordinate on four summits of one frame picture and adjacent next frame picture, according to the four of former frame picture Between the apex coordinate on summit and the apex coordinate on four summits of the former frame picture of output and adjacent next frame picture Deviation, determine the apex coordinate on four summits of next frame picture, and according to former frame picture and adjacent next frame picture Four opposite vertexes apex coordinate, by homography matrix calculation formula, determine that former frame picture is drawn with adjacent next frame Homography matrix between face, finally, then each pixel in adjacent next frame picture passed through into the homography determined Matrix conversion is spliced correction picture with former frame picture into correction picture.

By the above-mentioned means, it can effectively cause the splicing that the picture of adjacent former frame and a later frame is stablized.

The homography matrix based on convolutional neural networks provided above for the embodiment of the present application determines method, based on same Thinking, the embodiment of the present application also provides a kind of homography matrix determining device based on convolutional neural networks.

A kind of as shown in figure 9, homography matrix determining device based on convolutional neural networks that the embodiment of the present application provides Including：

Input module 901, for there will be the convolution that the input of a pair of rectangular images of homography corresponding relation pre-establishes Neural network model；

Coordinate determining module 902, for the pair of rectangular image that is exported according to the convolutional neural networks model Known four apex coordinates of the width rectangular image in deviation and the pair of rectangular image between four apex coordinates Determine four apex coordinates of another width rectangular image in the pair of rectangular image；

Matrix deciding module 903, for four apex coordinates according to known to the width rectangular image and described another Four apex coordinates of width rectangular image, determine homography matrix corresponding to the pair of rectangular image.

Described device also includes：

Model training module 904, for there will be a pair of rectangular images of homography corresponding relation are defeated in input module 901 Before entering the convolutional neural networks model pre-established, training image collection is made, wherein, the training image collection includes at least one To the rectangular image of homography corresponding relation be present, each weight parameter in convolutional neural networks model to be trained is initialized, At least one pair of described rectangular image that homography corresponding relation be present is inputted into convolutional neural networks model to be trained, according to treating The summit of at least one pair of rectangular image that homography corresponding relation be present of the convolutional neural networks model output of training is sat Wait to train described in the apex coordinate training of target deviation and at least one pair of rectangular image that homography corresponding relation be present Convolutional neural networks model in each weight parameter, obtain convolutional neural networks model.

At least one pair of described rectangular image that homography corresponding relation be present is gray level image.

The device also includes：

Disturb module 905, for it is described at least one pair of a width square in the rectangular image of homography corresponding relation be present At least one of brightness, fuzziness, noise and subgraph image position of shape image are disturbed.

The core size of last pond layer in the convolutional neural networks model is 4x4, the convolution of the convolutional layer The port number of core is 64.

The model training module 904, is additionally operable to the presence for the training image being concentrated according to stochastic gradient descent method Convolutional neural networks model to be trained described in the rectangular image input of homography corresponding relation, according to convolutional Neural to be trained The deviation of the apex coordinate of the rectangular image for the presence homography corresponding relation that the training image of network model output is concentrated, And between the apex coordinate of the apex coordinate of the rectangular image of the presence homography corresponding relation of the training image concentration Difference, loss function is built, until loss function meets model accuracy value set in advance.

It is described disturbance module 905 be specifically used for, to it is described at least one pair of exist in the rectangular image of homography corresponding relation The mode that is disturbed of brightness of a width rectangular image be：For a width rectangular image to be disturbed, random number r, root are generated According to the random number r generated, the new gray scale of each pixel in the width rectangular image is determined by formula p '=p × (1.0+r) Value, wherein, the new gray value of p ' expressions, P represents original gray value, and r represents random number；To it is described at least one pair of homography pair be present The mode that the fuzziness of a width rectangular image in the rectangular image that should be related to is disturbed is：For a width rectangle to be disturbed Image, random number a is generated, using random number a as blur radius, Gaussian Blur is carried out to the width rectangular image；To described at least one The mode disturbed to the noise that the width rectangular image in the rectangular image of homography corresponding relation be present is：For waiting to disturb A dynamic width rectangular image, density random number and intensity random number are generated, according to density random number and intensity random number, Generation salt-pepper noise in the width rectangular image；To it is described at least one pair of a width in the rectangular image of homography corresponding relation be present The mode that the subgraph image position of rectangular image is disturbed is：For a width rectangular image to be disturbed, in the width image with Machine selects two diverse locations and the subgraph of identical size, exchanges all pixels in two subgraphs.

The decline strategy that wherein stochastic gradient descent method uses is：Its In, lr is current learning rate, and iter is current iteration number, and max_iter is maximum iteration, and power is Schistosomiasis control rate Decline the parameter of speed, learning rate based on base_lr；And/or model accuracy calculates according to equation below：s_i=p_i-r_i, wherein, M be test sample collection quantity, p_iFor The prediction deviation of a pair of rectangular image i apex coordinate, r_iFor the true deviation of a pair of rectangular image i apex coordinate.

In addition, the embodiment of the present application, which also provides a kind of homography matrix based on convolutional neural networks, determines system, this is System includes：

Processor, computer-readable memory and computer-readable recording medium；

Program, for there will be the convolutional neural networks that the input of a pair of rectangular images of homography corresponding relation pre-establishes Model, according to the convolutional neural networks model export the pair of rectangular image four apex coordinates between deviation with And known four apex coordinates of the width rectangular image in the pair of rectangular image are determined in the pair of rectangular image separately Four apex coordinates of one width rectangular image, four apex coordinates and another width according to known to the width rectangular image Four apex coordinates of rectangular image, determine homography matrix corresponding to the pair of rectangular image.

Described program is stored on the computer-readable recording medium, for by the processor via the computer Readable memory performs.

The processor, computer-readable memory and the computer-readable recording medium can use the place in Figure 10 Reason device, internal storage, external memory storage are realized.

Wherein, Figure 10 is that the homography matrix based on convolutional neural networks determines that system forms structured flowchart, there is shown with Homography matrix based on convolutional neural networks determines the critical piece of system.In Figure 10, processor 1010, internal storage 1005th, bus bridge 1020 and the access system bus 1040 of network interface 1015, bus bridge 1020 are used for bridge system bus 1040 and I/O buses 1045, I/O interfaces access I/O buses 1045, USB interface and external memory storage are connected with I/O interfaces. In Figure 10, processor 1010 can be one or more processors, and each processor can have one or more processor Kernel；Internal storage 1005 is volatile memory, such as register, buffer, various types of random access memory Deng；When the homography matrix based on convolutional neural networks determines system operation, the packet in internal storage 1005 Include operating system and application program；Network interface 1015 can be Ethernet interface, optical fiber interface etc.；System bus 1040 can be with For data information, address information and control information；Bus bridge 1020 can be used for carrying out protocol conversion, by system Bus protocol is converted to I/O agreements or is system bus protocol to realize data transfer by I/O protocol conversions；I/O buses 1045 are used for data message and control information, can be disturbed with bus termination resistance or circuit to reduce signal reflex；I/O interfaces 1030 are mainly connected with various external equipments, such as keyboard, mouse, sensor etc., and flash memory can access I/ by USB interface O buses, external memory storage are nonvolatile memory, such as hard disk, CD etc..In the homography square based on convolutional neural networks After battle array determines system operation, processor can will be stored in external storage digital independent therein into internal storage, and Storage inside system command therein is handled, completes the function of operating system and application program.The example is based on volume The homography matrix of product neutral net determines that system can be positioned at desktop computer, notebook computer, tablet personal computer, smart mobile phone Deng.

Preferably, described program is additionally operable to, and to be built in advance being inputted there will be a pair of rectangular images of homography corresponding relation Before vertical convolutional neural networks model, training image collection is made, wherein, the training image collection includes at least one pair of and list be present The rectangular image of answering property corresponding relation, each weight parameter in convolutional neural networks model to be trained is initialized, by described in extremely Few a pair of rectangular images that homography corresponding relation be present input convolutional neural networks model to be trained, according to volume to be trained Product neural network model output it is described at least one pair of exist homography corresponding relation rectangular image apex coordinate deviation And convolution god to be trained described in the apex coordinate training of at least one pair of rectangular image that homography corresponding relation be present Through each weight parameter in network model, convolutional neural networks model is obtained.

Preferably, described program is additionally operable to, and at least one pair of described rectangular image that homography corresponding relation be present is ash Spend image.

Preferably, described program is additionally operable to, to it is described at least one pair of exist in the rectangular image of homography corresponding relation At least one of brightness, fuzziness, noise and subgraph image position of one width rectangular image are disturbed.

Preferably, described program is additionally operable to, the core size of last pond layer in the convolutional neural networks model For 4x4, the port number of the convolution kernel of the convolutional layer is 64.

Preferably, described program is additionally operable to, and the presence list for concentrating the training image according to stochastic gradient descent method should Property corresponding relation rectangular image input described in convolutional neural networks model to be trained, according to convolutional neural networks to be trained The deviation of the apex coordinate of the rectangular image for the presence homography corresponding relation that the training image of model output is concentrated, and Difference between the apex coordinate of the apex coordinate of the rectangular image for the presence homography corresponding relation that the training image is concentrated, Loss function is built, until loss function meets model accuracy value set in advance.

Preferably, described program is additionally operable to, to it is described at least one pair of exist in the rectangular image of homography corresponding relation The mode that the brightness of one width rectangular image is disturbed is：For a width rectangular image to be disturbed, random number r is generated, according to The random number r generated, the new gray value of each pixel in the width rectangular image is determined by formula p '=p × (1.0+r), Wherein, the new gray value of p ' expressions, P represent original gray value, and r represents random number；To it is described at least one pair of homography to be present corresponding The mode that the fuzziness of a width rectangular image in the rectangular image of relation is disturbed is：For a width histogram to be disturbed Picture, random number a is generated, using random number a as blur radius, Gaussian Blur is carried out to the width rectangular image；To it is described at least one pair of The mode that the noise of the width rectangular image in the rectangular image of homography corresponding relation is disturbed be present is：For waiting to disturb A width rectangular image, density random number and intensity random number are generated, according to density random number and intensity random number, at this Generation salt-pepper noise in width rectangular image；To it is described at least one pair of a width square in the rectangular image of homography corresponding relation be present The mode that the subgraph image position of shape image is disturbed is：It is random in the width image for a width rectangular image to be disturbed Two diverse locations and the subgraph of identical size are selected, exchanges all pixels in two subgraphs.

Preferably, described program is additionally operable to, and the decline strategy that wherein stochastic gradient descent method uses is：Wherein, lr is current learning rate, and iter is current iteration number, max_iter For maximum iteration, power is the parameter that Schistosomiasis control rate declines speed, learning rate based on base_lr；And/or model Precision calculates according to equation below：s_i=p_i-r_i, wherein, M is The quantity of test sample collection, p_iFor the prediction deviation of a pair of rectangular image i apex coordinate, r_iFor a pair of rectangular image i summit The true deviation of coordinate.

In a typical configuration, computing device includes one or more processors (CPU), input/output interface, net Network interface and internal memory.

Internal memory may include computer-readable medium in volatile memory, random access memory (RAM) and/or The forms such as Nonvolatile memory, such as read-only storage (ROM) or flash memory (flash RAM).Internal memory is computer-readable medium Example.

Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method Or technology come realize information store.Information can be computer-readable instruction, data structure, the module of program or other data. The example of the storage medium of computer includes, but are not limited to phase transition internal memory (PRAM), static RAM (SRAM), moved State random access memory (DRAM), other kinds of random access memory (RAM), read-only storage (ROM), electric erasable Programmable read only memory (EEPROM), fast flash memory bank or other memory techniques, read-only optical disc read-only storage (CD-ROM), Digital versatile disc (DVD) or other optical storages, magnetic cassette tape, the storage of tape magnetic rigid disk or other magnetic storage apparatus Or any other non-transmission medium, the information that can be accessed by a computing device available for storage.Define, calculate according to herein Machine computer-readable recording medium does not include temporary computer readable media (transitory media), such as data-signal and carrier wave of modulation.

It should also be noted that, term " comprising ", "comprising" or its any other variant are intended to nonexcludability Comprising so that process, method, commodity or equipment including a series of elements not only include those key elements, but also wrapping Include the other element being not expressly set out, or also include for this process, method, commodity or equipment intrinsic want Element.In the absence of more restrictions, the key element limited by sentence "including a ...", it is not excluded that wanted including described Other identical element also be present in the process of element, method, commodity or equipment.

It will be understood by those skilled in the art that embodiments herein can be provided as method, system or computer program product. Therefore, the application can be using the embodiment in terms of complete hardware embodiment, complete software embodiment or combination software and hardware Form.Deposited moreover, the application can use to can use in one or more computers for wherein including computer usable program code The shape for the computer program product that storage media is implemented on (including but is not limited to magnetic disk storage, CD-ROM, optical memory etc.) Formula.

Embodiments herein is the foregoing is only, is not limited to the application.For those skilled in the art For, the application can have various modifications and variations.All any modifications made within spirit herein and principle, it is equal Replace, improve etc., it should be included within the scope of claims hereof.

Claims

1. a kind of homography matrix based on convolutional neural networks determines method, it is characterised in that including：

According to the convolutional neural networks model export the pair of rectangular image four apex coordinates between deviation with And known four apex coordinates of the width rectangular image in the pair of rectangular image are determined in the pair of rectangular image separately Four apex coordinates of one width rectangular image；

Four apex coordinates of four apex coordinates and another width rectangular image according to known to the width rectangular image, Determine homography matrix corresponding to the pair of rectangular image.

2. the method as described in claim 1, it is characterised in that defeated in a pair of rectangular images there will be homography corresponding relation Before entering the convolutional neural networks model pre-established, methods described also includes：

Training image collection is made, wherein, the training image collection includes at least one pair of histogram that homography corresponding relation be present Picture；

Initialize each weight parameter in convolutional neural networks model to be trained；

At least one pair of described rectangular image that homography corresponding relation be present is inputted into convolutional neural networks model to be trained；

The histogram that at least one pair of has homography corresponding relation according to convolutional neural networks model output to be trained The apex coordinate of the deviation of the apex coordinate of picture and at least one pair of rectangular image that homography corresponding relation be present is trained Each weight parameter in the convolutional neural networks model to be trained, obtains convolutional neural networks model.

3. method as claimed in claim 2, it is characterised in that it is described at least one pair of the histogram of homography corresponding relation be present As being gray level image, and/or at least one pair of described rectangular image that homography corresponding relation be present includes the central point of image And size is identical.

4. method as claimed in claim 2, it is characterised in that this method includes：To it is described at least one pair of homography pair be present At least one of brightness, fuzziness, noise and subgraph image position of a width rectangular image in the rectangular image that should be related to are entered Row disturbance.

5. the method as described in claim 1-4 any one, it is characterised in that last in the convolutional neural networks model The core size of one pond layer is 4x4, and the port number of the convolution kernel of the convolutional layer is 64.

6. method as claimed in claim 2, it is characterised in that described in the training in convolutional neural networks model to be trained Each weight parameter include：

The rectangular image input institute for the presence homography corresponding relation for concentrating the training image according to stochastic gradient descent method State convolutional neural networks model to be trained；

The presence homography corresponding relation that the training image exported according to convolutional neural networks model to be trained is concentrated The deviation of the apex coordinate of rectangular image, and the rectangular image of the presence homography corresponding relation of training image concentration Difference between the apex coordinate of apex coordinate, loss function is built, until loss function meets model accuracy set in advance Value.

7. method as claimed in claim 4, it is characterised in that

At least one pair of described brightness that the width rectangular image in the rectangular image of homography corresponding relation be present is disturbed Mode be：For a width rectangular image to be disturbed, random number r is generated, according to the random number r generated, passes through formula p ' =p × (1.0+r) determines the new gray value of each pixel in the width rectangular image, wherein, the new gray value of p ' expressions, P is represented Original gray value, r represent random number；

At least one pair of described fuzziness that the width rectangular image in the rectangular image of homography corresponding relation be present is disturbed Dynamic mode is：For a width rectangular image to be disturbed, random number a is generated, using random number a as blur radius, to the width square Shape image carries out Gaussian Blur；

At least one pair of described noise that the width rectangular image in the rectangular image of homography corresponding relation be present is disturbed Mode be：For a width rectangular image to be disturbed, density random number and intensity random number are generated, according to density random number And intensity random number, generate salt-pepper noise in the width rectangular image；

At least one pair of described subgraph image position that the width rectangular image in the rectangular image of homography corresponding relation be present is entered Row disturbance mode be：For a width rectangular image to be disturbed, two diverse locations and phase are randomly choosed in the width image With the subgraph of size, all pixels in two subgraphs are exchanged.

8. method as claimed in claim 6, it is characterised in that the decline strategy that wherein stochastic gradient descent method uses is：

Wherein, lr is current learning rate, and iter is current iteration number, max_ Iter is maximum iteration, and power is the parameter that Schistosomiasis control rate declines speed, learning rate based on base_lr；And/or

Model accuracy calculates according to equation below：

s_i=p_i-r_i, wherein, M is the quantity of test sample collection, p_iFor the prediction deviation of a pair of rectangular image i apex coordinate, r_iFor the true deviation of a pair of rectangular image i apex coordinate.

A kind of 9. homography matrix determining device based on convolutional neural networks, it is characterised in that including：

Input module, for there will be the convolutional neural networks that the input of a pair of rectangular images of homography corresponding relation pre-establishes Model；

Coordinate determining module, for four summits of the pair of rectangular image exported according to the convolutional neural networks model Described in known four apex coordinates of the width rectangular image in deviation and the pair of rectangular image between coordinate determine Four apex coordinates of another width rectangular image in a pair of rectangular images；

Matrix deciding module, for four apex coordinates according to known to the width rectangular image and another width histogram Four apex coordinates of picture, determine homography matrix corresponding to the pair of rectangular image.

10. device as claimed in claim 9, it is characterised in that described device also includes：

Model training module, to be pre-established for being inputted in input module there will be a pair of rectangular images of homography corresponding relation Convolutional neural networks model before, make training image collection, wherein, the training image collection include at least one pair of exist it is single should Property corresponding relation rectangular image, initialize each weight parameter in convolutional neural networks model to be trained, will described at least A pair of rectangular images that homography corresponding relation be present input convolutional neural networks model to be trained, according to convolution to be trained Neural network model output it is described at least one pair of exist homography corresponding relation rectangular image apex coordinate deviation with And convolutional Neural to be trained described in the apex coordinate training of at least one pair of rectangular image that homography corresponding relation be present Each weight parameter in network model, obtains convolutional neural networks model.

11. device as claimed in claim 10, it is characterised in that it is described at least one pair of the rectangle of homography corresponding relation be present Image is gray level image, and/or at least one pair of described rectangular image that homography corresponding relation be present includes the center of image Put and size is identical.

12. device as claimed in claim 10, it is characterised in that the device also includes：

Disturb module, for it is described at least one pair of a width rectangular image in the rectangular image of homography corresponding relation be present At least one of brightness, fuzziness, noise and subgraph image position are disturbed.

13. the device as described in claim 9-12 any one, it is characterised in that in the convolutional neural networks model most The core size of the latter pond layer is 4x4, and the port number of the convolution kernel of the convolutional layer is 64.

14. device as claimed in claim 10, it is characterised in that the model training module, be additionally operable to according to stochastic gradient Convolution god to be trained described in the rectangular image input for the presence homography corresponding relation that descent method concentrates the training image Through network model, the presence homography that the training image exported according to convolutional neural networks model to be trained is concentrated is corresponding The deviation of the apex coordinate of the rectangular image of relation, and the rectangle of the presence homography corresponding relation of training image concentration Difference between the apex coordinate of the apex coordinate of image, loss function is built, until loss function meets mould set in advance Type accuracy value.

15. device as claimed in claim 14, it is characterised in that it is described disturbance module be specifically used for, to it is described at least one pair of The mode that the brightness of the width rectangular image in the rectangular image of homography corresponding relation is disturbed be present is：For waiting to disturb A width rectangular image, generate random number r, according to the random number r generated, the width is determined by formula p '=p × (1.0+r) The new gray value of each pixel in rectangular image, wherein, the new gray value of p ' expressions, P represents original gray value, and r represents random Number；At least one pair of described fuzziness that the width rectangular image in the rectangular image of homography corresponding relation be present is disturbed Mode be：For a width rectangular image to be disturbed, random number a is generated, using random number a as blur radius, to the width rectangle Image carries out Gaussian Blur；To it is described at least one pair of a width rectangular image in the rectangular image of homography corresponding relation be present The mode that noise is disturbed is：For a width rectangular image to be disturbed, density random number and intensity random number, root are generated According to density random number and intensity random number, salt-pepper noise is generated in the width rectangular image；It is single at least one pair of described presence The mode that the subgraph image position of a width rectangular image in the rectangular image of answering property corresponding relation is disturbed is：For waiting to disturb A width rectangular image, the subgraph of two diverse locations and identical size is randomly choosed in the width image, exchanges two sons All pixels in image.

16. method as claimed in claim 14, it is characterised in that the decline strategy that wherein stochastic gradient descent method uses is：Wherein, lr is current learning rate, and iter is current iteration number, max_iter For maximum iteration, power is the parameter that Schistosomiasis control rate declines speed, learning rate based on base_lr；And/or model Precision calculates according to equation below：s_i=p_i-r_i, wherein, M is The quantity of test sample collection, p_iFor the prediction deviation of a pair of rectangular image i apex coordinate, r_iFor a pair of rectangular image i summit The true deviation of coordinate.