CN111127327B - Picture inclination detection method and device - Google Patents

Picture inclination detection method and device Download PDF

Info

Publication number
CN111127327B
CN111127327B CN201911113009.2A CN201911113009A CN111127327B CN 111127327 B CN111127327 B CN 111127327B CN 201911113009 A CN201911113009 A CN 201911113009A CN 111127327 B CN111127327 B CN 111127327B
Authority
CN
China
Prior art keywords
picture
sample
training
tilt detection
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911113009.2A
Other languages
Chinese (zh)
Other versions
CN111127327A (en
Inventor
刘欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beike Technology Co Ltd
Original Assignee
Beike Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beike Technology Co Ltd filed Critical Beike Technology Co Ltd
Priority to CN201911113009.2A priority Critical patent/CN111127327B/en
Publication of CN111127327A publication Critical patent/CN111127327A/en
Application granted granted Critical
Publication of CN111127327B publication Critical patent/CN111127327B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/60Rotation of a whole image or part thereof
    • G06T3/608Skewing or deskewing, e.g. by two-pass or three-pass rotation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/24Aligning, centring, orientation detection or correction of the image
    • G06V10/245Aligning, centring, orientation detection or correction of the image by locating a pattern; Special marks for positioning

Abstract

The embodiment of the invention provides a picture inclination detection method and device, wherein the method comprises the following steps: constructing a training sample comprising a vertical picture sample and a tilted picture sample; acquiring a preset channel picture of each training sample, and training a convolutional neural network model through the preset channel picture to obtain a picture tilt detection model; and inputting the picture to be detected into a picture inclination detection model to carry out inclination detection of the picture to be detected. According to the image tilt detection method and device, the training samples consisting of the vertical image samples and the tilt image samples are constructed, and the preset channel images of the training samples are input into the convolutional neural network model for training to obtain the image tilt detection model, so that whether the images tilt or not is detected, automatic extraction, identification and image classification of the image tilt characteristics are realized, image tilt detection in complex daily scenes can be realized, and compared with the prior art, the identification rate and robustness are improved, and time and labor are saved.

Description

Picture inclination detection method and device
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a picture inclination detection method and device.
Background
Because of the position and other conditions of photographing, the photographed picture is often in an inclined state, and particularly for a square structure of a house, the inclined condition seriously affects the picture attractiveness.
Tilt correction is an important part of picture preprocessing, and current methods of tilt correction focus mainly on documents and texts with rectangular frame edges. The image tilt detection is the basis of image correction, and the existing tilt detection method mainly comprises the following steps of: a straight line based approach and a projection based approach. The straight line-based method mainly detects the straight lines in the transverse direction and the longitudinal direction by utilizing hough transformation, then judges whether the straight lines incline or not according to the angle of the straight lines, has good effect on simple pictures with obvious vertical and horizontal edges, and is difficult to process complex pictures. The projection method is characterized in that the projection is performed on the picture under different angles to obtain a plurality of projection pictures, and the inclination angle is calculated according to certain statistical characteristics of the projection pictures, but the projection method is required to project the whole picture, has more projection directions, requires a large amount of calculation, and greatly increases the error probability along with the increase of the size and the complexity of the picture. In addition, the result of the method based on the traditional picture processing is greatly affected by super parameter setting, and the method has good effect and poor robustness only by adjusting parameters for different scenes.
It can be seen that the detection of picture tilt in complex daily scenes is still a difficulty.
Disclosure of Invention
In order to solve the problems in the prior art, the embodiment of the invention provides a picture tilt detection method and device.
In a first aspect, an embodiment of the present invention provides a method for detecting a picture tilt, including: constructing a training sample, wherein the training sample comprises a vertical picture sample and an inclined picture sample; acquiring a preset channel picture of each training sample in the training samples, and training a convolutional neural network model through the preset channel picture of the training samples to obtain a picture tilt detection model; inputting the picture to be detected into the picture inclination detection model, and carrying out inclination detection on the picture to be detected according to the output of the picture inclination detection model.
Further, the vertical picture sample is provided with a label for representing the picture as vertical, and the inclined picture sample is provided with a label for representing the picture as inclined; training the convolutional neural network model through the preset channel picture of the training sample to obtain a picture tilt detection model comprises the following steps: inputting the preset channel picture into the convolutional neural network model, and training the convolutional neural network model by taking the label of the training sample corresponding to the preset channel picture as output to obtain the picture tilt detection model.
Further, the step of training the convolutional neural network model through the preset channel picture of the training sample to obtain a picture tilt detection model further includes: dividing the training sample into a training set and a verification set according to a set proportion; training a convolutional neural network model through a preset channel picture of the training sample in the training set to obtain a picture tilt detection model weight; and evaluating the accuracy and reliability of the picture tilt detection model weight through a preset channel picture of the training sample in the verification set to obtain the optimized picture tilt detection model weight.
Further, the preset channel pictures comprise preset feature pictures with preset quantity, and each preset feature picture corresponds to a picture of one channel; the preset channel picture is a three-channel picture, and the obtaining the preset channel picture of each training sample in the training samples includes: and acquiring an x-direction gradient image, a y-direction gradient image and a gray scale image of each training sample in the training samples, and superposing the x-direction gradient image, the y-direction gradient image and the gray scale image to form the three-channel image.
Further, the acquiring the x-direction gradient map and the y-direction gradient map of each training sample in the training samples specifically includes: and converting the training sample into an hsv space, and extracting the x-direction gradient map and the y-direction gradient map of a hue h channel by using a sobel operator.
Further, prior to said converting the training samples into hsv space, the method further comprises: and carrying out histogram equalization on the training samples.
Further, the constructing training samples includes: acquiring the vertical picture sample, and carrying out sample augmentation on the vertical picture sample through a preset picture augmentation rule; and acquiring an inclined picture sample according to the vertical picture sample subjected to the sample augmentation.
Further, the preset picture augmentation rule includes: at least one of randomly cropping a picture, randomly scaling, randomly horizontally flipping, and randomly vertically flipping.
Further, the obtaining the tilted picture sample from the vertical picture sample after the sample augmentation includes: and rotating the vertical picture sample subjected to sample augmentation and extracting the largest inscribed rectangle in the overlapping area before and after rotation to obtain the inclined picture sample.
Further, the rotating the vertical picture sample after the sample augmentation includes: and randomly rotating the vertical picture sample after the sample is amplified by taking normal distribution as probability selection rotation degrees.
Further, the convolutional neural network model is a multi-scale residual network model, and comprises a cascade residual module, a multi-scale pooling module and a full connection module which are sequentially connected.
Further, the cascade residual modules comprise 9 cascade residual modules, each of which is overlapped by 1*1 convolution, 3*3 convolution and 1*1 convolution, wherein the 3 rd and 6 th residual modules are followed by a maximum pooling layer, and the 9 th residual module is connected with the multi-scale pooling module; wherein the activation function in the convolutional layer uses relu.
Further, the pooling cores of the multi-scale pooling module are 16×16,8×8,4×4,2×2 respectively, four layers are parallel, and each layer of pooling is followed by 1 3*3 convolution and 1 1*1 convolution and global average pooling layer.
Further, the full-connection module comprises two full-connection layers, a dropout layer is arranged between the two full-connection layers, and the later full-connection layer utilizes a sigmoid function to map out a probability value that the picture is inclined.
In a second aspect, an embodiment of the present invention provides a device for detecting a tilt of a picture, including: a sample construction module for: constructing a training sample, wherein the training sample comprises a vertical picture sample and an inclined picture sample; the picture tilt detection model construction module is used for: acquiring a preset channel picture of each training sample in the training samples, and training a convolutional neural network model through the preset channel picture of the training samples to obtain a picture tilt detection model; the inclination detection module is used for: inputting the picture to be detected into the picture inclination detection model, and carrying out inclination detection on the picture to be detected according to the output of the picture inclination detection model.
Further, the vertical picture sample is provided with a label for representing the picture as vertical, and the inclined picture sample is provided with a label for representing the picture as inclined; the image inclination detection model construction module is specifically used for when training the convolutional neural network model through a preset channel image of the training sample to obtain an image inclination detection model when being used for: inputting the preset channel picture into the convolutional neural network model, and training the convolutional neural network model by taking the label of the training sample corresponding to the preset channel picture as output to obtain the picture tilt detection model.
Further, when the image tilt detection model construction module is used for training the convolutional neural network model through the preset channel image of the training sample to obtain the image tilt detection model, the image tilt detection model construction module is further used for: dividing the training sample into a training set and a verification set according to a set proportion; training a convolutional neural network model through a preset channel picture of the training sample in the training set to obtain a picture tilt detection model weight; and evaluating the accuracy and reliability of the picture tilt detection model weight through a preset channel picture of the training sample in the verification set to obtain the optimized picture tilt detection model weight.
Further, the preset channel pictures comprise preset feature pictures with preset quantity, and each preset feature picture corresponds to a picture of one channel; the preset channel picture is a three-channel picture, and the picture tilt detection model construction module is specifically configured to: and acquiring an x-direction gradient image, a y-direction gradient image and a gray scale image of each training sample in the training samples, and superposing the x-direction gradient image, the y-direction gradient image and the gray scale image to form the three-channel image.
Further, the image tilt detection model construction module is specifically configured to, when being configured to obtain an x-direction gradient map and a y-direction gradient map of each training sample in the training samples: and converting the training sample into an hsv space, and extracting the x-direction gradient map and the y-direction gradient map of a hue h channel by using a sobel operator.
Further, the image tilt detection model construction module is further configured to, prior to the converting the training samples into hsv space: and carrying out histogram equalization on the training samples.
Further, the sample construction module, when used for constructing training samples, is specifically configured to: acquiring the vertical picture sample, and carrying out sample augmentation on the vertical picture sample through a preset picture augmentation rule; and acquiring an inclined picture sample according to the vertical picture sample subjected to the sample augmentation.
Further, the preset picture augmentation rule includes: at least one of randomly cropping a picture, randomly scaling, randomly horizontally flipping, and randomly vertically flipping.
Further, the sample construction module is specifically configured to, when configured to obtain a tilted picture sample from the vertical picture sample after the sample augmentation: and rotating the vertical picture sample subjected to sample augmentation and extracting the largest inscribed rectangle in the overlapping area before and after rotation to obtain the inclined picture sample.
Further, the sample construction module is specifically configured to, when configured to rotate the vertical picture sample after the sample is amplified: and randomly rotating the vertical picture sample after the sample is amplified by taking normal distribution as probability selection rotation degrees.
Further, the convolutional neural network model is a multi-scale residual network model, and comprises a cascade residual module, a multi-scale pooling module and a full connection module which are sequentially connected.
Further, the cascade residual modules comprise 9 cascade residual modules, each of which is overlapped by 1*1 convolution, 3*3 convolution and 1*1 convolution, wherein the 3 rd and 6 th residual modules are followed by a maximum pooling layer, and the 9 th residual module is connected with the multi-scale pooling module; wherein the activation function in the convolutional layer uses relu.
Further, the pooling cores of the multi-scale pooling module are 16×16,8×8,4×4,2×2 respectively, four layers are parallel, and each layer of pooling is followed by 1 3*3 convolution and 1 1*1 convolution and global average pooling layer.
Further, the full-connection module comprises two full-connection layers, a dropout layer is arranged between the two full-connection layers, and the later full-connection layer utilizes a sigmoid function to map out a probability value that the picture is inclined.
In a third aspect, an embodiment of the invention provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method as provided in the first aspect when executing the computer program.
In a fourth aspect, embodiments of the present invention provide a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method as provided by the first aspect.
According to the image tilt detection method and device, the training samples consisting of the vertical image samples and the tilt image samples are constructed, and the preset channel images of the training samples are input into the convolutional neural network model for training to obtain the image tilt detection model, so that whether the images tilt or not is detected, automatic extraction, identification and image classification of the image tilt characteristics are realized, image tilt detection in complex daily scenes can be realized, and compared with the prior art, the identification rate and robustness are improved, and time and labor are saved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a method for detecting image tilt according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a construction process of a picture tilt detection model in a picture tilt detection method according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a convolutional neural network model in a picture tilt detection method according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a residual module in a convolutional neural network model in a picture tilt detection method according to an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a picture tilt detection apparatus according to an embodiment of the present invention;
fig. 6 is a schematic physical structure of an electronic device according to an embodiment of the invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Fig. 1 is a flowchart of a picture tilt detection method according to an embodiment of the present invention. As shown in fig. 1, the method includes:
step 101, constructing a training sample, wherein the training sample comprises a vertical picture sample and a tilted picture sample.
According to the embodiment of the invention, whether the picture is inclined or not is detected by a machine learning method, and a picture inclination detection model for picture inclination detection is obtained by training. To train to obtain the picture tilt detection model first requires the construction of training samples. The training samples should include vertical picture samples and inclined picture samples, so that the neural network learns the differences between the vertical picture samples and the inclined picture samples on the image characteristics, and the picture classification is convenient. For example, if the probability of the picture being tilted is high, it may be determined that the picture is tilted.
Step 102, obtaining a preset channel picture of each training sample in the training samples, and training the convolutional neural network model through the preset channel picture of the training samples to obtain a picture tilt detection model.
The preset channel pictures comprise preset number of preset feature pictures, and each preset feature picture corresponds to one channel. The preset channel pictures have preset channel numbers, and the type of the preset characteristic picture of each channel can be preset. And acquiring a preset channel picture of each training sample in the training samples, and training the convolutional neural network model through the preset channel picture of the training samples to obtain a picture tilt detection model.
And step 103, inputting the picture to be detected into the picture inclination detection model, and carrying out inclination detection on the picture to be detected according to the output of the picture inclination detection model.
Different from the traditional image inclination detection, the embodiment of the invention utilizes the convolutional neural network model to construct an image inclination detection model, and utilizes the image inclination detection model to carry out inclination detection, and the image inclination detection model can automatically identify the characteristics of the image as inclination or vertical, so that whether the image is inclined or not is judged, and the method is suitable for daily complex image inclination detection.
According to the image tilt detection method provided by the embodiment of the invention, the training sample consisting of the vertical image sample and the tilt image sample is constructed, and the preset channel image of the training sample is input into the convolutional neural network model for training to obtain the image tilt detection model, so that the image tilt detection model is further used for detecting whether the image is tilted, the automatic extraction and identification of the image tilt characteristics and the image classification are realized, the image tilt detection under a complex daily scene can be realized, and compared with the prior art, the recognition rate and the robustness are improved, and the time and the labor are saved.
And (3) performing picture tilt detection by using a picture tilt detection model, wherein the picture tilt detection model needs to be constructed first.
Fig. 2 is a schematic diagram of a construction process of a picture tilt detection model in a picture tilt detection method according to an embodiment of the present invention. Fig. 3 is a schematic structural diagram of a convolutional neural network model in a picture tilt detection method according to an embodiment of the present invention. Fig. 4 is a schematic structural diagram of a residual module in a convolutional neural network model in a picture tilt detection method according to an embodiment of the present invention. The process of constructing the picture tilt detection model is specifically described below with reference to fig. 2, 3, and 4.
Step 201, a vertical picture sample is collected, data augmentation is performed, and an inclined picture sample is generated.
And shooting a normal vertical sample picture by using a camera or a mobile phone, and amplifying the data after the vertical picture sample is generated. Random augmentation data, including random cropping pictures, such as random cropping of an area of the size of 1/4 of the original figure; random scaling, for example, the amplitude of random scaling is 80% to 120% of the original picture; and randomly performing operations such as horizontal overturning on the picture.
The tilted picture samples are derived from normal vertical picture sample rotation, and the network can learn the difference before and after sample rotation, i.e. the tilted features. And randomly rotating the vertical picture sample, wherein the picture with higher inclination amplitude in the actual scene has smaller occurrence probability, so that the picture with random rotation degree is selected by taking normal distribution as probability. Randomly rotating the normal vertical picture sample, wherein the rotation range is minus 90,90 degrees, the selection probability of the rotation angle is normal distribution, the mean value and the variance are (0,0.01) respectively, rotating the picture with the center as the origin, calculating the maximum inscribed rectangle of the overlapping area before and after rotation, and cutting out the maximum rectangle area as the inclined picture sample.
Step 202, data preprocessing.
In order to avoid the conditions of overexposure and darkness of the picture, histogram equalization is performed on the sample picture, and the contrast of the picture is increased. In order to reduce the influence of illumination and the like, each sample picture is converted into an hsv space, and an x gradient picture and a y gradient picture of a hue h channel are extracted by a sobel operator. And acquiring a gray level image of each sample picture, and superposing three single-channel pictures together to form a three-channel picture as network input.
And 203, constructing a multi-scale residual error network.
The multi-scale residual error network comprises a cascade residual error module, a multi-scale pooling module and a full connection module which are sequentially connected.
The multi-scale residual network model consists of 9 residual modules, 1 multi-scale pooling module and 2 full-connection modules. Each residual module is overlapped by 1*1 convolution, 3*3 convolution and 1*1 convolution, the activation function in the convolution layer is relu, and a maximum pooling layer is arranged after every 3 residual modules (a multi-scale pooling module is arranged after the last residual module). The multi-scale pooling module pooling cores are 16 x 16,8 x 8,4 x 4,2 x 2 respectively, four layers are parallel in total, one 3*3 convolution and one 1*1 convolution are arranged after each layer of pooling, global average pooling is carried out, and finally the four layers of results are connected in series (splicing operation) to be used as the characteristics of pictures and then sent into a full-connection layer for classification. A dropout layer with 0.5 proportion is arranged between the full connection layers, and finally the result output is mapped into a probability value of whether inclination exists or not by using sigmoid.
Step 204, model training and testing.
Dividing the picture sample into a training set and a verification set, performing iterative training on the designed convolutional neural network model by using the training set to obtain model weights, and performing accuracy and reliability evaluation on the model weights obtained by training by using the verification set to obtain optimized model weights.
For example, the picture sample is divided into a training set and a verification set according to the ratio of 9:1, the model is trained on the training sample by utilizing an SGD random gradient descent method, the initial learning rate is 0.001, the momentum parameter is 0.9, and the learning rate is attenuated by 10 times every 30 epochs. Further, after each epoch is completed, accuracy and reliability evaluations are performed on the model weights on the verification samples to obtain optimized model weights.
Compared with the prior art, the image tilt detection method based on the convolutional neural network has the advantages and positive effects that: by establishing the convolutional neural network model, the automatic extraction, identification and classification of the picture tilt features are realized, and compared with the prior art, the method not only improves the identification rate and the robustness, but also saves time and labor.
Fig. 5 is a schematic structural diagram of a picture tilt detection apparatus according to an embodiment of the present invention. As shown in fig. 5, the apparatus includes: sample construction module 10, picture tilt detection model construction module 20, and tilt detection module 30, wherein: the sample construction module 10 is for: constructing a training sample, wherein the training sample comprises a vertical picture sample and an inclined picture sample; the picture tilt detection model construction module 20 is configured to: acquiring a preset channel picture of each training sample in the training samples, and training a convolutional neural network model through the preset channel picture of the training samples to obtain a picture tilt detection model; the tilt detection module 30 is for: inputting the picture to be detected into the picture inclination detection model, and carrying out inclination detection on the picture to be detected according to the output of the picture inclination detection model.
According to the image tilt detection device provided by the embodiment of the invention, the training sample consisting of the vertical image sample and the tilt image sample is constructed, and the preset channel image of the training sample is input into the convolutional neural network model for training to obtain the image tilt detection model, so that the image tilt detection device is further used for detecting whether the image is tilted, the automatic extraction and identification of the image tilt characteristics and the image classification are realized, the image tilt detection under a complex daily scene can be realized, and compared with the prior art, the recognition rate and the robustness are improved, and the time and the labor are saved.
The device provided by the embodiment of the invention is used for the method, and specific functions can refer to the flow of the method and are not repeated here.
Fig. 6 is a schematic physical structure of an electronic device according to an embodiment of the invention. As shown in fig. 6, the electronic device may include: processor 610, communication interface (Communications Interface) 620, memory 630, and communication bus 640, wherein processor 610, communication interface 620, and memory 630 communicate with each other via communication bus 640. The processor 610 may call logic instructions in the memory 630 to perform the following methods: constructing a training sample, wherein the training sample comprises a vertical picture sample and an inclined picture sample; acquiring a preset channel picture of each training sample in the training samples, and training a convolutional neural network model through the preset channel picture of the training samples to obtain a picture tilt detection model; inputting the picture to be detected into the picture inclination detection model, and carrying out inclination detection on the picture to be detected according to the output of the picture inclination detection model.
Further, the logic instructions in the memory 630 may be implemented in the form of software functional units and stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, embodiments of the present invention also provide a non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor is implemented to perform the method provided in the above embodiments, for example, including: constructing a training sample, wherein the training sample comprises a vertical picture sample and an inclined picture sample; acquiring a preset channel picture of each training sample in the training samples, and training a convolutional neural network model through the preset channel picture of the training samples to obtain a picture tilt detection model; inputting the picture to be detected into the picture inclination detection model, and carrying out inclination detection on the picture to be detected according to the output of the picture inclination detection model.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (20)

1. A picture tilt detection method, comprising:
constructing a training sample, wherein the training sample comprises a vertical picture sample and an inclined picture sample;
acquiring a preset channel picture of each training sample in the training samples, and training a convolutional neural network model through the preset channel picture of the training samples to obtain a picture tilt detection model;
inputting a picture to be detected into the picture inclination detection model, and performing inclination detection of the picture to be detected according to the output of the picture inclination detection model;
the vertical picture sample is provided with a label for representing the picture as vertical, and the inclined picture sample is provided with a label for representing the picture as inclined;
the preset channel pictures comprise preset characteristic pictures with preset quantity, and each preset characteristic picture corresponds to a picture of one channel; training the convolutional neural network model through the preset channel picture of the training sample to obtain a picture tilt detection model comprises the following steps:
inputting the preset channel picture into the convolutional neural network model, and training the convolutional neural network model by taking the label of the training sample corresponding to the preset channel picture as output to obtain the picture tilt detection model;
the preset channel picture is a three-channel picture, and the obtaining the preset channel picture of each training sample in the training samples includes:
and acquiring an x-direction gradient image, a y-direction gradient image and a gray scale image of each training sample in the training samples, and superposing the x-direction gradient image, the y-direction gradient image and the gray scale image to form the three-channel image.
2. The method for detecting the inclination of a picture according to claim 1, wherein the step of training the convolutional neural network model through the preset channel picture of the training sample to obtain the inclination of a picture detection model further comprises:
dividing the training sample into a training set and a verification set according to a set proportion;
training a convolutional neural network model through a preset channel picture of the training sample in the training set to obtain a picture tilt detection model weight;
and evaluating the accuracy and reliability of the picture tilt detection model weight through a preset channel picture of the training sample in the verification set to obtain the optimized picture tilt detection model weight.
3. The method for detecting the inclination of the picture according to claim 1, wherein the step of obtaining the x-direction gradient map and the y-direction gradient map of each of the training samples specifically comprises:
and converting the training sample into an hsv space, and extracting the x-direction gradient map and the y-direction gradient map of a hue h channel by using a sobel operator.
4. A picture tilt detection method as defined in claim 3, wherein prior to said converting the training samples into hsv space, the method further comprises:
and carrying out histogram equalization on the training samples.
5. The picture tilt detection method of claim 1, wherein constructing the training samples comprises:
acquiring the vertical picture sample, and carrying out sample augmentation on the vertical picture sample through a preset picture augmentation rule;
and acquiring an inclined picture sample according to the vertical picture sample subjected to the sample augmentation.
6. The picture tilt detection method as claimed in claim 5, wherein the preset picture augmentation rule comprises: at least one of randomly cropping a picture, randomly scaling, randomly horizontally flipping, and randomly vertically flipping.
7. The picture tilt detection method as claimed in claim 5, wherein the obtaining a tilt picture sample from the vertical picture sample after the sample is amplified comprises:
and rotating the vertical picture sample subjected to sample augmentation and extracting the largest inscribed rectangle in the overlapping area before and after rotation to obtain the inclined picture sample.
8. The picture tilt detection method as claimed in claim 7, wherein the rotating the vertical picture sample after the sample augmentation comprises: and randomly rotating the vertical picture sample after the sample is amplified by taking normal distribution as probability selection rotation degrees.
9. The picture tilt detection method according to claim 1, wherein the convolutional neural network model is a multi-scale residual network model, and the convolutional neural network model comprises a cascade residual module, a multi-scale pooling module and a full connection module which are sequentially connected.
10. A picture tilt detection apparatus, comprising:
a sample construction module for: constructing a training sample, wherein the training sample comprises a vertical picture sample and an inclined picture sample;
the picture tilt detection model construction module is used for: acquiring a preset channel picture of each training sample in the training samples, and training a convolutional neural network model through the preset channel picture of the training samples to obtain a picture tilt detection model;
the inclination detection module is used for: inputting a picture to be detected into the picture inclination detection model, and performing inclination detection of the picture to be detected according to the output of the picture inclination detection model; the vertical picture sample is provided with a label for representing the picture as vertical, and the inclined picture sample is provided with a label for representing the picture as inclined; the preset channel pictures comprise preset characteristic pictures with preset quantity, and each preset characteristic picture corresponds to a picture of one channel; the image inclination detection model construction module is specifically used for when training the convolutional neural network model through a preset channel image of the training sample to obtain an image inclination detection model when being used for:
inputting the preset channel picture into the convolutional neural network model, and training the convolutional neural network model by taking the label of the training sample corresponding to the preset channel picture as output to obtain the picture tilt detection model;
the preset channel picture is a three-channel picture, and the picture tilt detection model construction module is specifically configured to: and acquiring an x-direction gradient image, a y-direction gradient image and a gray scale image of each training sample in the training samples, and superposing the x-direction gradient image, the y-direction gradient image and the gray scale image to form the three-channel image.
11. The apparatus according to claim 10, wherein the image tilt detection model construction module is further configured to, when training the convolutional neural network model through a preset channel image of the training sample to obtain the image tilt detection model:
dividing the training sample into a training set and a verification set according to a set proportion;
training a convolutional neural network model through a preset channel picture of the training sample in the training set to obtain a picture tilt detection model weight;
and evaluating the accuracy and reliability of the picture tilt detection model weight through a preset channel picture of the training sample in the verification set to obtain the optimized picture tilt detection model weight.
12. The apparatus according to claim 10, wherein the image tilt detection model construction module, when configured to obtain an x-direction gradient map and a y-direction gradient map of each of the training samples, is specifically configured to:
and converting the training sample into an hsv space, and extracting the x-direction gradient map and the y-direction gradient map of a hue h channel by using a sobel operator.
13. The picture tilt detection apparatus of claim 12, wherein the picture tilt detection model construction module, prior to the converting the training samples to hsv space, is further configured to: and carrying out histogram equalization on the training samples.
14. The picture tilt detection apparatus of claim 10, wherein the sample construction module, when configured to construct training samples, is configured to:
acquiring the vertical picture sample, and carrying out sample augmentation on the vertical picture sample through a preset picture augmentation rule;
and acquiring an inclined picture sample according to the vertical picture sample subjected to the sample augmentation.
15. The picture tilt detection apparatus of claim 14, wherein the preset picture augmentation rule comprises: at least one of randomly cropping a picture, randomly scaling, randomly horizontally flipping, and randomly vertically flipping.
16. The picture tilt detection apparatus according to claim 14, wherein the sample construction module, when configured to obtain a tilted picture sample from the vertical picture sample after the sample augmentation, is specifically configured to:
and rotating the vertical picture sample subjected to sample augmentation and extracting the largest inscribed rectangle in the overlapping area before and after rotation to obtain the inclined picture sample.
17. The picture tilt detection apparatus of claim 16, wherein the sample construction module, when configured to rotate the vertical picture sample after the sample augmentation, is configured to: and randomly rotating the vertical picture sample after the sample is amplified by taking normal distribution as probability selection rotation degrees.
18. The picture tilt detection apparatus of claim 10, wherein the convolutional neural network model is a multi-scale residual network model, and the convolutional neural network model comprises a cascade residual module, a multi-scale pooling module, and a full connection module, which are sequentially connected.
19. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the picture tilt detection method according to any one of claims 1 to 9 when the computer program is executed by the processor.
20. A non-transitory computer readable storage medium having stored thereon a computer program, characterized in that the computer program when executed by a processor implements the steps of the picture tilt detection method according to any of claims 1 to 9.
CN201911113009.2A 2019-11-14 2019-11-14 Picture inclination detection method and device Active CN111127327B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911113009.2A CN111127327B (en) 2019-11-14 2019-11-14 Picture inclination detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911113009.2A CN111127327B (en) 2019-11-14 2019-11-14 Picture inclination detection method and device

Publications (2)

Publication Number Publication Date
CN111127327A CN111127327A (en) 2020-05-08
CN111127327B true CN111127327B (en) 2024-04-12

Family

ID=70495599

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911113009.2A Active CN111127327B (en) 2019-11-14 2019-11-14 Picture inclination detection method and device

Country Status (1)

Country Link
CN (1) CN111127327B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112464852B (en) * 2020-12-09 2023-12-05 重庆大学 Vehicle driving license picture self-adaptive correction and identification method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106951848A (en) * 2017-03-13 2017-07-14 平安科技(深圳)有限公司 The method and system of picture recognition
CN107273832A (en) * 2017-06-06 2017-10-20 青海省交通科学研究院 Licence plate recognition method and system based on integrating channel feature and convolutional neural networks
CN108830213A (en) * 2018-06-12 2018-11-16 北京理工大学 Car plate detection and recognition methods and device based on deep learning
CN109961006A (en) * 2019-01-30 2019-07-02 东华大学 A kind of low pixel multiple target Face datection and crucial independent positioning method and alignment schemes
KR20190091101A (en) * 2018-01-26 2019-08-05 지의소프트 주식회사 Automatic classification apparatus and method of document type using deep learning

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106960219B (en) * 2017-03-10 2021-04-16 百度在线网络技术(北京)有限公司 Picture identification method and device, computer equipment and computer readable medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106951848A (en) * 2017-03-13 2017-07-14 平安科技(深圳)有限公司 The method and system of picture recognition
CN107273832A (en) * 2017-06-06 2017-10-20 青海省交通科学研究院 Licence plate recognition method and system based on integrating channel feature and convolutional neural networks
KR20190091101A (en) * 2018-01-26 2019-08-05 지의소프트 주식회사 Automatic classification apparatus and method of document type using deep learning
CN108830213A (en) * 2018-06-12 2018-11-16 北京理工大学 Car plate detection and recognition methods and device based on deep learning
CN109961006A (en) * 2019-01-30 2019-07-02 东华大学 A kind of low pixel multiple target Face datection and crucial independent positioning method and alignment schemes

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
傅鹏 ; 谢世朋 ; .基于级联卷积神经网络的车牌定位.计算机技术与发展.2017,(01),140-143. *
基于卷积神经网络的复杂档案图像倾斜校正方法研究;徐文渊等;《全国第三届"智能电网"会议论文集》;第294-300页 *

Also Published As

Publication number Publication date
CN111127327A (en) 2020-05-08

Similar Documents

Publication Publication Date Title
JP6926335B2 (en) Variable rotation object detection in deep learning
CN105574550A (en) Vehicle identification method and device
CN111445459B (en) Image defect detection method and system based on depth twin network
CN109993040A (en) Text recognition method and device
CN112001403B (en) Image contour detection method and system
US11605210B2 (en) Method for optical character recognition in document subject to shadows, and device employing method
CN112884782B (en) Biological object segmentation method, apparatus, computer device, and storage medium
CN109815823B (en) Data processing method and related product
CN111539456B (en) Target identification method and device
CN114639102B (en) Cell segmentation method and device based on key point and size regression
CN111127327B (en) Picture inclination detection method and device
CN114724246A (en) Dangerous behavior identification method and device
CN107527095A (en) A kind of vehicle detection image-pickup method and system
CN114445615A (en) Rotary insulator target detection method based on scale invariant feature pyramid structure
CN112528782A (en) Underwater fish target detection method and device
CN116524312A (en) Infrared small target detection method based on attention fusion characteristic pyramid network
CN115620083A (en) Model training method, face image quality evaluation method, device and medium
CN112308061B (en) License plate character recognition method and device
CN114399432A (en) Target identification method, device, equipment, medium and product
CN116415019A (en) Virtual reality VR image recognition method and device, electronic equipment and storage medium
CN114387451A (en) Training method, device and medium for abnormal image detection model
CN107563418A (en) A kind of picture attribute detection method based on area sensitive score collection of illustrative plates and more case-based learnings
US11876945B2 (en) Device and method for acquiring shadow-free images of documents for scanning purposes
CN113610184B (en) Wood texture classification method based on transfer learning
CN112883988B (en) Training and feature extraction method of feature extraction network based on multiple data sets

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant