WO2020164278A1 - 一种图像处理方法、装置、电子设备和可读存储介质 - Google Patents

一种图像处理方法、装置、电子设备和可读存储介质 Download PDF

Info

Publication number
WO2020164278A1
WO2020164278A1 PCT/CN2019/118277 CN2019118277W WO2020164278A1 WO 2020164278 A1 WO2020164278 A1 WO 2020164278A1 CN 2019118277 W CN2019118277 W CN 2019118277W WO 2020164278 A1 WO2020164278 A1 WO 2020164278A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
feature
neural network
information
layer
Prior art date
Application number
PCT/CN2019/118277
Other languages
English (en)
French (fr)
Inventor
赵峰
王健宗
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020164278A1 publication Critical patent/WO2020164278A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines

Definitions

  • This application relates to the technical field of image processing in the field of artificial intelligence, and in particular to an image processing method, device, electronic device, and computer-readable storage medium based on a deep neural network.
  • convolutional neural networks are increasingly used in computer vision, especially in the field of image classification.
  • CNN Convolutional Neural Network
  • Simple neural networks have been used as machine learning techniques in the field of character recognition and image classification.
  • training such a neural network requires a lot of labeled data.
  • an easy way to utilize the CNN function is to use a pre-trained CNN as a feature extractor.
  • support vector machines SVM is a very economical method for representing complex surfaces in high-dimensional spaces, including polynomials and other types of surfaces.
  • convolutional neural network is a powerful machine learning technology used in the field of deep learning.
  • Simple convolutional neural networks have been used as machine learning techniques in the field of character recognition and image classification.
  • the simple CNN and SVM technology in the prior art does not have high classification accuracy for images, so it is urgent to design a scheme that can combine the two technologies to improve the classification accuracy of image recognition.
  • this application proposes an image processing method, device, electronic equipment and readable storage medium.
  • the first aspect of this application provides an image processing method, including:
  • the image information use a feature extractor to perform feature extraction to obtain feature information
  • the classifier calculates according to the feature information to obtain the minimum feature distance between the feature information and the hyperplane in the corresponding classifier;
  • the classification information of the classifier is used as the final classification information.
  • a second aspect of the present application provides an image processing device, which includes:
  • the feature extraction module uses a feature extractor to perform feature extraction based on the image information to obtain feature information
  • the classification module sends the feature information to the classifier, and the classifier performs calculations based on the feature information to obtain the minimum feature distance between the feature information and the hyperplane in the corresponding classifier;
  • the judging module judges whether the minimum characteristic distance is greater than a preset distance threshold, and if it is greater, sends a signal to the classification module, and the classification module uses the classification information of the classifier as the final classification information.
  • a third aspect of the present application provides an electronic device, including a memory and a processor, the memory includes an image processing program, and the image processing program is executed by the processor to implement the following steps:
  • the image information use a feature extractor to perform feature extraction to obtain feature information
  • the classifier calculates according to the feature information to obtain the minimum feature distance between the feature information and the hyperplane in the corresponding classifier;
  • the classification information of the classifier is used as the final classification information.
  • a fourth aspect of the present application provides a computer non-volatile readable storage medium, the computer readable storage medium includes an image processing program, and when the image processing program is executed by a processor, the steps of the above image processing method are implemented .
  • the minimum feature distance in the classifier is judged, and the classification information is used as the final classification result when it is greater than the preset distance threshold.
  • the accuracy of image classification uses the convolutional layer in the neural network as a feature extractor to extract feature information, which can further increase the accuracy of recognition.
  • this application also performs pattern distortion enhancement for images, which further increases the accuracy of image recognition.
  • FIG. 1 shows a flowchart of the image processing method of the present application
  • Figure 2 shows the layer structure diagram of the deep convolutional neural network AlexNet of this application
  • Fig. 3 shows the weight structure diagram of the first layer of the deep convolutional neural network AlexNet of the present application
  • FIG. 4 shows a schematic diagram of image enhancement through cosine mode in this application
  • FIG. 5 shows a schematic diagram of image enhancement through rotation and tilt in this application
  • Fig. 6 shows a structural block diagram of the electronic device of the present application.
  • Fig. 1 shows a flowchart of the image processing method of the present application.
  • this application discloses an image processing method, including:
  • S106 Send the characteristic information to the classifier
  • the classifier performs calculations according to the feature information to obtain the minimum feature distance between the feature information and the hyperplane in the corresponding classifier;
  • the feature extractor After the picture is acquired, that is, the image information is acquired, it is input to the feature extractor for feature extraction to obtain feature information.
  • the feature extractor transforms the input original image into a feature vector, and the feature vector is the feature information.
  • the feature information is sent to the classifier for classification processing, and finally the input image is output as multiple classification contents. For example, if you input a picture containing a kitten, first obtain the image information of the picture, and then extract the feature according to the image information to obtain the feature information, send the feature information to the classifier for classification and recognition, and finally output the classification information.
  • the classification information shows that the recognition result of the picture is a kitten, and the position of the kitten in the picture can be marked.
  • the classification information may also contain other information, for example, marking the detection frame in the picture, marking the position of the species in the detection frame; marking the species information in the detection frame, and so on.
  • marking the detection frame in the picture marking the position of the species in the detection frame
  • marking the species information in the detection frame marking the species information in the detection frame
  • the classifier such as the SVM classifier
  • the algorithm will output an optimal hyperplane to classify new samples (test samples).
  • the trained classifier has a hyperplane, which separates each feature vector into different categories. The farther away the feature vector in each region is from the hyperplane, the better, which can make the classification more accurate.
  • the feature vector falls into a certain area, there will be a minimum distance from the hyperplane. It is judged whether this minimum distance is greater than the preset distance threshold. If it is less than the preset threshold, it indicates that there may be a certain error.
  • the image is preprocessed, and then the feature vector calculation and classification are performed again; if it is greater than, the classification result is accurate, and the classification information can be used as the final classification result.
  • the preprocessing can include operations such as contrast increase, color increase, and image noise removal.
  • the feature extractor is a feature extractor that has been pre-trained and includes:
  • the convolutional layer is used as a feature extractor.
  • the belonging feature extractor is a preset feature extractor.
  • the feature extractor is a convolutional layer of a pre-trained deep convolutional neural network AlexNet.
  • the training of the deep convolutional neural network AlexNet includes: obtaining a large-scale object image data set, which has 1,000 object categories and 1.2 million training images. Perform image training on the large-scale object image data set to obtain a deep convolutional neural network AlexNet; then obtain the convolutional layer information of the deep convolutional neural network AlexNet, where the deep convolutional neural network AlexNet has multiple convolutional layers , This application only needs to obtain the convolutional layer of one layer and use it as a feature extractor.
  • FIG. 2 shows a layer structure diagram of the deep convolutional neural network AlexNet of the present application.
  • the input of the deep convolutional neural network AlexNet is one of 1000 images of different types (such as cats, dogs, etc.), and the output is a vector of 1000 numbers.
  • the i-th element of the output vector is the probability that the input image belongs to the i-th image.
  • the first layer of the deep convolutional neural network AlexNet defines the size of the input image as 227 ⁇ 227 ⁇ 3.
  • the layer weight is pre-trained for large-scale object image data sets.
  • the first layer of the network learns filters used to capture blobs and edge features.
  • the middle layer is a series of five convolutional layers and three fully connected layers, interspersed with rectified linear units (ReLU) and maximum pooling layers.
  • the last layer is the classification layer, with 1000 classes.
  • the pre-trained CNN for the object image data set is used as the feature extractor.
  • the weight of the first convolutional layer of the deep convolutional neural network AlexNet is shown in Figure 3.
  • CNN ie Alex-Net pre-trained on object images is used as a feature extractor.
  • the 17th layer named "fc7" in the deep convolutional neural network AlexNet layer map is connected to the classifier as a feature extractor, that is, the feature vector obtained in the 17th layer is passed to the classifier.
  • the first layer of the deep convolutional neural network AlexNet includes a filter for capturing edge features
  • the middle layer of the deep convolutional neural network AlexNet includes multiple convolutional layers and a maximum pooling layer.
  • the last layer of the deep convolutional neural network AlexNet is a classification layer, and the weight of each layer of the deep convolutional neural network AlexNet is determined through training.
  • the seventeenth layer of the deep convolutional neural network AlexNet is used as a feature extractor, and the feature information obtained at the seventeenth layer of the deep convolutional neural network AlexNet is sent to the classifier.
  • this application uses the seventeenth layer of the deep convolutional neural network AlexNet as the feature extractor, where the deep convolutional neural network AlexNet includes a layer named "fc7", that is, fully connected in the AlexNet network The "fc7" layer in the fully connected layer.
  • the feature vector obtained in the 17th layer is passed to the classifier.
  • the classifier is a classifier that has been pre-trained.
  • the classifier can use Support Vector Machine (SVM) for classification.
  • SVM Support Vector Machine
  • the classifier classifies data by finding the best hyperplane that separates all data points of one class from data points of other classes.
  • Support vector machines can represent complex surfaces, including polynomials and radial basis function.
  • the best hyperplane is the largest hyperplane between the two classes. In other words, the margin is the maximum width of the plate parallel to the hyperplane without internal data points. Among them, the support vector is the closest to the separating hyperplane. data point.
  • the training step of the above-mentioned classifier includes:
  • a feature extraction module extracts the feature information of the set number of images, and uses the feature information of the set number of images as a training set;
  • the training set is input into the support vector machine to obtain the support vector and its minimum feature distance from the hyperplane.
  • the final classification information of each image in the training set is obtained through the judgment module.
  • the classification information includes the label sequence corresponding to the object in the image.
  • the validation set is verified, the support vector machine is iteratively trained, and the trained support vector machine is obtained.
  • the classifier is trained using the convolutional neural network function.
  • the acquired image feature information that is, the feature vector
  • a stochastic gradient descent (SGD) solver is used to speed up training.
  • SGD stochastic gradient descent
  • the test image features are extracted through CNN and passed to the SVM classifier.
  • Using the trained classifier as a classifier for image recognition can improve the accuracy of image recognition.
  • the image training of the large-scale object image data set to obtain the deep convolutional neural network AlexNet further includes:
  • Training is performed on the processed image to obtain a new deep convolutional neural network AlexNet.
  • the enhancement processing mentioned here can be modal distortion enhancement.
  • affine transformation and elastic distortion are well known in character recognition. They are used to generate new samples from the original samples and expand the training set. By applying an affine displacement field to the pattern, simple distortions such as translation, rotation, scaling, and tilt can be generated.
  • Elastic distortion is an image transformation that mimics the change of handwriting style.
  • the mode enhancement technology includes one or more of rotation, tilt, elastic distortion, and cosine function enhancement.
  • Fig. 4 shows a schematic diagram of image enhancement performed by the cosine mode in this application.
  • this application uses a mode transformation method with a cosine function. Specific enhancement methods such as rotation, tilt and elastic distortion are also applied.
  • Figure 4 shows an example of the mode of cosine function enhancement.
  • the number of training images is increased by 31 times of the original 5k training samples.
  • the cosine function By using the cosine function, the original pattern is left-aligned, right-aligned, top-aligned or bottom-aligned, and the pattern is also center-aligned and expanded.
  • Fig. 5 shows a schematic diagram of image enhancement by rotating and tilting in this application.
  • this application uses image rotation processing to rotate the image at a certain angle; or adopts image tilt processing for the purpose of transforming the image into an image that presents a certain angle with the original image.
  • Image processing can also be carried out by elastic distortion to transform the original image into a distorted image.
  • a new deep convolutional neural network AlexNet is obtained.
  • training is performed through processed certain distorted images, which can increase the probability of matching images in the network and achieve the purpose of improving the accuracy of image recognition.
  • Fig. 6 shows a structural block diagram of the electronic device of the present application.
  • the second aspect of the present application provides an electronic device 6, including: a memory 61 and a processor 62, the memory includes an image processing program, and the image processing program is executed by the processor as follows step:
  • the image information use a feature extractor to perform feature extraction to obtain feature information
  • the classifier calculates according to the feature information to obtain the minimum feature distance between the feature information and the hyperplane in the corresponding classifier;
  • the classification information of the classifier is used as the final classification information.
  • the feature extractor After the picture is acquired, that is, the image information is acquired, it is input to the feature extractor for feature extraction to obtain feature information.
  • the feature extractor transforms the input original image into a feature vector, and the feature vector is the feature information.
  • the feature information is sent to the classifier for classification processing, and finally the input image is output as multiple classification contents. For example, if you input a picture containing a kitten, first obtain the image information of the picture, and then extract the feature according to the image information to obtain the feature information, send the feature information to the classifier for classification and recognition, and finally output the classification information.
  • the classification information shows that the recognition result of the picture is a kitten, and the position of the kitten in the picture can be marked.
  • the classification information may also contain other information, for example, marking the detection frame in the picture, marking the position of the species in the detection frame; marking the species information in the detection frame, and so on.
  • marking the detection frame in the picture marking the position of the species in the detection frame
  • marking the species information in the detection frame marking the species information in the detection frame
  • the classifier such as the SVM classifier
  • the algorithm will output an optimal hyperplane to classify new samples (test samples).
  • the trained classifier has a hyperplane, which separates each feature vector into different categories. The farther away the feature vector in each region is from the hyperplane, the better, which can make the classification more accurate.
  • the feature vector falls into a certain area, there will be a minimum distance from the hyperplane. It is judged whether this minimum distance is greater than the preset distance threshold. If it is less than the preset threshold, it indicates that there may be a certain error.
  • the image is preprocessed, and then the feature vector calculation and classification are performed again; if it is greater than, the classification result is accurate, and the classification information can be used as the final classification result.
  • the preprocessing can include operations such as contrast increase, color increase, and image noise removal.
  • the feature extractor is a feature extractor that has been pre-trained and includes:
  • the convolutional layer is used as a feature extractor.
  • the belonging feature extractor is a preset feature extractor.
  • the feature extractor is a convolutional layer of a pre-trained deep convolutional neural network AlexNet.
  • the training of the deep convolutional neural network AlexNet includes: obtaining a large-scale object image data set, which has 1,000 object categories and 1.2 million training images. Perform image training on the large-scale object image data set to obtain a deep convolutional neural network AlexNet; then obtain the convolutional layer information of the deep convolutional neural network AlexNet, where the deep convolutional neural network AlexNet has multiple convolutional layers , This application only needs to obtain the convolutional layer of one layer and use it as a feature extractor.
  • FIG. 2 shows a layer structure diagram of the deep convolutional neural network AlexNet of the present application.
  • the input of the deep convolutional neural network AlexNet is one of 1000 images of different types (such as cats, dogs, etc.), and the output is a vector of 1000 numbers.
  • the i-th element of the output vector is the probability that the input image belongs to the i-th image.
  • the first layer of the deep convolutional neural network AlexNet defines the size of the input image as 227 ⁇ 227 ⁇ 3.
  • the layer weight is pre-trained for large-scale object image data sets.
  • the first layer of the network learns filters used to capture blobs and edge features.
  • the middle layer is a series of five convolutional layers and three fully connected layers, interspersed with rectified linear units (ReLU) and maximum pooling layers.
  • the last layer is the classification layer, with 1000 classes.
  • the pre-trained CNN for the object image data set is used as the feature extractor.
  • the weight of the first convolutional layer of the deep convolutional neural network AlexNet is shown in Figure 3.
  • CNN ie Alex-Net pre-trained on object images is used as a feature extractor.
  • the 17th layer named "fc7" in the deep convolutional neural network AlexNet layer map is connected to the classifier as a feature extractor, that is, the feature vector obtained in the 17th layer is passed to the classifier.
  • the first layer of the deep convolutional neural network AlexNet includes a filter for capturing edge features
  • the middle layer of the deep convolutional neural network AlexNet includes multiple convolutional layers and a maximum pooling layer.
  • the last layer of the deep convolutional neural network AlexNet is a classification layer, and the weight of each layer of the deep convolutional neural network AlexNet is determined through training.
  • the seventeenth layer of the deep convolutional neural network AlexNet is used as a feature extractor, and the feature information obtained at the seventeenth layer of the deep convolutional neural network AlexNet is sent to the classifier.
  • this application uses the seventeenth layer of the deep convolutional neural network AlexNet as the feature extractor, where the deep convolutional neural network AlexNet includes a layer named "fc7", that is, fully connected in the AlexNet network The "fc7" layer in the fully connected layer.
  • the feature vector obtained in the 17th layer is passed to the classifier.
  • the classifier is a classifier that has been pre-trained.
  • the classifier can use Support Vector Machine (SVM) for classification.
  • SVM Support Vector Machine
  • the classifier classifies data by finding the best hyperplane that separates all data points of one class from data points of other classes.
  • Support vector machines can represent complex surfaces, including polynomials and radial basis function.
  • the best hyperplane is the largest hyperplane between the two classes. In other words, the margin is the maximum width of the plate parallel to the hyperplane without internal data points. Among them, the support vector is the closest to the separating hyperplane. data point.
  • the classifier is trained using the convolutional neural network function.
  • the acquired image feature information that is, the feature vector
  • a stochastic gradient descent (SGD) solver is used to speed up training.
  • SGD stochastic gradient descent
  • the test image features are extracted through CNN and passed to the SVM classifier.
  • Using the trained classifier as a classifier for image recognition can improve the accuracy of image recognition.
  • the image training of the large-scale object image data set to obtain the deep convolutional neural network AlexNet further includes:
  • Training is performed on the processed image to obtain a new deep convolutional neural network AlexNet.
  • the mode enhancement technology includes one or more of rotation, tilt, elastic distortion, and cosine function enhancement.
  • the enhancement processing mentioned here can be modal distortion enhancement.
  • affine transformation and elastic distortion are well known in character recognition. They are used to generate new samples from the original samples and expand the training set. By applying an affine displacement field to the pattern, simple distortions such as translation, rotation, scaling, and tilt can be generated.
  • Elastic distortion is an image transformation that mimics the change of handwriting style.
  • Fig. 4 shows a schematic diagram of image enhancement performed by the cosine mode in this application.
  • this application uses a mode transformation method with a cosine function. Specific enhancement methods such as rotation, tilt and elastic distortion are also applied.
  • Figure 4 shows an example of the mode of cosine function enhancement.
  • the number of training images is increased by 31 times of the original 5k training samples.
  • the cosine function By using the cosine function, the original pattern is left-aligned, right-aligned, top-aligned or bottom-aligned, and the pattern is also center-aligned and expanded.
  • Fig. 5 shows a schematic diagram of image enhancement by rotating and tilting in this application.
  • this application uses image rotation processing to rotate the image at a certain angle; or adopts image tilt processing for the purpose of transforming the image into an image that presents a certain angle with the original image.
  • Image processing can also be carried out by elastic distortion to transform the original image into a distorted image.
  • a new deep convolutional neural network AlexNet is obtained.
  • training is performed through processed certain distorted images, which can increase the probability of matching images in the network and achieve the purpose of improving the accuracy of image recognition.
  • the present application also provides a computer-readable storage medium, the computer-readable storage medium includes an image processing program, and when the image processing program is executed by a processor, the steps of the above-mentioned image processing method are realized.
  • This application also provides an image processing device, including:
  • the feature extraction module uses a feature extractor to perform feature extraction based on the image information to obtain feature information
  • the classification module sends the feature information to the classifier, and the classifier performs calculations based on the feature information to obtain the minimum feature distance between the feature information and the hyperplane in the corresponding classifier;
  • the judging module judges whether the minimum characteristic distance is greater than a preset distance threshold, and if it is greater, sends a signal to the classification module, and the classification module uses the classification information of the classifier as the final classification information.
  • the above-mentioned feature extraction module includes:
  • Data set acquisition unit to acquire large-scale object image data sets
  • the first training unit performs image training on the large-scale object image data set to obtain a deep convolutional neural network AlexNet;
  • the feature extractor acquisition unit acquires the convolutional layer of the deep convolutional neural network AlexNet, and uses the convolutional layer as a feature extractor.
  • the image processing unit uses pattern enhancement technology to process images in a large-scale object image data set to obtain processed images, wherein the training unit trains the processed images to obtain a new deep convolutional neural network AlexNet,
  • the feature extractor obtaining unit uses the new deep convolutional neural network AlexNet convolutional layer as the feature extractor.
  • the first layer of the deep convolutional neural network AlexNet includes a filter for capturing edge features
  • the middle layer of the deep convolutional neural network AlexNet includes multiple convolutional layers and a maximum pooling layer, so
  • the last layer of the deep convolutional neural network AlexNet is a classification layer, and the weight of each layer of the deep convolutional neural network AlexNet is determined through training.
  • the above classification module includes:
  • Image library storing images of known objects
  • Tag library storing different tags corresponding to different objects according to the set sequence
  • a training set construction unit a collection module collects image information of a set number of images in an image library, a feature extraction module extracts feature information of the set number of images, and uses the feature information of the set number of images as a training set;
  • the verification set construction unit compares the objects contained in the set number of images with the objects corresponding to the tags in the tag library to obtain the tag sequence corresponding to each image, and use the tag sequence of the set number of images as Validation set
  • the second training unit inputs the training set to the support vector machine to obtain the support vector and its minimum feature distance from the hyperplane, and obtains the final classification information of each image in the training set through the judgment module.
  • the classification information includes the corresponding object in the image Use the validation set to verify the tag sequence of, and perform iterative training on the support vector machine to obtain the trained support vector machine.
  • the second training unit uses a stochastic gradient descent solver to train the support vector machine.
  • the minimum feature distance in the classifier is judged, and the classification information is used as the final classification result when it is greater than the preset distance threshold.
  • the accuracy of image classification uses the convolutional layer in the neural network as a feature extractor to extract feature information, which can further increase the accuracy of recognition.
  • this application also performs pattern distortion enhancement for images, which further increases the accuracy of image recognition.
  • the disclosed device and method may be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of the units is only a logical function division, and there may be other divisions in actual implementation, such as: multiple units or components can be combined, or It can be integrated into another device, or some features can be ignored or not implemented.
  • the coupling, or direct coupling, or communication connection between the components shown or discussed may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms. of.
  • the units described above as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units; they may be located in one place or distributed on multiple network units; Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • the functional units in the embodiments of the present application can all be integrated into one processing unit, or each unit can be individually used as a unit, or two or more units can be integrated into one unit;
  • the unit can be implemented in the form of hardware, or in the form of hardware plus software functional units.
  • the foregoing program can be stored in a computer readable storage medium.
  • the execution includes The steps of the foregoing method embodiment; and the foregoing storage medium includes: removable storage devices, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disks or optical disks, etc.
  • the medium storing the program code.
  • the above-mentioned integrated unit of this application is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer readable storage medium.
  • the computer software product is stored in a storage medium and includes several instructions for A computer device (which may be a personal computer, a server, or a network device, etc.) executes all or part of the methods described in the various embodiments of the present application.
  • the aforementioned storage media include: removable storage devices, ROM, RAM, magnetic disks, or optical disks and other media that can store program codes.

Abstract

一种图像处理方法、装置、电子设备和可读存储介质,其中方法包括:获取图像信息(S102);根据所述图像信息,利用特征提取器进行特征提取,得到特征信息(S104);将所述特征信息发送至分类器(S106);所述分类器根据所述特征信息进行分类,得到分类信息。通过该方案,对分类器中的最小特征距离进行判断,当大于预设的距离阈值时才将分类信息作为最终的分类结果,可以增加图像分类的准确性。并且将神经网络中的卷积层作为特征提取器进行特征信息的提取,可以进一步增加识别的准确度。在图像训练过程中,针对图像进行模式失真增强,进一步增加了图像识别的准确度。

Description

一种图像处理方法、装置、电子设备和可读存储介质
本申请要求申请号为201910114848.X,申请日为2019年2月14日,发明创造名称为“一种图像处理方法、装置和可读存储介质”的专利申请的优先权。
技术领域
本申请涉及人工智能领域的图像处理技术领域,具体而言,涉及一种基于深度神经网络的图像处理方法、装置、电子设备和计算机可读存储介质。
背景技术
随着机器学习技术的高速发展,卷积神经网络越来越多的应用于计算机视觉中,尤其是图像分类领域。
卷积神经网络(CNN)是一种在深度学习领域中使用的强大的机器学习技术。简单的神经网络已被用作字符识别和图像分类领域的机器学习技术。然而,训练这样一个神经网络需要大量的标记数据。在不花费时间和精力进行训练的情况下,利用CNN功能的简单方法是使用预先训练的CNN作为特征提取器。为了生成复杂的决策曲面,支持向量机(SVM)是一种非常经济的方法,用于表示高维空间中的复杂曲面,包括多项式和其他类型的曲面。
相关技术中,卷积神经网络是一种在深度学习领域中使用的强大的机器学习技术。简单的卷积神经网络已被用作字符识别和图像分类领域的机器学习技术。但是,现有技术中采用简单的CNN和SVM技术对图像的分类精度并不高,所以设计一种能够结合二者的技术以用来提升图像识别分类精度的方案是亟不可待的。
申请内容
为了解决上述至少一个技术问题,本申请提出了一种图像处理方法、装置、电子设备和可读存储介质。
本申请第一方面提供了一种图像处理方法,包括:
获取图像信息;
根据所述图像信息,利用特征提取器进行特征提取,得到特征信息;
将所述特征信息发送至分类器;
分类器根据所述特征信息进行计算,得到特征信息与对应的分类器中超平面的最小特征距离;
判断所述最小特征距离是否大于预设的距离阈值;
若大于,则将所述分类器的分类信息作为最终分类信息。
本申请的第二方面提供一种图像处理装置,该装置包括:
采集模块,获取图像信息;
特征提取模块,根据所述图像信息,利用特征提取器进行特征提取,得到特征信息;
分类模块,将所述特征信息发送至分类器,分类器根据所述特征信息进行计算,得到特征信息与对应的分类器中超平面的最小特征距离;
判断模块,判断所述最小特征距离是否大于预设的距离阈值,若大于,发送信号给分类模块,分类模块将所述分类器的分类信息作为最终分类信息。
本申请第三方面提供一种电子设备,包括:存储器、处理器,所述存储器中包括图像处理程序,所述图像处理程序被所述处理器执行时实现如下步骤:
获取图像信息;
根据所述图像信息,利用特征提取器进行特征提取,得到特征信息;
将所述特征信息发送至分类器;
分类器根据所述特征信息进行计算,得到特征信息与对应的分类器中超平面的最小特征距离;
判断所述最小特征距离是否大于预设的距离阈值;
若大于,则将所述分类器的分类信息作为最终分类信息。
本申请第四方面提供一种计算机非易失性可读存储介质,所述计算机可读存储介质中包括图像处理程序,所述图像处理程序被处理器执行时,实现如上述图像处理方法的步骤。
通过本申请涉及的图像处理方法、装置、电子设备和可读存储介质,对分类器中的最小特征距离进行判断,当大于预设的距离阈值时才将分类信息作为最终的分类结果,可以增加图像分类的准确性。并且本申请将神经网络中的卷积层作为特征提取器进行特征信息的提取,可以进一步增加识别的准确度。本申请还在图像训练过程中,针对图像进行模式失真增强,进一步增加了图像识别的准确度。
附图说明
图1示出了本申请图像处理方法的流程图;
图2示出了本申请深度卷积神经网络AlexNet的层结构图;
图3示出了本申请深度卷积神经网络AlexNet的第一层权重结构图;
图4示出了本申请通过余弦模式进行图像增强的示意图;
图5示出了本申请通过旋转倾斜进行图像增强的示意图;
图6示出了本申请电子设备的结构框图。
具体实施方式
为了能够更清楚地理解本申请的上述目的、特征和优点,下面结合附图和具体实施方式对本申请进行进一步的详细描述。需要说明的是,在不冲突的情况下,本申请的实施例及实施例中的特征可以相互组合。
在下面的描述中阐述了很多具体细节以便于充分理解本申请,但是,本申请还可以采用其他不同于在此描述的其他方式来实施,因此,本申请的保护范围并不受下面公开的具体实施例的限制。
图1示出了本申请图像处理方法的流程图。
如图1所示,本申请公开了一种图像处理方法,包括:
S102,获取图像信息;
S104,根据所述图像信息,利用特征提取器进行特征提取,得到特征信息;
S106,将所述特征信息发送至分类器;
S108,分类器根据所述特征信息进行计算,得到特征信息与对应的分类器中超平面的最小特征距离;
S110,判断所述最小特征距离是否大于预设的距离阈值;
S112,若大于,则将所述分类器的分类信息作为最终分类信息。
需要说明的是,在获取了图片之后,也就是获取了图像信息,便输入特征提取器进行特征提取,得到特征信息。也就是说,特征提取器将输入的原始图像变换为特征向量,所述特征向量便是特征信息。然后将特征信息发送至分类器中,进行分类处理,最终将输入的图像输出为多个分类内容。例如,输入一个包含小猫的图片,则先获取此图片的图像信息,然后根据此图像信息进行特征的提取,得到特征信息,将特征信息发送至分类器中进行分类识别,最终输出分类信息,分类信息中显示图片的识别结果为小猫,并且可以标记小猫所在图片中的位置。当然,分类信息还可以包含其他的信息,例如,在图片中标记检测框,检测框圈定物种的位置;在检测框中标注物种信息等等。本领域技术人员可根据实际需要设定分类信息的种类,但任何基于本申请的技术方案输出分类信息的方法都将落入本申请保护范围内。
需要说明的是,分类器,例如SVM分类器,是一个由分类超平面定义的判别分类器。也就是说给定一组带标签的训练样本,算法将会输出一个最优超平面对新样本(测试样本)进行分类。也就是说,经过训练的分类器存在一个超平面,这个超平面将每个的特征向量分隔为不同的类别,每个区域中的特征向量越远离超平面越好,这样能够使得分类更加准确。当特征向量落入某个区域时,会与所述超平面存在一个最小距离,判断这个最小距离是否大于预设的距离阈值,若小于预设的阈值,则表明可能存在一定的误差,可以进 行图像预处理,然后重新进行特征向量计算和分类;若大于,则表明分类结果准确,可以将分类信息作为最终的分类结果。其中预处理可以包含对比度增加、颜色增加、消除图像噪声等操作。
根据本申请实施例,所述特征提取器为经过预先训练设置的特征提取器,包括:
获取大规模物体图像数据集;
将所述大规模物体图像数据集进行图像训练,得到深度卷积神经网络AlexNet;
获取所述深度卷积神经网络AlexNet的卷积层;
将所述卷积层作为特征提取器。
需要说明的是,所属特征提取器为预先设置的特征提取器。所述特征提取器是预先训练的深度卷积神经网络AlexNet的卷积层。在训练深度卷积神经网络AlexNet时包括:获取大规模物体图像数据集,该大规模物体图像数据集有1000个对象类别和120万个训练图像。针对所述大规模物体图像数据集进行图像训练,得到深度卷积神经网络AlexNet;然后获取所述深度卷积神经网络AlexNet的卷积层信息,其中深度卷积神经网络AlexNet具备多个卷积层,本申请只需要获取其中一层的卷积层,将其作为特征提取器。
具体的,图2示出了本申请深度卷积神经网络AlexNet的层结构图。如图2所示,深度卷积神经网络AlexNet输入是1000个不同类型图像(如猫、狗等)中的一个图像,输出是1000个数字的矢量。输出向量的第i个元素即为输入图像属于第i类图像的概率。其中,深度卷积神经网络AlexNet的第一层将输入图像的尺寸定义为227×227×3。层重是针对大规模物体图像数据集预先训练的。网络的第一层学习用于捕获blob和边缘特征的过滤器。中间层是一系列五个卷积层和三个完全连接的层,散布有整流线性单元(ReLU)和最大池层。最后一层是分类层,有1000个类。用于对象图像数据集的预训练CNN用作特征提取器。深度卷积神经网络AlexNet的第一卷积层权重如图3所示。不是训练诸如STL-10数据库中的对象图像,而是将对象图像预训练的CNN(即Alex-Net)用作特征提取器。深度卷积神经网络AlexNet层图中名为“fc7”的第17层作为特征提取器连接到分类器,也就是将在第17层获得的特征向量被传递到分类器。
根据本申请实施例,深度卷积神经网络AlexNet的第一层包括用于捕获边缘特征的过滤器,所述深度卷积神经网络AlexNet的中间层包括多个卷积层和最大池层,所述深度卷积神经网络AlexNet的最后一层为分类层,所述深度卷积神经网络AlexNet的各层的权重通过训练确定。
根据本申请实施例,采用所述深度卷积神经网络AlexNet的第十七层作为特征提取器,将在所述深度卷积神经网络AlexNet的第十七层得到的特征信息发送至分类器中。
需要说明的是,本申请采用所述深度卷积神经网络AlexNet的第十七层作为特征提取器,其中在深度卷积神经网络AlexNet包括命名“fc7”的层,也就是在AlexNet网络中fully connected的全连接层中“fc7”的层。将在第17层获得的特征向量被传递到分类器中。
根据本申请实施例,所述分类器为经过预先训练设置的分类器。
需要说明的是,分类器可以采用支持向量机(Support Vector Machine,SVM)进行分类。所述分类器通过查找将一个类的所有数据点与其他类的数据点分开的最佳超平面来对数据进行分类,支持向量机(分类器)可以表示复杂的表面,包括多项式和径向基函数。最好的超平面是两个类之间最大的超平面,换句话说,边距是平行于超平面的平板的最大宽度,其没有内部数据点,其中,支持向量是最接近分离超平面的数据点。
优选地,上述分类器的训练步骤包括:
构建图像库,存储已知含有物体的图像;
构建标签库,按照设定顺序存储有不同物体对应的不同标签;
获取图像库中设定数量的图像的图像信息,特征提取模块提取所述设定数量的图像的特征信息,将所述设定数量的图像的特征信息作为训练集;
将所述的设定数量的图像中含有的物体与标签库中标签对应的物体比对,获得每一个图像对应的标签序列,将所述设定数量的图像的标签序列作为验证集;
将训练集输入支持向量机,获得支持向量及其与超平面的最小特征距离,通过判断模块获得训练集中每个图像的最终分类信息,所述分类信息包括图像中的物体对应的标签序列,利用验证集进行验证,对支持向量机进行迭代训练,获得训练后的支持向量机。
需要说明的是,使用卷积神经网络功能训练分类器。将获取的图像特征信息,也就是特征向量输入分类器进行训练。当使用高维CNN特征向量(每个长度为4096层)时,使用随机梯度下降(SGD)求解器来加速训练。为了测量训练的SVM分类器的分类准确度,通过CNN提取测试图像特征并将其传递给SVM分类器。将训练后的分类器作为图像识别的分类器,可以提高图像识别的准确率。
根据本申请实施例,所述将所述大规模物体图像数据集进行图像训练,得到深度卷积神经网络AlexNet,还包括:
采用模式增强技术对大规模物体图像数据集中的图像进行处理,得到处理后的图像;
针对所述处理后的图像进行训练,得到新的深度卷积神经网络AlexNet。
需要说明的是,在进行图像训练,得到深度卷积神经网络AlexNet时,还可以将图像进行增强处理。这里所说的增强处理,可以是模式失真增强。其中作为数据增强方法,仿射变换和弹性失真在字符识别中是众所周知的。它们用于从原始样本生成新样本并扩展训练集。通过将仿射位移场应用于模式,可以生成诸如平移,旋转,缩放和倾斜的简单失真。弹性失真是一种模仿手写风格变化的图像变换。
具体的,所述模式增强技术包括旋转、倾斜、弹性扭曲、余弦函数增强中的一种或几种。本领域技术人员应当明了,本申请并不仅仅限定以上的增强模式,任何通过本申请的技术方案进行模式增强的方法都将落入本申请保护范围内。
图4示出了本申请通过余弦模式进行图像增强的示意图。
如图4所示,本申请通过使用具有余弦函数的模式变换方法。具体还应用了一些增强方法,如旋转,倾斜和弹性扭曲。图4显示了余弦函数增强的模式示例。训练图像的数量增加了原始5k训练样本的31倍。通过使用余弦函数,原始模式是左对齐,右对齐,顶对齐或底对齐,模式也是中心对齐和扩大的。
图5示出了本申请通过旋转倾斜进行图像增强的示意图。
如图5所示,本申请通过采用图像的旋转处理,将图像进行一定角度的转动;或者采用图像倾斜的处理,目的均为将图像变换为与原图像呈现一定的角度的图像。还可以通过弹性扭曲的方式进行图像处理,将原始图像变换为扭曲的图像。
通过采用处理后的图像进行训练,得到新的深度卷积神经网络AlexNet。本申请通过处理后的一定失真图像进行训练,可以增加在网络中匹配图像的几率,达到提高图像识别的准确性的目的。
图6示出了本申请电子设备的结构框图。
如图6所示,本申请第二方面提供一种电子设备6,包括:存储器61、处理器62,所述存储器中包括图像处理程序,所述图像处理程序被所述处理器执行时实现如下步骤:
获取图像信息;
根据所述图像信息,利用特征提取器进行特征提取,得到特征信息;
将所述特征信息发送至分类器;
分类器根据所述特征信息进行计算,得到特征信息与对应的分类器中超平面的最小特征距离;
判断所述最小特征距离是否大于预设的距离阈值;
若大于,则将所述分类器的分类信息作为最终分类信息。
需要说明的是,在获取了图片之后,也就是获取了图像信息,便输入特征提取器进行特征提取,得到特征信息。也就是说,特征提取器将输入的原始图像变换为特征向量,所述特征向量便是特征信息。然后将特征信息发送至分类器中,进行分类处理,最终将输入的图像输出为多个分类内容。例如,输入一个包含小猫的图片,则先获取此图片的图像信息,然后根据此图像信息进行特征的提取,得到特征信息,将特征信息发送至分类器中进行分类识别,最终输出分类信息,分类信息中显示图片的识别结果为小猫,并且可以标记小猫所在图片中的位置。当然,分类信息还可以包含其他的信息,例如,在图片中标记检测框,检测框圈定物种的位置;在检测框中标注物种信息等等。本领域技术人员可根据实际需要设定分类信息的种类,但任何基于本申请的技术方案输出分类信息的方法都将落入本申请保护范围内。
需要说明的是,分类器,例如SVM分类器,是一个由分类超平面定义的判别分类器。也就是说给定一组带标签的训练样本,算法将会输出一个最优超平面对新样本(测试样本)进行分类。也就是说,经过训练的分类器存在一个超平面,这个超平面将每个的特征向量分隔为不同的类别,每个区域中的特征向量越远离超平面越好,这样能够使得分类更加准确。当特征向量落入某个区域时,会与所述超平面存在一个最小距离,判断这个最小距离是否大于预设的距离阈值,若小于预设的阈值,则表明可能存在一定的误差,可以进行图像预处理,然后重新进行特征向量计算和分类;若大于,则表明分类结果准确,可以将分类信息作为最终的分类结果。其中预处理可以包含对比度增加、颜色增加、消除图像噪声等操作。
根据本申请实施例,所述特征提取器为经过预先训练设置的特征提取器,包括:
获取大规模物体图像数据集;
将所述大规模物体图像数据集进行图像训练,得到深度卷积神经网络AlexNet;
获取所述深度卷积神经网络AlexNet的卷积层;
将所述卷积层作为特征提取器。
需要说明的是,所属特征提取器为预先设置的特征提取器。所述特征提取器是预先训练的深度卷积神经网络AlexNet的卷积层。在训练深度卷积神经 网络AlexNet时包括:获取大规模物体图像数据集,该大规模物体图像数据集有1000个对象类别和120万个训练图像。针对所述大规模物体图像数据集进行图像训练,得到深度卷积神经网络AlexNet;然后获取所述深度卷积神经网络AlexNet的卷积层信息,其中深度卷积神经网络AlexNet具备多个卷积层,本申请只需要获取其中一层的卷积层,将其作为特征提取器。
具体的,图2示出了本申请深度卷积神经网络AlexNet的层结构图。如图2所示,深度卷积神经网络AlexNet输入是1000个不同类型图像(如猫、狗等)中的一个图像,输出是1000个数字的矢量。输出向量的第i个元素即为输入图像属于第i类图像的概率。其中,深度卷积神经网络AlexNet的第一层将输入图像的尺寸定义为227×227×3。层重是针对大规模物体图像数据集预先训练的。网络的第一层学习用于捕获blob和边缘特征的过滤器。中间层是一系列五个卷积层和三个完全连接的层,散布有整流线性单元(ReLU)和最大池层。最后一层是分类层,有1000个类。用于对象图像数据集的预训练CNN用作特征提取器。深度卷积神经网络AlexNet的第一卷积层权重如图3所示。不是训练诸如STL-10数据库中的对象图像,而是将对象图像预训练的CNN(即Alex-Net)用作特征提取器。深度卷积神经网络AlexNet层图中名为“fc7”的第17层作为特征提取器连接到分类器,也就是将在第17层获得的特征向量被传递到分类器。
根据本申请实施例,深度卷积神经网络AlexNet的第一层包括用于捕获边缘特征的过滤器,所述深度卷积神经网络AlexNet的中间层包括多个卷积层和最大池层,所述深度卷积神经网络AlexNet的最后一层为分类层,所述深度卷积神经网络AlexNet的各层的权重通过训练确定。
根据本申请实施例,采用所述深度卷积神经网络AlexNet的第十七层作为特征提取器,将在所述深度卷积神经网络AlexNet的第十七层得到的特征信息发送至分类器中。
需要说明的是,本申请采用所述深度卷积神经网络AlexNet的第十七层作为特征提取器,其中在深度卷积神经网络AlexNet包括命名“fc7”的层,也就是在AlexNet网络中fully connected的全连接层中“fc7”的层。将在第17层获得的特征向量被传递到分类器中。
根据本申请实施例,所述分类器为经过预先训练设置的分类器。
需要说明的是,分类器可以采用支持向量机(Support Vector Machine,SVM)进行分类。所述分类器通过查找将一个类的所有数据点与其他类的数据点分开的最佳超平面来对数据进行分类,支持向量机(分类器)可以表示复杂的表面,包括多项式和径向基函数。最好的超平面是两个类之间最大的超平面, 换句话说,边距是平行于超平面的平板的最大宽度,其没有内部数据点,其中,支持向量是最接近分离超平面的数据点。
需要说明的是,使用卷积神经网络功能训练分类器。将获取的图像特征信息,也就是特征向量输入分类器进行训练。当使用高维CNN特征向量(每个长度为4096层)时,使用随机梯度下降(SGD)求解器来加速训练。为了测量训练的SVM分类器的分类准确度,通过CNN提取测试图像特征并将其传递给SVM分类器。将训练后的分类器作为图像识别的分类器,可以提高图像识别的准确率。
根据本申请实施例,所述将所述大规模物体图像数据集进行图像训练,得到深度卷积神经网络AlexNet,还包括:
采用模式增强技术对大规模物体图像数据集中的图像进行处理,得到处理后的图像;
针对所述处理后的图像进行训练,得到新的深度卷积神经网络AlexNet。
具体的,所述模式增强技术包括旋转、倾斜、弹性扭曲、余弦函数增强中的一种或几种。本领域技术人员应当明了,本申请并不仅仅限定以上的增强模式,任何通过本申请的技术方案进行模式增强的方法都将落入本申请保护范围内。
需要说明的是,在进行图像训练,得到深度卷积神经网络AlexNet时,还可以将图像进行增强处理。这里所说的增强处理,可以是模式失真增强。其中作为数据增强方法,仿射变换和弹性失真在字符识别中是众所周知的。它们用于从原始样本生成新样本并扩展训练集。通过将仿射位移场应用于模式,可以生成诸如平移,旋转,缩放和倾斜的简单失真。弹性失真是一种模仿手写风格变化的图像变换。
图4示出了本申请通过余弦模式进行图像增强的示意图。
如图4所示,本申请通过使用具有余弦函数的模式变换方法。具体还应用了一些增强方法,如旋转,倾斜和弹性扭曲。图4显示了余弦函数增强的模式示例。训练图像的数量增加了原始5k训练样本的31倍。通过使用余弦函数,原始模式是左对齐,右对齐,顶对齐或底对齐,模式也是中心对齐和扩大的。
图5示出了本申请通过旋转倾斜进行图像增强的示意图。
如图5所示,本申请通过采用图像的旋转处理,将图像进行一定角度的转动;或者采用图像倾斜的处理,目的均为将图像变换为与原图像呈现一定的角度的图像。还可以通过弹性扭曲的方式进行图像处理,将原始图像变换为扭曲的图像。
通过采用处理后的图像进行训练,得到新的深度卷积神经网络AlexNet。本申请通过处理后的一定失真图像进行训练,可以增加在网络中匹配图像的几率,达到提高图像识别的准确性的目的。
本申请还提供一种计算机可读存储介质,所述计算机可读存储介质中包括图像处理程序,所述图像处理程序被处理器执行时,实现如上述图像处理方法的步骤。
本申请还提供一种图像处理装置,包括:
采集模块,获取图像信息;
特征提取模块,根据所述图像信息,利用特征提取器进行特征提取,得到特征信息;
分类模块,将所述特征信息发送至分类器,分类器根据所述特征信息进行计算,得到特征信息与对应的分类器中超平面的最小特征距离;
判断模块,判断所述最小特征距离是否大于预设的距离阈值,若大于,发送信号给分类模块,分类模块将所述分类器的分类信息作为最终分类信息。
优选地,上述特征提取模块包括:
数据集获取单元,获取大规模物体图像数据集;
第一训练单元,将所述大规模物体图像数据集进行图像训练,得到深度卷积神经网络AlexNet;
特征提取器获取单元,获取所述深度卷积神经网络AlexNet的卷积层,将所述卷积层作为特征提取器。
进一步优选地,还包括:
图像处理单元,采用模式增强技术对大规模物体图像数据集中的图像进行处理,得到处理后的图像,其中,训练单元针对所述处理后的图像进行训练,得到新的深度卷积神经网络AlexNet,特征提取器获得单元将所属新的深度卷积神经网络AlexNet卷积层作为特征提取器。
此外,优选地,所述深度卷积神经网络AlexNet的第一层包括用于捕获边缘特征的过滤器,所述深度卷积神经网络AlexNet的中间层包括多个卷积层和最大池层,所述深度卷积神经网络AlexNet的最后一层为分类层,所述深度卷积神经网络AlexNet的各层的权重通过训练确定。
优选地,上述分类模块包括:
图像库,存储已知含有物体的图像;
标签库,按照设定顺序存储有不同物体对应的不同标签;
训练集构建单元,采集模块采集图像库中设定数量的图像的图像信息,特征提取模块提取所述设定数量的图像的特征信息,将所述设定数量的图像 的特征信息作为训练集;
验证集构建单元,将所述的设定数量的图像中含有的物体与标签库中标签对应的物体比对,获得每一个图像对应的标签序列,将所述设定数量的图像的标签序列作为验证集;
第二训练单元,将训练集输入支持向量机,获得支持向量及其与超平面的最小特征距离,通过判断模块获得训练集中每个图像的最终分类信息,所述分类信息包括图像中的物体对应的标签序列,利用验证集进行验证,对支持向量机进行迭代训练,获得训练后的支持向量机。
进一步,优选地,所述第二训练单元使用随机梯度下降求解器对支持向量机进行训练。
通过本申请涉及的图像处理方法、装置、电子设备和可读存储介质,对分类器中的最小特征距离进行判断,当大于预设的距离阈值时才将分类信息作为最终的分类结果,可以增加图像分类的准确性。并且本申请将神经网络中的卷积层作为特征提取器进行特征信息的提取,可以进一步增加识别的准确度。本申请还在图像训练过程中,针对图像进行模式失真增强,进一步增加了图像识别的准确度。
在本申请所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。以上所描述的设备实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,如:多个单元或组件可以结合,或可以集成到另一个装置,或一些特征可以忽略,或不执行。另外,所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,设备或单元的间接耦合或通信连接,可以是电性的、机械的或其它形式的。
上述作为分离部件说明的单元可以是、或也可以不是物理上分开的,作为单元显示的部件可以是、或也可以不是物理单元;既可以位于一个地方,也可以分布到多个网络单元上;可以根据实际的需要选择其中的部分或全部单元来实现本实施例方案的目的。
另外,在本申请各实施例中的各功能单元可以全部集成在一个处理单元中,也可以是各单元分别单独作为一个单元,也可以两个或两个以上单元集成在一个单元中;上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
本领域普通技术人员可以理解:实现上述方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成,前述的程序可以存储于计算机可读取存储介质中,该程序在执行时,执行包括上述方法实施例的步骤;而前述的 存储介质包括:移动存储设备、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。
或者,本申请上述集成的单元如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机、服务器、或者网络设备等)执行本申请各个实施例所述方法的全部或部分。而前述的存储介质包括:移动存储设备、ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (20)

  1. 一种图像处理方法,其特征在于,包括:
    获取图像信息;
    根据所述图像信息,利用特征提取器进行特征提取,得到特征信息;
    将所述特征信息发送至分类器;
    分类器根据所述特征信息进行计算,得到特征信息与对应的分类器中超平面的最小特征距离;
    判断所述最小特征距离是否大于预设的距离阈值;
    若大于,则将所述分类器的分类信息作为最终分类信息。
  2. 根据权利要求1所述的图像处理方法,其特征在于,所述特征提取器为经过预先训练设置的特征提取器,包括:
    获取大规模物体图像数据集;
    将所述大规模物体图像数据集进行图像训练,得到深度卷积神经网络AlexNet;
    获取所述深度卷积神经网络AlexNet的卷积层;
    将所述卷积层作为特征提取器。
  3. 根据权利要求2所述的图像处理方法,其特征在于:
    深度卷积神经网络AlexNet的第一层包括用于捕获边缘特征的过滤器,所述深度卷积神经网络AlexNet的中间层包括多个卷积层和最大池层,所述深度卷积神经网络AlexNet的最后一层为分类层,所述深度卷积神经网络AlexNet的各层的权重通过训练确定。
  4. 根据权利要求2所述的图像处理方法,其特征在于:
    采用所述深度卷积神经网络AlexNet的第十七层作为特征提取器,将在所述深度卷积神经网络AlexNet的第十七层得到的特征信息发送至分类器中。
  5. 根据权利要求1所述的图像处理方法,其特征在于,所述分类器为经过预先训练设置的分类器。
  6. 根据权利要求5所述的图像处理方法,其特征在于,所述分类器采用支持向量机,所述分类器的训练步骤包括:
    构建图像库,存储已知含有物体的图像;
    构建标签库,按照设定顺序存储有不同物体对应的不同标签;
    获取图像库中设定数量的图像的图像信息,特征提取模块提取所述设定数量的图像的特征信息,将所述设定数量的图像的特征信息作为训练集;
    将所述的设定数量的图像中含有的物体与标签库中标签对应的物体比对,获得每一个图像对应的标签序列,将所述设定数量的图像的标签序列作为验 证集;
    将训练集输入支持向量机,获得支持向量及其与超平面的最小特征距离,通过判断模块获得训练集中每个图像的最终分类信息,所述分类信息包括图像中的物体对应的标签序列,利用验证集进行验证,对支持向量机进行迭代训练,获得训练后的支持向量机。
  7. 根据根据权利要求6所述的图像处理方法,其特征在于,还包括:
    使用随机梯度下降求解器对支持向量机进行训练。
  8. 根据权利要求2所述的图像处理方法,其特征在于,所述将所述大规模物体图像数据集进行图像训练,得到深度卷积神经网络AlexNet,还包括:
    采用模式增强技术对大规模物体图像数据集中的图像进行处理,得到处理后的图像;
    针对所述处理后的图像进行训练,得到新的深度卷积神经网络AlexNet,将所属新的深度卷积神经网络AlexNet卷积层作为特征提取器。
  9. 一种图像处理装置,其特征在于,包括:
    采集模块,获取图像信息;
    特征提取模块,根据所述图像信息,利用特征提取器进行特征提取,得到特征信息;
    分类模块,将所述特征信息发送至分类器,分类器根据所述特征信息进行计算,得到特征信息与对应的分类器中超平面的最小特征距离;
    判断模块,判断所述最小特征距离是否大于预设的距离阈值,若大于,发送信号给分类模块,分类模块将所述分类器的分类信息作为最终分类信息。
  10. 根据权利要求9所述的图像处理装置,其特征在于,所述特征提取模块包括:
    数据集获取单元,获取大规模物体图像数据集;
    第一训练单元,将所述大规模物体图像数据集进行图像训练,得到深度卷积神经网络AlexNet;
    特征提取器获取单元,获取所述深度卷积神经网络AlexNet的卷积层,将所述卷积层作为特征提取器。
  11. 根据权利要求10所述的图像处理装置,其特征在于,还包括:
    图像处理单元,采用模式增强技术对大规模物体图像数据集中的图像进行处理,得到处理后的图像,其中,训练单元针对所述处理后的图像进行训练,得到新的深度卷积神经网络AlexNet,特征提取器获得单元将所属新的深度卷积神经网络AlexNet卷积层作为特征提取器。
  12. 根据权利要求10所述的图像处理装置,其特征在于,所述深度卷积神 经网络AlexNet的第一层包括用于捕获边缘特征的过滤器,所述深度卷积神经网络AlexNet的中间层包括多个卷积层和最大池层,所述深度卷积神经网络AlexNet的最后一层为分类层,所述深度卷积神经网络AlexNet的各层的权重通过训练确定。
  13. 根据权利要求9所述的图像处理装置,其特征在于,所述分类模块包括:
    图像库,存储已知含有物体的图像;
    标签库,按照设定顺序存储有不同物体对应的不同标签;
    训练集构建单元,采集模块采集图像库中设定数量的图像的图像信息,特征提取模块提取所述设定数量的图像的特征信息,将所述设定数量的图像的特征信息作为训练集;
    验证集构建单元,将所述的设定数量的图像中含有的物体与标签库中标签对应的物体比对,获得每一个图像对应的标签序列,将所述设定数量的图像的标签序列作为验证集;
    第二训练单元,将训练集输入支持向量机,获得支持向量及其与超平面的最小特征距离,通过判断模块获得训练集中每个图像的最终分类信息,所述分类信息包括图像中的物体对应的标签序列,利用验证集进行验证,对支持向量机进行迭代训练,获得训练后的支持向量机。
  14. 根据权利要求13所述的图像处理装置,其特征在于,所述第二训练单元使用随机梯度下降求解器对支持向量机进行训练。
  15. 一种电子设备,其特征在于,包括:存储器、处理器,所述存储器中包括图像处理程序,所述图像处理程序被所述处理器执行时实现如下步骤:
    获取图像信息;
    根据所述图像信息,利用特征提取器进行特征提取,得到特征信息;
    将所述特征信息发送至分类器;
    分类器根据所述特征信息进行计算,得到特征信息与对应的分类器中超平面的最小特征距离;
    判断所述最小特征距离是否大于预设的距离阈值;
    若大于,则将所述分类器的分类信息作为最终分类信息。
  16. 根据权利要求15所述的电子设备,其特征在于,所述特征提取器为经过预先训练设置的特征提取器,包括:
    获取大规模物体图像数据集;
    将所述大规模物体图像数据集进行图像训练,得到深度卷积神经网络AlexNet;
    获取所述深度卷积神经网络AlexNet的卷积层;
    将所述卷积层作为特征提取器。
  17. 根据权利要求16所述的电子设备,其特征在于:
    深度卷积神经网络AlexNet的第一层包括用于捕获边缘特征的过滤器,所述深度卷积神经网络AlexNet的中间层包括多个卷积层和最大池层,所述深度卷积神经网络AlexNet的最后一层为分类层,所述深度卷积神经网络AlexNet的各层的权重通过训练确定。
  18. 根据权利要求17所述的电子设备,其特征在于,所述深度卷积神经网络AlexNet的第十七层作为特征提取器。
  19. 根据权利要求15所述的电子设备,其特征在于,所述分类器为经过预先训练设置的分类器。
  20. 一种计算机非易失性可读存储介质,其特征在于,所述计算机可读存储介质中包括图像处理程序,所述图像处理程序被处理器执行时,实现如权利要求1至8中任一项所述的一种图像处理方法的步骤。
PCT/CN2019/118277 2019-02-14 2019-11-14 一种图像处理方法、装置、电子设备和可读存储介质 WO2020164278A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910114848.XA CN109993201A (zh) 2019-02-14 2019-02-14 一种图像处理方法、装置和可读存储介质
CN201910114848.X 2019-02-14

Publications (1)

Publication Number Publication Date
WO2020164278A1 true WO2020164278A1 (zh) 2020-08-20

Family

ID=67130148

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/118277 WO2020164278A1 (zh) 2019-02-14 2019-11-14 一种图像处理方法、装置、电子设备和可读存储介质

Country Status (2)

Country Link
CN (1) CN109993201A (zh)
WO (1) WO2020164278A1 (zh)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112345531A (zh) * 2020-10-19 2021-02-09 国网安徽省电力有限公司电力科学研究院 一种基于仿生机器鱼的变压器故障检测方法
CN112668449A (zh) * 2020-12-24 2021-04-16 杭州电子科技大学 一种室外自主移动机器人的低风险地貌识别方法
CN113129279A (zh) * 2021-04-08 2021-07-16 合肥工业大学 一种复合绝缘子鸟啄损伤风险等级评估方法
CN113642655A (zh) * 2021-08-18 2021-11-12 杭州电子科技大学 基于支持向量机和卷积神经网络的小样本图像分类方法
CN115713763A (zh) * 2022-11-25 2023-02-24 青海卓旺智慧信息科技有限公司 一种基于深度学习的土豆图像识别系统
CN113129279B (zh) * 2021-04-08 2024-04-30 合肥工业大学 一种复合绝缘子鸟啄损伤风险等级评估方法

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109993201A (zh) * 2019-02-14 2019-07-09 平安科技(深圳)有限公司 一种图像处理方法、装置和可读存储介质
CN110570419A (zh) * 2019-09-12 2019-12-13 杭州依图医疗技术有限公司 特征信息的获取方法、装置和存储介质
CN113239739B (zh) * 2021-04-19 2023-08-01 深圳市安思疆科技有限公司 一种佩戴物的识别方法及识别装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030086593A1 (en) * 2001-05-31 2003-05-08 Chengjun Liu Feature based classification
CN104361096A (zh) * 2014-11-20 2015-02-18 合肥工业大学 一种基于特征富集区域集合的图像检索方法
CN106372648A (zh) * 2016-10-20 2017-02-01 中国海洋大学 基于多特征融合卷积神经网络的浮游生物图像分类方法
CN106934339A (zh) * 2017-01-19 2017-07-07 上海博康智能信息技术有限公司 一种目标跟踪、跟踪目标识别特征的提取方法和装置
CN109993201A (zh) * 2019-02-14 2019-07-09 平安科技(深圳)有限公司 一种图像处理方法、装置和可读存储介质

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109033107B (zh) * 2017-06-09 2021-09-17 腾讯科技(深圳)有限公司 图像检索方法和装置、计算机设备和存储介质
CN109145143A (zh) * 2018-08-03 2019-01-04 厦门大学 图像检索中的序列约束哈希算法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030086593A1 (en) * 2001-05-31 2003-05-08 Chengjun Liu Feature based classification
CN104361096A (zh) * 2014-11-20 2015-02-18 合肥工业大学 一种基于特征富集区域集合的图像检索方法
CN106372648A (zh) * 2016-10-20 2017-02-01 中国海洋大学 基于多特征融合卷积神经网络的浮游生物图像分类方法
CN106934339A (zh) * 2017-01-19 2017-07-07 上海博康智能信息技术有限公司 一种目标跟踪、跟踪目标识别特征的提取方法和装置
CN109993201A (zh) * 2019-02-14 2019-07-09 平安科技(深圳)有限公司 一种图像处理方法、装置和可读存储介质

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112345531A (zh) * 2020-10-19 2021-02-09 国网安徽省电力有限公司电力科学研究院 一种基于仿生机器鱼的变压器故障检测方法
CN112345531B (zh) * 2020-10-19 2024-04-09 国网安徽省电力有限公司电力科学研究院 一种基于仿生机器鱼的变压器故障检测方法
CN112668449A (zh) * 2020-12-24 2021-04-16 杭州电子科技大学 一种室外自主移动机器人的低风险地貌识别方法
CN113129279A (zh) * 2021-04-08 2021-07-16 合肥工业大学 一种复合绝缘子鸟啄损伤风险等级评估方法
CN113129279B (zh) * 2021-04-08 2024-04-30 合肥工业大学 一种复合绝缘子鸟啄损伤风险等级评估方法
CN113642655A (zh) * 2021-08-18 2021-11-12 杭州电子科技大学 基于支持向量机和卷积神经网络的小样本图像分类方法
CN113642655B (zh) * 2021-08-18 2024-02-13 杭州电子科技大学 基于支持向量机和卷积神经网络的小样本图像分类方法
CN115713763A (zh) * 2022-11-25 2023-02-24 青海卓旺智慧信息科技有限公司 一种基于深度学习的土豆图像识别系统

Also Published As

Publication number Publication date
CN109993201A (zh) 2019-07-09

Similar Documents

Publication Publication Date Title
WO2020164278A1 (zh) 一种图像处理方法、装置、电子设备和可读存储介质
Asghar et al. Copy-move and splicing image forgery detection and localization techniques: a review
CN110348319B (zh) 一种基于人脸深度信息和边缘图像融合的人脸防伪方法
Cozzolino et al. Image forgery detection through residual-based local descriptors and block-matching
CN108681746B (zh) 一种图像识别方法、装置、电子设备和计算机可读介质
RU2668717C1 (ru) Генерация разметки изображений документов для обучающей выборки
CN112381775B (zh) 一种图像篡改检测方法、终端设备及存储介质
JP2016134175A (ja) ワイルドカードを用いてテキスト−画像クエリを実施するための方法およびシステム
CN109829467A (zh) 图像标注方法、电子装置及非暂态电脑可读取储存媒体
TWI712980B (zh) 理賠資訊提取方法和裝置、電子設備
CN111695453B (zh) 绘本识别方法、装置及机器人
US10423817B2 (en) Latent fingerprint ridge flow map improvement
Molina-Moreno et al. Efficient scale-adaptive license plate detection system
Chandran et al. Missing child identification system using deep learning and multiclass SVM
CN112862024B (zh) 一种文本识别方法及系统
CN110414622B (zh) 基于半监督学习的分类器训练方法及装置
CN108921006B (zh) 手写签名图像真伪鉴别模型建立方法及真伪鉴别方法
CN112241470B (zh) 一种视频分类方法及系统
CN113255557A (zh) 一种基于深度学习的视频人群情绪分析方法及系统
CN113762326A (zh) 一种数据识别方法、装置、设备及可读存储介质
CN115497124A (zh) 身份识别方法和装置及存储介质
CN112613341A (zh) 训练方法及装置、指纹识别方法及装置、电子设备
Astawa et al. Convolutional Neural Network Method Implementation for License Plate Recognition in Android
Vinod et al. Handwritten Signature Identification and Fraud Detection using Deep Learning and Computer Vision
Tan et al. The impact of data correlation on identification of computer-generated face images

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19914787

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS (EPO FORM 1205A DATED 05.10.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19914787

Country of ref document: EP

Kind code of ref document: A1