WO2021164550A1 - Image classification method and apparatus - Google Patents

Image classification method and apparatus Download PDF

Info

Publication number
WO2021164550A1
WO2021164550A1 PCT/CN2021/075045 CN2021075045W WO2021164550A1 WO 2021164550 A1 WO2021164550 A1 WO 2021164550A1 CN 2021075045 W CN2021075045 W CN 2021075045W WO 2021164550 A1 WO2021164550 A1 WO 2021164550A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
classified
clustering
global feature
feature
Prior art date
Application number
PCT/CN2021/075045
Other languages
French (fr)
Chinese (zh)
Inventor
孙哲
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2021164550A1 publication Critical patent/WO2021164550A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques

Definitions

  • This application relates to the field of image processing technology, and in particular to an image classification method and device.
  • the embodiments of the present application provide an image classification method and device, which can reduce the amount of calculation for image classification, and provide a possible way for images with a large amount of calculation to run on terminal hardware.
  • an embodiment of the present application provides an image classification method, and the method includes:
  • the classification result of the image to be classified is determined according to the global feature.
  • an image classification device the device including:
  • the acquiring unit is used to acquire the image to be classified
  • a dividing unit configured to divide the image to be classified to obtain M partial image blocks, where M is a positive integer greater than 1;
  • a clustering unit configured to cluster the M partial image blocks to obtain N clustering results, where N is a positive integer greater than 1;
  • a first determining unit configured to determine a global feature of the image to be classified according to the M local image blocks and the N clustering results, where the global feature is a feature vector of the image to be classified;
  • the second determining unit is configured to determine the classification result of the image to be classified according to the global feature.
  • an embodiment of the present application provides an electronic device, including a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and configured by The processor executes, and the program includes instructions for executing the steps in the method described in the first aspect of the embodiments of the present application.
  • an embodiment of the present application provides a computer-readable storage medium, wherein the above-mentioned computer-readable storage medium stores a computer program for electronic data exchange, wherein the above-mentioned computer program enables a computer to execute Some or all of the steps described in one aspect.
  • the embodiments of the present application provide a computer program product, wherein the computer program product includes a non-transitory computer-readable storage medium storing a computer program, and the computer program is operable to cause a computer to execute as implemented in this application.
  • the computer program product may be a software installation package.
  • this application can obtain the image to be classified, divide the image to be classified to obtain M partial image blocks, and cluster the M partial image blocks to obtain N clustering results.
  • the M local image blocks and the N clustering results determine the global feature of the image to be classified, and determine the classification result of the image to be classified according to the global feature.
  • FIG. 1 is a schematic flowchart of an image classification method provided by an embodiment of the present application
  • FIG. 2 is a schematic diagram of an image division process provided by an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of an image clustering provided by an embodiment of the present application.
  • FIG. 4 is a schematic diagram of a process for obtaining image feature vectors according to an embodiment of the present application.
  • Fig. 5 is a functional unit composition diagram of another image classification device provided by an embodiment of the present application.
  • Fig. 6 is a schematic structural diagram of a terminal device provided by an embodiment of the present application.
  • the terminal devices described in the embodiments of the present application include but are not limited to other portable devices such as mobile phones, laptop computers, or tablet computers with touch-sensitive surfaces (for example, touch screen displays and/or touch pads).
  • the device is not a portable communication device, but a desktop computer with a touch-sensitive surface (e.g., touch screen display and/or touch pad).
  • a terminal device including a display and a touch-sensitive surface is described.
  • the terminal device may include one or more other physical user interface devices such as a physical keyboard, mouse, and/or joystick.
  • the terminal device supports various applications, such as one or more of the following: drawing application, presentation application, word processing application, website creation application, disk burning application, spreadsheet application, game application, phone Apps, video conferencing apps, email apps, instant messaging apps, exercise support apps, photo management apps, digital camera apps, digital camera apps, web browsing apps, digital music player apps And/or digital video player application.
  • applications such as one or more of the following: drawing application, presentation application, word processing application, website creation application, disk burning application, spreadsheet application, game application, phone Apps, video conferencing apps, email apps, instant messaging apps, exercise support apps, photo management apps, digital camera apps, digital camera apps, web browsing apps, digital music player apps And/or digital video player application.
  • Various application programs that can be executed on the terminal device can use at least one common physical user interface device such as a touch-sensitive surface.
  • One or more functions of the touch-sensitive surface and corresponding information displayed on the terminal can be adjusted and/or changed between applications and/or within corresponding applications.
  • the common physical architecture of the terminal for example, a touch-sensitive surface
  • this application proposes an image classification method, which obtains an image to be classified, divides the image to be classified to obtain M partial image blocks, and clusters the M partial image blocks to obtain N Clustering results, determining the global feature of the image to be classified according to the M local image blocks and the N clustering results, and determining the classification result of the image to be classified according to the global feature, thereby reducing image classification
  • the amount of calculation is realized on the terminal device to run the image classification of the large amount of calculation or the high-resolution image.
  • FIG. 1 is a schematic flowchart of an image classification method provided by an embodiment of the present application.
  • the image classification method is applied to a terminal device.
  • the image classification method may include the following steps:
  • the image to be classified can be obtained locally from the terminal device, and the image to be classified can also be received from other devices, which is not limited here.
  • Obtaining the image to be classified locally from the terminal device may refer to obtaining the image to be classified from the memory of the terminal device, or it may be obtaining the photos that have not been stored in the memory when the terminal device is taking a picture.
  • the image to be classified is a photo taken by the terminal device that has not been stored in the memory, it can be realized that the photo is stored in the memory to complete the classification of the photo without subsequent classification.
  • the image to be classified may refer to an image of the category to be detected.
  • An image usually includes a subject and a background.
  • the subject is the main object of the image.
  • the background is the scene that sets off the subject in the image.
  • the category of the image is determined according to the subject in the image. For example, if the subject in the image is a building, then the category of the image is the architectural category; The subject of the image is green plants, so the category of the image is green plants.
  • the image to be detected may be an image obtained by various applications executed on the terminal device, for example, a drawing application, a survey application, a word processing application, a photo management application, etc.
  • Different applications or application scenarios have different subjects and backgrounds in the image.
  • the subject can include entities representing geographic features such as buildings, roads, trees, and rivers
  • word processing applications the main body mainly includes text. Therefore, the terminal device can first perform a simple classification of the image to be classified according to the application and/or application scenario to which the image to be classified belongs, that is, determine the application and/or application scenario to which the image to be classified belongs according to the source of the image to be classified. This simplifies the processing of images to be classified.
  • the method before dividing the image to be classified, the method further includes: acquiring an original image, and preprocessing the original image to obtain the image to be classified.
  • the original image can also be randomly selected from the image library. There is no restriction on this.
  • the original image size can be cropped into a uniform format.
  • the image to be classified can be uniformly cropped to a size of 512 ⁇ 512, and The original image is normalized to obtain the image to be classified.
  • the terminal device may compress the original image first, thereby further reducing the need for processing.
  • the amount of calculation to classify images may be performed.
  • the terminal device may divide the image to be classified into M identical partial image blocks according to a preset image size. For example, as shown in FIG. 2, the image to be classified can be divided into 9 rectangular partial image blocks in a nine-square grid manner.
  • the terminal device can also divide the image to be classified into M partial image blocks of the same or different sizes according to a preset image size list or pattern; or, the terminal device can also randomly divide the image to be classified into M partial image blocks.
  • the application embodiment does not limit this.
  • the embodiment of the present application can perform image classification by using M partial image blocks as input data, which not only increases the diversity of data, but also improves the robustness of the image classification model.
  • the size of the partial image blocks can be set according to the scene to which the image belongs, and the scene to which the image belongs can be divided according to whether the scene characteristic of the image is fixed.
  • the first scene is the scene characteristic.
  • the second scene is an image scene with fixed scene characteristics, such as a plant scene.
  • images belonging to natural scenes usually have no clear and fixed scene features.
  • the size list is used to extract the local features of the image belonging to the plant scene, because the size in the size list may vary greatly, the extracted local features may not be obvious , It is easy to be disturbed by other information (such as background information), such as an image.
  • the content contains three or four flowers in a large patch of grass. It belongs to the flower cluster and does not belong to the grass. If you use the size list to extract the local features of the image, you can judge the image The category of the image may be judged as grass.
  • the scene feature may refer to the image feature that can characterize the scene to which the image belongs. Fixed scene features may mean that the distribution of scene features in the image is relatively concentrated and not scattered; unfixed scene features may mean that the distribution of scene features in the image is relatively scattered and not concentrated.
  • the scene of the image to be tested may refer to the scene of the subject in the image to be tested.
  • the scene to which the image belongs is a plant scene
  • the subject of the image is a beach, valley, etc.
  • the scene to which the image belongs is a natural scene.
  • the partial image blocks do not overlap each other, and the shape blocks of the partial image blocks may be rectangular or irregular polygons.
  • the embodiment of the present application does not limit the shape of the partial image blocks here.
  • the terminal device may determine the size of M according to the resolution of the image to be classified, and the resolution of the image to be classified may have a proportional relationship or a mapping relationship with the size of M, that is, the size of the image to be classified.
  • the resolution of the image to be classified may have a proportional relationship or a mapping relationship with the size of M, that is, the size of the image to be classified.
  • the terminal device can determine the shape and quantity of the partial image block according to the application and/or application scenario to which the image to be classified belongs. For example, for a remote sensing image generated by a survey application, the height The feature distribution of different categories of images in the resolution image is relatively concentrated, that is, subjects belonging to the same category are concentrated on the entire image, and the terminal device can divide the image to be classified into multiple partial image blocks of the same rectangular size; for photographing applications The image of a person is generated in the program, and the main body of the image is mostly distributed in the middle of the whole image. The terminal device can be divided in order from the middle, and the size of the partial image block in the middle part can be larger, so that the partial image block contains more features .
  • the clustering of the M partial image blocks to obtain N clustering results includes: clustering the M partial image blocks using an unsupervised learning clustering method to obtain the N
  • the N is a positive integer greater than 1.
  • a classification model needs to be used to obtain the local features of the partial image blocks. Therefore, it is necessary to train the classification model first, and use the trained classification model to obtain the local features of the local image block.
  • training the classification training model first divide the training samples to obtain multiple partial training samples, then sample the partial training samples, input the sampled training samples into the classification model, and the classification model outputs the local characteristics of the training samples , Use the loss function to train back.
  • the classification model may be used to extract the features of the M partial image blocks respectively to obtain the M partial features.
  • the present application may use a combined partial feature formed by combining multiple features To perform cluster analysis.
  • image local features can include color features, LBP features, texture features, etc. Among them, the color features of the image have less dependence on the size, direction, and viewing angle of the image itself.
  • the commonly used color histogram features describe the different colors.
  • the proportion of the whole image; texture is an important spatial information of remote sensing images. With the increase of resolution, the internal structure of the ground objects becomes clearer and clearer, which is manifested as the texture structure of the ground objects in the remote sensing image.
  • texture features can reflect the regular spatial changes of pixels in the target object. Therefore, the terminal device can select different features or feature combinations according to different application scenarios or applications. For example, remote sensing images can use color features and texture features to combine to form combined local features; or select local features based on preset feature options. Features or local feature combinations are not limited in the embodiment of the present application.
  • the combined local features can be one-dimensional, that is, multiple combined features can be spliced.
  • the texture feature can be spliced behind the color feature;
  • the combined local feature can also be multi-dimensional, that is, multiple combined features can be A feature matrix, which is not limited in the embodiment of the present application.
  • the color features are based on the HSL (Hue, Saturation, Lightness) color space, and the color histogram features are extracted. Compared with the RGB color space, the HSL color space is more in line with the visual perception characteristics of the human eye.
  • HSL Human, Saturation, Lightness
  • the partial image blocks belonging to the same category in the M partial image blocks are clustered together based on the extracted M partial features, and N clustering results are obtained.
  • the size of the M and N depends on the number of categories contained in the M partial image blocks.
  • the number in the partial image block represents the category of the partial image block
  • the 9 partial image blocks Perform clustering, cluster the partial image blocks of the same category into the same class, and obtain 4 clustering results.
  • the clustering result 1 contains 3 partial image blocks
  • the clustering results 2-4 contain 2 partial image blocks respectively.
  • Image block
  • the clustering methods of unsupervised learning include but are not limited to: K-means clustering algorithm, Birch clustering algorithm, DBSCAN clustering algorithm and K nearest neighbor classification algorithm.
  • a good clustering division should reflect the internal structure of the data set as much as possible, so that the categories within the same category are as the same as possible, and the categories between the categories are as different as possible.
  • K-means clustering algorithm takes the K-means clustering algorithm as an example. From the perspective of distance, clusters with extremely small intra-class distances and large inter-class distances are the optimal clusters.
  • the clusters with similar local features are divided into the same class as much as possible. Dissimilar local features are divided into different categories as much as possible.
  • the global feature is the feature vector of the image to be classified
  • both the global feature and the local feature are the image features of the image to be classified
  • the global feature refers to the feature vector extracted from the entire image to be tested, which is derived from the entire The feature of an image to be tested
  • the local feature refers to a feature vector extracted from a partial image block of the entire image to be tested, and is a feature from a partial image block of the image to be tested.
  • the present application may use a combined global feature formed by a combination of multiple features to perform image classification.
  • Image global features can include color features, LBP features, texture features, etc.
  • the terminal device can select different features or feature combinations according to different application scenarios or applications. For example, remote sensing images can use color features and texture features to combine to form a combined global Features; global features or global feature combinations can also be selected according to preset feature options, which are not limited in the embodiment of the present application.
  • the determining the global feature of the image to be classified according to the M local image blocks and the N clustering results includes: separately clustering the M local image blocks with the N clustering results Each clustering result in the result is subjected to a convolution operation to obtain a first feature vector; binary image coding is performed on the first feature vector to obtain the global feature.
  • 9 partial image blocks are respectively convolved with 4 clustering results obtained by clustering the 9 partial image blocks to obtain the first feature vector of the image to be classified.
  • the first feature vector is used to describe the global information of the image to be classified.
  • the first feature vector may be one-dimensional, and the feature vector obtained by convolution of the partial image block and the clustering result may be spliced to obtain the first feature vector; the first feature vector may also be multi-dimensional.
  • the application embodiment does not limit this.
  • the first feature vector may also be one or more.
  • performing a convolution operation on each of the M partial image blocks and each of the N clustering results may be to combine the local features extracted from the partial image blocks with the clustering results.
  • the local features extracted from the class result are convolved to obtain the first feature vector.
  • the performing binary image encoding on the first feature vector to obtain the global feature includes: setting a value greater than the first value in the first feature vector as a first value according to a binary image encoding rule. Two values, and setting a value in the first feature vector that is less than or equal to the first value as the first value to obtain the global feature.
  • the first value can be 0, and the second value can be 1; the first value can also be 255, and the second value can be 0.
  • each pixel of the image can be between (0,255).
  • Each pixel on the grayscale image has only two possible values or grayscale states.
  • the above global feature can be represented by 8 bits. Characteristics.
  • the first value and the second value may also be other values, which are not limited in the embodiment of the present application.
  • the convolution operation before performing the convolution operation on the M partial image blocks and each of the N clustering results respectively, it is necessary to use a classification model to obtain the partial image blocks and clusters.
  • the local features of the result can be obtained by using the above classification model to obtain the local features of the local image block and the clustering result.
  • the terminal device may input the partial image block and the clustering result into the preset algorithm, and use the preset algorithm to process the image.
  • the preset algorithm may be Fast (Fast Region Based Convolutional Neural Network, RCNN) algorithm.
  • RCNN Fast Region Based Convolutional Neural Network
  • the user can pre-set the convolution window in the Fast RCNN algorithm.
  • the terminal uses the convolution window to convolve the image to obtain the first Feature vector.
  • the first feature vector refers to a complete matrix obtained after convolving the image.
  • the determining the classification result of the image to be classified according to the global feature includes: performing a convolution operation on the global feature and a convolution vector to obtain a probability vector of the global feature, and the convolution vector It is obtained by training the image samples of the labeled category; the category corresponding to the maximum value in the probability vector is used as the classification result of the image to be classified.
  • the size of the above convolution operation is 1x1
  • the value in the above probability vector represents the probability that the image to be classified belongs to each category. According to the value of the probability vector, the category corresponding to the maximum value can be used as the category to be classified. The classification result of the image.
  • the original training samples of each category in the image library can be used for training to obtain the convolution vector, or the collection of image samples marked with image categories can be trained to obtain the convolution vector.
  • the product vector is not limited in the embodiment of this application. It should be noted that the above-mentioned convolution vector may depend on the image samples of the annotated image category. For different image samples of the annotated image category, the convolution vector may be different. Therefore, the terminal device can select the corresponding image according to the application scenario or application.
  • the convolutional vector can be obtained by training with adapted image samples. For example, for the observing terminal equipment, in order to accurately classify remote sensing images and provide detailed ground information, you can directly use the remote sensing image samples with marked categories for training to obtain the convolution vector , Thereby improving the accuracy of image classification.
  • the global feature and the convolution vector can directly obtain the probability vector of the image to be classified after the convolution operation. Due to the small amount of 1x1 convolution calculation, the parallelization acceleration is obvious, which greatly reduces the calculation amount of the original data and the memory usage Therefore, the image classification algorithm is optimized under the condition of ensuring the accuracy, and the calculation cost of the unsupervised learning method is greatly reduced under the premise of ensuring the accuracy of the image classification.
  • the method further includes: acquiring a first image set, the first image set including images of an already-labeled image category; using the first image set to train the classifier to be trained , Get the first classifier;
  • the determining the classification result of the image to be classified according to the global feature includes: inputting the global feature into the first classifier, and outputting the classification result of the image to be classified.
  • inputting the global features of the image to be classified into the classifier can enable the classifier to classify according to the global features of the image to be classified. sort.
  • the classifier before using the classifier to obtain the category of the image to be classified, the classifier needs to be trained first, and the trained classifier is used to obtain the category of the image to be classified. : When training the classifier, use the trained classifier to obtain the global features of the images in the first image set, input the global features to the classifier, the classifier outputs the category of the images in the first image set, and use target supervision to control the classifier Perform backhaul training.
  • target supervision is supervised learning in deep learning, such as loss function.
  • the classifier may refer to a model that classifies the image to be classified according to the image characteristics of the image to be classified.
  • the classifier in this application may be a non-linear classifier, such as a non-linear support vector machine (Support Vector Machine, SVM).
  • SVM Simple Vector Machine
  • Non-linear classifiers can effectively expand the classification dimension and reduce the defects of linear classifiers such as softmax and fully connected layers in non-linear classification.
  • the clustering results of local image blocks are obtained by the clustering method of unsupervised learning, the classification model of supervised learning is used to obtain the global features of the image to be classified, and the classification model of supervised learning is used according to the global features. Determine the category of the image to be classified.
  • This application improves the performance of the image classification algorithm and reduces power consumption by combining unsupervised learning and supervised deep learning algorithms.
  • the first classifier and the convolution vector may be obtained by training the classifier to be trained through one image sample, or the first classifier and the convolution vector may be obtained by training the classifier to be trained through different image samples.
  • the embodiment of the application does not limit this.
  • the embodiment of the application proposes an image classification method, which is applied to a terminal device.
  • the image to be classified is divided to obtain M partial image blocks, and the M partial image blocks are Perform clustering to obtain N clustering results, determine the global feature of the image to be classified according to the M local image blocks and the N clustering results, and determine the global feature of the image to be classified according to the global feature
  • the classification result reduces the amount of redundant calculation for image classification, realizes the simplification and acceleration of the algorithm, and provides a possible way for the image with a large amount of calculation to run on the terminal hardware.
  • an electronic device includes hardware structures and/or software modules corresponding to each function.
  • this application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software-driven hardware depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
  • the embodiment of the present application may divide the electronic device into functional units according to the foregoing method examples.
  • each functional unit may be divided corresponding to each function, or two or more functions may be integrated into one processing unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit. It should be noted that the division of units in the embodiments of the present application is illustrative, and is only a logical function division, and there may be other division methods in actual implementation.
  • FIG. 5 is a block diagram of the functional unit composition of an image classification device provided by an embodiment of the present application, which is applied to a terminal device. As shown in FIG. 5, the device includes:
  • the obtaining unit 510 is configured to obtain the image to be classified
  • the dividing unit 520 is configured to divide the image to be classified to obtain M partial image blocks, where M is a positive integer greater than 1;
  • the clustering unit 530 is configured to cluster the M partial image blocks to obtain N clustering results, where N is a positive integer greater than 1;
  • the first determining unit 540 is configured to determine a global feature of the image to be classified according to the M local image blocks and the N clustering results, where the global feature is a feature vector of the image to be classified;
  • the second determining unit 550 is configured to determine the classification result of the image to be classified according to the global feature.
  • the first determining unit 540 is specifically configured to: perform a convolution operation on the M partial image blocks and each of the N clustering results. , Obtain a first feature vector; perform binary image coding on the first feature vector to obtain the global feature.
  • the first determining unit 540 is further specifically configured to: set a value greater than the first value in the first feature vector to a second value according to a binary image coding rule, And setting a value in the first feature vector that is less than or equal to the first value as the first value to obtain the global feature.
  • the second determining unit 550 is specifically configured to: perform a convolution operation on the global feature and the convolution vector to obtain the probability vector of the global feature, and the convolution vector It is obtained by training the image samples of the labeled category; the category corresponding to the maximum value in the probability vector is used as the classification result of the image to be classified.
  • the size of the convolution operation is 1 ⁇ 1.
  • the obtaining unit 510 is further configured to: obtain a first image set, where the first image set includes images with annotated image categories.
  • the device further includes a training unit 560 configured to train the classifier to be trained using the first image set to obtain the first classifier.
  • the second confirmation unit 550 is further specifically configured to: input the global feature into the first classifier, and output the classification result of the image to be classified.
  • the clustering unit 530 is specifically configured to use an unsupervised learning clustering method to cluster the M partial image blocks to obtain the N clustering results.
  • the obtaining unit before the dividing the image to be classified, is further configured to: obtain an original image, and preprocess the original image to obtain the image to be classified.
  • the embodiment of the present application proposes an image classification device, which is applied to a terminal device.
  • the image to be classified is divided to obtain M partial image blocks, and the M partial image blocks are Perform clustering to obtain N clustering results, determine the global feature of the image to be classified according to the M local image blocks and the N clustering results, and determine the global feature of the image to be classified according to the global feature
  • the classification result reduces the amount of redundant calculation for image classification, realizes the simplification and acceleration of the algorithm, and provides a possible way for the image with a large amount of calculation to run on the terminal hardware.
  • FIG. 6 is a schematic structural diagram of a terminal device provided by an embodiment of the present application.
  • the terminal device includes one or more processors, one or more memories, and one or more communications. Interface, and one or more programs;
  • the one or more programs are stored in the memory, and are configured to be executed by the one or more processors;
  • the program includes instructions for performing the following steps:
  • the classification result of the image to be classified is determined according to the global feature.
  • the program includes instructions that are further used to perform the following steps: respectively, the M partial image blocks are collated with each of the N clustering results.
  • the program includes instructions that are further used to perform the following steps: according to a binary image coding rule, a value greater than the first value in the first feature vector is set to a second value , And setting a value in the first feature vector that is less than or equal to the first value as the first value to obtain the global feature.
  • the program includes instructions that are further used to perform the following steps: perform a convolution operation on the global feature and the convolution vector to obtain the probability vector of the global feature, and the volume
  • the product vector is obtained by training image samples of the labeled category; the category corresponding to the maximum value in the probability vector is used as the classification result of the image to be classified.
  • the size of the convolution operation is 1 ⁇ 1.
  • the program includes instructions that are further used to perform the following steps: obtaining a first image set, the first image set including images of an already-labeled image category; using the first image Set the classifier to be trained to obtain the first classifier.
  • the program includes instructions that are further used to perform the following steps: input the global feature to the first classifier, and output the classification result of the image to be classified.
  • the program includes instructions that are further used to perform the following steps: clustering the M partial image blocks using an unsupervised learning clustering method to obtain the N clusters Class result.
  • the program before dividing the image to be classified, includes instructions for performing the following steps: obtaining an original image, and preprocessing the original image to obtain the image to be classified. Categorize images.
  • the embodiment of the present application proposes an image classification device, which is applied to a terminal device.
  • the image to be classified is divided to obtain M partial image blocks, and the M partial image blocks are Perform clustering to obtain N clustering results, determine the global feature of the image to be classified according to the M local image blocks and the N clustering results, and determine the global feature of the image to be classified according to the global feature
  • the classification result reduces the amount of redundant calculation for image classification, realizes the simplification and acceleration of the algorithm, and provides a possible way for the image with a large amount of calculation to run on the terminal hardware.
  • An embodiment of the present application also provides a computer storage medium, wherein the computer storage medium stores a computer program for electronic data exchange, and the computer program enables a computer to execute part or all of the steps of any method as recorded in the above method embodiment ,
  • the above-mentioned computer includes terminal equipment.
  • the embodiments of the present application also provide a computer program product.
  • the above-mentioned computer program product includes a non-transitory computer-readable storage medium storing a computer program. Part or all of the steps of the method.
  • the computer program product may be a software installation package, and the above-mentioned computer includes terminal equipment.
  • the disclosed device may be implemented in other ways.
  • the device embodiments described above are only illustrative, for example, the division of the above-mentioned units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or integrated. To another system, or some features can be ignored, or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical or other forms.
  • the units described above as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the above integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable memory.
  • the technical solution of the present application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a memory.
  • a number of instructions are included to enable a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the foregoing methods of the various embodiments of the present application.
  • the aforementioned memory includes: U disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk and other media that can store program codes.
  • the program can be stored in a computer-readable memory, and the memory can include: flash memory, ROM, RAM, magnetic disk or CD, etc.

Abstract

The embodiments of the present application disclose an image classification method and apparatus, which are applied to a terminal device. Said method comprises: acquiring an image to be classified; dividing said image to obtain M local image blocks; clustering the M local image blocks to obtain N clustering results; determining a global feature of said image according to the M local image blocks and the N clustering results; and determining a classification result of said image according to the global feature. The present invention can reduce the redundant computation of image classification, realizes a simplified and accelerated algorithm, and allows for the possibility of an image with a large amount of computation to run on terminal hardware.

Description

图像分类方法及装置Image classification method and device 技术领域Technical field
本申请涉及图像处理技术领域,尤其涉及一种图像分类方法及装置。This application relates to the field of image processing technology, and in particular to an image classification method and device.
背景技术Background technique
近年来图像分类引起了人们极大的研究兴趣,并同时在许多应用产品中成功部署,例如手机、个人计算机等终端设备,智能化地解决了许多实际图像处理问题。随着深度学习技术的快速发展,深度学习已成为图像分类中的先进技术。然而,现有的深度学习模型,通常采用端到端的计算方式得到结果,对于计算量小的图像或者输入为小分辨率的图像,现有的终端硬件可以达到性能要求,但对于计算量大的图像或者高分辨率的图像,可能无法在终端硬件上运行。In recent years, image classification has aroused people's great research interest, and it has been successfully deployed in many application products, such as mobile phones, personal computers and other terminal devices, which intelligently solve many practical image processing problems. With the rapid development of deep learning technology, deep learning has become an advanced technology in image classification. However, the existing deep learning models usually use end-to-end calculations to obtain the results. For images with a small amount of calculation or images with a small input resolution, the existing terminal hardware can meet the performance requirements, but for those with a large amount of calculation. Images or high-resolution images may not run on the terminal hardware.
发明内容Summary of the invention
本申请实施例提供了一种图像分类方法及装置,能够减少图像分类的计算量,为计算量大的图像在终端硬件上运行提供了一种可能途径。The embodiments of the present application provide an image classification method and device, which can reduce the amount of calculation for image classification, and provide a possible way for images with a large amount of calculation to run on terminal hardware.
第一方面,本申请实施例提供一种图像分类方法,所述方法包括:In a first aspect, an embodiment of the present application provides an image classification method, and the method includes:
获取待分类图像;Obtain the image to be classified;
将所述待分类图像进行划分,得到M个局部图像块,所述M为大于1的正整数;Dividing the image to be classified to obtain M partial image blocks, where M is a positive integer greater than 1;
将所述M个局部图像块进行聚类,得到N个聚类结果,所述N为大于1的正整数;Clustering the M partial image blocks to obtain N clustering results, where N is a positive integer greater than 1;
根据所述M个局部图像块与所述N个聚类结果,确定所述待分类图像的全局特征,所述全局特征为所述待分类图像的特征向量;Determining a global feature of the image to be classified according to the M local image blocks and the N clustering results, where the global feature is a feature vector of the image to be classified;
根据所述全局特征确定所述待分类图像的分类结果。The classification result of the image to be classified is determined according to the global feature.
第二方面,本申请实施例提供一种图像分类装置,所述装置包括:In a second aspect, an embodiment of the present application provides an image classification device, the device including:
获取单元,用于获取待分类图像;The acquiring unit is used to acquire the image to be classified;
划分单元,用于将所述待分类图像进行划分,得到M个局部图像块,所述M为大于1的正整数;A dividing unit, configured to divide the image to be classified to obtain M partial image blocks, where M is a positive integer greater than 1;
聚类单元,用于将所述M个局部图像块进行聚类,得到N个聚类结果,所述N为大于1的正整数;A clustering unit, configured to cluster the M partial image blocks to obtain N clustering results, where N is a positive integer greater than 1;
第一确定单元,用于根据所述M个局部图像块与所述N个聚类结果,确定所述待分类图像的全局特征,所述全局特征为所述待分类图像的特征向量;A first determining unit, configured to determine a global feature of the image to be classified according to the M local image blocks and the N clustering results, where the global feature is a feature vector of the image to be classified;
第二确定单元,用于根据所述全局特征确定所述待分类图像的分类结果。The second determining unit is configured to determine the classification result of the image to be classified according to the global feature.
第三方面,本申请实施例提供一种电子装置,包括处理器、存储器、通信接口以及一个或多个程序,其中,所述一个或多个程序被存储在所述存储器中,并且被配置由所述处理器执行,所述程序包括用于执行本申请实施例第一方面所述的方法中的步骤的指令。In a third aspect, an embodiment of the present application provides an electronic device, including a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and configured by The processor executes, and the program includes instructions for executing the steps in the method described in the first aspect of the embodiments of the present application.
第四方面,本申请实施例提供了一种计算机可读存储介质,其中,上述计算机可读存储介质存储用于电子数据交换的计算机程序,其中,上述计算机程序使得计算机执行如本申请实施例第一方面中所描述的部分或全部步骤。In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, wherein the above-mentioned computer-readable storage medium stores a computer program for electronic data exchange, wherein the above-mentioned computer program enables a computer to execute Some or all of the steps described in one aspect.
第五方面,本申请实施例提供了一种计算机程序产品,其中,上述计算机程序产品包括存储了计算机程序的非瞬时性计算机可读存储介质,上述计算机程序可操作来使计算机执行如本申请实施例第一方面中所描述的部分或全部步骤。该 计算机程序产品可以为一个软件安装包。In a fifth aspect, the embodiments of the present application provide a computer program product, wherein the computer program product includes a non-transitory computer-readable storage medium storing a computer program, and the computer program is operable to cause a computer to execute as implemented in this application. Example part or all of the steps described in the first aspect. The computer program product may be a software installation package.
实施本申请实施例,具备如下有益效果:The implementation of the embodiments of this application has the following beneficial effects:
可以看出,本申请可以通过获取待分类图像,将所述待分类图像进行划分,得到M个局部图像块,将所述M个局部图像块进行聚类,得到N个聚类结果,根据所述M个局部图像块与所述N个聚类结果,确定所述待分类图像的全局特征,根据所述全局特征确定所述待分类图像的分类结果。通过采用传统算法与深度学习算法相结合的方法来优化端到端深度学习模型算法中,减少了图像分类的冗余计算量,实现了算法的精简和加速,为计算量大的图像在终端硬件上运行提供了一种可能途径。It can be seen that this application can obtain the image to be classified, divide the image to be classified to obtain M partial image blocks, and cluster the M partial image blocks to obtain N clustering results. The M local image blocks and the N clustering results determine the global feature of the image to be classified, and determine the classification result of the image to be classified according to the global feature. By using a combination of traditional algorithms and deep learning algorithms to optimize the end-to-end deep learning model algorithm, the redundant calculation of image classification is reduced, the algorithm is simplified and accelerated, and the image with a large amount of calculation is used in the terminal hardware. Running on provides a possible way.
附图说明Description of the drawings
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings needed in the embodiments. Obviously, the drawings in the following description are some embodiments of the present application. For those of ordinary skill in the art, without creative work, other drawings can be obtained from these drawings.
图1是本申请实施例提供一种图像分类方法的流程示意图;FIG. 1 is a schematic flowchart of an image classification method provided by an embodiment of the present application;
图2是本申请实施例提供的一种图像划分的流程示意图;FIG. 2 is a schematic diagram of an image division process provided by an embodiment of the present application;
图3是本申请实施例提供的一种图像聚类的流程示意图;FIG. 3 is a schematic flowchart of an image clustering provided by an embodiment of the present application;
图4本申请实施例提供的一种获取图像特征向量的流程示意图;FIG. 4 is a schematic diagram of a process for obtaining image feature vectors according to an embodiment of the present application;
图5本申请实施例提供的另一种图像分类装置的功能单元组成图;Fig. 5 is a functional unit composition diagram of another image classification device provided by an embodiment of the present application;
图6是本申请实施例提供的一种终端设备的结构示意图。Fig. 6 is a schematic structural diagram of a terminal device provided by an embodiment of the present application.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to enable those skilled in the art to better understand the solutions of the application, the technical solutions in the embodiments of the application will be clearly and completely described below in conjunction with the drawings in the embodiments of the application. Obviously, the described embodiments are only It is a part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别不同对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、产品或设备固有的其他步骤或单元。The terms "first", "second", etc. in the specification and claims of this application and the above-mentioned drawings are used to distinguish different objects, rather than to describe a specific sequence. In addition, the terms "including" and "having" and any variations of them are intended to cover non-exclusive inclusions. For example, a process, system, product, or device that includes a series of steps or units is not limited to the listed steps or units, but optionally includes steps or units that are not listed, or optionally also includes Other steps or units inherent in a process, product, or equipment.
在本文中提及“实施例”意味着,结合实施例描述的特定特征、结构或特性可以包含在本申请的至少一个实施例中。在说明书中的各个位置出现该短语并不一定均是指相同的实施例,也不是与其它实施例互斥的独立的或备选的实施例。本领域技术人员显式地和隐式地理解的是,本文所描述的实施例可以与其它实施例相结合。The reference to "embodiments" herein means that a specific feature, structure, or characteristic described in conjunction with the embodiments may be included in at least one embodiment of the present application. The appearance of the phrase in various places in the specification does not necessarily refer to the same embodiment, nor is it an independent or alternative embodiment mutually exclusive with other embodiments. Those skilled in the art clearly and implicitly understand that the embodiments described herein can be combined with other embodiments.
具体实现中,本申请实施例中描述的终端设备包括但不限于诸如具有触摸敏感表面(例如,触摸屏显示器和/或触摸板)的移动电话、膝上型计算机或平板计算机之类的其它便携式设备。还应当理解的是,在某些实施例中,所述设备并非便携式通信设备,而是具有触摸敏感表面(例如,触摸屏显示器和/或触摸板)的台式计算机。In specific implementation, the terminal devices described in the embodiments of the present application include but are not limited to other portable devices such as mobile phones, laptop computers, or tablet computers with touch-sensitive surfaces (for example, touch screen displays and/or touch pads). . It should also be understood that in some embodiments, the device is not a portable communication device, but a desktop computer with a touch-sensitive surface (e.g., touch screen display and/or touch pad).
在本申请实施例中,描述了包括显示器和触摸敏感表面的终端设备。然而,应当理解的是,终端设备可以包括诸如物理键盘、鼠标和/或控制杆的一个或多个其它物理用户接口设备。In the embodiments of the present application, a terminal device including a display and a touch-sensitive surface is described. However, it should be understood that the terminal device may include one or more other physical user interface devices such as a physical keyboard, mouse, and/or joystick.
终端设备支持各种应用程序,例如以下中的一个或多个:绘图应用程序、演示应用程序、文字处理应用程序、网站创建应用程序、盘刻录应用程序、电子表格应用程序、游戏应用程序、电话应用程序、视频会议应用程序、电子邮件应用程序、即时消息收发应用程序、锻炼支持应用程序、照片管理应用程序、数码相机应用程序、数字摄影机应用程序、web浏览应用程序、数字音乐播放器应用程序和/或数字视频播放器应用程序。The terminal device supports various applications, such as one or more of the following: drawing application, presentation application, word processing application, website creation application, disk burning application, spreadsheet application, game application, phone Apps, video conferencing apps, email apps, instant messaging apps, exercise support apps, photo management apps, digital camera apps, digital camera apps, web browsing apps, digital music player apps And/or digital video player application.
可以在终端设备上执行的各种应用程序可以使用诸如触摸敏感表面的至少一个公共物理用户接口设备。可以在应用程序之间和/或相应应用程序内调整和/或改变触摸感表面的一个或多个功能以及终端上显示的相应信息。这样,终端的公共物理架构(例如,触摸敏感表面)可以支持具有对用户而言直观且透明的用户界面的各种应用程序。Various application programs that can be executed on the terminal device can use at least one common physical user interface device such as a touch-sensitive surface. One or more functions of the touch-sensitive surface and corresponding information displayed on the terminal can be adjusted and/or changed between applications and/or within corresponding applications. In this way, the common physical architecture of the terminal (for example, a touch-sensitive surface) can support various applications with a user interface that is intuitive and transparent to the user.
目前,深度学习模型大多采用端到端的计算方式来得到结果,当使用深度学习模型来进行图像分类时,简单的任务在一般的终端设备硬件上可以达到性能要求的,但当处理原始分辨率的图像时,比如在图像增强领域,用户对图像的细节比较关注,此时不能对输入图像进行缩放,因此计算量非常大,常常导致算法在终端设备上的无法运行。因此,现有的图像分类算法在简单的任务或者输入为小分辨率的图像时可以满足要求,但当处理计算量大或者输入为原始图像是,在手机等终端设备硬件上可能无法运行。At present, most deep learning models use end-to-end calculations to obtain results. When using deep learning models for image classification, simple tasks can meet the performance requirements on general terminal equipment hardware, but when processing the original resolution For example, in the field of image enhancement, the user pays more attention to the details of the image. At this time, the input image cannot be zoomed. Therefore, the amount of calculation is very large, which often leads to the inability of the algorithm to run on the terminal device. Therefore, the existing image classification algorithm can meet the requirements for simple tasks or when the input is a small-resolution image, but when the processing calculation is large or the input is an original image, it may not be able to run on terminal device hardware such as mobile phones.
为此,本申请提出了一种图像分类方法,通过获取待分类图像,将所述待分类图像进行划分,得到M个局部图像块,将所述M个局部图像块进行聚类,得到N个聚类结果,根据所述M个局部图像块与所述N个聚类结果,确定所述待分类图像的全局特征,根据所述全局特征确定所述待分类图像的分类结果,从而减少图像分类的计算量,实现在终端设备上运行计算量大或者高分辨率图像的图像分类。For this reason, this application proposes an image classification method, which obtains an image to be classified, divides the image to be classified to obtain M partial image blocks, and clusters the M partial image blocks to obtain N Clustering results, determining the global feature of the image to be classified according to the M local image blocks and the N clustering results, and determining the classification result of the image to be classified according to the global feature, thereby reducing image classification The amount of calculation is realized on the terminal device to run the image classification of the large amount of calculation or the high-resolution image.
为了说明本申请所述的技术方案,下面通过具体实施例来进行详细说明。In order to illustrate the technical solutions described in the present application, specific embodiments are used to describe in detail below.
请参阅图1,图1是本申请实施例提供的一种图像分类方法的流程示意图,该图像分类方法应用于终端设备,如图所示,该图像分类方法可以包括以下步骤:Please refer to FIG. 1. FIG. 1 is a schematic flowchart of an image classification method provided by an embodiment of the present application. The image classification method is applied to a terminal device. As shown in the figure, the image classification method may include the following steps:
S110、获取待分类图像。S110. Obtain an image to be classified.
在本申请实施例中,可以从终端设备本地获取待分类图像,也可以接收其他设备发送的待分类图像,在此不做限定。从终端设备本地获取待分类图像可以是指从终端设备的存储器中获取待分类图像,也可以是获取终端设备在拍照时,还未存入存储器的照片。当待分类图像是终端设备所拍摄的还未存入存储器的照片时,可以实现照片在存入存储器之间,完成对照片的分类,而无需后续再对其进行分类。In the embodiment of the present application, the image to be classified can be obtained locally from the terminal device, and the image to be classified can also be received from other devices, which is not limited here. Obtaining the image to be classified locally from the terminal device may refer to obtaining the image to be classified from the memory of the terminal device, or it may be obtaining the photos that have not been stored in the memory when the terminal device is taking a picture. When the image to be classified is a photo taken by the terminal device that has not been stored in the memory, it can be realized that the photo is stored in the memory to complete the classification of the photo without subsequent classification.
其中,待分类图像可以是指待检测类别的图像。图像通常包括主体和背景,主体是图像主要表现的对象,背景是图像中衬托主体的景象,图像的类别是根据图像中主体确定的,例如图像中主体是建筑,那么图像的类别就是建筑类;图像中主体是绿植,那么图像的类别就是绿植类。Among them, the image to be classified may refer to an image of the category to be detected. An image usually includes a subject and a background. The subject is the main object of the image. The background is the scene that sets off the subject in the image. The category of the image is determined according to the subject in the image. For example, if the subject in the image is a building, then the category of the image is the architectural category; The subject of the image is green plants, so the category of the image is green plants.
其中,待分类图像可以是一个或多个。There may be one or more images to be classified.
在本申请实施例的一实现方式中,待检测图像可以是终端设备上执行的各种应用程序得到的图像,例如,绘图应用程序、勘测应用程序、文字处理应用程序、 照片管理应用程序等,不同的应用程序或应用场景,图像中的主体和背景也不同,例如,对于勘测应用程序,主体可以包括建筑物、道路、树木、河流等表示地理样貌的实体,而对于文字处理应用程序,则主体主要包括文字。因此,终端设备可以首先根据待分类图像的属于的应用程序和/或应用场景对待分类图像进行一次简单的分类,即根据待分类图像的来源确定待分类图像所属的应用程序和/或应用场景,从而简化待分类图像的处理。In an implementation of the embodiment of the present application, the image to be detected may be an image obtained by various applications executed on the terminal device, for example, a drawing application, a survey application, a word processing application, a photo management application, etc., Different applications or application scenarios have different subjects and backgrounds in the image. For example, for survey applications, the subject can include entities representing geographic features such as buildings, roads, trees, and rivers, while for word processing applications, The main body mainly includes text. Therefore, the terminal device can first perform a simple classification of the image to be classified according to the application and/or application scenario to which the image to be classified belongs, that is, determine the application and/or application scenario to which the image to be classified belongs according to the source of the image to be classified. This simplifies the processing of images to be classified.
可选的,所述将所述待分类图像进行划分之前,所述方法还包括:获取原始图像,对所述原始图像进行预处理得到待分类图像。Optionally, before dividing the image to be classified, the method further includes: acquiring an original image, and preprocessing the original image to obtain the image to be classified.
具体地,可以预先通过搜集或者拍摄大量图像,或使用已有公开的图像来作为原始图像,当然,在其他实施例中,原始图像也可以是随机地从图像库中选取的,本申请实施例对此不做限定。在具体处理原始图像前,可以先对原始图像进行图像压缩、增减复原等处理,例如将原始图像尺寸大小裁剪成统一格式,例如,可以统一将待分类图像裁剪成512×512尺寸大小,以及将原始图像进行归一化操作,得到待分类图像。Specifically, a large number of images can be collected or taken in advance, or a publicly available image can be used as the original image. Of course, in other embodiments, the original image can also be randomly selected from the image library. There is no restriction on this. Before processing the original image, you can perform image compression, increase, decrease, and restoration of the original image. For example, the original image size can be cropped into a uniform format. For example, the image to be classified can be uniformly cropped to a size of 512×512, and The original image is normalized to obtain the image to be classified.
进一步地,在所述原始图像的分辨率过高,或者所述原始图像尺寸太大,使其超过所述终端设备硬件的需求时,终端设备可以先对原始图像进行压缩,从而进一步减少处理待分类图像的计算量。Further, when the resolution of the original image is too high, or the size of the original image is too large, which exceeds the hardware requirements of the terminal device, the terminal device may compress the original image first, thereby further reducing the need for processing. The amount of calculation to classify images.
S120、将所述待分类图像进行划分,得到M个局部图像块,所述M为大于1的正整数。S120. Divide the image to be classified to obtain M partial image blocks, where M is a positive integer greater than 1.
具体地,终端设备可以按照预设的图像尺寸将待分类图像划分成M个相同的局部图像块。例如,如图2所示,可以将图像按照九宫格的方式将待分类的图像划分成9个矩形局部图像块。终端设备也可以按照预设的图像尺寸列表或图样将待分类图像划分成相同或不同尺寸的M个局部图像块;或者,终端设备也可以随机将待分类图像划分成M个局部图像块,本申请实施例对此不做限定。本申请实施例可以通过将M个局部图像块作为输入数据进行图像分类,既增加了数据的多样性,也可以提高图像分类模型的鲁棒性。Specifically, the terminal device may divide the image to be classified into M identical partial image blocks according to a preset image size. For example, as shown in FIG. 2, the image to be classified can be divided into 9 rectangular partial image blocks in a nine-square grid manner. The terminal device can also divide the image to be classified into M partial image blocks of the same or different sizes according to a preset image size list or pattern; or, the terminal device can also randomly divide the image to be classified into M partial image blocks. The application embodiment does not limit this. The embodiment of the present application can perform image classification by using M partial image blocks as input data, which not only increases the diversity of data, but also improves the robustness of the image classification model.
在本申请实施例的一实现方式中,可以根据图像所属场景对局部图像块的尺寸是否相同进行设置,可以根据图像的场景特征是否固定对图像所属场景进行划分,例如,第一场景为场景特征不固定的图像场景,例如自然场景;第二场景为场景特征固定的图像场景,例如植物场景。例如属于自然场景的图像其通常无明确固定的场景特征,可以使用尺寸列表提取属于自然场景的图像的局部图像块,以提取有效的局部特征;而属于植物场景的图像其通常有固定的场景图像,使用相同的预设尺寸的提取属于植物场景的图像的局部特征,如果使用尺寸列表提取属于植物场景的图像的局部特征,由于尺寸列表中的尺寸可能变化较大,提取的局部特征可能不明显,容易被其他信息(例如背景信息)干扰,如一幅图像,内容在大片草丛中有三四朵花,其属于花丛并不属于草丛,如果使用尺寸列表提取该图像的局部特征,以判断该图像的类别,可能会将该图像的类别判定为草丛。其中,场景特征可以是指能够表征图像所属场景的图像特征。场景特征固定可以是指场景特征在图像中的分布较为集中,不分散;场景特征不固定可以是指场景特征在图像中的分布较为分散,不集中。待测图像所属场景可以是指待测图像中主体所属场景,例如图像的主体是花朵,那么该图像所属场景为植物场景;图像的主体是沙滩、山谷等,那么该图像所属场景为自然场景。In an implementation of the embodiment of the present application, the size of the partial image blocks can be set according to the scene to which the image belongs, and the scene to which the image belongs can be divided according to whether the scene characteristic of the image is fixed. For example, the first scene is the scene characteristic. An unfixed image scene, such as a natural scene; the second scene is an image scene with fixed scene characteristics, such as a plant scene. For example, images belonging to natural scenes usually have no clear and fixed scene features. You can use the size list to extract local image blocks of images belonging to natural scenes to extract effective local features; while images belonging to plant scenes usually have fixed scene images , Use the same preset size to extract the local features of the image belonging to the plant scene. If the size list is used to extract the local features of the image belonging to the plant scene, because the size in the size list may vary greatly, the extracted local features may not be obvious , It is easy to be disturbed by other information (such as background information), such as an image. The content contains three or four flowers in a large patch of grass. It belongs to the flower cluster and does not belong to the grass. If you use the size list to extract the local features of the image, you can judge the image The category of the image may be judged as grass. Among them, the scene feature may refer to the image feature that can characterize the scene to which the image belongs. Fixed scene features may mean that the distribution of scene features in the image is relatively concentrated and not scattered; unfixed scene features may mean that the distribution of scene features in the image is relatively scattered and not concentrated. The scene of the image to be tested may refer to the scene of the subject in the image to be tested. For example, if the subject of the image is a flower, then the scene to which the image belongs is a plant scene; the subject of the image is a beach, valley, etc., then the scene to which the image belongs is a natural scene.
需要说明的是,局部图像块是互不重叠的,且局部图像块的形块可以是矩形,也可以是不规则的多边形,本申请实施例对局部图像块的形状在此不做限定。It should be noted that the partial image blocks do not overlap each other, and the shape blocks of the partial image blocks may be rectangular or irregular polygons. The embodiment of the present application does not limit the shape of the partial image blocks here.
在本申请实施例中,终端设备可以根据待分类图像的分辨率来确定所述M的大小,所述待分类图像的分辨率可以与M的大小存在正比例关系或映射关系,即所述待分类图像的分辨率越高,终端设备可以将待分类图像划分成局部图像块的数量M越大;所述待分类图像的分辨率越低,终端设备可以将待分类图像划分成局部图像块的数量M越小。通过将高分辨率的图像划分成多个局部图像块,可以减小图像分类算法的复杂度,实现算法的精简和加速。In this embodiment of the application, the terminal device may determine the size of M according to the resolution of the image to be classified, and the resolution of the image to be classified may have a proportional relationship or a mapping relationship with the size of M, that is, the size of the image to be classified The higher the resolution of the image, the greater the number M that the terminal device can divide the image to be classified into partial image blocks; the lower the resolution of the image to be classified, the terminal device can divide the image to be classified into the number of partial image blocks The smaller M is. By dividing a high-resolution image into multiple partial image blocks, the complexity of the image classification algorithm can be reduced, and the algorithm can be simplified and accelerated.
在本申请实施例的一实现方式中,终端设备可以根据待分类图像所属的应用程序和/或应用场景来确定局部图像块的形状和数量,例如,对于勘测应用程序生成的遥感图像,其高分辨率图像中不同类别图像的特征分布情况比较集中,即属于同一类的主体在整幅图像上分布集中,终端设备可以将待分类图像划分成多个相同矩形大小的局部图像块;对于拍照应用程序中生成人物图像,其主体大部分分布于整幅图像的中间,终端设备可以从中间开始依次进行划分,且中间部分的局部图像块的尺寸可以大些,使得局部图像块包含更多的特征。In an implementation of the embodiment of the present application, the terminal device can determine the shape and quantity of the partial image block according to the application and/or application scenario to which the image to be classified belongs. For example, for a remote sensing image generated by a survey application, the height The feature distribution of different categories of images in the resolution image is relatively concentrated, that is, subjects belonging to the same category are concentrated on the entire image, and the terminal device can divide the image to be classified into multiple partial image blocks of the same rectangular size; for photographing applications The image of a person is generated in the program, and the main body of the image is mostly distributed in the middle of the whole image. The terminal device can be divided in order from the middle, and the size of the partial image block in the middle part can be larger, so that the partial image block contains more features .
S130、将所述M个局部图像块进行聚类,得到N个聚类结果。S130. Cluster the M partial image blocks to obtain N clustering results.
可选的,所述将所述M个局部图像块进行聚类,得到N个聚类结果包括:采用非监督学习的聚类方法将所述M个局部图像块进行聚类,得到所述N个聚类结果,所述N为大于1的正整数。Optionally, the clustering of the M partial image blocks to obtain N clustering results includes: clustering the M partial image blocks using an unsupervised learning clustering method to obtain the N For clustering results, the N is a positive integer greater than 1.
在本申请实施例中,在将所述M个局部图像块进行聚类之前,需要先使用分类模型获取局部图像块的局部特征。因此,需要先对分类模型进行训练,使用训练好的分类模型获取局部图像块的局部特征。在对分类训练模型进行训练时,先对训练样本进行划分,得到多个局部训练样本,再对局部训练样本进行采样,将采样后的训练样本输入至分类模型,分类模型输出训练样本的局部特征,使用损失函数进行训练回传。In the embodiment of the present application, before clustering the M partial image blocks, a classification model needs to be used to obtain the local features of the partial image blocks. Therefore, it is necessary to train the classification model first, and use the trained classification model to obtain the local features of the local image block. When training the classification training model, first divide the training samples to obtain multiple partial training samples, then sample the partial training samples, input the sampled training samples into the classification model, and the classification model outputs the local characteristics of the training samples , Use the loss function to train back.
在本申请实施例中,可以先使用分类模型对M个局部图像块分别进行特征提取,得到M个局部特征,为了克服单一特征的局限性,本申请可以采用多种特征组合成的组合局部特征来进行聚类分析。常用的图像局部特征可以包括颜色特征、LBP特征、纹理特征等,其中,图像的颜色特征对图像本身的尺寸、方向、视角的依赖性较小,常用的颜色直方图特征描述的是不同色彩在整幅图像中所占的比例;纹理是遥感图像的一种重要空间信息,随着分辨率的提高,地物的内部结构越来越清晰,这在遥感图像中表现为地物的纹理结构越来越明显;相对于光谱信息,纹理特征能反映目标地物内像元有规则的空间变化。因此,终端设备可以根据不同的应用场景或应用程序来选取不同的特征或特征组合,例如,遥感图像可以采用颜色特征和纹理特征组合成组合局部特征;也可以根据预先设定的特征选项选取局部特征或局部特征组合,本申请实施例对此不做限定。In the embodiment of the present application, the classification model may be used to extract the features of the M partial image blocks respectively to obtain the M partial features. In order to overcome the limitation of a single feature, the present application may use a combined partial feature formed by combining multiple features To perform cluster analysis. Commonly used image local features can include color features, LBP features, texture features, etc. Among them, the color features of the image have less dependence on the size, direction, and viewing angle of the image itself. The commonly used color histogram features describe the different colors. The proportion of the whole image; texture is an important spatial information of remote sensing images. With the increase of resolution, the internal structure of the ground objects becomes clearer and clearer, which is manifested as the texture structure of the ground objects in the remote sensing image. The coming is more obvious; relative to spectral information, texture features can reflect the regular spatial changes of pixels in the target object. Therefore, the terminal device can select different features or feature combinations according to different application scenarios or applications. For example, remote sensing images can use color features and texture features to combine to form combined local features; or select local features based on preset feature options. Features or local feature combinations are not limited in the embodiment of the present application.
其中,组合局部特征可以是一维的,即将多个组合的特征进行拼接,例如,可以将纹理特征拼接到颜色特征的后面;组合局部特征也可以是多维的,即多个组合的特征可以是一个特征矩阵,本申请实施例对此不做限定。Among them, the combined local features can be one-dimensional, that is, multiple combined features can be spliced. For example, the texture feature can be spliced behind the color feature; the combined local feature can also be multi-dimensional, that is, multiple combined features can be A feature matrix, which is not limited in the embodiment of the present application.
进一步地,颜色特征采用的是基于HSL(Hue,Saturation,Lightness)色彩空间,提取颜色直方图特征,相比于RGB色彩空间,HSL色彩空间更符合人眼的视觉感知特性。Further, the color features are based on the HSL (Hue, Saturation, Lightness) color space, and the color histogram features are extracted. Compared with the RGB color space, the HSL color space is more in line with the visual perception characteristics of the human eye.
在本申请实施例中,基于提取的M个局部特征将所述M个局部图像块中属于同一类别的局部图像块聚类到一起,得到N个聚类结果。其中,所述M个所述N的大小取决于M个局部图像块中包含的类别数量,如图3所示,局部图像块中的数字表示该局部图像块的类别,将9个局部图像块进行聚类,将同一类别 的局部图像块聚类成同一类,从而得到4个聚类结果,其中聚类结果1中包含3个局部图像块、聚类结果2-4中分别包含2个局部图像块。In the embodiment of the present application, the partial image blocks belonging to the same category in the M partial image blocks are clustered together based on the extracted M partial features, and N clustering results are obtained. Wherein, the size of the M and N depends on the number of categories contained in the M partial image blocks. As shown in FIG. 3, the number in the partial image block represents the category of the partial image block, and the 9 partial image blocks Perform clustering, cluster the partial image blocks of the same category into the same class, and obtain 4 clustering results. Among them, the clustering result 1 contains 3 partial image blocks, and the clustering results 2-4 contain 2 partial image blocks respectively. Image block.
其中,所述非监督学习的聚类方法包括但不限于:K-means聚类算法、Birch聚类算法、DBSCAN聚类算法和K最邻近分类算法。Wherein, the clustering methods of unsupervised learning include but are not limited to: K-means clustering algorithm, Birch clustering algorithm, DBSCAN clustering algorithm and K nearest neighbor classification algorithm.
一般来说,一个好的聚类划分应尽可能地反应数据集的内在结构,使同类内的类别尽可能相同,类间的类别尽可能不相同,例如,以K-means聚类算法为例,从距离的角度考虑就是使同类内距离极小而类间距离极大的聚类是最优聚类,而在本申请实施例中,是将局部特征相似的尽可能的划分为同一类,局部特征不相似的尽可能划分为不同类。Generally speaking, a good clustering division should reflect the internal structure of the data set as much as possible, so that the categories within the same category are as the same as possible, and the categories between the categories are as different as possible. For example, take the K-means clustering algorithm as an example. From the perspective of distance, clusters with extremely small intra-class distances and large inter-class distances are the optimal clusters. In the embodiment of this application, the clusters with similar local features are divided into the same class as much as possible. Dissimilar local features are divided into different categories as much as possible.
S140、根据所述M个局部图像块与所述N个聚类结果,确定所述待分类图像的全局特征。S140. Determine a global feature of the image to be classified according to the M local image blocks and the N clustering results.
其中,所述全局特征为所述待分类图像的特征向量,全局特征和局部特征均是所述待分类图像的图像特征,全局特征是指从整幅待测图像提取出的特征向量,来自整幅待测图像的特征;所述局部特征是指从整幅待测图像的局部图像块提取的特征向量,是来自待测图像的局部图像块的特征。Wherein, the global feature is the feature vector of the image to be classified, both the global feature and the local feature are the image features of the image to be classified, and the global feature refers to the feature vector extracted from the entire image to be tested, which is derived from the entire The feature of an image to be tested; the local feature refers to a feature vector extracted from a partial image block of the entire image to be tested, and is a feature from a partial image block of the image to be tested.
进一步地,为了克服单一特征的局限性,本申请可以采用多种特征组合成的组合全局特征来进行图像分类。图像全局特征可以包括颜色特征、LBP特征、纹理特征等,终端设备可以根据不同的应用场景或应用程序来选取不同的特征或特征组合,例如,遥感图像可以采用颜色特征和纹理特征组合成组合全局特征;也可以根据预先设定的特征选项选取全局特征或全局特征组合,本申请实施例对此不做限定。Further, in order to overcome the limitation of a single feature, the present application may use a combined global feature formed by a combination of multiple features to perform image classification. Image global features can include color features, LBP features, texture features, etc. The terminal device can select different features or feature combinations according to different application scenarios or applications. For example, remote sensing images can use color features and texture features to combine to form a combined global Features; global features or global feature combinations can also be selected according to preset feature options, which are not limited in the embodiment of the present application.
可选的,所述根据所述M个局部图像块与所述N个聚类结果,确定所述待分类图像的全局特征包括:将所述M个局部图像块分别与所述N个聚类结果中的每个聚类结果进行卷积操作,得到第一特征向量;对所述第一特征向量进行二值图像编码,得到所述全局特征。Optionally, the determining the global feature of the image to be classified according to the M local image blocks and the N clustering results includes: separately clustering the M local image blocks with the N clustering results Each clustering result in the result is subjected to a convolution operation to obtain a first feature vector; binary image coding is performed on the first feature vector to obtain the global feature.
举例说明,如图4所示,将9个局部图像块分别与由该9个局部图像块聚类得到的4个聚类结果进行卷积,从而得到该待分类图像的第一特征向量,该第一特征向量用于描述该待分类图像的全局信息。For example, as shown in Figure 4, 9 partial image blocks are respectively convolved with 4 clustering results obtained by clustering the 9 partial image blocks to obtain the first feature vector of the image to be classified. The first feature vector is used to describe the global information of the image to be classified.
其中,所述第一特征向量可以是一维的,可以将局部图像块与聚类结果卷积得到的特征向量进行拼接得到第一特征向量;所述第一特征向量也可以是多维的,本申请实施例对此不做限定。Wherein, the first feature vector may be one-dimensional, and the feature vector obtained by convolution of the partial image block and the clustering result may be spliced to obtain the first feature vector; the first feature vector may also be multi-dimensional. The application embodiment does not limit this.
其中,所述第一特征向量也可以是一个或多个。Wherein, the first feature vector may also be one or more.
需要说明的是,将所述M个局部图像块分别与所述N个聚类结果中的每个聚类结果进行卷积操作,可以是将局部图像块中提取的出的局部特征分别与聚类结果中提取出的局部特征进行卷积,得到第一特征向量。It should be noted that performing a convolution operation on each of the M partial image blocks and each of the N clustering results may be to combine the local features extracted from the partial image blocks with the clustering results. The local features extracted from the class result are convolved to obtain the first feature vector.
可选的,所述对所述第一特征向量进行二值图像编码,得到所述全局特征包括:根据二值图像编码规则,将所述第一特征向量中大于第一值的值置为第二值,以及将所述第一特征向量中小于或等于所述第一值的值置为所述第一值,得到所述全局特征。Optionally, the performing binary image encoding on the first feature vector to obtain the global feature includes: setting a value greater than the first value in the first feature vector as a first value according to a binary image encoding rule. Two values, and setting a value in the first feature vector that is less than or equal to the first value as the first value to obtain the global feature.
其中,所述第一值可以是0,第二值可以是1;所述第一值也可以是255,第二值可以是0,例如,图像的每一像素可以用(0,255)之间的取值来表示,上述二值图像编码是将图像转化为灰度图像,灰度图像上的每一像素只有两种可能的取值或灰度等级状态,上述全局特征可以是用8比特位表示的特征。当然,所述第一值和所述第二值也可以是其他值,本申请实施例对此不做限定。Wherein, the first value can be 0, and the second value can be 1; the first value can also be 255, and the second value can be 0. For example, each pixel of the image can be between (0,255). Value to indicate that the above binary image coding is to convert the image into a grayscale image. Each pixel on the grayscale image has only two possible values or grayscale states. The above global feature can be represented by 8 bits. Characteristics. Of course, the first value and the second value may also be other values, which are not limited in the embodiment of the present application.
在本申请实施例中,在将所述M个局部图像块分别与所述N个聚类结果中的每个聚类结果进行卷积操作之前,需要先使用分类模型获取局部图像块和聚类结果的局部特征,可以使用上述分类模型获取局部图像块和聚类结果的局部特征。In the embodiment of the present application, before performing the convolution operation on the M partial image blocks and each of the N clustering results respectively, it is necessary to use a classification model to obtain the partial image blocks and clusters. The local features of the result can be obtained by using the above classification model to obtain the local features of the local image block and the clustering result.
在本申请实施例的一实现方式中,终端设备可以将局部图像块和聚类结果输入到预设算法中,利用预设算法对图像进行处理。其中,预设算法可以是Fast(快速)基于区域的卷积神经网络(Fast Region Based Convolutional Neural Network,RCNN)算法。例如,在实现时,用户可以预先设置Fast RCNN算法中的卷积窗口,当终端将局部图像块和聚类结果输入到Fast RCNN算法后,终端利用卷积窗口对图像进行卷积,得到第一特征向量。其中,第一特征向量是指对图像进行卷积后得到的完整的矩阵。In an implementation manner of the embodiment of the present application, the terminal device may input the partial image block and the clustering result into the preset algorithm, and use the preset algorithm to process the image. Among them, the preset algorithm may be Fast (Fast Region Based Convolutional Neural Network, RCNN) algorithm. For example, in the implementation, the user can pre-set the convolution window in the Fast RCNN algorithm. After the terminal inputs the partial image blocks and clustering results into the Fast RCNN algorithm, the terminal uses the convolution window to convolve the image to obtain the first Feature vector. Among them, the first feature vector refers to a complete matrix obtained after convolving the image.
S150、根据所述全局特征确定所述待分类图像的分类结果。S150. Determine a classification result of the image to be classified according to the global feature.
可选的,所述根据所述全局特征确定所述待分类图像的分类结果包括:将所述全局特征与卷积向量进行卷积操作,得到所述全局特征的概率向量,所述卷积向量是训练标注类别的图像样本得到的;将所述概率向量中的最大值对应的类别作为所述待分类图像的分类结果。Optionally, the determining the classification result of the image to be classified according to the global feature includes: performing a convolution operation on the global feature and a convolution vector to obtain a probability vector of the global feature, and the convolution vector It is obtained by training the image samples of the labeled category; the category corresponding to the maximum value in the probability vector is used as the classification result of the image to be classified.
其中,上述卷积操作的大小为1x1,上述概率向量中的值表示所述待分类图像属于每一类别的概率,根据概率向量中值的大小,可以将最大值对应的类别作为所述待分类图像的分类结果。Wherein, the size of the above convolution operation is 1x1, and the value in the above probability vector represents the probability that the image to be classified belongs to each category. According to the value of the probability vector, the category corresponding to the maximum value can be used as the category to be classified. The classification result of the image.
在本申请实施例中,需要先得到卷积向量,可以利用图像库中各类别的原始训练样本进行训练得到该卷积向量,也可以通过搜集的标注有图像类别的图像样本进行训练得到该卷积向量,本申请实施例对此不做限定。需要说明的是,上述卷积向量可以依赖于标注图像类别的图像样本,针对不同的标注图像类别的图像样本,其卷积向量也可能不同,因此,终端设备可以根据应用场景或应用程序选择相适应的图像样本来训练得到该卷积向量,例如,对于观测的终端设备,为了准确地分类遥感图像,提供详细的地面信息,可以直接采用已标注好类别的遥感图像样本进行训练得到卷积向量,从而提高图像分类的准确性。In the embodiment of this application, it is necessary to obtain the convolution vector first. The original training samples of each category in the image library can be used for training to obtain the convolution vector, or the collection of image samples marked with image categories can be trained to obtain the convolution vector. The product vector is not limited in the embodiment of this application. It should be noted that the above-mentioned convolution vector may depend on the image samples of the annotated image category. For different image samples of the annotated image category, the convolution vector may be different. Therefore, the terminal device can select the corresponding image according to the application scenario or application. The convolutional vector can be obtained by training with adapted image samples. For example, for the observing terminal equipment, in order to accurately classify remote sensing images and provide detailed ground information, you can directly use the remote sensing image samples with marked categories for training to obtain the convolution vector , Thereby improving the accuracy of image classification.
其中,全局特征与卷积向量进行卷积操作后可以直接得到待分类图像的概率向量,由于1x1卷积计算量小,并行化加速比较明显,从而大大降低了原始数据的计算量和内存占用率,从而对图像分类算法在保证精度的情况下进行了优化,在保证图像分类准确度的前提下,大大降低了非监学习法的计算代价。Among them, the global feature and the convolution vector can directly obtain the probability vector of the image to be classified after the convolution operation. Due to the small amount of 1x1 convolution calculation, the parallelization acceleration is obvious, which greatly reduces the calculation amount of the original data and the memory usage Therefore, the image classification algorithm is optimized under the condition of ensuring the accuracy, and the calculation cost of the unsupervised learning method is greatly reduced under the premise of ensuring the accuracy of the image classification.
在本申请实施例的一实现方式中,所述方法还包括:获取第一图像集,所述第一图像集包括已标注图像类别的图像;使用所述第一图像集训练待训练的分类器,得到第一分类器;In an implementation manner of the embodiment of the present application, the method further includes: acquiring a first image set, the first image set including images of an already-labeled image category; using the first image set to train the classifier to be trained , Get the first classifier;
所述根据所述全局特征确定所述待分类图像的分类结果包括:将所述全局特征输入所述第一分类器,输出所述待分类图像的分类结果。The determining the classification result of the image to be classified according to the global feature includes: inputting the global feature into the first classifier, and outputting the classification result of the image to be classified.
其中,将待分类图像的全局特征输入至分类器,能够使得分类器根据待分类图像的全局特征中进行分类,例如,对于遥感图像,分类器可以根据全局特征中的颜色特征和纹理特征对其进行分类。Among them, inputting the global features of the image to be classified into the classifier can enable the classifier to classify according to the global features of the image to be classified. sort.
在本申请实施例中,在使用分类器获取待分类图像的类别之前,需要先对分类器进行训练,使用训练好的分类器获取待分类图像的类别。:在对分类器进行训练时,使用训练好的分类器获取第一图像集中图像的全局特征,将全局特征输入至分类器,分类器输出第一图像集中图像的类别,使用目标监督对分类器进行回传训练。其中,目标监督是深度学习中的监督学习,例如损失函数。In the embodiment of the present application, before using the classifier to obtain the category of the image to be classified, the classifier needs to be trained first, and the trained classifier is used to obtain the category of the image to be classified. : When training the classifier, use the trained classifier to obtain the global features of the images in the first image set, input the global features to the classifier, the classifier outputs the category of the images in the first image set, and use target supervision to control the classifier Perform backhaul training. Among them, target supervision is supervised learning in deep learning, such as loss function.
进一步地,分类器可以是指根据待分类图像的图像特征对待分类图像进行分类的模型,可选的,本申请中的分类器可以是非线性分类器,例如非线性支持向量机(Support Vector Machine,SVM)。非线性分类器可以有效扩展分类维度,降低softmax、全连接层等线性分类器在非线性分类上的缺陷。Further, the classifier may refer to a model that classifies the image to be classified according to the image characteristics of the image to be classified. Optionally, the classifier in this application may be a non-linear classifier, such as a non-linear support vector machine (Support Vector Machine, SVM). Non-linear classifiers can effectively expand the classification dimension and reduce the defects of linear classifiers such as softmax and fully connected layers in non-linear classification.
在本申请实施例中,通过无监督学习的聚类方法得到局部图像块的聚类结果,利用有监督学习的分类模型得到待分类图像的全局特征,以及使用有监督学习的分类模型根据全局特征确定待分类图像的类别。本申请通过采用无监督学习和有监督学习的深度学习算法相结合,提高了图像分类算法的性能,降低了功耗。In the embodiment of the present application, the clustering results of local image blocks are obtained by the clustering method of unsupervised learning, the classification model of supervised learning is used to obtain the global features of the image to be classified, and the classification model of supervised learning is used according to the global features. Determine the category of the image to be classified. This application improves the performance of the image classification algorithm and reduces power consumption by combining unsupervised learning and supervised deep learning algorithms.
本申请实施例可以通过一个图像样本训练待训练的分类器得到第一分类器和卷积向量,也可以通过不同的图像样本分别训练待训练的分类器来得到第一分类器和卷积向量,本申请实施例对此不做限定。In the embodiment of the application, the first classifier and the convolution vector may be obtained by training the classifier to be trained through one image sample, or the first classifier and the convolution vector may be obtained by training the classifier to be trained through different image samples. The embodiment of the application does not limit this.
可以看出,本申请实施例提出一种图像分类方法,应用于终端设备,通过获取待分类图像,将所述待分类图像进行划分,得到M个局部图像块,将所述M个局部图像块进行聚类,得到N个聚类结果,根据所述M个局部图像块与所述N个聚类结果,确定所述待分类图像的全局特征,根据所述全局特征确定所述待分类图像的分类结果,减少了图像分类的冗余计算量,实现了算法的精简和加速,为计算量大的图像在终端硬件上运行提供了一种可能途径。It can be seen that the embodiment of the application proposes an image classification method, which is applied to a terminal device. By acquiring an image to be classified, the image to be classified is divided to obtain M partial image blocks, and the M partial image blocks are Perform clustering to obtain N clustering results, determine the global feature of the image to be classified according to the M local image blocks and the N clustering results, and determine the global feature of the image to be classified according to the global feature The classification result reduces the amount of redundant calculation for image classification, realizes the simplification and acceleration of the algorithm, and provides a possible way for the image with a large amount of calculation to run on the terminal hardware.
上述主要从方法侧执行过程的角度对本申请实施例的方案进行了介绍。可以理解的是,电子设备为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所提供的实施例描述的各示例的单元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。The foregoing mainly introduces the solution of the embodiment of the present application from the perspective of the execution process on the method side. It can be understood that, in order to implement the above-mentioned functions, an electronic device includes hardware structures and/or software modules corresponding to each function. Those skilled in the art should easily realize that in combination with the units and algorithm steps of the examples described in the embodiments provided herein, this application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software-driven hardware depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered beyond the scope of this application.
本申请实施例可以根据上述方法示例对电子设备进行功能单元的划分,例如,可以对应各个功能划分各个功能单元,也可以将两个或两个以上的功能集成在一个处理单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。需要说明的是,本申请实施例中对单元的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。The embodiment of the present application may divide the electronic device into functional units according to the foregoing method examples. For example, each functional unit may be divided corresponding to each function, or two or more functions may be integrated into one processing unit. The above-mentioned integrated unit can be implemented in the form of hardware or software functional unit. It should be noted that the division of units in the embodiments of the present application is illustrative, and is only a logical function division, and there may be other division methods in actual implementation.
请参阅图5,图5是本申请实施例提供的一种图像分类装置的功能单元组成框图,应用于终端设备,如图5所示,所述装置包括:Please refer to FIG. 5. FIG. 5 is a block diagram of the functional unit composition of an image classification device provided by an embodiment of the present application, which is applied to a terminal device. As shown in FIG. 5, the device includes:
获取单元510,用于获取待分类图像;The obtaining unit 510 is configured to obtain the image to be classified;
划分单元520,用于将所述待分类图像进行划分,得到M个局部图像块,所述M为大于1的正整数;The dividing unit 520 is configured to divide the image to be classified to obtain M partial image blocks, where M is a positive integer greater than 1;
聚类单元530,用于将所述M个局部图像块进行聚类,得到N个聚类结果,所述N为大于1的正整数;The clustering unit 530 is configured to cluster the M partial image blocks to obtain N clustering results, where N is a positive integer greater than 1;
第一确定单元540,用于根据所述M个局部图像块与所述N个聚类结果,确定所述待分类图像的全局特征,所述全局特征为所述待分类图像的特征向量;The first determining unit 540 is configured to determine a global feature of the image to be classified according to the M local image blocks and the N clustering results, where the global feature is a feature vector of the image to be classified;
第二确定单元550,用于根据所述全局特征确定所述待分类图像的分类结果。The second determining unit 550 is configured to determine the classification result of the image to be classified according to the global feature.
在本申请实施例的一实现方式中,所述第一确定单元540具体用于:将所述M个局部图像块分别与所述N个聚类结果中的每个聚类结果进行卷积操作,得到第一特征向量;对所述第一特征向量进行二值图像编码,得到所述全局特征。In an implementation manner of the embodiment of the present application, the first determining unit 540 is specifically configured to: perform a convolution operation on the M partial image blocks and each of the N clustering results. , Obtain a first feature vector; perform binary image coding on the first feature vector to obtain the global feature.
在本申请实施例的一实现方式中,所述第一确定单元540还具体用于:根据二值图像编码规则,将所述第一特征向量中大于第一值的值置为第二值,以及将 所述第一特征向量中小于或等于所述第一值的值置为所述第一值,得到所述全局特征。In an implementation manner of the embodiment of the present application, the first determining unit 540 is further specifically configured to: set a value greater than the first value in the first feature vector to a second value according to a binary image coding rule, And setting a value in the first feature vector that is less than or equal to the first value as the first value to obtain the global feature.
在本申请实施例的一实现方式中,所述第二确定单元550具体用于:将所述全局特征与卷积向量进行卷积操作,得到所述全局特征的概率向量,所述卷积向量是训练标注类别的图像样本得到的;将所述概率向量中的最大值对应的类别作为所述待分类图像的分类结果。In an implementation manner of the embodiment of the present application, the second determining unit 550 is specifically configured to: perform a convolution operation on the global feature and the convolution vector to obtain the probability vector of the global feature, and the convolution vector It is obtained by training the image samples of the labeled category; the category corresponding to the maximum value in the probability vector is used as the classification result of the image to be classified.
在本申请实施例的一实现方式中,所述卷积操作的大小为1x1。In an implementation manner of the embodiment of the present application, the size of the convolution operation is 1×1.
在本申请实施例的一实现方式中,所述获取单元510还用于:获取第一图像集,所述第一图像集包括已标注图像类别的图像。In an implementation manner of the embodiment of the present application, the obtaining unit 510 is further configured to: obtain a first image set, where the first image set includes images with annotated image categories.
在本申请实施例的一实现方式中,所述装置还包括训练单元560,所述训练单元560用于使用所述第一图像集训练待训练的分类器,得到第一分类器。In an implementation of the embodiment of the present application, the device further includes a training unit 560 configured to train the classifier to be trained using the first image set to obtain the first classifier.
在本申请实施例的一实现方式中,所述第二确认单元550还具体用于:将所述全局特征输入所述第一分类器,输出所述待分类图像的分类结果。In an implementation manner of the embodiment of the present application, the second confirmation unit 550 is further specifically configured to: input the global feature into the first classifier, and output the classification result of the image to be classified.
在本申请实施例的一实现方式中,所述聚类单元530具体用于:采用非监督学习的聚类方法将所述M个局部图像块进行聚类,得到所述N个聚类结果。In an implementation manner of the embodiment of the present application, the clustering unit 530 is specifically configured to use an unsupervised learning clustering method to cluster the M partial image blocks to obtain the N clustering results.
在本申请实施例的一实现方式中,在所述将所述待分类图像进行划分之前,所述获取单元还用于:获取原始图像,对所述原始图像进行预处理得到待分类图像。In an implementation manner of the embodiment of the present application, before the dividing the image to be classified, the obtaining unit is further configured to: obtain an original image, and preprocess the original image to obtain the image to be classified.
可以理解的是,本申请实施例的图像分分类装置的各程序模块的功能可根据上述方法实施例中的方法具体实现,其具体实现过程可以参照上述方法实施例的相关描述,此处不再赘述。It is understandable that the functions of the program modules of the image classification device in the embodiment of the present application can be specifically implemented according to the method in the above method embodiment, and the specific implementation process can refer to the relevant description of the above method embodiment, which will not be omitted here. Go into details.
可以看出,本申请实施例提出一种图像分类装置,应用于终端设备,通过获取待分类图像,将所述待分类图像进行划分,得到M个局部图像块,将所述M个局部图像块进行聚类,得到N个聚类结果,根据所述M个局部图像块与所述N个聚类结果,确定所述待分类图像的全局特征,根据所述全局特征确定所述待分类图像的分类结果,减少了图像分类的冗余计算量,实现了算法的精简和加速,为计算量大的图像在终端硬件上运行提供了一种可能途径。It can be seen that the embodiment of the present application proposes an image classification device, which is applied to a terminal device. By acquiring an image to be classified, the image to be classified is divided to obtain M partial image blocks, and the M partial image blocks are Perform clustering to obtain N clustering results, determine the global feature of the image to be classified according to the M local image blocks and the N clustering results, and determine the global feature of the image to be classified according to the global feature The classification result reduces the amount of redundant calculation for image classification, realizes the simplification and acceleration of the algorithm, and provides a possible way for the image with a large amount of calculation to run on the terminal hardware.
请参阅图6,图6是本申请实施例提供的一种终端设备的结构示意图,如图6所示,该终端设备包括一个或多个处理器、一个或多个存储器、一个或多个通信接口,以及一个或多个程序;Please refer to FIG. 6. FIG. 6 is a schematic structural diagram of a terminal device provided by an embodiment of the present application. As shown in FIG. 6, the terminal device includes one or more processors, one or more memories, and one or more communications. Interface, and one or more programs;
所述一个或多个程序被存储在所述存储器中,并且被配置由所述一个或多个处理器执行;The one or more programs are stored in the memory, and are configured to be executed by the one or more processors;
所述程序包括用于执行以下步骤的指令:The program includes instructions for performing the following steps:
获取待分类图像;Obtain the image to be classified;
将所述待分类图像进行划分,得到M个局部图像块,所述M为大于1的正整数;Dividing the image to be classified to obtain M partial image blocks, where M is a positive integer greater than 1;
将所述M个局部图像块进行聚类,得到N个聚类结果,所述N为大于1的正整数;Clustering the M partial image blocks to obtain N clustering results, where N is a positive integer greater than 1;
根据所述M个局部图像块与所述N个聚类结果,确定所述待分类图像的全局特征,所述全局特征为所述待分类图像的特征向量;Determining a global feature of the image to be classified according to the M local image blocks and the N clustering results, where the global feature is a feature vector of the image to be classified;
根据所述全局特征确定所述待分类图像的分类结果。The classification result of the image to be classified is determined according to the global feature.
在本申请实施例的一实现方式中,所述程序包括还用于执行以下步骤的指令:将所述M个局部图像块分别与所述N个聚类结果中的每个聚类结果进行卷积操作,得到第一特征向量;对所述第一特征向量进行二值图像编码,得到所述 全局特征。In an implementation manner of the embodiment of the present application, the program includes instructions that are further used to perform the following steps: respectively, the M partial image blocks are collated with each of the N clustering results. Product operation to obtain a first feature vector; binary image coding is performed on the first feature vector to obtain the global feature.
在本申请实施例的一实现方式中,所述程序包括还用于执行以下步骤的指令:根据二值图像编码规则,将所述第一特征向量中大于第一值的值置为第二值,以及将所述第一特征向量中小于或等于所述第一值的值置为所述第一值,得到所述全局特征。In an implementation manner of the embodiment of the present application, the program includes instructions that are further used to perform the following steps: according to a binary image coding rule, a value greater than the first value in the first feature vector is set to a second value , And setting a value in the first feature vector that is less than or equal to the first value as the first value to obtain the global feature.
在本申请实施例的一实现方式中,所述程序包括还用于执行以下步骤的指令:将所述全局特征与卷积向量进行卷积操作,得到所述全局特征的概率向量,所述卷积向量是训练标注类别的图像样本得到的;将所述概率向量中的最大值对应的类别作为所述待分类图像的分类结果。In an implementation manner of the embodiment of the present application, the program includes instructions that are further used to perform the following steps: perform a convolution operation on the global feature and the convolution vector to obtain the probability vector of the global feature, and the volume The product vector is obtained by training image samples of the labeled category; the category corresponding to the maximum value in the probability vector is used as the classification result of the image to be classified.
在本申请实施例的一实现方式中,所述卷积操作的大小为1x1。In an implementation manner of the embodiment of the present application, the size of the convolution operation is 1×1.
在本申请实施例的一实现方式中,所述程序包括还用于执行以下步骤的指令:获取第一图像集,所述第一图像集包括已标注图像类别的图像;使用所述第一图像集训练待训练的分类器,得到第一分类器。In an implementation manner of the embodiment of the present application, the program includes instructions that are further used to perform the following steps: obtaining a first image set, the first image set including images of an already-labeled image category; using the first image Set the classifier to be trained to obtain the first classifier.
在本申请实施例的一实现方式中,所述程序包括还用于执行以下步骤的指令:将所述全局特征输入所述第一分类器,输出所述待分类图像的分类结果。In an implementation manner of the embodiment of the present application, the program includes instructions that are further used to perform the following steps: input the global feature to the first classifier, and output the classification result of the image to be classified.
在本申请实施例的一实现方式中,所述程序包括还用于执行以下步骤的指令:采用非监督学习的聚类方法将所述M个局部图像块进行聚类,得到所述N个聚类结果。In an implementation manner of the embodiment of the present application, the program includes instructions that are further used to perform the following steps: clustering the M partial image blocks using an unsupervised learning clustering method to obtain the N clusters Class result.
在本申请实施例的一实现方式中,所述将所述待分类图像进行划分之前,所述程序包括还用于执行以下步骤的指令:获取原始图像,对所述原始图像进行预处理得到待分类图像。In an implementation manner of the embodiment of the present application, before dividing the image to be classified, the program includes instructions for performing the following steps: obtaining an original image, and preprocessing the original image to obtain the image to be classified. Categorize images.
可以看出,本申请实施例提出一种图像分类装置,应用于终端设备,通过获取待分类图像,将所述待分类图像进行划分,得到M个局部图像块,将所述M个局部图像块进行聚类,得到N个聚类结果,根据所述M个局部图像块与所述N个聚类结果,确定所述待分类图像的全局特征,根据所述全局特征确定所述待分类图像的分类结果,减少了图像分类的冗余计算量,实现了算法的精简和加速,为计算量大的图像在终端硬件上运行提供了一种可能途径。It can be seen that the embodiment of the present application proposes an image classification device, which is applied to a terminal device. By acquiring an image to be classified, the image to be classified is divided to obtain M partial image blocks, and the M partial image blocks are Perform clustering to obtain N clustering results, determine the global feature of the image to be classified according to the M local image blocks and the N clustering results, and determine the global feature of the image to be classified according to the global feature The classification result reduces the amount of redundant calculation for image classification, realizes the simplification and acceleration of the algorithm, and provides a possible way for the image with a large amount of calculation to run on the terminal hardware.
本申请实施例还提供一种计算机存储介质,其中,该计算机存储介质存储用于电子数据交换的计算机程序,该计算机程序使得计算机执行如上述方法实施例中记载的任一方法的部分或全部步骤,上述计算机包括终端设备。An embodiment of the present application also provides a computer storage medium, wherein the computer storage medium stores a computer program for electronic data exchange, and the computer program enables a computer to execute part or all of the steps of any method as recorded in the above method embodiment , The above-mentioned computer includes terminal equipment.
本申请实施例还提供一种计算机程序产品,上述计算机程序产品包括存储了计算机程序的非瞬时性计算机可读存储介质,上述计算机程序可操作来使计算机执行如上述方法实施例中记载的任一方法的部分或全部步骤。该计算机程序产品可以为一个软件安装包,上述计算机包括终端设备。The embodiments of the present application also provide a computer program product. The above-mentioned computer program product includes a non-transitory computer-readable storage medium storing a computer program. Part or all of the steps of the method. The computer program product may be a software installation package, and the above-mentioned computer includes terminal equipment.
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本申请所必须的。It should be noted that for the foregoing method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should know that this application is not limited by the described sequence of actions. Because according to this application, some steps can be performed in other order or at the same time. Secondly, those skilled in the art should also know that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by this application.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。In the above-mentioned embodiments, the description of each embodiment has its own emphasis. For parts that are not described in detail in an embodiment, reference may be made to related descriptions of other embodiments.
在本申请所提供的几个实施例中,应该理解到,所揭露的装置,可通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如上述单元的 划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed device may be implemented in other ways. For example, the device embodiments described above are only illustrative, for example, the division of the above-mentioned units is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components can be combined or integrated. To another system, or some features can be ignored, or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical or other forms.
上述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described above as separate components may or may not be physically separate, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
上述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储器中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储器中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本申请各个实施例上述方法的全部或部分步骤。而前述的存储器包括:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。If the above integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable memory. Based on this understanding, the technical solution of the present application essentially or the part that contributes to the existing technology or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a memory. A number of instructions are included to enable a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the foregoing methods of the various embodiments of the present application. The aforementioned memory includes: U disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk and other media that can store program codes.
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读存储器中,存储器可以包括:闪存、ROM、RAM、磁盘或光盘等。Those of ordinary skill in the art can understand that all or part of the steps in the various methods of the above-mentioned embodiments can be completed by a program instructing relevant hardware. The program can be stored in a computer-readable memory, and the memory can include: flash memory, ROM, RAM, magnetic disk or CD, etc.
以上对本申请实施例进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。The embodiments of the application are described in detail above, and specific examples are used in this article to illustrate the principles and implementation of the application. The descriptions of the above embodiments are only used to help understand the methods and core ideas of the application; at the same time, for Those of ordinary skill in the art, based on the idea of the application, will have changes in the specific implementation and the scope of application. In summary, the content of this specification should not be construed as a limitation to the application.

Claims (20)

  1. 一种图像分类方法,其特征在于,所述方法包括:An image classification method, characterized in that the method includes:
    获取待分类图像;Obtain the image to be classified;
    将所述待分类图像进行划分,得到M个局部图像块,所述M为大于1的正整数;Dividing the image to be classified to obtain M partial image blocks, where M is a positive integer greater than 1;
    将所述M个局部图像块进行聚类,得到N个聚类结果,所述N为大于1的正整数;Clustering the M partial image blocks to obtain N clustering results, where N is a positive integer greater than 1;
    根据所述M个局部图像块与所述N个聚类结果,确定所述待分类图像的全局特征,所述全局特征为所述待分类图像的特征向量;Determining a global feature of the image to be classified according to the M local image blocks and the N clustering results, where the global feature is a feature vector of the image to be classified;
    根据所述全局特征确定所述待分类图像的分类结果。The classification result of the image to be classified is determined according to the global feature.
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述M个局部图像块与所述N个聚类结果,确定所述待分类图像的全局特征包括:The method according to claim 1, wherein the determining the global feature of the image to be classified according to the M local image blocks and the N clustering results comprises:
    将所述M个局部图像块分别与所述N个聚类结果中的每个聚类结果进行卷积操作,得到第一特征向量;Performing a convolution operation on the M partial image blocks and each of the N clustering results to obtain a first feature vector;
    对所述第一特征向量进行二值图像编码,得到所述全局特征。Binary image coding is performed on the first feature vector to obtain the global feature.
  3. 根据权利要求2所述的方法,其特征在于,所述对所述第一特征向量进行二值图像编码,得到所述全局特征包括:The method according to claim 2, wherein the performing binary image encoding on the first feature vector to obtain the global feature comprises:
    根据二值图像编码规则,将所述第一特征向量中大于第一值的值置为第二值,以及将所述第一特征向量中小于或等于所述第一值的值置为所述第一值,得到所述全局特征。According to the binary image coding rule, a value in the first feature vector greater than a first value is set to a second value, and a value in the first feature vector that is less than or equal to the first value is set to the The first value, the global feature is obtained.
  4. 根据权利要求1-3任一项所述的方法,其特征在于,所述根据所述全局特征确定所述待分类图像的分类结果包括:The method according to any one of claims 1 to 3, wherein the determining the classification result of the image to be classified according to the global feature comprises:
    将所述全局特征与卷积向量进行卷积操作,得到所述全局特征的概率向量,所述卷积向量是训练标注类别的图像样本得到的;Performing a convolution operation on the global feature and a convolution vector to obtain a probability vector of the global feature, where the convolution vector is obtained by training an image sample of an annotation category;
    将所述概率向量中的最大值对应的类别作为所述待分类图像的分类结果。The category corresponding to the maximum value in the probability vector is used as the classification result of the image to be classified.
  5. 根据权利要求4所述方法,其特征在于,所述卷积操作的大小为1x1。The method according to claim 4, wherein the size of the convolution operation is 1×1.
  6. 根据权利要求1-3任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 1-3, wherein the method further comprises:
    获取第一图像集,所述第一图像集包括已标注图像类别的图像;Acquiring a first image set, the first image set including images of an already-labeled image category;
    使用所述第一图像集训练待训练的分类器,得到第一分类器;Use the first image set to train the classifier to be trained to obtain the first classifier;
    所述根据所述全局特征确定所述待分类图像的分类结果包括:将所述全局特征输入所述第一分类器,输出所述待分类图像的分类结果。The determining the classification result of the image to be classified according to the global feature includes: inputting the global feature into the first classifier, and outputting the classification result of the image to be classified.
  7. 根据权利要求1-6任一项所述的方法,其特征在于,所述将所述M个局部图像块进行聚类,得到N个聚类结果包括:The method according to any one of claims 1 to 6, wherein the clustering the M partial image blocks to obtain N clustering results comprises:
    采用非监督学习的聚类方法将所述M个局部图像块进行聚类,得到所述N个聚类结果。Clustering the M partial image blocks using an unsupervised learning clustering method to obtain the N clustering results.
  8. 根据权利要求1-7任一项所述的方法,其特征在于,所述将所述待分类图像进行划分,得到M个局部图像块,包括:The method according to any one of claims 1-7, wherein the dividing the image to be classified to obtain M partial image blocks comprises:
    按照预设的图像尺寸将所述待分类图像划分成M个相同大小的局部图像块。The image to be classified is divided into M partial image blocks of the same size according to the preset image size.
  9. 根据权利要求8所述的方法,其特征在于,所述M的大小由所述待分类图像的分辨率确定,所述待分类图像的分辨率与所述M的大小呈正比例关系或映射关系。The method according to claim 8, wherein the size of the M is determined by the resolution of the image to be classified, and the resolution of the image to be classified and the size of the M are in a proportional relationship or a mapping relationship.
  10. 一种图像分类装置,其特征在于,所述装置包括:An image classification device, characterized in that the device includes:
    获取单元,用于获取待分类图像;The acquiring unit is used to acquire the image to be classified;
    划分单元,用于将所述待分类图像进行划分,得到M个局部图像块,所述M为大于1的正整数;A dividing unit, configured to divide the image to be classified to obtain M partial image blocks, where M is a positive integer greater than 1;
    聚类单元,用于将所述M个局部图像块进行聚类,得到N个聚类结果,所述N为大于1的正整数;A clustering unit, configured to cluster the M partial image blocks to obtain N clustering results, where N is a positive integer greater than 1;
    第一确定单元,用于根据所述M个局部图像块与所述N个聚类结果,确定所述待分类图像的全局特征,所述全局特征为所述待分类图像的特征向量;A first determining unit, configured to determine a global feature of the image to be classified according to the M local image blocks and the N clustering results, where the global feature is a feature vector of the image to be classified;
    第二确定单元,用于根据所述全局特征确定所述待分类图像的分类结果。The second determining unit is configured to determine the classification result of the image to be classified according to the global feature.
  11. 根据权利要求10所述的装置,其特征在于,所述第一确定单元具体用于:The device according to claim 10, wherein the first determining unit is specifically configured to:
    将所述M个局部图像块分别与所述N个聚类结果中的每个聚类结果进行卷积操作,得到第一特征向量;对所述第一特征向量进行二值图像编码,得到所述全局特征。Perform a convolution operation on the M local image blocks and each of the N clustering results to obtain a first feature vector; perform binary image coding on the first feature vector to obtain the State the overall characteristics.
  12. 根据权利要求11所述的装置,其特征在于,所述第一确定单元还具体用于:The device according to claim 11, wherein the first determining unit is further specifically configured to:
    根据二值图像编码规则,将所述第一特征向量中大于第一值的值置为第二值,以及将所述第一特征向量中小于或等于所述第一值的值置为所述第一值,得到所述全局特征。According to the binary image coding rule, a value in the first feature vector greater than a first value is set to a second value, and a value in the first feature vector that is less than or equal to the first value is set to the The first value, the global feature is obtained.
  13. 根据权利要求10-12任一项所述的装置,其特征在于,所述第二确定单元具体用于:The device according to any one of claims 10-12, wherein the second determining unit is specifically configured to:
    将所述全局特征与卷积向量进行卷积操作,得到所述全局特征的概率向量,所述卷积向量是训练标注类别的图像样本得到的;将所述概率向量中的最大值对应的类别作为所述待分类图像的分类结果。Perform a convolution operation on the global feature and the convolution vector to obtain a probability vector of the global feature. The convolution vector is obtained by training the image samples of the labeled category; and the category corresponding to the maximum value in the probability vector As the classification result of the image to be classified.
  14. 根据权利要求13所述的装置,其特征在于,所述卷积操作的大小为1x1。The device according to claim 13, wherein the size of the convolution operation is 1×1.
  15. 根据权利要求10-12任一项所述的装置,其特征在于,所述装置还包括训练单元;The device according to any one of claims 10-12, wherein the device further comprises a training unit;
    所述获取单元还用于:获取第一图像集,所述第一图像集包括已标注图像类别的图像;The obtaining unit is further configured to: obtain a first image set, the first image set including images of an already-labeled image category;
    所述训练单元用于:使用所述第一图像集训练待训练的分类器,得到第一分类器;The training unit is used to train a classifier to be trained using the first image set to obtain a first classifier;
    所述第二确认单元还具体用于:将所述全局特征输入所述第一分类器,输出所述待分类图像的分类结果。The second confirmation unit is further specifically configured to: input the global feature into the first classifier, and output a classification result of the image to be classified.
  16. 根据权利要求10-15任一项所述的装置,其特征在于,所述聚类单元具体用于:The device according to any one of claims 10-15, wherein the clustering unit is specifically configured to:
    采用非监督学习的聚类方法将所述M个局部图像块进行聚类,得到所述N个聚类结果。Clustering the M partial image blocks using an unsupervised learning clustering method to obtain the N clustering results.
  17. 根据权利要求10-16任一项所述的装置,其特征在于,所述划分单元具体用于:The device according to any one of claims 10-16, wherein the dividing unit is specifically configured to:
    按照预设的图像尺寸将所述待分类图像划分成M个相同大小的局部图像块。The image to be classified is divided into M partial image blocks of the same size according to the preset image size.
  18. 根据权利要求17所述的装置,其特征在于,所述M的大小由所述待分类图像的分辨率确定,所述待分类图像的分辨率与所述M的大小呈正比例关系或映射关系。The device according to claim 17, wherein the size of the M is determined by the resolution of the image to be classified, and the resolution of the image to be classified is in a proportional relationship or a mapping relationship with the size of the M.
  19. 一种终端设备,其特征在于,所述终端设备包括处理器、存储器、通信接口,以及一个或多个程序,所述一个或多个程序被存储在所述存储器中,并且 被配置由所述处理器执行,所述程序包括用于执行如权利要求1-9任一项所述的方法中的步骤的指令。A terminal device, characterized in that the terminal device includes a processor, a memory, a communication interface, and one or more programs, and the one or more programs are stored in the memory and configured by the The processor executes, and the program includes instructions for executing the steps in the method according to any one of claims 1-9.
  20. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质包括存储用于电子数据交换的计算机程序,所述计算机程序使得计算机执行如权利要求1-9任一项所述的方法。A computer-readable storage medium, wherein the computer-readable storage medium includes a computer program stored for electronic data exchange, and the computer program causes a computer to execute the method according to any one of claims 1-9 .
PCT/CN2021/075045 2020-02-18 2021-02-03 Image classification method and apparatus WO2021164550A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010101515.6A CN111325271B (en) 2020-02-18 2020-02-18 Image classification method and device
CN202010101515.6 2020-02-18

Publications (1)

Publication Number Publication Date
WO2021164550A1 true WO2021164550A1 (en) 2021-08-26

Family

ID=71168841

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/075045 WO2021164550A1 (en) 2020-02-18 2021-02-03 Image classification method and apparatus

Country Status (2)

Country Link
CN (1) CN111325271B (en)
WO (1) WO2021164550A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116206208A (en) * 2023-05-05 2023-06-02 河东区志远苗木种植专业合作社 Forestry plant diseases and insect pests rapid analysis system based on artificial intelligence
CN116612389A (en) * 2023-07-20 2023-08-18 青建国际集团有限公司 Building construction progress management method and system

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111325271B (en) * 2020-02-18 2023-09-12 Oppo广东移动通信有限公司 Image classification method and device
CN111881849A (en) * 2020-07-30 2020-11-03 Oppo广东移动通信有限公司 Image scene detection method and device, electronic equipment and storage medium
CN111652329B (en) * 2020-08-05 2020-11-10 腾讯科技(深圳)有限公司 Image classification method and device, storage medium and electronic equipment
CN113112518B (en) * 2021-04-19 2024-03-26 深圳思谋信息科技有限公司 Feature extractor generation method and device based on spliced image and computer equipment
CN114418030B (en) * 2022-01-27 2024-04-23 腾讯科技(深圳)有限公司 Image classification method, training method and device for image classification model

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7840071B2 (en) * 2006-12-12 2010-11-23 Seiko Epson Corporation Method and apparatus for identifying regions of different content in an image
CN104077597A (en) * 2014-06-25 2014-10-01 小米科技有限责任公司 Image classifying method and device
CN109410196A (en) * 2018-10-24 2019-03-01 东北大学 Cervical cancer tissues pathological image diagnostic method based on Poisson annular condition random field
CN110751218A (en) * 2019-10-22 2020-02-04 Oppo广东移动通信有限公司 Image classification method, image classification device and terminal equipment
CN111325271A (en) * 2020-02-18 2020-06-23 Oppo广东移动通信有限公司 Image classification method and device

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008282267A (en) * 2007-05-11 2008-11-20 Seiko Epson Corp Scene discrimination device and scene discrimination method
CN104517113B (en) * 2013-09-29 2017-12-19 浙江大华技术股份有限公司 The sorting technique and relevant apparatus of a kind of feature extracting method of image, image
CN103942564B (en) * 2014-04-08 2017-02-15 武汉大学 High-resolution remote sensing image scene classifying method based on unsupervised feature learning
CN104036293B (en) * 2014-06-13 2017-02-22 武汉大学 Rapid binary encoding based high resolution remote sensing image scene classification method
CN107368791A (en) * 2017-06-29 2017-11-21 广东欧珀移动通信有限公司 Living iris detection method and Related product
CN110580482B (en) * 2017-11-30 2022-04-08 腾讯科技(深圳)有限公司 Image classification model training, image classification and personalized recommendation method and device
CN108647602B (en) * 2018-04-28 2019-11-12 北京航空航天大学 A kind of aerial remote sensing images scene classification method determined based on image complexity
CN109800781A (en) * 2018-12-07 2019-05-24 北京奇艺世纪科技有限公司 A kind of image processing method, device and computer readable storage medium
CN110309865A (en) * 2019-06-19 2019-10-08 上海交通大学 A kind of unmanned plane patrolling power transmission lines pin defect system image-recognizing method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7840071B2 (en) * 2006-12-12 2010-11-23 Seiko Epson Corporation Method and apparatus for identifying regions of different content in an image
CN104077597A (en) * 2014-06-25 2014-10-01 小米科技有限责任公司 Image classifying method and device
CN109410196A (en) * 2018-10-24 2019-03-01 东北大学 Cervical cancer tissues pathological image diagnostic method based on Poisson annular condition random field
CN110751218A (en) * 2019-10-22 2020-02-04 Oppo广东移动通信有限公司 Image classification method, image classification device and terminal equipment
CN111325271A (en) * 2020-02-18 2020-06-23 Oppo广东移动通信有限公司 Image classification method and device

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116206208A (en) * 2023-05-05 2023-06-02 河东区志远苗木种植专业合作社 Forestry plant diseases and insect pests rapid analysis system based on artificial intelligence
CN116612389A (en) * 2023-07-20 2023-08-18 青建国际集团有限公司 Building construction progress management method and system
CN116612389B (en) * 2023-07-20 2023-09-19 青建国际集团有限公司 Building construction progress management method and system

Also Published As

Publication number Publication date
CN111325271A (en) 2020-06-23
CN111325271B (en) 2023-09-12

Similar Documents

Publication Publication Date Title
WO2021164550A1 (en) Image classification method and apparatus
US20200186714A1 (en) Estimating hdr lighting conditions from a single ldr digital image
CN112651438A (en) Multi-class image classification method and device, terminal equipment and storage medium
WO2021196389A1 (en) Facial action unit recognition method and apparatus, electronic device, and storage medium
US11704357B2 (en) Shape-based graphics search
CN106874937B (en) Text image generation method, text image generation device and terminal
US10032091B2 (en) Spatial organization of images based on emotion face clouds
WO2020238515A1 (en) Image matching method and apparatus, device, medium, and program product
CN108961267B (en) Picture processing method, picture processing device and terminal equipment
CN110751218B (en) Image classification method, image classification device and terminal equipment
WO2021129466A1 (en) Watermark detection method, device, terminal and storage medium
WO2016112797A1 (en) Method and device for determining image display information
WO2023000895A1 (en) Image style conversion method and apparatus, electronic device and storage medium
CN111047509A (en) Image special effect processing method and device and terminal
US11804043B2 (en) Detecting objects in a video using attention models
CN111598149B (en) Loop detection method based on attention mechanism
CN111553838A (en) Model parameter updating method, device, equipment and storage medium
WO2022206729A1 (en) Method and apparatus for selecting cover of video, computer device, and storage medium
CN110163095B (en) Loop detection method, loop detection device and terminal equipment
WO2023024413A1 (en) Information matching method and apparatus, computer device and readable storage medium
CN108629767B (en) Scene detection method and device and mobile terminal
US11934958B2 (en) Compressing generative adversarial neural networks
WO2021051562A1 (en) Facial feature point positioning method and apparatus, computing device, and storage medium
Han Texture image compression algorithm based on self-organizing neural network
CN112150347A (en) Image modification patterns learned from a limited set of modified images

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21756441

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21756441

Country of ref document: EP

Kind code of ref document: A1