CN114140637A - Image classification method, storage medium and electronic device - Google Patents

Image classification method, storage medium and electronic device Download PDF

Info

Publication number
CN114140637A
CN114140637A CN202111227616.9A CN202111227616A CN114140637A CN 114140637 A CN114140637 A CN 114140637A CN 202111227616 A CN202111227616 A CN 202111227616A CN 114140637 A CN114140637 A CN 114140637A
Authority
CN
China
Prior art keywords
image
shearing
classification
views
view
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111227616.9A
Other languages
Chinese (zh)
Other versions
CN114140637B (en
Inventor
袁建龙
徐渊鸿
王志斌
李�昊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Damo Institute Hangzhou Technology Co Ltd
Original Assignee
Alibaba Damo Institute Hangzhou Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Damo Institute Hangzhou Technology Co Ltd filed Critical Alibaba Damo Institute Hangzhou Technology Co Ltd
Priority to CN202111227616.9A priority Critical patent/CN114140637B/en
Publication of CN114140637A publication Critical patent/CN114140637A/en
Application granted granted Critical
Publication of CN114140637B publication Critical patent/CN114140637B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses an image classification method, a storage medium and an electronic device. Wherein, the method comprises the following steps: acquiring an image to be classified; analyzing the images to be classified by adopting a classification model to obtain a classification result, wherein the classification model is obtained by training a plurality of views obtained by performing data enhancement processing on each frame of image in a plurality of frames of images by using a plurality of frames of images, the views are a plurality of shearing views, the shearing views are obtained by performing data enhancement processing on the whole image corresponding to each frame of image, each shearing view in the shearing views corresponds to different shearing areas in the whole image, and the types of shearing objects corresponding to each shearing view in the shearing views are different; and displaying the classification result. The invention solves the technical problem that the classification accuracy of the ground feature classification model is lower because the ground feature classification model of the target task is not constructed more accurately in the prior art.

Description

Image classification method, storage medium and electronic device
Technical Field
The present invention relates to the field of image classification technologies, and in particular, to an image classification method, a storage medium, and an electronic device.
Background
The world operates under strict physical and biological rules, and then there must be some a priori regularity to the observations (images) of the world. For example, the image coloring task utilizes the association between the object category and the object color distribution; image inpainting, namely, utilizing the association between object types and shape textures; the rotation prediction task utilizes the association between object classes and their orientations.
In the related art, no effective solution is provided at present how to realize more accurate construction of a classification model of a target task and improve the accuracy of the classification model.
Disclosure of Invention
The embodiment of the invention provides an image classification method, a storage medium and electronic equipment, which are used for at least solving the technical problem that the classification accuracy of a ground feature classification model is low because the ground feature classification model of a target task is not constructed more accurately in the prior art.
According to an aspect of an embodiment of the present invention, there is provided an image classification method including: acquiring an image to be classified; analyzing the images to be classified by adopting a classification model to obtain a classification result, wherein the classification model is obtained by training a plurality of views obtained by performing data enhancement processing on each frame of image in a plurality of frames of images by using a plurality of frames of images, the views are a plurality of shearing views, the shearing views are obtained by performing data enhancement processing on the whole image corresponding to each frame of image, each shearing view in the shearing views corresponds to different shearing areas in the whole image, and the types of shearing objects corresponding to each shearing view in the shearing views are different; and displaying the classification result.
According to another aspect of the embodiments of the present invention, there is also provided a natural resource feature classification method, including: acquiring a natural resource image to be classified; analyzing the natural resource image to be classified by adopting a surface feature classification model, and acquiring a natural resource surface feature type corresponding to the natural resource image to be classified, wherein the surface feature classification model is obtained by adopting a multi-frame natural resource image and training a plurality of views obtained by performing data enhancement processing on each frame of natural resource image in the multi-frame natural resource image, the plurality of views are a plurality of shearing views, the plurality of shearing views are obtained by performing data enhancement processing on a whole image corresponding to each frame of natural resource image, each shearing view in the plurality of shearing views respectively corresponds to different shearing areas in the whole image, and the types of shearing objects corresponding to each shearing view in the plurality of shearing views are different; and displaying the natural resource ground object types.
According to another aspect of the embodiments of the present invention, there is also provided a method for detecting a change in a building, including: acquiring a building image to be detected; analyzing the building image to be detected by using a change detection model, and determining a detection result of whether the building image to be detected changes, wherein the change detection model is obtained by training a plurality of views obtained by performing data enhancement processing on each frame of building image in a plurality of frames of building images, the plurality of views are a plurality of shearing views, the plurality of shearing views are obtained by performing data enhancement processing on a whole image corresponding to each frame of building image, each shearing view in the plurality of shearing views respectively corresponds to a different shearing area in the whole image, and the type of a shearing object corresponding to each shearing view in the plurality of shearing views is different; and displaying the detection result.
According to another aspect of the embodiments of the present invention, there is also provided an image classification method, including: receiving an image to be classified from a client; analyzing the images to be classified by adopting a classification model to obtain a classification result, wherein the classification model is obtained by training a plurality of views obtained by performing data enhancement processing on each frame of image in a plurality of frames of images by using a plurality of frames of images, the views are a plurality of shearing views, the shearing views are obtained by performing data enhancement processing on the whole image corresponding to each frame of image, each shearing view in the shearing views corresponds to different shearing areas in the whole image, and the types of shearing objects corresponding to each shearing view in the shearing views are different; and returning a classification result to the client and displaying the classification result on the client.
According to another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium, where the computer-readable storage medium includes a stored program, and when the program runs, the apparatus where the computer-readable storage medium is located is controlled to execute any one of the image classification methods.
According to another aspect of the embodiments of the present invention, there is also provided an electronic device, including: a processor; and a memory, connected to the processor, for providing instructions to the processor for processing the following processing steps: acquiring an image to be classified; analyzing the images to be classified by adopting a classification model to obtain a classification result, wherein the classification model is obtained by training a plurality of views obtained by performing data enhancement processing on each frame of image in a plurality of frames of images by using a plurality of frames of images, the views are a plurality of shearing views, the shearing views are obtained by performing data enhancement processing on the whole image corresponding to each frame of image, each shearing view in the shearing views corresponds to different shearing areas in the whole image, and the types of shearing objects corresponding to each shearing view in the shearing views are different; and displaying the classification result.
In the embodiment of the invention, the images to be classified are obtained; analyzing the images to be classified by adopting a classification model to obtain a classification result, wherein the classification model is obtained by training a plurality of views obtained by carrying out data enhancement processing on each frame of image in the plurality of frames of images; and displaying the classification result.
It is easy to note that, since the plurality of views are a plurality of cutout views, the plurality of cutout views are obtained by performing data enhancement processing on the whole image corresponding to each frame of image, each cutout view in the plurality of cutout views respectively corresponds to a different cutout region in the whole image, the type of a cutout object corresponding to each cutout view in the plurality of cutout views is different, a multi-frame image and a plurality of views obtained by performing data enhancement processing on each frame of image in the multi-frame image are used in advance to train to obtain a classification model, and the classification result is obtained by analyzing the obtained classification image by using the classification model, so that the purpose of providing a more accurate classification model for constructing a target task to classify the classification image is achieved, and local region enhancement is performed after one image is divided into one or more different views, therefore, the technical effect of improving the classification accuracy of the classification model is achieved, and the technical problem that the classification accuracy of the ground object classification model is low due to the fact that the ground object classification model for constructing the target task is not accurately achieved in the prior art is solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and not to limit the invention. In the drawings:
fig. 1 shows a hardware configuration block diagram of a computer terminal (or mobile device) for implementing an image classification method;
FIG. 2 is a flow chart of a method of image classification according to an embodiment of the invention;
FIG. 3 is a schematic diagram of an application scenario of an image classification method according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of an alternative pre-trained first neural network model, in accordance with embodiments of the present invention;
FIG. 5 is a schematic diagram of an alternative process for pre-training a first neural network model, according to an embodiment of the present invention;
FIG. 6a is a schematic illustration of an alternative remote sensing self-supervised image in accordance with embodiments of the present invention;
FIG. 6b is a schematic diagram of an alternative natural image according to an embodiment of the invention;
FIG. 7 is a flow chart of a method of image classification according to an embodiment of the invention;
FIG. 8 is a flow chart of a method of classifying terrain according to an embodiment of the present invention;
FIG. 9 is a flow chart of a method of detecting a change in terrain in accordance with an embodiment of the present invention;
FIG. 10 is a flow chart of a method for remote sensing image classification according to an embodiment of the invention;
FIG. 11 is a flow chart of a method of terrain identification according to an embodiment of the present invention;
fig. 12 is a diagram illustrating terrain classification performed at a cloud server according to an embodiment of the present invention;
FIG. 13 is a flow chart of a natural resource feature classification method according to an embodiment of the invention;
FIG. 14 is a flow chart of a method for detecting a change in a building feature according to an embodiment of the present invention;
FIG. 15 is a schematic structural diagram of an apparatus for obtaining a neural network model according to an embodiment of the present invention;
fig. 16 is a block diagram of another configuration of a computer terminal according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
First, some terms or terms appearing in the description of the embodiments of the present invention are applicable to the following explanations:
AIEarth: the method is applied to the problems of natural resources, water conservancy industry, natural disasters and the like by analyzing the earth through AI technology and analyzing different ground object types.
Self-supervision learning: self-supervised learning means that the labels (ground entries) used for machine learning are derived from the data itself, not from manual labels. As shown in the following figure, the self-supervised learning belongs to the unsupervised learning in the first place, so that the learning target does not need to be labeled manually. Second, the current field of self-supervised learning can be roughly divided into two branches. The first is the self-supervised learning for solving specific tasks, e.g. scene de-occlusion and self-supervised depth estimation, optical flow estimation, image correlation point matching, etc. The other branch is used for characterization learning, wherein supervised characterization learning is adopted, and a typical example is ImageNet classification; in unsupervised characterization learning, the most important method is self-supervised learning.
Codebook refers to a state in which features are encoded into an intermediate concatenation.
Example 1
In accordance with an embodiment of the present invention, there is provided an embodiment of an image classification method, it is noted that the steps illustrated in the flowchart of the drawings may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in an order different than here.
The method provided by embodiment 1 of the present invention may be implemented in a mobile terminal, a computer terminal, or a similar computing device. Fig. 1 shows a hardware configuration block diagram of a computer terminal (or mobile device) for implementing the image classification method. As shown in fig. 1, the computer terminal 10 (or mobile device 10) may include one or more (shown as 102a, 102b, … …, 102 n) processors 102 (the processors 102 may include, but are not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA, etc.), a memory 104 for storing data, and a transmission module 106 for communication functions. Besides, the method can also comprise the following steps: a display, an input/output interface (I/O interface), a Universal Serial BUS (USB) port (which may be included as one of the ports of the BUS), a network interface, a power source, and/or a camera. It will be understood by those skilled in the art that the structure shown in fig. 1 is only an illustration and is not intended to limit the structure of the electronic device. For example, the computer terminal 10 may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.
It should be noted that the one or more processors 102 and/or other data processing circuitry described above may be referred to generally herein as "data processing circuitry". The data processing circuitry may be embodied in whole or in part in software, hardware, firmware, or any combination thereof. Further, the data processing circuit may be a single stand-alone processing module, or incorporated in whole or in part into any of the other elements in the computer terminal 10 (or mobile device). As referred to in the embodiments of the invention, the data processing circuit acts as a processor control (e.g. selection of a variable resistance termination path connected to the interface).
The memory 104 may be used to store software programs and modules of application software, such as program instructions/data storage devices corresponding to the image classification method in the embodiment of the present invention, and the processor 102 executes various functional applications and data processing by running the software programs and modules stored in the memory 104, so as to implement the image classification method. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the computer terminal 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 106 is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the computer terminal 10. In one example, the transmission device 106 includes a Network adapter (NIC) that can be connected to other Network devices through a base station to communicate with the internet. In one example, the transmission device 106 can be a Radio Frequency (RF) module, which is used to communicate with the internet in a wireless manner.
The display may be, for example, a touch screen type Liquid Crystal Display (LCD) that may enable a user to interact with a user interface of the computer terminal 10 (or mobile device).
It should be noted that the world runs under strict physical and biological rules, and there must be some a priori rules for the observation (image) of the world. For example, the image coloring task utilizes the association between the object category and the object color distribution; image inpainting, namely, utilizing the association between object types and shape textures; the rotation prediction task utilizes the association between the object category and the orientation thereof, and can design a self-supervision learning task by mining more priors.
The popularity of self-supervised learning is inevitable, and after various mainstream supervised learning tasks are mature, data becomes the most important bottleneck. Through continuous analysis of semi-supervised and self-supervised methods, learning effective information from unlabelled data is always an important research topic, for example, how to realize and obtain higher performance of self-supervised learning by reducing data labeling amount provides a very rich imagination space for self-supervised learning.
Self-supervised learning means that the labels (ground entries) used for machine learning are derived from the data itself, not from manual labels. A typical method of self-supervised learning includes: solves the problems of Jigsaw Puzzles, motion propagation, rotation prediction, MoCo which is recently very hot and the like. There are other classification methods, for example, the classification may be classified into the self-supervised learning of audio/video/image/language based on the data classification. The embodiment of the invention mainly discusses the self-supervision learning related to the image.
In the foregoing operating environment, the present invention provides an image classification method as shown in fig. 2, where fig. 2 is a flowchart of an image classification method according to an embodiment of the present invention, and as shown in fig. 2, the image classification method includes:
step S102, acquiring an image to be classified;
step S104, analyzing the images to be classified by using a classification model to obtain a classification result, wherein the classification model is obtained by training a plurality of views obtained by performing data enhancement processing on each frame of image in a plurality of frames of images, the plurality of views are a plurality of shearing views, the plurality of shearing views are obtained by performing data enhancement processing on a whole image corresponding to each frame of image, each shearing view in the plurality of shearing views respectively corresponds to different shearing areas in the whole image, and the types of shearing objects corresponding to each shearing view in the plurality of shearing views are different;
and step S106, displaying the classification result.
In the embodiment of the invention, the images to be classified are obtained; analyzing the images to be classified by adopting a classification model to obtain a classification result, wherein the classification model is obtained by training a plurality of views obtained by carrying out data enhancement processing on each frame of image in the plurality of frames of images; and displaying the classification result.
It is easy to note that, since the plurality of views are a plurality of cutout views, the plurality of cutout views are obtained by performing data enhancement processing on the whole image corresponding to each frame of image, each cutout view in the plurality of cutout views respectively corresponds to a different cutout region in the whole image, the type of a cutout object corresponding to each cutout view in the plurality of cutout views is different, a multi-frame image and a plurality of views obtained by performing data enhancement processing on each frame of image in the multi-frame image are used in advance to train to obtain a classification model, and the classification result is obtained by analyzing the obtained classification image by using the classification model, so that the purpose of providing a more accurate classification model for constructing a target task to classify the classification image is achieved, and local region enhancement is performed after one image is divided into one or more different views, therefore, the technical effect of improving the classification accuracy of the classification model is achieved, and the technical problem that the classification accuracy of the ground object classification model is low due to the fact that the ground object classification model for constructing the target task is not accurately achieved in the prior art is solved.
It should be noted that the embodiment of the present invention may be applied to, but not limited to, a ground feature classification actual application scene, a change detection actual application scene, a remote sensing image classification actual application scene, and a ground feature identification actual application scene. For example, it can also be applied to the following technical fields: the meteorological field (e.g., cloud extraction, weather forecasting, weather forewarning, etc.); natural resource and ecological environment fields (e.g., weather forecast, change detection, ecological redline change detection, multi-classification change detection, ground feature classification, greenhouse extraction, road network extraction, building change detection (satellite, unmanned aerial vehicle), etc.), water conservancy fields (e.g., water area change detection, greenhouse extraction, water body extraction (optical, radar), sheet forest extraction, cage culture extraction, sand pit extraction, river house extraction, barrage extraction, photovoltaic power plant extraction, etc.), agroforestry fields (e.g., crop extraction (wheat, rice, potato, etc.), unmanned aerial vehicle crop identification (corn, flue-cured tobacco, myotonia, etc.), land parcel identification, growth monitoring (index calculation), agricultural assessment, pest monitoring, planting suggestion pushing, etc.), secondary disaster fields (e.g., disaster monitoring, travel disaster warning, etc.), life services, Take-out, logistics) areas (e.g., travel route planning, travel advice pushing, personnel mobilization, price adjustment, etc.); the city planning field (e.g., road network extraction (satellite, drone), building extraction, building change detection (satellite, drone), fire protection, etc.).
Because the pictures have spatial coherence and the videos have spatial-temporal coherence, the self-supervision learning task can be designed by utilizing the characteristics. For example, semantic consistency on object space in pictures is utilized, and temporal consistency of object motion in videos is utilized.
In the embodiment of the application, the multi-frame images and the multiple views obtained by performing data enhancement processing on each frame of image in the multi-frame images are used in advance to obtain the classification model, the obtained classification images are analyzed by adopting the classification model to obtain the classification result, and the purpose of providing the classification model for constructing the target task more accurately to perform classification processing on the classification images is achieved.
As an alternative embodiment, the classification model includes: the first neural network model is used for analyzing the image to be classified by adopting the classification model, and the obtained classification result comprises the following steps:
step S1041, analyzing the image to be classified by using the first neural network model to obtain the classification result, where each group of data in the multiple groups of unlabeled data used by the first neural network model includes: the method comprises the steps of pre-training images and a first view and a second view which are obtained by carrying out data enhancement processing on the pre-training images.
Optionally, the classification model includes: the first neural network model is used for directly analyzing the images to be classified by adopting the first neural network model to obtain the classification results, the multiple groups of label-free data are remote sensing images acquired in advance, the pre-training images are whole images, and data enhancement processing is augmentation, wherein the data enhancement processing is divided into off-line processing and on-line processing, the data enhancement enables limited data to generate more data, the number and diversity (noise data) of training samples are increased, and the robustness of the model is improved.
Because the pictures have spatial coherence and the videos have spatial-temporal coherence, the self-supervision learning task can be designed by utilizing the characteristics. For example, semantic consistency on object space in pictures is utilized, and temporal consistency of object motion in videos is utilized.
Optionally, the multiple views are multiple cropping views (or cropping views), the multiple cropping views are obtained by performing data enhancement processing on an entire image corresponding to each frame of image, each cropping view in the multiple cropping views corresponds to a different cropping area in the entire image, and types of cropping objects corresponding to each cropping view in the multiple cropping views are different, as shown in fig. 3, data enhancement processing is performed on each pre-training image to obtain two cropping view crop, that is, a first view and a second view, and since a difference (for example, a feature difference) between different cropping view crop is large and matching errors are easy to occur, in the embodiment of the present invention, the entire image is adopted as the pre-training image.
As an alternative embodiment, the classification model includes: the second neural network model is used for analyzing the image to be classified by adopting the classification model, and the obtained classification result comprises the following steps:
step S1043, analyzing the image to be classified by using the second neural network model to obtain the classification result, where the second neural network model is obtained by using a target type training set to train a first neural network model, and each group of data in multiple groups of unlabeled data used by the first neural network model includes: the method comprises the steps of pre-training images and a first view and a second view which are obtained by carrying out data enhancement processing on the pre-training images.
Optionally, the classification model includes: and the second neural network model is obtained after the first neural network model is trained by adopting a target type training set, and the image to be classified is analyzed by adopting the second neural network model to obtain the classification result.
Optionally, each set of data in the plurality of sets of unlabeled data used by the first neural network model includes: the method comprises the steps of pre-training images and a first view and a second view which are obtained by carrying out data enhancement processing on the pre-training images.
Optionally, the target type training set is downstream task data, and the downstream task may include, but is not limited to: a ground object classification task, a change detection task, a remote sensing image classification task, a ground object identification task and the like; the multiple groups of label-free data are remote sensing images acquired in advance, the pre-training images are whole images, and data enhancement processing is augmentation, wherein the data enhancement processing is divided into off-line processing and on-line processing, limited data are generated by data enhancement, the number and diversity (noise data) of training samples are increased, and the robustness of the model is improved.
Because the pictures have spatial coherence and the videos have spatial-temporal coherence, the self-supervision learning task can be designed by utilizing the characteristics. For example, semantic consistency on object space in pictures is utilized, and temporal consistency of object motion in videos is utilized.
Optionally, the first view and the second view may be a cropping view or a cropping view, as shown in fig. 3, each pre-training image is subjected to data enhancement processing to obtain two cropping views crop, i.e., a first view and a second view, and since a difference (e.g., a feature difference) between different cropping views crop is large and is prone to matching errors, in the embodiment of the present invention, the entire image is used as the pre-training image.
As an alternative embodiment, a graphical user interface is provided by an electronic device, where content displayed by the graphical user interface at least partially includes an image classification scene, and the method further includes:
step S110, displaying a plurality of task types in the graphical user interface;
step S112, responding to the touch operation acted on the graphical user interface, and determining a target task type from the plurality of task types;
step S114, displaying the first neural network model corresponding to the target task type in the graphical user interface;
and step S116, obtaining the target type training set, and performing model training on the first neural network model by using the target type training set to obtain the second neural network model corresponding to the target task type.
As an alternative embodiment, the content displayed in the graphical user interface at least partially comprises an image classification scene, and a plurality of task types are presented in the graphical user interface; the user can touch the graphical user interface to determine a target task type from the plurality of task types; and further displaying the first neural network model corresponding to the target task type in the graphical user interface.
For example, taking a surface feature classification task as an example, the task data of the surface feature classification task may be pre-trained on a large amount of label-free data to obtain a surface feature classification training set; and performing downstream training on the first neural network model loaded with the pre-training parameters by adopting the ground feature classification training set to obtain a second neural network model.
It is easy to notice that, a first neural network model is obtained by training a plurality of groups of label-free data through machine learning in advance, an obtained target type training set is used for training the first neural network model to obtain a second neural network model, and the aim of training the first neural network model obtained based on the label-free data through the target type training set and obtaining the second neural network model is fulfilled.
As an alternative embodiment, the method further includes:
step S120, performing data enhancement processing on the pre-training image to obtain the first view and the second view;
step S122, acquiring a third view corresponding to the pre-training image;
and step S124, pre-training an initial neural network model by adopting the first view, the second view and the third view to obtain the first neural network model.
In the above optional embodiment, the first neural network model is obtained by machine learning training using a plurality of sets of unlabeled data, and each set of data in the plurality of sets of unlabeled data includes: the method comprises the steps of pre-training images and a first view and a second view which are obtained by carrying out data enhancement processing on the pre-training images.
Optionally, the types or categories of the objects in the first view and the second view are not consistent, each pre-training image codebook is subjected to data enhancement processing augmentation to obtain a first view and a second view, wherein the pre-training image codebook is an integral diagram, a third view corresponding to the pre-training image is obtained, as shown in fig. 4, the initial neural network model is pre-trained by the first view, the second view and the third view respectively to obtain the first neural network model, as shown in fig. 5, in the process of training the first neural network model, since the cropped image does not include the whole object, distortion in geometric aspect is brought in, in addition, even if the cropping and the distortion are performed, an input scale is still considered to be specified, in the first neural network model, the features obtained by convolution can be integrated by using spatial pyramid pooling SPP for the first view and the second view, then a fixed-length feature vector is obtained and then transmitted into the full-link layer.
As an optional embodiment, pre-training the initial neural network model by using the first view, the second view, and the third view to obtain the first neural network model includes:
step S130, obtaining a codebook corresponding to the third view by using the initial neural network model;
step S132, comparing and learning the first view and the second view based on the codebook to obtain a comparison result;
step S134, adjusting the initial neural network model according to the comparison result to obtain the first neural network model.
It should be noted that, because various transformation processes are usually performed on pictures by a contrast task, and then the optimization goal is that different transformations of the same picture are as close as possible in a feature space and different pictures are as far away from each other in the feature space, for such a task, theoretically, different transformations of the same picture are as close as possible in the feature space and different pictures are as far away from each other in the feature space, but an actual optimization result is more biased to the second type rather than the first type, that is, although a category label of an object which is not used in the process of determining instance discrimination is solved, in the optimized feature space, similar objects can be relatively close together, that is, the data are proved to have structure and relevance, and the instance discrimination skillfully utilizes the structure and relevance.
As an alternative embodiment, in remote sensing self-supervision learning, more service requirements are based on pixel-level comparison, and it is predicted which category each pixel belongs to specifically. However, if the remote sensing self-monitoring image shown in fig. 6a and the natural image shown in fig. 6b are compared, and specific pixels are from the same ground feature and different ground feature types, it needs to be determined to perform matching, so that the embodiment of the present invention provides a comparison scheme for a neural network model based on codebook.
Because the feature difference between different cropping view crops is large and matching errors are easy to occur, in the embodiment of the invention, the whole image is used as a pre-training image, the third view corresponding to the pre-training image is obtained by using the initial neural network model and is used as a codebook, the color patch generated by the different cropping view crops is subjected to coding processing, then a non-equilibrium optimal transport algorithm sinkhorn is adopted to compare the different cropping views, namely the first view and the second view are compared and learned based on the codebook, and a comparison result is obtained; and adjusting the initial neural network model according to the comparison result to obtain the first neural network model.
As an optional embodiment, the performing comparison learning on the first view and the second view based on the codebook to obtain the comparison result includes:
step S140, encoding the first view based on the codebook to obtain a first encoding result, and encoding the second view based on the codebook to obtain a second encoding result;
step S142, performing feature extraction processing on the first coding result based on the codebook to obtain a first feature vector, and performing feature extraction processing on the second coding result based on the codebook to obtain a second feature vector;
step S144, obtaining the comparison result by using the first feature vector and the second feature vector.
In an embodiment, after the first view is encoded based on the codebook to obtain a first encoding result, and the second view is encoded based on the codebook to obtain a second encoding result; performing feature extraction processing on the first coding result to obtain a first feature vector f (xi, c), and performing feature extraction processing on the second coding result to obtain a second feature vector f (yj, c), namely extracting features to obtain feature vectors on different color patches on one graph, wherein the obtained feature graph can be 7 × 7 in size; and then comparing the first characteristic vector with the second characteristic vector by using a non-equilibrium optimal transport algorithm sinkhorn to obtain a comparison result.
In the embodiment of the present invention, a similarity calculation formula for comparing the first feature vector and the second feature vector may adopt a sinkhorn algorithm, and after obtaining the comparison result, it may be determined whether the first feature vector and the second feature vector originate from the same region, if the first feature vector and the second feature vector originate from the same region, the first feature vector and the second feature vector are pulled in, and if the first feature vector and the second feature vector do not originate from the same region, the second feature vector and the third feature vector are pulled out.
In the foregoing operating environment, the present invention provides a method for acquiring a neural network model as shown in fig. 7, where fig. 7 is a flowchart of a method for acquiring a neural network model according to an embodiment of the present invention, and as shown in fig. 7, the method for acquiring a neural network model includes:
step S202, acquiring a target type training set;
step S204, training a first neural network model by adopting the target type training set to obtain a second neural network model, wherein the first neural network model is obtained by using a plurality of groups of unlabeled data through machine learning training, and each group of data in the plurality of groups of unlabeled data comprises: the method comprises the steps of pre-training images and a first view and a second view which are obtained by carrying out data enhancement processing on the pre-training images.
It should be noted that the embodiment of the present invention may be applied to, but not limited to, a ground feature classification actual application scenario, a change detection actual application scenario, a remote sensing image classification actual application scenario, and a ground feature identification actual application scenario.
As an optional embodiment, the method for acquiring a neural network model provided by the present application acquires a target type training set; the first neural network model is trained by adopting the target type training set to obtain a second neural network model, and the second neural network model can be applied to the meteorological field (such as cloud layer extraction, meteorological forecast, meteorological early warning and the like) when being applied to remote sensing image classification; natural resource and ecological environment fields (e.g., weather forecast, change detection, ecological redline change detection, multi-classification change detection, ground feature classification, greenhouse extraction, road network extraction, building change detection (satellite, unmanned aerial vehicle), etc.), water conservancy fields (e.g., water area change detection, greenhouse extraction, water body extraction (optical, radar), sheet forest extraction, cage culture extraction, sand pit extraction, river house extraction, barrage extraction, photovoltaic power plant extraction, etc.), agroforestry fields (e.g., crop extraction (wheat, rice, potato, etc.), unmanned aerial vehicle crop identification (corn, flue-cured tobacco, myotonia, etc.), land parcel identification, growth monitoring (index calculation), agricultural assessment, pest monitoring, planting suggestion pushing, etc.), secondary disaster fields (e.g., disaster monitoring, travel disaster warning, etc.), life services, Take-out, logistics) areas (e.g., travel route planning, travel advice pushing, personnel mobilization, price adjustment, etc.); the city planning field (e.g., road network extraction (satellite, drone), building extraction, building change detection (satellite, drone), fire protection, etc.).
Optionally, the target type training set is downstream task data, and the downstream task may include, but is not limited to: a ground object classification task, a change detection task, a remote sensing image classification task, a ground object identification task and the like; the multiple groups of label-free data are remote sensing images acquired in advance, the pre-training images are whole images, and data enhancement processing is augmentation, wherein the data enhancement processing is divided into off-line processing and on-line processing, limited data are generated by data enhancement, the number and diversity (noise data) of training samples are increased, and the robustness of the model is improved.
Because the pictures have spatial coherence and the videos have spatial-temporal coherence, the self-supervision learning task can be designed by utilizing the characteristics. For example, semantic consistency on object space in pictures is utilized, and temporal consistency of object motion in videos is utilized.
Optionally, the first view and the second view may be a cropping view or a cropping view, as shown in fig. 3, each pre-training image is subjected to data enhancement processing to obtain two cropping views crop, i.e., a first view and a second view, and since a difference (e.g., a feature difference) between different cropping views crop is large and is prone to matching errors, in the embodiment of the present invention, the entire image is used as the pre-training image.
For example, taking a surface feature classification task as an example, the task data of the surface feature classification task may be pre-trained on a large amount of label-free data to obtain a surface feature classification training set; and performing downstream training on the first neural network model loaded with the pre-training parameters by adopting the ground feature classification training set to obtain a second neural network model.
In the embodiment of the invention, a target type training set is obtained; training a first neural network model by adopting the target type training set to obtain a second neural network model, wherein the first neural network model is obtained by using multiple groups of label-free data through machine learning training, and each group of data in the multiple groups of label-free data comprises: the method comprises the steps of pre-training images and a first view and a second view which are obtained by carrying out data enhancement processing on the pre-training images. It is easy to notice that, a first neural network model is obtained by training a plurality of groups of label-free data through machine learning in advance, an obtained target type training set is used for training the first neural network model to obtain a second neural network model, and the aim of training the first neural network model obtained based on the label-free data through the target type training set and obtaining the second neural network model is fulfilled.
In an optional embodiment, the method further includes:
step S302, performing data enhancement processing on the pre-training image to obtain the first view and the second view;
step S304, acquiring a third view corresponding to the pre-training image;
step S306, pre-training an initial neural network model by using the first view, the second view, and the third view to obtain the first neural network model.
In the above optional embodiment, the first neural network model is obtained by machine learning training using a plurality of sets of unlabeled data, and each set of data in the plurality of sets of unlabeled data includes: the method comprises the steps of pre-training images and a first view and a second view which are obtained by carrying out data enhancement processing on the pre-training images.
Optionally, the types or categories of the objects in the first view and the second view are not consistent, each pre-training image codebook is subjected to data enhancement processing augmentation to obtain a first view and a second view, wherein the pre-training image codebook is an integral map, a third view corresponding to the pre-training image is obtained, as shown in fig. 4, the initial neural network model is pre-trained by the first view, the second view and the third view respectively to obtain the first neural network model, as shown in fig. 5, in the process of training the first neural network model, since the cropped image does not include the whole object, distortion in geometric aspect is brought in, in addition, even if the cropping and the distortion are still considered to be a specified input scale, in the first neural network model, the obtained features can be integrated by using spatial convolution pyramid pooling SPP for the first view and the second view, then a fixed-length feature vector is obtained and then transmitted into the full-link layer.
In an optional embodiment, pre-training the initial neural network model by using the first view, the second view, and the third view to obtain the first neural network model includes:
step S402, obtaining a codebook corresponding to the third view by using the initial neural network model;
step S404, comparing and learning the first view and the second view based on the codebook to obtain a comparison result;
step S406, adjusting the initial neural network model according to the comparison result to obtain the first neural network model.
It should be noted that, because various transformation processes are usually performed on pictures by a contrast task, and then the optimization goal is that different transformations of the same picture are as close as possible in a feature space and different pictures are as far away from each other in the feature space, for such a task, theoretically, different transformations of the same picture are as close as possible in the feature space and different pictures are as far away from each other in the feature space, but an actual optimization result is more biased to the second type rather than the first type, that is, although a category label of an object which is not used in the process of determining instance discrimination is solved, in the optimized feature space, similar objects can be relatively close together, that is, the data are proved to have structure and relevance, and the instance discrimination skillfully utilizes the structure and relevance.
As an alternative embodiment, in remote sensing self-supervision learning, more service requirements are based on pixel-level comparison, and it is predicted which category each pixel belongs to specifically. However, as shown in the remote sensing self-monitoring image shown in fig. 6a and the natural image shown in fig. 6b, if the two images are compared, specifically, which pixels are from the same ground feature and which are different ground feature types, it needs to be determined to perform matching, so that the embodiment of the present invention provides a comparison scheme for a neural network model based on codebook.
Because the feature difference between different cropping view crops is large and matching errors are easy to occur, in the embodiment of the invention, the whole image is used as a pre-training image, the third view corresponding to the pre-training image is obtained by using the initial neural network model and is used as a codebook, the color patch generated by the different cropping view crops is subjected to coding processing, then a non-equilibrium optimal transport algorithm sinkhorn is adopted to compare the different cropping views, namely the first view and the second view are compared and learned based on the codebook, and a comparison result is obtained; and adjusting the initial neural network model according to the comparison result to obtain the first neural network model.
In an optional embodiment, the comparing and learning the first view and the second view based on the codebook, and obtaining the comparison result includes:
step S502, coding the first view based on the codebook to obtain a first coding result, and coding the second view based on the codebook to obtain a second coding result;
step S504, feature extraction processing is carried out on the first coding result based on the codebook to obtain a first feature vector, and feature extraction processing is carried out on the second coding result based on the codebook to obtain a second feature vector;
step S506, obtaining the comparison result by using the first feature vector and the second feature vector.
In an embodiment, after the first view is encoded based on the codebook to obtain a first encoding result, and the second view is encoded based on the codebook to obtain a second encoding result; performing feature extraction processing on the first coding result to obtain a first feature vector f (xi, c), and performing feature extraction processing on the second coding result to obtain a second feature vector f (yj, c), namely extracting features to obtain feature vectors on different color patches on one graph, wherein the obtained feature graph can be 7 × 7 in size; and then comparing the first characteristic vector with the second characteristic vector by using a non-equilibrium optimal transport algorithm sinkhorn to obtain a comparison result.
In the embodiment of the present invention, a similarity calculation formula for comparing the first feature vector and the second feature vector may adopt a sinkhorn algorithm, and after obtaining the comparison result, it may be determined whether the first feature vector and the second feature vector originate from the same region, if the first feature vector and the second feature vector originate from the same region, the first feature vector and the second feature vector are pulled in, and if the first feature vector and the second feature vector do not originate from the same region, the second feature vector and the third feature vector are pulled out.
In an alternative embodiment, obtaining the comparison result by using the first feature vector and the second feature vector includes:
step S602, calculating similarity between the first eigenvector and the second eigenvector to obtain a weight;
step S604, performing a mean square error on the first eigenvector and the second eigenvector to obtain an eigenvector;
step S606, determining the comparison result by using the weight and the characteristic distance.
Alternatively, in the above embodiment, the similarity between the first feature vector and the second feature vector is calculated to obtain a weight w, w ═ OT (M (f (x, c), f (y, c))), and the first feature vector and the second feature vector are compared with each otherThe second feature vector is subjected to mean square error to obtain a feature distance | | f (x)i,c)-f(yjC) | |; determining the comparison result Loss by using the weight and the characteristic distance, wherein,
Figure BDA0003314849330000161
as an optional embodiment, a graphical user interface is provided by an electronic device, where content displayed on the graphical user interface at least partially includes a feature classification display scene, and the feature classification method further includes:
step S610, dynamically displaying a classification result in the graphical user interface, wherein the classification result is used for representing the surface feature type corresponding to the image to be classified;
step S612, receiving a modification instruction of the surface feature classification result, adjusting the surface feature type corresponding to the image to be classified, and dynamically displaying the modified surface feature type again in the graphical user interface; or receiving a confirmation instruction of the ground feature classification result, and saving the ground feature type corresponding to the image to be classified for subsequent use.
As an optional embodiment, in the graphical user interface, a user may modify the feature classification result currently and dynamically displayed in the graphical user interface, that is, the user touches a modification button or a change button in the graphical user interface, or double-clicks the feature classification result, triggers a modification instruction, so as to further adjust the feature type, and dynamically displays the modified feature type again in the graphical user interface.
As another optional embodiment, in the graphical user interface, the user may confirm the feature classification result currently and dynamically displayed in the graphical user interface, that is, the user may touch a confirmation button in the graphical user interface or click the feature classification result to trigger a confirmation instruction, so as to store the feature type corresponding to the image to be classified for subsequent use.
The present invention provides a method for classifying terrain as shown in fig. 8, fig. 8 is a flowchart of a method for classifying terrain according to an embodiment of the present invention, and as shown in fig. 8, the method for classifying terrain includes:
step S702, acquiring an image to be classified;
step S704, analyzing the image to be classified by using a surface feature classification model, and obtaining a surface feature type corresponding to the image to be classified, where the surface feature classification model is obtained by training a pre-trained neural network model by using a surface feature classification training set, the pre-trained neural network model is obtained by using multiple sets of non-tag data through machine learning training, and each set of data in the multiple sets of non-tag data includes: the method comprises the steps of pre-training images and a first view and a second view which are obtained by carrying out data enhancement processing on the pre-training images.
It should be noted that the embodiment of the present invention may be applied to, but not limited to, a practical application scenario of surface feature classification, and the surface feature classification method provided in the embodiment of the present invention provides an algorithm for reducing a data labeling amount, and large-scale remote sensing self-monitoring pre-training is adopted, that is, a large amount of non-label data is pre-trained, so as to reduce a requirement for a labeling data amount, improve a self-monitoring learning ability in a self-monitoring learning module, and a user can obtain a higher performance of a neural network model by labeling a small amount of data.
As an optional embodiment, the image to be classified acquired in any one of the following fields is analyzed by using the feature classification model, and a feature type corresponding to the image to be classified is acquired. Any of the above areas include, but are not limited to: the meteorological field (e.g., cloud extraction, weather forecasting, weather forewarning, etc.); natural resource and ecological environment fields (e.g., weather forecast, change detection, ecological redline change detection, multi-classification change detection, ground feature classification, greenhouse extraction, road network extraction, building change detection (satellite, unmanned aerial vehicle), etc.), water conservancy fields (e.g., water area change detection, greenhouse extraction, water body extraction (optical, radar), sheet forest extraction, cage culture extraction, sand pit extraction, river house extraction, barrage extraction, photovoltaic power plant extraction, etc.), agroforestry fields (e.g., crop extraction (wheat, rice, potato, etc.), unmanned aerial vehicle crop identification (corn, flue-cured tobacco, myotonia, etc.), land parcel identification, growth monitoring (index calculation), agricultural assessment, pest monitoring, planting suggestion pushing, etc.), secondary disaster fields (e.g., disaster monitoring, travel disaster warning, etc.), life services, Take-out, logistics) areas (e.g., travel route planning, travel advice pushing, personnel mobilization, price adjustment, etc.); the city planning field (e.g., road network extraction (satellite, drone), building extraction, building change detection (satellite, drone), fire protection, etc.).
Optionally, the multiple sets of label-free data are ground feature classification remote sensing images acquired in advance, the pre-training images are whole images, and data enhancement processing is augmentation, wherein the data enhancement processing is divided into offline processing and online processing, so that limited data generate more data through data enhancement, the number and diversity (noise data) of training samples are increased, and the robustness of the model is improved.
Because the pictures have spatial coherence and the videos have spatial-temporal coherence, the self-supervision learning task can be designed by utilizing the characteristics. For example, semantic consistency on object space in pictures is utilized, and temporal consistency of object motion in videos is utilized.
Optionally, the first view and the second view may be a cropping view or a cropping view, as shown in fig. 3, each pre-training image is subjected to data enhancement processing to obtain two cropping views crop, i.e., a first view and a second view, and since a difference (e.g., a feature difference) between different cropping views crop is large and is prone to matching errors, in the embodiment of the present invention, the entire image is used as the pre-training image.
For example, the feature classification task is used, the task data of the feature classification task can be pre-trained on a large amount of label-free data to obtain a feature classification training set; and performing downstream training on the first neural network model loaded with the pre-training parameters by adopting the ground feature classification training set to obtain a ground feature classification model.
In the embodiment of the invention, the images to be classified are obtained; adopting a ground feature classification model to analyze the image to be classified, and acquiring a ground feature type corresponding to the image to be classified, wherein the ground feature classification model is obtained by adopting a ground feature classification training set to train a pre-training neural network model, the pre-training neural network model is obtained by using a plurality of groups of label-free data through machine learning training, and each group of data in the plurality of groups of label-free data comprises: the method comprises the steps of pre-training images and a first view and a second view which are obtained by carrying out data enhancement processing on the pre-training images. The purpose of determining the ground feature type corresponding to the image to be classified by adopting the ground feature classification model is achieved, so that the technical effect of realizing higher performance of self-supervision learning by reducing the data label amount is achieved, and the technical problem that the classification accuracy of the ground feature classification model is lower because the ground feature classification model for constructing a target task more accurately is not realized in the prior art is solved.
The present invention provides a method for detecting a change in terrain as shown in fig. 9, and fig. 9 is a flowchart of the method for detecting a change in terrain according to an embodiment of the present invention, and as shown in fig. 9, the method for detecting a change in terrain includes:
step S802, obtaining an image to be detected;
step S804, analyzing the image to be detected by using a change detection model, and determining whether the image to be detected changes, wherein the change detection model is obtained by training a pre-trained neural network model by using a change detection training set, the pre-trained neural network model is obtained by using a plurality of groups of label-free data through machine learning training, and each group of data in the plurality of groups of label-free data comprises: the method comprises the steps of pre-training images and a first view and a second view which are obtained by carrying out data enhancement processing on the pre-training images.
It should be noted that the embodiment of the present invention may be applied to, but not limited to, a ground feature change detection practical application scenario, and the ground feature change detection method provided in the embodiment of the present invention provides an algorithm for reducing a data labeling amount, and large-scale remote sensing self-monitoring pre-training is adopted, that is, a large amount of non-label data is pre-trained, so as to reduce a requirement for the labeling data amount, improve the self-monitoring learning ability in the self-monitoring learning module, and a user can obtain a higher performance of the neural network model by labeling a small amount of data.
As an optional embodiment, in the method for detecting a feature change provided by the present application, a change detection model is used to analyze an image to be detected acquired in any one of the following fields, so as to determine whether the image to be detected changes. Any of the above areas include, but are not limited to: the meteorological field (e.g., cloud extraction, weather forecasting, weather forewarning, etc.); natural resource and ecological environment fields (e.g., weather forecast, change detection, ecological redline change detection, multi-classification change detection, ground feature classification, greenhouse extraction, road network extraction, building change detection (satellite, unmanned aerial vehicle), etc.), water conservancy fields (e.g., water area change detection, greenhouse extraction, water body extraction (optical, radar), sheet forest extraction, cage culture extraction, sand pit extraction, river house extraction, barrage extraction, photovoltaic power plant extraction, etc.), agroforestry fields (e.g., crop extraction (wheat, rice, potato, etc.), unmanned aerial vehicle crop identification (corn, flue-cured tobacco, myotonia, etc.), land parcel identification, growth monitoring (index calculation), agricultural assessment, pest monitoring, planting suggestion pushing, etc.), secondary disaster fields (e.g., disaster monitoring, travel disaster warning, etc.), life services, Take-out, logistics) areas (e.g., travel route planning, travel advice pushing, personnel mobilization, price adjustment, etc.); the city planning field (e.g., road network extraction (satellite, drone), building extraction, building change detection (satellite, drone), fire protection, etc.).
Optionally, the multiple groups of non-tag data are pre-acquired remote sensing images to be detected of surface feature changes, the pre-training images are whole images, and data enhancement processing is augmentation, wherein the data enhancement processing is divided into offline processing and online processing, the data enhancement enables limited data to generate more data, the number and diversity (noise data) of training samples are increased, and the robustness of the model is improved.
Because the pictures have spatial coherence and the videos have spatial-temporal coherence, the self-supervision learning task can be designed by utilizing the characteristics. For example, semantic consistency on object space in pictures is utilized, and temporal consistency of object motion in videos is utilized.
Optionally, the first view and the second view may be a cropping view or a cropping view, as shown in fig. 3, each pre-training image is subjected to data enhancement processing to obtain two cropping views crop, i.e., a first view and a second view, and since a difference (e.g., a feature difference) between different cropping views crop is large and matching errors are easy to occur, in the embodiment of the present invention, the entire image is used as the pre-training image.
For example, the feature change detection task may be pre-trained on a large amount of non-label data by using task data of the feature change detection task to obtain a feature change detection training set; and then, carrying out downstream training on the first neural network model loaded with the pre-training parameters by adopting the ground feature change detection training set to obtain a change detection model.
In the embodiment of the invention, an image to be detected is obtained; adopt to change detection model to wait to detect the image and analyze the aforesaid, confirm that the aforesaid is waited to detect whether the image changes, wherein, above-mentioned change detection model adopts and changes the detection training set and trains the neural network model of pretraining and obtain, above-mentioned pretraining neural network model uses multiunit unlabelled data to obtain through machine learning training, every group data in the above-mentioned multiunit unlabelled data includes: the method comprises the steps of pre-training images and a first view and a second view which are obtained by carrying out data enhancement processing on the pre-training images. The purpose of determining whether the image to be detected changes by adopting the change detection model is achieved, so that the technical effect of realizing higher performance of self-supervision learning by reducing the data label amount is achieved, and the technical problem that the classification accuracy of the ground feature classification model is lower because the ground feature classification model for constructing a target task more accurately is not realized in the prior art is solved.
The invention provides a remote sensing image classification method as shown in fig. 10, fig. 10 is a flow chart of the remote sensing image classification method according to the embodiment of the invention, and as shown in fig. 10, the remote sensing image classification method includes:
step S902, obtaining a remote sensing image to be classified;
step S904, analyzing the remote sensing image to be classified by using a classification model, and determining a classification corresponding to the remote sensing image to be classified, wherein the classification model is obtained by training a pre-trained neural network model by using a remote sensing image classification training set, the pre-trained neural network model is obtained by using multiple sets of label-free data through machine learning training, and each set of data in the multiple sets of label-free data includes: the method comprises the steps of pre-training images and a first view and a second view which are obtained by carrying out data enhancement processing on the pre-training images.
It should be noted that the embodiment of the present invention may be applied to, but not limited to, a practical application scenario of remote sensing image classification, and the remote sensing image classification method provided in the embodiment of the present invention provides an algorithm for reducing a data annotation amount, and large-scale remote sensing self-supervised pre-training is adopted, that is, a large amount of non-label data is pre-trained, so as to reduce a requirement for the annotation data amount, improve a self-supervised learning capability in a self-supervised learning module, and a user can obtain a higher performance of a neural network model by labeling a small amount of data.
As an optional embodiment, in the remote sensing image classification method provided by the present application, a classification model is used to analyze a remote sensing image to be classified acquired in any one of the following fields, and a classification corresponding to the remote sensing image to be classified is determined, where the any one of the fields includes but is not limited to: the meteorological field (e.g., cloud extraction, weather forecasting, weather forewarning, etc.); natural resource and ecological environment fields (e.g., weather forecast, change detection, ecological redline change detection, multi-classification change detection, ground feature classification, greenhouse extraction, road network extraction, building change detection (satellite, unmanned aerial vehicle), etc.), water conservancy fields (e.g., water area change detection, greenhouse extraction, water body extraction (optical, radar), sheet forest extraction, cage culture extraction, sand pit extraction, river house extraction, barrage extraction, photovoltaic power plant extraction, etc.), agroforestry fields (e.g., crop extraction (wheat, rice, potato, etc.), unmanned aerial vehicle crop identification (corn, flue-cured tobacco, myotonia, etc.), land parcel identification, growth monitoring (index calculation), agricultural assessment, pest monitoring, planting suggestion pushing, etc.), secondary disaster fields (e.g., disaster monitoring, travel disaster warning, etc.), life services, Take-out, logistics) areas (e.g., travel route planning, travel advice pushing, personnel mobilization, price adjustment, etc.); the city planning field (e.g., road network extraction (satellite, drone), building extraction, building change detection (satellite, drone), fire protection, etc.).
Optionally, the multiple groups of non-tag data are pre-acquired remote sensing images to be detected of surface feature changes, the pre-training images are whole images, and data enhancement processing is augmentation, wherein the data enhancement processing is divided into offline processing and online processing, the data enhancement enables limited data to generate more data, the number and diversity (noise data) of training samples are increased, and the robustness of the model is improved.
Because the pictures have spatial coherence and the videos have spatial-temporal coherence, the self-supervision learning task can be designed by utilizing the characteristics. For example, semantic consistency on object space in pictures is utilized, and temporal consistency of object motion in videos is utilized.
Optionally, the first view and the second view may be a cropping view or a cropping view, as shown in fig. 3, each pre-training image is subjected to data enhancement processing to obtain two cropping views crop, i.e., a first view and a second view, and since a difference (e.g., a feature difference) between different cropping views crop is large and is prone to matching errors, in the embodiment of the present invention, the entire image is used as the pre-training image.
For example, the remote sensing image classification task is used, task data of the remote sensing image classification task can be pre-trained on a large amount of label-free data to obtain a remote sensing image classification training set; and then, carrying out downstream training on the first neural network model loaded with the pre-training parameters by adopting the remote sensing image classification training set to obtain a classification model.
In the embodiment of the invention, the remote sensing image to be classified is obtained; analyzing the remote sensing images to be classified by adopting a classification model, and determining the classification corresponding to the remote sensing images to be classified, wherein the classification model is obtained by training a pre-trained neural network model by adopting a remote sensing image classification training set, the pre-trained neural network model is obtained by using multiple groups of non-label data through machine learning training, and each group of data in the multiple groups of non-label data comprises: the method comprises the steps of pre-training images and a first view and a second view which are obtained by carrying out data enhancement processing on the pre-training images. The method achieves the purpose of determining the classification corresponding to the remote sensing image to be classified by adopting the classification model, thereby achieving the technical effect of realizing higher performance of self-supervision learning by reducing the data labeling quantity, and further solving the technical problem that the classification accuracy of the ground object classification model is lower because the ground object classification model for constructing a target task more accurately is not realized in the prior art.
The present invention provides a method for recognizing a feature as shown in fig. 11, where fig. 11 is a flowchart of the method for recognizing a feature according to the embodiment of the present invention, and as shown in fig. 11, the method for recognizing a feature includes:
step S1002, acquiring an image to be identified;
step S1004, analyzing the image to be recognized by using a feature recognition model to determine whether a target feature exists in the image to be recognized, wherein the feature recognition model is obtained by training a pre-trained neural network model by using a feature recognition training set, the pre-trained neural network model is obtained by using multiple sets of unlabeled data through machine learning training, and each set of data in the multiple sets of unlabeled data includes: the method comprises the steps of pre-training images and a first view and a second view which are obtained by carrying out data enhancement processing on the pre-training images.
It should be noted that the embodiment of the present invention may be applied to, but not limited to, a real application scene of surface feature recognition, and the surface feature recognition method provided in the embodiment of the present invention provides an algorithm for reducing a data labeling amount, and large-scale remote sensing self-monitoring pre-training is adopted, that is, a large amount of non-label data is pre-trained, so as to reduce a requirement for a labeling data amount, improve a self-monitoring learning ability in a self-monitoring learning module, and a user can obtain a higher performance of a neural network model by labeling a small amount of data.
As an optional embodiment, in the feature recognition method provided by the application, a feature recognition model is used to analyze an image to be recognized acquired in any one of the following fields, and determine whether a target feature exists in the image to be recognized, where the any one of the fields includes but is not limited to: the meteorological field (e.g., cloud extraction, weather forecasting, weather forewarning, etc.); natural resource and ecological environment fields (e.g., weather forecast, change detection, ecological redline change detection, multi-classification change detection, ground feature classification, greenhouse extraction, road network extraction, building change detection (satellite, unmanned aerial vehicle), etc.), water conservancy fields (e.g., water area change detection, greenhouse extraction, water body extraction (optical, radar), sheet forest extraction, cage culture extraction, sand pit extraction, river house extraction, barrage extraction, photovoltaic power plant extraction, etc.), agroforestry fields (e.g., crop extraction (wheat, rice, potato, etc.), unmanned aerial vehicle crop identification (corn, flue-cured tobacco, myotonia, etc.), land parcel identification, growth monitoring (index calculation), agricultural assessment, pest monitoring, planting suggestion pushing, etc.), secondary disaster fields (e.g., disaster monitoring, travel disaster warning, etc.), life services, Take-out, logistics) areas (e.g., travel route planning, travel advice pushing, personnel mobilization, price adjustment, etc.); the city planning field (e.g., road network extraction (satellite, drone), building extraction, building change detection (satellite, drone), fire protection, etc.).
Optionally, the multiple sets of label-free data are ground object identification remote sensing images acquired in advance, the pre-training images are whole images, and data enhancement processing is augmentation, wherein the data enhancement processing is divided into offline processing and online processing, so that limited data generate more data through data enhancement, the number and diversity (noise data) of training samples are increased, and the robustness of the model is improved.
Because the pictures have spatial coherence and the videos have spatial-temporal coherence, the self-supervision learning task can be designed by utilizing the characteristics. For example, semantic consistency on object space in pictures is utilized, and temporal consistency of object motion in videos is utilized.
Optionally, the first view and the second view may be a cropping view or a cropping view, as shown in fig. 3, each pre-training image is subjected to data enhancement processing to obtain two cropping views crop, i.e., a first view and a second view, and since a difference (e.g., a feature difference) between different cropping views crop is large and is prone to matching errors, in the embodiment of the present invention, the entire image is used as the pre-training image.
For example, the feature recognition task is used, the task data of the feature recognition task can be pre-trained on a large amount of label-free data to obtain a feature recognition training set; and then, carrying out downstream training on the first neural network model loaded with the pre-training parameters by adopting the ground object recognition training set to obtain a ground object recognition model.
In the embodiment of the invention, the image to be identified is obtained; analyzing the image to be recognized by adopting a ground feature recognition model, and determining whether a target ground feature exists in the image to be recognized, wherein the ground feature recognition model is obtained by training a pre-trained neural network model by adopting a ground feature recognition training set, the pre-trained neural network model is obtained by using multiple groups of label-free data through machine learning training, and each group of data in the multiple groups of label-free data comprises: the method comprises the steps of pre-training images and a first view and a second view which are obtained by carrying out data enhancement processing on the pre-training images. The purpose of determining whether the target ground object exists in the image to be recognized or not by adopting the ground object recognition model is achieved, so that the technical effect of realizing higher performance of self-supervision learning by reducing the data label quantity is achieved, and the technical problem that the classification accuracy of the ground object classification model is lower because the ground object classification model for constructing a target task more accurately is not realized in the prior art is solved.
The present application provides yet another method of classifying terrain as shown in fig. 12. Fig. 12 is a schematic diagram of classifying features at a cloud server according to an embodiment of the present invention, and as shown in fig. 12, a client uploads an image to be classified to the cloud server, and the cloud server analyzes the image to be classified by using a feature classification model to obtain a feature type corresponding to the image to be classified, where the feature classification model is obtained by training a pre-trained neural network model by using a feature classification training set, the pre-trained neural network model is obtained by using multiple sets of non-labeled data through machine learning training, and each set of the multiple sets of non-labeled data includes: the method comprises the steps of pre-training images and a first view and a second view which are obtained by carrying out data enhancement processing on the pre-training images.
And then, the cloud server feeds back the classification result to the client, and the final classification result is displayed to the user through a graphical user interface of the client. The optional way of displaying the classification result on the graphical user interface has been described in the above embodiments, and is not described herein again.
It should be noted that the surface feature classification method provided in the embodiment of the present application may be, but is not limited to, suitable for a surface feature classification actual application scene, a change detection actual application scene, a remote sensing image classification actual application scene, and a surface feature recognition actual application scene, and the image to be classified is analyzed by using a surface feature classification model through an SaaS server and a client in an interactive manner, so as to obtain a surface feature type corresponding to the image to be classified, and display a returned classification result on a client.
The present invention provides a natural resource feature classification method as shown in fig. 13, fig. 13 is a flowchart of a natural resource feature classification method according to an embodiment of the present invention, and as shown in fig. 13, the natural resource feature classification method includes:
step S1102, acquiring a natural resource image to be classified;
step S1104, analyzing the natural resource image to be classified by using a surface feature classification model, and obtaining a natural resource surface feature type corresponding to the natural resource image to be classified, wherein the surface feature classification model is obtained by training a plurality of views obtained by performing data enhancement processing on each frame of natural resource image in the plurality of frames of natural resource images, the plurality of views are a plurality of shearing views, the plurality of shearing views are obtained by performing data enhancement processing on a whole image corresponding to each frame of natural resource image, each shearing view in the plurality of shearing views corresponds to a different shearing area in the whole image, and the types of shearing objects corresponding to each shearing view in the plurality of shearing views are different;
step S1106, displaying the nature resource feature types.
In the embodiment of the invention, the natural resource image to be classified is obtained; analyzing the natural resource image to be classified by adopting a surface feature classification model to obtain a classification result, wherein the surface feature classification model is obtained by training a plurality of views obtained by performing data enhancement processing on each frame of image in a plurality of frames of images by using a plurality of frames of images; and displaying the classification result.
It is easy to note that, since the plurality of views are a plurality of cutout views, the plurality of cutout views are obtained by performing data enhancement processing on the whole image corresponding to each frame of the natural resource image, each cutout view in the plurality of cutout views respectively corresponds to a different cutout region in the whole image, the types of the cutout objects corresponding to each cutout view in the plurality of cutout views are different, a plurality of frames of images and a plurality of views obtained by performing data enhancement processing on each frame of image in the plurality of frames of images are used in advance to train to obtain a ground object classification model, and the ground object classification model is used to analyze the obtained classification images to obtain a classification result, so as to achieve the purpose of providing classification processing on the classification images by the ground object classification model of a more accurate construction target task, after one image is divided into one or more different views, the enhancement of the local area is carried out, so that the technical effect of improving the classification accuracy of the ground feature classification model is achieved, and the technical problem that the classification accuracy of the ground feature classification model is lower due to the fact that the ground feature classification model of a more accurate construction target task is not achieved in the prior art is solved.
It should be noted that the embodiment of the present invention may be, but not limited to, applied to a ground feature classification practical application scenario, and is configured to analyze the natural resource image to be classified by using a ground feature classification model, and obtain a natural resource ground feature type corresponding to the natural resource image to be classified. For example, natural surface feature types such as cultivated land, urban areas, water areas, sea areas, mineral products, forest vegetation, deserts, tropical rain forests, and the like.
For example, it can also be applied to the following technical fields: the meteorological field (e.g., cloud extraction, weather forecasting, weather forewarning, etc.); natural resource and ecological environment fields (e.g., weather forecast, change detection, ecological redline change detection, multi-classification change detection, ground feature classification, greenhouse extraction, road network extraction, building change detection (satellite, unmanned aerial vehicle), etc.), water conservancy fields (e.g., water area change detection, greenhouse extraction, water body extraction (optical, radar), sheet forest extraction, cage culture extraction, sand pit extraction, river house extraction, barrage extraction, photovoltaic power plant extraction, etc.), agroforestry fields (e.g., crop extraction (wheat, rice, potato, etc.), unmanned aerial vehicle crop identification (corn, flue-cured tobacco, myotonia, etc.), land parcel identification, growth monitoring (index calculation), agricultural assessment, pest monitoring, planting suggestion pushing, etc.), secondary disaster fields (e.g., disaster monitoring, travel disaster warning, etc.), life services, Take-out, logistics) areas (e.g., travel route planning, travel advice pushing, personnel mobilization, price adjustment, etc.); the city planning field (e.g., road network extraction (satellite, drone), building extraction, building change detection (satellite, drone), fire protection, etc.).
Because the pictures have spatial coherence and the videos have spatial-temporal coherence, the self-supervision learning task can be designed by utilizing the characteristics. For example, semantic consistency on object space in pictures is utilized, and temporal consistency of object motion in videos is utilized.
In the embodiment of the application, the multi-frame images and the multiple views obtained by performing data enhancement processing on each frame of image in the multi-frame images are used in advance to obtain the surface feature classification model, the obtained natural resource image to be classified is analyzed by adopting the surface feature classification model to obtain the natural resource surface feature type, the classification image is classified by more accurately constructing the surface feature classification model of the target task, and the natural resource surface feature type to which the natural resource to be classified belongs is accurately analyzed.
As an optional embodiment, a graphical user interface is provided by an electronic device, where content displayed on the graphical user interface at least partially includes a feature classification display scene, and the feature classification method further includes:
step S1202, dynamically displaying the natural resource feature type in the graphical user interface;
step S1204, receiving the modification instruction of the nature resource surface feature type, adjusting the nature resource surface feature type corresponding to the nature resource image to be classified, and dynamically displaying the modified nature resource surface feature type again in the graphic user interface; or receiving a confirmation instruction of the natural resource surface feature type, and storing the natural resource surface feature type corresponding to the natural resource image to be classified for subsequent use.
As an optional embodiment, in the graphical user interface, a user may modify a natural resource feature type dynamically displayed currently in the graphical user interface, that is, the user touches a modification button or a change button in the graphical user interface, or double-clicks the natural resource feature type, triggers a modification instruction, so as to further adjust the natural resource feature type, and dynamically displays the modified natural resource feature type again in the graphical user interface.
As another optional embodiment, in the graphical user interface, the user may confirm the natural resource feature type dynamically displayed currently in the graphical user interface, that is, the user may touch a confirmation button in the graphical user interface or click the natural resource feature type to trigger a confirmation instruction, so as to store the natural resource feature type corresponding to the natural resource image to be classified for subsequent use.
The present invention provides a method for detecting a change in a building, as shown in fig. 14, fig. 14 is a flowchart of a method for detecting a change in a building according to an embodiment of the present invention, and as shown in fig. 14, the method for detecting a change in a building includes:
step S1302, acquiring a building image to be detected;
step S1304, analyzing the building image to be detected by using a change detection model, and determining a detection result of whether the building image to be detected changes, where the change detection model is obtained by training a plurality of views obtained by performing data enhancement processing on each frame of building image in the plurality of frames of building images, the plurality of views are a plurality of cutout views, the plurality of cutout views are obtained by performing data enhancement processing on an entire image corresponding to each frame of building image, each cutout view in the plurality of cutout views corresponds to a different cutout area in the entire image, and the type of a cutout object corresponding to each cutout view in the plurality of cutout views is different;
step 1306, displaying the detection result.
It is easy to note that, since the plurality of views are a plurality of cutout views, the plurality of cutout views are obtained by performing data enhancement processing on a whole image corresponding to each frame of architectural image, each of the plurality of cutout views respectively corresponds to a different cutout region in the whole image, and a type of a cutout object corresponding to each of the plurality of cutout views is different; the method comprises the steps of training a plurality of views obtained by carrying out data enhancement processing on each frame of building image in a plurality of frames of building images in advance to obtain a change detection model, analyzing the building image to be detected by using the change detection model, and determining a detection result of whether the building image to be detected changes, so that the purpose of providing a more accurate change detection model for constructing a target task to classify and process classified images is achieved.
It should be noted that the embodiment of the present invention may be applied to, but not limited to, a ground object classification actual application scenario, for example, in a building ground object change detection scenario, and is configured to detect whether a change of the building ground object occurs by using a ground object classification model, so as to obtain a detection result whether the building image to be detected changes. For example, it is changed from the existence of the building image to the wasteland, which indicates that the building in the building image is removed, or it is changed from the wasteland to the existence of the building image, which indicates that the building in the building image is a newly built building.
Because the pictures have spatial coherence and the videos have spatial-temporal coherence, the self-supervision learning task can be designed by utilizing the characteristics. For example, semantic consistency on object space in pictures is utilized, and temporal consistency of object motion in videos is utilized.
In the embodiment of the application, a plurality of frames of images and a plurality of views obtained by performing data enhancement processing on each frame of image in the plurality of frames of images are used in advance to train to obtain a change detection model, whether the building feature change changes is detected by adopting the change detection model to obtain a detection result whether the building image to be detected changes, and classification processing is performed on the classified images through the change detection model for constructing a target task more accurately to accurately determine whether the building feature change changes.
Still as shown in fig. 12, a client uploads an image to be classified to a cloud server, and the cloud server analyzes the image to be classified by using a classification model to obtain a classification result, where the classification model is obtained by training a plurality of views obtained by performing data enhancement on each frame of image in a plurality of frames of images, the views are a plurality of cutout views obtained by performing data enhancement on a whole image corresponding to each frame of image, each cutout view in the plurality of cutout views corresponds to a different cutout region in the whole image, and the types of cutout objects corresponding to each cutout view in the plurality of cutout views are different; and returning a classification result to the client and displaying the classification result on the client.
And then, the cloud server feeds back the classification result to the client, and the final classification result is displayed to the user through a graphical user interface of the client. The optional way of displaying the classification result on the graphical user interface has been described in the above embodiments, and is not described herein again.
It should be noted that the surface feature classification method provided in the embodiment of the present application may be, but is not limited to, suitable for a surface feature classification actual application scene, a change detection actual application scene, a remote sensing image classification actual application scene, and a surface feature recognition actual application scene, and the image to be classified is analyzed by using a classification model through an SaaS server and a client in an interactive manner, so as to obtain a classification result, and the returned classification result is displayed on the client.
It should be noted that, for simplicity of description, the above-mentioned method embodiments are described as a series of acts or combination of acts, but those skilled in the art will recognize that the present invention is not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the invention. Further, those skilled in the art should also appreciate that the embodiments described in the specification are preferred embodiments and that the acts and modules referred to are not necessarily required by the invention.
Through the above description of the embodiments, those skilled in the art can clearly understand that the method according to the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but the former is a better implementation mode in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a computer-readable storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (which may be a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
Example 2
According to an embodiment of the present invention, there is further provided an embodiment of an apparatus for implementing the method for acquiring a neural network model, fig. 15 is a schematic structural diagram of an apparatus for acquiring a neural network model according to an embodiment of the present invention, as shown in fig. 15, the apparatus includes:
an obtaining module 110, configured to obtain a target type training set; a training module 112, configured to train a first neural network model by using the target type training set to obtain a second neural network model, where the first neural network model is obtained by using multiple sets of unlabeled data through machine learning training, and each set of data in the multiple sets of unlabeled data includes: the method comprises the steps of pre-training images and a first view and a second view which are obtained by carrying out data enhancement processing on the pre-training images.
It should be noted here that the acquiring module 110 and the training module 112 correspond to steps S202 to S204 in embodiment 1, and the two modules are the same as the corresponding steps in the implementation example and application scenario, but are not limited to the disclosure in embodiment 1. It should be noted that the above modules may be operated in the computer terminal 10 provided in embodiment 1 as a part of the apparatus.
In the embodiment of the invention, the acquisition module is used for acquiring a target type training set; a training module, configured to train a first neural network model by using the target type training set to obtain a second neural network model, where the first neural network model is obtained by using multiple sets of unlabeled data through machine learning training, and each set of data in the multiple sets of unlabeled data includes: the method comprises the steps of pre-training images and a first view and a second view which are obtained by carrying out data enhancement processing on the pre-training images. It is easy to notice that, a first neural network model is obtained by training a plurality of groups of label-free data through machine learning in advance, an obtained target type training set is used for training the first neural network model to obtain a second neural network model, and the aim of training the first neural network model obtained based on the label-free data through the target type training set and obtaining the second neural network model is fulfilled.
It should be noted that, reference may be made to the relevant description in embodiment 1 for a preferred implementation of this embodiment, and details are not described here again.
Example 3
According to an embodiment of the present invention, there is further provided an embodiment of an electronic device, which may be any one of computing devices in a computing device group. The electronic device includes: a processor and a memory, wherein:
a processor; and a memory, connected to the processor, for providing instructions to the processor for processing the following processing steps: acquiring an image to be classified; analyzing the images to be classified by adopting a classification model to obtain a classification result, wherein the classification model is obtained by training a plurality of views obtained by performing data enhancement processing on each frame of image in a plurality of frames of images by using a plurality of frames of images, the views are a plurality of shearing views, the shearing views are obtained by performing data enhancement processing on the whole image corresponding to each frame of image, each shearing view in the shearing views corresponds to different shearing areas in the whole image, and the types of shearing objects corresponding to each shearing view in the shearing views are different; and displaying the classification result.
In the embodiment of the invention, the images to be classified are obtained; analyzing the images to be classified by adopting a classification model to obtain a classification result, wherein the classification model is obtained by training a plurality of views obtained by carrying out data enhancement processing on each frame of image in the plurality of frames of images; and displaying the classification result.
It is easy to notice that a plurality of views obtained by using a plurality of frames of images and performing data enhancement processing on each frame of image in the plurality of frames of images are used in advance to train to obtain a classification model, the obtained classification images are analyzed by adopting the classification model to obtain a classification result, the purpose of performing classification processing on the classification images by using the classification model for providing a more accurate construction target task is achieved, and after one image is divided into one or more different views, local area enhancement is performed, so that the technical effect of improving the classification accuracy of the classification model is realized, and the technical problem that the classification accuracy of the classification model is low because the ground feature classification model for constructing the target task more accurately is not realized in the prior art is solved.
It should be noted that, reference may be made to the relevant description in embodiment 1 for a preferred implementation of this embodiment, and details are not described here again.
Example 4
According to the embodiment of the invention, the embodiment of the computer terminal is also provided, and the computer terminal can be any computer terminal device in a computer terminal group. Optionally, in this embodiment, the computer terminal may also be replaced with a terminal device such as a mobile terminal.
Optionally, in this embodiment, the computer terminal may be located in at least one network device of a plurality of network devices of a computer network.
In this embodiment, the computer terminal may execute program codes of the following steps in the image classification method: acquiring an image to be classified; analyzing the images to be classified by adopting a classification model to obtain a classification result, wherein the classification model is obtained by training a plurality of views obtained by performing data enhancement processing on each frame of image in a plurality of frames of images by using a plurality of frames of images, the views are a plurality of shearing views, the shearing views are obtained by performing data enhancement processing on the whole image corresponding to each frame of image, each shearing view in the shearing views corresponds to different shearing areas in the whole image, and the types of shearing objects corresponding to each shearing view in the shearing views are different; and displaying the classification result.
Alternatively, fig. 16 is a block diagram of another computer terminal according to an embodiment of the present invention, and as shown in fig. 16, the computer terminal may include: one or more processors 122 (only one of which is shown), memory 124, and peripherals interface 126.
The memory may be configured to store software programs and modules, such as program instructions/modules corresponding to the image classification method and apparatus in the embodiments of the present invention, and the processor executes various functional applications and data processing by running the software programs and modules stored in the memory, so as to implement the image classification method. The memory may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory may further include memory located remotely from the processor, and these remote memories may be connected to the computer terminal through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The processor can call the information and application program stored in the memory through the transmission device to execute the following steps: acquiring an image to be classified; analyzing the images to be classified by adopting a classification model to obtain a classification result, wherein the classification model is obtained by training a plurality of views obtained by performing data enhancement processing on each frame of image in a plurality of frames of images by using a plurality of frames of images, the views are a plurality of shearing views, the shearing views are obtained by performing data enhancement processing on the whole image corresponding to each frame of image, each shearing view in the shearing views corresponds to different shearing areas in the whole image, and the types of shearing objects corresponding to each shearing view in the shearing views are different; and displaying the classification result.
Optionally, the processor may further execute the program code of the following steps: analyzing the image to be classified by using the first neural network model to obtain the classification result, wherein each group of data in a plurality of groups of unlabeled data used by the first neural network model comprises: the method comprises the steps of pre-training images and a first view and a second view which are obtained by carrying out data enhancement processing on the pre-training images.
Optionally, the processor may further execute the program code of the following steps: analyzing the image to be classified by using the second neural network model to obtain the classification result, wherein the second neural network model is obtained by training a first neural network model by using a target type training set, and each group of data in multiple groups of unlabeled data used by the first neural network model comprises: the method comprises the steps of pre-training images and a first view and a second view which are obtained by carrying out data enhancement processing on the pre-training images.
Optionally, the processor may further execute the program code of the following steps: presenting a plurality of task types within the graphical user interface; responding to touch operation acted on the graphical user interface, and determining a target task type from the plurality of task types; displaying the first neural network model corresponding to the target task type in the graphical user interface; and obtaining the target type training set, and performing model training on the first neural network model by adopting the target type training set to obtain the second neural network model corresponding to the target task type.
Optionally, the processor may further execute the program code of the following steps: the first view and the second view are obtained by performing data enhancement processing on the pre-training image; acquiring a third view corresponding to the pre-training image; and pre-training an initial neural network model by adopting the first view, the second view and the third view to obtain the first neural network model.
Optionally, the processor may further execute the program code of the following steps: acquiring a codebook corresponding to the third view by using the initial neural network model; comparing and learning the first view and the second view based on the codebook to obtain a comparison result; and adjusting the initial neural network model according to the comparison result to obtain the first neural network model.
Optionally, the processor may further execute the program code of the following steps: coding the first view based on the codebook to obtain a first coding result, and coding the second view based on the codebook to obtain a second coding result; performing feature extraction processing on the first coding result based on the codebook to obtain a first feature vector, and performing feature extraction processing on the second coding result based on the codebook to obtain a second feature vector; and obtaining the comparison result by using the first feature vector and the second feature vector.
Optionally, the processor may further execute the program code of the following steps: acquiring a natural resource image to be classified; analyzing the natural resource image to be classified by adopting a surface feature classification model to obtain a natural resource surface feature type corresponding to the natural resource image to be classified, wherein the surface feature classification model is obtained by adopting a multi-frame natural resource image and a plurality of views obtained by performing data enhancement processing on each frame of natural resource image in the multi-frame natural resource image; and displaying the natural resource ground object types.
Optionally, the processor may further execute the program code of the following steps: dynamically displaying the natural resource feature type in the graphical user interface; receiving a modification instruction of the natural resource surface feature type, adjusting the natural resource surface feature type corresponding to the natural resource image to be classified, and dynamically displaying the modified natural resource surface feature type again in the graphic user interface; or receiving a confirmation instruction of the natural resource surface feature type, and storing the natural resource surface feature type corresponding to the natural resource image to be classified for subsequent use.
Optionally, the processor may further execute the program code of the following steps: acquiring a building image to be detected; analyzing the building image to be detected by adopting a change detection model, and determining a detection result of whether the building image to be detected changes, wherein the change detection model is obtained by adopting a plurality of frames of building images and a plurality of views obtained by performing data enhancement processing on each frame of building image in the plurality of frames of building images; and displaying the detection result.
Optionally, the processor may further execute the program code of the following steps: receiving an image to be classified from a client; analyzing the images to be classified by adopting a classification model to obtain a classification result, wherein the classification model is obtained by training a plurality of views obtained by carrying out data enhancement processing on each frame of image in the plurality of frames of images; and returning a classification result to the client and displaying the classification result on the client.
The embodiment of the invention provides a scheme for acquiring a neural network model. Obtaining a target type training set; training a first neural network model by adopting the target type training set to obtain a second neural network model, wherein the first neural network model is obtained by using multiple groups of label-free data through machine learning training, and each group of data in the multiple groups of label-free data comprises: the method comprises the steps of pre-training images and a first view and a second view which are obtained by performing data enhancement processing on the pre-training images, so that a first neural network model obtained based on label-free data is trained by adopting a target type training set to obtain a second neural network model, and the technical problem that the classification accuracy of a ground object classification model is low because a ground object classification model of a target task is not accurately constructed in the prior art is solved.
It can be understood by those skilled in the art that the structure shown in fig. 16 is only an illustration, and the computer terminal may also be a terminal device such as a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palmtop computer, a Mobile Internet Device (MID), a PAD, and the like. Fig. 16 is a diagram illustrating a structure of the electronic device. For example, the computer terminal may also include more or fewer components (e.g., network interfaces, display devices, etc.) than shown in FIG. 16, or have a different configuration than shown in FIG. 16.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by a program instructing hardware associated with the terminal device, where the program may be stored in a computer-readable storage medium, and the computer-readable storage medium may include: flash disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
Example 5
Embodiments of a computer-readable storage medium are also provided according to embodiments of the present invention. Optionally, in this embodiment, the computer-readable storage medium may be configured to store program codes executed by the image classification method, the neural network model obtaining method, the feature change detection method, the feature classification method, the remote sensing image classification method, and the feature identification method provided in the foregoing embodiments.
Optionally, in this embodiment, the computer-readable storage medium may be located in any one of a group of computer terminals in a computer network, or in any one of a group of mobile terminals.
Optionally, in this embodiment, the computer readable storage medium is configured to store program code for performing the following steps: acquiring an image to be classified; analyzing the images to be classified by adopting a classification model to obtain a classification result, wherein the classification model is obtained by training a plurality of views obtained by performing data enhancement processing on each frame of image in a plurality of frames of images by using a plurality of frames of images, the views are a plurality of shearing views, the shearing views are obtained by performing data enhancement processing on the whole image corresponding to each frame of image, each shearing view in the shearing views corresponds to different shearing areas in the whole image, and the types of shearing objects corresponding to each shearing view in the shearing views are different; and displaying the classification result.
Optionally, in this embodiment, the computer readable storage medium is configured to store program code for performing the following steps: analyzing the image to be classified by using the first neural network model to obtain the classification result, wherein each group of data in a plurality of groups of unlabeled data used by the first neural network model comprises: the method comprises the steps of pre-training images and a first view and a second view which are obtained by carrying out data enhancement processing on the pre-training images.
Optionally, in this embodiment, the computer readable storage medium is configured to store program code for performing the following steps: analyzing the image to be classified by using the second neural network model to obtain the classification result, wherein the second neural network model is obtained by training a first neural network model by using a target type training set, and each group of data in multiple groups of unlabeled data used by the first neural network model comprises: the method comprises the steps of pre-training images and a first view and a second view which are obtained by carrying out data enhancement processing on the pre-training images.
Optionally, in this embodiment, the computer readable storage medium is configured to store program code for performing the following steps: presenting a plurality of task types within the graphical user interface; responding to touch operation acted on the graphical user interface, and determining a target task type from the plurality of task types; displaying the first neural network model corresponding to the target task type in the graphical user interface; and obtaining the target type training set, and performing model training on the first neural network model by adopting the target type training set to obtain the second neural network model corresponding to the target task type.
Optionally, in this embodiment, the computer readable storage medium is configured to store program code for performing the following steps: the first view and the second view are obtained by performing data enhancement processing on the pre-training image; acquiring a third view corresponding to the pre-training image; and pre-training an initial neural network model by adopting the first view, the second view and the third view to obtain the first neural network model.
Optionally, in this embodiment, the computer readable storage medium is configured to store program code for performing the following steps: acquiring a codebook corresponding to the third view by using the initial neural network model; comparing and learning the first view and the second view based on the codebook to obtain a comparison result; and adjusting the initial neural network model according to the comparison result to obtain the first neural network model.
Optionally, in this embodiment, the computer readable storage medium is configured to store program code for performing the following steps: coding the first view based on the codebook to obtain a first coding result, and coding the second view based on the codebook to obtain a second coding result; performing feature extraction processing on the first coding result based on the codebook to obtain a first feature vector, and performing feature extraction processing on the second coding result based on the codebook to obtain a second feature vector; and obtaining the comparison result by using the first feature vector and the second feature vector.
Optionally, in this embodiment, the computer readable storage medium is configured to store program code for performing the following steps: acquiring a natural resource image to be classified; analyzing the natural resource image to be classified by adopting a surface feature classification model to obtain a natural resource surface feature type corresponding to the natural resource image to be classified, wherein the surface feature classification model is obtained by adopting a multi-frame natural resource image and a plurality of views obtained by performing data enhancement processing on each frame of natural resource image in the multi-frame natural resource image; and displaying the natural resource ground object types.
Optionally, in this embodiment, the computer readable storage medium is configured to store program code for performing the following steps: dynamically displaying the natural resource feature type in the graphical user interface; receiving a modification instruction of the natural resource surface feature type, adjusting the natural resource surface feature type corresponding to the natural resource image to be classified, and dynamically displaying the modified natural resource surface feature type again in the graphic user interface; or receiving a confirmation instruction of the natural resource surface feature type, and storing the natural resource surface feature type corresponding to the natural resource image to be classified for subsequent use.
Optionally, in this embodiment, the computer readable storage medium is configured to store program code for performing the following steps: acquiring a building image to be detected; analyzing the building image to be detected by adopting a change detection model, and determining a detection result of whether the building image to be detected changes, wherein the change detection model is obtained by adopting a plurality of frames of building images and a plurality of views obtained by performing data enhancement processing on each frame of building image in the plurality of frames of building images; and displaying the detection result.
Optionally, in this embodiment, the computer readable storage medium is configured to store program code for performing the following steps: receiving an image to be classified from a client; analyzing the images to be classified by adopting a classification model to obtain a classification result, wherein the classification model is obtained by training a plurality of views obtained by carrying out data enhancement processing on each frame of image in the plurality of frames of images; and returning a classification result to the client and displaying the classification result on the client.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
In the above embodiments of the present invention, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present invention, it should be understood that the disclosed technical contents can be implemented in other manners. The above-described embodiments of the apparatus are merely illustrative, and for example, the above-described division of the units is only one type of division of logical functions, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, units or modules, and may be in an electrical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The integrated unit may be stored in a computer-readable storage medium if it is implemented in the form of a software functional unit and sold or used as a separate product. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a computer-readable storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned computer-readable storage media comprise: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (13)

1. An image classification method, comprising:
acquiring an image to be classified;
analyzing the images to be classified by adopting a classification model to obtain a classification result, wherein the classification model is obtained by training a plurality of views obtained by performing data enhancement processing on each frame of image in a plurality of frames of images, the views are a plurality of shearing views, the shearing views are obtained by performing data enhancement processing on the whole image corresponding to each frame of image, each shearing view in the shearing views corresponds to a different shearing area in the whole image, and the types of shearing objects corresponding to each shearing view in the shearing views are different;
and displaying the classification result.
2. The image classification method according to claim 1, characterized in that the classification model comprises: the first neural network model is used for analyzing the image to be classified by adopting the classification model, and the obtained classification result comprises the following steps:
analyzing the image to be classified by adopting the first neural network model to obtain the classification result, wherein each group of data in a plurality of groups of unlabeled data used by the first neural network model comprises: the image processing method comprises the steps of pre-training images and a first view and a second view which are obtained by carrying out data enhancement processing on the pre-training images.
3. The image classification method according to claim 1, characterized in that the classification model comprises: the second neural network model is used for analyzing the image to be classified by adopting the classification model, and the obtained classification result comprises the following steps:
analyzing the image to be classified by using the second neural network model to obtain the classification result, wherein the second neural network model is obtained by training a first neural network model by using a target type training set, and each group of data in multiple groups of label-free data used by the first neural network model comprises: the image processing method comprises the steps of pre-training images and a first view and a second view which are obtained by carrying out data enhancement processing on the pre-training images.
4. The method of claim 3, wherein a graphical user interface is provided by the electronic device, the content displayed by the graphical user interface at least partially including an image classification scene, the method further comprising:
presenting a plurality of task types within the graphical user interface;
determining a target task type from the plurality of task types in response to a touch operation acting on the graphical user interface;
displaying the first neural network model corresponding to the target task type in the graphical user interface;
and obtaining the target type training set, and performing model training on the first neural network model by adopting the target type training set to obtain the second neural network model corresponding to the target task type.
5. The image classification method according to claim 2 or 3, characterized in that the method further comprises:
the first view and the second view are obtained by performing data enhancement processing on the pre-training image;
acquiring a third view corresponding to the pre-training image;
and pre-training an initial neural network model by adopting the first view, the second view and the third view to obtain the first neural network model.
6. The image classification method of claim 5, wherein pre-training the initial neural network model using the first view, the second view, and the third view to obtain the first neural network model comprises:
acquiring a codebook corresponding to the third view by using the initial neural network model;
comparing and learning the first view and the second view based on the codebook to obtain a comparison result;
and adjusting the initial neural network model according to the comparison result to obtain the first neural network model.
7. The image classification method according to claim 6, wherein the performing comparison learning on the first view and the second view based on the codebook to obtain the comparison result comprises:
coding the first view based on the codebook to obtain a first coding result, and coding the second view based on the codebook to obtain a second coding result;
performing feature extraction processing on the first coding result based on the codebook to obtain a first feature vector, and performing feature extraction processing on the second coding result based on the codebook to obtain a second feature vector;
and obtaining the comparison result by using the first feature vector and the second feature vector.
8. A natural resource feature classification method is characterized by comprising the following steps:
acquiring a natural resource image to be classified;
analyzing the natural resource image to be classified by adopting a surface feature classification model, and acquiring a natural resource surface feature type corresponding to the natural resource image to be classified, wherein the surface feature classification model is obtained by adopting a multi-frame natural resource image and training a plurality of views obtained by performing data enhancement processing on each frame of natural resource image in the multi-frame natural resource image, the plurality of views are a plurality of shearing views, the plurality of shearing views are obtained by performing data enhancement processing on a whole image corresponding to each frame of natural resource image, each shearing view in the plurality of shearing views respectively corresponds to different shearing areas in the whole image, and the types of shearing objects corresponding to each shearing view in the plurality of shearing views are different;
and displaying the natural resource ground object type.
9. The method according to claim 8, wherein a graphical user interface is provided by the electronic device, the content displayed by the graphical user interface at least partially includes a feature classification display scene, and the method further comprises:
dynamically presenting the natural resource feature type within the graphical user interface;
receiving a modification instruction of the natural resource surface feature type, adjusting the natural resource surface feature type corresponding to the natural resource image to be classified, and dynamically displaying the modified natural resource surface feature type again in the graphic user interface; or receiving a confirmation instruction of the natural resource surface feature type, and storing the natural resource surface feature type corresponding to the natural resource image to be classified for subsequent use.
10. A method for detecting a change in a building feature, comprising:
acquiring a building image to be detected;
analyzing the building image to be detected by using a change detection model, and determining a detection result of whether the building image to be detected changes, wherein the change detection model is obtained by training a plurality of views obtained by performing data enhancement processing on each frame of building image in a plurality of frames of building images, the plurality of views are a plurality of shearing views, the plurality of shearing views are obtained by performing data enhancement processing on the whole image corresponding to each frame of building image, each shearing view in the plurality of shearing views respectively corresponds to a different shearing area in the whole image, and the type of a shearing object corresponding to each shearing view in the plurality of shearing views is different;
and displaying the detection result.
11. An image classification method, comprising:
receiving an image to be classified from a client;
analyzing the images to be classified by adopting a classification model to obtain a classification result, wherein the classification model is obtained by training a plurality of views obtained by performing data enhancement processing on each frame of image in a plurality of frames of images, the views are a plurality of shearing views, the shearing views are obtained by performing data enhancement processing on the whole image corresponding to each frame of image, each shearing view in the shearing views corresponds to a different shearing area in the whole image, and the types of shearing objects corresponding to each shearing view in the shearing views are different;
and returning a classification result to the client and displaying the classification result on the client.
12. A computer-readable storage medium, comprising a stored program, wherein the program, when executed, controls an apparatus in which the computer-readable storage medium is located to perform the image classification method according to any one of claims 1 to 5.
13. An electronic device, comprising:
a processor; and
a memory coupled to the processor for providing instructions to the processor for processing the following processing steps:
acquiring an image to be classified;
analyzing the images to be classified by adopting a classification model to obtain a classification result, wherein the classification model is obtained by training a plurality of views obtained by performing data enhancement processing on each frame of image in a plurality of frames of images, the views are a plurality of shearing views, the shearing views are obtained by performing data enhancement processing on the whole image corresponding to each frame of image, each shearing view in the shearing views corresponds to a different shearing area in the whole image, and the types of shearing objects corresponding to each shearing view in the shearing views are different;
and displaying the classification result.
CN202111227616.9A 2021-10-21 2021-10-21 Image classification method, storage medium and electronic device Active CN114140637B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111227616.9A CN114140637B (en) 2021-10-21 2021-10-21 Image classification method, storage medium and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111227616.9A CN114140637B (en) 2021-10-21 2021-10-21 Image classification method, storage medium and electronic device

Publications (2)

Publication Number Publication Date
CN114140637A true CN114140637A (en) 2022-03-04
CN114140637B CN114140637B (en) 2023-09-12

Family

ID=80395474

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111227616.9A Active CN114140637B (en) 2021-10-21 2021-10-21 Image classification method, storage medium and electronic device

Country Status (1)

Country Link
CN (1) CN114140637B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115497010A (en) * 2022-09-30 2022-12-20 北京恒歌科技有限公司 Deep learning-based geographic information identification method and system
US11930022B2 (en) * 2019-12-10 2024-03-12 Fortinet, Inc. Cloud-based orchestration of incident response using multi-feed security event classifications

Citations (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017096758A1 (en) * 2015-12-11 2017-06-15 腾讯科技(深圳)有限公司 Image classification method, electronic device, and storage medium
CN107886509A (en) * 2017-11-24 2018-04-06 苏州珂锐铁电气科技有限公司 A kind of image deflects recognition methods, electronic equipment, storage medium and system
CN110119677A (en) * 2019-03-28 2019-08-13 东南大学 Carbon fiber composite core cable damage testing method based on image classification network
CN110163236A (en) * 2018-10-15 2019-08-23 腾讯科技(深圳)有限公司 The training method and device of model, storage medium, electronic device
US20200012854A1 (en) * 2017-09-08 2020-01-09 Tencent Technology (Shenzhen) Company Ltd Processing method for augmented reality scene, terminal device, system, and computer storage medium
CN110781919A (en) * 2019-09-23 2020-02-11 腾讯云计算(北京)有限责任公司 Classification model training method, classification device and classification equipment
US20200074243A1 (en) * 2017-11-30 2020-03-05 Tencent Technology (Shenzhen) Company Limited Image classification method, personalized recommendation method, computer device and storage medium
CN111160434A (en) * 2019-12-19 2020-05-15 中国平安人寿保险股份有限公司 Training method and device of target detection model and computer readable storage medium
CN111489370A (en) * 2020-03-29 2020-08-04 复旦大学 Remote sensing image segmentation method based on deep learning
US20200250497A1 (en) * 2017-11-01 2020-08-06 Tencent Technology (Shenzhen) Company Limited Image classification method, server, and computer-readable storage medium
US20200250491A1 (en) * 2017-11-01 2020-08-06 Tencent Technology (Shenzhen) Company Limited Image classification method, computer device, and computer-readable storage medium
CN111598174A (en) * 2020-05-19 2020-08-28 中国科学院空天信息创新研究院 Training method of image ground feature element classification model, image analysis method and system
WO2020190772A1 (en) * 2019-03-15 2020-09-24 Futurewei Technologies, Inc. Neural network model compression and optimization
CN111951285A (en) * 2020-08-12 2020-11-17 湖南神帆科技有限公司 Optical remote sensing image woodland classification method based on cascade deep convolutional neural network
CN112232448A (en) * 2020-12-14 2021-01-15 北京大恒普信医疗技术有限公司 Image classification method and device, electronic equipment and storage medium
CN112272830A (en) * 2018-04-20 2021-01-26 希侬人工智能公司 Image classification by label delivery
CN112364933A (en) * 2020-11-23 2021-02-12 北京达佳互联信息技术有限公司 Image classification method and device, electronic equipment and storage medium
CN112734641A (en) * 2020-12-31 2021-04-30 百果园技术(新加坡)有限公司 Training method and device of target detection model, computer equipment and medium
US20210212561A1 (en) * 2018-03-02 2021-07-15 Kowa Company, Ltd. Image classification method, device, and program
CN113435522A (en) * 2021-06-30 2021-09-24 平安科技(深圳)有限公司 Image classification method, device, equipment and storage medium
CN113468939A (en) * 2020-11-30 2021-10-01 电子科技大学 SAR target recognition method based on supervised minimization deep learning model
US20210326708A1 (en) * 2019-05-21 2021-10-21 Beijing Sensetime Technology Development Co., Ltd. Neural network training method and apparatus, and image processing method and apparatus

Patent Citations (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017096758A1 (en) * 2015-12-11 2017-06-15 腾讯科技(深圳)有限公司 Image classification method, electronic device, and storage medium
US20200012854A1 (en) * 2017-09-08 2020-01-09 Tencent Technology (Shenzhen) Company Ltd Processing method for augmented reality scene, terminal device, system, and computer storage medium
US20200250497A1 (en) * 2017-11-01 2020-08-06 Tencent Technology (Shenzhen) Company Limited Image classification method, server, and computer-readable storage medium
US20200250491A1 (en) * 2017-11-01 2020-08-06 Tencent Technology (Shenzhen) Company Limited Image classification method, computer device, and computer-readable storage medium
CN107886509A (en) * 2017-11-24 2018-04-06 苏州珂锐铁电气科技有限公司 A kind of image deflects recognition methods, electronic equipment, storage medium and system
US20200074243A1 (en) * 2017-11-30 2020-03-05 Tencent Technology (Shenzhen) Company Limited Image classification method, personalized recommendation method, computer device and storage medium
US20210212561A1 (en) * 2018-03-02 2021-07-15 Kowa Company, Ltd. Image classification method, device, and program
CN112272830A (en) * 2018-04-20 2021-01-26 希侬人工智能公司 Image classification by label delivery
CN110163236A (en) * 2018-10-15 2019-08-23 腾讯科技(深圳)有限公司 The training method and device of model, storage medium, electronic device
WO2020190772A1 (en) * 2019-03-15 2020-09-24 Futurewei Technologies, Inc. Neural network model compression and optimization
CN110119677A (en) * 2019-03-28 2019-08-13 东南大学 Carbon fiber composite core cable damage testing method based on image classification network
US20210326708A1 (en) * 2019-05-21 2021-10-21 Beijing Sensetime Technology Development Co., Ltd. Neural network training method and apparatus, and image processing method and apparatus
CN110781919A (en) * 2019-09-23 2020-02-11 腾讯云计算(北京)有限责任公司 Classification model training method, classification device and classification equipment
CN111160434A (en) * 2019-12-19 2020-05-15 中国平安人寿保险股份有限公司 Training method and device of target detection model and computer readable storage medium
CN111489370A (en) * 2020-03-29 2020-08-04 复旦大学 Remote sensing image segmentation method based on deep learning
CN111598174A (en) * 2020-05-19 2020-08-28 中国科学院空天信息创新研究院 Training method of image ground feature element classification model, image analysis method and system
CN111951285A (en) * 2020-08-12 2020-11-17 湖南神帆科技有限公司 Optical remote sensing image woodland classification method based on cascade deep convolutional neural network
CN112364933A (en) * 2020-11-23 2021-02-12 北京达佳互联信息技术有限公司 Image classification method and device, electronic equipment and storage medium
CN113468939A (en) * 2020-11-30 2021-10-01 电子科技大学 SAR target recognition method based on supervised minimization deep learning model
CN112232448A (en) * 2020-12-14 2021-01-15 北京大恒普信医疗技术有限公司 Image classification method and device, electronic equipment and storage medium
CN112734641A (en) * 2020-12-31 2021-04-30 百果园技术(新加坡)有限公司 Training method and device of target detection model, computer equipment and medium
CN113435522A (en) * 2021-06-30 2021-09-24 平安科技(深圳)有限公司 Image classification method, device, equipment and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
FANG HSUAN CHENG; MOO DI LOO: "An Image Inpainting Method for Stereoscopic Images Based on Hole Classification", 2014 7TH INTERNATIONAL CONFERENCE ON UBI-MEDIA COMPUTING AND WORKSHOPS, pages 271 *
李祥霞,吉晓慧,李彬: "细粒度图像分类的深度学习方法", 计算机科学与探索, vol. 15, no. 10, pages 1830 - 1842 *
赵永威;李婷;蔺博宇;: "基于深度学习编码模型的图像分类方法", 工程科学与技术, no. 01 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11930022B2 (en) * 2019-12-10 2024-03-12 Fortinet, Inc. Cloud-based orchestration of incident response using multi-feed security event classifications
CN115497010A (en) * 2022-09-30 2022-12-20 北京恒歌科技有限公司 Deep learning-based geographic information identification method and system

Also Published As

Publication number Publication date
CN114140637B (en) 2023-09-12

Similar Documents

Publication Publication Date Title
Jia et al. Detection and segmentation of overlapped fruits based on optimized mask R-CNN application in apple harvesting robot
Ocer et al. Tree extraction from multi-scale UAV images using Mask R-CNN with FPN
Chen et al. MANet: A multi-level aggregation network for semantic segmentation of high-resolution remote sensing images
CN111209810A (en) Bounding box segmentation supervision deep neural network architecture for accurately detecting pedestrians in real time in visible light and infrared images
CN110414387A (en) A kind of lane line multi-task learning detection method based on lane segmentation
CN107909015A (en) Hyperspectral image classification method based on convolutional neural networks and empty spectrum information fusion
CN111368846B (en) Road ponding identification method based on boundary semantic segmentation
CN114140637B (en) Image classification method, storage medium and electronic device
CN111325271B (en) Image classification method and device
CN110555420B (en) Fusion model network and method based on pedestrian regional feature extraction and re-identification
CN115761529B (en) Image processing method and electronic device
CN104063686A (en) System and method for performing interactive diagnosis on crop leaf segment disease images
Rathore et al. Real-time continuous feature extraction in large size satellite images
Guo et al. Underwater sea cucumber identification via deep residual networks
Pang et al. SGBNet: An ultra light-weight network for real-time semantic segmentation of land cover
CN112561973A (en) Method and device for training image registration model and electronic equipment
Xu et al. Real-time and accurate detection of citrus in complex scenes based on HPL-YOLOv4
CN110598705B (en) Semantic annotation method and device for image
Du et al. Open-pit mine change detection from high resolution remote sensing images using DA-UNet++ and object-based approach
CN113971757A (en) Image classification method, computer terminal and storage medium
Zhao et al. A novel strategy for pest disease detection of Brassica chinensis based on UAV imagery and deep learning
CN114092920A (en) Model training method, image classification method, device and storage medium
Dong et al. A cloud detection method for GaoFen-6 wide field of view imagery based on the spectrum and variance of superpixels
Lu et al. Citrus green fruit detection via improved feature network extraction
Dong et al. A deep learning based framework for remote sensing image ground object segmentation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant