CN117115498A

CN117115498A - Method and electronic device for recognizing an aerial image

Info

Publication number: CN117115498A
Application number: CN202310247067.4A
Authority: CN
Inventors: 周佳; 郑龙; 周黎; 李晶; 何燕飞
Original assignee: Panzhihua Ecological Environment Information And Technology Assessment Service Center; Tsinghua Solution Information Technology Co ltd
Current assignee: Panzhihua Ecological Environment Information And Technology Assessment Service Center; Tsinghua Solution Information Technology Co ltd
Priority date: 2023-03-15
Filing date: 2023-03-15
Publication date: 2023-11-24

Abstract

A method and an electronic device for identifying an aerial image. The method for identifying an aerial image comprises the following steps: acquiring a first weather map; extracting image fusion characteristics of the first weather map so that a trained weather map recognition model predicts and recognizes the first weather map according to the image fusion characteristics, wherein the image fusion characteristics are formed by splicing text-level characteristics and image-level characteristics; outputting the weather category of the first weather map.

Description

Method and electronic device for recognizing an aerial image

Technical Field

The present invention relates to the field of gas image recognition technology, and in particular, to a method and an electronic device for recognizing a gas image.

Background

In order to effectively cope with meteorological changes and improve disaster monitoring and early warning capabilities, it is very necessary to identify an aerial image. With the development of artificial intelligence, a great deal of technologies for performing gas image recognition analysis by using a neural network emerge successively, for example, the prior art processes a gas image by using opencv (cross-platform computer vision and machine learning software library). That is, text in the gas image is recognized by using tesseract OCR (Optical Character Recognition ), and further, a neural network gas image is constructed by using tensorf (end-to-end open source machine learning platform) to recognize and classify the gas image.

However, the text recognition rate of the tessellact OCR technology is low and support for the text recognition is not friendly. the tesseract OCR technology is too complicated for manually selecting the characteristics of the curve angle profile, wind direction angle, air pressure size, similarity and the like of the picture, and can not accurately mine the hidden characteristics; in addition, aiming at the input layers with the same size, the model parameters required by the feedforward neural network are too many, so that the optimal parameters are difficult to obtain, namely, the model of the tesseract OCR has defects, and the characteristic of local correlation of the weather map is ignored.

Therefore, identifying weather classes for which the weather map is output quickly with high accuracy is a problem to be solved by the invention.

Disclosure of Invention

The invention aims to provide a method and electronic equipment for identifying an aerial image, which can at least identify the input aerial image, rapidly output high-precision weather categories of each region (or designated region) and each geographical position of the aerial image, accurately analyze the spatial distribution and time-varying characteristics of pollutants during heavy pollution and provide decision support for scientific knowledge of a heavy pollution forming mechanism.

According to an aspect of the present invention, at least one embodiment provides a method for identifying an aerial image, comprising: acquiring a first weather map; extracting image fusion characteristics of the first weather map so that a trained weather map recognition model predicts and recognizes the first weather map according to the image fusion characteristics, wherein the image fusion characteristics are formed by splicing text-level characteristics and image-level characteristics; outputting the weather category of the first weather map.

According to an aspect of the present invention, at least one embodiment also provides a method for training an aerial image recognition model, comprising: obtaining an aerial image sample; if the data of the weather map sample is balanced and the data size meets the requirement, extracting training fusion features of the weather map sample, wherein the training fusion features are formed by splicing text-level features and image-level features; and training the weather pattern recognition model based on the training fusion characteristic, and stabilizing the weather pattern recognition model through a categorical-cross sentropy loss function.

According to another aspect of the present invention, at least one embodiment also provides an electronic device, including: a processor adapted to implement instructions; and a memory adapted to store a plurality of instructions, the instructions adapted to be loaded and executed by the processor: the present invention is described above for a method for identifying an aerial image, and/or for a method for training an aerial image identification model.

According to another aspect of the present invention, at least one embodiment also provides a system for identifying an aerial image, comprising: the electronic equipment provided by the invention is provided.

According to another aspect of the present invention, at least one embodiment also provides a computer-readable non-volatile storage medium storing computer program instructions that, when executed by the computer, perform: the present invention is described above for a method for identifying an aerial image, and/or for a method for training an aerial image identification model.

According to the embodiment of the invention, the gas image is taken as input, the text level features and the image level features of the gas image are extracted, wherein the text level features (related to characters and space) are extracted by adopting an EAST text recognition model, the image level features are extracted by adopting a VGG16 transfer learning model, and then the gas image is predicted and classified. The invention integrates some advantages of the convolutional neural network on the image data (parameter sharing, local connection and rotation invariance), but does not analyze the weather map by using simple curve angle outline, wind direction angle, air pressure magnitude and other fixed index angles.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings which are required in the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the description below are some embodiments of the invention and that other drawings may be obtained from these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of a weather class according to an embodiment of the invention;

FIG. 2 is a schematic illustration of an application environment according to an embodiment of the invention;

FIG. 3 is a schematic diagram of an electronic device according to an embodiment of the invention;

FIG. 4 is a flow chart of a method for training an meteorological graph recognition model according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of an meteorological pattern recognition model according to an embodiment of the present invention;

FIG. 6 is a flow chart of a method for identifying an aerial image in accordance with an embodiment of the present invention;

FIG. 7 is a schematic diagram of an acquired aerial image in accordance with an embodiment of the present invention;

FIG. 8 is a schematic diagram of an EAST text recognition model according to an embodiment of the present invention;

FIG. 9 is a schematic diagram of a VGG16 shift learning model according to an embodiment of the invention;

FIG. 10 is a schematic diagram of an meteorological pattern recognition model according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made apparent and fully in view of the accompanying drawings, in which some, but not all embodiments of the invention are shown. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

It should be noted that the terms "first," "second," and the like in the description and the claims of the present invention and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Weather patterns, also known as weather patterns, are used to analyze the general term for atmospheric physical conditions and characteristics, and there are various weather categories depending on different requirements and purposes. Generally, as shown in FIG. 1, the weather categories include: eight kinds of high-pressure, low-pressure center, low-pressure inverted groove, low-pressure equalizing field, north high-pressure, west high-pressure, high-pressure center, high-pressure equalizing field and the like are adopted, and decision support can be provided for analysis of diffusion mechanism of pollutants through retrospective analysis of the type of the weather map in the heavy pollution period area, and decision support is provided for analysis of spatial distribution and time-varying characteristics of the pollutants in the heavy pollution expected period.

1. The east high pressure is at the closed high pressure rear part or the auxiliary tropical high pressure rear part, the ground blows the south wind, the ground is controlled by the south wind at the high pressure rear part, and the diffusion condition is general;

2. the low-pressure center is positioned in the low-pressure closed isobaric line, the air flow rises, and in rainy days, the atmosphere is in an unstable state and is easy to diffuse;

3. the low-pressure groove is reversed, the isopipe line of the ground gas figure is in an A shape, the front part of the groove line is always blown with south wind, the rising motion is carried out, and the water vapor is abundant, so that clouds and precipitation can be formed. The back of the trough line blows more north wind, the sinking movement is carried out, and the weather is always clear and cloudy. The air pressure field is weaker, the diffusion condition in the eastern area is better, and the diffusion condition in the western area is worse;

4. the low-pressure equalizing field is in an obvious equalizing field of a low-pressure system, so that the atmosphere is stable and diffusion is not facilitated;

5. the north part is high-pressure and is positioned in a closed high-pressure bottom area, and the ground blows east wind, so that the diffusion condition is general;

6. the western high pressure is positioned in front of the continental Leng Gao pressure, the ground blows north wind, is influenced by cold air, and is controlled by the near ground to be the north wind, so that the wind power is enhanced, the humidity is reduced, and the atmospheric diffusion condition is good;

7. the high-pressure center is positioned near a high-pressure ridge line or in a subtropical zone, so that the weather is sunny, the wind speed is low, the air is sunk, the reverse temperature weather is easy to form, and the diffusion is not facilitated;

8. the high-pressure equalizing field is characterized in that the ground is generally positioned in a weaker weather system, the wind speed is less than 1m/s, the atmosphere is stable, and the diffusion is not facilitated;

currently, a great deal of technologies for identifying and analyzing an object graph by using a neural network are emerging, for example, an object graph feature type identifying system in the prior art includes: the device comprises an aerial image uploading module, an aerial image identifying module and a similarity matching module, wherein the aerial image uploading module, the aerial image identifying module and the similarity matching module are used for identifying aerial images with two different pressures of 500hpa and surface_pres, the pressure of 500hpa is used for identifying partial west air flow, high altitude ridges and auxiliary high weather in the aerial image, and the pressure of surface_pres is used for identifying high pressure, pressure equalization and typhoon weather in the aerial image. However, these prior art weather pattern recognition not only occupies a large amount of computing resources, but also has low recognition accuracy, and there is a strong need to provide a method for rapidly and highly accurately recognizing the weather category of the weather pattern.

On the basis of at least one embodiment of the invention, a system for identifying an aerial image is provided, which comprises an electronic device for identifying an aerial image and/or an electronic device for training an aerial image identification model. The system for identifying an aerial image may include an environment as shown in fig. 2, which may include a hardware environment and a network environment. The hardware environment includes: electronic equipment for identifying an aerial image, and/or electronic equipment for training an aerial image identification model, hereinafter collectively referred to as electronic equipment 100; server 200. The electronic device 100 may operate the server 200 through corresponding instructions so that data may be read, changed, added, etc. The electronic device 100 may be one or more, or may include a plurality of processing nodes, where the plurality of processing nodes may be external to the device as a whole.

Optionally, the electronic device 100 may also send the acquired first weather chart and/or the weather chart sample to the server 200, so that the server 200 performs the method for identifying the weather chart and/or the method for training the weather chart identification model according to the present invention. Alternatively, the electronic device 100 may be connected to the server 200 through a network. The network includes a wired network and a wireless network. The wireless network includes, but is not limited to: a wide area network, a metropolitan area network, a local area network, or a mobile data network. Typically, the mobile data network includes, but is not limited to: global system for mobile communications (GSM) networks, code Division Multiple Access (CDMA) networks, wideband Code Division Multiple Access (WCDMA) networks, long Term Evolution (LTE) communication networks, WIFI networks, zigBee networks, bluetooth technology based networks, and the like. Different types of communication networks may be operated by different operators. The type of communication network is not limiting of embodiments of the present invention.

The electronic device 100, as shown in fig. 3, includes: a processor 301; and a memory 303 configured to store computer program instructions adapted to be loaded and executed by the processor for performing the method for identifying an aerial image and/or the method for training an aerial image identification model developed by the present invention (as will be described in more detail later). Optionally, at least one embodiment of the present invention further provides a computer-readable non-volatile storage medium storing computer program instructions that, when executed by a computer, perform the method for identifying an aerial image developed by the present invention, and/or the method for training an aerial image identification model.

The processor 301 may be any suitable processor, for example, implemented as a central processing unit, a microprocessor, an embedded processor, etc., and may be in an X86, ARM, etc. architecture. The memory 303 may be any suitable memory device, such as a non-volatile memory device, including but not limited to magnetic memory devices, semiconductor memory devices, optical memory devices, etc., and may be arranged as a single memory device, an array of memory devices, or a distributed memory device, as embodiments of the present invention are not limited to such.

It will be appreciated by those of ordinary skill in the art that the structure of the electronic device 100 described above is merely illustrative and is not intended to limit the structure of the device. For example, the apparatus for delaying a signal may also include more or less components (e.g., transmission devices) than those shown in fig. 3. The transmission device is used for receiving or transmitting data via a network. In one example, the transmission device is a Radio Frequency (RF) module, which is used to communicate with the internet wirelessly.

In the above-described operating environment, at least one embodiment of the present invention proposes a method for training an weather pattern recognition model, which may be loaded and executed by the processor 301, and via which the weather pattern recognition model formed at least can recognize weather categories of the weather pattern quickly and with high accuracy. As shown in the flowchart of the method for training an aerial image recognition model of fig. 4, it should be noted that the steps shown in the flowchart of the drawing may be performed in a computer system such as a set of computer-executable instructions, and, although a logical order is shown in the flowchart, in some cases, the steps shown or described may be performed in an order other than that shown herein, the method may include the steps of:

step S402, obtaining an aerial image sample;

step S404, if the data of the weather pattern sample is balanced and the data size meets the requirement, extracting training fusion characteristics of the weather pattern sample, wherein the training fusion characteristics are formed by splicing text-level characteristics and image-level characteristics;

step S406, training the gas image recognition model based on the training fusion characteristics, and stabilizing the gas image recognition model through a categorical-cross sentropy loss function.

The training method aims at taking the weather pattern sample as input, utilizing a network model built by the migration learning and multi-model feature fusion ideas, intelligently and efficiently judging the weather category of the weather pattern sample, immediately displaying the weather category of each area (or designated area) on a page, displaying the confidence level, and assisting the inspector in judging the weather category.

In step S402, an aerial image sample is acquired. Optionally, the invention can obtain the weather pattern sample which is marked manually in the history range for training and prediction, wherein the weather pattern of the weather pattern sample is marked manually and then the weather pattern is reflected in the file name, and the marking data is marked by the same batch of people at one time so as to be unified.

That is, in order to maintain the data consistency of the weather pattern sample, the invention uses GFS (global predictive System) data to automatically draw the weather pattern sample through ncl software (common software in weather and ocean drawing), marks the weather pattern sample by personnel with special weather experience, and stores the weather pattern sample in a specified disk catalog or ftp server. The above storage in the specified disk directory or ftp server may be: the acquisition program identifies the weather type according to the file name, namely the acquired weather pattern samples for training can be distributed to different folders according to classified data transmission, and the data transmission process can be a remote transmission function or hardware such as ftp, sftp, ssh.

In step S404, if the data of the weather pattern sample is balanced and the data size meets the requirement, the training fusion feature of the weather pattern sample is extracted, where the training fusion feature is formed by splicing the text-level feature and the image-level feature. That is, the present invention performs data cleaning and normalization processing on the gas image samples before performing fusion feature extraction on the gas image samples, and then mainly uses the images after the data cleaning and normalization processing to extract training fusion features.

In view of the fact that the gas Image field picture is different from other object classification, a plurality of isobars exist, and data such as map boundaries and the like in the gas Image picture can greatly influence the model, so that the cleaning can be used for removing interference data such as map boundaries, legends and invalid numbers, and the like, and the Image pixel information can be obtained by reading the gas Image picture by means of an Image method and a putpixel method in PIL (Python Imaging Library, which is an Image processing library with very powerful function and simple and easy use of a Python platform) during cleaning, and the Image after the storage processing of the interference data such as the map boundaries, the legends and the invalid numbers and the like is removed, so that the model is helped to produce high-quality output. The Image method and the putpixel method described above will be described in detail herein during the course of the method for recognizing an aerial Image to be described later.

In addition, from the weather patterns of past history, the distribution of each sample is extremely unbalanced, the phase difference between each category is larger, for example, the phase difference between a low-voltage center and a high-voltage equalizing field is more than 60 times, and the sample imbalance on a small sample can lead the model prediction result to be mainly deviated to a large category, so that the model accuracy is not high. For unbalanced weather map samples, they are "extended" by a series of random transformations, processed data is output, for example, the original unbalanced data is up-sampled by the ImageDataGenerator method in keras (a Python deep learning framework of open source), so that the data of the weather map samples are balanced and the data set is extended, thus the subsequent training model will never see the exact same image twice, which helps to prevent overfitting and helps better generalization of the training model.

Therefore, after data cleaning and standardization processing are carried out on the gas image sample, if the data of the gas image sample are balanced and the data size meets the requirement, the training fusion feature of the gas image sample is further extracted, wherein the training fusion feature is formed by splicing the text-level feature2 and the image-level feature1. For example, text level feature2 of the weather map sample is extracted using the EAST text recognition model; extracting image level features 1 of the weather image sample by utilizing a VGG16 transfer learning model; and merging the text-level feature2 and the image-level feature1, and then carrying out normalization processing to form an image fusion feature.

That is, the invention uses the open source algorithm EAST text recognition model to recognize character level features such as the high-pressure center, the low-pressure center, the value of the isopipe and the position of the high-pressure center and the low-pressure center in the image graph, uses the transfer learning to call the VGG16 transfer learning model (wherein the VGG16 transfer learning model is removed from the full connection module), retrains to obtain the deep image level features of the image, and the transfer learning multiplexing model parameters not only properly improve the training efficiency, but also reduce the training time, and obtain the advanced image level features.

According to the invention, the VGG16 transfer learning model is called by transfer learning, so that model parameters are fewer, the model can be directly trained on an ImageNet data set, the characteristics related to the data set can be learned through fine adjustment, and hidden characteristics in data can be better found without complicated construction of complex feedforward neural network learning characteristics. In the method for recognizing an aerial image according to the present invention, the EAST text recognition model and the VGG16 shift learning model and parameters thereof may be multiplexed, and the EAST text recognition model and the VGG16 shift learning model according to the present invention will be described in detail in the course of the method for recognizing an aerial image according to the present invention.

In step S406, the gas image recognition model is trained based on the training fusion feature, and the gas image recognition model is stabilized by the categorical_cross sentropy loss function, wherein the training result is stored in the image storage unit or the memory 303, and the training result is a predicted weather category result and its confidence level. Optionally, the weather pattern recognition model includes, but is not limited to, an input layer, a first fully-connected layer, a first Dropout layer, a second fully-connected layer, a second Dropout layer, and an output layer, and training the weather pattern recognition model may include: the training fusion characteristics sequentially pass through the input layer, the first full-connection layer, the first Dropout layer, the second full-connection layer and the second Dropout layer, and are output at the output layer. For example, as shown in fig. 5, an input layer such as a concat layer, a first fully connected layer includes but is not limited to 1024 output neurons (fc 1024), a first Dropout layer is used to drop a certain ratio, e.g., 20% of output neurons (Dropout 0.2), a second fully connected layer includes but is not limited to 256 output neurons (fc 256), a second Dropout layer is used to drop a certain ratio, e.g., 20% of output neurons (Dropout 0.2), and an output layer includes but is not limited to 8 output neurons (fc 8). It should be noted that, the output neurons can be flexibly adjusted according to actual requirements, for example, the output neuron data of the output layer is set to 8 for identifying class 8 meteorological types, and the output neuron data of the output layer is set to 12 for identifying class 12 meteorological types.

That is, the invention takes the training fusion characteristic as the input of the neural network-weather map recognition model, and obtains the probability distribution of the weather classification category of the weather map through the input layer, the two full connection layers, the two Dropout layers and the Softmax layer. The optional operation method is as follows: the model optimizer comprising a concat layer (namely an input layer), two full-connection layers, two dropout layers and an output layer is constructed, wherein the learning rate of SGD (random gradient descent algorithm) is 0.0001, and the loss function is a training module of a categorical cross sentropy neural network.

Through the mode, a plurality of rapid and high-precision gas image recognition models can be trained, and the gas image recognition models adopt EAST text recognition models, VGG16 transfer learning models and the like to form a multi-model feature fusion classification recognition system. Subsequently, the invention can utilize the trained weather pattern recognition model to recognize and analyze the weather types of any weather pattern (not limited to 8 weather types listed herein), not only can rapidly output the high-precision weather types of each region (or designated region) and each geographical position of the weather pattern, but also can provide decision support for scientifically recognizing a heavy pollution forming mechanism and accurately analyzing the spatial distribution and time-varying characteristics of pollutants during heavy pollution.

In the above operating environment, at least one embodiment of the present invention proposes a method for identifying weather patterns, which can be loaded and executed by the processor 301, which is capable of at least quickly and highly accurately identifying weather categories of weather patterns. As shown in the flowchart of the method for identifying an aerial image of fig. 6, it should be noted that the steps shown in the flowchart of the drawing may be performed in a computer system such as a set of computer executable instructions, and, although a logical order is shown in the flowchart, in some cases, the steps shown or described may be performed in an order other than that shown and described herein, the method may include the steps of:

step S602, a first weather map is obtained;

step S604, extracting image fusion features of the first weather map, so that the trained weather map recognition model predicts and recognizes the first weather map according to the image fusion features, wherein the image fusion features are formed by splicing text-level features and image-level features;

step S606, outputting the weather category of the first weather map.

According to the embodiment of the invention, the weather category of the weather image is output by taking the real-time or non-real-time acquired weather image as input through the trained weather image identification model, and the weather image identification method is simple in model, high in speed and high in accuracy.

In step S602, a first weather map is acquired. Optionally, acquiring an aerial image in real time or in non-real time; and removing interference data of the gas image graphs by using different colors to form a first gas image graph. The acquired weather map may be: the image acquisition device is used for acquiring the latest generated gas image of a certain province (or a certain area) in real time, and the source of the gas image can be a sea level gas pressure image downloaded from a central gas image table, a Japanese weather hall, a Korean gas image table and other websites, or can be drawn by using GFS data and other software by itself, as shown in figure 7. The above-mentioned interference data for removing the aerial image by using different colors may be: the interference data is removed by using the Image method and the putpixel method in the PIL (Python Imaging Library, a very powerful and simple and easy-to-use Image processing library of the Python platform).

In general, the gas Image represents the gas pressure field, the gas pressure data value, the isobars, the map boundary, the legend, and/or the invalid numbers with different colors, so that the gas Image colors are obtained by using the Image method in PIL for different colors, and the map boundary, the legend, the invalid numbers are removed by setting a threshold rule for different colors, as follows: (1) An open method of Image is used for reading original aerial Image materials; (2) Acquiring the number of the long and wide pixel points of the aerial image and traversing R, G, B values (the range of values is 0-255) of all the points with the length i and the width j; (3) When the R value of each point is within the positive and negative 10 of the G value, the G value is within the positive and negative 10 of the B value, and the R value is within the positive 10 of the B value, the point is assigned a white color (255 ) by using the putpixel in the Image.

Meanwhile, for noise data, the denoising method using the Image method and the putpixel method is as follows: (1) Converting the color map into a Gray map by a floating point algorithm gray=r0.3+g0.59+b0.11 method; (2) Binarizing the gray map, determining a threshold 115, wherein pixels larger than the threshold are represented as white, and pixels smaller than the threshold are represented as black, thereby dividing the pixels (gray values) of the picture into two parts: 0 and 1, for example 0 represents black, 1 represents white; (3) Noise is removed by using an isolated algorithm, namely, black points in a nine grid around the black points are counted, if the number of the black points is less than two, the points are proved to be isolated points, and the positions of all the isolated points are recorded; (4) And assigning the noise point positions to the processed RGB Image by utilizing putpixel in the Image to be white (255 ). Thereby, various kinds of disturbance data in the weather map can be removed.

In step S604, image fusion features of the first weather map are extracted, so that the trained weather map recognition model performs predictive recognition on the first weather map according to the image fusion features. The extracting the image fusion feature of the first weather map may include: extracting a text level feature2 of the first weather map by using an EAST text recognition model; extracting an image level feature1 of the first weather map by utilizing a VGG16 transfer learning model; and combining the text-level features with the image-level features, and then carrying out normalization processing to form image fusion features, namely, the image fusion features are formed by combining the text-level features and the image-level features, wherein the text-level features comprise a first feature, a second feature and a third feature for representing characters and space.

The EAST text recognition model of the present invention includes, but is not limited to, a Feature extraction layer Feature extractor stem (PVANet), a Feature fusion layer Feature-enhancing branch, and an Output layer, and extracting text-level features of the first weather map using the EAST text recognition model may include: the feature extraction layer outputs a first layer feature map f1, a second layer feature map f2, a third layer feature map f3 and a fourth layer feature map f4 through 4 layers of convolution layers with the channel numbers of 64, 128, 256 and 384; the feature fusion layer splices f1 up-sampling 2 times and f2 to form a first layer fusion image h1, h1 up-sampling 2 times and f3 to form a second layer fusion image h2, h2 up-sampling 2 times and f4 to form a third layer fusion image h3, and h3 forms a fourth layer fusion image h4 through a convolution layer, wherein h1, h2, h3 and h4 are respectively 1/4, 1/8, 1/16 and 1/32 of the first gas image; the output layer outputs a first feature using a convolution layer of 1*1 for lane 1, a second feature using a convolution layer of 1*1 for lane 5, and a third feature using a convolution layer of 1*1 for lane 8, wherein the first feature represents a probability that each pixel belongs to a text region, the second feature is used to predict a text feature of a rotated rectangle, and the third feature is used to predict a text of a trapezoid.

As shown in fig. 8, the first image (i.e., image) forms a feature map of each output layer as f1, f2, f3, f4 by 4 convolutions with channel numbers of 64, 128, 256, 384. f1 is spliced with f2 of the upper layer by up-sampling 2 times of the unpolol layer, and then is spliced with f3 by up-sampling 2 times of the unpolol layer after passing through a 1*1 convolution layer (feature dimension reduction) and a 3*3 convolution layer; then, after passing through a 1*1 convolution layer (feature dimension reduction) and a 3*3 convolution layer, the four-layer feature map is spliced with f4 by up-sampling 2 times through an unpoll layer, and the sizes of the four-layer feature map are respectively 1/4, 1/8, 1/16 and 1/32 of the original map, wherein the specific formulas are as follows:

the output layer after the feature fusion layer has three parts: socre map, RBOX, QUAD. The sorre map outputs a score graph through a convolution layer of 1*1 with an output channel of 1, and represents the probability that each pixel belongs to a text region, so that a text feature featureA is obtained; RBOX, producing 5 channels from two 1*1 convolution layers, wherein 4 channels represent 4 distances from the pixel location to the top, right, bottom, left side boundary of the rectangle, respectively, and 1 channel represents the rotation angle of the bounding box, which is used in part to predict the text feature featureB of the rotated rectangle; QUAD uses 8 numbers to represent the coordinate offset from the four corner vertices { pi|i e {1,2,3,4} of the quadrilateral to the pixel location. Since each distance offset contains two numbers (Δxi, Δyi), the output contains 8 channels, which can predict the text featureC of a trapezoid.

That is, the invention acquires the probability featureA that each pixel point of the text feature belongs to the text region, the text feature featureB of the rotating rectangle and the text feature featureC of the trapezoid by using the EAST text recognition model, and combines the three feature data into the text space feature2 after vectorization. As known from the structure of the East text recognition model, the East text recognition model adopts a multi-scale fusion method such as FCN to extract the characteristics, and is used for predicting the text region of the subsequent pixel level; meanwhile, the text recognition model can detect inclined texts, and the text in each direction can be detected by considering the direction information. The invention adopts the idea of FPN to extract multi-scale fusion characteristics, namely, simply generates pictures with different sizes, generates different characteristics for each picture, respectively predicts, finally counts the prediction results of all sizes, can make an EAST text detector very robust, and can locate the text even if the text is blurred, reflected or partially blocked.

The VGG16 migration learning model comprises, but is not limited to, a first convolution module, a second convolution module, a third convolution module, a fourth convolution module, a fifth convolution module and 1 full connection module, wherein the first 5 convolution modules are responsible for feature extraction, and the last full connection module is responsible for completing classification tasks. Each convolution module comprises a plurality of convolution layers, a pooling layer; the full connection module comprises a flat layer and a plurality of full connection layers, the number of channels in each block of structure is the same, each convolution module bureau adopts a 3*3 convolution core, padding is the same, and the pooling layer is used for compressing data and parameter quantity when constructing a neural network, so that overfitting is reduced. Here, the invention can also delete the last full connection module of the VGG16 transfer learning model, add the average pooling layer globalargepoling 2D, and output the image level features.

Optionally, extracting the image level features of the first weather map using the VGG16 transition learning model may include: the first gas image sequentially passes through a first convolution module, a second convolution module, a third convolution module, a fourth convolution module and a fifth convolution module to generate image level features, wherein the first convolution module comprises but is not limited to a 64-channel 3*3 convolution kernel of a 2-layer cascade, the second convolution module comprises but is not limited to a 128-channel 3*3 convolution kernel of the 2-layer cascade, the third convolution module comprises but is not limited to a 256-channel 3*3 convolution kernel of a 3-layer cascade, the fourth convolution module comprises but is not limited to a 512-channel 3*3 convolution kernel of the 2-layer cascade, and the fifth convolution module comprises but is not limited to a 512-channel 3*3 convolution kernel of the 2-layer cascade. As shown in fig. 9:

for example, the first convolution module includes two convolution layers, each convolution layer includes 64 convolution kernels of 3*3 with 3 steps of 1, padding=same, an activation function of ReLU, and an output size of 224×224×64; the pooling layer is max pooling (maximum pooling), the filter is 2x2, the step length is 2, the max pooling can reduce the deviation of the mean value of the estimated value caused by the parameter error of the convolution layer, and more texture information is reserved; the image size after processing by the first convolution module is halved, and the pooled size becomes 112×112×64.

For example, the second convolution module includes two convolution layers, each convolution layer containing 128 convolution kernels of 3*3 with 3 steps of 1, padding = same, activation function ReLU, output size 112 x 128; the pooling layer is max pooling (maximizing pooling), the filter is 2x2, and the step length is 2; the image size after processing by the second convolution module is halved and the pooled size becomes 56 x 128.

For example, the third convolution module includes three convolution layers, each convolution layer includes 256 convolution kernels of 3*3 with 3 steps of 1, padding=same, an activation function of ReLU, and an output size of 56×56×256; the pooling layer is max pooling (maximizing pooling), the filter is 2x2, and the step length is 2; the image size after processing by the third convolution module is halved, and the pooled size becomes 28 x 256.

For example, the fourth convolution module includes three convolution layers, each convolution layer including 512 convolution kernels of 3*3 with 3-step-size 1, padding=same, activation function ReLU, and output size 28×28×512. The pooling layer is max pooling (maximizing pooling), the filter is 2x2, and the step length is 2; the image size is halved after processing by the fourth convolution module, and the pooled size becomes 14 x 512.

For example, the fifth convolution module includes three convolution layers, each convolution layer includes 512 convolution kernels of 3*3 with 3 steps of 1, padding=same, an activation function of ReLU, and an output size of 14×14×512; the pooling layer is max pooling (maximizing pooling), the filter is 2x2, and the step length is 2; the image size after processing by the fifth convolution module is halved, and the size after pooling becomes 7 x 512.

According to the method, the first weather map is input, the VGG16 migration learning model with the all-connection layer removed is built, the weight is loaded, and the globalagePooling 2D output image level feature1 is added to the last layer of the VGG16 migration learning model.

According to the invention, the text-level feature2 and the image-level feature1 are combined and normalized, for example, the feature1 and the feature2 are combined by a concatate method and then normalized by maximum and minimum values to form an image fusion feature, and then the trained gas image recognition model predicts and recognizes the first gas image according to the image fusion feature. The method for training the fusion feature generation or training in the method for training the weather map recognition model according to the present invention can refer to the content of the part, as shown in fig. 10, similarly.

In step S606, weather categories for the first weather map are output, the weather categories including east high voltage, low voltage center, low voltage inverted trough, low voltage equalizing field, north high voltage, west high voltage, high voltage center, and/or high voltage equalizing field. The outputting weather categories of the first weather map may include: the weather category and the confidence level of each area (or the appointed area) of the weather map are displayed.

According to the method, the weather category of any weather map is identified and analyzed by using the trained weather map identification model, so that not only can the high-precision weather category of each region (or designated region) and each geographical position of the weather map be rapidly output, but also a heavy pollution forming mechanism is scientifically known, and decision support is provided for accurately analyzing the spatial distribution and time-varying characteristics of pollutants during heavy pollution.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims

1. A method for identifying an aerial image, comprising:

acquiring a first weather map;

extracting image fusion characteristics of the first weather map so that a trained weather map recognition model predicts and recognizes the first weather map according to the image fusion characteristics, wherein the image fusion characteristics are formed by splicing text-level characteristics and image-level characteristics;

outputting the weather category of the first weather map.

2. The method of claim 1, wherein acquiring the first weather map comprises:

acquiring an aerial image in real time, wherein the aerial image represents an air pressure field, an air pressure data value, an isopipe, a map boundary, a legend and/or an invalid number by different colors;

and removing interference data of the aerial image by using the different colors to form a first aerial image, wherein the interference data comprises map boundaries, legends and/or invalid numbers.

3. The method of claim 1, wherein extracting the image fusion feature of the first weather map comprises:

extracting text level features of the first weather map using an EAST text recognition model;

extracting image level features of the first weather map by utilizing a VGG16 transfer learning model;

and combining the text-level features with the image-level features, and then carrying out normalization processing to form image fusion features.

4. The method of claim 3, wherein the EAST text recognition model includes a feature extraction layer, a feature fusion layer, and an output layer, wherein the text level features include a first feature, a second feature, and a third feature, wherein extracting the text level features of the first weather map using the EAST text recognition model includes:

the feature extraction layer outputs a first layer feature map f1, a second layer feature map f2, a third layer feature map f3 and a fourth layer feature map f4 through 4 layers of convolution layers with the channel numbers of 64, 128, 256 and 384;

the feature fusion layer splices f1 up-sampling 2 times and f2 to form a first layer fusion image h1, h1 up-sampling 2 times and f3 to form a second layer fusion image h2, h2 up-sampling 2 times and f4 to form a third layer fusion image h3, and h3 forms a fourth layer fusion image h4 through a convolution layer, wherein h1, h2, h3 and h4 are respectively 1/4, 1/8, 1/16 and 1/32 of the first weather image;

the output layer outputs a first feature using a convolution layer of 1*1 with channel 1, outputs a second feature using a convolution layer of 1*1 with channel 5, and outputs a third feature using a convolution layer of 1*1 with channel 8, wherein the first feature represents a probability that each pixel belongs to a text region, the second feature is used to predict a text feature of a rotated rectangle, and the third feature is used to predict a text of a trapezoid.

5. The method of claim 3, the VGG16 transition learning model comprising a first convolution module, a second convolution module, a third convolution module, a fourth convolution module, and a fifth convolution module, wherein extracting image level features of the first gas image map using the VGG16 transition learning model comprises:

the first gas image graph sequentially passes through a first convolution module, a second convolution module, a third convolution module, a fourth convolution module and a fifth convolution module to generate image level features, wherein the first convolution module comprises a 64-channel 3*3 convolution kernel of 2-layer cascade connection, the second convolution module comprises a 128-channel 3*3 convolution kernel of 2-layer cascade connection, the third convolution module comprises a 256-channel 3*3 convolution kernel of 3-layer cascade connection, the fourth convolution module comprises a 512-channel 3*3 convolution kernel of 2-layer cascade connection, and the fifth convolution module comprises a 512-channel 3*3 convolution kernel of 2-layer cascade connection.

6. The method of claim 2, wherein the weather categories include east high voltage, low voltage center, low voltage inverted trough, low voltage equalizing field, north high voltage, west high voltage, high voltage center, and/or high voltage equalizing field, wherein outputting the weather categories for the first weather map comprises:

and displaying the weather category and the confidence level of each area of the weather map.

7. A method for training an meteorological graph recognition model, comprising:

obtaining an aerial image sample;

if the data of the weather map sample is balanced and the data size meets the requirement, extracting training fusion features of the weather map sample, wherein the training fusion features are formed by splicing text-level features and image-level features;

and training the weather pattern recognition model based on the training fusion characteristic, and stabilizing the weather pattern recognition model through a categorical-cross sentropy loss function.

8. The method of claim 7, wherein the aerial image recognition model comprises an input layer, a first fully connected layer, a first Dropout layer, a second fully connected layer, a second Dropout layer, and an output layer, wherein training the aerial image recognition model based on the training fusion feature comprises:

the training fusion characteristic sequentially passes through an input layer, a first full-connection layer, a first dropoff layer, a second full-connection layer and a second dropoff layer, and is output on the output layer, wherein the first full-connection layer comprises 1024 output neurons, the first dropoff layer is used for deleting 20% of the output neurons, the second full-connection layer comprises 256 output neurons, the second dropoff layer is used for deleting 20% of the output neurons, and the output layer comprises 8 output neurons.

9. The method of claim 7, wherein the step of extracting training fusion features of the aerial image sample in the case that the aerial image sample is balanced in data and the data size meets the requirement comprises the steps of:

extracting text-level features of the weather pattern sample by using an EAST text recognition model;

extracting image-level features of the weather pattern sample by utilizing a VGG16 migration learning model;

10. An electronic device, comprising:

a processor adapted to implement instructions; and a memory adapted to store a plurality of instructions, the instructions adapted to be loaded and executed by the processor: a method for identifying an aerial image according to any of claims 1-6 and/or a method for training an aerial image identification model according to any of claims 7-9.