CN115525786B - Method for constructing ionospheric frequency high-graph classification sample library - Google Patents
Method for constructing ionospheric frequency high-graph classification sample library Download PDFInfo
- Publication number
- CN115525786B CN115525786B CN202211238852.5A CN202211238852A CN115525786B CN 115525786 B CN115525786 B CN 115525786B CN 202211238852 A CN202211238852 A CN 202211238852A CN 115525786 B CN115525786 B CN 115525786B
- Authority
- CN
- China
- Prior art keywords
- frequency
- frequency high
- time
- txt
- image
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 88
- 238000005070 sampling Methods 0.000 claims abstract description 9
- 238000010586 diagram Methods 0.000 claims description 23
- 150000003839 salts Chemical class 0.000 claims description 23
- 239000006002 Pepper Substances 0.000 claims description 9
- 238000004364 calculation method Methods 0.000 claims description 6
- 239000005433 ionosphere Substances 0.000 abstract description 16
- 238000010276 construction Methods 0.000 abstract description 5
- 238000012549 training Methods 0.000 description 7
- 239000006185 dispersion Substances 0.000 description 3
- 235000002566 Capsicum Nutrition 0.000 description 2
- 241000722363 Piper Species 0.000 description 2
- 235000016761 Piper aduncum Nutrition 0.000 description 2
- 235000017804 Piper guineense Nutrition 0.000 description 2
- 235000008184 Piper nigrum Nutrition 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000005293 physical law Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 239000010754 BS 2869 Class F Substances 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000004062 sedimentation Methods 0.000 description 1
- 238000010008 shearing Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/55—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/51—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Medical Informatics (AREA)
- Multimedia (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a construction method of an ionospheric frequency high-frequency chart classification sample library, which comprises the following steps: s1, concentrating all frequency high-image images corresponding to a manually judged TXT identification file in a total folder; s2, analyzing the TXT identification file to obtain the categories corresponding to all the frequency high graphs in the total folder; s3, moving the frequency high graph with the determined category into each classified folder; s4, constructing a relatively balanced sample library of each class of samples through an up-down sampling mechanism. According to the method for constructing the ionosphere frequency-high-image classification sample library, the influence of natural phenomena on the ionosphere frequency-high images is simulated by introducing different types of random noise, so that sample data of various frequency-high image expansion F are expanded, and a relatively balanced frequency-high-image sample library is finally established, so that the method is convenient for serving subsequent supervision and study.
Description
Technical Field
The invention relates to a frequency high-frequency chart processing technology, in particular to a construction method of an ionosphere frequency high-frequency chart classification sample library.
Background
Ionosphere altimeter is a remote sensing device based on radar echo detection of the ionosphere space environment, and the generated image data is called a frequency-altitude map (frequency Gao Tu) and reflects the change of electron density with altitude by a clear trace. The ionosphere F (about 130 km-1000 km, the vast majority of spacecraft flying areas) is not a stable layer, but rather there are some fine structures of plasma (density non-uniformities, or irregularities) that cause diffuse reflection of the incident wave, presenting a piece that is not a sharp line trace, but a dispersion. The expansion F is a natural phenomenon of ionosphere plasma irregularities, affecting radio wave propagation, thereby generating specific dispersion patterns in the frequency hypergraph, and different forms of which correspond to different physical laws.
The current international acceptance of the class F class is relatively high, and is the "Manual of ionization diagram interpretation and measurement" revised in 1978 by the International radio science Union. The graphic features in the frequency-height diagram are divided into a frequency type (Frequency Spread F, FSF for short), a region type (Range Spread F for short, RSF for short), a Mixed type (Mixed Spread F for short, MSF) and a manifold type (BSF for short), and the proposal is made that each station can be classified by itself in a targeted manner in view of the huge difference of the phenomenon features of the Spread F of each station around the world. Corresponding to the above 4 categories, respectively: (1) The frequency is a disturbance structure near the peak value height of the F layer, wherein the trace of the F layer in the low frequency section is clear, and the high frequency section is expanded; (2) The area is a plasma density uneven structure near the bottom of the F layer, wherein the trace of the F layer in the high frequency section is clear, and the low frequency section is expanded; (3) The hybrid type has the characteristics of both frequency type and regional type, and the mechanism is relatively complex; (4) The manifold is a plasma structure with an expansion near the F-layer peak frequency and an expansion F-bifurcation different from the F-layer trace, corresponding to a horizontal distribution and different density, and may be associated with ion sedimentation in a high latitude region.
Because of the scientificity and complexity of the frequency chart, the human eyes can only perform experience judgment in the past internationally. One major drawback of this approach in scientific research is that the subjective judgment of the person is mixed, and the standards for the judgment of the extended F category are different for different researchers, so that even for the same scientist, in the course of working for a long time, the judgment standards will change for the extended F graphic features that change with the year, season, local time, etc.
With the development of the Chinese aerospace technology, particularly the further construction of the second stage of meridian engineering of the central traction of the space center of the Chinese academy, more than ten digital altimeters which are distributed throughout China and work for 24 hours are increased in 2023, the time resolution is about 5-15 minutes, and even engineering personnel develop a high-precision detection network with 1 minute resolution. Under the situation, the traditional mode of relying on manual interpretation of the frequency-high diagram is unfavorable for real-time monitoring of space environment, and all stations cannot be identified by adopting a unified judgment standard in 24 hours manually, so that from the application point of view, the development of the ionosphere frequency-high diagram expansion F phenomenon artificial intelligent identification method is very necessary. In order to develop an intelligent interpretation model of the ionosphere frequency high-graph expansion F phenomenon, a sample library with reasonable data category distribution is necessary according to the requirement of supervised method learning.
Disclosure of Invention
The invention aims to provide a construction method of an ionosphere frequency-high graph classification sample library, which simulates the influence of natural phenomena on the ionosphere frequency-high graph by introducing different types of random noise, thereby expanding sample data of various frequency-high graph expansion F, and finally establishing a relatively balanced frequency-high graph sample library, so that the method is convenient for subsequent supervision and study services.
In order to achieve the above purpose, the invention provides a construction method of an ionospheric frequency high-frequency chart classification sample library, which comprises the following steps:
s1, concentrating all frequency high-image images corresponding to a manually judged TXT identification file in a total folder;
s2, analyzing the TXT identification file to obtain the categories corresponding to all the frequency high graphs in the total folder;
s3, moving the frequency high graph with the determined category into each classified folder;
s4, constructing a relatively balanced sample library of each class of samples through an up-down sampling mechanism.
Preferably, the step S2 specifically includes the following steps:
s21, traversing all the frequency high-image images in the total folder, and obtaining drawing time of all the frequency high-image images;
s22, reading the manually-judged TXT identification file row by row, and obtaining the frequency-high graph category information marked in the manually-judged TXT identification file;
s23, unifying drawing time of the frequency high-frequency image and manually judging frequency high-frequency image time in the TXT identification file, and determining the category of the frequency high-frequency image according to records in the TXT.
Preferably, in step S21, the drawing time format of the intermediate-frequency high-level image is year, month, day, minute, second, and the initial classification is set to 0.
Preferably, the first column of the TXT identification file in step S22 records the date when the extended F phenomenon occurs in a yearly product manner, the second column is the start time when the extended F phenomenon occurs recorded in twenty-four hour time, the third column is the deadline when the extended F phenomenon occurs recorded in twenty-four hour time, and the fourth column is the duration when the extended F occurs, wherein the first two bits of the fourth column are hours and the second two bits are minutes;
and when the time resolutions of the TXT identification files in step S22 are different, determining the category of the TXT identification files by:
s221, judging the number of images of the frequency-high image under the current folder, if the number of images is smaller than or equal to a set value, calculating according to the first time resolution, otherwise, calculating according to the second time resolution;
s222, converting the yearly-accumulated time-of-day timing method into a yearly-month-day-time-minute-second timing method according to the first column data of the TXT identification file;
s223, according to the second column data and the third column data of the TXT identification file, the starting and ending time of a certain expansion F phenomenon is obtained by time counting of time, day and time second;
s224, according to the fourth column of the TXT identification file, acquiring the duration time of the expansion F phenomenon, and then according to the judged time resolution, estimating the number of the images of the expansion F phenomenon possibly existing in the total folder;
s225, marking the category of the frequency high graph according to the TXT identification file;
s226, circulating the steps S221-S225 until all the manually determined marks are traversed.
Preferably, the set value in step S221 is 35040 =24×4×365;
the first time resolution was 15 minutes and the second time resolution was 5 minutes.
Preferably, in step S224:
the estimated image quantity calculation formula of the first resolution expansion F phenomenon is as follows: number offlag=round (floor (i)/100) 60/15+mod (i), 100)/15);
the estimated image quantity calculation formula of the second resolution expansion F phenomenon is as follows: number offlag=round (floor (i)/100) 60/5+mod (i), 100)/5;
wherein duration (i) is the duration of occurrence of the i-th line expansion F phenomenon in the TXT identification file.
Preferably, in step S225, this occurs due to file loss:
when traversing TXT identification files row by row, if a corresponding frequency high map exists at the initial position of a certain row and the corresponding frequency high map is missing at the final position, locking the marking range of the category in the number ofFlag frequency high maps behind the initial position, and marking the frequency high map as the category of the corresponding row if the time of the file name of the frequency high map is smaller than the time of the file name of the frequency high map at the final position; if the number is greater than, marking is not carried out, and the cycle is exited;
if the starting position of the row lacks the corresponding frequency high map and the ending position has the corresponding frequency high map, locking the marking range of the category in the numberofFlag frequency high maps before the ending position, and if the time division of the file name of the frequency high map is greater than the time division of the starting position, marking the frequency high map as the category of the corresponding row; if the number is less than the preset number, marking is not carried out, and the cycle is exited;
and when the starting position and the ending position are not corresponding to the frequency high diagram, ending the cycle, and directly entering the next TXT identification file for judgment.
Preferably, in step S4, the order difference of the frequency-high image of each class is considered, and the class with the number higher than the set sample size is subjected to the equalization operation by introducing different types of noise, and the frequency-high image is subjected to downsampling; and carrying out data enhancement on the frequency high graph of the class with the number lower than the set sample size.
Preferably, the data enhancement specifically includes the following steps:
firstly, copying an original frequency high graph to a sample library, and defining n noise methods;
then, generating a random number in the range of [1, n ], and randomly determining the type of noise added in the selected sample;
then, generating random numbers between [1,5], randomly determining the number of blocks, the number of rows and the number of columns for loading noise, and randomly determining the positions for adding random block, row and column noise in the image;
then, adding corresponding random noise on the determined image blocks, rows and columns;
and adding a noise mark to the image file name with added noise into a sample library of a corresponding category, and circulating the operation until the number of the frequency high pattern books of the category meets the requirement.
The number 5 is an empirical value, and the amount of noise in the existing frequency high-frequency chart is usually not more than 5 blocks, 5 rows, or 5 columns.
Preferably, the noise method at least comprises a Gaussian method and a spiced salt method;
gaussian methods include random gaussian, and column Gao Sifa;
the spiced salt method comprises a random spiced salt method, a row spiced salt method and a column spiced salt method;
in the aspect of super parameters, the noise density parameters of a Gaussian method and a spiced salt method are adjustable; the noise width of the Gaussian method, the row Gao Sifa and the line and line salt-and-pepper method is adjustable; the noise coverage area of the random Gaussian method and the random spiced salt method is adjustable.
Therefore, the invention has the following beneficial effects:
1. under the condition that original observation and incomplete artificial identification data are fully considered, analyzing the artificial identification file, namely taking account of the coexistence of sampling rates of 5 minutes and 15 minutes in sampling data of the same year, adopting a fault-tolerant mechanism for the situation that partial frequency high graphs are possibly lost, and having stronger robustness when establishing the corresponding relationship between the artificial identification and the frequency high graphs, and being capable of correctly establishing the category corresponding relationship between the artificial identification and the frequency high graphs;
2. the original altimeter frequency high-frequency chart is not destroyed and still is a part of a sample library;
3. the problem that the distribution of different types of frequency high patterns is obviously unbalanced is fully solved, and the sample enhancement is realized by means of downsampling and superposition of different types of random noise; the sample enhancement method does not destroy the basic trend of trace lines in the frequency high diagram, but effectively simulates natural phenomena and noise possibly generated by instrument equipment, so that the classification category of the frequency high diagram is ensured to be unchanged;
4. in the sample enhancement method, frequency-high diagram data among 14 years are classified according to manually identified TXT files in 2002-2015, then four types of samples of frequency type FSF, regional type RSF, mixed type MSF and strong region SSF are expanded, each type is expanded to 20000 samples, 10 ten thousand samples are added in the type without the expansion F, training sets and verification sets are divided according to the proportion of 8:2, and finally classification accuracy exceeding 93% is obtained on three models of ResNet34_20_5_100 (Resnet 34 Net) (93.20%), res Net34-modified-20-2-200 (93.50%), residual_provided_net_old_old_25_5 (last_model_92_sgd_25_5. Pkl) (93.53%). The sample enhancement algorithm used by the method is fully proved, the training requirement of the classification model can be fully met, and a foundation is laid for obtaining good model classification precision.
The technical scheme of the invention is further described in detail through the drawings and the embodiments.
Drawings
Fig. 1 is a high frequency chart at 20050104154500 in accordance with the present invention;
fig. 2 is a high frequency chart at 20050116124500 in accordance with the present invention;
fig. 3 illustrates an original frequency Gao Tu of the present invention taken as an example at a time 20060101021500;
FIG. 4 is a frequency chart of the random Gaussian method of FIG. 3;
fig. 5 is a high frequency chart of fig. 3 after being treated by a random salt-and-pepper method;
fig. 6 is a high frequency chart of fig. 3 after treatment by a salt-and-pepper method;
FIG. 7 is a high frequency chart of FIG. 3 after treatment by the salt and pepper method;
FIG. 8 is a frequency chart of the Gaussian process of FIG. 3;
fig. 9 is a frequency chart of fig. 3 after processing by column Gao Sifa.
Detailed Description
The present invention will be further described with reference to the accompanying drawings, and it should be noted that, while the present embodiment provides a detailed implementation and a specific operation process on the premise of the present technical solution, the protection scope of the present invention is not limited to the present embodiment.
The invention comprises the following steps:
s1, concentrating all frequency high-image images corresponding to a manually judged TXT identification file in a total folder;
s2, analyzing the TXT identification file to obtain the categories corresponding to all the frequency high graphs in the total folder;
preferably, the step S2 specifically includes the following steps:
s21, traversing all the frequency high-image images in the total folder, and obtaining drawing time of all the frequency high-image images;
preferably, in step S21, the drawing time format of the intermediate-frequency high-level image is year, month, day, minute, second, and the initial classification is set to 0.
S22, reading the manually-judged TXT identification file row by row, and obtaining the frequency-high graph category information marked in the manually-judged TXT identification file;
s23, unifying drawing time of the frequency high-frequency image and manually judging frequency high-frequency image time in the TXT identification file, and determining the category of the frequency high-frequency image according to records in the TXT.
Preferably, the first column of the TXT identification file in step S22 records the date when the extended F phenomenon occurs in a yearly product manner, the second column is the start time when the extended F phenomenon occurs recorded in twenty-four hour time, the third column is the deadline when the extended F phenomenon occurs recorded in twenty-four hour time, and the fourth column is the duration when the extended F occurs, wherein the first two bits of the fourth column are hours and the second two bits are minutes;
in this embodiment, the frequency chart generated in the Hainan altimeter 2002-2015 is manually interpreted and corrected for a plurality of times, and according to the international convention mode, the interpretation result is recorded as an example, and the result of manual interpretation classification in 2011 is described:
table 1 is a chart of Hainan altimeter 2011 for manual interpretation of annual frequency chart classification
The first column of the TXT file shown in table 1 records the date of occurrence of the extended F phenomenon in terms of the annual product date, for example 049 represents the 49 th day of 2011, that is, the 18 th day of 2 months of 2011. And so on for the latter three columns, the fifth column representing the type of extension F during this time period.
And when the time resolutions of the TXT identification files in step S22 are different, determining the category of the TXT identification files by:
s221, judging the number of images of the frequency-high image under the current folder, if the number of images is smaller than or equal to a set value, calculating according to the first time resolution, otherwise, calculating according to the second time resolution;
preferably, the set value in step S221 is 35040 =24×4×365;
the first time resolution is 15 minutes, the second time resolution is 5 minutes (the first period of meridian engineering implemented by the central lead of the space center of the department of Chinese academy, the digital altimeter deployed in the Hainan station basically keeps the sampling resolution at 15 minutes, but the later period meets the requirements of the business department, and the condition that the sampling resolution of 5 minutes and the sampling resolution of 15 minutes coexist can occur in certain periods of certain years).
Preferably, in step S224:
the estimated image quantity calculation formula of the first resolution expansion F phenomenon is as follows: number offlag=round (floor (i)/100) 60/15+mod (i), 100)/15);
the estimated image quantity calculation formula of the second resolution expansion F phenomenon is as follows: number offlag=round (floor (i)/100) 60/5+mod (i), 100)/5;
wherein duration (i) is the duration of occurrence of the i-th line expansion F phenomenon in the TXT identification file.
S222, converting the yearly-accumulated time-of-day timing method into a yearly-month-day-time-minute-second timing method according to the first column data of the TXT identification file;
s223, according to the second column data and the third column data of the TXT identification file, the starting and ending time of a certain expansion F phenomenon is obtained by time counting of time, day and time second;
s224, according to the fourth column of the TXT identification file, acquiring the duration time of the expansion F phenomenon, and then according to the judged time resolution, estimating the number of the images of the expansion F phenomenon possibly existing in the total folder;
s225, marking the category of the frequency high graph according to the TXT identification file;
preferably, in step S225, this occurs due to file loss:
when the TXT identification file is traversed row by row, if the starting position of a certain row has a corresponding frequency high diagram, and the ending position lacks the corresponding frequency high diagram, locking the marking range of the category in the number ofFlag frequency high diagrams behind the starting position, and judging whether the time of the file name of the frequency high diagram is smaller than the time of the ending position. If the frequency high graph is smaller than or equal to the classification category of the corresponding row, marking the frequency high graph as the classification category of the corresponding row; if the number is greater than, marking is not carried out, and the cycle is exited;
if the starting position of the row lacks the corresponding frequency high map and the ending position has the corresponding frequency high map, locking the marking range of the category in the numberofFlag frequency high maps before the ending position, and if the time division of the file name of the frequency high map is greater than the time division of the starting position, marking the frequency high map as the category of the corresponding row; if the number is less than the preset number, marking is not carried out, and the cycle is exited;
and when the starting position and the ending position are not corresponding to the frequency high diagram, ending the cycle, and directly entering the next TXT identification file for judgment.
S226, circulating the steps S221-S225 until all the manually determined marks are traversed.
S3, moving the frequency high graph with the determined category into each classified folder;
s4, constructing a relatively balanced sample library of each class of samples through an up-down sampling mechanism.
Preferably, in step S4, the order difference of the frequency-high image of each class is considered, and the class with the number higher than the set sample size is subjected to the equalization operation by introducing different types of noise, and the frequency-high image is subjected to downsampling; and carrying out data enhancement on the frequency high graph of the class with the number lower than the set sample size.
Preferably, the data enhancement specifically includes the following steps:
firstly, copying an original frequency high graph to a sample library, and defining n noise methods;
then, generating a random number in the range of [1, n ], and randomly determining the type of noise added in the selected sample;
then, generating random numbers between [1,5], randomly determining the number of blocks, the number of rows and the number of columns for loading noise, and randomly determining the positions for adding random block, row and column noise in the image;
then, adding corresponding random noise on the determined image blocks, rows and columns;
and adding a noise mark to the image file name with added noise into a sample library of a corresponding category, and circulating the operation until the number of the frequency high pattern books of the category meets the requirement.
The number 5 is an empirical value, and the amount of noise in the existing frequency high-frequency chart is usually not more than 5 blocks, 5 rows, or 5 columns.
The statistical results of classifying the frequency chart generated in Hainan altimeter 2002-2015 in this embodiment are shown in the following table 2:
table 2 is a chart of statistical results of frequency chart generated by Hainan altimeter 2002-2015
As can be seen from the above table, the number ratio of the frequency-high graphs of the five categories in the whole image is: 91.64%, 2.41%, 1.12%, 3.24% and 1.60%. It is apparent that there is a significant imbalance in the sample data between the classes. In order to develop a supervised learning recognition method of the ionosphere frequency high-graph expansion F phenomenon, a sample library which is uniformly distributed and basically consistent in the number of various types of samples is an important guarantee for ensuring rapid convergence during model training before deep neural network training. Therefore, before training of the deep learning-based frequency-high graph automatic recognition model, the above-described problem of significantly unbalanced distribution of statistical frequency-high graphs must be solved.
Fig. 1 is a high frequency chart at 20050104154500 in accordance with the present invention; fig. 2 is a frequency chart at the moment 20050116124500, wherein the noise display is more remarkable in the frame, and as shown in fig. 1 and fig. 2, the frequency chart between the years 2002 and 2015 can be classified year by year through the analysis of the first step manual identification file. The statistical results of the classification are shown in table 2. In the 14 years, the numbers of the frequency-height diagrams of the five categories are respectively as follows: 426555, 11226, 5201, 15083 and 7428. In deep neural network training, the magnitude difference of each class of samples needs to be considered as much as possible. Such as: downsampling is performed without an extended sample, which is acceptable because the image without an extension has a relatively certain similarity; the regional RSF and the strong regional SSF should be respectively extended to 400% and 300% of the original data amount, and when the samples are extended, the sample number should be increased as much as possible, and the repetition rate of similar or related samples should be reduced as much as possible, so as to ensure the final classification accuracy of the model, and therefore, 20000 is finally determined as the sample number of the final sample library of each category.
Meanwhile, the drawing principle of the frequency chart shows that the ionosphere is fed back and displayed by radar waves with different frequencies at 0-17MHz, and different forms correspond to different physical laws. Therefore, the sample enhancement of the frequency-high graph cannot destroy the matching relationship between the frequency (abscissa) and the height (ordinate), otherwise, the feature carrying the physical phenomenon in the image is destroyed, and the interpretation of later scientific researchers is affected. However, conventional image data enhancement methods, including but not limited to rotation, shearing, stretching, etc., may cause unacceptable damage to the trace lines reflecting the change of electron density with height, so that professionals cannot recognize the image after sample enhancement at all.
After the study on the frequency high diagram of the Hainan station for 14 years, the diffuse reflection of the plasma fine structure on the incident electric wave is found, and the diffuse reflection is obviously dispersed in the frequency high diagram. In addition, sometimes, the altimeter itself is unstable, so that abnormal longitudinal or transverse noise points appear in the frequency high-frequency chart. Therefore, aiming at the sample enhancement requirement that the ionosphere frequency high-image category distribution is obviously unbalanced, the traditional image sample enhancement method cannot be adopted, but diffuse reflection caused by a plasma fine structure to an incident electric wave is simulated, or the machine is unstable, and the number of samples of the same type can be increased under the condition that the frequency high-image category is not changed by artificially increasing the dispersion distribution or noise points in the frequency high-image.
Fig. 3 illustrates an original frequency Gao Tu of the present invention taken as an example at a time 20060101021500; FIG. 4 is a frequency chart of the random Gaussian method of FIG. 3; fig. 5 is a high frequency chart of fig. 3 after being treated by a random salt-and-pepper method; fig. 6 is a high frequency chart of fig. 3 after treatment by a salt-and-pepper method; FIG. 7 is a high frequency chart of FIG. 3 after treatment by the salt and pepper method; FIG. 8 is a frequency chart of the Gaussian process of FIG. 3; fig. 9 is a high frequency chart of fig. 3 after being processed by a column Gao Sifa, and as shown in fig. 3-9, the preferred noise method at least comprises a gaussian method and a spiced salt method; gaussian methods include random gaussian, and column Gao Sifa; the spiced salt method comprises a random spiced salt method, a row spiced salt method and a column spiced salt method; in the aspect of super parameters, the noise density parameters of a Gaussian method and a spiced salt method are adjustable; the noise width of the Gaussian method, the row Gao Sifa and the line and line salt-and-pepper method is adjustable; the noise coverage area of the random Gaussian method and the random spiced salt method is adjustable.
Therefore, the method for constructing the ionosphere frequency-high graph classification sample library is adopted, and firstly, the matching relationship between the manual identification and the frequency-high graph is established. The recording mode of the manual mark on the year, the month, the day, the time, the minute and the second is modified into the recording mode of the manual mark on the year, the month, the day, the time, the minute and the second, and then the association relation between the mark and the related frequency high graph is established through a robust comparison method; secondly, through analysis of the statistics of the frequency-high graph for many years, in order to establish a relatively balanced image library of class samples, data enhancement must be performed on the existing classes. The physical significance of the frequency-high diagram determines that the sample expansion method adopting general rotation and clipping cannot be simple and rough. The data enhancement of the sample must follow the physical meaning carried by the frequency-high diagram. Through image analysis and discussion of related scientific researchers, diffuse reflection of a plasma fine structure on an incident electric wave and different types of noise of a frequency-height diagram can be caused by reasons of working voltage, local short-time weather and the like of the altimeter, so that the scheme simulates the influence of natural phenomena on the ionosphere frequency-height diagram by introducing different types of random noise, thereby expanding sample data of various frequency-height diagram expansion F, and finally establishing a relatively balanced frequency-height diagram library. The effectiveness of the sample enhancement algorithm provided by the scheme is also verified by the subsequent model training result.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention and not for limiting it, and although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that: the technical scheme of the invention can be modified or replaced by the same, and the modified technical scheme cannot deviate from the spirit and scope of the technical scheme of the invention.
Claims (7)
1. The method for constructing the ionospheric frequency high-image classification sample library is characterized by comprising the following steps of: the method comprises the following steps:
s1, concentrating all frequency high-image images corresponding to a manually judged TXT identification file in a total folder;
s2, analyzing the TXT identification file to obtain the categories corresponding to all the frequency high graphs in the total folder;
the step S2 specifically comprises the following steps:
s21, traversing all the frequency high-image images in the total folder, and obtaining drawing time of all the frequency high-image images;
s22, reading the manually-judged TXT identification file row by row, and obtaining the frequency-high graph category information marked in the manually-judged TXT identification file;
s23, unifying drawing time of the frequency high-frequency image and manually judging frequency high-frequency image time in the TXT identification file, and determining the category of the frequency high-frequency image according to records in the TXT;
s3, moving the frequency high graph with the determined category into each classified folder;
s4, constructing a relatively balanced sample library of each class of samples through an up-down sampling mechanism;
in the step S4, taking the order difference of the frequency-high image of each category into consideration, carrying out equalization operation by introducing different types of noise, and downsampling the frequency-high image of the category with the number higher than the set sample size; the data enhancement is carried out on the frequency high graph of the categories with the quantity lower than the set sample quantity;
the data enhancement specifically comprises the following steps:
firstly, copying an original frequency high graph to a sample library, and defining n noise methods;
then, generating a random number in the range of [1, n ], and randomly determining the type of noise added in the selected sample;
then, generating random numbers between [1,5], randomly determining the number of blocks, the number of rows and the number of columns for loading noise, and randomly determining the positions for adding random block, row and column noise in the image;
then, adding corresponding random noise on the determined image blocks, rows and columns;
and adding a noise mark to the image file name with added noise into a sample library of a corresponding category, and circulating the operation until the number of the frequency high pattern books of the category meets the requirement.
2. The method for constructing an ionospheric frequency high-resolution classification sample library according to claim 1, wherein: in step S21, the drawing time format of the intermediate-frequency high-level image is year, month, day, minute, second, and the initial classification is set to 0.
3. The method for constructing the ionospheric frequency high-resolution classification sample library according to claim 2, wherein: the first column of the TXT identification file in step S22 is to record the date when the expansion F phenomenon occurs in a form of a yen product day, the second column is the start time when the expansion F phenomenon occurs recorded in a twenty-four time method, the third column is the deadline when the expansion F phenomenon occurs recorded in a twenty-four time method, and the fourth column is the duration when the expansion F occurs, wherein the first two bits of the fourth column are hour bits and the second two bits are minute bits;
and when the time resolutions of the TXT identification files in step S22 are different, determining the category of the TXT identification files by:
s221, judging the number of images of the frequency-high image under the current folder, if the number of images is smaller than or equal to a set value, calculating according to the first time resolution, otherwise, calculating according to the second time resolution;
s222, converting the yearly-accumulated time-of-day timing method into a yearly-month-day-time-minute-second timing method according to the first column data of the TXT identification file;
s223, according to the second column data and the third column data of the TXT identification file, the starting and ending time of a certain expansion F phenomenon is obtained by time counting of time, day and time second;
s224, according to the fourth column of the TXT identification file, acquiring the duration time of the expansion F phenomenon, and then according to the judged time resolution, estimating the number of the images of the expansion F phenomenon possibly existing in the total folder;
s225, marking the category of the frequency high graph according to the TXT identification file;
s226, circulating the steps S221-S225 until all the manually determined marks are traversed.
4. A method for constructing an ionospheric frequency high-resolution classification sample library according to claim 3, wherein: the set value in step S221 is 35040 =24×4×365;
the first time resolution was 15 minutes and the second time resolution was 5 minutes.
5. The method for constructing ionospheric frequency high-resolution classification sample library according to claim 4, wherein: in step S224:
the estimated image quantity calculation formula of the first resolution expansion F phenomenon is as follows: number offlag=round (floor (i)/100) 60/15+mod (i), 100)/15);
the estimated image quantity calculation formula of the second resolution expansion F phenomenon is as follows: number offlag=round (floor (i)/100) 60/5+mod (i), 100)/5;
wherein duration (i) is the duration of occurrence of the i-th line expansion F phenomenon in the TXT identification file.
6. A method for constructing an ionospheric frequency high-resolution classification sample library according to claim 3, wherein: in step S225, this occurs due to file loss:
when traversing TXT identification files row by row, if a corresponding frequency high map exists at the initial position of a certain row and the corresponding frequency high map is missing at the final position, locking the marking range of the category in the number ofFlag frequency high maps behind the initial position, and marking the frequency high map as the category of the corresponding row if the time of the file name of the frequency high map is smaller than the time of the file name of the frequency high map at the final position; if the number is greater than, marking is not carried out, and the cycle is exited;
if the starting position of the row lacks the corresponding frequency high map and the ending position has the corresponding frequency high map, locking the marking range of the category in the numberofFlag frequency high maps before the ending position, and if the time division of the file name of the frequency high map is greater than the time division of the starting position in the year, month, day and minute, and if the time division is greater than or equal to the time division of the starting position, marking the frequency high map as the category of the corresponding row; if the number is less than the preset number, marking is not carried out, and the cycle is exited;
and when the starting position and the ending position are not corresponding to the frequency high diagram, ending the cycle, and directly entering the next TXT identification file for judgment.
7. The method for constructing an ionospheric frequency high-resolution classification sample library according to claim 1, wherein: the noise method at least comprises a Gaussian method and a spiced salt method;
gaussian methods include random gaussian, and column Gao Sifa;
the spiced salt method comprises a random spiced salt method, a row spiced salt method and a column spiced salt method;
in the aspect of super parameters, the noise density parameters of a Gaussian method and a spiced salt method are adjustable; the noise width of the Gaussian method, the row Gao Sifa and the line and line salt-and-pepper method is adjustable; the noise coverage area of the random Gaussian method and the random spiced salt method is adjustable.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211238852.5A CN115525786B (en) | 2022-10-11 | 2022-10-11 | Method for constructing ionospheric frequency high-graph classification sample library |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211238852.5A CN115525786B (en) | 2022-10-11 | 2022-10-11 | Method for constructing ionospheric frequency high-graph classification sample library |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115525786A CN115525786A (en) | 2022-12-27 |
CN115525786B true CN115525786B (en) | 2024-02-20 |
Family
ID=84702208
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211238852.5A Active CN115525786B (en) | 2022-10-11 | 2022-10-11 | Method for constructing ionospheric frequency high-graph classification sample library |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115525786B (en) |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5218299A (en) * | 1991-03-25 | 1993-06-08 | Reinhard Dunkel | Method for correcting spectral and imaging data and for using such corrected data in magnet shimming |
JP2000307964A (en) * | 1999-04-23 | 2000-11-02 | Casio Comput Co Ltd | Device and method for generating moving picture |
CN112230205A (en) * | 2020-10-16 | 2021-01-15 | 哈尔滨工程大学 | Underwater target recognition system performance evaluation method using ship radiation noise simulation signal |
CN112884003A (en) * | 2021-01-18 | 2021-06-01 | 中国船舶重工集团公司第七二四研究所 | Radar target sample expansion generation method based on sample expander |
CN113569632A (en) * | 2021-06-16 | 2021-10-29 | 西安电子科技大学 | Small sample local surface slow-speed moving object classification method based on WGAN |
CN113900137A (en) * | 2021-07-30 | 2022-01-07 | 应急管理部国家自然灾害防治研究院 | Data processing method and system for high-energy particle detector |
CN114429151A (en) * | 2021-12-16 | 2022-05-03 | 中南大学 | Magnetotelluric signal identification and reconstruction method and system based on depth residual error network |
CN114715430A (en) * | 2021-03-31 | 2022-07-08 | 中国科学院国家空间科学中心 | System for multi-satellite automatic linear formation and time-varying baseline generation |
CN115063624A (en) * | 2022-05-05 | 2022-09-16 | 北京交通大学 | Small sample classification learning method based on graph neural network |
CN115508800A (en) * | 2022-08-19 | 2022-12-23 | 中国科学院国家空间科学中心 | Method and system for screening ionospheric frequency elevation map extension F phenomenon radar graph |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10419698B2 (en) * | 2015-11-12 | 2019-09-17 | Canon Kabushiki Kaisha | Image processing apparatus and image processing method |
US11488025B2 (en) * | 2019-04-29 | 2022-11-01 | Landmark Graphics Corporation | Hybrid neural network and autoencoder |
AU2020437435B2 (en) * | 2020-03-26 | 2023-07-20 | Shenzhen Institutes Of Advanced Technology | Adversarial image generation method, apparatus, device, and readable storage medium |
-
2022
- 2022-10-11 CN CN202211238852.5A patent/CN115525786B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5218299A (en) * | 1991-03-25 | 1993-06-08 | Reinhard Dunkel | Method for correcting spectral and imaging data and for using such corrected data in magnet shimming |
JP2000307964A (en) * | 1999-04-23 | 2000-11-02 | Casio Comput Co Ltd | Device and method for generating moving picture |
CN112230205A (en) * | 2020-10-16 | 2021-01-15 | 哈尔滨工程大学 | Underwater target recognition system performance evaluation method using ship radiation noise simulation signal |
CN112884003A (en) * | 2021-01-18 | 2021-06-01 | 中国船舶重工集团公司第七二四研究所 | Radar target sample expansion generation method based on sample expander |
CN114715430A (en) * | 2021-03-31 | 2022-07-08 | 中国科学院国家空间科学中心 | System for multi-satellite automatic linear formation and time-varying baseline generation |
CN113569632A (en) * | 2021-06-16 | 2021-10-29 | 西安电子科技大学 | Small sample local surface slow-speed moving object classification method based on WGAN |
CN113900137A (en) * | 2021-07-30 | 2022-01-07 | 应急管理部国家自然灾害防治研究院 | Data processing method and system for high-energy particle detector |
CN114429151A (en) * | 2021-12-16 | 2022-05-03 | 中南大学 | Magnetotelluric signal identification and reconstruction method and system based on depth residual error network |
CN115063624A (en) * | 2022-05-05 | 2022-09-16 | 北京交通大学 | Small sample classification learning method based on graph neural network |
CN115508800A (en) * | 2022-08-19 | 2022-12-23 | 中国科学院国家空间科学中心 | Method and system for screening ionospheric frequency elevation map extension F phenomenon radar graph |
Non-Patent Citations (3)
Title |
---|
zheng W.,pengdong G.,et al. .《Automatic Detection and Classification of Spread-F From Ionosonde at Hainan With Image-Based Deep Learning Method 》.《Space Weather》.2023,论文1-14页. * |
一种基于生成式对抗网络的图像数据扩充方法;王海文;邱晓晖;;计算机技术与发展;20191205(03);57-62 * |
基于自相似性与多任务高斯过程回归的单帧图像超分辨率重建;李键红;吴亚榕;吕巨建;;光学精密工程;20181115(11);205-217 * |
Also Published As
Publication number | Publication date |
---|---|
CN115525786A (en) | 2022-12-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Imada et al. | The July 2018 high temperature event in Japan could not have happened without human-induced global warming | |
Liu et al. | A new statistical downscaling model for autumn precipitation in China | |
Li et al. | Growth of wave height with retreating ice cover in the Arctic | |
CN107450054B (en) | A kind of adaptive Coherent Noise in GPR Record denoising method | |
CN115508800A (en) | Method and system for screening ionospheric frequency elevation map extension F phenomenon radar graph | |
CN107247927A (en) | A kind of remote sensing images coastline information extracting method and system based on K-T Transformation | |
CN115223054A (en) | Remote sensing image change detection method based on partition clustering and convolution | |
US20240312206A1 (en) | Accurate inversion method and system for aboveground biomass of urban vegetations considering vegetation type | |
Fu et al. | Simulated relationship between wintertime ENSO and East Asian summer rainfall: From CMIP3 to CMIP6 | |
Pinto et al. | Assessment of winter cyclone activity in a transient ECHAM4-OPYC3 GHG experiment | |
CN102073867A (en) | Sorting method and device for remote sensing images | |
Sahu et al. | Characterization of precipitation in the subdivisions of the Mahanadi River basin, India | |
CN112418506A (en) | Coastal zone wetland ecological safety pattern optimization method and device based on machine learning | |
CN115525786B (en) | Method for constructing ionospheric frequency high-graph classification sample library | |
CN117576238A (en) | Ionosphere expansion F graph prediction method and system based on time sequence convolution and superdivision | |
CN111915694B (en) | Cloud coverage pixel earth surface temperature reconstruction method considering space-time characteristics | |
CN117392510A (en) | Improved space-time GAN network-based frequency high-frequency chart prediction and detection method | |
CN111142134B (en) | Coordinate time series processing method and device | |
NL2033915B1 (en) | Monthly precipitation forecasting model with the temporal convolutional network | |
CN115861791B (en) | Method and device for generating litigation clues and storage medium | |
CN117115671A (en) | Soil quality analysis method and device based on remote sensing and electronic equipment | |
CN116502136A (en) | Operational insulator pollution grade assessment method based on noise | |
CN116148713A (en) | Power transmission line fault information extraction method for improving fault information integrity | |
CN113434998B (en) | Random typhoon generation method based on hidden Markov supplemental model | |
Gruza et al. | Assessment of forthcoming climate changes on the territory of the Russian Federation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |