CN115525786A - Method for constructing ionospheric frequency high-graph classification sample library - Google Patents

Method for constructing ionospheric frequency high-graph classification sample library Download PDF

Info

Publication number
CN115525786A
CN115525786A CN202211238852.5A CN202211238852A CN115525786A CN 115525786 A CN115525786 A CN 115525786A CN 202211238852 A CN202211238852 A CN 202211238852A CN 115525786 A CN115525786 A CN 115525786A
Authority
CN
China
Prior art keywords
frequency
txt
time
image
column
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211238852.5A
Other languages
Chinese (zh)
Other versions
CN115525786B (en
Inventor
高鹏东
裘初
齐全
王铮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Space Science Center of CAS
Communication University of China
Original Assignee
National Space Science Center of CAS
Communication University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Space Science Center of CAS, Communication University of China filed Critical National Space Science Center of CAS
Priority to CN202211238852.5A priority Critical patent/CN115525786B/en
Publication of CN115525786A publication Critical patent/CN115525786A/en
Application granted granted Critical
Publication of CN115525786B publication Critical patent/CN115525786B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/55Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a method for constructing an ionospheric frequency histogram classification sample library, which comprises the following steps: s1, collecting all frequently high image images corresponding to manually judged TXT identification files in a general folder; s2, analyzing the TXT identification file to obtain the corresponding categories of all the frequently-raised pictures in the total folder; s3, moving the frequency height map of the determined category to each classification folder; and S4, constructing a relatively balanced sample library of each type of sample through an up-down sampling mechanism. The invention adopts the construction method of the ionospheric frequency high-graph classification sample library, and simulates the influence of natural phenomena on the ionospheric frequency high-graph by introducing different types of random noise, thereby expanding the sample data of various frequency high-graphs expanding F, finally establishing a relatively balanced frequency high-graph sample library, and facilitating the subsequent supervised learning service.

Description

Method for constructing ionospheric frequency high-graph classification sample library
Technical Field
The invention relates to a frequency-height map processing technology, in particular to a method for constructing an ionospheric frequency-height map classification sample library.
Background
The ionosphere altimeter is a remote sensing device for detecting the ionosphere space environment based on radar echo, and the generated image data is called a frequency-height map (frequency Gao Tu), and the change of electron density along with height is reflected by a clear tracing line. The ionosphere F layer (about 130 km to 1000 km, most spacecraft flight areas) is not a stable layer in a certain height range, but has some plasma fine structures (density inhomogeneities, or irregular bodies) which cause diffuse reflection to incident electric waves and present a clear linear trace instead of a dispersed piece. The spread F is a natural phenomenon of ionospheric plasma irregularities that affect radio wave propagation, and thus a specific dispersion pattern is generated in a frequency height map, with different forms corresponding to different physical laws.
The expanded F category classification with high international acceptance at present is the ionization diagram explanation and measurement manual revised in 1978 of the International radio scientific Union. According to the graphic features in the Frequency-height map, the Frequency Spread F (FSF for short), the region Spread F (RSF for short), the Mixed Spread F (MSF for short), and the Branch Spread F (BSF for short) are classified, and it is proposed that each station can use its own targeted classification in consideration of the great difference of the phenomenon features of the Spread F of each station around the world. Corresponding to the above 4 categories, respectively: (1) The frequency type is a disturbance structure with clear tracing lines of an F layer in a low frequency band and expansion in a high frequency band, and corresponds to the height of a peak value of the F layer; (2) The region type is a structure with clear tracing lines of the F layer in a high-frequency section and expansion in a low-frequency section, and corresponds to uneven plasma density near the bottom of the F layer; (3) The hybrid type has the characteristics of both frequency type and region type, and the mechanism is relatively complex; (4) The bifurcation is that the F layer has an extension near the peak frequency and an extension F bifurcation different from the F layer trace, corresponding to the plasma structure with horizontal distribution and different density, which may be associated with ion deposition in the high latitude area.
Due to the scientificity and complexity of the frequency elevation map, the frequency elevation map can be judged only by human eyes internationally. The method has a great defect in scientific research that human subjective judgment is mixed, the standards of expanded F type judgment of different researchers are different, and even the same researcher changes the judgment standards of expanded F graphic characteristics which change along with the year, season, local time and the like in the working process of the same researcher.
With the development of the aerospace technology in China, particularly the further construction of the second phase of meridian engineering involving the space center of Chinese academy of sciences, more than ten digital altimeters which are distributed in China and work for 24 hours are added in 2023, the time resolution is about 5-15 minutes, and even engineers develop a high-precision detection network with the resolution of 1 minute. Under the condition, the conventional method of manually interpreting the frequency elevation map is not beneficial to real-time monitoring of the space environment, and the stations cannot be identified by adopting a unified judgment standard in 24 hours manually, so that the development of the ionospheric frequency elevation map expansion F phenomenon artificial intelligence identification method is very necessary from the application perspective. In order to develop an intelligent interpretation model of the ionospheric frequency high-graph spread F phenomenon, a sample library with reasonable data category distribution is indispensable according to the learning requirement of a supervision method.
Disclosure of Invention
The invention aims to provide a method for constructing an ionospheric frequency high-graph classification sample base, which simulates the influence of natural phenomena on an ionospheric frequency high-graph by introducing different types of random noise, thereby expanding sample data of various frequency high-graphs to expand F, and finally establishing a relatively balanced frequency high-graph sample base, so as to be convenient for subsequent supervised learning services.
In order to achieve the above object, the present invention provides a method for constructing an ionospheric frequency histogram classification sample library, comprising the following steps:
s1, collecting all frequently high image images corresponding to manually judged TXT identification files in a general folder;
s2, analyzing the TXT identification file to obtain the corresponding categories of all the frequently-raised pictures in the total folder;
s3, moving the frequency height map of the determined category to each classification folder;
and S4, constructing a relatively balanced sample library of each type of sample through an up-down sampling mechanism.
Preferably, step S2 specifically includes the following steps:
s21, traversing all the frequency-height image images in the total folder, and obtaining the drawing time of all the frequency-height image images;
s22, reading the manually judged TXT identification files line by line, and acquiring the frequently and highly mapped type information marked in the manually judged TXT identification files;
and S23, unifying the drawing time of the high-frequency image and artificially judging the high-frequency image time in the TXT identification file, and determining the type of the high-frequency image according to the record in the TXT.
Preferably, in step S21, the time format of rendering the high-resolution image is year, month, day, hour, minute and second, and all the initial classifications are set to 0.
Preferably, the first column of the TXT identification file in step S22 is a year-and-month-day-type record of the date of occurrence of the extended F phenomenon, the second column is a start time of occurrence of the extended F phenomenon recorded by a twenty-four-hour chronograph method, the third column is an end time of occurrence of the extended F phenomenon recorded by a twenty-four-hour chronograph method, and the fourth column is a duration of occurrence of the extended F phenomenon, wherein the first two digits of the fourth column are hour digits and the last two digits are minute digits;
and when the time resolution of the TXT identification file in step S22 is different, the category of the TXT identification file is determined by:
s221, judging the number of images of the frequency-height images under the current folder, if the number of images is less than or equal to a set value, calculating according to a first time resolution, otherwise, calculating according to a second time resolution;
s222, converting a chronology and a day timing method into a chronology of year, month, day, hour, minute and second according to the first column of data of the TXT identification file;
s223, according to the data of the second column and the third column of the TXT identification file, the starting time and the ending time of a certain expansion F phenomenon are obtained by time counting in time of year, month, day, hour, minute and second;
s224, acquiring the duration of the occurrence of the expansion F phenomenon according to the fourth column of the TXT identification file, and then estimating the number of images of the expansion F phenomenon possibly existing in the total folder according to the judged time resolution;
s225, marking the category of the high frequency graph according to the TXT identification file;
and S226, looping the steps S221-S225 until all the manually judged marks are traversed.
Preferably, the set value in step S221 is 35040=24 × 4 × 365;
the first temporal resolution is 15 minutes and the second temporal resolution is 5 minutes.
Preferably, in step S224:
the estimated image quantity calculation formula of the extended F phenomenon of the first resolution is: numberfFlag = round (floor (duration (i)/100) × 60/15+ mod (duration (i), 100)/15);
the estimated image quantity calculation formula of the expansion F phenomenon of the second resolution is as follows: numberfFlag = round (floor (duration (i)/100) × 60/5+ mod (duration (i), 100)/5);
wherein, duration (i) is the duration of the occurrence of the expanding F phenomenon of the ith line in the TXT identification file.
Preferably, in step S225, due to file loss:
when traversing the TXT identification files row by row, if the initial position of a certain row is provided with a corresponding frequency height map and the end position lacks the corresponding frequency height map, locking the marking range of the category in the number of frequency height maps behind the initial position, and if the time of the year, month and day of passing the file name of the frequency height map is less than the time of the year, month and day of the end position, marking the frequency height map as the classification category of the corresponding row; if yes, not marking, and exiting the loop;
if the starting position of the line lacks the corresponding frequency height map and the ending position has the corresponding frequency height map, locking the marking range of the category in the number of frequency height maps before the ending position, and marking the frequency height map as the classification category of the corresponding line if the time of year, month and day of the file name of the frequency height map is greater than the time of year, month and day of the starting position or not; if the current time is less than the preset time, marking is not carried out, and the loop is exited;
when the starting position and the ending position do not correspond to the frequency height diagram, the loop is ended, and the next TXT identification file judgment is directly carried out.
Preferably, in step S4, equalization operation is performed by introducing different types of noise in consideration of the difference of the order of magnitude of the image of each class of the frequency height map, and the frequency height map of the class with the number of numbers higher than the set sample amount is down-sampled; and performing data enhancement on the frequency height map of the class with the number lower than the set sample number.
Preferably, the data enhancement specifically comprises the following steps:
firstly, copying an original frequency height image to a sample library, and defining n noise methods;
then, generating a random number in the range of [1,n ], and randomly determining the type of noise added in the selected sample;
then, generating random numbers between [1,5], randomly determining the number of blocks, the number of rows and the number of columns for loading noise, and randomly determining the positions for adding random block, row and column noise in the image;
then, corresponding random noise is added to the determined image blocks, rows and columns;
and adding a noise mark to the image file name with the noise added, adding the image file name with the noise added to a sample library of the corresponding category, and circulating the operation until the number of the frequency height pattern samples of the category meets the requirement.
The above number 5 is an empirical value and the amount of noise in the existing histogram typically does not exceed 5 blocks, 5 rows or 5 columns.
Preferably, the noise process includes at least a gaussian process and a salt-pepper process;
the Gaussian method comprises random Gao Sifa, the Gaussian method and column Gao Sifa;
the salt and pepper method comprises a random salt and pepper method, a line salt and a column salt and pepper method;
in the aspect of super parameters, noise density parameters of a Gaussian method and a salt-pepper method are adjustable; the noise widths of the line Gao Sifa method and the column gauss method, and the line pepper salt method and the column pepper salt method are adjustable; the noise coverage area of the random Gaussian method and the random salt pepper method is adjustable.
Therefore, the beneficial effects of the invention are as follows:
1. under the condition of fully considering the original observation and the incomplete manual identification data, the manual identification file is analyzed, namely the condition that the sampling rates of 5 minutes and 15 minutes in the sampling data of the same year coexist is considered, and a fault-tolerant mechanism is adopted for the condition that partial frequency height maps are possibly lost, so that the corresponding relationship between the manual identification and the frequency height maps is established, the robustness is high, and the category corresponding relationship between the manual identification and the frequency height maps can be accurately established;
2. the original altimeter frequency height map is not damaged and still is a part of the sample library;
3. the problem that the distributions of different types of frequency high pattern samples are obviously unbalanced is fully solved, and sample enhancement is realized in a mode of down-sampling and superposing different types of random noise; the sample enhancement method does not destroy the basic trend of the trace lines in the frequency-height map, but effectively simulates natural phenomena and noise possibly generated by instrument equipment, so that the classification category of the frequency-height map is kept unchanged;
4. by using the sample enhancement method of the scheme, in 2002-2015, frequency-height map data in 14 years are classified according to a manually identified TXT file, then four types of samples including frequency type FSF, region type RSF, hybrid type MSF and strong region SSF are expanded, each type is expanded to 20000 samples, and a total of 10 ten thousand samples are added, training sets and verification sets are divided according to the proportion of 8:2, and finally, classification accuracy exceeding 93% is achieved on three models including ResNet34_20 \ 5 (Net 34 Net) (93.20%), resNet34-modified-20-2-200 (93.50%), residual _ identification _ Net _ old _25 \ _5 (last _ model _92 \ 25_5. Pkl) (93.53%). The sample enhancement algorithm used by the invention is fully proved, the training requirement of the classification model can be fully met, and a foundation is laid for obtaining good model classification precision.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
FIG. 1 is a graph of frequency at time 20050104154500 in accordance with the present invention;
FIG. 2 is a graph of the frequency at time 20050116124500 of the present invention;
FIG. 3 shows an original frequency Gao Tu, which is exemplified by time 20060101021500 according to the present invention;
FIG. 4 is a graph of the frequency of FIG. 3 after being processed by a stochastic Gaussian method;
FIG. 5 is a graph of the frequency of FIG. 3 after treatment with a random salt pepper process;
FIG. 6 is a graph of the frequency of FIG. 3 after salt treatment with pepper;
FIG. 7 is a graph of the frequency of FIG. 3 after treatment with the impulse salt process;
FIG. 8 is a graph of the frequency of FIG. 3 after Gaussian processing;
FIG. 9 is a graph of the frequency of FIG. 3 after processing by column Gao Sifa.
Detailed Description
The present invention will be further described with reference to the accompanying drawings, and it should be noted that the present embodiment is based on the technical solution, and the detailed implementation and the specific operation process are provided, but the protection scope of the present invention is not limited to the present embodiment.
The invention comprises the following steps:
s1, collecting all frequently high image images corresponding to manually judged TXT identification files in a general folder;
s2, analyzing the TXT identification file to obtain the corresponding categories of all the frequently-raised pictures in the total folder;
preferably, step S2 specifically includes the following steps:
s21, traversing all the frequency-height image images in the total folder, and obtaining the drawing time of all the frequency-height image images;
preferably, in step S21, the time format of rendering the high-resolution image is year, month, day, hour, minute and second, and all the initial classifications are set to 0.
S22, reading the manually judged TXT identification files line by line, and acquiring the frequently and highly mapped type information marked in the manually judged TXT identification files;
and S23, unifying the drawing time of the high-frequency image and artificially judging the high-frequency image time in the TXT identification file, and determining the type of the high-frequency image according to the record in the TXT.
Preferably, the first column of the TXT identification file in step S22 is a year, month and day record of the occurrence date of the extended F phenomenon, the second column is a start time of the occurrence of the extended F phenomenon recorded by twenty-four hours, the third column is an end time of the occurrence of the extended F phenomenon recorded by twenty-four hours, and the fourth column is a duration of the occurrence of the extended F, wherein the first two digits of the fourth column are hours and the last two digits are minutes;
in this embodiment, the frequency elevation map generated by the Hainan altimeter in 2002-2015 is manually interpreted and corrected for multiple times, and the interpretation result is recorded according to an international convention mode, so that the results of manual interpretation classification in 2011 are explained:
table 1 is a chart of records of 2011 annual frequency and altitude chart classification manual interpretation of Hainan altimeter
Figure BDA0003883782800000071
Figure BDA0003883782800000081
The first column of the TXT file shown in table 1 records the date of the occurrence of the extended F event in terms of yearly days, for example, 049 represents the 49 th day of 2011, i.e., 18 nd day of 2011, which is 2 months and 18 th day of 2011. And so on for the last three columns, the fifth column represents the type of extension F in this time period.
And when the time resolution of the TXT identification file in step S22 is different, the category of the TXT identification file is determined by:
s221, judging the number of images of the frequency-height images under the current folder, if the number of images is less than or equal to a set value, calculating according to a first time resolution, otherwise, calculating according to a second time resolution;
preferably, the set value in step S221 is 35040=24 × 4 × 365;
the first time resolution is 15 minutes, and the second time resolution is 5 minutes (the first time of meridian engineering which is implemented in the space center of the Chinese academy, the sampling resolution of the digital altimeter which is deployed in the Hainan station is basically kept at 15 minutes, but the later time meets the requirements of business departments, and the situation that the sampling resolutions of 5 minutes and 15 minutes coexist occurs in certain time periods of certain years).
Preferably, in step S224:
the estimated image quantity calculation formula of the extended F phenomenon of the first resolution is: numberfFlag = round (floor (duration (i)/100) × 60/15+ mod (duration (i), 100)/15);
the estimated image quantity calculation formula of the expansion F phenomenon of the second resolution is as follows: numberfFlag = round (floor (duration (i)/100) × 60/5+ mod (duration (i), 100)/5);
wherein duration (i) identifies the duration of occurrence of the expanding F phenomenon of the ith line in the TXT file.
S222, converting a chronology and a day timing method into a chronology of year, month, day, hour, minute and second according to the first column of data of the TXT identification file;
s223, according to the data of the second column and the third column of the TXT identification file, the starting time and the ending time of a certain expansion F phenomenon are obtained by time counting in time of year, month, day, hour, minute and second;
s224, acquiring the duration of the occurrence of the expansion F phenomenon according to the fourth column of the TXT identification file, and then estimating the number of images of the expansion F phenomenon possibly existing in the total folder according to the judged time resolution;
s225, marking the category of the high frequency graph according to the TXT identification file;
preferably, in step S225, due to file loss:
when traversing the TXT identification files row by row, if the initial position of a certain row is provided with the corresponding frequency height map and the end position lacks the corresponding frequency height map, the mark range of the category is locked in the number of the frequency height maps behind the initial position, and whether the time of the year, month and day passing through the file name of the frequency height map is less than the time of the year, month and day of the end position. If the frequency-height map is less than or equal to the classification category of the corresponding row, marking the frequency-height map as the classification category of the corresponding row; if yes, not marking, and exiting the loop;
if the starting position of the line lacks the corresponding frequency height map and the ending position has the corresponding frequency height map, locking the marking range of the category in the number of frequency height maps before the ending position, and marking the frequency height map as the category of the corresponding line if the time of the year, month and day of the file name of the frequency height map is greater than the time of the year, month and day of the starting position or not; if the current time is less than the preset time, marking is not carried out, and the loop is exited;
and when the initial position and the end position do not correspond to the frequency height map, terminating the loop and directly entering the next TXT identification file judgment.
S226, the steps S221-S225 are circulated until all the manually judged marks are traversed.
S3, moving the frequency height map of the determined category to each classification folder;
and S4, constructing a relatively balanced sample library of each type of sample through an up-down sampling mechanism.
Preferably, in step S4, equalization operation is performed by introducing different types of noise in consideration of the difference of the order of magnitude of the image of each class of the frequency height map, and the frequency height map of the class with the number of numbers higher than the set sample amount is down-sampled; and performing data enhancement on the frequency height map of the class with the number lower than the set sample number.
Preferably, the data enhancement specifically comprises the following steps:
firstly, copying an original frequency height image to a sample library, and defining n noise methods;
then, generating a random number in the range of [1,n ], and randomly determining the type of noise added in the selected sample;
then, generating random numbers between [1,5], randomly determining the number of blocks, the number of rows and the number of columns for loading noise, and randomly determining the positions for adding random block, row and column noise in the image;
then, corresponding random noise is added to the determined image blocks, rows and columns;
and adding a noise mark to the image file name with the noise added, adding the image file name with the noise added to a sample library of the corresponding category, and circulating the operation until the number of the frequency height pattern samples of the category meets the requirement.
The above number 5 is an empirical value and the amount of noise in the existing histogram typically does not exceed 5 blocks, 5 rows or 5 columns.
The statistical results of the classification of the frequency height maps generated by the Hainan altimeter in 2002-2015 in this embodiment are shown in the following table 2:
table 2 is a statistical result table of frequency height maps generated by Hainan altimeter in 2002-2015
Figure BDA0003883782800000111
As can be seen from the above table, the number ratios of the five types of frequency height maps in all images are: 91.64%, 2.41%, 1.12%, 3.24%, 1.60%. It is clear that there is a significant imbalance of sample data between classes. In order to develop a supervised learning identification method of the ionospheric frequency high-graph spread F phenomenon, before deep neural network training, a sample library which is uniformly distributed and has basically consistent quantity of various types of samples is an important guarantee for ensuring rapid convergence during model training. Therefore, before training of the automatic recognition model of the histogram based on deep learning, the problem of significant imbalance of the histogram distribution in statistics must be solved.
FIG. 1 is a graph of frequency at time 20050104154500 in accordance with the present invention; fig. 2 is a frequency height diagram of the invention at the time 20050116124500, in which a frame shows a relatively significant noise, and as shown in fig. 1 and fig. 2, the frequency height diagrams between 2002 and 2015 for 14 years can be classified year by year through the first step of parsing of the manual identification file. The statistical results of the classification are shown in table 2. In the 14 years' fall, there are no extension, frequency-type FSF, regional RSF, hybrid MSF, and strong regional SSF, and the number of frequency height maps in these five categories are: 426555, 11226, 5201, 15083 and 7428. When deep neural network training is performed, the order difference of each class of samples needs to be considered as much as possible. Such as: downsampling without expansion samples is acceptable because the images without expansion have a certain similarity relatively; the region-type RSF and the strong region SSF are respectively extended to 400% and 300% of the original data size, and when the samples are extended, the number of samples is increased as much as possible, and the repetition rate of similar or related samples is reduced as much as possible, so as to ensure the final classification accuracy of the model, and therefore 20000 is finally determined as the number of samples in the final sample library of each class.
Meanwhile, the principle of drawing a frequency height diagram shows that the ionosphere is fed back and displayed by radar waves with different frequencies of 0-17MHz, and different forms correspond to different physical laws. Therefore, enhancing the samples of the frequency-height map cannot destroy the matching relationship between the frequency (abscissa) and the height (ordinate), otherwise, the features bearing the physical phenomena in the image are destroyed, and the interpretation of later-stage researchers is affected. However, the conventional image data enhancement methods, including but not limited to rotation, cutting, stretching, etc., can cause insurable damage to the trace lines reflecting the variation of the electron density with the height, so that the professionals cannot identify and process the sample enhanced image at all.
After the research on the frequency height diagram accumulated for 14 years in the Hainan station, the diffuse reflection of the plasma fine structure to the incident radio wave is found to be in a obviously dispersed sheet shape in the frequency height diagram. In addition, sometimes, the height measuring instrument itself has certain instability, so that some abnormal longitudinal or transverse noise points appear in the frequency-height diagram. Therefore, for the sample enhancement requirement of the ionospheric frequency high-map category distribution with significant imbalance, a traditional image sample enhancement method cannot be adopted, but diffuse reflection of a plasma fine structure to incident electric waves is simulated, or dispersion distribution or noise points are artificially increased in the frequency high-map due to instability of a machine, so that the number of samples of the same type can be increased without changing the frequency high-map category.
FIG. 3 shows an original frequency Gao Tu, which is exemplified by time 20060101021500 according to the present invention; FIG. 4 is a graph of the frequency height of FIG. 3 after being processed by a stochastic Gaussian method; FIG. 5 is a graph of the frequency of FIG. 3 after treatment with a random pepper salt process; FIG. 6 is a graph of the frequency of the signal of FIG. 3 after salt pepper treatment; FIG. 7 is a graph of the frequency of FIG. 3 after treatment with the impulse salt process; FIG. 8 is a graph of the frequency of FIG. 3 after Gaussian processing; FIG. 9 is a graph of the frequency of FIG. 3 after being processed by column Gao Sifa, as shown in FIGS. 3-9, preferably, the noise process includes at least the Gaussian process and the salt-pepper process; the Gaussian method comprises random Gao Sifa, the Gaussian method and column Gao Sifa; the salt and pepper method comprises a random salt and pepper method, a line salt and a column salt; in the aspect of super parameters, noise density parameters of a Gaussian method and a salt-pepper method are adjustable; the noise widths of the row Gao Sifa method, the column gauss method, the row pepper salt method and the column pepper salt method can be adjusted; the noise coverage area of the random Gaussian method and the random salt pepper method is adjustable.
Therefore, the invention adopts the construction method of the ionospheric frequency elevation map classification sample library, and firstly, the matching relation between the artificial identification and the frequency elevation map is established. Namely, the recording mode of the year, month, day, hour, minute and second of the manual mark is changed into the recording mode of the year, month, day, hour, minute and second, and then the association relation between the mark and the related frequency-height map is established through a robust comparison method; secondly, through analyzing the statistical results of the frequency altitude maps for many years, in order to establish an image library with relatively balanced class samples, data enhancement must be carried out on the existing several classes. The physical meaning of the frequency height map determines that the universal sample expansion method of rotation and cutting cannot be simply and roughly adopted. The data enhancement of the samples must follow the physical meaning carried by the histogram. Through image analysis and discussion with related scientific researchers, diffuse reflection of a plasma fine structure on incident electric waves and noise of different types on a frequency height map caused by working voltage, local short-time weather and other reasons of a height measuring instrument can be caused, so that the scheme simulates the influence of natural phenomena on an ionospheric frequency height map by introducing random noise of different types, sample data of various frequency height map extension F is expanded, and a relatively balanced frequency height pattern library is finally established. And the effectiveness of the sample enhancement algorithm provided by the scheme is also verified by the subsequent model training result.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting the same, and although the present invention is described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that: modifications and equivalents may be made to the disclosed embodiments without departing from the spirit and scope of the present invention.

Claims (10)

1. A method for constructing an ionospheric frequency elevation map classification sample library is characterized by comprising the following steps: the method comprises the following steps:
s1, collecting all frequently high image images corresponding to manually judged TXT identification files in a general folder;
s2, analyzing the TXT identification files to obtain the corresponding categories of all the frequency height graphs in the total folder;
s3, moving the frequency height map of the determined category to each classification folder;
and S4, constructing a relatively balanced sample library of each type of sample through an up-down sampling mechanism.
2. The method for constructing the ionospheric frequency elevation map classification sample library according to claim 1, wherein the method comprises the following steps: the step S2 specifically includes the following steps:
s21, traversing all the frequency-height image images in the total folder, and obtaining the drawing time of all the frequency-height image images;
s22, reading the manually judged TXT identification files line by line, and acquiring the frequency-height image category information marked in the manually judged TXT identification files;
and S23, unifying the drawing time of the high-frequency image and artificially judging the high-frequency image time in the TXT identification file, and determining the type of the high-frequency image according to the record in the TXT.
3. The method for constructing the ionospheric frequency elevation map classification sample library according to claim 2, wherein: in step S21, the time format of rendering the high-frequency image is year, month, day, hour, minute, and second, and all the initial classifications are set to 0.
4. The method for constructing the ionospheric frequency elevation map classification sample library according to claim 2, wherein the method comprises the following steps: the first column of the TXT identification file in step S22 is to record the date of occurrence of the extended F phenomenon in a chronological manner, the second column is the start time of occurrence of the extended F phenomenon recorded in a twenty-four time chronograph, the third column is the deadline of occurrence of the extended F phenomenon recorded in the twenty-four time chronograph, and the fourth column is the duration of occurrence of the extended F, wherein the first two digits of the fourth column are hours and the last two digits are minutes;
and when the time resolution of the TXT identification file in step S22 is different, the category of the TXT identification file is determined by:
s221, judging the number of images of the frequency-height images under the current folder, if the number of images is less than or equal to a set value, calculating according to a first time resolution, otherwise, calculating according to a second time resolution;
s222, converting an accumulated-year-day timing method into a chronograph method of year, month, day, time, minute and second according to the first column of data of the TXT identification file;
s223, according to the data of the second column and the third column of the TXT identification file, the starting time and the ending time of a certain expansion F phenomenon are obtained by time counting in time of year, month, day, hour, minute and second;
s224, acquiring the duration of the occurrence of the expansion F phenomenon according to the fourth column of the TXT identification file, and then estimating the number of images of the expansion F phenomenon possibly existing in the total folder according to the judged time resolution;
s225, marking the category of the high frequency graph according to the TXT identification file;
and S226, looping the steps S221-S225 until all the manually judged marks are traversed.
5. The method for constructing the ionospheric frequency elevation map classification sample library according to claim 4, wherein the method comprises the following steps: the set value in step S221 is 35040=24 × 4 × 365;
the first temporal resolution is 15 minutes and the second temporal resolution is 5 minutes.
6. The method for constructing the ionospheric frequency elevation map classification sample library according to claim 5, wherein: in step S224:
the estimated image quantity calculation formula of the extended F phenomenon of the first resolution is: numberrofflag = round (floor (duration (i)/100) × 60/15+ mod (duration (i), 100)/15);
the estimated image quantity calculation formula of the expansion F phenomenon of the second resolution is as follows: numberrofflag = round (floor (duration (i)/100) × 60/5+ mod (duration (i), 100)/5);
wherein duration (i) identifies the duration of occurrence of the expanding F phenomenon of the ith line in the TXT file.
7. The method for constructing the ionospheric frequency elevation map classification sample library according to claim 4, wherein the method comprises the following steps: in step S225, due to file loss:
when traversing the TXT identification files row by row, if the initial position of a certain row is provided with a corresponding frequency height map and the end position lacks the corresponding frequency height map, locking the marking range of the category in the number of frequency height maps behind the initial position, and if the time of the year, month and day of passing the file name of the frequency height map is less than the time of the year, month and day of the end position, marking the frequency height map as the classification category of the corresponding row; if yes, not marking, and exiting the loop;
if the starting position of the line lacks the corresponding frequency height map and the ending position has the corresponding frequency height map, locking the marking range of the category in the number of frequency height maps before the ending position, and marking the frequency height map as the classification category of the corresponding line if the time of year, month and day of the file name of the frequency height map is greater than the time of year, month and day, minute and second of the starting position; if the current time is less than the preset time, marking is not carried out, and the loop is exited;
and when the initial position and the end position do not correspond to the frequency height map, terminating the loop and directly entering the next TXT identification file judgment.
8. The method for constructing the ionospheric frequency elevation map classification sample library according to claim 1, wherein the method comprises the following steps: in step S4, the magnitude difference of the frequency height image of each category is considered, the equalization operation is carried out by introducing different types of noise, the categories with the number of numbers higher than the set sample amount are subjected to down-sampling on the frequency height image; and carrying out data enhancement on the frequency height map of the class with the number lower than the set sample number.
9. The method for constructing the ionospheric frequency elevation map classification sample library according to claim 8, wherein: the data enhancement specifically comprises the following steps:
firstly, copying an original frequency height image to a sample library, and defining n noise methods;
then, generating a random number in the range of [1,n ], and randomly determining the type of noise added in the selected sample;
then, generating random numbers between [1,5], randomly determining the number of blocks, the number of rows and the number of columns for loading noise, and randomly determining the positions for adding random block, row and column noise in the image;
then, corresponding random noise is added to the determined image blocks, rows and columns;
and adding a noise mark to the image file name with the noise added, adding the image file name with the noise added to a sample library of the corresponding category, and circulating the operation until the number of the frequency height pattern samples of the category meets the requirement.
10. The method for constructing the ionospheric frequency elevation map classification sample library according to claim 9, wherein: the noise method at least comprises a Gaussian method and a pepper salt method;
the Gaussian method comprises random Gao Sifa, the Gaussian method and column Gao Sifa;
the salt and pepper method comprises a random salt and pepper method, a line salt and a column salt and pepper method;
in the aspect of super parameters, noise density parameters of a Gaussian method and a salt-pepper method are adjustable; the noise widths of the line Gao Sifa method and the column gauss method, and the line pepper salt method and the column pepper salt method are adjustable; the noise coverage area of the random Gaussian method and the random salt pepper method is adjustable.
CN202211238852.5A 2022-10-11 2022-10-11 Method for constructing ionospheric frequency high-graph classification sample library Active CN115525786B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211238852.5A CN115525786B (en) 2022-10-11 2022-10-11 Method for constructing ionospheric frequency high-graph classification sample library

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211238852.5A CN115525786B (en) 2022-10-11 2022-10-11 Method for constructing ionospheric frequency high-graph classification sample library

Publications (2)

Publication Number Publication Date
CN115525786A true CN115525786A (en) 2022-12-27
CN115525786B CN115525786B (en) 2024-02-20

Family

ID=84702208

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211238852.5A Active CN115525786B (en) 2022-10-11 2022-10-11 Method for constructing ionospheric frequency high-graph classification sample library

Country Status (1)

Country Link
CN (1) CN115525786B (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5218299A (en) * 1991-03-25 1993-06-08 Reinhard Dunkel Method for correcting spectral and imaging data and for using such corrected data in magnet shimming
JP2000307964A (en) * 1999-04-23 2000-11-02 Casio Comput Co Ltd Device and method for generating moving picture
US20170142358A1 (en) * 2015-11-12 2017-05-18 Canon Kabushiki Kaisha Image processing apparatus and image processing method
CN112230205A (en) * 2020-10-16 2021-01-15 哈尔滨工程大学 Underwater target recognition system performance evaluation method using ship radiation noise simulation signal
CN112884003A (en) * 2021-01-18 2021-06-01 中国船舶重工集团公司第七二四研究所 Radar target sample expansion generation method based on sample expander
US20210201160A1 (en) * 2019-04-29 2021-07-01 Landmark Graphics Corporation Hybrid neural network and autoencoder
CN113569632A (en) * 2021-06-16 2021-10-29 西安电子科技大学 Small sample local surface slow-speed moving object classification method based on WGAN
CN113900137A (en) * 2021-07-30 2022-01-07 应急管理部国家自然灾害防治研究院 Data processing method and system for high-energy particle detector
US20220092336A1 (en) * 2020-03-26 2022-03-24 Shenzhen Institutes Of Advanced Technology Adversarial image generation method, computer device, and computer-readable storage medium
CN114429151A (en) * 2021-12-16 2022-05-03 中南大学 Magnetotelluric signal identification and reconstruction method and system based on depth residual error network
CN114715430A (en) * 2021-03-31 2022-07-08 中国科学院国家空间科学中心 System for multi-satellite automatic linear formation and time-varying baseline generation
CN115063624A (en) * 2022-05-05 2022-09-16 北京交通大学 Small sample classification learning method based on graph neural network
CN115508800A (en) * 2022-08-19 2022-12-23 中国科学院国家空间科学中心 Method and system for screening ionospheric frequency elevation map extension F phenomenon radar graph

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5218299A (en) * 1991-03-25 1993-06-08 Reinhard Dunkel Method for correcting spectral and imaging data and for using such corrected data in magnet shimming
JP2000307964A (en) * 1999-04-23 2000-11-02 Casio Comput Co Ltd Device and method for generating moving picture
US20170142358A1 (en) * 2015-11-12 2017-05-18 Canon Kabushiki Kaisha Image processing apparatus and image processing method
US20210201160A1 (en) * 2019-04-29 2021-07-01 Landmark Graphics Corporation Hybrid neural network and autoencoder
US20220092336A1 (en) * 2020-03-26 2022-03-24 Shenzhen Institutes Of Advanced Technology Adversarial image generation method, computer device, and computer-readable storage medium
CN112230205A (en) * 2020-10-16 2021-01-15 哈尔滨工程大学 Underwater target recognition system performance evaluation method using ship radiation noise simulation signal
CN112884003A (en) * 2021-01-18 2021-06-01 中国船舶重工集团公司第七二四研究所 Radar target sample expansion generation method based on sample expander
CN114715430A (en) * 2021-03-31 2022-07-08 中国科学院国家空间科学中心 System for multi-satellite automatic linear formation and time-varying baseline generation
CN113569632A (en) * 2021-06-16 2021-10-29 西安电子科技大学 Small sample local surface slow-speed moving object classification method based on WGAN
CN113900137A (en) * 2021-07-30 2022-01-07 应急管理部国家自然灾害防治研究院 Data processing method and system for high-energy particle detector
CN114429151A (en) * 2021-12-16 2022-05-03 中南大学 Magnetotelluric signal identification and reconstruction method and system based on depth residual error network
CN115063624A (en) * 2022-05-05 2022-09-16 北京交通大学 Small sample classification learning method based on graph neural network
CN115508800A (en) * 2022-08-19 2022-12-23 中国科学院国家空间科学中心 Method and system for screening ionospheric frequency elevation map extension F phenomenon radar graph

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ZHENG W., PENGDONG G., ET AL.: "《Automatic Detection and Classification of Spread-F From Ionosonde at Hainan With Image-Based Deep Learning Method 》", 《SPACE WEATHER》, pages 1 - 14 *
李键红;吴亚榕;吕巨建;: "基于自相似性与多任务高斯过程回归的单帧图像超分辨率重建", 光学精密工程, no. 11, 15 November 2018 (2018-11-15), pages 205 - 217 *
王海文;邱晓晖;: "一种基于生成式对抗网络的图像数据扩充方法", 计算机技术与发展, no. 03, 5 December 2019 (2019-12-05), pages 57 - 62 *

Also Published As

Publication number Publication date
CN115525786B (en) 2024-02-20

Similar Documents

Publication Publication Date Title
Easterling et al. Detection and attribution of climate extremes in the observed record
Hanna Confidence limits for air quality model evaluations, as estimated by bootstrap and jackknife resampling methods
Glimsdal et al. A new approximate method for quantifying tsunami maximum inundation height probability
CN110134907B (en) Rainfall missing data filling method and system and electronic equipment
CN115508800A (en) Method and system for screening ionospheric frequency elevation map extension F phenomenon radar graph
Amorocho et al. Mathematical models for the simulation of cyclonic storm sequences and precipitation fields
CN112347652A (en) Storm high-risk zoning method based on hydrological meteorological area linear moment frequency analysis
Benoit et al. Nonstationary stochastic rain type generation: accounting for climate drivers
Nicholas et al. Empirical downscaling of high-resolution regional precipitation from large-scale reanalysis fields
van der Elst et al. a‐positive: A robust estimator of the earthquake rate in incomplete or saturated catalogs
CN115525786B (en) Method for constructing ionospheric frequency high-graph classification sample library
CN111125937B (en) Near-ground atmosphere fine particulate matter concentration estimation method based on space-time weighted regression model
CN111103262B (en) Arctic multi-year ice density inversion method based on scatterometer data
CN116148855B (en) Method and system for removing atmospheric phase and resolving deformation of time sequence InSAR
CN117115671A (en) Soil quality analysis method and device based on remote sensing and electronic equipment
Patrick et al. Spatial variation of rainfall intensities for short duration storms
Phillips et al. Trend analysis in the presence of short-and long-range correlations with application to regional warming
Guzzetti Landslide hazard and risk by GIS-based multivariate models
CN116381773B (en) Method and device for normalizing hybrid data in earthquake prediction
CN117057499B (en) Landslide sensitivity evaluation method, system and equipment in mountain gorge valley region
CN114320284B (en) Method for evaluating stratum fracturing effect by utilizing dipole acoustic time difference correlation matrix
CN118348352A (en) Novel power distribution network line lightning risk early warning method based on time reversal
CN113538388A (en) Arable land loss assessment method based on MODIS NDVI time sequence data
CN116481642A (en) Method and system for correcting solar total radiation data of photovoltaic power station
Franzke et al. A Critical Evaluation and Future Projection of Extreme Precipitation Over South Korea in Observation-Based Products and a High-Resolution Model Simulation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant