CN116975008B - Ship meteorological monitoring data optimal storage method - Google Patents

Ship meteorological monitoring data optimal storage method Download PDF

Info

Publication number
CN116975008B
CN116975008B CN202311226338.4A CN202311226338A CN116975008B CN 116975008 B CN116975008 B CN 116975008B CN 202311226338 A CN202311226338 A CN 202311226338A CN 116975008 B CN116975008 B CN 116975008B
Authority
CN
China
Prior art keywords
data
preset
analyzed
amplitude
compression
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311226338.4A
Other languages
Chinese (zh)
Other versions
CN116975008A (en
Inventor
李微
崔冬雪
胡芮宁
吴勇
汪新闻
卞军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Hailianzhi Information Technology Co ltd
Original Assignee
Qingdao Hailianzhi Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Hailianzhi Information Technology Co ltd filed Critical Qingdao Hailianzhi Information Technology Co ltd
Priority to CN202311226338.4A priority Critical patent/CN116975008B/en
Publication of CN116975008A publication Critical patent/CN116975008A/en
Application granted granted Critical
Publication of CN116975008B publication Critical patent/CN116975008B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/174Redundancy elimination performed by the file system
    • G06F16/1744Redundancy elimination performed by the file system using compression, e.g. sparse files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1727Details of free space management performed by the file system

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention relates to the technical field of data compression storage structures, in particular to a ship meteorological monitoring data optimal storage method. According to the method, the real-time data points of ship meteorological monitoring data are taken as data points to be analyzed, the influence of historical time sequence data in a preset historical neighborhood on real-time data fluctuation analysis is analyzed, a time weighting factor is obtained, and the importance degree of the data points to be analyzed is obtained by combining amplitude fluctuation conditions; acquiring reference weights of importance degrees according to amplitude levels and variation trends of all data points in a preset history neighborhood, and further acquiring weighted importance degrees; and obtaining an acceptable compression size of the data point to be analyzed, constructing an initial compression sub-block with a preset size, continuously increasing the preset size according to the cut-off condition until an adjusted compression sub-block is obtained, and iteratively dividing all data into proper compression sub-blocks for compression storage. The invention adjusts the size of the compressed sub-block according to the importance degree of the data, improves the compression precision and reduces the occupation of the storage space.

Description

Ship meteorological monitoring data optimal storage method
Technical Field
The invention relates to the technical field of data compression storage structures, in particular to a ship meteorological monitoring data optimal storage method.
Background
Ship weather monitoring data refers to various weather data collected by a ship during voyage, which is critical to the safety and planning of ship voyage. However, due to the high collection frequency and huge data volume of the ship meteorological monitoring data, a large amount of storage space is occupied, and the requirement on real-time processing of the ship meteorological monitoring data is high. Therefore, there is a need for efficient compression of data to reduce storage space and improve compression efficiency and efficiency.
When the traditional data compression method compresses meteorological monitoring data, the data are generally divided into compression sub-blocks with corresponding sizes according to the periodic characteristics or fixed values of data change, and the data in each compression sub-block are compressed, so that the compression efficiency is improved. However, the fluctuation degree of the data is different for the compressed sub-blocks with different sizes, and the compression effect is poor when the data shows high similarity or repeatability when the data is compressed by the compressed sub-blocks with shorter sizes and the data changes smoothly; when the data of which the partial data is severely changed is compressed by the compression sub-block of the longer size, the compression accuracy of the data is lowered and the compression effect is deteriorated. The traditional selection method of the size of the compressed sub-block is easy to have the condition of poor data compression effect, thereby causing the waste of storage space.
Disclosure of Invention
In order to solve the technical problems of poor compression effect and storage space waste caused by improper size selection of compressed sub-blocks, the invention aims to provide a ship meteorological monitoring data optimized storage method, which adopts the following specific technical scheme:
the invention provides a ship meteorological monitoring data optimizing and storing method, which comprises the following steps:
acquiring a time sequence of ship meteorological monitoring data in real time;
in the time sequence, taking real-time data points as data points to be analyzed; analyzing the influence of the historical time sequence data points of the data points to be analyzed in a preset historical neighborhood on time sequence on the amplitude fluctuation analysis of the data points at real time, and obtaining a time weighting factor; acquiring importance degrees of the data points to be analyzed in the preset historical neighborhood according to the time weighting factors and amplitude fluctuation conditions of all the data points in the preset historical neighborhood of the data points to be analyzed; acquiring the reference weight of the importance degree according to the amplitude level and the amplitude change trend of the data points in the preset history neighborhood; acquiring the weighted importance degree of the data points to be analyzed in the time sequence according to the importance degree and the reference weight;
acquiring an acceptable compression size of the data point to be analyzed according to the weighted importance degree; constructing an initial compression sub-block according to a preset size by taking the data point to be analyzed as a starting point, sequentially increasing the preset size, and judging whether the initial compression sub-block reaches a preset cut-off condition according to the acceptable compression size of the data point in the initial compression sub-block; until the cut-off condition is met, a compressed sub-block is obtained, a previous point of the end point of the compressed sub-block is used as a new data point to be analyzed, and all the adjusted compressed sub-blocks in the time sequence are obtained in an iterative mode;
and compressing and storing the data in each compressed sub-block according to a preset compression step length.
Further, the acquiring the time weighting factor includes:
and obtaining the sequence number of each historical time sequence data point in the preset historical neighbor, and normalizing after multiplying the sequence number by a preset first positive parameter to obtain a time weighting factor.
Further, the obtaining the importance degree of the data point to be analyzed in the preset history neighborhood includes:
acquiring an amplitude difference value between the amplitude of each historical time sequence data point and the average value of the amplitudes of all data points in the preset historical neighborhood;
multiplying the time weighting factor by the amplitude difference value and then squaring to obtain a weighted amplitude difference; and obtaining and normalizing the average value of the weighted amplitude differences of all the historical time sequence data points in the preset historical neighborhood to obtain the importance degree of the data points to be analyzed in the preset historical neighborhood.
Further, the obtaining the reference weight of the importance degree includes:
in the preset historical neighborhood of the data point to be analyzed, the average value of the amplitude values of all the data points is differenced with a preset threshold value to obtain an amplitude abnormal difference value, the amplitude increment among all the adjacent data points is obtained, and the amplitude increment variance of the amplitude increment is obtained;
adding a preset second positive parameter to the amplitude increment between the data point to be analyzed and the previous adjacent data point on the time sequence to obtain an amplitude increment reference index of the data point to be analyzed; multiplying the amplitude increment reference index by the amplitude increment variance to obtain an amplitude variation trend index;
dividing the amplitude abnormity difference value by the amplitude variation trend index, and mapping the amplitude abnormity difference value into an exponential function to obtain an amplitude abnormity fluctuation index; normalizing the amplitude abnormal fluctuation index to obtain the reference weight of the importance degree.
Further, the acquiring the weighted importance degree of the data points to be analyzed in the time sequence includes:
multiplying the importance degree of the data point to be analyzed by the corresponding reference weight to obtain the weighted importance degree of the data point to be analyzed.
Further, the obtaining an acceptable compressed size of the data point to be analyzed according to the weighted importance degree includes:
subtracting a preset positive third parameter from the weighted importance degree of the data point to be analyzed to obtain an adjustment weighted importance degree, multiplying the adjustment weighted importance degree by a preset fourth positive parameter, performing negative correlation mapping and normalization to obtain an acceptable compression index of the data point to be analyzed, and rounding down the acceptable compression index to obtain an acceptable compression size.
Further, the cutoff condition includes:
acquiring the minimum acceptable compression size of all the acceptable compression sizes in the initial compression sub-block;
when the minimum acceptable compression size in the initial compression sub-block is less than or equal to the size of the initial compression sub-block, the cut-off condition is satisfied; wherein when the minimum acceptable compressed size within the initial compressed sub-block is smaller than the size of the initial compressed sub-block, the result of the previous increase of the corresponding initial compressed sub-block is required as the corresponding compressed sub-block.
Further, the method for increasing the preset size of the initial compressed sub-block includes:
and adding a preset number of data points into the initial compressed subblock along the time sequence reverse direction of the time sequence.
Further, the compression method for compressing and storing the data in each compression sub-block according to the preset compression step is differential compression.
Further, the preset history neighborhood includes:
and taking each data point to be analyzed as a time sequence end point, searching forward in time sequence, and obtaining the preset history neighborhood according to a preset length.
The invention has the following beneficial effects:
according to the embodiment of the invention, the time sequence of the ship meteorological monitoring data is obtained in real time, the real-time data point is taken as the data point to be analyzed, and the time weighting factor is obtained according to the influence of the historical time sequence data in the preset historical neighborhood of the data point to be analyzed on the fluctuation analysis of the real-time data; further combining the data amplitude fluctuation to obtain the importance degree of the data point to be analyzed; acquiring reference weights of importance degrees according to amplitude levels and variation trends of all data points in a preset historical neighborhood of the data points to be analyzed, and avoiding mistaking gentle fluctuation data at a higher abnormal level into data with low importance degrees; then, obtaining a weighted importance degree which more accurately reflects the fluctuation condition of the data point to be analyzed according to the reference weight and the importance degree; acquiring an acceptable compression size of a data point to be analyzed according to the weighted importance degree, wherein the acceptable compression size reflects the minimum compression precision acceptable when the data is compressed, and the lower the compression precision is, the fuzzy loss of the fluctuation characteristic of the data is caused when the data is compressed; and constructing an initial compression sub-block with a preset size by taking a data point to be analyzed as a starting point, continuously increasing the preset size according to a cut-off condition until the adjusted compression sub-block meeting the cut-off condition is obtained, taking the point before the end point of the adjusted compression sub-block as a new data point to be analyzed, iteratively obtaining all the adjusted compression sub-blocks, and further compressing data. The invention adjusts the size of the compressed sub-block according to the importance degree of the data point, divides the data into the compressed sub-blocks with proper size, and reduces the occupation of the storage space while improving the compression precision.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions and advantages of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are only some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a method for optimizing storage of ship meteorological monitoring data according to an embodiment of the present invention.
Detailed Description
In order to further describe the technical means and effects adopted by the invention to achieve the preset aim, the following description refers to the specific implementation, structure, characteristics and effects of the optimized storage method for ship meteorological monitoring data according to the invention by combining the accompanying drawings and the preferred embodiment. In the following description, different "one embodiment" or "another embodiment" means that the embodiments are not necessarily the same. Furthermore, the particular features, structures, or characteristics of one or more embodiments may be combined in any suitable manner.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
The specific scheme of the ship meteorological monitoring data optimizing and storing method provided by the invention is specifically described below with reference to the accompanying drawings.
Referring to fig. 1, a flowchart of a method for optimizing storage of ship meteorological monitoring data according to an embodiment of the present invention is shown, and the method includes the following steps:
step S1, acquiring a time sequence of ship meteorological monitoring data in real time.
In order to optimally store the ship meteorological monitoring data, the meteorological data of the ship is monitored and collected in real time through a meteorological instrument, and then the collected ship meteorological monitoring data is analyzed, compressed and optimally stored. The ship meteorological monitoring data comprise key meteorological data such as temperature, humidity, wind speed, wind direction and the like of the ship, the meteorological data have important roles in ship safe sailing and route planning decision, and the safety and efficiency of ship sailing can be improved.
The analysis and processing method of each meteorological data of the ship meteorological monitoring data is the same, and only one analysis method of the humidity data is described.
In the embodiment of the invention, humidity data of the ship is acquired by a meteorological instrument at a preset sampling frequency, and a time sequence of the humidity data is constructed according to the time node sequence of the data acquired by the meteorological instrument, wherein each data point in the time sequence is a humidity value of a corresponding time node. In one embodiment of the present invention, the preset sampling frequency is 1 second/time, the sampling time period is 6 hours, the length of the instant sequence is 21600, and in specific applications, the implementer sets according to specific situations.
S2, in the time sequence, taking real-time data points as data points to be analyzed; analyzing the influence of a historical time sequence data point of a data point to be analyzed in a preset historical neighborhood on time sequence on amplitude fluctuation analysis of a data point at real time, and obtaining a time weighting factor; acquiring importance degrees of the data points to be analyzed in a preset historical neighborhood according to the time weighting factors and amplitude fluctuation conditions of all the data points in the preset historical neighborhood of the data points to be analyzed; acquiring a reference weight of the importance degree according to the amplitude level and the amplitude change trend of the data points in the preset history neighborhood; and acquiring the weighted importance degree of the data points to be analyzed in the time sequence according to the importance degree and the reference weight.
During the sailing process of the ship, each item of meteorological data presents a stable fluctuation condition, and when the ship is subjected to severe meteorological conditions, the acquired meteorological data presents obvious fluctuation; the greater the fluctuation degree of the data is, the more the weather conditions change at the current moment is, the greater the importance degree of the weather data is, the more attention is paid to the crewman and the sailing planning is adjusted according to the current weather conditions, so that the sailing safety is ensured. The method is characterized in that accurate humidity data fluctuation conditions are obtained after data compression and decompression, the importance degree of each humidity data point is obtained first, then a compression sub-block with a proper size is obtained in a self-adaptive mode according to the importance degree of each humidity data point, all the humidity data in a time sequence are divided into corresponding compression sub-blocks, and compression efficiency and compression effect are improved.
In order to facilitate analysis of fluctuation of each data point, further obtain importance, in one embodiment of the present invention, each data point to be analyzed is used as a time sequence end point, searching forward in time sequence, and obtaining a preset history neighborhood according to a preset length. Wherein the preset length isTaking 1000, in a specific application, the practitioner can set up according to the specific situation.
It should be noted that, when the preset history neighborhood is constructed, there is a possibility that less than 1000 data are in the time sequence corresponding to the current moment, so when this occurs, the length of the preset history neighborhood is the number of data in the current time sequence, and the fluctuation condition of the data points to be analyzed is analyzed in the corresponding preset history neighborhood.
In consideration of the fact that the reference value of the historical data in the preset historical neighborhood of the data point to be analyzed for analyzing the importance of the fluctuation feature of the data at the real-time moment is reduced along with the time of acquisition, the embodiment of the invention firstly takes the data point at the real-time moment as the data point to be analyzed, analyzes the influence of the historical time sequence data point of the data point to be analyzed in the preset historical neighborhood on the time sequence on the amplitude fluctuation analysis of the data point at the real-time moment, and acquires the time weighting factor.
Preferably, in one embodiment of the present invention, obtaining the time weighting factor includes obtaining a ranking number of each historical time series data point in a preset historical neighborhood, multiplying the ranking number by a preset first positive parameter, and normalizing to obtain the time weighting factor. The calculation formula of the time weighting factor is expressed as:
in (1) the->Indicate->Time weighting factor of influence of amplitude fluctuation analysis of data points to be analyzed by historical time sequence data points,/>For presetting a first positive parameter, ">Indicate->Ordinal number of each historical sequential data point in sequential sequence, < >>Representing the length of a preset history neighborhood, +.>Representing the ordinal number, +.>Expressed as natural constant->An exponential function of the base +_>Indicate->The ranking ordinals of the historical sequential data points in a preset historical neighborhood of the data point to be analyzed. In one embodiment of the invention, a first positive parameter is preset>Taking 0.005, in a specific application, the practitioner sets according to the specific situation.
In the time weighting factor calculation formula, the sequence ordinal number of each historical time sequence data point in a preset historical neighbor is multiplied by a preset first positive parameter and then mapped into an exponential function for normalization. Other normalization methods may be employed in other embodiments of the present invention, and are not limited herein. The smaller the ranking ordinals, the smaller the time weighting factors, and the less impact on analyzing the fluctuation characteristics of the data points to be analyzed.
And considering that the greater the amplitude fluctuation degree of the data point is, the greater the importance degree of the data point in the preset historical neighborhood is, and the current data fluctuation characteristic is kept as much as possible during data compression.
Preferably, in one embodiment of the present invention, obtaining the importance of the data point to be analyzed in the preset history neighborhood includes obtaining an amplitude difference between the amplitude of each history sequential data point and the average of the amplitudes of all the data points in the preset history neighborhood; multiplying the time weighting factor by the amplitude value difference, and then squaring to obtain a weighted amplitude value difference; and obtaining and normalizing the average value of the weighted amplitude differences of all the historical time sequence data points in the preset historical neighborhood to obtain the importance degree of the data points to be analyzed in the preset historical neighborhood. The calculation formula of the importance degree is expressed as:
in (1) the->Representing the importance of the data point to be analyzed, +.>Indicate->Time weighting factor of influence of amplitude fluctuation analysis of data points to be analyzed by historical time sequence data points,/>Indicate->Ordinal number of each historical sequential data point in sequential sequence, < >>Representing the length of a preset history neighborhood, +.>Representing the ordinal number, +.>Indicate->The amplitude of the individual historical time series data points,representing the average of the magnitudes of all data points in a preset history neighborhood, +.>Representing a standard normalization function.
In the importance degree calculation formula, the amplitude difference reflects the fluctuation condition of the relative average amplitude level of all data points in the preset neighborhood of the data point to be analyzed, the weighted amplitude difference is combined with the influence of the historical time sequence data on the analysis of the data point to be analyzed at real time, the weighted amplitude difference of all data points in the preset historical neighborhood is averaged, the more accurate average fluctuation condition of the humidity data in the preset historical neighborhood of the data point to be analyzed is reflected, the larger the fluctuation change of the data is, the more severe the change of the weather condition is, the crewman needs to pay more attention to the change of the weather data at the moment, and the sailing planning is adjusted timely so as to ensure the sailing safety.
When the ship sails for a period of time under severe meteorological conditions, meteorological data collected by the meteorological monitor show sudden and severe changes in time sequence, the data amplitude is kept at a higher level continuously, and the whole is in a gentle fluctuation condition. Therefore, when the importance level of the data point at the real time is analyzed in the history neighborhood is preset, the abnormal condition that the amplitude is kept at a high level continuously but the fluctuation is gentle is mistaken as the gentle fluctuation of the data in the normal condition. Based on the importance degree, in order to accurately acquire the importance degree of the data point to be analyzed in the preset history neighborhood, the embodiment of the invention firstly acquires the reference weight of the importance degree according to the amplitude level and the amplitude change trend of the data point in the preset history neighborhood, and further acquires the accurate weighted importance degree according to the reference weight and the importance degree.
Preferably, in one embodiment of the present invention, obtaining the reference weight of the importance degree includes, in a preset history neighborhood of the data point to be analyzed, making a difference between a mean value of magnitudes of all data points and a preset threshold value, obtaining a magnitude anomaly difference value, obtaining magnitude increments between all adjacent data points, and obtaining a magnitude increment variance of the magnitude increments; adding a preset second positive parameter to the amplitude increment between the data point to be analyzed and the previous adjacent data point on the time sequence to obtain an amplitude increment reference index of the data point to be analyzed; multiplying the amplitude increment reference index by the amplitude increment variance to obtain an amplitude variation trend index; dividing the amplitude abnormal difference value by the amplitude variation trend index, and mapping the amplitude abnormal difference value into an exponential function to obtain an amplitude abnormal fluctuation index; normalizing the amplitude abnormal fluctuation index to obtain the reference weight of the importance degree. The calculation formula of the reference weight is expressed as:
in (1) the->Reference weight indicating the importance of the data point to be analyzed, +.>Representing the average of the magnitudes of all data points in a preset history neighborhood, +.>Indicate->Amplitude of each historical time sequence data point, +.>Indicate->Ordinal number of each historical sequential data point in sequential sequence, < >>Representing the length of a preset history neighborhood, +.>Representing the ordinal number, +.>Indicate->Historical time series data point relative +.>Amplitude increment of historical time sequence data points, +.>Representing the average value of the amplitude increment among all adjacent data points in the preset history neighborhood, +.>Indicating the +.f. of the data point to be analyzed relative to the immediately preceding adjacent data point>Amplitude increment of historical time sequence data points, +.>Expressed as natural constant->An exponential function of the base +_>For the standard normalization function, +.>Representing a preset threshold value, < >>Representing a preset second positive parameter. In one embodiment of the invention, a threshold value is preset +.>Taking 70%, presetting second positive parameter ∈10%>Taking 1, in a specific application, the practitioner can set according to the specific situation.
In a calculation formula of the reference weight, the average value of the amplitude is subjected to difference with a preset threshold value to obtain an amplitude abnormality difference value, and whether the amplitude of a data point in a preset adjacent area is excessively high or not is reflected; when the amplitude mean value is larger than a preset threshold value, the overall average level of data in a preset neighborhood is too high to be abnormal, a shipman needs to pay attention to the change of weather data at the current moment, and otherwise, the change trend of the data needs to be combined for judgment; multiplying the amplitude increment reference index by the amplitude increment variance to obtain an amplitude variation trend index, and reflecting the variation trend of the data fluctuation in the preset history neighborhood; the amplitude abnormal fluctuation index reflects the amplitude level and the change trend of the humidity data, the amplitude abnormal difference value is divided by the amplitude change trend index and then mapped to an exponential function, and the abnormal condition of the data is further judged by combining the fluctuation change trend of the data points. When the amplitude abnormal difference value is larger than zero and is too high, if the amplitude change trend index is smaller, the data point keeps higher abnormal amplitude level and stably fluctuates, and the shipman should pay attention to the abnormal situation; if the amplitude change trend index is larger, the data points are kept at a higher abnormal amplitude level and the fluctuation is more intense, and the shipman should pay attention to the abnormal situation. When the amplitude abnormal difference value is smaller than zero and lower, the ship is under safer meteorological conditions, if the amplitude change trend index is larger, the fluctuation degree of the data points is larger, and when the ship is under safer meteorological conditions, the severe fluctuation of the meteorological data possibly reflects the severe condition of the subsequent meteorological conditions on the side surface, and the crewman needs to pay attention to the fluctuation condition; if the amplitude change trend index is lower, the navigation condition of the ship at the moment is safer.
The importance degree and the reference weight of the data points to be analyzed are acquired, the weighted importance degree of the data points to be analyzed in the time sequence is acquired according to the importance degree and the reference weight, the possibility that the abnormal condition that the amplitude is kept at a higher level continuously but the fluctuation is gentle is mistaken as the gentle fluctuation of the data in the normal condition is reduced, and the more accurate weighted importance degree is acquired.
Preferably, in one embodiment of the present invention, obtaining the weighted importance of the data points to be analyzed in the time sequence includes multiplying the importance of the data points to be analyzed with the corresponding reference weights to obtain the weighted importance of the data points to be analyzed. The calculation formula of the weighted importance degree is expressed as:
in (1) the->Representing the weighted importance of the data point to be analyzed, +.>Reference weight indicating the importance of the data point to be analyzed, < ->Representing the importance of the data point to be analyzed.
In the weighted importance degree calculation formula, the reference weight and the importance degree are combined through multiplication to represent that the reference weight and the importance degree have positive correlation with the weighted importance degree, and when the reference weight is larger, the weighted importance degree of the data point to be analyzed is larger; the greater the importance, the greater the weighted importance of the data points to be analyzed.
In the process of acquiring the time sequence of the ship meteorological monitoring data in real time, acquiring the weighted importance degree of the data point to be analyzed corresponding to each moment according to the calculation formula of the weighted importance degree.
Step S3, obtaining an acceptable compression size of the data point to be analyzed according to the weighted importance degree; constructing an initial compression sub-block according to a preset size by taking a data point to be analyzed as a starting point, sequentially increasing the preset size, and judging whether the initial compression sub-block reaches a preset cut-off condition according to the acceptable compression size of the data point in the initial compression sub-block; and (3) until the cut-off condition is met, obtaining a compressed sub-block, taking the previous point of the end point of the compressed sub-block as a new data point to be analyzed, and iteratively obtaining all the adjusted compressed sub-blocks in the time sequence.
In data compression, it is important to select the appropriate compressed sub-block size. Larger compressed sub-block sizes may reduce the occupied memory space and decompression speed, while smaller compressed sub-block sizes may guarantee the accuracy and detail of the data. According to the embodiment of the invention, the acceptable compression size is determined according to the weighted importance degree of the data points to be analyzed, the lower the importance degree of the data points is, the more gradual the fluctuation change of the data is and the whole data is at a lower amplitude level, and the larger the acceptable compression size of the data points is, so that the whole trend can be captured better and the compression space can be saved; the higher the importance of the data point, the more severe the data fluctuations become and the higher the level of outlier amplitude is overall, and the smaller the acceptable compression size will be, facilitating better preservation of the details and accuracy of the data.
Preferably, in an embodiment of the present invention, obtaining the acceptable compression size of the data point to be analyzed according to the weighted importance degree includes subtracting a preset positive third parameter from the weighted importance degree of the data point to be analyzed to obtain an adjusted weighted importance degree, multiplying the adjusted weighted importance degree by a preset fourth positive parameter, performing negative correlation mapping and normalization to obtain an acceptable compression index of the data point to be analyzed, and rounding down the acceptable compression index to obtain the acceptable compression size. The calculation formula for the acceptable compression size is expressed as:
in (1) the->Representing an acceptable compressed size of the data point to be analyzed, +.>Representing the weighted importance of the data point to be analyzed, +.>Expressed as natural constant->An exponential function of the base +_>To round down the function ++>For presetting a third positive parameter->Is a preset fourth positive parameter. In one embodiment of the invention, a third positive parameter is preset>Taking 1, presetting fourth positive parameter +.>Taking 4, in a specific application, the practitioner can set according to the specific situation.
In the calculation formula of the acceptable compression size, the definition domain of the exponential function is shifted rightward by one unit by subtracting a preset third positive parameter from the weighted importance degree so as to expand the value domain of the exponential function and obtain a richer acceptable compression size value. And (3) rounding down the obtained index function value to obtain an integer size, thereby ensuring the compression precision and effect and facilitating the subsequent division of the compressed sub-blocks.
Acceptable compression size means that each data point, when compressed, increases the compression effect as much as possible without losing the size of the compressed sub-block corresponding to the fluctuation detail feature. And obtaining the acceptable compression size of the data point to be analyzed according to the calculation formula of the acceptable compression size, and dividing the data in the time sequence into compression sub-blocks with proper sizes according to the acceptable compression size. In the embodiment of the invention, firstly, an initial compression sub-block is constructed according to a preset size by taking a data point to be analyzed as a starting point, the preset size is sequentially increased, whether the initial compression sub-block reaches a preset cut-off condition is judged according to the acceptable compression size of the data point in the initial compression sub-block, and then the size of the initial compression sub-block is continuously adjusted until the adjustment direct condition is met, and then the initial compression sub-block is stopped. In one embodiment of the invention, the preset size of the initial compressed sub-block is set to 1, and in a specific application, the practitioner sets the setting according to the specific situation.
Preferably, in one embodiment of the present invention, it is contemplated that the smallest acceptable compression size of all data points in an initial compressed sub-block reflects the acceptable compression accuracy of the data points when compressed within the initial compressed sub-block; when the minimum acceptable compression size is smaller than the size of the initial compression sub-block, the data point corresponding to the minimum acceptable compression size is indicated to lose corresponding fluctuation details and characteristics when the data point is compressed, the compression effect is poor, the size of the initial compression sub-block should be stopped to be increased, the data point corresponding to the minimum acceptable compression size is removed from the initial compression sub-block, and the data point is divided into the next initial compression sub-block again and is judged to be increased. Based on this, the cutoff condition includes obtaining a minimum acceptable compression size of all acceptable compression sizes within the initial compression sub-block; when the minimum acceptable compression size in the initial compression sub-block is smaller than or equal to the size of the initial compression sub-block, a cut-off condition is satisfied; wherein when the minimum acceptable compressed size within the initial compressed sub-block is smaller than the size of the initial compressed sub-block, the result of the previous increase of the corresponding initial compressed sub-block is required as the corresponding compressed sub-block.
And sequentially increasing the preset size of the initial compressed sub-block according to the cut-off condition until the cut-off condition is met, obtaining the compressed sub-block, so far, obtaining the first adjusted compressed sub-block in the time sequence, and iteratively obtaining all the adjusted compressed sub-blocks in the time sequence by taking the previous point of the end point of the adjusted compressed sub-block as a new data point to be analyzed in order to divide all the data in the time sequence into the compressed sub-blocks with proper sizes.
Preferably, in one embodiment of the present invention, the method of increasing the preset size of the initial compressed sub-block includes adding a preset number of data points to the initial compressed sub-block in a time-sequential reverse direction of the time-sequential sequence. In the embodiment of the present invention, the preset number is set to 1, and in a specific application, the practitioner sets according to a specific situation.
All the adjusted compressed sub-blocks in the time sequence are obtained through iteration, and each data point is divided into corresponding proper compressed sub-blocks according to the importance degree and the acceptable compression size of the data point; dividing the data points with high importance, namely meteorological data which needs important attention of crewman, into compression sub-blocks with shorter sizes so as to improve compression precision and keep important fluctuation characteristics of original data; meteorological data with low importance and low attention of crewman is divided into compressed sub-blocks with longer size, so that occupation of storage space by the data is reduced.
And S4, compressing and storing the data in each compressed sub-block according to a preset compression step length.
In the embodiment of the invention, in step S3, all data points in the time sequence are divided into compression sub-blocks with self-adaptive adjustment sizes, and ship weather monitoring data in each compression sub-block are compressed so as to optimize storage.
Preferably, in one embodiment of the present invention, the compression method for compressing and storing the data in each compressed sub-block according to the preset compression step is differential compression.
In one embodiment of the present invention, a differential compression algorithm is specifically employed to compress the data within each compressed sub-block of the time series. Differential compression is a common data compression technology, uses the difference between data points to reduce the storage requirement, is suitable for compressing continuous data such as time sequence data and the like, and can retain trend information of the data to a certain extent. The differential compression is used to compress the ship weather monitoring data in view of the fact that the ship weather monitoring data are continuous data in time sequence acquired by the weather monitor. Because differential compression is the prior art, the description is not repeated here, and only the brief steps of compressing the ship meteorological monitoring data by the differential compression method in one embodiment of the invention are briefly described:
1. acquiring the average value of differential values among all ship meteorological monitoring data in each compression sub-block according to a preset step length, and taking the average value of the differential values as data to be compressed;
2. and compressing the data to be compressed by using run-length coding to obtain a compressed sequence in the whole time sequence data.
In the embodiment of the invention, the preset step length is set to be 1, and in the specific application, the implementer sets according to specific situations. It should be noted that, the run length encoding is the prior art, and will not be described in detail herein.
The compressed sequence is stored in the ship weather monitoring system, the compressed sequence occupies a small storage space, the compression precision can meet the requirement of accurately reflecting the fluctuation characteristics of the data, the real-time collected weather data can be conveniently analyzed, the shipmen can pay attention to abnormal fluctuation and properly adjust the sailing plan, and the sailing safety and efficiency are improved.
In summary, the embodiment of the invention acquires the time sequence of the ship meteorological monitoring data in real time; taking the real-time data point as a data point to be analyzed, and obtaining a time weighting factor according to the influence of the historical time sequence data in the preset historical neighborhood of the data point to be analyzed on the real-time data fluctuation analysis; further combining the data amplitude fluctuation to obtain the importance degree of the data point to be analyzed; acquiring reference weights of importance degrees according to amplitude levels and variation trends of all data points in a preset historical neighborhood of the data points to be analyzed; acquiring a weighted importance degree which more accurately reflects the fluctuation condition of the data point to be analyzed according to the reference weight and the importance degree; acquiring an acceptable compression size of a data point to be analyzed according to the weighted importance degree; and constructing an initial compression sub-block with a preset size by taking a data point to be analyzed as a starting point, continuously increasing the preset size according to a cut-off condition until the adjusted compression sub-block meeting the cut-off condition is obtained, taking the point before the end point of the adjusted compression sub-block as a new data point to be analyzed, iteratively obtaining all the adjusted compression sub-blocks, and further compressing data. The invention adjusts the size of the compressed sub-block according to the importance degree of the data point, divides the data into the compressed sub-blocks with proper size, and reduces the occupation of the storage space while improving the compression precision.
It should be noted that: the sequence of the embodiments of the present invention is only for description, and does not represent the advantages and disadvantages of the embodiments. The processes depicted in the accompanying drawings do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments.

Claims (5)

1. An optimized storage method for ship meteorological monitoring data, which is characterized by comprising the following steps:
acquiring a time sequence of ship meteorological monitoring data in real time;
in the time sequence, taking real-time data points as data points to be analyzed; analyzing the influence of the historical time sequence data points of the data points to be analyzed in a preset historical neighborhood on time sequence on the amplitude fluctuation analysis of the data points at real time, and obtaining a time weighting factor; acquiring importance degrees of the data points to be analyzed in the preset historical neighborhood according to the time weighting factors and amplitude fluctuation conditions of all the data points in the preset historical neighborhood of the data points to be analyzed; acquiring the reference weight of the importance degree according to the amplitude level and the amplitude change trend of the data points in the preset history neighborhood; acquiring the weighted importance degree of the data points to be analyzed in the time sequence according to the importance degree and the reference weight;
acquiring an acceptable compression size of the data point to be analyzed according to the weighted importance degree; constructing an initial compression sub-block according to a preset size by taking the data point to be analyzed as a starting point, sequentially increasing the preset size, and judging whether the initial compression sub-block reaches a preset cut-off condition according to the acceptable compression size of the data point in the initial compression sub-block; until the cut-off condition is met, a compressed sub-block is obtained, a previous point of the end point of the compressed sub-block is used as a new data point to be analyzed, and all the adjusted compressed sub-blocks in the time sequence are obtained in an iterative mode;
compressing and storing the data in each compressed sub-block according to a preset compression step length;
the obtaining the time weighting factor includes:
acquiring a sequence number of each historical time sequence data point in the preset historical neighbor, multiplying the sequence number by a preset first positive parameter, and normalizing to obtain a time weighting factor;
the step of obtaining the importance degree of the data point to be analyzed in the preset history neighborhood comprises the following steps:
acquiring an amplitude difference value between the amplitude of each historical time sequence data point and the average value of the amplitudes of all data points in the preset historical neighborhood;
multiplying the time weighting factor by the amplitude difference value and then squaring to obtain a weighted amplitude difference; acquiring and normalizing the average value of the weighted amplitude differences of all the historical time sequence data points in the preset historical neighborhood to obtain the importance degree of the data points to be analyzed in the preset historical neighborhood;
the obtaining of the reference weight of the importance degree comprises the following steps:
in the preset historical neighborhood of the data point to be analyzed, the average value of the amplitude values of all the data points is differenced with a preset threshold value to obtain an amplitude abnormal difference value, the amplitude increment among all the adjacent data points is obtained, and the amplitude increment variance of the amplitude increment is obtained;
adding a preset second positive parameter to the amplitude increment between the data point to be analyzed and the previous adjacent data point on the time sequence to obtain an amplitude increment reference index of the data point to be analyzed; multiplying the amplitude increment reference index by the amplitude increment variance to obtain an amplitude variation trend index;
dividing the amplitude abnormity difference value by the amplitude variation trend index, and mapping the amplitude abnormity difference value into an exponential function to obtain an amplitude abnormity fluctuation index; normalizing the amplitude abnormal fluctuation index to obtain a reference weight of the importance degree;
the step of obtaining the weighted importance degree of the data points to be analyzed in the time sequence comprises the following steps:
multiplying the importance degree of the data point to be analyzed by the corresponding reference weight to obtain the weighted importance degree of the data point to be analyzed;
obtaining an acceptable compressed size of the data point to be analyzed according to the weighted importance level comprises:
subtracting a preset third positive parameter from the weighted importance degree of the data point to be analyzed to obtain an adjustment weighted importance degree, multiplying the adjustment weighted importance degree by a preset fourth positive parameter, performing negative correlation mapping and normalization to obtain an acceptable compression index of the data point to be analyzed, and rounding down the acceptable compression index to obtain an acceptable compression size.
2. The method for optimized storage of marine vessel meteorological monitoring data according to claim 1, wherein the cutoff condition comprises:
acquiring the minimum acceptable compression size of all the acceptable compression sizes in the initial compression sub-block;
when the minimum acceptable compression size in the initial compression sub-block is less than or equal to the size of the initial compression sub-block, the cut-off condition is satisfied; wherein when the minimum acceptable compressed size within the initial compressed sub-block is smaller than the size of the initial compressed sub-block, the result of the previous increase of the corresponding initial compressed sub-block is required as the corresponding compressed sub-block.
3. The optimized storage method of ship meteorological monitoring data according to claim 2, wherein the method for increasing the preset size of the initial compressed sub-block comprises:
and adding a preset number of data points into the initial compressed subblock along the time sequence reverse direction of the time sequence.
4. The optimized storage method of ship meteorological monitoring data according to claim 1, wherein the compression method for compressing and storing the data in each compression sub-block according to a preset compression step is differential compression.
5. The optimized storage method of ship meteorological monitoring data according to claim 1, wherein the preset history neighborhood comprises:
and taking each data point to be analyzed as a time sequence end point, searching forward in time sequence, and obtaining the preset history neighborhood according to a preset length.
CN202311226338.4A 2023-09-22 2023-09-22 Ship meteorological monitoring data optimal storage method Active CN116975008B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311226338.4A CN116975008B (en) 2023-09-22 2023-09-22 Ship meteorological monitoring data optimal storage method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311226338.4A CN116975008B (en) 2023-09-22 2023-09-22 Ship meteorological monitoring data optimal storage method

Publications (2)

Publication Number Publication Date
CN116975008A CN116975008A (en) 2023-10-31
CN116975008B true CN116975008B (en) 2023-12-15

Family

ID=88483506

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311226338.4A Active CN116975008B (en) 2023-09-22 2023-09-22 Ship meteorological monitoring data optimal storage method

Country Status (1)

Country Link
CN (1) CN116975008B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117650791B (en) * 2024-01-30 2024-04-05 苏芯物联技术(南京)有限公司 Welding history airflow data compression method integrating welding process mechanism
CN117814805A (en) * 2024-03-05 2024-04-05 自贡市第一人民医院 Intelligent processing method for data of clinical care equipment
CN117851414B (en) * 2024-03-07 2024-05-17 杭州永德电气有限公司 Lightning arrester aging test data storage method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114358192A (en) * 2022-01-06 2022-04-15 长安大学 Multi-source heterogeneous landslide data monitoring and fusing method
CN115659070A (en) * 2022-12-28 2023-01-31 鸿基骏业环保科技有限公司 Water flow data transmission method based on NB-IOT intelligent water meter
WO2023083454A1 (en) * 2021-11-11 2023-05-19 Huawei Technologies Co., Ltd. Data compression and deduplication aware tiering in a storage system
CN116552745A (en) * 2023-07-10 2023-08-08 中交第一航务工程局有限公司 Ship state monitoring method and device, electronic equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023083454A1 (en) * 2021-11-11 2023-05-19 Huawei Technologies Co., Ltd. Data compression and deduplication aware tiering in a storage system
CN114358192A (en) * 2022-01-06 2022-04-15 长安大学 Multi-source heterogeneous landslide data monitoring and fusing method
CN115659070A (en) * 2022-12-28 2023-01-31 鸿基骏业环保科技有限公司 Water flow data transmission method based on NB-IOT intelligent water meter
CN116552745A (en) * 2023-07-10 2023-08-08 中交第一航务工程局有限公司 Ship state monitoring method and device, electronic equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
The Block LZSS Compression Algorithm;Wei-ling Chang et al.;2009 Data Compression Conference. DCC 2009;第439页 *
实时数据库中数据压缩算法的研究与实现;胥胜林;;科技与企业(06);第100-101页 *

Also Published As

Publication number Publication date
CN116975008A (en) 2023-10-31

Similar Documents

Publication Publication Date Title
CN116975008B (en) Ship meteorological monitoring data optimal storage method
CN116828070B (en) Intelligent power grid data optimization transmission method
CN107256245B (en) Offline model improvement and selection method for spam message classification
CN115882868B (en) Intelligent storage method for gas monitoring data
CN116208172B (en) Data management system for building engineering project
CN115359807B (en) Noise online monitoring system for urban noise pollution
CN116915260B (en) Wind power motor cooling fan fault data optimization acquisition method
CN115987294A (en) Multidimensional data processing method of Internet of things
CN115801901A (en) Compression processing method for enterprise production emission data
CN117278054B (en) Intelligent power grid monitoring data storage method and system
CN115827577A (en) Cloud collaborative self-adaptive dividing and treating method for high-frequency data compression and reconstruction of intelligent electric meter
CN115543946A (en) Financial big data optimized storage method
CN117235557A (en) Electrical equipment fault rapid diagnosis method based on big data analysis
CN117271987B (en) Intelligent acquisition and processing method for environmental state data of power distribution equipment
CN117459418B (en) Real-time data acquisition and storage method and system
CN116743180B (en) Intelligent storage method for energy storage power supply data
CN117040542B (en) Intelligent comprehensive distribution box energy consumption data processing method
CN117478148A (en) Wind turbine running state dividing method based on mode analysis
CN116629843B (en) Remote early warning and maintenance decision support system of intelligent diesel generator set
CN117235903A (en) LNG ship natural evaporation BOG prediction method and system
CN115695564B (en) Efficient transmission method of Internet of things data
CN114492798A (en) Convolutional neural network pruning method based on kernel similarity and non-maximum suppression algorithm
CN117155402B (en) Public health intelligent physical examination service system based on RPA technology
CN117668269B (en) Vehicle audit information optimized storage method based on manifold learning
CN117240930B (en) Intelligent acquisition method and system for carrier communication data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant