CN115145494B - Disk capacity prediction system and method based on big data time sequence analysis - Google Patents

Disk capacity prediction system and method based on big data time sequence analysis Download PDF

Info

Publication number
CN115145494B
CN115145494B CN202210961428.7A CN202210961428A CN115145494B CN 115145494 B CN115145494 B CN 115145494B CN 202210961428 A CN202210961428 A CN 202210961428A CN 115145494 B CN115145494 B CN 115145494B
Authority
CN
China
Prior art keywords
disk
calling
storage
early warning
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210961428.7A
Other languages
Chinese (zh)
Other versions
CN115145494A (en
Inventor
李卓兵
李庆博
彭珊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Zhenyun Technology Co ltd
Original Assignee
Jiangsu Zhenyun Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Zhenyun Technology Co ltd filed Critical Jiangsu Zhenyun Technology Co ltd
Priority to CN202210961428.7A priority Critical patent/CN115145494B/en
Publication of CN115145494A publication Critical patent/CN115145494A/en
Application granted granted Critical
Publication of CN115145494B publication Critical patent/CN115145494B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • G06F3/0607Improving or facilitating administration, e.g. storage management by facilitating the process of upgrading existing storage systems, e.g. for improving compatibility between host and storage device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3037Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system component is a memory, e.g. virtual memory, cache
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3452Performance evaluation by statistical analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Quality & Reliability (AREA)
  • Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Computing Systems (AREA)
  • Human Resources & Organizations (AREA)
  • Economics (AREA)
  • Evolutionary Biology (AREA)
  • Game Theory and Decision Science (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Hardware Design (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Development Economics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Mathematical Physics (AREA)
  • Human Computer Interaction (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Operations Research (AREA)
  • Tourism & Hospitality (AREA)
  • General Business, Economics & Management (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention discloses a disk capacity prediction system and method based on big data time sequence analysis, which relates to the technical field of disk capacity prediction and comprises the following steps: calling a storage calling record; carding the corresponding storage calling paths for each storage calling record; respectively integrating path information of each disk; collecting all storage calling paths on each disk according to the corresponding storage calling record time; calculating average calling frequency and calling deflection rate of a user on each disk on the computer equipment respectively; calculating a storage management early warning value of each disk based on average calling frequency and calling deviation rate of a user on each disk; integrating and analyzing storage management early warning values corresponding to all the disks on the computer equipment, judging and locking the disk with the abnormal storage management early warning value, and setting the disk as a target disk to be detected; and respectively predicting the disk capacity of the target disk to be detected.

Description

Disk capacity prediction system and method based on big data time sequence analysis
Technical Field
The invention relates to the technical field of disk capacity prediction, in particular to a disk capacity prediction system and method based on big data time sequence analysis.
Background
With the rapid development of Internet big data, a large-scale data center becomes a development requirement of a modern society more and more; under the current development trend, PB-level storage data centers have been frequently generated, EB-level storage data centers, and even larger storage data centers are being generated, and these mass data present a great challenge to disk capacity supervision of the data centers.
The traditional disk monitoring mode cannot meet the existing monitoring requirements, the high-efficiency full utilization of the disk capacity cannot be realized, some disks are often idle for a long time, and other disk resources are consumed completely, so that the storage of a data center cannot really achieve the most full and reasonable use, and the resource waste is caused.
These problems have presented great challenges to the efficiency of disk usage in large-scale data centers, severely impacting further development of large-scale data center storage capacity and affecting economic benefits.
Disclosure of Invention
The invention aims to provide a disk capacity prediction system and method based on big data time sequence analysis, so as to solve the problems in the background technology.
In order to solve the technical problems, the invention provides the following technical scheme: a disk capacity prediction system and method based on big data time sequence analysis, the method includes:
step S100: invoking all storage call records generated on a disk of the computer equipment based on a data storage instruction initiated by a user in a historical application log of the computer equipment; extracting information from each storage calling record; according to the extracted information of the storage calling records, carding the corresponding storage calling paths of each storage calling record;
step S200: respectively integrating path information of each disk; collecting all storage calling paths on each disk according to the corresponding storage calling record time, and obtaining a storage calling path sequence of each disk; calculating average calling frequency and calling deflection rate of a user on each disk on the computer equipment respectively; calculating a storage management early warning value of each disk based on average calling frequency and calling deviation rate of a user on each disk;
step S300: integrating and analyzing storage management early warning values corresponding to all the disks on the computer equipment, judging and locking the disk with the abnormal storage management early warning value, and setting the disk as a target disk to be detected;
step S400: and respectively predicting the disk capacity of the target disk to be detected.
Further, step S100 includes:
step S101: invoking all storage call records generated on a disk of the computer equipment based on a data storage instruction initiated by a user in a historical application log of the computer equipment;
step S102: capturing a target call disk in each storage call record; capturing data to be stored corresponding to a data storage instruction in each storage call record, and acquiring data information of the data to be stored, wherein the data information comprises a data source origin of the data to be stored; the data source origin of the data to be stored comprises application software on the computer device;
step S103: capturing call mode way when achieving call to target call disk in each stored call record k The method comprises the steps of carrying out a first treatment on the surface of the Wherein k=1 or k=2; the calling mode comprises default calling way 1 Change call way 2 The method comprises the steps of carrying out a first treatment on the surface of the If the calling mode of the target calling disk is default calling way 1 The method comprises the steps that a target call disk is indicated to be a disk which is originally pushed to a user or a corresponding application software end by a computer equipment end based on a storage instruction; if the calling mode of the target calling disk is changed calling way 2 The target call disk is a disk which is alternatively used after the storage disk is originally pushed to a user or a corresponding application software end by the computer equipment end based on a storage instruction;
step S104: numbering all the magnetic disks on the computer equipment; and carrying out corresponding storage calling paths on each storage calling record generated by each magnetic disk on the computer equipment:is extracted from the above; a storage calling record correspondingly extracts a storage calling path; wherein (1)>Representing the j-th storage calling path extracted from the j-th storage calling record of the disk with the corresponding number i;
the above-mentioned carding of the storage call path for each disk is used for supervising the call habit of the computer user when the data storage of each disk on the computer device is used for capturing; because the computer equipment has a plurality of disks, based on the use habit of a user, all data are often stored in one or a plurality of disks in the computer equipment, the data storage burden is caused on the disks, meanwhile, the utilization rate of the rest of the disks is abnormally low, and under the condition, certain disks are usually idle for a long time, and the resources of other disks are consumed completely, so that the storage of a data center cannot really achieve the most sufficient and reasonable use, and the resource waste is caused.
Further, step S200 includes:
step S201: calculating average call frequency of a user to each disk:wherein t is j+1 、t j Respectively represent the j+1st and t in the sequence of the storage calling paths j The strip stores the time corresponding to the calling path; p (P) i Representing the average call frequency of a user to a disk with the number of i; m is m i Representing a total storage call path on a disk corresponding to number i;
step S202: extracting to obtain source application software sets { S1, S2, …, sn } corresponding to each disk in all storage call paths of each disk; s1, S2, … and Sn respectively represent the 1 st, 2 nd, … th and n-th application software extracted from the corresponding storage calling path sequences of the magnetic disks; classifying n kinds of application software based on the calling modes in the stored calling paths to obtain the default calling way as the calling mode 1 N1 of first-class application software, and the calling mode is changed calling way 2 N2 of the second class of application software of n, and n1+n2=n;
step S203: respectively inquiring calling modes of n2 second-class application software in storage calling paths corresponding to other disks, setting the calling modes of n3 second-class application software in the n2 second-class application software in the storage calling paths corresponding to other disks as well as changing calling ways 2 I.e. also secondClass application software; calculating call bias rate of a user to each disk:wherein R is i Representing the bias call rate of a user to a disk with the number of i; m represents the total number of application software in the computer equipment;
step S204: calculating storage management early warning value S0S for each disk i =P i *R i The method comprises the steps of carrying out a first treatment on the surface of the Wherein SOS i And the storage management early warning value represents that a storage management unbalance phenomenon occurs on the disk with the number of i by a user.
Further, step S300 includes:
step S301: respectively extracting storage management early warning values corresponding to all magnetic disks on the computer equipment; setting a first early warning threshold value and a second early warning threshold value; the first early warning threshold value is far greater than the second early warning threshold value; extracting a disk with a storage management early warning value larger than a first early warning threshold value and a disk with a storage management early warning value smaller than a second early warning threshold value respectively;
step S302: setting a disk with a storage management early warning value larger than a first early warning threshold value as a first type disk; setting the disk with the storage management early warning value smaller than the second early warning threshold value as a second type disk; when there is a second type of application software set of a certain first disk in all the disksFirst class application software set of a second disk +.>Satisfy->Wherein C is Threshold value Representing a coincidence rate threshold; judging that a storage management early warning value is abnormal between a certain first disk and a certain second disk; setting a certain first disk and a certain second disk as target disks to be tested;
in the process of screening the target to-be-tested magnetic disk, the method starts based on the relativity among the magnetic disk storage; because if long-time idle occurs in some disks and resource consumption is lost in other disks, the disks are the disks with unbalanced management for users, and the relationship between the disks is often that the disks are mutually offset; the excessive use of the disk is required to reduce the data storage burden, and the disk which is idle for a long time is reasonably utilized.
Further, step S400 includes:
step S401: polling and collecting disk data of a target disk to be tested at fixed time intervals; the disk data comprises a time stamp during polling acquisition, a capacity change rate of a disk at a fixed time interval, and a ratio of the used capacity of the disk to the total capacity; respectively constructing a first time sequence diagram of the change rate of the capacity along with the change of the time stamp and a second time sequence diagram of the change of the ratio of the used capacity of the disk to the total capacity along with the time stamp;
step S402: respectively carrying out data stability test on the first time sequence diagram and the second time sequence diagram by adopting a unit root test method, and carrying out differential processing on the data on the time sequence diagram if the tested time sequence diagram shows that the data is unstable;
step S403: performing disk capacity prediction on the target disk to be detected based on the first time sequence diagram and the second time sequence diagram; and setting early warning values for the first time sequence diagram and the second time sequence diagram of each target to-be-tested magnetic disk based on the storage management early warning values corresponding to each target to-be-tested magnetic disk respectively, and reminding a user to clean the magnetic disk in time or expand the magnetic disk or call the magnetic disk in balance.
In order to better realize the method, a disk capacity prediction system based on big data time sequence analysis is also provided, and the system comprises: the system comprises a storage calling record extraction module, a storage calling path combing module, a path information integration module, a storage management early warning value calculation module, an abnormal disk judgment module and a disk capacity prediction module;
the storage calling record extraction module is used for calling all storage calling records generated on a disk of the computer equipment based on a data storage instruction initiated by a user in a history application log of the computer equipment;
the storage calling path combing module is used for receiving the data in the storage calling record extracting module and extracting information from each storage calling record; according to the extracted information of the storage calling records, carding the corresponding storage calling paths of each storage calling record;
the path information integration module is used for receiving and storing the data in the calling path combing module and integrating the path information of each disk respectively; collecting all storage calling paths on each disk according to the corresponding storage calling record time, and obtaining a storage calling path sequence of each disk;
the storage management early warning value calculation module is used for receiving the data in the path information integration module and respectively calculating the average calling frequency and the calling deviation rate of a user to each disk on the computer equipment; calculating a storage management early warning value of each disk based on average calling frequency and calling deviation rate of a user on each disk;
the abnormal disk judging module is used for receiving the data in the storage management early warning value calculating module, carrying out integration analysis on the storage management early warning values corresponding to all the disks on the computer equipment, judging and locking the disk with the abnormal storage management early warning value, and setting the disk as a target disk to be detected;
and the disk capacity prediction module is used for receiving the data in the abnormal disk judgment module and predicting the capacity of the target disk to be detected.
Further, the storage management early warning value calculating module comprises an average calling frequency calculating unit, a calling deviation rate calculating unit and a storage management early warning value calculating unit;
the average calling frequency calculation unit is used for receiving the data in the path information integration module and calculating the average calling frequency of a user to each disk on the computer equipment;
the call deflection rate calculation unit is used for receiving the data in the path information integration module and calculating the call deflection rate of the user to each disk on the computer equipment;
and the storage management early warning value calculation unit is used for receiving the data in the average calling frequency calculation unit and the calling deviation rate calculation unit and calculating the storage management early warning value of each disk.
Further, the storing and calling path combing module includes: a recording information extraction unit and a path carding unit;
the record information extraction unit is used for receiving the data in the storage calling record extraction module and extracting information from each storage calling record;
and the path combing unit is used for receiving the data in the record information extracting unit and combing the corresponding storage calling paths for each storage calling record.
Compared with the prior art, the invention has the following beneficial effects: the invention can realize the storage calling record of each disk on the computer equipment based on the user, analyze the storage calling habit of the user on each disk on the computer equipment, catch the disk with unbalanced data storage management of the user, and reduce the phenomenon that some disks in the computer equipment are idle for a long time and some disk resources are consumed completely; the capacity prediction is carried out on the magnetic disk with the unbalance phenomenon of data storage management, and intelligent reminding is carried out on a user, so that the storage of the data center of the computer equipment can be truly fully and reasonably used, and the waste of resources is reduced.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate the invention and together with the embodiments of the invention, serve to explain the invention. In the drawings:
FIG. 1 is a flow chart of a disk capacity prediction method based on big data time series analysis according to the present invention;
FIG. 2 is a schematic diagram of a disk capacity prediction system based on big data time series analysis according to the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1-2, the present invention provides the following technical solutions: a disk capacity prediction system and method based on big data time sequence analysis, the method includes:
step S100: invoking all storage call records generated on a disk of the computer equipment based on a data storage instruction initiated by a user in a historical application log of the computer equipment; extracting information from each storage calling record; according to the extracted information of the storage calling records, carding the corresponding storage calling paths of each storage calling record;
wherein, step S100 includes:
step S101: invoking all storage call records generated on a disk of the computer equipment based on a data storage instruction initiated by a user in a historical application log of the computer equipment;
step S102: capturing a target call disk in each storage call record; capturing data to be stored corresponding to a data storage instruction in each storage call record, and acquiring data information of the data to be stored, wherein the data information comprises a data source origin of the data to be stored; the data source origin of the data to be stored comprises application software on the computer device;
step S103: capturing call mode way when achieving call to target call disk in each stored call record k The method comprises the steps of carrying out a first treatment on the surface of the Wherein k=1 or k=2; the calling mode comprises default calling way 1 Change call way 2 The method comprises the steps of carrying out a first treatment on the surface of the If the calling mode of the target calling disk is default calling way 1 The method comprises the steps that a target call disk is indicated to be a disk which is originally pushed to a user or a corresponding application software end by a computer equipment end based on a storage instruction; if the calling mode of the target calling disk is changed calling way 2 The target call disk is indicated to be originally pushed to a user or a corresponding application software end based on a storage instruction for the computer equipment endAfter the storage disk is sent, the disk which is used alternatively is changed;
for example, in a storage call record, data to be stored is A, a data source origin of A is application software B, and a target call disk is a computer C disk;
if in the stored call record, the call mode when the call is implemented to the computer C disk is way 1 The method includes that a computer equipment end originally pushes a stored disk to a user or application software B based on a storage instruction to be a computer C disk;
if in the stored call record, the call mode when the call is implemented to the computer C disk is way 2 The method comprises the steps that a computer equipment end originally pushes a stored disk to a user or application software B to be a computer D disk based on a storage instruction; the user changes the stored disk address and changes the computer D disk into the computer C disk;
step S104: numbering all the magnetic disks on the computer equipment; and carrying out corresponding storage calling paths on each storage calling record generated by each magnetic disk on the computer equipment:is extracted from the above; a storage calling record correspondingly extracts a storage calling path; wherein (1)>Representing the j-th storage calling path extracted from the j-th storage calling record of the disk with the corresponding number i;
step S200: respectively integrating path information of each disk; collecting all storage calling paths on each disk according to the corresponding storage calling record time, and obtaining a storage calling path sequence of each disk; calculating average calling frequency and calling deflection rate of a user on each disk on the computer equipment respectively; calculating a storage management early warning value of each disk based on average calling frequency and calling deviation rate of a user on each disk;
wherein, step S200 includes:
step S201: calculating average call frequency of a user to each disk:wherein t is j+1 、t j Respectively represent the j+1st and t in the sequence of the storage calling paths j The strip stores the time corresponding to the calling path; p (P) i Representing the average call frequency of a user to a disk with the number of i; m is m i Representing a total storage call path on a disk corresponding to number i;
step S202: extracting to obtain source application software sets { S1, S2, …, sn } corresponding to each disk in all storage call paths of each disk; s1, S2, … and Sn respectively represent the 1 st, 2 nd, … th and n-th application software extracted from the corresponding storage calling path sequences of the magnetic disks; classifying n kinds of application software based on the calling modes in the stored calling paths to obtain the default calling way as the calling mode 1 N1 of first-class application software, and the calling mode is changed calling way 2 N2 of the second class of application software of n, and n1+n2=n;
step S203: respectively inquiring calling modes of n2 second-class application software in storage calling paths corresponding to other disks, setting the calling modes of n3 second-class application software in the n2 second-class application software in the storage calling paths corresponding to other disks as well as changing calling ways 2 Namely, the second type of application software; calculating call bias rate of a user to each disk:wherein R is i Representing the bias call rate of a user to a disk with the number of i; m represents the total number of application software in the computer equipment;
step S204: calculating and storing a management early warning value SOS for each disk i =P i *R i The method comprises the steps of carrying out a first treatment on the surface of the Wherein SOS i A storage management early warning value representing that a storage management unbalance phenomenon occurs on a disk with a number i by a user;
step S300: integrating and analyzing storage management early warning values corresponding to all the disks on the computer equipment, judging and locking the disk with the abnormal storage management early warning value, and setting the disk as a target disk to be detected;
wherein, step S300 includes:
step S301: respectively extracting storage management early warning values corresponding to all magnetic disks on the computer equipment; setting a first early warning threshold value and a second early warning threshold value; the first early warning threshold value is far greater than the second early warning threshold value; extracting a disk with a storage management early warning value larger than a first early warning threshold value and a disk with a storage management early warning value smaller than a second early warning threshold value respectively;
step S302: setting a disk with a storage management early warning value larger than a first early warning threshold value as a first type disk; setting the disk with the storage management early warning value smaller than the second early warning threshold value as a second type disk; when there is a second type of application software set of a certain first disk in all the disksFirst class application software set of a second disk +.>Satisfy->Wherein C is Threshold value Representing a coincidence rate threshold; judging that a storage management early warning value is abnormal between a certain first disk and a certain second disk; setting a certain first disk and a certain second disk as target disks to be tested;
step S400: respectively predicting the disk capacity of a target disk to be detected;
wherein, step S400 includes:
step S401: polling and collecting disk data of a target disk to be tested at fixed time intervals; the disk data comprises a time stamp during polling acquisition, a capacity change rate of a disk at a fixed time interval, and a ratio of the used capacity of the disk to the total capacity; respectively constructing a first time sequence diagram of the change rate of the capacity along with the change of the time stamp and a second time sequence diagram of the change of the ratio of the used capacity of the disk to the total capacity along with the time stamp;
step S402: respectively carrying out data stability test on the first time sequence diagram and the second time sequence diagram by adopting a unit root test method, and carrying out differential processing on the data on the time sequence diagram if the tested time sequence diagram shows that the data is unstable;
step S403: performing disk capacity prediction on the target disk to be detected based on the first time sequence diagram and the second time sequence diagram; and setting early warning values for the first time sequence diagram and the second time sequence diagram of each target to-be-tested magnetic disk based on the storage management early warning values corresponding to each target to-be-tested magnetic disk respectively, and reminding a user to clean the magnetic disk in time or expand the magnetic disk or call the magnetic disk in balance.
In order to better realize the method, a disk capacity prediction system based on big data time sequence analysis is also provided, and the system comprises: the system comprises a storage calling record extraction module, a storage calling path combing module, a path information integration module, a storage management early warning value calculation module, an abnormal disk judgment module and a disk capacity prediction module;
the storage calling record extraction module is used for calling all storage calling records generated on a disk of the computer equipment based on a data storage instruction initiated by a user in a history application log of the computer equipment;
the storage calling path combing module is used for receiving the data in the storage calling record extracting module and extracting information from each storage calling record; according to the extracted information of the storage calling records, carding the corresponding storage calling paths of each storage calling record;
the storage calling path combing module comprises: a recording information extraction unit and a path carding unit;
the record information extraction unit is used for receiving the data in the storage calling record extraction module and extracting information from each storage calling record;
the path combing unit is used for receiving the data in the record information extracting unit and combing the corresponding storage calling paths for each storage calling record;
the path information integration module is used for receiving and storing the data in the calling path combing module and integrating the path information of each disk respectively; collecting all storage calling paths on each disk according to the corresponding storage calling record time, and obtaining a storage calling path sequence of each disk;
the storage management early warning value calculation module is used for receiving the data in the path information integration module and respectively calculating the average calling frequency and the calling deviation rate of a user to each disk on the computer equipment; calculating a storage management early warning value of each disk based on average calling frequency and calling deviation rate of a user on each disk;
the storage management early warning value calculation module comprises an average calling frequency calculation unit, a calling deviation rate calculation unit and a storage management early warning value calculation unit;
the average calling frequency calculation unit is used for receiving the data in the path information integration module and calculating the average calling frequency of a user to each disk on the computer equipment;
the call deflection rate calculation unit is used for receiving the data in the path information integration module and calculating the call deflection rate of the user to each disk on the computer equipment;
the storage management early warning value calculation unit is used for receiving the data in the average calling frequency calculation unit and the calling deviation rate calculation unit and calculating the storage management early warning value of each disk;
the abnormal disk judging module is used for receiving the data in the storage management early warning value calculating module, carrying out integration analysis on the storage management early warning values corresponding to all the disks on the computer equipment, judging and locking the disk with the abnormal storage management early warning value, and setting the disk as a target disk to be detected;
and the disk capacity prediction module is used for receiving the data in the abnormal disk judgment module and predicting the capacity of the target disk to be detected.
It is noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
Finally, it should be noted that: the foregoing description is only a preferred embodiment of the present invention, and the present invention is not limited thereto, but it is to be understood that modifications and equivalents of some of the technical features described in the foregoing embodiments may be made by those skilled in the art, although the present invention has been described in detail with reference to the foregoing embodiments. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (5)

1. A disk capacity prediction method based on big data time series analysis, the method comprising:
step S100: invoking all storage call records generated on a disk of the computer equipment based on a data storage instruction initiated by a user in a historical application log of the computer equipment; extracting information from each storage calling record; according to the extracted information of the storage calling records, carding the corresponding storage calling paths of each storage calling record;
the step S100 includes:
step S101: invoking all storage call records generated on a disk of the computer equipment based on a data storage instruction initiated by a user in a historical application log of the computer equipment;
step S102: capturing a target call disk in each storage call record; capturing data to be stored corresponding to the data storage instruction in each storage call record, and acquiring data information of the data to be stored, wherein the data information comprises a data source origin of the data to be stored; the data source origin of the data to be stored comprises application software on the computer device;
step S103: capturing call mode way when implementing call to the target call disk in each stored call record k The method comprises the steps of carrying out a first treatment on the surface of the Wherein k=1 or k=2; the calling mode comprises default calling way 1 Change call way 2 The method comprises the steps of carrying out a first treatment on the surface of the If the calling mode of the target calling disk is default calling way 1 The target call disk is indicated to be a disk which is originally pushed to a user or a corresponding application software end by the computer equipment end based on a storage instruction; if the calling mode of the target calling disk is changed calling way 2 The target call disk is a disk which is alternatively used after the storage disk is originally pushed to a user or a corresponding application software end by the computer equipment end based on a storage instruction;
step S104: numbering all the magnetic disks on the computer equipment; and carrying out corresponding storage calling paths on each storage calling record generated by each magnetic disk on the computer equipment:is extracted from the above; one storage calling record correspondingly extracts one storage calling path; wherein (1)>Representing the j-th storage calling path extracted from the j-th storage calling record of the disk with the corresponding number i;
step S200: respectively integrating path information of each disk; collecting all storage calling paths on each disk according to the corresponding storage calling record time, and obtaining a storage calling path sequence of each disk; calculating average calling frequency and calling deflection rate of a user on each disk on the computer equipment respectively; calculating a storage management early warning value of each disk based on average calling frequency and calling deviation rate of a user on each disk;
the step S200 includes:
step S201: calculating average call frequency of a user to each disk:wherein t is j+1 、t j Respectively representing the corresponding time of the j+1th and j-th storage calling paths in the storage calling path sequence; p (P) i Representing the average call frequency of a user to a disk with the number of i; m is m i Representing a total storage call path on a disk corresponding to number i;
step S202: extracting to obtain source application software sets { S1, S2, …, sn } corresponding to each disk in all storage call paths of each disk; s1, S2, … and Sn respectively represent the 1 st, 2 nd, … th and n-th application software extracted from the corresponding storage calling path sequences of the magnetic disks; classifying n kinds of application software based on the calling modes in the stored calling paths to obtain the default calling way as the calling mode 1 N1 of first-class application software, and the calling mode is changed calling way 2 N2 of the second class of application software of n, and n1+n2=n;
step S203: respectively inquiring calling modes of n2 second-class application software in storage calling paths corresponding to other disks, wherein the calling modes of n3 second-class application software in the storage calling paths corresponding to other disks are set in the n2 second-class application software, and the calling modes of n2 second-class application software in the storage calling paths corresponding to other disks are also changed calling paths 2 Namely, the second type of application software; calculating call bias rate of a user to each disk:wherein R is i Representing the bias call rate of a user to a disk with the number of i; m represents the total number of application software in the computer equipment;
step S204: calculating storage management early warning value S0S for each disk i =P i *R i The method comprises the steps of carrying out a first treatment on the surface of the Wherein S0S i A storage management early warning value representing that a storage management unbalance phenomenon occurs on a disk with a number i by a user;
step S300: integrating and analyzing storage management early warning values corresponding to all the disks on the computer equipment, judging and locking the disk with the abnormal storage management early warning value, and setting the disk as a target disk to be tested;
the step S300 includes:
step S301: respectively extracting storage management early warning values corresponding to all magnetic disks on the computer equipment; setting a first early warning threshold value and a second early warning threshold value; the first early warning threshold value is far greater than the second early warning threshold value; extracting a disk with a storage management early warning value larger than the first early warning threshold value and a disk with a storage management early warning value smaller than the second early warning threshold value respectively;
step S302: setting a disk with a storage management early warning value larger than a first early warning threshold value as a first type disk; setting the disk with the storage management early warning value smaller than the second early warning threshold value as a second type disk; when in all the disks, there is a second type application software set of a first type diskFirst class application software set of disc of certain second class +.>Satisfy->Wherein C is Threshold value Representing a coincidence rate threshold; judging that the storage management early warning value of the certain first type of disk and the certain second type of disk is abnormal; setting the first type of disk and the second type of disk as target disks to be tested;
step S400: and respectively predicting the disk capacity of the target disk to be detected.
2. The method for predicting disk capacity based on time series analysis of big data as set forth in claim 1, wherein said step S400 includes:
step S401: polling and collecting disk data of the target disk to be tested at fixed time intervals; the disk data comprises a time stamp during polling acquisition, a capacity change rate of a disk at a fixed time interval, and a ratio of the used capacity of the disk to the total capacity; respectively constructing a first time sequence diagram of the change rate of the capacity along with the change of the time stamp and a second time sequence diagram of the change of the ratio of the used capacity of the disk to the total capacity along with the time stamp;
step S402: respectively carrying out data stability test on the first time sequence diagram and the second time sequence diagram by adopting a unit root test method, and carrying out differential processing on the data on the time sequence diagram if the tested time sequence diagram shows that the data is unstable;
step S403: performing disk capacity prediction on the target disk to be detected based on the first time sequence diagram and the second time sequence diagram; and setting early warning values for the first time sequence diagram and the second time sequence diagram of each target to-be-tested magnetic disk based on the storage management early warning values corresponding to each target to-be-tested magnetic disk respectively, and reminding a user to clean the magnetic disk in time or expand the magnetic disk or call the magnetic disk in balance.
3. A disk capacity prediction system based on big data time series analysis applied to the disk capacity prediction method based on big data time series analysis of any one of claims 1 to 2, characterized in that the system comprises: the system comprises a storage calling record extraction module, a storage calling path combing module, a path information integration module, a storage management early warning value calculation module, an abnormal disk judgment module and a disk capacity prediction module;
the storage calling record extraction module is used for calling all storage calling records generated on a disk of the computer equipment based on a data storage instruction initiated by a user in a history application log of the computer equipment;
the storage calling path combing module is used for receiving the data in the storage calling record extracting module and extracting information from each storage calling record; according to the extracted information of the storage calling records, carding the corresponding storage calling paths of each storage calling record;
the path information integration module is used for receiving the data in the storage calling path combing module and respectively integrating path information of each disk; collecting all storage calling paths on each disk according to the corresponding storage calling record time, and obtaining a storage calling path sequence of each disk;
the storage management early warning value calculation module is used for receiving the data in the path information integration module and respectively calculating the average calling frequency and the calling deviation rate of a user to each disk on the computer equipment; calculating a storage management early warning value of each disk based on average calling frequency and calling deviation rate of a user on each disk;
the abnormal disk judging module is used for receiving the data in the storage management early warning value calculating module, carrying out integration analysis on the storage management early warning values corresponding to all the disks on the computer equipment, judging and locking the disk with the abnormal storage management early warning value, and setting the disk as a target disk to be detected;
the disk capacity prediction module is used for receiving the data in the abnormal disk judgment module and predicting the capacity of the target disk to be detected.
4. The disk capacity prediction system based on big data time series analysis according to claim 3, wherein the storage management early warning value calculation module comprises an average calling frequency calculation unit, a calling deviation rate calculation unit and a storage management early warning value calculation unit;
the average calling frequency calculation unit is used for receiving the data in the path information integration module and calculating the average calling frequency of a user to each disk on the computer equipment;
the call deflection rate calculation unit is used for receiving the data in the path information integration module and calculating the call deflection rate of a user to each disk on the computer equipment;
the storage management early warning value calculating unit is used for receiving the data in the average calling frequency calculating unit and the calling deviation rate calculating unit and calculating the storage management early warning value of each magnetic disk.
5. A disk capacity prediction system based on big data time series analysis as claimed in claim 3, wherein said store call path combing module comprises: a recording information extraction unit and a path carding unit;
the record information extraction unit is used for receiving the data in the storage calling record extraction module and extracting information from each storage calling record;
and the path combing unit is used for receiving the data in the record information extracting unit and combing the corresponding storage calling paths for each storage calling record.
CN202210961428.7A 2022-08-11 2022-08-11 Disk capacity prediction system and method based on big data time sequence analysis Active CN115145494B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210961428.7A CN115145494B (en) 2022-08-11 2022-08-11 Disk capacity prediction system and method based on big data time sequence analysis

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210961428.7A CN115145494B (en) 2022-08-11 2022-08-11 Disk capacity prediction system and method based on big data time sequence analysis

Publications (2)

Publication Number Publication Date
CN115145494A CN115145494A (en) 2022-10-04
CN115145494B true CN115145494B (en) 2023-09-15

Family

ID=83415929

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210961428.7A Active CN115145494B (en) 2022-08-11 2022-08-11 Disk capacity prediction system and method based on big data time sequence analysis

Country Status (1)

Country Link
CN (1) CN115145494B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115361662B (en) * 2022-10-21 2023-01-10 深圳市友恺通信技术有限公司 Network state monitoring and management method and system based on big data

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000305820A (en) * 1999-04-16 2000-11-02 Nec Corp Method for monitoring capacity of disk
CN105094708A (en) * 2015-08-25 2015-11-25 北京百度网讯科技有限公司 Method and apparatus for predicting disk capacity
CN105094698A (en) * 2015-07-08 2015-11-25 浪潮(北京)电子信息产业有限公司 Method for predicting disc capacity based on historical monitoring data
US9892014B1 (en) * 2014-09-29 2018-02-13 EMC IP Holding Company LLC Automated identification of the source of RAID performance degradation
CN109766234A (en) * 2018-12-11 2019-05-17 国网甘肃省电力公司信息通信公司 Disk storage capacity prediction technique based on time series models
US10593380B1 (en) * 2017-12-13 2020-03-17 Amazon Technologies, Inc. Performance monitoring for storage-class memory
CN112131078A (en) * 2020-09-21 2020-12-25 上海上讯信息技术股份有限公司 Method and equipment for monitoring disk capacity
CN113238714A (en) * 2021-05-28 2021-08-10 广东好太太智能家居有限公司 Disk capacity prediction method and system based on historical monitoring data and storage medium
CN113687777A (en) * 2021-07-23 2021-11-23 苏州浪潮智能科技有限公司 Method, device, equipment and medium for predicting usable time of disk
WO2022116922A1 (en) * 2020-12-03 2022-06-09 中兴通讯股份有限公司 Magnetic disk failure prediction method, prediction model training method, and electronic device

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008097502A (en) * 2006-10-16 2008-04-24 Hitachi Ltd Capacity monitoring method and computer system
US11256595B2 (en) * 2019-07-11 2022-02-22 Dell Products L.P. Predictive storage management system

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000305820A (en) * 1999-04-16 2000-11-02 Nec Corp Method for monitoring capacity of disk
US9892014B1 (en) * 2014-09-29 2018-02-13 EMC IP Holding Company LLC Automated identification of the source of RAID performance degradation
CN105094698A (en) * 2015-07-08 2015-11-25 浪潮(北京)电子信息产业有限公司 Method for predicting disc capacity based on historical monitoring data
CN105094708A (en) * 2015-08-25 2015-11-25 北京百度网讯科技有限公司 Method and apparatus for predicting disk capacity
US10593380B1 (en) * 2017-12-13 2020-03-17 Amazon Technologies, Inc. Performance monitoring for storage-class memory
CN109766234A (en) * 2018-12-11 2019-05-17 国网甘肃省电力公司信息通信公司 Disk storage capacity prediction technique based on time series models
CN112131078A (en) * 2020-09-21 2020-12-25 上海上讯信息技术股份有限公司 Method and equipment for monitoring disk capacity
WO2022116922A1 (en) * 2020-12-03 2022-06-09 中兴通讯股份有限公司 Magnetic disk failure prediction method, prediction model training method, and electronic device
CN113238714A (en) * 2021-05-28 2021-08-10 广东好太太智能家居有限公司 Disk capacity prediction method and system based on historical monitoring data and storage medium
CN113687777A (en) * 2021-07-23 2021-11-23 苏州浪潮智能科技有限公司 Method, device, equipment and medium for predicting usable time of disk

Also Published As

Publication number Publication date
CN115145494A (en) 2022-10-04

Similar Documents

Publication Publication Date Title
CN110502494B (en) Log processing method and device, computer equipment and storage medium
CN101635651A (en) Method, system and device for managing network log data
CN115145494B (en) Disk capacity prediction system and method based on big data time sequence analysis
CN113612749A (en) Intrusion behavior-oriented tracing data clustering method and device
CN111314158B (en) Big data platform monitoring method, device, equipment and medium
CN111984495A (en) Big data monitoring method and device and storage medium
CN115220995A (en) Agent probe-based micro-service full-link analysis method
CN107515807B (en) Method and device for storing monitoring data
CN111274090A (en) Job processing method, job processing apparatus, job processing medium, and electronic device
US7587513B1 (en) Efficient storage of network and application data
CN112527887B (en) Visual operation and maintenance method and device applied to Gbase database
CN103647824A (en) Storage resource optimized scheduling and discovering algorithm
CN116795883A (en) Software development data analysis system based on cloud computing
CN113918636B (en) ETL-based data throughput analysis method
CN115730069A (en) Micro-service identification method based on dynamic and static calling relation and reflecting service capability
CN111813833B (en) Real-time two-degree communication relation data mining method
CN110191026B (en) Distributed service link monitoring method and device
CN112860469A (en) Method, device, equipment and storage medium for collecting information of katon log
US7203707B2 (en) System and method for knowledge asset acquisition and management
US6233326B1 (en) Method and apparatus for identifying a line blockage
CN111026599A (en) Data collection method and device based on API call and storage device
CN115629950B (en) Extraction method of performance test asynchronous request processing time point
CN118445155B (en) Real-time data change data capturing platform
CN117896363B (en) Cloud service-based software management system and method
CN110555625B (en) Information processing method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant