CN116541364A - Log data storage method and device, terminal equipment and storage medium - Google Patents

Log data storage method and device, terminal equipment and storage medium Download PDF

Info

Publication number
CN116541364A
CN116541364A CN202310821350.3A CN202310821350A CN116541364A CN 116541364 A CN116541364 A CN 116541364A CN 202310821350 A CN202310821350 A CN 202310821350A CN 116541364 A CN116541364 A CN 116541364A
Authority
CN
China
Prior art keywords
index
data
log data
thermal
data model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310821350.3A
Other languages
Chinese (zh)
Inventor
张璞
刘兴川
费越峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Smart City Research Institute Of China Electronics Technology Group Corp
Original Assignee
Smart City Research Institute Of China Electronics Technology Group Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Smart City Research Institute Of China Electronics Technology Group Corp filed Critical Smart City Research Institute Of China Electronics Technology Group Corp
Priority to CN202310821350.3A priority Critical patent/CN116541364A/en
Publication of CN116541364A publication Critical patent/CN116541364A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The embodiment of the application is applicable to the technical field of computers, and provides a method, a device, terminal equipment and a storage medium for storing log data, wherein the method comprises the following steps: acquiring a plurality of types of log data, wherein each type of log data comprises data of at least one service characteristic; dividing a plurality of types of log data into a plurality of categories according to the service characteristics, wherein the log data of any category corresponds to a pre-established data model, and the data model comprises a plurality of indexes; and storing the log data belonging to the same category into the corresponding data model according to the index. By adopting the method, the stability and the efficiency of storing massive log data can be improved.

Description

Log data storage method and device, terminal equipment and storage medium
Technical Field
The embodiment of the application belongs to the technical field of computers, and particularly relates to a log data storage method, device, terminal equipment and storage medium.
Background
An elastomer search is a search engine that can be used to store, search data. As business volume of enterprises increases, massive log data is generated, and a large amount of log data with different sources and different business characteristics is required to be stored through an elastic search.
However, when log data with various service characteristics is collected and stored from multiple sources, a large number of indexes are generated by the system, and when the indexes cannot roll normally, the system is easy to run and is blocked. In addition, based on the conventional writing method, all data needs to be traversed during searching, which often requires a lot of time, and the searching efficiency is low.
Disclosure of Invention
In view of this, the embodiments of the present application provide a method, an apparatus, a terminal device, and a storage medium for storing log data, which may be stored in an elastic search according to a type of log data, so as to improve stability and efficiency of storing massive log data.
A first aspect of an embodiment of the present application provides a method for storing log data, including:
acquiring a plurality of types of log data, wherein each type of log data comprises data of at least one service characteristic;
dividing a plurality of types of log data into a plurality of categories according to the service characteristics, wherein the log data of any category corresponds to a pre-established data model, and the data model comprises a plurality of indexes;
and storing the log data belonging to the same category into the corresponding data model according to the index.
A second aspect of an embodiment of the present application provides a storage device for log data, including:
the system comprises an acquisition module, a storage module and a storage module, wherein the acquisition module is used for acquiring various log data, and each log data comprises data of at least one service characteristic;
the division module is used for dividing the plurality of types of log data into a plurality of types according to the service characteristics, the log data of any type corresponds to a pre-established data model, and the data model comprises a plurality of indexes;
and the storage module is used for storing the log data belonging to the same category into the corresponding data model according to the index.
A third aspect of the embodiments of the present application provides a terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the method for storing log data according to the first aspect when the processor executes the computer program.
A fourth aspect of the embodiments of the present application provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements a method of storing log data as described in the first aspect above.
Compared with the prior art, the embodiment of the application has the following beneficial effects:
according to the embodiment of the application, under the application scene of the elastic search, a plurality of corresponding data models are established according to the service characteristics of the log data, the log data are stored in the index created in the data model, and when the amount of the log data is huge, a large number of indexes are required to be generated by the system.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings that are required to be used in the embodiments or the description of the prior art. It is apparent that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained from these drawings without inventive effort for a person of ordinary skill in the art.
Fig. 1 is a schematic diagram of a method for storing log data according to an embodiment of the present application;
FIG. 2 is a system global schematic diagram with multiple data models provided in an embodiment of the present application;
FIG. 3 is a flow chart of storing log data according to an embodiment of the present application;
FIG. 4 is a schematic diagram of indexes in a data model of a type provided in an embodiment of the present application;
FIG. 5 is a schematic diagram of index pointing changes provided by an embodiment of the present application;
FIG. 6 is a schematic flow chart diagram of creating a data model provided by an embodiment of the present application;
FIG. 7 is a schematic diagram of creating an initial index according to one embodiment of the present application;
FIG. 8 is a schematic flow chart of managing multiple indexes in a data model according to an embodiment of the present application;
FIG. 9 is a schematic diagram of index temperature transition according to an embodiment of the present disclosure;
FIG. 10 is a schematic flow chart of managing multiple indexes in a data model according to an embodiment of the present application;
FIG. 11 is a schematic diagram of creating a new version of a temperature index according to one embodiment of the present application;
FIG. 12 is a schematic diagram of a single lot index state change according to one embodiment of the present disclosure;
FIG. 13 is a schematic flow chart diagram of a data model management store provided in an embodiment of the present application;
FIG. 14 is a schematic diagram of a log data storage function module according to an embodiment of the present application;
FIG. 15 is a schematic flow chart diagram of log data retrieval provided by an embodiment of the present application;
FIG. 16 is a schematic diagram of a log data storage device according to an embodiment of the present disclosure;
fig. 17 is a schematic diagram of a terminal device provided in an embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system configurations, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
It should be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It should also be understood that the term "and/or" as used in this specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations. As used in this specification and the appended claims, the term "if" may be interpreted as "when..once" or "in response to a determination" or "in response to detection" depending on the context. Similarly, the phrase "if a determination" or "if a [ described condition or event ] is detected" may be interpreted in the context of meaning "upon determination" or "in response to determination" or "upon detection of a [ described condition or event ]" or "in response to detection of a [ described condition or event ]". In addition, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used merely to distinguish between descriptions and are not to be construed as indicating or implying relative importance. Reference in the specification to "one embodiment" or "some embodiments" or the like means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," and the like in the specification are not necessarily all referring to the same embodiment, but mean "one or more but not all embodiments" unless expressly specified otherwise. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless expressly specified otherwise.
For the introduction in the background art, when the total amount of log data is huge, a large amount of indexes are required to be generated based on the application scene of the elastic search, so that the condition of operation blocking can occur when the log data is stored, and the problem of low efficiency of log data storage is caused. Moreover, after log data is stored in the elastic search based on the above manner, a large amount of index information needs to be traversed in data retrieval, so that the load of data nodes is too high, and the efficiency of data writing and query is affected.
In view of the above problems, the present invention provides a method, an apparatus, a terminal device, and a storage medium for storing log data, where a plurality of corresponding data models are created according to service features of the log data, and the log data is stored in an index created in the data models, and when the amount of the log data is huge, a system needs to generate a large amount of indexes.
The technical scheme of the present application is described below by specific examples.
As shown in fig. 1, a schematic diagram of a method for storing log data according to an embodiment of the present application may specifically include steps S101 to S103:
S101, acquiring various log data.
In an embodiment of the present application, each log data includes data of at least one service feature.
In an actual application scenario, the log data may be obtained from a device log, a weblog, a security log, a service log, etc. of each manufacturer, and the embodiment of the present application does not limit the type of the log data.
Typically, a large number of log data of different sources, different traffic characteristics, need to be stored into an elastic search based storage space.
S102, dividing the log data into a plurality of categories according to the service characteristics.
In the embodiment of the application, after the log data is acquired, the log data is divided into a plurality of categories according to the service characteristics of the log data.
In the embodiment of the present application, log data of any category corresponds to a pre-established data model, and the data model includes a plurality of indexes.
And S103, storing the log data belonging to the same category into the corresponding data model according to the index.
In this embodiment of the present application, the index in the data model has a corresponding storage unit, and after log data is obtained, the log data may be stored in the storage unit corresponding to one of the indexes in the data model.
In one possible implementation manner of the embodiment of the present application, fig. 2 shows a global system schematic diagram with multiple data models, where different types of log data have different data models, a data model corresponding to log data with a service characteristic of a is a model a, a data model corresponding to log data with a service characteristic of B is a model B, and a data model corresponding to log data with a service characteristic of C is a model C.
Taking model A as an example, there are indices idx for multiple lots in model A, e.g., batch1 for lot 1, batch2 for lot 2, and so on.
Referring to the arrows in fig. 2, a smaller lot represents the older index, or may be represented as the colder the index, a larger lot represents the newer index, or may be represented as the hotter index, specifically, the index for lot 1 and lot 2 is Wen Suoyin, where the index for lot 1 is the index created first; while batch 3 has both thermal index and Wen Suoyin, because there is time lag due to the time-consuming effect of generating the thermal index, there is a case where the thermal index and the thermal index are simultaneously present in the same batch; the index of lot 4 and lot 5 are thermal indexes, where the index of lot 5 is the most recently created thermal index.
Fig. 3 is a schematic flow chart of storing log data according to an embodiment of the present application, as shown in fig. 3, the method may include the following steps S1031-S1033:
s1031, determining batch information of each thermal index in the data model.
In an embodiment of the present application, the data model includes a plurality of indexes, where the plurality of indexes includes a plurality of thermal indexes, and the plurality of indexes further includes a temperature index.
It should be noted that when the latest log data is acquired, the log data is stored in the hot index, instead of the warm index, and it is understood that the hot index may be used to store newly acquired log data.
S1032, based on the batch information, determining a newly created thermal index for storing the log data.
In the embodiment of the present application, each thermal index in the data model has one-to-one corresponding batch information, and the time sequence in which the thermal index is created can be determined according to the batch information of each thermal index.
In one possible implementation manner of this embodiment of the present application, fig. 4 is a schematic diagram of indexes in a data model, as shown in fig. 4, from top to bottom, the first row is a thermal index, a storage unit corresponding to the thermal index is used to store newly acquired data, and a naming rule of the thermal index may be act-idx-data batch.
For example, the lot information of the thermal index act-idx-4 is 4, the lot information of the thermal index act-idx-5 is 5, and the lot information of the thermal index act-idx-6 is 6, it is understood that the larger the value of the lot information, the closer the time of creation of the thermal index corresponding to the lot information is to the time of storing log data at that time, that is, the larger the value of the lot information, the more recent the index is, and the index having the largest lot information can be determined as the latest created thermal index.
In this embodiment of the present application, it may be known that the most recently created hot index used to store log data in fig. 4 is act-idx-6, which is the most recently created hot index, and at this time, the write alias index write-idx points to the index act-idx-6, which represents that log data may be written to the index.
Furthermore, the thermal indexes act-idx-4, act-idx-5, act-idx-6 have uniform collective aliases active-idx, which are used for collective management indexes. And the indexes act-idx-1, act-idx-2, and act-idx-3 are hot indexes that have been deleted and warmed.
And S1033, storing the log data belonging to the same category into a storage unit corresponding to the newly created thermal index.
In the embodiment of the application, after the latest created thermal index in each data model is determined, the acquired log data is stored in a storage unit corresponding to the latest created thermal index in the corresponding data model.
In one possible implementation manner of the embodiment of the present application, after log data belonging to the same category is stored in a storage unit corresponding to a newly created hot index, a rolling index of the newly created hot index may be determined, and when the rolling index reaches a rolling threshold, a new hot index with an incremental data batch is created, where the new hot index is used to store the newly received log data.
The rolling index may be the total amount of data contained in the storage unit corresponding to the latest created hot index, or the total creation duration of the latest created hot index, or the number of files in the storage unit corresponding to the latest created hot index.
For example, the scroll index may be set to 50G, and when the total amount of data contained in the storage unit corresponding to the thermal index is greater than 50G, a new thermal index with an incremental data batch is created, specifically, as shown in fig. 4, when the total amount of data stored in the storage unit corresponding to the thermal index act-idx-5 is greater than 50G, a thermal index act-idx-6 is created, and when the corresponding log data is acquired after the thermal index act-idx-6 is created, the log data is stored in the storage unit corresponding to the thermal index act-idx-6.
In a possible implementation manner of this embodiment of the present application, fig. 5 is a schematic diagram of index pointing change, as shown in fig. 5, where the hot index act-idx-1 and the hot index act-idx-2 are hot indexes whose rolling indexes reach a rolling threshold, specifically, the rolling index of the hot index act-idx-2 reaches the rolling threshold, so that a new hot index act-idx-3 is created, at this time, the write alias index write-idx is changed from pointing to the hot index act-idx-2 to pointing to the hot index act-idx-3, at this time, when log data of a type corresponding to the data model is received, the log data is written into a storage unit corresponding to the hot index act-idx-3 through the write alias index.
In one possible implementation manner of the embodiment of the present application, when the number of documents of log data in a storage unit corresponding to an index reaches a preset value, a plurality of documents in the storage unit may be combined, where a single document of the storage unit may be referred to as a segment, so that the means may be referred to as segment combination, by which the number of segments in the storage unit may be reduced, so that the number of handles of an opened document is reduced, saving system resources, and when retrieving data in the data model, the retrieval efficiency may be improved.
FIG. 6 is a schematic flow chart of creating a data model according to an embodiment of the present application, as shown in FIG. 6, the method includes steps S201-S203:
s201, determining template setting information corresponding to each service feature one by one.
A plurality of corresponding data models need to be created before log data is stored to the corresponding data models.
In the embodiment of the application, the log data has corresponding service characteristics, the template setting information corresponding to the service characteristics one by one can be determined according to the difference of the service characteristics,
in practical applications, the template setting information may include parameter settings and index mapping, where the parameter settings may include configuration information of indexes in the data model, for example, information such as the number of fragments of an index, the number of copies, tranlog synchronization conditions, refresh policies, and the like; the index map may include internal construction information of the index in the data model, for example, the service feature corresponding to the security log is a security feature, and information such as mapping of an index field corresponding to the security feature, an attack source IP, an attack destination IP, and an attack type is determined.
In a possible implementation manner of the embodiment of the present application, the template setting information may further include an index management policy, and multiple indexes in the data model may be managed according to the index management policy, for example, creation, scrolling, merging of the indexes, writing and reading and writing control of data may be controlled. It should be noted that, each index in the data model includes at least the following information: rolling index, temperature change index, cold change index and/or batch information, etc.
S202, creating a plurality of data templates based on the template setting information.
In the embodiment of the application, after the template setting information corresponding to each data template is determined, a plurality of data templates corresponding to the template setting information one by one are created, wherein any one data template corresponds to one service feature. After the data templates are created, the plurality of data templates may be saved to a database for storing data.
S203, creating the data model corresponding to the log data of each category one by one based on the data template.
In the embodiment of the application, after the data templates are created and saved, the data models corresponding to the respective data templates one by one are created based on the respective data templates, that is, the data models corresponding to the log data of the respective categories one by one are created.
For the log data corresponding to different service features, a corresponding data model is created for storing the log data, and specifically, the log data is stored in a storage unit corresponding to an index in the data model.
Therefore, there are multiple indexes in the data model and log data in the storage units corresponding to the indexes, when the data volume is huge, the number of the indexes is increased at the same time, so that the indexes need to be managed, for example, the indexes can be divided into hot indexes, wen Suoyin and/or cold indexes, and the log data in the storage units corresponding to different indexes can be stored by adopting servers with different performances, so that hardware resources are saved.
In a possible implementation manner of the embodiment of the present application, fig. 7 is a schematic diagram of creating an initial index, after creating a data model corresponding to each type of log data, the acquired log data needs to be stored in the data model, and it needs to be noted that an initial state of the data model does not include the index, so that the initial index needs to be created when the data is initially written.
In this embodiment of the present application, as shown in fig. 7, act-idx-1 is an initial index created, at this time, a write alias index write-idx is also created, where the write alias is named as a write-model name, and for an object writing data into a data model, the write alias is named as an index name uniquely visible to the data model, and is used to store the obtained log data into a thermal index corresponding to the write alias, for example, at this time, the write alias points to a thermal index act-idx-1, and when log data of a type corresponding to the data model is obtained, the log data is stored into a storage unit corresponding to the thermal index act-idx-1.
In the present embodiment, as shown in fig. 7, when one initial index is created, a corresponding single control index ctrl-idx-1 and global control index ctrl-idx are also created.
Fig. 8 is a schematic flow chart of managing multiple indexes in a data model according to an embodiment of the present application, as shown in fig. 8, may include steps S301 to S303:
s301, when a temperature change index of a thermal index in the data model reaches a temperature change threshold, creating a temperature index corresponding to the thermal index.
In an embodiment of the present application, the content for managing the plurality of indexes in the data model according to the index management policy may include: thermal indexes in the data model are managed.
The information of the thermal index may include a temperature change index, the temperature change index may be a time period for creating the thermal index, the index management policy may set a temperature change threshold, compare the temperature change threshold with the temperature change index of the thermal index, and create a temperature index corresponding to the thermal index when the temperature change index of the thermal index reaches the temperature change threshold.
For example, the temperature change threshold may be set to 7 days, when the temperature change index of the thermal index reaches the temperature change threshold, that is, when the temperature change index of the thermal index is the total number of days the thermal index is created, and when the creation time of the thermal index is 7 days, the temperature index corresponding to the thermal index is created.
In one possible implementation manner of this embodiment of the present application, fig. 4 is a schematic diagram of indexes in a data model, as shown in fig. 4, from top to bottom, a query index of a second row is a search index pointer of the data model, an index corresponding to the search index pointer is a query index, a read-idx-n of the query index is a read alias, and a naming rule of the read alias may be a read-idx-data batch.
For upper layer applications, the read alias is the only visible index name, and the query index can be used to retrieve and read log data in the storage unit corresponding to the index pointed to by the read alias.
As shown in fig. 4, the third row is numbered Wen Suoyin, the naming convention for the temperature index is indac-idx-data version-data batch, e.g., wen Suoyin indac-idx-1-3 data version 1, and data batch 3, from top to bottom.
As shown in fig. 4, from top to bottom, the fourth row is a single control index, ctrl-idx-n is a data control block for the index of each batch, which may be used to maintain index value state information, and may be stored in a relational database; in addition, the fifth row is a global control index ctrl-idx, which may be stored in a relational database, and may be used to maintain global information for the data model.
In one possible implementation manner of this embodiment of the present application, fig. 9 is a schematic diagram of index temperature transition, as shown in fig. 9, when a temperature transition index of a thermal index act-idx-1 reaches a temperature transition threshold, wen Suoyin act-idx-1-1 corresponding to the thermal index is generated, in addition, a read alias read-idx-1 needs to be switched to the thermal index, where the read alias is used to read log data corresponding to the thermal index before the thermal index is generated, and after the thermal index is generated, the read alias is switched to Wen Suo to be switched to the thermal index, where the read alias may be used to read log data corresponding to the thermal index.
S302, determining a storage unit corresponding to the thermal index.
In this embodiment of the present application, after a temperature index corresponding to a thermal index is created, a storage unit corresponding to the thermal index needs to be associated with the temperature index, so that log data corresponding to the thermal index can be read through the temperature index, and therefore, the storage unit corresponding to the thermal index needs to be determined.
S303, deleting the thermal index after associating the Wen Suoyin with the storage unit.
In the embodiment of the present application, after the storage unit corresponding to the thermal index is determined, the storage unit is associated with the newly created thermal index, so that log data stored in the storage unit corresponding to the original thermal index can be read through the thermal index.
After the warm index corresponding to the warm index is created, the warm index may be deleted.
Fig. 10 is a schematic flow chart of managing multiple indexes in a data model according to another embodiment of the present application, as shown in fig. 10, may include steps S401 to S402:
s401, when any of the warm indexes reaches a version update condition, creating a new version Wen Suoyin, and deleting an old version Wen Suoyin.
When the model mapping of the data model is updated in an incompatible manner, for example, the mapping of the database corresponding to the data model is changed, specifically, the mapping of the attribute a to the field a is changed to the mapping of the attribute B, that is, the database where the storage unit corresponding to the index in the data model is located is changed, and at this time, it can be determined that the data model is updated in an incompatible manner; in addition, the data type of the log data stored in the storage unit corresponding to the index in the data model may be changed, specifically, the data type of the log data may be changed from the field type to the digital type or the character string type, and at this time, it may be determined that the data model is updated in an incompatible manner.
In the embodiment of the application, when the data model is subjected to incompatible update, that is, when any temperature index in the data model reaches the version update condition, a new version of temperature index is created.
In a possible implementation manner of the embodiment of the present application, fig. 11 is a schematic diagram of creating a new version of the temperature index, as shown in fig. 11, the mapping of the index act-idx-4 in the data model is changed, so that the index reaches the version update condition, and at this time, the new version of the temperature index may be created. Specifically, a new version of Wen Suoyin indx-idx-2-1 is created and the old version of Wen Suoyin indx-idx-1-1 is deleted, so when the thermal index act-idx-2 is deleted and Wen Suoyin is created, the temperature index corresponding to the thermal index act-idx-2 is indx-2-2, wherein the first 2 of indx-idx-2-2 is a data version value and the second 2 is batch information.
It should be noted that, the data version value of the new version of the warm index is incremented compared with the data version value of the old version of the warm index, as shown in fig. 11, the data version value of the old version of the warm index is 1, and the data version value of the new version of the warm index is 2.
In the embodiment of the application, after the new version of the temperature index is generated, after the storage unit corresponding to Wen Suoyin of the old version is associated with the new version of the temperature index, the read alias read-idx-1 needs to be switched to Wen Suo of the new version.
And S402, deleting Wen Suoyin and deleting and/or transferring the log data in the deleted Wen Suoyin corresponding storage unit when the cold transfer index of any temperature index reaches a cold transfer threshold.
In this embodiment of the present application, when the temperature index exists for too long, the log data stored in the storage unit corresponding to the temperature index does not have the reference value, and a lot of system resources will be consumed to continue to be stored in the storage unit, so when the cold transfer index of the temperature index reaches the cold transfer threshold, the temperature index needs to be deleted, and the log data in the storage unit corresponding to the temperature index is also deleted or backed up and transferred to other databases for storage.
In a specific implementation, the cooling threshold may be set to be the total number of days for which the temperature index is created, for example, may be set to 7 days, and when the cooling index of any one temperature index reaches the cooling threshold, that is, when the total number of days for which the temperature index is created is 7 days, the temperature index is deleted, and log data in a storage unit corresponding to the temperature index is also deleted, or log data in a storage unit corresponding to the temperature index may be transferred and backed up to other databases.
In one possible implementation manner of the embodiment of the present application, fig. 12 provides a schematic diagram of a single-batch index state change, and as shown in fig. 12, an index initially created is a thermal index act-idx-n, and a corresponding write alias index write-idx, a collective alias active-idx, and a read alias index read-idx-n are created while the thermal index is created, where n is a data batch.
In the embodiment of the application, after an initial index is created, when the rolling index of the initial index reaches a rolling threshold, a new hot index with data batch increment is created, and when the temperature change index of the initial index reaches a temperature change threshold, wen Suoyin act-idx-1-n corresponding to the initial index is created, and meanwhile the initial index is deleted.
In the embodiment of the application, when any index in the data model reaches the version updating condition, the temperature index can be regenerated. Note that, at this time, the regenerated warm index is Wen Suoyin of the new version, and Wen Suoyin of the old version is deleted while the new version of warm index is generated. As shown in fig. 12, the m-th generation new version of the temperature index is named as index-idx-m-n, where m is the number of times the index in the data model reaches the version update condition, and n is a data batch.
In this embodiment of the present application, when the index of the index reaches the cold transfer threshold, log data in the partition where the storage unit corresponding to the index is located may be deleted. Alternatively, when the number of documents of log data in a storage unit corresponding to an index reaches a preset value, a plurality of documents in the storage unit may be combined, wherein a single document of the storage unit may be referred to as one segment, and thus, this means may be referred to as segment combination.
In this embodiment of the present application, the temperature index may also be opened or closed, when the temperature index is in an opened state, the log data in the storage unit corresponding to the temperature index may be read, and when the temperature index is in a closed state, the log data in the storage unit corresponding to the temperature index may not be read. The mode of the temperature index being turned on or off may be that the state of the memory cell corresponding to the temperature index is identified as 0 or 1, for example, when the field of the state identification of the temperature index is read as 0, it indicates that the memory cell corresponding to the temperature index cannot be read, and when the field of the state identification of the temperature index is read as 1, it indicates that the memory cell corresponding to the temperature index can be read.
FIG. 13 is a schematic flowchart of managing storage of a data model according to an embodiment of the present application, as shown in FIG. 13, the method includes steps S501-S517:
s501, grabbing log data.
In this embodiment of the present application, the log data may be obtained from different paths, for example, the log data obtained from the path a may be a device log, and the log data obtained from the path B may be a weblog.
S502, sorting log data.
In this embodiment of the present application, the service feature of the log data may be determined according to the path for obtaining the log data, and the captured plurality of log data may be classified according to the service feature, for example, the service feature of the log data obtained from the path a may be a device feature, the service feature of the log data is a device feature, and the log data obtained from the path a may be classified as data whose service feature is a device feature.
S503, generating a data template.
S504, storing the data template.
S505, setting an index management strategy.
S506, writing log data.
Steps S503 to S506 of the present embodiment are similar to steps S101 to S103 of the foregoing embodiment, and can be referred to each other, and the present embodiment is not repeated here.
S507, determining whether a writable index exists.
In the embodiment of the present application, when log data to be written is grabbed, determining a data model corresponding to the log data, determining whether a writable index exists in the data model, and when the writable index does not exist, performing step S508; when there is a writable index, step S509 is performed.
S508, generating an initial index.
The steps of this embodiment are similar to those of fig. 6 in the foregoing embodiment, and reference may be made to the same, and the description of this embodiment is omitted here.
S509, determining whether the scrolling index of the thermal index reaches a scrolling threshold.
In the embodiment of the present application, when the rolling index of the thermal index reaches the rolling threshold, step S510 is performed; when the scroll index of the thermal index does not reach the scroll threshold, step S511 is performed.
S510, scrolling to generate a new thermal index.
Steps S509-S510 of the present embodiment are similar to S1033 of the previous embodiment, and can be referred to each other, and the present embodiment is not repeated here.
S511, determining whether the temperature change index of the thermal index reaches a temperature change threshold.
In the embodiment of the present application, when the temperature change index of the thermal index reaches the temperature change threshold, step S512 is performed; when the temperature change index of the thermal index does not reach the temperature change threshold, step S513 is performed.
S512, generating a temperature index.
Steps S511-S512 of the present embodiment are similar to steps S301-S303 of the previous embodiment, and can be referred to each other, and the present embodiment is not repeated here.
S513, determining whether the data model is updated in an incompatible manner.
In the embodiment of the present application, when an incompatible update occurs to the data model, step S514 is performed; when no incompatible update of the data model occurs, step S515 is performed.
S514, generating a new version of temperature index.
Steps S513 to S514 of the present embodiment are similar to steps S401 to S402 of the previous embodiment, and can be referred to each other, and the present embodiment is not described herein.
S515, saving the index.
In the embodiment of the application, when a new index is generated, for example, an initial index is created, or a new hot index is created, or a new version of a warm index is created, the newly generated index is saved.
S516, judging whether log data are to be written into the data model.
In this embodiment of the present application, after a round of management of log data and a data model corresponding to the log data is completed, it is determined whether there is unwritten log data, if there is unwritten log data, step S507 is returned, and if there is no unwritten log data, step S517 is performed, so that the flow of this embodiment is completed.
S517, ending.
Fig. 14 is a schematic diagram of a log data storage function module according to an embodiment of the present application, as shown in fig. 14, where the function module at least includes: a log data grabbing module 61, a log data classifying module 62, a data template building module 63, a data model building module 64, a storage module 65 and a log data retrieving module 66.
In this embodiment of the present application, the log data grabbing module 61 may be used to perform a similar content to step S501 in the previous embodiment, and may be referred to each other, which is not described herein again.
In this embodiment, the log data classification module 62 may be used to perform the content similar to the step S502 in the foregoing embodiment, and may be referred to each other, which is not described herein.
In the embodiment of the present application, the data template creation module 63 may include the following sub-modules: the functions of the data template setting module 631 and the data template storage module 632, where the functions of the data template creating module 63 and its submodules are similar to those of steps S101 to S103 in the foregoing embodiment, and may be referred to each other, which is not described herein.
In the embodiment of the present application, the data model building module 64 may be used to perform similar contents to those of steps S201 to S203 in the previous embodiment, and reference may be made to each other. The data model building module 64 may include the following sub-modules: the data model control module 641 and the index control module 642, the data model control module 641 may further include an index association module 6411 and a data model update module 6412, and the index control module may further include an initial index setup module 6421, an index scroll module 6422, an index warming module 6423 and a new version index generation module 6424.
In this embodiment, the storage module 65 may be used to execute similar contents as those of steps S1031-S1033 in the previous embodiment, and reference may be made to each other.
In embodiments of the present application, log data retrieval module 66 may be used to retrieve target log data in a data model.
In one possible implementation manner of the embodiment of the present application, fig. 15 is a schematic flowchart of log data retrieval provided in one embodiment of the present application, as shown in fig. 15, and the method includes steps S601 to S605:
s601, acquiring target data template information corresponding to target log data.
In this embodiment of the present application, a plurality of data templates and corresponding data models have been established in advance based on the elastic search, and log data corresponding to the category is stored in each data model, where, if required target log data needs to be searched in the plurality of data models, the corresponding data template information of the target log data may be input, that is, the corresponding target data template information may be determined by the service feature of the target log data, and then the corresponding target data model is determined by the target data template information.
S602, searching a target data model stored in a database.
In this embodiment of the present application, after determining a target data template corresponding to target log data, the corresponding target data model may be determined by the target data template, and specifically, the corresponding target data model may be searched in a corresponding database for storing data models.
S603, searching indexes in a database.
In the embodiment of the application, after the target data model is determined, a plurality of indexes contained in the target data model can be searched in a database.
S604, inquiring target log data through indexes.
In the embodiment of the application, the target log data can be queried through a plurality of indexes contained in the determined target data model, and the target log data can be quickly retrieved by querying few indexes because the target log data are classified and stored according to the types of the log data when the log data are stored.
In practical applications, the target log data may be queried through an elastesearch-based DSL algorithm or an SPL algorithm.
S605, returning the target log data.
In the embodiment of the application, when the target log data is found, the target log data may be returned to the terminal device that retrieves the target log data.
It should be noted that, the sequence number of each step in the above embodiment does not mean the sequence of execution sequence, and the execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.
Fig. 16 shows a schematic diagram of a log data storage device provided in an embodiment of the present application, as shown in fig. 16, may specifically include an obtaining module 71, a dividing module 72, and a storage module 73, where:
an acquisition module 71, configured to acquire a plurality of types of log data, each of the log data including data of at least one service feature.
In the embodiment of the present application, the acquisition module is further configured to
A dividing module 72, configured to divide the plurality of types of log data into a plurality of categories according to the service characteristics, where the log data in any category corresponds to a pre-established data model, and the data model includes a plurality of indexes;
in the embodiment of the present application, the dividing module 72 is further configured to:
determining template setting information corresponding to each service feature one by one;
creating a plurality of data templates based on the template setting information, wherein any data template corresponds to one service feature;
And creating the data model corresponding to the log data of each category one by one based on the data template.
In the embodiment of the present application, the dividing module 72 is further configured to:
and managing a plurality of indexes in the data model according to the index management strategy, wherein each index at least comprises a rolling index, a temperature conversion index, a cooling conversion index and/or batch information.
In the embodiment of the present application, the dividing module 72 is further configured to:
when the temperature change index of the thermal index in the data model reaches a temperature change threshold, wen Suoyin corresponding to the thermal index is created;
determining a storage unit corresponding to the thermal index;
after associating the Wen Suoyin with the storage unit, the thermal index is deleted.
In the embodiment of the present application, the dividing module 72 is further configured to:
when any temperature index reaches a version updating condition, a new version Wen Suoyin is created, an old version Wen Suoyin is deleted, and the data version value of the new version temperature index is increased compared with the data version value of the old version temperature index; associating a memory location corresponding to Wen Suoyin of the old version with Wen Suoyin of the new version;
and deleting Wen Suoyin when the cold transfer index of any temperature index reaches a cold transfer threshold value, and deleting and/or transferring the log data in the deleted storage unit corresponding to Wen Suoyin.
And the storage module 73 is configured to store log data belonging to the same category into the corresponding data model according to the index.
In the embodiment of the present application, the storage module 73 is further configured to:
determining batch information of each thermal index in the data model;
determining a newly created thermal index for storing the log data based on the lot information;
and storing the log data belonging to the same category into a storage unit corresponding to the newly created thermal index.
In the embodiment of the present application, the storage module 73 is further configured to:
when the rolling index of the thermal index in the data model reaches a rolling threshold, creating a new thermal index with the data batch increasing, wherein a storage unit corresponding to the new thermal index is used for storing the latest received log data.
For the device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference should be made to the description of the method embodiments.
Fig. 17 is a schematic diagram of a terminal device provided in an embodiment of the present application. As shown in fig. 17, a terminal device 800 in the embodiment of the present application includes: a processor 810, a memory 820 and a computer program 821 stored in said memory 820 and executable on said processor 810. The processor 810 implements the steps of the above-described embodiments of the log data storage method when executing the computer program 821, such as steps S101 to S103 shown in fig. 1. Alternatively, the processor 810 may perform the functions of the modules/units of the apparatus embodiments described above, such as the functions of the modules 71 to 73 shown in fig. 16, when executing the computer program 821.
By way of example, the computer program 821 may be partitioned into one or more modules/units that are stored in the memory 820 and executed by the processor 810 to complete the present application. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which may be used to describe the execution of the computer program 821 in the terminal device 800. The terminal device 800 may include, but is not limited to, a processor 810, a memory 820. It will be appreciated by those skilled in the art that fig. 8 is merely an example of a terminal device 800 and is not meant to be limiting as to the terminal device 800, and may include more or fewer components than shown, or may combine certain components, or different components, e.g., the terminal device 800 may also include input and output devices, network access devices, buses, etc.
The processor 810 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The storage 820 may be an internal storage unit of the terminal device 800, for example, a hard disk or a memory of the terminal device 800. The memory 820 may also be an external storage device of the terminal device 800, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the terminal device 800. Further, the memory 820 may also include both internal storage units and external storage devices of the terminal device 800. The memory 820 is used to store the computer program 821 and other programs and data required by the terminal device 800. The memory 820 may also be used to temporarily store data that has been output or is to be output.
The embodiment of the application also discloses a terminal device, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the log data storage method in the previous embodiments when executing the computer program.
The embodiments also disclose a computer readable storage medium storing a computer program which, when executed by a processor, implements the method for storing log data as described in the foregoing embodiments.
The above embodiments are only for illustrating the technical solution of the present application, and are not limiting. Although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application.

Claims (10)

1. A method of storing log data, comprising:
acquiring a plurality of types of log data, wherein each type of log data comprises data of at least one service characteristic;
dividing a plurality of types of log data into a plurality of categories according to the service characteristics, wherein the log data of any category corresponds to a pre-established data model, and the data model comprises a plurality of indexes;
and storing the log data belonging to the same category into the corresponding data model according to the index.
2. The method of claim 1, wherein the plurality of indexes includes a plurality of thermal indexes, and wherein storing log data belonging to the same category into the corresponding data model according to the indexes includes:
Determining batch information of each thermal index in the data model;
determining a newly created thermal index for storing the log data based on the lot information;
and storing the log data belonging to the same category into a storage unit corresponding to the newly created thermal index.
3. The method according to claim 2, further comprising, after storing log data belonging to the same category in a storage unit corresponding to the thermal index newly created:
when the rolling index of the thermal index in the data model reaches a rolling threshold, creating a new thermal index with the data batch increasing, wherein a storage unit corresponding to the new thermal index is used for storing the latest received log data.
4. A method according to any one of claims 1-3, further comprising:
determining template setting information corresponding to each service feature one by one;
creating a plurality of data templates based on the template setting information, wherein any data template corresponds to one service feature;
and creating the data model corresponding to the log data of each category one by one based on the data template.
5. The method of claim 4, wherein the template setting information includes an index management policy, the method further comprising:
and managing a plurality of indexes in the data model according to the index management strategy, wherein each index at least comprises a rolling index, a temperature conversion index, a cooling conversion index and/or batch information.
6. The method of claim 5, wherein managing the plurality of indexes in the data model according to the index management policy comprises:
when the temperature change index of the thermal index in the data model reaches a temperature change threshold, wen Suoyin corresponding to the thermal index is created;
determining a storage unit corresponding to the thermal index;
after associating the Wen Suoyin with the storage unit, the thermal index is deleted.
7. The method of claim 5 or 6, wherein a plurality of the indices comprise Wen Suoyin, the warm index further comprises a data version value, managing the plurality of indices in the data model according to the index management policy comprises:
when any temperature index reaches a version updating condition, a new version Wen Suoyin is created, an old version Wen Suoyin is deleted, and the data version value of the new version temperature index is increased compared with the data version value of the old version temperature index; associating a memory location corresponding to Wen Suoyin of the old version with Wen Suoyin of the new version;
And deleting Wen Suoyin when the cold transfer index of any temperature index reaches a cold transfer threshold value, and deleting and/or transferring the log data in the deleted storage unit corresponding to Wen Suoyin.
8. A storage device for log data, comprising:
the system comprises an acquisition module, a storage module and a storage module, wherein the acquisition module is used for acquiring various log data, and each log data comprises data of at least one service characteristic;
the division module is used for dividing the plurality of types of log data into a plurality of types according to the service characteristics, the log data of any type corresponds to a pre-established data model, and the data model comprises a plurality of indexes;
and the storage module is used for storing the log data belonging to the same category into the corresponding data model according to the index.
9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the method of storing log data according to any of claims 1-7 when executing the computer program.
10. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the method of storing log data according to any one of claims 1-7.
CN202310821350.3A 2023-07-06 2023-07-06 Log data storage method and device, terminal equipment and storage medium Pending CN116541364A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310821350.3A CN116541364A (en) 2023-07-06 2023-07-06 Log data storage method and device, terminal equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310821350.3A CN116541364A (en) 2023-07-06 2023-07-06 Log data storage method and device, terminal equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116541364A true CN116541364A (en) 2023-08-04

Family

ID=87458211

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310821350.3A Pending CN116541364A (en) 2023-07-06 2023-07-06 Log data storage method and device, terminal equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116541364A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107622084A (en) * 2017-08-10 2018-01-23 深圳前海微众银行股份有限公司 Blog management method, system and computer-readable recording medium
CN111522786A (en) * 2020-04-21 2020-08-11 中国建设银行股份有限公司 Log processing system and method
CN111914126A (en) * 2020-07-22 2020-11-10 浙江乾冠信息安全研究院有限公司 Processing method, equipment and storage medium for indexed network security big data
CN112181987A (en) * 2020-10-12 2021-01-05 嘉联支付有限公司 Non-time sequence data processing method
CN113282618A (en) * 2021-06-18 2021-08-20 福建天晴数码有限公司 Optimization scheme and system for retrieval of active clusters of Elasticissearch
CN114817588A (en) * 2022-04-11 2022-07-29 广东华兴银行股份有限公司 Business image data management method, device and storage medium
US11615082B1 (en) * 2020-07-31 2023-03-28 Splunk Inc. Using a data store and message queue to ingest data for a data intake and query system

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107622084A (en) * 2017-08-10 2018-01-23 深圳前海微众银行股份有限公司 Blog management method, system and computer-readable recording medium
CN111522786A (en) * 2020-04-21 2020-08-11 中国建设银行股份有限公司 Log processing system and method
CN111914126A (en) * 2020-07-22 2020-11-10 浙江乾冠信息安全研究院有限公司 Processing method, equipment and storage medium for indexed network security big data
US11615082B1 (en) * 2020-07-31 2023-03-28 Splunk Inc. Using a data store and message queue to ingest data for a data intake and query system
CN112181987A (en) * 2020-10-12 2021-01-05 嘉联支付有限公司 Non-time sequence data processing method
CN113282618A (en) * 2021-06-18 2021-08-20 福建天晴数码有限公司 Optimization scheme and system for retrieval of active clusters of Elasticissearch
CN114817588A (en) * 2022-04-11 2022-07-29 广东华兴银行股份有限公司 Business image data management method, device and storage medium

Similar Documents

Publication Publication Date Title
US9697247B2 (en) Tiered data storage architecture
US8112463B2 (en) File management method and storage system
US10430398B2 (en) Data storage system having mutable objects incorporating time
US7890541B2 (en) Partition by growth table space
US9411840B2 (en) Scalable data structures
US8725730B2 (en) Responding to a query in a data processing system
US8977623B2 (en) Method and system for search engine indexing and searching using the index
US9020892B2 (en) Efficient metadata storage
WO2011108021A1 (en) File level hierarchical storage management system, method, and apparatus
US20110218972A1 (en) Data reduction indexing
CN104021161A (en) Cluster storage method and device
CN103812939A (en) Big data storage system
US10678817B2 (en) Systems and methods of scalable distributed databases
CN107515879B (en) Method and electronic equipment for document retrieval
US9081784B2 (en) Delta indexing method for hierarchy file storage
CN105469001B (en) Disk data protection method and device
US20230325363A1 (en) Time series data layered storage systems and methods
CN110858210A (en) Data query method and device
EP3343395B1 (en) Data storage method and apparatus for mobile terminal
US11860840B2 (en) Update of deduplication fingerprint index in a cache memory
CN111930684A (en) Small file processing method, device and equipment based on HDFS (Hadoop distributed File System) and storage medium
CN116541364A (en) Log data storage method and device, terminal equipment and storage medium
CN113835613B (en) File reading method and device, electronic equipment and storage medium
CN113849482A (en) Data migration method and device and electronic equipment
US11308038B2 (en) Copying container images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination